Comparing Exploration Approaches in Deep Reinforcement Learning for Traffic Light Control

Research output: Chapter in Book/Conference proceedings/Edited volumeConference contributionScientificpeer-review

400 Downloads (Pure)


Identifying the most efficient exploration approach for deep reinforcement learning in traffic light control is not a trivial task, and can be a critical step in the development of reinforcement learning solutions that can effectively reduce traffic congestion. It is common to use baseline dithering methods such as -greedy. However, the value of more evolved exploration approaches in this setting has not yet been determined. This paper addresses this concern by comparing the performance of the popular deep Q-learning algorithm using one baseline and two state of the art exploration approaches, and their combination. Specifically, -greedy is used as a baseline, and compared to the exploration approaches Bootstrapped DQN, randomized prior functions, and their combination. This is done in three different traffic scenarios, capturing different traffic profiles. The results obtained suggest that the higher the complexity of the traffic scenario, and the larger the size of the observation space of the agent, the larger the gain from efficient exploration. This is illustrated by the improved performance observed in the agents using efficient exploration and enjoying a larger observation space in the complex traffic scenarios.
Original languageEnglish
Title of host publicationBNAIC/BeneLearn 2020
PublisherRU Leiden
Publication statusPublished - 2020
EventBNAIC/BENELEARN 2020 - Leiden, Netherlands
Duration: 19 Nov 202020 Nov 2020


ConferenceBNAIC/BENELEARN 2020


Dive into the research topics of 'Comparing Exploration Approaches in Deep Reinforcement Learning for Traffic Light Control'. Together they form a unique fingerprint.

Cite this