Reinforcement Learning based Algorithm with Safety Handling and Risk Perception

S. Shyamsundar, Tommaso Mannucci, Erik-Jan van Kampen

Research output: Chapter in Book/Conference proceedings/Edited volumeConference contributionScientificpeer-review

1 Citation (Scopus)
59 Downloads (Pure)

Abstract

Navigation in an unknown or uncertain environment is a challenging task for an autonomous agent. The agent is expected to behave independently and to learn the suitable action to take for a given situation. Reinforcement Learning could be used to help the agent adapt to an unknown environment and learn the right actions to take. This paper presents the setup and the results of a reinforcement learning problem utilizing Q-learning and a Safety Handling Exploration with Risk Perception Algorithm (SHERPA) for safe exploration in an unknown environment. The agent has to explore its environment safely and must learn the optimal action for a given situation from the feedback received from the environment. The results show that the agent can learn a value function converged to within 10% of the optimal values after 5000 iterations. The simulation results show that the proposed approach ensures that the agent explores an unknown environment safely and learns the desirable actions for a given situation.
Original languageEnglish
Title of host publication2016 IEEE Symposium Series on Computational Intelligence
Subtitle of host publicationAthens, Greece
EditorsY Jin, S. Kollias
PublisherIEEE
Number of pages7
DOIs
Publication statusE-pub ahead of print - 2016
Event2016 IEEE Symposium Series on Computational Intelligence - Athens, Greece
Duration: 6 Oct 20169 Oct 2016
http://ssci2016.cs.surrey.ac.uk/

Conference

Conference2016 IEEE Symposium Series on Computational Intelligence
Abbreviated titleSSCI 2016
Country/TerritoryGreece
CityAthens
Period6/10/169/10/16
Internet address

Fingerprint

Dive into the research topics of 'Reinforcement Learning based Algorithm with Safety Handling and Risk Perception'. Together they form a unique fingerprint.

Cite this