Where to go next: Learning a Subgoal Recommendation Policy for Navigation in Dynamic Environments

Bruno Brito, Michael Everett, Jonathan Patrick How, Javier Alonso-Mora

Research output: Contribution to journalArticleScientificpeer-review

38 Downloads (Pure)

Abstract

Robotic navigation in environments shared with other robots or humans remains challenging because the intentions of the surrounding agents are not directly observable and the environment conditions are continuously changing. Local trajectory optimization methods, such as model predictive control (MPC), can deal with those changes but require global guidance, which is not trivial to obtain in crowded scenarios. This paper proposes to learn, via deep Reinforcement Learning (RL), an interaction-aware policy that provides long-term guidance to the local planner. In particular, in simulations with cooperative and non-cooperative agents, we train a deep network to recommend a subgoal for the MPC planner. The recommended subgoal is expected to help the robot in making progress towards its goal and accounts for the expected interaction with other agents. Based on the recommended subgoal, the MPC planner then optimizes the inputs for the robot satisfying its kinodynamic and collision avoidance constraints. Our approach is shown to substantially improve the navigation performance in terms of number of collisions as compared to prior MPC frameworks, and in terms of both travel time and number of collisions compared to deep RL methods in cooperative, competitive and mixed multiagent scenarios.

Original languageEnglish
Pages (from-to)4616-4623
JournalIEEE Robotics and Automation Letters
Volume6
Issue number3
DOIs
Publication statusPublished - 2021

Bibliographical note

Accepted Author Manuscript

Keywords

  • Collision avoidance
  • Deep Reinforcement Learning
  • Motion and Path Planning in Dynamic Environments or for Multi-robot Systems
  • Navigation
  • Planning
  • Robot kinematics
  • Robots
  • Training
  • Vehicle dynamics

Fingerprint

Dive into the research topics of 'Where to go next: Learning a Subgoal Recommendation Policy for Navigation in Dynamic Environments'. Together they form a unique fingerprint.

Cite this