Where to go next: Learning a Subgoal Recommendation Policy for Navigation in Dynamic Environments

Bruno Brito; Michael Everett; Jonathan Patrick How; Javier Alonso-Mora

doi:10.1109/LRA.2021.3068662

Where to go next: Learning a Subgoal Recommendation Policy for Navigation in Dynamic Environments

Bruno Brito, Michael Everett, Jonathan Patrick How, Javier Alonso-Mora

Learning & Autonomous Control

Research output: Contribution to journal › Article › Scientific › peer-review

19 Citations (Scopus)

191 Downloads (Pure)

Abstract

Robotic navigation in environments shared with other robots or humans remains challenging because the intentions of the surrounding agents are not directly observable and the environment conditions are continuously changing. Local trajectory optimization methods, such as model predictive control (MPC), can deal with those changes but require global guidance, which is not trivial to obtain in crowded scenarios. This paper proposes to learn, via deep Reinforcement Learning (RL), an interaction-aware policy that provides long-term guidance to the local planner. In particular, in simulations with cooperative and non-cooperative agents, we train a deep network to recommend a subgoal for the MPC planner. The recommended subgoal is expected to help the robot in making progress towards its goal and accounts for the expected interaction with other agents. Based on the recommended subgoal, the MPC planner then optimizes the inputs for the robot satisfying its kinodynamic and collision avoidance constraints. Our approach is shown to substantially improve the navigation performance in terms of number of collisions as compared to prior MPC frameworks, and in terms of both travel time and number of collisions compared to deep RL methods in cooperative, competitive and mixed multiagent scenarios.

Original language	English
Pages (from-to)	4616-4623
Journal	IEEE Robotics and Automation Letters
Volume	6
Issue number	3
DOIs	https://doi.org/10.1109/LRA.2021.3068662
Publication status	Published - 2021

Bibliographical note

Green Open Access added to TU Delft Institutional Repository 'You share, we take care!' - Taverne project https://www.openaccess.nl/en/you-share-we-take-care
Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.

Keywords

Collision avoidance
Deep Reinforcement Learning
Motion and Path Planning in Dynamic Environments or for Multi-robot Systems
Navigation
Planning
Robot kinematics
Robots
Training
Vehicle dynamics

Access to Document

10.1109/LRA.2021.3068662

Where_to_go_Next_Learning_a_Subgoal_Recommendation_Policy_for_Navigation_in_Dynamic_EnvironmentsFinal published version, 1.31 MB

Cite this

@article{9d6316ec035d44bc8d7685c8a6c74846,

title = "Where to go next: Learning a Subgoal Recommendation Policy for Navigation in Dynamic Environments",

abstract = "Robotic navigation in environments shared with other robots or humans remains challenging because the intentions of the surrounding agents are not directly observable and the environment conditions are continuously changing. Local trajectory optimization methods, such as model predictive control (MPC), can deal with those changes but require global guidance, which is not trivial to obtain in crowded scenarios. This paper proposes to learn, via deep Reinforcement Learning (RL), an interaction-aware policy that provides long-term guidance to the local planner. In particular, in simulations with cooperative and non-cooperative agents, we train a deep network to recommend a subgoal for the MPC planner. The recommended subgoal is expected to help the robot in making progress towards its goal and accounts for the expected interaction with other agents. Based on the recommended subgoal, the MPC planner then optimizes the inputs for the robot satisfying its kinodynamic and collision avoidance constraints. Our approach is shown to substantially improve the navigation performance in terms of number of collisions as compared to prior MPC frameworks, and in terms of both travel time and number of collisions compared to deep RL methods in cooperative, competitive and mixed multiagent scenarios.",

keywords = "Collision avoidance, Deep Reinforcement Learning, Motion and Path Planning in Dynamic Environments or for Multi-robot Systems, Navigation, Planning, Robot kinematics, Robots, Training, Vehicle dynamics",

author = "Bruno Brito and Michael Everett and How, {Jonathan Patrick} and Javier Alonso-Mora",

note = "Green Open Access added to TU Delft Institutional Repository 'You share, we take care!' - Taverne project https://www.openaccess.nl/en/you-share-we-take-care Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.",

year = "2021",

doi = "10.1109/LRA.2021.3068662",

language = "English",

volume = "6",

pages = "4616--4623",

journal = "IEEE Robotics and Automation Letters",

issn = "2377-3766",

publisher = "Institute of Electrical and Electronics Engineers (IEEE)",

number = "3",

}

TY - JOUR

T1 - Where to go next

T2 - Learning a Subgoal Recommendation Policy for Navigation in Dynamic Environments

AU - Brito, Bruno

AU - Everett, Michael

AU - How, Jonathan Patrick

AU - Alonso-Mora, Javier

N1 - Green Open Access added to TU Delft Institutional Repository 'You share, we take care!' - Taverne project https://www.openaccess.nl/en/you-share-we-take-care Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.

PY - 2021

Y1 - 2021

N2 - Robotic navigation in environments shared with other robots or humans remains challenging because the intentions of the surrounding agents are not directly observable and the environment conditions are continuously changing. Local trajectory optimization methods, such as model predictive control (MPC), can deal with those changes but require global guidance, which is not trivial to obtain in crowded scenarios. This paper proposes to learn, via deep Reinforcement Learning (RL), an interaction-aware policy that provides long-term guidance to the local planner. In particular, in simulations with cooperative and non-cooperative agents, we train a deep network to recommend a subgoal for the MPC planner. The recommended subgoal is expected to help the robot in making progress towards its goal and accounts for the expected interaction with other agents. Based on the recommended subgoal, the MPC planner then optimizes the inputs for the robot satisfying its kinodynamic and collision avoidance constraints. Our approach is shown to substantially improve the navigation performance in terms of number of collisions as compared to prior MPC frameworks, and in terms of both travel time and number of collisions compared to deep RL methods in cooperative, competitive and mixed multiagent scenarios.

AB - Robotic navigation in environments shared with other robots or humans remains challenging because the intentions of the surrounding agents are not directly observable and the environment conditions are continuously changing. Local trajectory optimization methods, such as model predictive control (MPC), can deal with those changes but require global guidance, which is not trivial to obtain in crowded scenarios. This paper proposes to learn, via deep Reinforcement Learning (RL), an interaction-aware policy that provides long-term guidance to the local planner. In particular, in simulations with cooperative and non-cooperative agents, we train a deep network to recommend a subgoal for the MPC planner. The recommended subgoal is expected to help the robot in making progress towards its goal and accounts for the expected interaction with other agents. Based on the recommended subgoal, the MPC planner then optimizes the inputs for the robot satisfying its kinodynamic and collision avoidance constraints. Our approach is shown to substantially improve the navigation performance in terms of number of collisions as compared to prior MPC frameworks, and in terms of both travel time and number of collisions compared to deep RL methods in cooperative, competitive and mixed multiagent scenarios.

KW - Collision avoidance

KW - Deep Reinforcement Learning

KW - Motion and Path Planning in Dynamic Environments or for Multi-robot Systems

KW - Navigation

KW - Planning

KW - Robot kinematics

KW - Robots

KW - Training

KW - Vehicle dynamics

UR - http://www.scopus.com/inward/record.url?scp=85103290180&partnerID=8YFLogxK

U2 - 10.1109/LRA.2021.3068662

DO - 10.1109/LRA.2021.3068662

M3 - Article

AN - SCOPUS:85103290180

SN - 2377-3766

VL - 6

SP - 4616

EP - 4623

JO - IEEE Robotics and Automation Letters

JF - IEEE Robotics and Automation Letters

IS - 3

ER -

Where to go next: Learning a Subgoal Recommendation Policy for Navigation in Dynamic Environments

Abstract

Bibliographical note

Keywords

Access to Document

Other files and links

Fingerprint

Cite this