Online robot guidance and navigation in non-stationary environment with hybrid Hierarchical Reinforcement Learning

Ye Zhou; Hann Woei Ho

doi:10.1016/j.engappai.2022.105152

Online robot guidance and navigation in non-stationary environment with hybrid Hierarchical Reinforcement Learning

Ye Zhou^*, Hann Woei Ho

^*Corresponding author for this work

Control & Simulation

Research output: Contribution to journal › Article › Scientific › peer-review

4 Citations (Scopus)

12 Downloads (Pure)

Abstract

Hierarchical Reinforcement Learning (HRL) provides an option to solve complex guidance and navigation problems with high-dimensional spaces, multiple objectives, and a large number of states and actions. The current HRL methods often use the same or similar reinforcement learning methods within one application so that multiple objectives can be easily combined. Since there is not a single learning method that can benefit all targets, hybrid Hierarchical Reinforcement Learning (hHRL) was proposed to use various methods to optimize the learning with different types of information and objectives in one application. The previous hHRL method, however, requires manual task-specific designs, which involves engineers’ preferences and may impede its transfer learning ability. This paper, therefore, proposes a systematic online guidance and navigation method under the framework of hHRL, which generalizes training samples with a function approximator, decomposes the state space automatically, and thus does not require task-specific designs. The simulation results indicate that the proposed method is superior to the previous hHRL method, which requires manual decomposition, in terms of the convergence rate and the learnt policy. It is also shown that this method is generally applicable to non-stationary environments changing over episodes and over time without the loss of efficiency even with noisy state information.

Original language	English
Article number	105152
Number of pages	9
Journal	Engineering Applications of Artificial Intelligence
Volume	114
DOIs	https://doi.org/10.1016/j.engappai.2022.105152
Publication status	Published - 2022

Bibliographical note

Green Open Access added to TU Delft Institutional Repository ‘You share, we take care!’ – Taverne project https://www.openaccess.nl/en/you-share-we-take-care
Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.

Keywords

Function approximation
Hybrid Hierarchical Reinforcement Learning
Non-stationary environment
Online guidance and navigation
State space decomposition

Access to Document

10.1016/j.engappai.2022.105152

1_s2.0_S0952197622002676_mainFinal published version, 1.14 MB

Cite this

@article{0797bee50a87478484d0335d6604b4c2,

title = "Online robot guidance and navigation in non-stationary environment with hybrid Hierarchical Reinforcement Learning",

abstract = "Hierarchical Reinforcement Learning (HRL) provides an option to solve complex guidance and navigation problems with high-dimensional spaces, multiple objectives, and a large number of states and actions. The current HRL methods often use the same or similar reinforcement learning methods within one application so that multiple objectives can be easily combined. Since there is not a single learning method that can benefit all targets, hybrid Hierarchical Reinforcement Learning (hHRL) was proposed to use various methods to optimize the learning with different types of information and objectives in one application. The previous hHRL method, however, requires manual task-specific designs, which involves engineers{\textquoteright} preferences and may impede its transfer learning ability. This paper, therefore, proposes a systematic online guidance and navigation method under the framework of hHRL, which generalizes training samples with a function approximator, decomposes the state space automatically, and thus does not require task-specific designs. The simulation results indicate that the proposed method is superior to the previous hHRL method, which requires manual decomposition, in terms of the convergence rate and the learnt policy. It is also shown that this method is generally applicable to non-stationary environments changing over episodes and over time without the loss of efficiency even with noisy state information.",

keywords = "Function approximation, Hybrid Hierarchical Reinforcement Learning, Non-stationary environment, Online guidance and navigation, State space decomposition",

author = "Ye Zhou and Ho, {Hann Woei}",

note = "Green Open Access added to TU Delft Institutional Repository {\textquoteleft}You share, we take care!{\textquoteright} – Taverne project https://www.openaccess.nl/en/you-share-we-take-care Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public. ",

year = "2022",

doi = "10.1016/j.engappai.2022.105152",

language = "English",

volume = "114",

journal = "Engineering Applications of Artificial Intelligence",

issn = "0952-1976",

publisher = "Elsevier",

}

TY - JOUR

T1 - Online robot guidance and navigation in non-stationary environment with hybrid Hierarchical Reinforcement Learning

AU - Zhou, Ye

AU - Ho, Hann Woei

N1 - Green Open Access added to TU Delft Institutional Repository ‘You share, we take care!’ – Taverne project https://www.openaccess.nl/en/you-share-we-take-care Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.

PY - 2022

Y1 - 2022

N2 - Hierarchical Reinforcement Learning (HRL) provides an option to solve complex guidance and navigation problems with high-dimensional spaces, multiple objectives, and a large number of states and actions. The current HRL methods often use the same or similar reinforcement learning methods within one application so that multiple objectives can be easily combined. Since there is not a single learning method that can benefit all targets, hybrid Hierarchical Reinforcement Learning (hHRL) was proposed to use various methods to optimize the learning with different types of information and objectives in one application. The previous hHRL method, however, requires manual task-specific designs, which involves engineers’ preferences and may impede its transfer learning ability. This paper, therefore, proposes a systematic online guidance and navigation method under the framework of hHRL, which generalizes training samples with a function approximator, decomposes the state space automatically, and thus does not require task-specific designs. The simulation results indicate that the proposed method is superior to the previous hHRL method, which requires manual decomposition, in terms of the convergence rate and the learnt policy. It is also shown that this method is generally applicable to non-stationary environments changing over episodes and over time without the loss of efficiency even with noisy state information.

AB - Hierarchical Reinforcement Learning (HRL) provides an option to solve complex guidance and navigation problems with high-dimensional spaces, multiple objectives, and a large number of states and actions. The current HRL methods often use the same or similar reinforcement learning methods within one application so that multiple objectives can be easily combined. Since there is not a single learning method that can benefit all targets, hybrid Hierarchical Reinforcement Learning (hHRL) was proposed to use various methods to optimize the learning with different types of information and objectives in one application. The previous hHRL method, however, requires manual task-specific designs, which involves engineers’ preferences and may impede its transfer learning ability. This paper, therefore, proposes a systematic online guidance and navigation method under the framework of hHRL, which generalizes training samples with a function approximator, decomposes the state space automatically, and thus does not require task-specific designs. The simulation results indicate that the proposed method is superior to the previous hHRL method, which requires manual decomposition, in terms of the convergence rate and the learnt policy. It is also shown that this method is generally applicable to non-stationary environments changing over episodes and over time without the loss of efficiency even with noisy state information.

KW - Function approximation

KW - Hybrid Hierarchical Reinforcement Learning

KW - Non-stationary environment

KW - Online guidance and navigation

KW - State space decomposition

UR - http://www.scopus.com/inward/record.url?scp=85133698566&partnerID=8YFLogxK

U2 - 10.1016/j.engappai.2022.105152

DO - 10.1016/j.engappai.2022.105152

M3 - Article

AN - SCOPUS:85133698566

SN - 0952-1976

VL - 114

JO - Engineering Applications of Artificial Intelligence

JF - Engineering Applications of Artificial Intelligence

M1 - 105152

ER -

Online robot guidance and navigation in non-stationary environment with hybrid Hierarchical Reinforcement Learning

Abstract

Bibliographical note

Keywords

Access to Document

Other files and links

Fingerprint

Cite this