Abstract
Reinforcement learning is a paradigm for learning decision-making tasks from interaction with the environment. Function approximators solve a part of the curse of dimensionality when learning in high-dimensional state and/or action spaces. It can be a time-consuming process to learn a good policy in a high dimensional state space directly. A method is proposed for initially limiting the state and action space to a subset of the variables of the Markov Decision Process. Therefore, the agent will initially learn a coarse policy. It is then gradually exposed to new state and action variables to increase the dimensionality of the state and action space to the ones posed by the control problem. A local function approximator has been developed that supports the expansion of state and action space. The concept is applied to the Model-Learning Actor-Critic, a model-based Heuristic Dy- namic Programming algorithm. Its functioning is demonstrated by training a reinforcement learning agent for 2-dimensional hover control of a Parrot AR 2.0 quad-rotor. It is shown that the agent is able to learn faster and to achieve a better policy when being exposed to the action and state variables gradually than all at once from the start
Original language | English |
---|---|
Title of host publication | Proceedings of the 2018 AIAA Information Systems-AIAA Infotech @ Aerospace |
Publisher | American Institute of Aeronautics and Astronautics Inc. (AIAA) |
Number of pages | 19 |
ISBN (Electronic) | 978-1-62410-527-2 |
DOIs | |
Publication status | Published - 2018 |
Event | AIAA Information Systems-AIAA Infotech at Aerospace, 2018 - Kissimmee, United States Duration: 8 Jan 2018 → 12 Jan 2018 https://doi.org/10.2514/MIAA18 |
Conference
Conference | AIAA Information Systems-AIAA Infotech at Aerospace, 2018 |
---|---|
Country/Territory | United States |
City | Kissimmee |
Period | 8/01/18 → 12/01/18 |
Internet address |