Flight test of Quadcopter Guidance with Vision-Based Reinforcement Learning

Manan Siddiquee, J. Junell, Erik-jan van Kampen

Research output: Chapter in Book/Conference proceedings/Edited volumeConference contributionScientificpeer-review

2 Citations (Scopus)
123 Downloads (Pure)


Reinforcement Learning (RL) has been applied to teach quadcopters guidance tasks. Most applications rely on position information from an absolute reference system such as Global Positioning System (GPS). The dependence on “absolute position” information is a general limitation in the autonomous flight of Unmanned Aerial Vehicles (UAVs). Environments that have weak or no GPS signals are difficult to traverse for them. Instead of using absolute position, it is possible to sense the environment and the information contained within it in order to come up with a “relative” description of the UAV’s position. This paper presents the design of an RL agent with relative vision-based states and rewards for the teaching of a guidance task to a quadcopter. The agent is taught the task of turning towards a red marker and approaching it in simulation and in flight tests. A more complex task of travelling between a blue and a red marker is trained in simulation. This work shows that relative vision-based states and rewards can be used with RL to teach quadcopters simple guidance tasks. The performance of the trained agent is inconsistent in simulation and flight test due to the inherent partial observability in the relative description of the state.
Original languageEnglish
Title of host publicationAIAA Scitech 2019 Forum
Subtitle of host publication7-11 January 2019, San Diego, California, USA
Number of pages21
ISBN (Electronic)978-1-62410-578-4
Publication statusPublished - 2019
EventAIAA Scitech Forum, 2019 - San Diego, United States
Duration: 7 Jan 201911 Jan 2019


ConferenceAIAA Scitech Forum, 2019
Country/TerritoryUnited States
CitySan Diego
Internet address


Dive into the research topics of 'Flight test of Quadcopter Guidance with Vision-Based Reinforcement Learning'. Together they form a unique fingerprint.

Cite this