Model-Reference Reinforcement Learning Control of Autonomous Surface Vehicles

Qingrui Zhang, Wei Pan, Vasso Reppa

Research output: Chapter in Book/Conference proceedings/Edited volumeConference contributionScientificpeer-review

Abstract

This paper presents a novel model-reference reinforcement learning control method for uncertain autonomous surface vehicles. The proposed control combines a conventional model-based control method with deep reinforcement learning. With the conventional model-based control, we can ensure the learning-based control law provides closed-loop stability for the trajectory tracking control of the overall system, and increase the sample efficiency of the deep reinforcement learning. With reinforcement learning, we can directly learn a control law to compensate for modeling uncertainties. In the proposed control, a nominal system is employed for the design of a baseline control law using a conventional control approach. The nominal system also defines the desired performance for uncertain autonomous vehicles to follow. In comparison with traditional deep reinforcement learning methods, our proposed learning-based control can provide stability guarantees and better sample efficiency. We demonstrate the performance of the new algorithm via extensive simulation results.

Original languageEnglish
Title of host publicationProceedings of the 59th IEEE Conference on Decision and Control, CDC 2020
Place of PublicationPiscataway, NJ, USA
PublisherIEEE
Pages5291-5296
ISBN (Electronic)978-1-7281-7447-1
DOIs
Publication statusPublished - 2020
Event59th IEEE Conference on Decision and Control, CDC 2020 - Virtual, Jeju Island, Korea, Republic of
Duration: 14 Dec 202018 Dec 2020

Conference

Conference59th IEEE Conference on Decision and Control, CDC 2020
CountryKorea, Republic of
CityVirtual, Jeju Island
Period14/12/2018/12/20

Fingerprint

Dive into the research topics of 'Model-Reference Reinforcement Learning Control of Autonomous Surface Vehicles'. Together they form a unique fingerprint.

Cite this