Abstract
Preference-based reinforcement learning (RL) poses as a recent research direction in robot learning, by allowing humans to teach robots through preferences on pairs of desired behaviours. Nonetheless, to obtain realistic robot policies, an arbitrarily large number of queries is required to be answered by humans. In this work, we approach the sample-efficiency challenge by presenting a technique which synthesizes queries, in a semi-supervised learning perspective. To achieve this, we leverage latent variational autoencoder (VAE) representations of trajectory segments (sequences of state-action pairs). Our approach manages to produce queries which are closely aligned with those labeled by humans, while avoiding excessive uncertainty according to the human preference predictions as determined by reward estimations. Additionally, by introducing variation without deviating from the original human's intents, more robust reward function representations are achieved. We compare our approach to recent state-of-the-art preference-based RL semi-supervised learning techniques. Our experimental findings reveal that we can enhance the generalization of the estimated reward function without requiring additional human intervention. Lastly, to confirm the practical applicability of our approach, we conduct experiments involving actual human users in a simulated social navigation setting. Videos of the experiments can be found at https://sites.google.com/view/rl-sequel
Original language | English |
---|---|
Title of host publication | Proceedings of the IEEE International Conference on Robotics and Automation, ICRA 2024 |
Publisher | IEEE |
Pages | 9585-9592 |
Number of pages | 8 |
ISBN (Electronic) | 979-8-3503-8457-4 |
DOIs | |
Publication status | Published - 2024 |
Event | 2024 IEEE International Conference on Robotics and Automation, ICRA 2024 - Yokohama, Japan Duration: 13 May 2024 → 17 May 2024 |
Publication series
Name | Proceedings - IEEE International Conference on Robotics and Automation |
---|---|
ISSN (Print) | 1050-4729 |
Conference
Conference | 2024 IEEE International Conference on Robotics and Automation, ICRA 2024 |
---|---|
Country/Territory | Japan |
City | Yokohama |
Period | 13/05/24 → 17/05/24 |
Bibliographical note
Green Open Access added to TU Delft Institutional Repository 'You share, we take care!' - Taverne project https://www.openaccess.nl/en/you-share-we-take-careOtherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.