MORAL: Aligning AI with Human Norms through Multi-Objective Reinforced Active Learning

M. Peschl; A. Zgonnikov; F.A. Oliehoek; L. Cavalcante Siebert

MORAL: Aligning AI with Human Norms through Multi-Objective Reinforced Active Learning

M. Peschl, A. Zgonnikov, F.A. Oliehoek, L. Cavalcante Siebert

Research output: Chapter in Book/Conference proceedings/Edited volume › Conference contribution › Scientific › peer-review

47 Downloads (Pure)

Abstract

Inferring reward functions from demonstrations and pairwise preferences are auspicious approaches for aligning Reinforcement Learning (RL) agents with human intentions. However, state-of-the art methods typically focus on learning a single reward model, thus rendering it difficult to trade off different reward functions from multiple experts. We propose Multi-Objective Reinforced Active Learning (MORAL), a novel method for combining diverse demonstrations of social norms into a Pareto-optimal policy. Through maintaining a distribution over scalarization weights, our approach is able to interactively tune a deep RL agent towards a variety of preferences, while eliminating the need for computing multiple policies. We empirically demonstrate the effectiveness of MORAL in two scenarios, which model a delivery and an emergency task that require an agent to act in the presence of normative conflicts. Overall, we consider our research a step towards multi-objective RL with learned rewards, bridging the gap between current reward learning and machine ethics literature.

Original language	English
Title of host publication	Proceedings of the 21st International Conference on Autonomous Agents and Multiagent Systems (AAMAS '22)
Editors	Catherine Pelachaud, Matthew E. Taylor
Publisher	International Foundation for Autonomous Agents and Multiagent Systems
Pages	1038-1046
ISBN (Print)	978-1-4503-9213-6
Publication status	Published - 2022
Event	AAMAS 2022: 21st International Conference on Autonomous Agents and Multiagent Systems (Virtual) - , New Zealand Duration: 9 May 2022 → 13 May 2022

Conference

Conference	AAMAS 2022: 21st International Conference on Autonomous Agents and Multiagent Systems (Virtual)
Country/Territory	New Zealand
Period	9/05/22 → 13/05/22

Bibliographical note

Green Open Access added to TU Delft Institutional Repository 'You share, we take care!' - Taverne project https://www.openaccess.nl/en/you-share-we-take-care
Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.

Keywords

Active Learning
Inverse Reinforcement Learning
Multi-Objective Decision-Making
Value Alignment

Access to Document

3535850.3535966Final published version, 2.84 MB

https://dl.acm.org/doi/abs/10.5555/3535850.3535966

Cite this

Peschl, M., Zgonnikov, A., Oliehoek, F. A., & Cavalcante Siebert, L. (2022). MORAL: Aligning AI with Human Norms through Multi-Objective Reinforced Active Learning. In C. Pelachaud, & M. E. Taylor (Eds.), Proceedings of the 21st International Conference on Autonomous Agents and Multiagent Systems (AAMAS '22) (pp. 1038-1046). International Foundation for Autonomous Agents and Multiagent Systems. https://dl.acm.org/doi/abs/10.5555/3535850.3535966

Peschl, M. ; Zgonnikov, A. ; Oliehoek, F.A. et al. / MORAL: Aligning AI with Human Norms through Multi-Objective Reinforced Active Learning. Proceedings of the 21st International Conference on Autonomous Agents and Multiagent Systems (AAMAS '22). editor / Catherine Pelachaud ; Matthew E. Taylor. International Foundation for Autonomous Agents and Multiagent Systems, 2022. pp. 1038-1046

@inproceedings{f2a0f20f6bda4157b3dae1f490f6f1fc,

title = "MORAL: Aligning AI with Human Norms through Multi-Objective Reinforced Active Learning",

abstract = "Inferring reward functions from demonstrations and pairwise preferences are auspicious approaches for aligning Reinforcement Learning (RL) agents with human intentions. However, state-of-the art methods typically focus on learning a single reward model, thus rendering it difficult to trade off different reward functions from multiple experts. We propose Multi-Objective Reinforced Active Learning (MORAL), a novel method for combining diverse demonstrations of social norms into a Pareto-optimal policy. Through maintaining a distribution over scalarization weights, our approach is able to interactively tune a deep RL agent towards a variety of preferences, while eliminating the need for computing multiple policies. We empirically demonstrate the effectiveness of MORAL in two scenarios, which model a delivery and an emergency task that require an agent to act in the presence of normative conflicts. Overall, we consider our research a step towards multi-objective RL with learned rewards, bridging the gap between current reward learning and machine ethics literature.",

keywords = "Active Learning, Inverse Reinforcement Learning, Multi-Objective Decision-Making, Value Alignment",

author = "M. Peschl and A. Zgonnikov and F.A. Oliehoek and {Cavalcante Siebert}, L.",

note = "Green Open Access added to TU Delft Institutional Repository 'You share, we take care!' - Taverne project https://www.openaccess.nl/en/you-share-we-take-care Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.; AAMAS 2022: 21st International Conference on Autonomous Agents and Multiagent Systems (Virtual) ; Conference date: 09-05-2022 Through 13-05-2022",

year = "2022",

language = "English",

isbn = "978-1-4503-9213-6",

pages = "1038--1046",

editor = "Pelachaud, {Catherine } and Taylor, {Matthew E.}",

booktitle = "Proceedings of the 21st International Conference on Autonomous Agents and Multiagent Systems (AAMAS '22)",

publisher = "International Foundation for Autonomous Agents and Multiagent Systems",

}

Peschl, M, Zgonnikov, A , Oliehoek, FA & Cavalcante Siebert, L 2022, MORAL: Aligning AI with Human Norms through Multi-Objective Reinforced Active Learning. in C Pelachaud & ME Taylor (eds), Proceedings of the 21st International Conference on Autonomous Agents and Multiagent Systems (AAMAS '22). International Foundation for Autonomous Agents and Multiagent Systems, pp. 1038-1046, AAMAS 2022: 21st International Conference on Autonomous Agents and Multiagent Systems (Virtual), New Zealand, 9/05/22. <https://dl.acm.org/doi/abs/10.5555/3535850.3535966>

MORAL: Aligning AI with Human Norms through Multi-Objective Reinforced Active Learning. / Peschl, M.; Zgonnikov, A.; Oliehoek, F.A. et al.
Proceedings of the 21st International Conference on Autonomous Agents and Multiagent Systems (AAMAS '22). ed. / Catherine Pelachaud; Matthew E. Taylor. International Foundation for Autonomous Agents and Multiagent Systems, 2022. p. 1038-1046.

Research output: Chapter in Book/Conference proceedings/Edited volume › Conference contribution › Scientific › peer-review

TY - GEN

T1 - MORAL: Aligning AI with Human Norms through Multi-Objective Reinforced Active Learning

AU - Peschl, M.

AU - Zgonnikov, A.

AU - Oliehoek, F.A.

AU - Cavalcante Siebert, L.

N1 - Green Open Access added to TU Delft Institutional Repository 'You share, we take care!' - Taverne project https://www.openaccess.nl/en/you-share-we-take-care Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.

PY - 2022

Y1 - 2022

N2 - Inferring reward functions from demonstrations and pairwise preferences are auspicious approaches for aligning Reinforcement Learning (RL) agents with human intentions. However, state-of-the art methods typically focus on learning a single reward model, thus rendering it difficult to trade off different reward functions from multiple experts. We propose Multi-Objective Reinforced Active Learning (MORAL), a novel method for combining diverse demonstrations of social norms into a Pareto-optimal policy. Through maintaining a distribution over scalarization weights, our approach is able to interactively tune a deep RL agent towards a variety of preferences, while eliminating the need for computing multiple policies. We empirically demonstrate the effectiveness of MORAL in two scenarios, which model a delivery and an emergency task that require an agent to act in the presence of normative conflicts. Overall, we consider our research a step towards multi-objective RL with learned rewards, bridging the gap between current reward learning and machine ethics literature.

AB - Inferring reward functions from demonstrations and pairwise preferences are auspicious approaches for aligning Reinforcement Learning (RL) agents with human intentions. However, state-of-the art methods typically focus on learning a single reward model, thus rendering it difficult to trade off different reward functions from multiple experts. We propose Multi-Objective Reinforced Active Learning (MORAL), a novel method for combining diverse demonstrations of social norms into a Pareto-optimal policy. Through maintaining a distribution over scalarization weights, our approach is able to interactively tune a deep RL agent towards a variety of preferences, while eliminating the need for computing multiple policies. We empirically demonstrate the effectiveness of MORAL in two scenarios, which model a delivery and an emergency task that require an agent to act in the presence of normative conflicts. Overall, we consider our research a step towards multi-objective RL with learned rewards, bridging the gap between current reward learning and machine ethics literature.

KW - Active Learning

KW - Inverse Reinforcement Learning

KW - Multi-Objective Decision-Making

KW - Value Alignment

M3 - Conference contribution

SN - 978-1-4503-9213-6

SP - 1038

EP - 1046

BT - Proceedings of the 21st International Conference on Autonomous Agents and Multiagent Systems (AAMAS '22)

A2 - Pelachaud, Catherine

A2 - Taylor, Matthew E.

PB - International Foundation for Autonomous Agents and Multiagent Systems

T2 - AAMAS 2022: 21st International Conference on Autonomous Agents and Multiagent Systems (Virtual)

Y2 - 9 May 2022 through 13 May 2022

ER -

Peschl M, Zgonnikov A , Oliehoek FA , Cavalcante Siebert L. MORAL: Aligning AI with Human Norms through Multi-Objective Reinforced Active Learning. In Pelachaud C, Taylor ME, editors, Proceedings of the 21st International Conference on Autonomous Agents and Multiagent Systems (AAMAS '22). International Foundation for Autonomous Agents and Multiagent Systems. 2022. p. 1038-1046

MORAL: Aligning AI with Human Norms through Multi-Objective Reinforced Active Learning

Abstract

Conference

Bibliographical note

Keywords

Access to Document

Fingerprint

Cite this