Training for Implicit Norms in Deep Reinforcement Learning Agents through Adversarial Multi-Objective Reward Optimization

Markus Peschl

doi:10.1145/3461702.3462473

Training for Implicit Norms in Deep Reinforcement Learning Agents through Adversarial Multi-Objective Reward Optimization

Markus Peschl^*

^*Corresponding author for this work

Electrical Engineering, Mathematics and Computer Science

Research output: Chapter in Book/Conference proceedings/Edited volume › Conference contribution › Scientific › peer-review

3 Citations (Scopus)

103 Downloads (Pure)

Abstract

We propose a deep reinforcement learning algorithm that employs an adversarial training strategy for adhering to implicit human norms alongside optimizing for a narrow goal objective. Previous methods which incorporate human values into reinforcement learning algorithms either scale poorly or assume hand-crafted state features. Our algorithm drops these assumptions and is able to automatically infer norms from human demonstrations, which allows for integrating it into existing agents in the form of multi-objective optimization. We benchmark our approach in a search-and-rescue grid world and show that, conditioned on respecting human norms, our agent maintains optimal performance with respect to the predefined goal.

Original language	English
Title of host publication	AIES 2021 - Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society
Publisher	Association for Computing Machinery (ACM)
Pages	275-276
Number of pages	2
ISBN (Electronic)	9781450384735
DOIs	https://doi.org/10.1145/3461702.3462473
Publication status	Published - 2021
Event	4th AAAI/ACM Conference on Artificial Intelligence, Ethics, and Society, AIES 2021 - Virtual, Online, United States Duration: 19 May 2021 → 21 May 2021

Publication series

Name	AIES 2021 - Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society

Conference

Conference	4th AAAI/ACM Conference on Artificial Intelligence, Ethics, and Society, AIES 2021
Country/Territory	United States
City	Virtual, Online
Period	19/05/21 → 21/05/21

Keywords

deep learning
inverse reinforcement learning
multi-objective optimization
value alignment

Access to Document

10.1145/3461702.3462473

3461702.3462473Final published version, 0.99 MB

Cite this

Peschl, M. (2021). Training for Implicit Norms in Deep Reinforcement Learning Agents through Adversarial Multi-Objective Reward Optimization. In AIES 2021 - Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society (pp. 275-276). (AIES 2021 - Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society). Association for Computing Machinery (ACM). https://doi.org/10.1145/3461702.3462473

Peschl, Markus. / Training for Implicit Norms in Deep Reinforcement Learning Agents through Adversarial Multi-Objective Reward Optimization. AIES 2021 - Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society. Association for Computing Machinery (ACM), 2021. pp. 275-276 (AIES 2021 - Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society).

@inproceedings{af9eea331fa74cdd8278d7a0da05736e,

title = "Training for Implicit Norms in Deep Reinforcement Learning Agents through Adversarial Multi-Objective Reward Optimization",

abstract = "We propose a deep reinforcement learning algorithm that employs an adversarial training strategy for adhering to implicit human norms alongside optimizing for a narrow goal objective. Previous methods which incorporate human values into reinforcement learning algorithms either scale poorly or assume hand-crafted state features. Our algorithm drops these assumptions and is able to automatically infer norms from human demonstrations, which allows for integrating it into existing agents in the form of multi-objective optimization. We benchmark our approach in a search-and-rescue grid world and show that, conditioned on respecting human norms, our agent maintains optimal performance with respect to the predefined goal.",

keywords = "deep learning, inverse reinforcement learning, multi-objective optimization, value alignment",

author = "Markus Peschl",

year = "2021",

doi = "10.1145/3461702.3462473",

language = "English",

series = "AIES 2021 - Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society",

publisher = "Association for Computing Machinery (ACM)",

pages = "275--276",

booktitle = "AIES 2021 - Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society",

address = "United States",

note = "4th AAAI/ACM Conference on Artificial Intelligence, Ethics, and Society, AIES 2021 ; Conference date: 19-05-2021 Through 21-05-2021",

}

Peschl, M 2021, Training for Implicit Norms in Deep Reinforcement Learning Agents through Adversarial Multi-Objective Reward Optimization. in AIES 2021 - Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society. AIES 2021 - Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society, Association for Computing Machinery (ACM), pp. 275-276, 4th AAAI/ACM Conference on Artificial Intelligence, Ethics, and Society, AIES 2021, Virtual, Online, United States, 19/05/21. https://doi.org/10.1145/3461702.3462473

Training for Implicit Norms in Deep Reinforcement Learning Agents through Adversarial Multi-Objective Reward Optimization. / Peschl, Markus.
AIES 2021 - Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society. Association for Computing Machinery (ACM), 2021. p. 275-276 (AIES 2021 - Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society).

Research output: Chapter in Book/Conference proceedings/Edited volume › Conference contribution › Scientific › peer-review

TY - GEN

T1 - Training for Implicit Norms in Deep Reinforcement Learning Agents through Adversarial Multi-Objective Reward Optimization

AU - Peschl, Markus

PY - 2021

Y1 - 2021

N2 - We propose a deep reinforcement learning algorithm that employs an adversarial training strategy for adhering to implicit human norms alongside optimizing for a narrow goal objective. Previous methods which incorporate human values into reinforcement learning algorithms either scale poorly or assume hand-crafted state features. Our algorithm drops these assumptions and is able to automatically infer norms from human demonstrations, which allows for integrating it into existing agents in the form of multi-objective optimization. We benchmark our approach in a search-and-rescue grid world and show that, conditioned on respecting human norms, our agent maintains optimal performance with respect to the predefined goal.

AB - We propose a deep reinforcement learning algorithm that employs an adversarial training strategy for adhering to implicit human norms alongside optimizing for a narrow goal objective. Previous methods which incorporate human values into reinforcement learning algorithms either scale poorly or assume hand-crafted state features. Our algorithm drops these assumptions and is able to automatically infer norms from human demonstrations, which allows for integrating it into existing agents in the form of multi-objective optimization. We benchmark our approach in a search-and-rescue grid world and show that, conditioned on respecting human norms, our agent maintains optimal performance with respect to the predefined goal.

KW - deep learning

KW - inverse reinforcement learning

KW - multi-objective optimization

KW - value alignment

UR - http://www.scopus.com/inward/record.url?scp=85112461561&partnerID=8YFLogxK

U2 - 10.1145/3461702.3462473

DO - 10.1145/3461702.3462473

M3 - Conference contribution

AN - SCOPUS:85112461561

T3 - AIES 2021 - Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society

SP - 275

EP - 276

BT - AIES 2021 - Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society

PB - Association for Computing Machinery (ACM)

T2 - 4th AAAI/ACM Conference on Artificial Intelligence, Ethics, and Society, AIES 2021

Y2 - 19 May 2021 through 21 May 2021

ER -

Peschl M. Training for Implicit Norms in Deep Reinforcement Learning Agents through Adversarial Multi-Objective Reward Optimization. In AIES 2021 - Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society. Association for Computing Machinery (ACM). 2021. p. 275-276. (AIES 2021 - Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society). doi: 10.1145/3461702.3462473

Training for Implicit Norms in Deep Reinforcement Learning Agents through Adversarial Multi-Objective Reward Optimization

Abstract

Publication series

Conference

Keywords

Access to Document

Other files and links

Fingerprint

Cite this