Online and offline learning of player objectives from partial observations in dynamic games

Lasse Peters; Vicenç Rubies-Royo; Claire J. Tomlin; Laura Ferranti; Javier Alonso-Mora; Cyrill Stachniss; David Fridovich-Keil

doi:10.1177/02783649231182453

Online and offline learning of player objectives from partial observations in dynamic games

Lasse Peters^*, Vicenç Rubies-Royo, Claire J. Tomlin, Laura Ferranti, Javier Alonso-Mora, Cyrill Stachniss, David Fridovich-Keil

^*Corresponding author for this work

Learning & Autonomous Control

Research output: Contribution to journal › Article › Scientific › peer-review

44 Downloads (Pure)

Abstract

Robots deployed to the real world must be able to interact with other agents in their environment. Dynamic game theory provides a powerful mathematical framework for modeling scenarios in which agents have individual objectives and interactions evolve over time. However, a key limitation of such techniques is that they require a priori knowledge of all players’ objectives. In this work, we address this issue by proposing a novel method for learning players’ objectives in continuous dynamic games from noise-corrupted, partial state observations. Our approach learns objectives by coupling the estimation of unknown cost parameters of each player with inference of unobserved states and inputs through Nash equilibrium constraints. By coupling past state estimates with future state predictions, our approach is amenable to simultaneous online learning and prediction in receding horizon fashion. We demonstrate our method in several simulated traffic scenarios in which we recover players’ preferences, for, e.g. desired travel speed and collision-avoidance behavior. Results show that our method reliably estimates game-theoretic models from noise-corrupted data that closely matches ground-truth objectives, consistently outperforming state-of-the-art approaches.

Original language	English
Pages (from-to)	917-937
Number of pages	21
Journal	International Journal of Robotics Research
Volume	42
Issue number	10
DOIs	https://doi.org/10.1177/02783649231182453
Publication status	Published - 2023

Keywords

Inverse dynamic games
inverse optimal control
multi-agent prediction

Access to Document

10.1177/02783649231182453

peters-et-al-2023-online-and-offline-learning-of-player-objectives-from-partial-observations-in-dynamic-gamesFinal published version, 2.47 MBLicence: CC BY

Cite this

@article{f4dd7063b31e4edfaeb725db6861cc0e,

title = "Online and offline learning of player objectives from partial observations in dynamic games",

abstract = "Robots deployed to the real world must be able to interact with other agents in their environment. Dynamic game theory provides a powerful mathematical framework for modeling scenarios in which agents have individual objectives and interactions evolve over time. However, a key limitation of such techniques is that they require a priori knowledge of all players{\textquoteright} objectives. In this work, we address this issue by proposing a novel method for learning players{\textquoteright} objectives in continuous dynamic games from noise-corrupted, partial state observations. Our approach learns objectives by coupling the estimation of unknown cost parameters of each player with inference of unobserved states and inputs through Nash equilibrium constraints. By coupling past state estimates with future state predictions, our approach is amenable to simultaneous online learning and prediction in receding horizon fashion. We demonstrate our method in several simulated traffic scenarios in which we recover players{\textquoteright} preferences, for, e.g. desired travel speed and collision-avoidance behavior. Results show that our method reliably estimates game-theoretic models from noise-corrupted data that closely matches ground-truth objectives, consistently outperforming state-of-the-art approaches.",

keywords = "Inverse dynamic games, inverse optimal control, multi-agent prediction",

author = "Lasse Peters and Vicen{\c c} Rubies-Royo and Tomlin, {Claire J.} and Laura Ferranti and Javier Alonso-Mora and Cyrill Stachniss and David Fridovich-Keil",

year = "2023",

doi = "10.1177/02783649231182453",

language = "English",

volume = "42",

pages = "917--937",

journal = "International Journal of Robotics Research",

issn = "0278-3649",

publisher = "SAGE Publishing",

number = "10",

}

TY - JOUR

T1 - Online and offline learning of player objectives from partial observations in dynamic games

AU - Peters, Lasse

AU - Rubies-Royo, Vicenç

AU - Tomlin, Claire J.

AU - Ferranti, Laura

AU - Alonso-Mora, Javier

AU - Stachniss, Cyrill

AU - Fridovich-Keil, David

PY - 2023

Y1 - 2023

N2 - Robots deployed to the real world must be able to interact with other agents in their environment. Dynamic game theory provides a powerful mathematical framework for modeling scenarios in which agents have individual objectives and interactions evolve over time. However, a key limitation of such techniques is that they require a priori knowledge of all players’ objectives. In this work, we address this issue by proposing a novel method for learning players’ objectives in continuous dynamic games from noise-corrupted, partial state observations. Our approach learns objectives by coupling the estimation of unknown cost parameters of each player with inference of unobserved states and inputs through Nash equilibrium constraints. By coupling past state estimates with future state predictions, our approach is amenable to simultaneous online learning and prediction in receding horizon fashion. We demonstrate our method in several simulated traffic scenarios in which we recover players’ preferences, for, e.g. desired travel speed and collision-avoidance behavior. Results show that our method reliably estimates game-theoretic models from noise-corrupted data that closely matches ground-truth objectives, consistently outperforming state-of-the-art approaches.

AB - Robots deployed to the real world must be able to interact with other agents in their environment. Dynamic game theory provides a powerful mathematical framework for modeling scenarios in which agents have individual objectives and interactions evolve over time. However, a key limitation of such techniques is that they require a priori knowledge of all players’ objectives. In this work, we address this issue by proposing a novel method for learning players’ objectives in continuous dynamic games from noise-corrupted, partial state observations. Our approach learns objectives by coupling the estimation of unknown cost parameters of each player with inference of unobserved states and inputs through Nash equilibrium constraints. By coupling past state estimates with future state predictions, our approach is amenable to simultaneous online learning and prediction in receding horizon fashion. We demonstrate our method in several simulated traffic scenarios in which we recover players’ preferences, for, e.g. desired travel speed and collision-avoidance behavior. Results show that our method reliably estimates game-theoretic models from noise-corrupted data that closely matches ground-truth objectives, consistently outperforming state-of-the-art approaches.

KW - Inverse dynamic games

KW - inverse optimal control

KW - multi-agent prediction

UR - http://www.scopus.com/inward/record.url?scp=85162722851&partnerID=8YFLogxK

U2 - 10.1177/02783649231182453

DO - 10.1177/02783649231182453

M3 - Article

AN - SCOPUS:85162722851

SN - 0278-3649

VL - 42

SP - 917

EP - 937

JO - International Journal of Robotics Research

JF - International Journal of Robotics Research

IS - 10

ER -

Online and offline learning of player objectives from partial observations in dynamic games

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this