Multi-agent reinforcement learning via distributed MPC as a function approximator

Samuel Mallick*, Filippo Airaldi, Azita Dabiri, Bart De Schutter

*Corresponding author for this work

Research output: Contribution to journalArticleScientificpeer-review

8 Downloads (Pure)

Abstract

This paper presents a novel approach to multi-agent reinforcement learning (RL) for linear systems with convex polytopic constraints. Existing work on RL has demonstrated the use of model predictive control (MPC) as a function approximator for the policy and value functions. The current paper is the first work to extend this idea to the multi-agent setting. We propose the use of a distributed MPC scheme as a function approximator, with a structure allowing for distributed learning and deployment. We then show that Q-learning updates can be performed distributively without introducing nonstationarity, by reconstructing a centralized learning update. The effectiveness of the approach is demonstrated on a numerical example.

Original languageEnglish
Article number111803
Number of pages9
JournalAutomatica
Volume167
DOIs
Publication statusPublished - 2024

Bibliographical note

Green Open Access added to TU Delft Institutional Repository 'You share, we take care!' - Taverne project https://www.openaccess.nl/en/you-share-we-take-care
Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.

Keywords

  • ADMM
  • Distributed model predictive control
  • Multi-agent reinforcement learning
  • Networked systems

Fingerprint

Dive into the research topics of 'Multi-agent reinforcement learning via distributed MPC as a function approximator'. Together they form a unique fingerprint.

Cite this