Influence-Augmented Local Simulators: a Scalable Solution for Fast Deep RL in Large Networked Systems

Miguel Suau*, Jinke He, Matthijs T.J. Spaan, Frans A. Oliehoek

*Corresponding author for this work

Research output: Chapter in Book/Conference proceedings/Edited volumeConference contributionScientificpeer-review

26 Downloads (Pure)

Abstract

Learning effective policies for real-world problems is still an open challenge for the field of reinforcement learning (RL). The main limitation being the amount of data needed and the pace at which that data can be obtained. In this paper, we study how to build lightweight simulators of complicated systems that can run sufficiently fast for deep RL to be applicable. We focus on domains where agents interact with a reduced portion of a larger environment while still being affected by the global dynamics. Our method combines the use of local simulators with learned models that mimic the influence of the global system. The experiments reveal that incorporating this idea into the deep RL workflow can considerably accelerate the training process and presents several opportunities for the future.
Original languageEnglish
Title of host publicationProceedings of the 39th International Conference on Machine Learning
EditorsK. Chaudhuri, S. Jegelka, L. Song, C. Szepesvari, G. Niu, S. Sabato
PublisherPMLR
Pages20604-20624
Number of pages21
Volume162
Publication statusPublished - 2022
EventThe 39th International Conference on Machine Learning - Baltimore, United States
Duration: 17 Jul 202223 Jul 2022
Conference number: 39th

Publication series

NameProceedings of Machine Learning Research
Volume162
ISSN (Print)2640-3498

Conference

ConferenceThe 39th International Conference on Machine Learning
Abbreviated titleICML 2022
Country/TerritoryUnited States
CityBaltimore
Period17/07/2223/07/22

Keywords

  • reinforcement learning (RL)
  • simulation and control.

Fingerprint

Dive into the research topics of 'Influence-Augmented Local Simulators: a Scalable Solution for Fast Deep RL in Large Networked Systems'. Together they form a unique fingerprint.

Cite this