Event-Based Communication in Distributed Q-Learning

Research output: Chapter in Book/Conference proceedings/Edited volumeConference contributionScientificpeer-review

11 Downloads (Pure)

Abstract

We present an approach to reduce the communication of information needed on a Distributed Q-Learning system inspired by Event Triggered Control (ETC) techniques. We consider a baseline scenario of a Distributed Q-Learning problem on a Markov Decision Process (MDP). Following an event-based approach, N agents sharing a value function explore the MDP and compute a trajectory-dependent triggering signal which they use distributedly to decide when to communicate information to a central learner in charge of computing updates on the action-value function. These decision functions form an Event Based distributed Q learning system (EBd-Q), and we derive convergence guarantees resulting from the reduction of communication. We then apply the proposed algorithm to a cooperative path planning problem, and show how the agents are able to learn optimal trajectories communicating a fraction of the information. Additionally, we discuss what effects (desired and undesired) these event-based approaches have on the learning processes studied, and how they can be applied to more complex multi-agent systems.
Original languageEnglish
Title of host publicationProceedings of the IEEE 61st Conference on Decision and Control (CDC 2022)
PublisherIEEE
Pages2379-2386
ISBN (Print)978-1-6654-6761-2
DOIs
Publication statusPublished - 2022
EventIEEE 61st Conference on Decision and Control (CDC 2022) - Cancún, Mexico
Duration: 6 Dec 20229 Dec 2022

Conference

ConferenceIEEE 61st Conference on Decision and Control (CDC 2022)
Country/TerritoryMexico
CityCancún
Period6/12/229/12/22

Bibliographical note

Green Open Access added to TU Delft Institutional Repository 'You share, we take care!' - Taverne project https://www.openaccess.nl/en/you-share-we-take-care
Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.

Keywords

  • Q-learning
  • Markov processes
  • Control systems
  • Trajectory
  • Multi-agent systems
  • Convergence
  • Event-Triggered Control
  • Reinforcement Learning
  • Distributed Systems

Fingerprint

Dive into the research topics of 'Event-Based Communication in Distributed Q-Learning'. Together they form a unique fingerprint.

Cite this