Event-Based Communication in Distributed Q-Learning

D. Jarne Ornia; M. Mazo

doi:10.1109/CDC51059.2022.9992660

Event-Based Communication in Distributed Q-Learning

Team Manuel Mazo Jr

Research output: Chapter in Book/Conference proceedings/Edited volume › Conference contribution › Scientific › peer-review

13 Downloads (Pure)

Abstract

We present an approach to reduce the communication of information needed on a Distributed Q-Learning system inspired by Event Triggered Control (ETC) techniques. We consider a baseline scenario of a Distributed Q-Learning problem on a Markov Decision Process (MDP). Following an event-based approach, N agents sharing a value function explore the MDP and compute a trajectory-dependent triggering signal which they use distributedly to decide when to communicate information to a central learner in charge of computing updates on the action-value function. These decision functions form an Event Based distributed Q learning system (EBd-Q), and we derive convergence guarantees resulting from the reduction of communication. We then apply the proposed algorithm to a cooperative path planning problem, and show how the agents are able to learn optimal trajectories communicating a fraction of the information. Additionally, we discuss what effects (desired and undesired) these event-based approaches have on the learning processes studied, and how they can be applied to more complex multi-agent systems.

Original language	English
Title of host publication	Proceedings of the IEEE 61st Conference on Decision and Control (CDC 2022)
Publisher	IEEE
Pages	2379-2386
ISBN (Print)	978-1-6654-6761-2
DOIs	https://doi.org/10.1109/CDC51059.2022.9992660
Publication status	Published - 2022
Event	IEEE 61st Conference on Decision and Control (CDC 2022) - Cancún, Mexico Duration: 6 Dec 2022 → 9 Dec 2022

Conference

Conference	IEEE 61st Conference on Decision and Control (CDC 2022)
Country/Territory	Mexico
City	Cancún
Period	6/12/22 → 9/12/22

Bibliographical note

Green Open Access added to TU Delft Institutional Repository 'You share, we take care!' - Taverne project https://www.openaccess.nl/en/you-share-we-take-care
Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.

Keywords

Q-learning
Markov processes
Control systems
Trajectory
Multi-agent systems
Convergence
Event-Triggered Control
Reinforcement Learning
Distributed Systems

Access to Document

10.1109/CDC51059.2022.9992660

Event-Based_Communication_in_Distributed_Q-LearningFinal published version, 1.15 MB

Cite this

@inproceedings{1a7b9b7722404e0cb392a951ab1d9451,

title = "Event-Based Communication in Distributed Q-Learning",

abstract = "We present an approach to reduce the communication of information needed on a Distributed Q-Learning system inspired by Event Triggered Control (ETC) techniques. We consider a baseline scenario of a Distributed Q-Learning problem on a Markov Decision Process (MDP). Following an event-based approach, N agents sharing a value function explore the MDP and compute a trajectory-dependent triggering signal which they use distributedly to decide when to communicate information to a central learner in charge of computing updates on the action-value function. These decision functions form an Event Based distributed Q learning system (EBd-Q), and we derive convergence guarantees resulting from the reduction of communication. We then apply the proposed algorithm to a cooperative path planning problem, and show how the agents are able to learn optimal trajectories communicating a fraction of the information. Additionally, we discuss what effects (desired and undesired) these event-based approaches have on the learning processes studied, and how they can be applied to more complex multi-agent systems.",

keywords = "Q-learning, Markov processes, Control systems, Trajectory, Multi-agent systems, Convergence, Event-Triggered Control, Reinforcement Learning, Distributed Systems",

author = "{Jarne Ornia}, D. and M. Mazo",

note = "Green Open Access added to TU Delft Institutional Repository 'You share, we take care!' - Taverne project https://www.openaccess.nl/en/you-share-we-take-care Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.; IEEE 61st Conference on Decision and Control (CDC 2022) ; Conference date: 06-12-2022 Through 09-12-2022",

year = "2022",

doi = "10.1109/CDC51059.2022.9992660",

language = "English",

isbn = "978-1-6654-6761-2",

pages = "2379--2386",

booktitle = "Proceedings of the IEEE 61st Conference on Decision and Control (CDC 2022)",

publisher = "IEEE",

address = "United States",

}

TY - GEN

T1 - Event-Based Communication in Distributed Q-Learning

AU - Jarne Ornia, D.

AU - Mazo, M.

N1 - Green Open Access added to TU Delft Institutional Repository 'You share, we take care!' - Taverne project https://www.openaccess.nl/en/you-share-we-take-care Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.

PY - 2022

Y1 - 2022

N2 - We present an approach to reduce the communication of information needed on a Distributed Q-Learning system inspired by Event Triggered Control (ETC) techniques. We consider a baseline scenario of a Distributed Q-Learning problem on a Markov Decision Process (MDP). Following an event-based approach, N agents sharing a value function explore the MDP and compute a trajectory-dependent triggering signal which they use distributedly to decide when to communicate information to a central learner in charge of computing updates on the action-value function. These decision functions form an Event Based distributed Q learning system (EBd-Q), and we derive convergence guarantees resulting from the reduction of communication. We then apply the proposed algorithm to a cooperative path planning problem, and show how the agents are able to learn optimal trajectories communicating a fraction of the information. Additionally, we discuss what effects (desired and undesired) these event-based approaches have on the learning processes studied, and how they can be applied to more complex multi-agent systems.

AB - We present an approach to reduce the communication of information needed on a Distributed Q-Learning system inspired by Event Triggered Control (ETC) techniques. We consider a baseline scenario of a Distributed Q-Learning problem on a Markov Decision Process (MDP). Following an event-based approach, N agents sharing a value function explore the MDP and compute a trajectory-dependent triggering signal which they use distributedly to decide when to communicate information to a central learner in charge of computing updates on the action-value function. These decision functions form an Event Based distributed Q learning system (EBd-Q), and we derive convergence guarantees resulting from the reduction of communication. We then apply the proposed algorithm to a cooperative path planning problem, and show how the agents are able to learn optimal trajectories communicating a fraction of the information. Additionally, we discuss what effects (desired and undesired) these event-based approaches have on the learning processes studied, and how they can be applied to more complex multi-agent systems.

KW - Q-learning

KW - Markov processes

KW - Control systems

KW - Trajectory

KW - Multi-agent systems

KW - Convergence

KW - Event-Triggered Control

KW - Reinforcement Learning

KW - Distributed Systems

UR - http://www.scopus.com/inward/record.url?scp=85146989215&partnerID=8YFLogxK

U2 - 10.1109/CDC51059.2022.9992660

DO - 10.1109/CDC51059.2022.9992660

M3 - Conference contribution

SN - 978-1-6654-6761-2

SP - 2379

EP - 2386

BT - Proceedings of the IEEE 61st Conference on Decision and Control (CDC 2022)

PB - IEEE

T2 - IEEE 61st Conference on Decision and Control (CDC 2022)

Y2 - 6 December 2022 through 9 December 2022

ER -

Event-Based Communication in Distributed Q-Learning

Abstract

Conference

Bibliographical note

Keywords

Access to Document

Other files and links

Fingerprint

Cite this