Abstract
Besides facing the same challenges as single-agent systems, the distributed nature of complex multi-agent systems sparks many questions and problems revolving around the constraints imposed by communication. The idea that multi-agent systems require communication to access information, to coordinate or simply to sense the environment they are acting on is sometimes overlooked when thinking of (and solving) emerging theoretical challenges. However, research problems related to communication in Cyber-Physical Systems have been a prevalent target for network control research for decades. In particular, we take inspiration on Event Triggered Control to study how communication affects performance, safety and robustness in multi-agent systems.
The work in this dissertation begins by covering a communication-based form of swarm robotics systems, where taking inspiration from ants, agents learn to forage cooperatively by communicating through the environment. We study what form of convergence guarantees we can derive in such systems and how these depend on the communication logic, proposing mean field formulations of such systems. We then draw an analogy between such learning-based swarms and distributed Reinforcement Learning (RL), and propose strategies to safely reduce communication of information in a general form of distributed Q-Learning problems. We extend these ideas to cooperative Multi-Agent RL systems where agents communicate state measurements with each-other, and define so-called robustness surrogate functions (value function robustness certificates). These certificates allow agents to distributedly estimate how robust the joint policies are against lack of information, and determine when do they need to update other agents with new measurements. At last, we look into the general problem of robust control in RL systems, and propose a characterization of policy robustness against state measurement noise that allows us to cast robustness as a secondary objective in a lexicographic optimization scheme, applicable to policy gradient algorithms. This answers the following premise: If we need to learn controllers that are then deployed in possibly uncertain environments, we may want to make sure that “robustifying” the controller does not decrease (excessively) the capacity of the controller to successfully solve the original problem (without uncertainty).
The work presented through this dissertation covers different problems and jumps between overlapping fields, but the methods and techniques proposed share a common principle: As complex multi-agent systems become more applicable to engineering problems, the need for understanding (and simplifying) communication rules is increasingly motivated by safety. Therefore, the problems and solutions considered aim to advance towards a formal understanding and design of communication logic in complex, model free multi-agent systems.
The work in this dissertation begins by covering a communication-based form of swarm robotics systems, where taking inspiration from ants, agents learn to forage cooperatively by communicating through the environment. We study what form of convergence guarantees we can derive in such systems and how these depend on the communication logic, proposing mean field formulations of such systems. We then draw an analogy between such learning-based swarms and distributed Reinforcement Learning (RL), and propose strategies to safely reduce communication of information in a general form of distributed Q-Learning problems. We extend these ideas to cooperative Multi-Agent RL systems where agents communicate state measurements with each-other, and define so-called robustness surrogate functions (value function robustness certificates). These certificates allow agents to distributedly estimate how robust the joint policies are against lack of information, and determine when do they need to update other agents with new measurements. At last, we look into the general problem of robust control in RL systems, and propose a characterization of policy robustness against state measurement noise that allows us to cast robustness as a secondary objective in a lexicographic optimization scheme, applicable to policy gradient algorithms. This answers the following premise: If we need to learn controllers that are then deployed in possibly uncertain environments, we may want to make sure that “robustifying” the controller does not decrease (excessively) the capacity of the controller to successfully solve the original problem (without uncertainty).
The work presented through this dissertation covers different problems and jumps between overlapping fields, but the methods and techniques proposed share a common principle: As complex multi-agent systems become more applicable to engineering problems, the need for understanding (and simplifying) communication rules is increasingly motivated by safety. Therefore, the problems and solutions considered aim to advance towards a formal understanding and design of communication logic in complex, model free multi-agent systems.
Original language | English |
---|---|
Qualification | Doctor of Philosophy |
Awarding Institution |
|
Supervisors/Advisors |
|
Award date | 24 Apr 2023 |
Print ISBNs | 978-94-6384-432-1 |
DOIs | |
Publication status | Published - 2023 |
Keywords
- Multi-Agent Systems
- Event-Triggered Control
- Reinforcement Leaning (RL)
- Swarm robotics