Multi-agent Hierarchical Reinforcement Learning with Dynamic Termination

Dongge Han*, Wendelin Böhmer, Michael Wooldridge, Alex Rogers

*Corresponding author for this work

Research output: Chapter in Book/Conference proceedings/Edited volumeConference contributionScientificpeer-review

4 Citations (Scopus)

Abstract

In a multi-agent system, an agent’s optimal policy will typically depend on the policies chosen by others. Therefore, a key issue in multi-agent systems research is that of predicting the behaviours of others, and responding promptly to changes in such behaviours. One obvious possibility is for each agent to broadcast their current intention, for example, the currently executed option in a hierarchical reinforcement learning framework. However, this approach results in inflexibility of agents if options have an extended duration and are dynamic. While adjusting the executed option at each step improves flexibility from a single-agent perspective, frequent changes in options can induce inconsistency between an agent’s actual behaviour and its broadcast intention. In order to balance flexibility and predictability, we propose a dynamic termination Bellman equation that allows the agents to flexibly terminate their options. We evaluate our models empirically on a set of multi-agent pursuit and taxi tasks, and show that our agents learn to adapt flexibly across scenarios that require different termination behaviours.

Original languageEnglish
Title of host publicationPRICAI 2019
Subtitle of host publicationTrends in Artificial Intelligence - 16th Pacific Rim International Conference on Artificial Intelligence, Proceedings
EditorsAbhaya C. Nayak, Alok Sharma
PublisherSpringer
Pages80-92
Number of pages13
Volume11671
ISBN (Print)9783030299101
DOIs
Publication statusPublished - 2019
Externally publishedYes
Event16th Pacific Rim International Conference on Artificial Intelligence, PRICAI 2019 - Yanuka Island, Fiji
Duration: 26 Aug 201930 Aug 2019

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume11671 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference16th Pacific Rim International Conference on Artificial Intelligence, PRICAI 2019
Country/TerritoryFiji
CityYanuka Island
Period26/08/1930/08/19

Keywords

  • Hierarchcial reinforcement learning
  • Multi-agent Learning

Fingerprint

Dive into the research topics of 'Multi-agent Hierarchical Reinforcement Learning with Dynamic Termination'. Together they form a unique fingerprint.

Cite this