Multi-agent hierarchical reinforcement learning with dynamic termination

Dongge Han; Wendelin Boehmer; Michael Wooldridge; Alex Rogers

Multi-agent hierarchical reinforcement learning with dynamic termination

Dongge Han, Wendelin Boehmer, Michael Wooldridge, Alex Rogers

Research output: Chapter in Book/Conference proceedings/Edited volume › Conference contribution › Scientific › peer-review

5 Citations (Scopus)

Abstract

In a multi-agent system, an agent's optimal policy will typically depend on the policies of other agents. Predicting the behaviours of others, and responding promptly to changes in such behaviours, is therefore a key issue in multi-agent systems research. One obvious possibility is for each agent to broadcast their current intention, for example, the currently executed option in a hierarchical RL framework. However, this approach results in inflexible agents when options have an extended duration. While adjusting the executed option at each step improves flexibility from a single-agent perspective, frequent changes in options can induce inconsistency between an agent's actual behaviour and its broadcasted intention. In order to balance flexibility and predictability, we propose a dynamic termination Bellman equation that allows the agents to flexibly terminate their options.

Original language	English
Title of host publication	18th International Conference on Autonomous Agents and Multiagent Systems, AAMAS 2019
Publisher	International Foundation for Autonomous Agents and Multiagent Systems (IFAAMAS)
Pages	2006-2008
Number of pages	3
ISBN (Electronic)	9781510892002
Publication status	Published - 1 Jan 2019
Externally published	Yes
Event	18th International Conference on Autonomous Agents and Multiagent Systems, AAMAS 2019 - Montreal, Canada Duration: 13 May 2019 → 17 May 2019

Publication series

Name	Proceedings of the International Joint Conference on Autonomous Agents and Multiagent Systems, AAMAS
Volume	4
ISSN (Print)	1548-8403
ISSN (Electronic)	1558-2914

Conference

Conference	18th International Conference on Autonomous Agents and Multiagent Systems, AAMAS 2019
Country/Territory	Canada
City	Montreal
Period	13/05/19 → 17/05/19

Keywords

Hierarchical reinforcement learning
Multi-agent learning

Cite this

Han, D., Boehmer, W., Wooldridge, M., & Rogers, A. (2019). Multi-agent hierarchical reinforcement learning with dynamic termination. In 18th International Conference on Autonomous Agents and Multiagent Systems, AAMAS 2019 (pp. 2006-2008). (Proceedings of the International Joint Conference on Autonomous Agents and Multiagent Systems, AAMAS; Vol. 4). International Foundation for Autonomous Agents and Multiagent Systems (IFAAMAS).

Han, Dongge ; Boehmer, Wendelin ; Wooldridge, Michael et al. / Multi-agent hierarchical reinforcement learning with dynamic termination. 18th International Conference on Autonomous Agents and Multiagent Systems, AAMAS 2019. International Foundation for Autonomous Agents and Multiagent Systems (IFAAMAS), 2019. pp. 2006-2008 (Proceedings of the International Joint Conference on Autonomous Agents and Multiagent Systems, AAMAS).

@inproceedings{c3c9adcc5cf14325ae725fdbdf4ade5b,

title = "Multi-agent hierarchical reinforcement learning with dynamic termination",

abstract = "In a multi-agent system, an agent's optimal policy will typically depend on the policies of other agents. Predicting the behaviours of others, and responding promptly to changes in such behaviours, is therefore a key issue in multi-agent systems research. One obvious possibility is for each agent to broadcast their current intention, for example, the currently executed option in a hierarchical RL framework. However, this approach results in inflexible agents when options have an extended duration. While adjusting the executed option at each step improves flexibility from a single-agent perspective, frequent changes in options can induce inconsistency between an agent's actual behaviour and its broadcasted intention. In order to balance flexibility and predictability, we propose a dynamic termination Bellman equation that allows the agents to flexibly terminate their options.",

keywords = "Hierarchical reinforcement learning, Multi-agent learning",

author = "Dongge Han and Wendelin Boehmer and Michael Wooldridge and Alex Rogers",

year = "2019",

month = jan,

day = "1",

language = "English",

series = "Proceedings of the International Joint Conference on Autonomous Agents and Multiagent Systems, AAMAS",

publisher = "International Foundation for Autonomous Agents and Multiagent Systems (IFAAMAS)",

pages = "2006--2008",

booktitle = "18th International Conference on Autonomous Agents and Multiagent Systems, AAMAS 2019",

note = "18th International Conference on Autonomous Agents and Multiagent Systems, AAMAS 2019 ; Conference date: 13-05-2019 Through 17-05-2019",

}

Han, D, Boehmer, W, Wooldridge, M & Rogers, A 2019, Multi-agent hierarchical reinforcement learning with dynamic termination. in 18th International Conference on Autonomous Agents and Multiagent Systems, AAMAS 2019. Proceedings of the International Joint Conference on Autonomous Agents and Multiagent Systems, AAMAS, vol. 4, International Foundation for Autonomous Agents and Multiagent Systems (IFAAMAS), pp. 2006-2008, 18th International Conference on Autonomous Agents and Multiagent Systems, AAMAS 2019, Montreal, Canada, 13/05/19.

Multi-agent hierarchical reinforcement learning with dynamic termination. / Han, Dongge; Boehmer, Wendelin; Wooldridge, Michael et al.
18th International Conference on Autonomous Agents and Multiagent Systems, AAMAS 2019. International Foundation for Autonomous Agents and Multiagent Systems (IFAAMAS), 2019. p. 2006-2008 (Proceedings of the International Joint Conference on Autonomous Agents and Multiagent Systems, AAMAS; Vol. 4).

Research output: Chapter in Book/Conference proceedings/Edited volume › Conference contribution › Scientific › peer-review

TY - GEN

T1 - Multi-agent hierarchical reinforcement learning with dynamic termination

AU - Han, Dongge

AU - Boehmer, Wendelin

AU - Wooldridge, Michael

AU - Rogers, Alex

PY - 2019/1/1

Y1 - 2019/1/1

N2 - In a multi-agent system, an agent's optimal policy will typically depend on the policies of other agents. Predicting the behaviours of others, and responding promptly to changes in such behaviours, is therefore a key issue in multi-agent systems research. One obvious possibility is for each agent to broadcast their current intention, for example, the currently executed option in a hierarchical RL framework. However, this approach results in inflexible agents when options have an extended duration. While adjusting the executed option at each step improves flexibility from a single-agent perspective, frequent changes in options can induce inconsistency between an agent's actual behaviour and its broadcasted intention. In order to balance flexibility and predictability, we propose a dynamic termination Bellman equation that allows the agents to flexibly terminate their options.

AB - In a multi-agent system, an agent's optimal policy will typically depend on the policies of other agents. Predicting the behaviours of others, and responding promptly to changes in such behaviours, is therefore a key issue in multi-agent systems research. One obvious possibility is for each agent to broadcast their current intention, for example, the currently executed option in a hierarchical RL framework. However, this approach results in inflexible agents when options have an extended duration. While adjusting the executed option at each step improves flexibility from a single-agent perspective, frequent changes in options can induce inconsistency between an agent's actual behaviour and its broadcasted intention. In order to balance flexibility and predictability, we propose a dynamic termination Bellman equation that allows the agents to flexibly terminate their options.

KW - Hierarchical reinforcement learning

KW - Multi-agent learning

UR - http://www.scopus.com/inward/record.url?scp=85077045468&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:85077045468

T3 - Proceedings of the International Joint Conference on Autonomous Agents and Multiagent Systems, AAMAS

SP - 2006

EP - 2008

BT - 18th International Conference on Autonomous Agents and Multiagent Systems, AAMAS 2019

PB - International Foundation for Autonomous Agents and Multiagent Systems (IFAAMAS)

T2 - 18th International Conference on Autonomous Agents and Multiagent Systems, AAMAS 2019

Y2 - 13 May 2019 through 17 May 2019

ER -

Han D, Boehmer W, Wooldridge M, Rogers A. Multi-agent hierarchical reinforcement learning with dynamic termination. In 18th International Conference on Autonomous Agents and Multiagent Systems, AAMAS 2019. International Foundation for Autonomous Agents and Multiagent Systems (IFAAMAS). 2019. p. 2006-2008. (Proceedings of the International Joint Conference on Autonomous Agents and Multiagent Systems, AAMAS).

Multi-agent hierarchical reinforcement learning with dynamic termination

Abstract

Publication series

Conference

Keywords

Other files and links

Fingerprint

Cite this