TY - GEN
T1 - Multi-agent hierarchical reinforcement learning with dynamic termination
AU - Han, Dongge
AU - Boehmer, Wendelin
AU - Wooldridge, Michael
AU - Rogers, Alex
PY - 2019/1/1
Y1 - 2019/1/1
N2 - In a multi-agent system, an agent's optimal policy will typically depend on the policies of other agents. Predicting the behaviours of others, and responding promptly to changes in such behaviours, is therefore a key issue in multi-agent systems research. One obvious possibility is for each agent to broadcast their current intention, for example, the currently executed option in a hierarchical RL framework. However, this approach results in inflexible agents when options have an extended duration. While adjusting the executed option at each step improves flexibility from a single-agent perspective, frequent changes in options can induce inconsistency between an agent's actual behaviour and its broadcasted intention. In order to balance flexibility and predictability, we propose a dynamic termination Bellman equation that allows the agents to flexibly terminate their options.
AB - In a multi-agent system, an agent's optimal policy will typically depend on the policies of other agents. Predicting the behaviours of others, and responding promptly to changes in such behaviours, is therefore a key issue in multi-agent systems research. One obvious possibility is for each agent to broadcast their current intention, for example, the currently executed option in a hierarchical RL framework. However, this approach results in inflexible agents when options have an extended duration. While adjusting the executed option at each step improves flexibility from a single-agent perspective, frequent changes in options can induce inconsistency between an agent's actual behaviour and its broadcasted intention. In order to balance flexibility and predictability, we propose a dynamic termination Bellman equation that allows the agents to flexibly terminate their options.
KW - Hierarchical reinforcement learning
KW - Multi-agent learning
UR - http://www.scopus.com/inward/record.url?scp=85077045468&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85077045468
T3 - Proceedings of the International Joint Conference on Autonomous Agents and Multiagent Systems, AAMAS
SP - 2006
EP - 2008
BT - 18th International Conference on Autonomous Agents and Multiagent Systems, AAMAS 2019
PB - International Foundation for Autonomous Agents and Multiagent Systems (IFAAMAS)
T2 - 18th International Conference on Autonomous Agents and Multiagent Systems, AAMAS 2019
Y2 - 13 May 2019 through 17 May 2019
ER -