TY - GEN
T1 - A cross-entropy approach to solving Dec-POMDPs
AU - Oliehoek, Frans A.
AU - Kooij, Julian F.P.
AU - Vlassis, Nikos
PY - 2008
Y1 - 2008
N2 - In this paper we focus on distributed multiagent planning under uncertainty. For single-agent planning under uncertainty, the partially observable Markov decision process (POMDP) is the dominant model (see [Spaan and Vlassis, 2005] and references therein). Recently, several generalizations of the POMDP to multiagent settings have been proposed. Here we focus on the decentralized POMDP (Dec-POMDP) model for multiagent planning under uncertainty [Bernstein et al., 2002, Goldman and Zilberstein, 2004]. Solving a Dec-POMDP amounts to finding a set of optimal policies for the agents that maximize the expected shared reward. However, solving a Dec-POMDP has proven to be hard (NEXP-complete): The number of possible deterministic policies for a single agent grows doubly exponentially with the planning horizon, and exponentially with the number of actions and observations available. As a result, the focus has shifted to approximate solution techniques [Nair et al., 2003, Emery-Montemerlo et al., 2005, Oliehoek and Vlassis, 2007].
AB - In this paper we focus on distributed multiagent planning under uncertainty. For single-agent planning under uncertainty, the partially observable Markov decision process (POMDP) is the dominant model (see [Spaan and Vlassis, 2005] and references therein). Recently, several generalizations of the POMDP to multiagent settings have been proposed. Here we focus on the decentralized POMDP (Dec-POMDP) model for multiagent planning under uncertainty [Bernstein et al., 2002, Goldman and Zilberstein, 2004]. Solving a Dec-POMDP amounts to finding a set of optimal policies for the agents that maximize the expected shared reward. However, solving a Dec-POMDP has proven to be hard (NEXP-complete): The number of possible deterministic policies for a single agent grows doubly exponentially with the planning horizon, and exponentially with the number of actions and observations available. As a result, the focus has shifted to approximate solution techniques [Nair et al., 2003, Emery-Montemerlo et al., 2005, Oliehoek and Vlassis, 2007].
UR - http://www.scopus.com/inward/record.url?scp=77951732939&partnerID=8YFLogxK
U2 - 10.1007/978-3-540-74930-1_15
DO - 10.1007/978-3-540-74930-1_15
M3 - Conference contribution
AN - SCOPUS:77951732939
SN - 9783540749295
VL - 78
T3 - Studies in Computational Intelligence
SP - 145
EP - 154
BT - Advances in Intelligent and Distributed Computing: Proceedings of the 1st International Symposium on Intelligent and Distributed Computing IDC'2007, Craiova, Romania, October 2007
ER -