Efficient Exploitation of Factored Domains in Bayesian Reinforcement Learning for POMDPs

Sammie Katt, Frans A. Oliehoek, Christopher Amato

Research output: Chapter in Book/Conference proceedings/Edited volumeConference contributionScientificpeer-review

14 Downloads (Pure)

Abstract

While the POMDP has proven to be a powerful framework to model and solve partially observable stochastic problems, it assumes ac- curate and complete knowledge of the environment. When such information is not available, as is the case in many real world appli- cations, one must learn such a model. The BA-POMDP considers the model as part of the hidden state and explicitly considers the uncertainty over it, and as a result transforms the learning problem into a planning problem. This model, however, grows exponentially with the underlying POMDP size, and becomes intractable for non- trivial problems. In this article we propose a factored framework, the FBA-POMDP that represents the model as a Bayes-Net, dras- tically decreasing the number of parameters required to describe the dynamics of the environment. We demonstrate that the our ap- proach allows solvers to tackle problems much larger than possible in the BA-POMDP.
Original languageEnglish
Title of host publicationAdaptive Learning Agents (ALA 2018)
Number of pages6
Publication statusPublished - 1 Jul 2018
EventALA 2018 - Workshop at the Federated AI Meeting 2018 - Stockholm, Sweden
Duration: 14 Jul 201815 Jul 2019

Conference

ConferenceALA 2018 - Workshop at the Federated AI Meeting 2018
Abbreviated titleALA 2018
CountrySweden
CityStockholm
Period14/07/1815/07/19

Bibliographical note

Green Open Access added to TU Delft Institutional Repository ‘You share, we take care!’ – Taverne project https://www.openaccess.nl/en/you-share-we-take-care Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.

Keywords

  • refereed, workshop

Fingerprint

Dive into the research topics of 'Efficient Exploitation of Factored Domains in Bayesian Reinforcement Learning for POMDPs'. Together they form a unique fingerprint.

Cite this