Learning Multimodal Transition Dynamics for Model-Based Reinforcement Learning

Thomas M. Moerland; Joost Broekens; Catholijn M. Jonker

Learning Multimodal Transition Dynamics for Model-Based Reinforcement Learning

Thomas M. Moerland, Joost Broekens, Catholijn M. Jonker

Interactive Intelligence

Research output: Chapter in Book/Conference proceedings/Edited volume › Conference contribution › Scientific › peer-review

60 Downloads (Pure)

Abstract

In this paper we study how to learn stochastic, multimodal transition dynamics in reinforcement learning (RL) tasks. We focus on evaluating transition function estimation, while we defer planning over this model to future work. Stochasticity is a fundamental property of many task environments. However, discriminative function approximators have difficulty estimating multimodal stochasticity. In contrast, deep generative models do capture complex high-dimensional outcome distributions. First we discuss why, amongst such models, conditional variational inference (VI) is theoretically most appealing for model-based RL. Subsequently, we compare different VI models on their ability to learn complex stochasticity on simulated functions, as well as on a typical RL gridworld with multimodal dynamics. Results show VI successfully predicts multimodal outcomes, but also robustly ignores these for deterministic parts of the transition dynamics. In summary, we show a robust method to learn multimodal transitions using function approximation, which is a key preliminary for model-based RL in stochastic domains.

Original language	English
Title of host publication	SURL 2017: 1st Scaling-Up Reinforcement Learning (SURL) Workshop
Pages	1-18
Number of pages	18
Publication status	Published - 2017
Event	SURL 2017: 1st Scaling-Up Reinforcement Learning (SURL) Workshop - Skopje, Macedonia, The Former Yugoslav Republic of Duration: 18 Sept 2017 → 18 Sept 2017

Workshop

Workshop	SURL 2017: 1st Scaling-Up Reinforcement Learning (SURL) Workshop
Country/Territory	Macedonia, The Former Yugoslav Republic of
City	Skopje
Period	18/09/17 → 18/09/17

Access to Document

SURL-2017_paper_6Final published version, 554 KB

http://'Learning Multimodal Transition Dynamics for Model-Based Reinforcement Learning

Cite this

@inproceedings{46ad580bbc9c4a96bb40178176fe9f12,

title = "Learning Multimodal Transition Dynamics for Model-Based Reinforcement Learning",

abstract = "In this paper we study how to learn stochastic, multimodal transition dynamics in reinforcement learning (RL) tasks. We focus on evaluating transition function estimation, while we defer planning over this model to future work. Stochasticity is a fundamental property of many task environments. However, discriminative function approximators have difficulty estimating multimodal stochasticity. In contrast, deep generative models do capture complex high-dimensional outcome distributions. First we discuss why, amongst such models, conditional variational inference (VI) is theoretically most appealing for model-based RL. Subsequently, we compare different VI models on their ability to learn complex stochasticity on simulated functions, as well as on a typical RL gridworld with multimodal dynamics. Results show VI successfully predicts multimodal outcomes, but also robustly ignores these for deterministic parts of the transition dynamics. In summary, we show a robust method to learn multimodal transitions using function approximation, which is a key preliminary for model-based RL in stochastic domains.",

author = "Moerland, {Thomas M.} and Joost Broekens and Jonker, {Catholijn M.}",

year = "2017",

language = "English",

pages = "1--18",

booktitle = "SURL 2017: 1st Scaling-Up Reinforcement Learning (SURL) Workshop",

note = "SURL 2017: 1st Scaling-Up Reinforcement Learning (SURL) Workshop ; Conference date: 18-09-2017 Through 18-09-2017",

}

Moerland, TM, Broekens, J & Jonker, CM 2017, Learning Multimodal Transition Dynamics for Model-Based Reinforcement Learning. in SURL 2017: 1st Scaling-Up Reinforcement Learning (SURL) Workshop. pp. 1-18, SURL 2017: 1st Scaling-Up Reinforcement Learning (SURL) Workshop, Skopje, Macedonia, The Former Yugoslav Republic of, 18/09/17. <http://'Learning Multimodal Transition Dynamics for Model-Based Reinforcement Learning>

TY - GEN

T1 - Learning Multimodal Transition Dynamics for Model-Based Reinforcement Learning

AU - Moerland, Thomas M.

AU - Broekens, Joost

AU - Jonker, Catholijn M.

PY - 2017

Y1 - 2017

N2 - In this paper we study how to learn stochastic, multimodal transition dynamics in reinforcement learning (RL) tasks. We focus on evaluating transition function estimation, while we defer planning over this model to future work. Stochasticity is a fundamental property of many task environments. However, discriminative function approximators have difficulty estimating multimodal stochasticity. In contrast, deep generative models do capture complex high-dimensional outcome distributions. First we discuss why, amongst such models, conditional variational inference (VI) is theoretically most appealing for model-based RL. Subsequently, we compare different VI models on their ability to learn complex stochasticity on simulated functions, as well as on a typical RL gridworld with multimodal dynamics. Results show VI successfully predicts multimodal outcomes, but also robustly ignores these for deterministic parts of the transition dynamics. In summary, we show a robust method to learn multimodal transitions using function approximation, which is a key preliminary for model-based RL in stochastic domains.

AB - In this paper we study how to learn stochastic, multimodal transition dynamics in reinforcement learning (RL) tasks. We focus on evaluating transition function estimation, while we defer planning over this model to future work. Stochasticity is a fundamental property of many task environments. However, discriminative function approximators have difficulty estimating multimodal stochasticity. In contrast, deep generative models do capture complex high-dimensional outcome distributions. First we discuss why, amongst such models, conditional variational inference (VI) is theoretically most appealing for model-based RL. Subsequently, we compare different VI models on their ability to learn complex stochasticity on simulated functions, as well as on a typical RL gridworld with multimodal dynamics. Results show VI successfully predicts multimodal outcomes, but also robustly ignores these for deterministic parts of the transition dynamics. In summary, we show a robust method to learn multimodal transitions using function approximation, which is a key preliminary for model-based RL in stochastic domains.

M3 - Conference contribution

SP - 1

EP - 18

BT - SURL 2017: 1st Scaling-Up Reinforcement Learning (SURL) Workshop

T2 - SURL 2017: 1st Scaling-Up Reinforcement Learning (SURL) Workshop

Y2 - 18 September 2017 through 18 September 2017

ER -

Learning Multimodal Transition Dynamics for Model-Based Reinforcement Learning

Abstract

Workshop

Access to Document

Fingerprint

Cite this