A Unifying Framework for Reinforcement Learning and Planning

Thomas M. Moerland; Joost Broekens; Aske Plaat; Catholijn M. Jonker

doi:10.3389/frai.2022.908353

A Unifying Framework for Reinforcement Learning and Planning

Thomas M. Moerland^*, Joost Broekens, Aske Plaat, Catholijn M. Jonker

^*Corresponding author for this work

Interactive Intelligence

Research output: Contribution to journal › Article › Scientific › peer-review

1 Citation (Scopus)

73 Downloads (Pure)

Abstract

Sequential decision making, commonly formalized as optimization of a Markov Decision Process, is a key challenge in artificial intelligence. Two successful approaches to MDP optimization are reinforcement learning and planning, which both largely have their own research communities. However, if both research fields solve the same problem, then we might be able to disentangle the common factors in their solution approaches. Therefore, this paper presents a unifying algorithmic framework for reinforcement learning and planning (FRAP), which identifies underlying dimensions on which MDP planning and learning algorithms have to decide. At the end of the paper, we compare a variety of well-known planning, model-free and model-based RL algorithms along these dimensions. Altogether, the framework may help provide deeper insight in the algorithmic design space of planning and reinforcement learning.

Original language	English
Article number	908353
Number of pages	25
Journal	Frontiers in Artificial Intelligence
Volume	5
DOIs	https://doi.org/10.3389/frai.2022.908353
Publication status	Published - 2022

Keywords

framework
model-based reinforcement learning
overview
planning
reinforcement learning
synthesis

Access to Document

10.3389/frai.2022.908353

frai-05-908353Final published version, 8.08 MBLicence: CC BY

Cite this

@article{e29c644e4bf748a58b8df1e8ec7afe91,

title = "A Unifying Framework for Reinforcement Learning and Planning",

abstract = "Sequential decision making, commonly formalized as optimization of a Markov Decision Process, is a key challenge in artificial intelligence. Two successful approaches to MDP optimization are reinforcement learning and planning, which both largely have their own research communities. However, if both research fields solve the same problem, then we might be able to disentangle the common factors in their solution approaches. Therefore, this paper presents a unifying algorithmic framework for reinforcement learning and planning (FRAP), which identifies underlying dimensions on which MDP planning and learning algorithms have to decide. At the end of the paper, we compare a variety of well-known planning, model-free and model-based RL algorithms along these dimensions. Altogether, the framework may help provide deeper insight in the algorithmic design space of planning and reinforcement learning.",

keywords = "framework, model-based reinforcement learning, overview, planning, reinforcement learning, synthesis",

author = "Moerland, {Thomas M.} and Joost Broekens and Aske Plaat and Jonker, {Catholijn M.}",

year = "2022",

doi = "10.3389/frai.2022.908353",

language = "English",

volume = "5",

journal = "Frontiers in Artificial Intelligence",

issn = "2624-8212",

publisher = "Frontiers Media",

}

TY - JOUR

T1 - A Unifying Framework for Reinforcement Learning and Planning

AU - Moerland, Thomas M.

AU - Broekens, Joost

AU - Plaat, Aske

AU - Jonker, Catholijn M.

PY - 2022

Y1 - 2022

N2 - Sequential decision making, commonly formalized as optimization of a Markov Decision Process, is a key challenge in artificial intelligence. Two successful approaches to MDP optimization are reinforcement learning and planning, which both largely have their own research communities. However, if both research fields solve the same problem, then we might be able to disentangle the common factors in their solution approaches. Therefore, this paper presents a unifying algorithmic framework for reinforcement learning and planning (FRAP), which identifies underlying dimensions on which MDP planning and learning algorithms have to decide. At the end of the paper, we compare a variety of well-known planning, model-free and model-based RL algorithms along these dimensions. Altogether, the framework may help provide deeper insight in the algorithmic design space of planning and reinforcement learning.

AB - Sequential decision making, commonly formalized as optimization of a Markov Decision Process, is a key challenge in artificial intelligence. Two successful approaches to MDP optimization are reinforcement learning and planning, which both largely have their own research communities. However, if both research fields solve the same problem, then we might be able to disentangle the common factors in their solution approaches. Therefore, this paper presents a unifying algorithmic framework for reinforcement learning and planning (FRAP), which identifies underlying dimensions on which MDP planning and learning algorithms have to decide. At the end of the paper, we compare a variety of well-known planning, model-free and model-based RL algorithms along these dimensions. Altogether, the framework may help provide deeper insight in the algorithmic design space of planning and reinforcement learning.

KW - framework

KW - model-based reinforcement learning

KW - overview

KW - planning

KW - reinforcement learning

KW - synthesis

UR - http://www.scopus.com/inward/record.url?scp=85134699602&partnerID=8YFLogxK

U2 - 10.3389/frai.2022.908353

DO - 10.3389/frai.2022.908353

M3 - Article

AN - SCOPUS:85134699602

SN - 2624-8212

VL - 5

JO - Frontiers in Artificial Intelligence

JF - Frontiers in Artificial Intelligence

M1 - 908353

ER -

A Unifying Framework for Reinforcement Learning and Planning

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this