TY - JOUR
T1 - A Unifying Framework for Reinforcement Learning and Planning
AU - Moerland, Thomas M.
AU - Broekens, Joost
AU - Plaat, Aske
AU - Jonker, Catholijn M.
PY - 2022
Y1 - 2022
N2 - Sequential decision making, commonly formalized as optimization of a Markov Decision Process, is a key challenge in artificial intelligence. Two successful approaches to MDP optimization are reinforcement learning and planning, which both largely have their own research communities. However, if both research fields solve the same problem, then we might be able to disentangle the common factors in their solution approaches. Therefore, this paper presents a unifying algorithmic framework for reinforcement learning and planning (FRAP), which identifies underlying dimensions on which MDP planning and learning algorithms have to decide. At the end of the paper, we compare a variety of well-known planning, model-free and model-based RL algorithms along these dimensions. Altogether, the framework may help provide deeper insight in the algorithmic design space of planning and reinforcement learning.
AB - Sequential decision making, commonly formalized as optimization of a Markov Decision Process, is a key challenge in artificial intelligence. Two successful approaches to MDP optimization are reinforcement learning and planning, which both largely have their own research communities. However, if both research fields solve the same problem, then we might be able to disentangle the common factors in their solution approaches. Therefore, this paper presents a unifying algorithmic framework for reinforcement learning and planning (FRAP), which identifies underlying dimensions on which MDP planning and learning algorithms have to decide. At the end of the paper, we compare a variety of well-known planning, model-free and model-based RL algorithms along these dimensions. Altogether, the framework may help provide deeper insight in the algorithmic design space of planning and reinforcement learning.
KW - framework
KW - model-based reinforcement learning
KW - overview
KW - planning
KW - reinforcement learning
KW - synthesis
UR - http://www.scopus.com/inward/record.url?scp=85134699602&partnerID=8YFLogxK
U2 - 10.3389/frai.2022.908353
DO - 10.3389/frai.2022.908353
M3 - Article
AN - SCOPUS:85134699602
SN - 2624-8212
VL - 5
JO - Frontiers in Artificial Intelligence
JF - Frontiers in Artificial Intelligence
M1 - 908353
ER -