MDP homomorphic networks: Group symmetries in reinforcement learning

Elise van der Pol; Daniel E. Worrall; Herke van Hoof; Frans A. Oliehoek; Max Welling

MDP homomorphic networks: Group symmetries in reinforcement learning

Elise van der Pol, Daniel E. Worrall, Herke van Hoof, Frans A. Oliehoek, Max Welling

Interactive Intelligence

Research output: Contribution to journal › Conference article › Scientific › peer-review

41 Citations (Scopus)

90 Downloads (Pure)

Abstract

This paper introduces MDP homomorphic networks for deep reinforcement learning. MDP homomorphic networks are neural networks that are equivariant under symmetries in the joint state-action space of an MDP. Current approaches to deep reinforcement learning do not usually exploit knowledge about such structure. By building this prior knowledge into policy and value networks using an equivariance constraint, we can reduce the size of the solution space. We specifically focus on group-structured symmetries (invertible transformations). Additionally, we introduce an easy method for constructing equivariant network layers numerically, so the system designer need not solve the constraints by hand, as is typically done. We construct MDP homomorphic MLPs and CNNs that are equivariant under either a group of reflections or rotations. We show that such networks converge faster than unstructured baselines on CartPole, a grid world and Pong.

Original language	English
Number of pages	12
Journal	Advances in Neural Information Processing Systems
Volume	2020-December
Publication status	Published - 2020
Event	34th Conference on Neural Information Processing Systems, NeurIPS 2020 - Virtual, Online Duration: 6 Dec 2020 → 12 Dec 2020

Access to Document

NeurIPS-2020-mdp-homomorphic-networks-group-symmetries-in-reinforcement-learning-PaperFinal published version, 653 KB

Cite this

@article{231506a276354c30be3097273e99537e,

title = "MDP homomorphic networks: Group symmetries in reinforcement learning",

abstract = "This paper introduces MDP homomorphic networks for deep reinforcement learning. MDP homomorphic networks are neural networks that are equivariant under symmetries in the joint state-action space of an MDP. Current approaches to deep reinforcement learning do not usually exploit knowledge about such structure. By building this prior knowledge into policy and value networks using an equivariance constraint, we can reduce the size of the solution space. We specifically focus on group-structured symmetries (invertible transformations). Additionally, we introduce an easy method for constructing equivariant network layers numerically, so the system designer need not solve the constraints by hand, as is typically done. We construct MDP homomorphic MLPs and CNNs that are equivariant under either a group of reflections or rotations. We show that such networks converge faster than unstructured baselines on CartPole, a grid world and Pong.",

author = "{van der Pol}, Elise and Worrall, {Daniel E.} and {van Hoof}, Herke and Oliehoek, {Frans A.} and Max Welling",

year = "2020",

language = "English",

volume = "2020-December",

journal = "Advances in Neural Information Processing Systems",

issn = "1049-5258",

note = "34th Conference on Neural Information Processing Systems, NeurIPS 2020 ; Conference date: 06-12-2020 Through 12-12-2020",

}

TY - JOUR

T1 - MDP homomorphic networks

T2 - 34th Conference on Neural Information Processing Systems, NeurIPS 2020

AU - van der Pol, Elise

AU - Worrall, Daniel E.

AU - van Hoof, Herke

AU - Oliehoek, Frans A.

AU - Welling, Max

PY - 2020

Y1 - 2020

N2 - This paper introduces MDP homomorphic networks for deep reinforcement learning. MDP homomorphic networks are neural networks that are equivariant under symmetries in the joint state-action space of an MDP. Current approaches to deep reinforcement learning do not usually exploit knowledge about such structure. By building this prior knowledge into policy and value networks using an equivariance constraint, we can reduce the size of the solution space. We specifically focus on group-structured symmetries (invertible transformations). Additionally, we introduce an easy method for constructing equivariant network layers numerically, so the system designer need not solve the constraints by hand, as is typically done. We construct MDP homomorphic MLPs and CNNs that are equivariant under either a group of reflections or rotations. We show that such networks converge faster than unstructured baselines on CartPole, a grid world and Pong.

AB - This paper introduces MDP homomorphic networks for deep reinforcement learning. MDP homomorphic networks are neural networks that are equivariant under symmetries in the joint state-action space of an MDP. Current approaches to deep reinforcement learning do not usually exploit knowledge about such structure. By building this prior knowledge into policy and value networks using an equivariance constraint, we can reduce the size of the solution space. We specifically focus on group-structured symmetries (invertible transformations). Additionally, we introduce an easy method for constructing equivariant network layers numerically, so the system designer need not solve the constraints by hand, as is typically done. We construct MDP homomorphic MLPs and CNNs that are equivariant under either a group of reflections or rotations. We show that such networks converge faster than unstructured baselines on CartPole, a grid world and Pong.

UR - http://www.scopus.com/inward/record.url?scp=85106087435&partnerID=8YFLogxK

M3 - Conference article

AN - SCOPUS:85106087435

SN - 1049-5258

VL - 2020-December

JO - Advances in Neural Information Processing Systems

JF - Advances in Neural Information Processing Systems

Y2 - 6 December 2020 through 12 December 2020

ER -

MDP homomorphic networks: Group symmetries in reinforcement learning

Abstract

Access to Document

Other files and links

Fingerprint

Cite this