Using Bisimulation Metrics to Analyze and Evaluate Latent State Representations

Research output: Chapter in Book/Conference proceedings/Edited volumeConference contributionScientificpeer-review

119 Downloads (Pure)

Abstract

Deep Reinforcement Learning (RL) is a promising technique towards constructing intelligent agents, but it is not always easy to understand the learning process and the factors that impact it. To shed some light on this, we analyze the Latent State Representations (LSRs) that deep RL agents learn, and compare them to what such agents should ideally learn. We propose a crisp definition of ’ideal LSR’ based on a bisimulation metric, which measures how behaviorally similar states are. The ideal LSR is that in which the distance between two states is proportional to this bisimulation metric. Intuitively, forming such an ideal representation is highly favorable due to its compactness and generalization properties. Here we investigate if this type of representation is also desirable in practice. Our experiments suggest that learning representations that are close to this ideal LSR may improve upon generalization to new irrelevant feature values and modified dynamics. Yet, we show empirically that the extent to which such representations are learned depends on both the network capacity and the state encoding, and that with the current techniques the exact ideal LSR is never formed.
Original languageEnglish
Title of host publicationBNAIC/BeneLearn 2021
Subtitle of host publication33rd Benelux Conference on Artificial Intelligence and 30th Belgian-Dutch Conference on Machine Learning
EditorsEdit Luis A. Leiva, Cédric Pruski, Réka Markovich, Amro Najjar, Christoph Schommer
Pages320-334
Publication statusPublished - 2021
Event33rd Benelux Conference on Artificial Intelligence and
30th Belgian-Dutch Conference on Machine Learning
- Esch-sur-Alzette, Luxembourg
Duration: 10 Nov 202112 Nov 2021

Conference

Conference33rd Benelux Conference on Artificial Intelligence and
30th Belgian-Dutch Conference on Machine Learning
Abbreviated titleBNAIC/BeneLearn 2021
Country/TerritoryLuxembourg
CityEsch-sur-Alzette
Period10/11/2112/11/21

Keywords

  • Deep Reinforcement Learning
  • Bisimulation Metrics

Fingerprint

Dive into the research topics of 'Using Bisimulation Metrics to Analyze and Evaluate Latent State Representations'. Together they form a unique fingerprint.

Cite this