Using Bisimulation Metrics to Analyze and Evaluate Latent State Representations

N. Albers; M. Suau de Castro; F.A. Oliehoek

Using Bisimulation Metrics to Analyze and Evaluate Latent State Representations

N. Albers, M. Suau de Castro, F.A. Oliehoek

Interactive Intelligence

Research output: Chapter in Book/Conference proceedings/Edited volume › Conference contribution › Scientific › peer-review

119 Downloads (Pure)

Abstract

Deep Reinforcement Learning (RL) is a promising technique towards constructing intelligent agents, but it is not always easy to understand the learning process and the factors that impact it. To shed some light on this, we analyze the Latent State Representations (LSRs) that deep RL agents learn, and compare them to what such agents should ideally learn. We propose a crisp definition of ’ideal LSR’ based on a bisimulation metric, which measures how behaviorally similar states are. The ideal LSR is that in which the distance between two states is proportional to this bisimulation metric. Intuitively, forming such an ideal representation is highly favorable due to its compactness and generalization properties. Here we investigate if this type of representation is also desirable in practice. Our experiments suggest that learning representations that are close to this ideal LSR may improve upon generalization to new irrelevant feature values and modified dynamics. Yet, we show empirically that the extent to which such representations are learned depends on both the network capacity and the state encoding, and that with the current techniques the exact ideal LSR is never formed.

Original language	English
Title of host publication	BNAIC/BeneLearn 2021
Subtitle of host publication	33rd Benelux Conference on Artificial Intelligence and 30th Belgian-Dutch Conference on Machine Learning
Editors	Edit Luis A. Leiva, Cédric Pruski, Réka Markovich, Amro Najjar, Christoph Schommer
Pages	320-334
Publication status	Published - 2021
Event	33rd Benelux Conference on Artificial Intelligence and 30th Belgian-Dutch Conference on Machine Learning - Esch-sur-Alzette, Luxembourg Duration: 10 Nov 2021 → 12 Nov 2021

Conference

Conference	33rd Benelux Conference on Artificial Intelligence and 30th Belgian-Dutch Conference on Machine Learning
Abbreviated title	BNAIC/BeneLearn 2021
Country/Territory	Luxembourg
City	Esch-sur-Alzette
Period	10/11/21 → 12/11/21

Keywords

Deep Reinforcement Learning
Bisimulation Metrics

Access to Document

bnaic2021_preproceedings5Final published version, 5.35 MB

Cite this

Albers, N. ; Suau de Castro, M. ; Oliehoek, F.A. / Using Bisimulation Metrics to Analyze and Evaluate Latent State Representations. BNAIC/BeneLearn 2021: 33rd Benelux Conference on Artificial Intelligence and 30th Belgian-Dutch Conference on Machine Learning. editor / Edit Luis A. Leiva ; Cédric Pruski ; Réka Markovich ; Amro Najjar ; Christoph Schommer. 2021. pp. 320-334

@inproceedings{cdb0d337dd7643f28f3dd5a005af0790,

title = "Using Bisimulation Metrics to Analyze and Evaluate Latent State Representations",

abstract = "Deep Reinforcement Learning (RL) is a promising technique towards constructing intelligent agents, but it is not always easy to understand the learning process and the factors that impact it. To shed some light on this, we analyze the Latent State Representations (LSRs) that deep RL agents learn, and compare them to what such agents should ideally learn. We propose a crisp definition of {\textquoteright}ideal LSR{\textquoteright} based on a bisimulation metric, which measures how behaviorally similar states are. The ideal LSR is that in which the distance between two states is proportional to this bisimulation metric. Intuitively, forming such an ideal representation is highly favorable due to its compactness and generalization properties. Here we investigate if this type of representation is also desirable in practice. Our experiments suggest that learning representations that are close to this ideal LSR may improve upon generalization to new irrelevant feature values and modified dynamics. Yet, we show empirically that the extent to which such representations are learned depends on both the network capacity and the state encoding, and that with the current techniques the exact ideal LSR is never formed. ",

keywords = "Deep Reinforcement Learning, Bisimulation Metrics",

author = "N. Albers and {Suau de Castro}, M. and F.A. Oliehoek",

year = "2021",

language = "English",

pages = "320--334",

editor = "Leiva, {Edit Luis A. } and Pruski, {C{\'e}dric } and Markovich, {R{\'e}ka } and Najjar, {Amro } and Schommer, {Christoph }",

booktitle = "BNAIC/BeneLearn 2021",

note = "33rd Benelux Conference on Artificial Intelligence and<br/>30th Belgian-Dutch Conference on Machine Learning, BNAIC/BeneLearn 2021 ; Conference date: 10-11-2021 Through 12-11-2021",

}

Albers, N , Suau de Castro, M & Oliehoek, FA 2021, Using Bisimulation Metrics to Analyze and Evaluate Latent State Representations. in ELA Leiva, C Pruski, R Markovich, A Najjar & C Schommer (eds), BNAIC/BeneLearn 2021: 33rd Benelux Conference on Artificial Intelligence and 30th Belgian-Dutch Conference on Machine Learning. pp. 320-334, 33rd Benelux Conference on Artificial Intelligence and
30th Belgian-Dutch Conference on Machine Learning, Esch-sur-Alzette, Luxembourg, 10/11/21.

Using Bisimulation Metrics to Analyze and Evaluate Latent State Representations. / Albers, N.; Suau de Castro, M.; Oliehoek, F.A.
BNAIC/BeneLearn 2021: 33rd Benelux Conference on Artificial Intelligence and 30th Belgian-Dutch Conference on Machine Learning. ed. / Edit Luis A. Leiva; Cédric Pruski; Réka Markovich; Amro Najjar; Christoph Schommer. 2021. p. 320-334.

Research output: Chapter in Book/Conference proceedings/Edited volume › Conference contribution › Scientific › peer-review

TY - GEN

T1 - Using Bisimulation Metrics to Analyze and Evaluate Latent State Representations

AU - Albers, N.

AU - Suau de Castro, M.

AU - Oliehoek, F.A.

PY - 2021

Y1 - 2021

N2 - Deep Reinforcement Learning (RL) is a promising technique towards constructing intelligent agents, but it is not always easy to understand the learning process and the factors that impact it. To shed some light on this, we analyze the Latent State Representations (LSRs) that deep RL agents learn, and compare them to what such agents should ideally learn. We propose a crisp definition of ’ideal LSR’ based on a bisimulation metric, which measures how behaviorally similar states are. The ideal LSR is that in which the distance between two states is proportional to this bisimulation metric. Intuitively, forming such an ideal representation is highly favorable due to its compactness and generalization properties. Here we investigate if this type of representation is also desirable in practice. Our experiments suggest that learning representations that are close to this ideal LSR may improve upon generalization to new irrelevant feature values and modified dynamics. Yet, we show empirically that the extent to which such representations are learned depends on both the network capacity and the state encoding, and that with the current techniques the exact ideal LSR is never formed.

AB - Deep Reinforcement Learning (RL) is a promising technique towards constructing intelligent agents, but it is not always easy to understand the learning process and the factors that impact it. To shed some light on this, we analyze the Latent State Representations (LSRs) that deep RL agents learn, and compare them to what such agents should ideally learn. We propose a crisp definition of ’ideal LSR’ based on a bisimulation metric, which measures how behaviorally similar states are. The ideal LSR is that in which the distance between two states is proportional to this bisimulation metric. Intuitively, forming such an ideal representation is highly favorable due to its compactness and generalization properties. Here we investigate if this type of representation is also desirable in practice. Our experiments suggest that learning representations that are close to this ideal LSR may improve upon generalization to new irrelevant feature values and modified dynamics. Yet, we show empirically that the extent to which such representations are learned depends on both the network capacity and the state encoding, and that with the current techniques the exact ideal LSR is never formed.

KW - Deep Reinforcement Learning

KW - Bisimulation Metrics

M3 - Conference contribution

SP - 320

EP - 334

BT - BNAIC/BeneLearn 2021

A2 - Leiva, Edit Luis A.

A2 - Pruski, Cédric

A2 - Markovich, Réka

A2 - Najjar, Amro

A2 - Schommer, Christoph

T2 - 33rd Benelux Conference on Artificial Intelligence and<br/>30th Belgian-Dutch Conference on Machine Learning

Y2 - 10 November 2021 through 12 November 2021

ER -

Using Bisimulation Metrics to Analyze and Evaluate Latent State Representations

Abstract

Conference

Keywords

Access to Document

Fingerprint

Cite this