Multitask Soft Option Learning

Maximilian Igl; Andrew Gambardella; Jinke He; Nantas Nardelli; N Siddharth; Wendelin Böhmer; Shimon Whiteson

Multitask Soft Option Learning

Maximilian Igl, Andrew Gambardella, Jinke He, Nantas Nardelli, N Siddharth, Wendelin Böhmer, Shimon Whiteson

Interactive Intelligence

Research output: Chapter in Book/Conference proceedings/Edited volume › Conference contribution › Scientific › peer-review

14 Downloads (Pure)

Abstract

We present Multitask Soft Option Learning (MSOL), a hierarchical multitask framework based on Planning as Inference. MSOL extends the concept of options, using separate variational posteriors for each task, regularized by a shared prior. This “soft” version of options avoids several instabilities during training in a multitask setting, and provides a natural way to learn both intra-option policies and their terminations. Furthermore, it allows fine-tuning of options for new tasks without forgetting their learned policies, leading to faster training without reducing the expressiveness of the hierarchical policy. We demonstrate empirically that MSOL significantly outperforms both hierarchical and flat transfer-learning baselines.

Original language	English
Title of host publication	Proceedings of the 36th Conference on Uncertainty in Artificial Intelligence (UAI)
Pages	969-978
Number of pages	10
Volume	124
Publication status	Published - 2020
Event	36th Conference on Uncertainty in Artificial Intelligence - Virtual/online event Duration: 4 Aug 2020 → 6 Aug 2020 Conference number: 36

Publication series

Name	Proceedings of Machine Learning Research

Conference

Conference	36th Conference on Uncertainty in Artificial Intelligence
Abbreviated title	UAI 2020
Period	4/08/20 → 6/08/20

Access to Document

igl20a-1Final published version, 1.24 MB

http://proceedings.mlr.press/v124/igl20a.html

Cite this

@inproceedings{1f6355309db34a92b2b46b1e38844fe5,

title = "Multitask Soft Option Learning",

abstract = "We present Multitask Soft Option Learning (MSOL), a hierarchical multitask framework based on Planning as Inference. MSOL extends the concept of options, using separate variational posteriors for each task, regularized by a shared prior. This “soft” version of options avoids several instabilities during training in a multitask setting, and provides a natural way to learn both intra-option policies and their terminations. Furthermore, it allows fine-tuning of options for new tasks without forgetting their learned policies, leading to faster training without reducing the expressiveness of the hierarchical policy. We demonstrate empirically that MSOL significantly outperforms both hierarchical and flat transfer-learning baselines. ",

author = "Maximilian Igl and Andrew Gambardella and Jinke He and Nantas Nardelli and N Siddharth and Wendelin B{\"o}hmer and Shimon Whiteson",

year = "2020",

language = "English",

volume = "124",

series = "Proceedings of Machine Learning Research",

pages = "969--978",

booktitle = "Proceedings of the 36th Conference on Uncertainty in Artificial Intelligence (UAI)",

note = "36th Conference on Uncertainty in Artificial Intelligence, UAI 2020 ; Conference date: 04-08-2020 Through 06-08-2020",

}

Igl, M, Gambardella, A, He, J, Nardelli, N, Siddharth, N, Böhmer, W & Whiteson, S 2020, Multitask Soft Option Learning. in Proceedings of the 36th Conference on Uncertainty in Artificial Intelligence (UAI). vol. 124, Proceedings of Machine Learning Research, pp. 969-978, 36th Conference on Uncertainty in Artificial Intelligence, 4/08/20. <http://proceedings.mlr.press/v124/igl20a.html>

Multitask Soft Option Learning. / Igl, Maximilian; Gambardella, Andrew; He, Jinke et al.
Proceedings of the 36th Conference on Uncertainty in Artificial Intelligence (UAI). Vol. 124 2020. p. 969-978 (Proceedings of Machine Learning Research).

Research output: Chapter in Book/Conference proceedings/Edited volume › Conference contribution › Scientific › peer-review

TY - GEN

T1 - Multitask Soft Option Learning

AU - Igl, Maximilian

AU - Gambardella, Andrew

AU - He, Jinke

AU - Nardelli, Nantas

AU - Siddharth, N

AU - Böhmer, Wendelin

AU - Whiteson, Shimon

N1 - Conference code: 36

PY - 2020

Y1 - 2020

N2 - We present Multitask Soft Option Learning (MSOL), a hierarchical multitask framework based on Planning as Inference. MSOL extends the concept of options, using separate variational posteriors for each task, regularized by a shared prior. This “soft” version of options avoids several instabilities during training in a multitask setting, and provides a natural way to learn both intra-option policies and their terminations. Furthermore, it allows fine-tuning of options for new tasks without forgetting their learned policies, leading to faster training without reducing the expressiveness of the hierarchical policy. We demonstrate empirically that MSOL significantly outperforms both hierarchical and flat transfer-learning baselines.

AB - We present Multitask Soft Option Learning (MSOL), a hierarchical multitask framework based on Planning as Inference. MSOL extends the concept of options, using separate variational posteriors for each task, regularized by a shared prior. This “soft” version of options avoids several instabilities during training in a multitask setting, and provides a natural way to learn both intra-option policies and their terminations. Furthermore, it allows fine-tuning of options for new tasks without forgetting their learned policies, leading to faster training without reducing the expressiveness of the hierarchical policy. We demonstrate empirically that MSOL significantly outperforms both hierarchical and flat transfer-learning baselines.

M3 - Conference contribution

VL - 124

T3 - Proceedings of Machine Learning Research

SP - 969

EP - 978

BT - Proceedings of the 36th Conference on Uncertainty in Artificial Intelligence (UAI)

T2 - 36th Conference on Uncertainty in Artificial Intelligence

Y2 - 4 August 2020 through 6 August 2020

ER -

Multitask Soft Option Learning

Abstract

Publication series

Conference

Access to Document

Fingerprint

Cite this