Learning Complex Policy Distribution with CEM Guided Adversarial Hypernetwork

Shi Yuan Tang, F.A. Oliehoek, Athirai A. Irissappane, Jie Zhang

Research output: Chapter in Book/Conference proceedings/Edited volumeConference contributionScientificpeer-review

7 Downloads (Pure)

Abstract

Cross-Entropy Method (CEM) is a gradient-free direct policy search method, which has greater stability and is insensitive to hyperparameter tuning. CEM bears similarity to population-based evolutionary methods, but, rather than using a population it uses a distribution over candidate solutions (policies in our case). Usually, a natural exponential family distribution such as multivariate Gaussian is used to parameterize the policy distribution. Using a multivariate Gaussian limits the quality of CEM policies as the search becomes confined to a less representative subspace. We address this drawback by using an adversarially-trained hypernetwork, enabling a richer and complex representation of the policy distribution. To achieve better training stability and faster convergence, we use a multivariate Gaussian CEM policy to guide our adversarial training process. Experiments demonstrate that our approach outperforms state-of-the-art CEM-based methods by 15.8% in terms of rewards while achieving faster convergence. Results also show that our approach is less sensitive to hyper-parameters than other deep-RL methods such as REINFORCE, DDPG and DQN.
Original languageEnglish
Title of host publicationProceedings of the 20th International Conference on Autonomous Agents and MultiAgent Systems
Place of PublicationRichland, SC
PublisherInternational Foundation for Autonomous Agents and Multiagent Systems
Pages1308-1316
Number of pages9
ISBN (Electronic)9781450383073
Publication statusPublished - 2021
Event20th International Conference on Autonomous Agentsand Multiagent Systems - Virtual/online event due to COVID-19
Duration: 3 May 20217 May 2021
Conference number: 20

Publication series

NameAAMAS '21
PublisherInternational Foundation for Autonomous Agents and Multiagent Systems
ISSN (Electronic)2523-5699

Conference

Conference20th International Conference on Autonomous Agentsand Multiagent Systems
Abbreviated titleAAMAS 2021
Period3/05/217/05/21

Keywords

  • Cross-Entropy Method
  • Hypernetworks
  • Generative Adversarial Networks
  • Reinforcement Learning

Fingerprint

Dive into the research topics of 'Learning Complex Policy Distribution with CEM Guided Adversarial Hypernetwork'. Together they form a unique fingerprint.

Cite this