EAD-GAN: A Generative Adversarial Network for Disentangling Affine Transforms in Images

Letao Liu*, Xudong Jiang, Martin Saerbeck, Justin Dauwels

*Corresponding author for this work

Research output: Contribution to journalArticleScientificpeer-review

2 Citations (Scopus)
11 Downloads (Pure)

Abstract

This article proposes a generative adversarial network called explicit affine disentangled generative adversarial network (EAD-GAN), which explicitly disentangles affine transform in a self-supervised manner. We propose an affine transform regularizer to force the InfoGAN to have explicit properties of affine transform. To facilitate training an affine transform encoder, we decompose the affine matrix into two separate matrices and infer the explicit transform parameters by the least-squares method. Unlike the existing approaches, representations learned by the proposed EAD-GAN have clear physical meaning, where transforms, such as rotation, horizontal and vertical zooms, skews, and translations, are explicitly learned from training data. Thus, we set different values of each transform parameter individually to generate specifically affine transformed data by the learned network. We show that the proposed EAD-GAN successfully disentangles these attributes on the MNIST, CelebA, and dSprites datasets. EAD-GAN achieves higher disentanglement scores with a large margin compared to the state-of-the-art methods on the dSprites dataset. For example, on the dSprites dataset, EAD-GAN achieves the MIG and DCI score of 0.59 and 0.96 respectively, compared to 0.37 and 0.71, respectively, for the state-of-the-art methods.

Original languageEnglish
Pages (from-to)3652-3662
Number of pages11
JournalIEEE Transactions on Neural Networks and Learning Systems
Volume35
Issue number3
DOIs
Publication statusPublished - 2022

Keywords

  • Affine transform
  • disentanglement
  • generative adversarial network (GAN)

Fingerprint

Dive into the research topics of 'EAD-GAN: A Generative Adversarial Network for Disentangling Affine Transforms in Images'. Together they form a unique fingerprint.

Cite this