Abstract
An alternative method for sharing knowledge while complying with strict data access regulations, such as the European General Data Protection Regulation (GDPR), is the emergence of synthetic tabular data. Mainstream table synthesizers utilize methodologies derived from Generative Adversarial Networks (GAN). Although several state-of-the-art (SOTA) tabular GAN algorithms inherit Convolutional Neural Network (CNN)-based architectures, which have proven effective for images, they tend to overlook two critical properties of tabular data: (i) the global correlation across columns, and (ii) the semantic invariance to the column order. Permuting columns in a table does not alter the semantic meaning of the data, but features extracted by CNNs can change significantly due to their limited convolution filter kernel size. To address the above problems, we propose FCT-GAN the first conditional tabular GAN to adopt Fourier networks into table synthesis. FCT-GAN enhances permutation invariant GAN training by strengthening the learning of global correlations via Fourier layers. Extensive evaluation on benchmarks and real-world datasets show that FCT-GAN can synthesize tabular data with better (up to 27.8%) machine learning utility (i.e. a proxy of global correlations) and higher (up to 26.5%) statistical similarity to real data. FCT-GAN also has the least variation on synthetic data quality among 7 SOTA baselines on 3 different training-data column orders.
Original language | English |
---|---|
Title of host publication | CIKM 2023 - Proceedings of the 32nd ACM International Conference on Information and Knowledge Management |
Place of Publication | New York |
Publisher | Association for Computing Machinery (ACM) |
Pages | 4450–4454 |
Number of pages | 5 |
ISBN (Electronic) | 9798400701245 |
ISBN (Print) | 979-8-4007-0124-5 |
DOIs | |
Publication status | Published - 2023 |
Event | CIKM 2023: The 32nd ACM International Conference on Information and Knowledge Management - Birmingham, United Kingdom Duration: 21 Oct 2023 → 25 Oct 2023 Conference number: 32nd |
Publication series
Name | International Conference on Information and Knowledge Management, Proceedings |
---|
Conference
Conference | CIKM 2023 |
---|---|
Country/Territory | United Kingdom |
City | Birmingham |
Period | 21/10/23 → 25/10/23 |
Bibliographical note
Green Open Access added to TU Delft Institutional Repository 'You share, we take care!' - Taverne project https://www.openaccess.nl/en/you-share-we-take-careOtherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.
Keywords
- fourier transform
- gan
- tabular data