TrustNet: Learning from Trusted Data Against (A)symmetric Label Noise

Research output: Chapter in Book/Conference proceedings/Edited volumeConference contributionScientificpeer-review

20 Downloads (Pure)

Abstract

Big Data systems allow collecting massive datasets to feed the data hungry deep learning. Labelling these ever-bigger datasets is increasingly challenging and label errors affect even highly curated sets. This makes robustness to label noise a critical property for weakly-supervised classifiers. The related works on resilient deep networks tend to focus on a limited set of synthetic noise patterns, and with disparate views on their impacts, e.g., robustness against symmetric v.s. asymmetric noise patterns. In this paper, we first extend the theoretical analysis of test accuracy for any given noise patterns. Based on the insights, we design TrustNet that first learns the pattern of noise corruption, being it both symmetric or asymmetric, from a small set of trusted data. Then, TrustNet is trained via a robust loss function, which weights the given labels against the inferred labels from the learned noise pattern. The weight is adjusted based on model uncertainty across training epochs. We evaluate TrustNet on synthetic label noise for CIFAR-10, CIFAR-100 and big real-world data with label noise, i.e., Clothing1M. We compare against state-of-The-Art methods demonstrating the strong robustness of TrustNet under a diverse set of noise patterns.

Original languageEnglish
Title of host publicationProceedings of the 8th IEEE/ACM International Conference on Big Data Computing, Applications and Technologies, BDCAT 2021
PublisherAssociation for Computing Machinery (ACM)
Pages52-62
Number of pages11
ISBN (Electronic)9781450391641
DOIs
Publication statusPublished - 2021
Event8th IEEE/ACM International Conference on Big Data Computing, Applications and Technologies, BDCAT 2021 - Leicester, United Kingdom
Duration: 6 Dec 20219 Dec 2021

Publication series

NameACM International Conference Proceeding Series

Conference

Conference8th IEEE/ACM International Conference on Big Data Computing, Applications and Technologies, BDCAT 2021
Country/TerritoryUnited Kingdom
CityLeicester
Period6/12/219/12/21

Keywords

  • deep neural networks
  • noise estimation
  • noise transition matrix
  • noisy labels in big data
  • robust loss function

Fingerprint

Dive into the research topics of 'TrustNet: Learning from Trusted Data Against (A)symmetric Label Noise'. Together they form a unique fingerprint.

Cite this