Robust Learning via Golden Symmetric Loss of (un)Trusted Labels

Amirmasoud Ghiassi; Robert Birke; Lydia Y. Chen

Robust Learning via Golden Symmetric Loss of (un)Trusted Labels

Amirmasoud Ghiassi^*, Robert Birke, Lydia Y. Chen^*

^*Corresponding author for this work

Data-Intensive Systems

Research output: Chapter in Book/Conference proceedings/Edited volume › Conference contribution › Scientific › peer-review

13 Downloads (Pure)

Abstract

Learning robust deep models against noisy labels becomes ever critical when today's data is commonly collected from open platforms and subject to adversarial corruption. The information on the label corruption process, i.e., corruption matrix, can greatly enhance the robustness of deep models but still fall behind in combating hard classes. In this paper, we propose to construct a golden symmetric loss (GSL) based on the estimated corruption matrix as to avoid overfitting to noisy labels and learn effectively from hard classes. GSL is the weighted sum of the corrected regular cross entropy and reverse cross entropy. By leveraging a small fraction of trusted clean data, we estimate the corruption matrix and use it to correct the loss as well as to determine the weights of GSL. We theoretically prove the robustness of the proposed loss function in the presence of dirty labels. We provide a heuristics to adaptively tune the loss weights of GSL according to the noise rate and diversity measured from the dataset. We evaluate our proposed golden symmetric loss on both vision and natural language deep models subject to different types of label noise patterns. Empirical results show that GSL can significantly outperform the existing robust training methods on different noise patterns, showing accuracy improvement up to 18% on CIFAR-100 and 1% on real world noisy dataset of Clothing1M.

Original language	English
Title of host publication	2023 SIAM International Conference on Data Mining, SDM 2023
Publisher	Society for Industrial and Applied Mathematics
Pages	568-576
Number of pages	9
ISBN (Electronic)	9781611977653
Publication status	Published - 2023
Event	2023 SIAM International Conference on Data Mining, SDM 2023 - Minneapolis, United States Duration: 27 Apr 2023 → 29 Apr 2023

Publication series

Name	2023 SIAM International Conference on Data Mining, SDM 2023

Conference

Conference	2023 SIAM International Conference on Data Mining, SDM 2023
Country/Territory	United States
City	Minneapolis
Period	27/04/23 → 29/04/23

Keywords

Deep learning models
Noisy labels
Robust training
Symmetric loss function

Access to Document

1.9781611977653.ch64Final published version, 1.27 MB

Cite this

@inproceedings{753e81ded1cc4c059c0a1fe0a796a583,

title = "Robust Learning via Golden Symmetric Loss of (un)Trusted Labels",

abstract = "Learning robust deep models against noisy labels becomes ever critical when today's data is commonly collected from open platforms and subject to adversarial corruption. The information on the label corruption process, i.e., corruption matrix, can greatly enhance the robustness of deep models but still fall behind in combating hard classes. In this paper, we propose to construct a golden symmetric loss (GSL) based on the estimated corruption matrix as to avoid overfitting to noisy labels and learn effectively from hard classes. GSL is the weighted sum of the corrected regular cross entropy and reverse cross entropy. By leveraging a small fraction of trusted clean data, we estimate the corruption matrix and use it to correct the loss as well as to determine the weights of GSL. We theoretically prove the robustness of the proposed loss function in the presence of dirty labels. We provide a heuristics to adaptively tune the loss weights of GSL according to the noise rate and diversity measured from the dataset. We evaluate our proposed golden symmetric loss on both vision and natural language deep models subject to different types of label noise patterns. Empirical results show that GSL can significantly outperform the existing robust training methods on different noise patterns, showing accuracy improvement up to 18% on CIFAR-100 and 1% on real world noisy dataset of Clothing1M.",

keywords = "Deep learning models, Noisy labels, Robust training, Symmetric loss function",

author = "Amirmasoud Ghiassi and Robert Birke and Chen, {Lydia Y.}",

year = "2023",

language = "English",

series = "2023 SIAM International Conference on Data Mining, SDM 2023",

publisher = "Society for Industrial and Applied Mathematics",

pages = "568--576",

booktitle = "2023 SIAM International Conference on Data Mining, SDM 2023",

address = "United States",

note = "2023 SIAM International Conference on Data Mining, SDM 2023 ; Conference date: 27-04-2023 Through 29-04-2023",

}

Ghiassi, A, Birke, R & Chen, LY 2023, Robust Learning via Golden Symmetric Loss of (un)Trusted Labels. in 2023 SIAM International Conference on Data Mining, SDM 2023. 2023 SIAM International Conference on Data Mining, SDM 2023, Society for Industrial and Applied Mathematics, pp. 568-576, 2023 SIAM International Conference on Data Mining, SDM 2023, Minneapolis, United States, 27/04/23.

Robust Learning via Golden Symmetric Loss of (un)Trusted Labels. / Ghiassi, Amirmasoud; Birke, Robert; Chen, Lydia Y.
2023 SIAM International Conference on Data Mining, SDM 2023. Society for Industrial and Applied Mathematics, 2023. p. 568-576 (2023 SIAM International Conference on Data Mining, SDM 2023).

Research output: Chapter in Book/Conference proceedings/Edited volume › Conference contribution › Scientific › peer-review

TY - GEN

T1 - Robust Learning via Golden Symmetric Loss of (un)Trusted Labels

AU - Ghiassi, Amirmasoud

AU - Birke, Robert

AU - Chen, Lydia Y.

PY - 2023

Y1 - 2023

N2 - Learning robust deep models against noisy labels becomes ever critical when today's data is commonly collected from open platforms and subject to adversarial corruption. The information on the label corruption process, i.e., corruption matrix, can greatly enhance the robustness of deep models but still fall behind in combating hard classes. In this paper, we propose to construct a golden symmetric loss (GSL) based on the estimated corruption matrix as to avoid overfitting to noisy labels and learn effectively from hard classes. GSL is the weighted sum of the corrected regular cross entropy and reverse cross entropy. By leveraging a small fraction of trusted clean data, we estimate the corruption matrix and use it to correct the loss as well as to determine the weights of GSL. We theoretically prove the robustness of the proposed loss function in the presence of dirty labels. We provide a heuristics to adaptively tune the loss weights of GSL according to the noise rate and diversity measured from the dataset. We evaluate our proposed golden symmetric loss on both vision and natural language deep models subject to different types of label noise patterns. Empirical results show that GSL can significantly outperform the existing robust training methods on different noise patterns, showing accuracy improvement up to 18% on CIFAR-100 and 1% on real world noisy dataset of Clothing1M.

AB - Learning robust deep models against noisy labels becomes ever critical when today's data is commonly collected from open platforms and subject to adversarial corruption. The information on the label corruption process, i.e., corruption matrix, can greatly enhance the robustness of deep models but still fall behind in combating hard classes. In this paper, we propose to construct a golden symmetric loss (GSL) based on the estimated corruption matrix as to avoid overfitting to noisy labels and learn effectively from hard classes. GSL is the weighted sum of the corrected regular cross entropy and reverse cross entropy. By leveraging a small fraction of trusted clean data, we estimate the corruption matrix and use it to correct the loss as well as to determine the weights of GSL. We theoretically prove the robustness of the proposed loss function in the presence of dirty labels. We provide a heuristics to adaptively tune the loss weights of GSL according to the noise rate and diversity measured from the dataset. We evaluate our proposed golden symmetric loss on both vision and natural language deep models subject to different types of label noise patterns. Empirical results show that GSL can significantly outperform the existing robust training methods on different noise patterns, showing accuracy improvement up to 18% on CIFAR-100 and 1% on real world noisy dataset of Clothing1M.

KW - Deep learning models

KW - Noisy labels

KW - Robust training

KW - Symmetric loss function

UR - http://www.scopus.com/inward/record.url?scp=85180635046&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:85180635046

T3 - 2023 SIAM International Conference on Data Mining, SDM 2023

SP - 568

EP - 576

BT - 2023 SIAM International Conference on Data Mining, SDM 2023

PB - Society for Industrial and Applied Mathematics

T2 - 2023 SIAM International Conference on Data Mining, SDM 2023

Y2 - 27 April 2023 through 29 April 2023

ER -

Robust Learning via Golden Symmetric Loss of (un)Trusted Labels

Abstract

Publication series

Conference

Keywords

Access to Document

Other files and links

Fingerprint

Cite this