Learning Distributions Generated by Single-Layer ReLU Networks in the Presence of Arbitrary Outliers

Saikiran Bulusu, G. Joseph, M. Cenk Gursoy, Pramod K. Varshney

Research output: Chapter in Book/Conference proceedings/Edited volumeConference contributionScientificpeer-review

18 Downloads (Pure)

Abstract

We consider a set of data samples such that a fraction of the samples are arbitrary outliers, and the rest are the output samples of a single-layer neural network with rectified linear unit (ReLU) activation. Our goal is to estimate the parameters (weight matrix and bias vector) of the neural network, assuming the bias vector to be non-negative. We estimate the network parameters using the gradient descent algorithm combined with either the median- or trimmed mean-based filters to mitigate the effect of the arbitrary outliers. We then prove that $\tilde{O}( \frac{1}{p^2}+\frac{1}{\epsilon^2p})$ samples and $\tilde{O} ( \frac{d^2}{p^2}+ \frac{d^2}{\epsilon^2p})$ time are sufficient for our algorithm to estimate the neural network parameters within an error of $\epsilon$ when the outlier probability is $1-p$, {where $2/3< p \leq 1$} and the problem dimension is $d$ (with log factors being ignored here). Our theoretical and simulation results provide insights into the training complexity of ReLU neural networks in terms of the probability of outliers and problem dimension.
Original languageEnglish
Title of host publication36th Conference on Neural Information Processing Systems 2022
EditorsS. Koyejo
Number of pages11
ISBN (Electronic)9781713871088
Publication statusPublished - 2022
Event36th Conference on Neural Information Processing Systems - Hybrid Conference, New Orleans, United States
Duration: 28 Nov 20229 Dec 2022
Conference number: 36

Conference

Conference36th Conference on Neural Information Processing Systems
Abbreviated titleNeurIPS 2022
Country/TerritoryUnited States
CityNew Orleans
Period28/11/229/12/22

Fingerprint

Dive into the research topics of 'Learning Distributions Generated by Single-Layer ReLU Networks in the Presence of Arbitrary Outliers'. Together they form a unique fingerprint.

Cite this