Robust Anomaly Detection on Unreliable Data

Zilong Zhao; Sophie Cerf; Robert Birke; Bogdan Robu; Sara Bouchenak; Sonia Ben Mokhtar; Lydia Y. Chen

doi:10.1109/DSN.2019.00068

Robust Anomaly Detection on Unreliable Data

Zilong Zhao, Sophie Cerf, Robert Birke, Bogdan Robu, Sara Bouchenak, Sonia Ben Mokhtar, Lydia Y. Chen

Data-Intensive Systems

Research output: Chapter in Book/Conference proceedings/Edited volume › Conference contribution › Scientific › peer-review

29 Citations (Scopus)

Abstract

Classification algorithms have been widely adopted to detect anomalies for various systems, e.g., IoT and cloud, under the common assumption that the data source is clean, i.e., features and labels are correctly set. However, data collected from the field can be unreliable due to careless annotations or malicious data transformation for incorrect anomaly detection. In this paper, we present a two-layer learning framework for robust anomaly detection (RAD) in the presence of unreliable anomaly labels. The first layer of quality model filters the suspicious data, where the second layer of classification model detects the anomaly types. We specifically focus on two use cases, (i) detecting 10 classes of IoT attacks and (ii) predicting 4 classes of task failures of big data jobs. Our evaluation results show that RAD can robustly improve the accuracy of anomaly detection, to reach up to 98% for IoT device attacks (i.e., +11%) and up to 83% for cloud task failures (i.e., +20%), under a significant percentage of altered anomaly labels. Index Terms-Unreliable Data; Anomaly Detection; Failures; Attacks; Machine Learning.

Original language	English
Title of host publication	Proceedings - 49th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, DSN 2019
Subtitle of host publication	Proceedings
Place of Publication	Piscataway
Publisher	IEEE
Pages	630-637
Number of pages	8
ISBN (Electronic)	9781728100562
ISBN (Print)	978-1-7281-0058-6
DOIs	https://doi.org/10.1109/DSN.2019.00068
Publication status	Published - 2019
Event	49th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, DSN 2019 - Portland, United States Duration: 24 Jun 2019 → 27 Jun 2019

Publication series

Name	Proceedings - 49th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, DSN 2019

Conference

Conference	49th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, DSN 2019
Country/Territory	United States
City	Portland
Period	24/06/19 → 27/06/19

Keywords

Anomaly Detection
Attacks
Failures
Machine Learning
Unreliable Data

Access to Document

10.1109/DSN.2019.00068

Cite this

Zhao, Z., Cerf, S., Birke, R., Robu, B., Bouchenak, S., Ben Mokhtar, S., & Chen, L. Y. (2019). Robust Anomaly Detection on Unreliable Data. In Proceedings - 49th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, DSN 2019: Proceedings (pp. 630-637). Article 8809512 (Proceedings - 49th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, DSN 2019). IEEE. https://doi.org/10.1109/DSN.2019.00068

@inproceedings{8076019570a142f6a0a26afeb39ce67e,

title = "Robust Anomaly Detection on Unreliable Data",

abstract = "Classification algorithms have been widely adopted to detect anomalies for various systems, e.g., IoT and cloud, under the common assumption that the data source is clean, i.e., features and labels are correctly set. However, data collected from the field can be unreliable due to careless annotations or malicious data transformation for incorrect anomaly detection. In this paper, we present a two-layer learning framework for robust anomaly detection (RAD) in the presence of unreliable anomaly labels. The first layer of quality model filters the suspicious data, where the second layer of classification model detects the anomaly types. We specifically focus on two use cases, (i) detecting 10 classes of IoT attacks and (ii) predicting 4 classes of task failures of big data jobs. Our evaluation results show that RAD can robustly improve the accuracy of anomaly detection, to reach up to 98% for IoT device attacks (i.e., +11%) and up to 83% for cloud task failures (i.e., +20%), under a significant percentage of altered anomaly labels. Index Terms-Unreliable Data; Anomaly Detection; Failures; Attacks; Machine Learning.",

keywords = "Anomaly Detection, Attacks, Failures, Machine Learning, Unreliable Data",

author = "Zilong Zhao and Sophie Cerf and Robert Birke and Bogdan Robu and Sara Bouchenak and {Ben Mokhtar}, Sonia and Chen, {Lydia Y.}",

year = "2019",

doi = "10.1109/DSN.2019.00068",

language = "English",

isbn = "978-1-7281-0058-6",

series = "Proceedings - 49th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, DSN 2019",

publisher = "IEEE",

pages = "630--637",

booktitle = "Proceedings - 49th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, DSN 2019",

address = "United States",

note = "49th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, DSN 2019 ; Conference date: 24-06-2019 Through 27-06-2019",

}

Zhao, Z, Cerf, S, Birke, R, Robu, B, Bouchenak, S, Ben Mokhtar, S & Chen, LY 2019, Robust Anomaly Detection on Unreliable Data. in Proceedings - 49th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, DSN 2019: Proceedings., 8809512, Proceedings - 49th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, DSN 2019, IEEE, Piscataway, pp. 630-637, 49th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, DSN 2019, Portland, United States, 24/06/19. https://doi.org/10.1109/DSN.2019.00068

Robust Anomaly Detection on Unreliable Data. / Zhao, Zilong; Cerf, Sophie; Birke, Robert et al.
Proceedings - 49th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, DSN 2019: Proceedings. Piscataway: IEEE, 2019. p. 630-637 8809512 (Proceedings - 49th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, DSN 2019).

Research output: Chapter in Book/Conference proceedings/Edited volume › Conference contribution › Scientific › peer-review

TY - GEN

T1 - Robust Anomaly Detection on Unreliable Data

AU - Zhao, Zilong

AU - Cerf, Sophie

AU - Birke, Robert

AU - Robu, Bogdan

AU - Bouchenak, Sara

AU - Ben Mokhtar, Sonia

AU - Chen, Lydia Y.

PY - 2019

Y1 - 2019

N2 - Classification algorithms have been widely adopted to detect anomalies for various systems, e.g., IoT and cloud, under the common assumption that the data source is clean, i.e., features and labels are correctly set. However, data collected from the field can be unreliable due to careless annotations or malicious data transformation for incorrect anomaly detection. In this paper, we present a two-layer learning framework for robust anomaly detection (RAD) in the presence of unreliable anomaly labels. The first layer of quality model filters the suspicious data, where the second layer of classification model detects the anomaly types. We specifically focus on two use cases, (i) detecting 10 classes of IoT attacks and (ii) predicting 4 classes of task failures of big data jobs. Our evaluation results show that RAD can robustly improve the accuracy of anomaly detection, to reach up to 98% for IoT device attacks (i.e., +11%) and up to 83% for cloud task failures (i.e., +20%), under a significant percentage of altered anomaly labels. Index Terms-Unreliable Data; Anomaly Detection; Failures; Attacks; Machine Learning.

AB - Classification algorithms have been widely adopted to detect anomalies for various systems, e.g., IoT and cloud, under the common assumption that the data source is clean, i.e., features and labels are correctly set. However, data collected from the field can be unreliable due to careless annotations or malicious data transformation for incorrect anomaly detection. In this paper, we present a two-layer learning framework for robust anomaly detection (RAD) in the presence of unreliable anomaly labels. The first layer of quality model filters the suspicious data, where the second layer of classification model detects the anomaly types. We specifically focus on two use cases, (i) detecting 10 classes of IoT attacks and (ii) predicting 4 classes of task failures of big data jobs. Our evaluation results show that RAD can robustly improve the accuracy of anomaly detection, to reach up to 98% for IoT device attacks (i.e., +11%) and up to 83% for cloud task failures (i.e., +20%), under a significant percentage of altered anomaly labels. Index Terms-Unreliable Data; Anomaly Detection; Failures; Attacks; Machine Learning.

KW - Anomaly Detection

KW - Attacks

KW - Failures

KW - Machine Learning

KW - Unreliable Data

UR - http://www.scopus.com/inward/record.url?scp=85072121763&partnerID=8YFLogxK

U2 - 10.1109/DSN.2019.00068

DO - 10.1109/DSN.2019.00068

M3 - Conference contribution

SN - 978-1-7281-0058-6

T3 - Proceedings - 49th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, DSN 2019

SP - 630

EP - 637

BT - Proceedings - 49th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, DSN 2019

PB - IEEE

CY - Piscataway

T2 - 49th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, DSN 2019

Y2 - 24 June 2019 through 27 June 2019

ER -

Zhao Z, Cerf S, Birke R, Robu B, Bouchenak S, Ben Mokhtar S et al. Robust Anomaly Detection on Unreliable Data. In Proceedings - 49th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, DSN 2019: Proceedings. Piscataway: IEEE. 2019. p. 630-637. 8809512. (Proceedings - 49th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, DSN 2019). doi: 10.1109/DSN.2019.00068

Robust Anomaly Detection on Unreliable Data

Abstract

Publication series

Conference

Keywords

Access to Document

Other files and links

Fingerprint

Cite this