TY - GEN
T1 - Federated learning for cyber security
AU - Khramtsova, Ekaterina
AU - Hammerschmidt, Christian
AU - Lagraa, Sofian
AU - State, Radu
PY - 2020
Y1 - 2020
N2 - Managed security service providers increasingly rely on machine-learning methods to exceed traditional, signature-based threat detection and classification methods. As machine-learning often improves with more data available, smaller organizations and clients find themselves at a disadvantage: Without the ability to share their data and others willing to collaborate, their machine-learned threat detection will perform worse than the same model in a larger organization. We show that Federated Learning, i.e. collaborative learning without data sharing, successfully helps to overcome this problem. Our experiments focus on a common task in cyber security, the detection of unwanted URLs in network traffic seen by security-as-a-service providers. Our experiments show that i) Smaller participants benefit from larger participants ii) Participants seeing different types of malicious traffic can generalize better to unseen types of attacks, increasing performance by 8% to 15% on average, and up to 27% in the extreme case. iii) Participating in Federated training never harms the performance of the locally trained model. In our experiment modeling a security-as-a service setting, Federated Learning increased detection up to 30% for some participants in the scheme. This clearly shows that Federated Learning is a viable approach to address issues of data sharing in common cyber security settings.
AB - Managed security service providers increasingly rely on machine-learning methods to exceed traditional, signature-based threat detection and classification methods. As machine-learning often improves with more data available, smaller organizations and clients find themselves at a disadvantage: Without the ability to share their data and others willing to collaborate, their machine-learned threat detection will perform worse than the same model in a larger organization. We show that Federated Learning, i.e. collaborative learning without data sharing, successfully helps to overcome this problem. Our experiments focus on a common task in cyber security, the detection of unwanted URLs in network traffic seen by security-as-a-service providers. Our experiments show that i) Smaller participants benefit from larger participants ii) Participants seeing different types of malicious traffic can generalize better to unseen types of attacks, increasing performance by 8% to 15% on average, and up to 27% in the extreme case. iii) Participating in Federated training never harms the performance of the locally trained model. In our experiment modeling a security-as-a service setting, Federated Learning increased detection up to 30% for some participants in the scheme. This clearly shows that Federated Learning is a viable approach to address issues of data sharing in common cyber security settings.
KW - cyber-security
KW - Federated-learning
KW - Machine-learning
UR - http://www.scopus.com/inward/record.url?scp=85101996393&partnerID=8YFLogxK
U2 - 10.1109/ICDCS47774.2020.00171
DO - 10.1109/ICDCS47774.2020.00171
M3 - Conference contribution
AN - SCOPUS:85101996393
T3 - Proceedings - International Conference on Distributed Computing Systems
SP - 1316
EP - 1321
BT - Proceedings - 2020 IEEE 40th International Conference on Distributed Computing Systems, ICDCS 2020
PB - Institute of Electrical and Electronics Engineers (IEEE)
T2 - 40th IEEE International Conference on Distributed Computing Systems, ICDCS 2020
Y2 - 29 November 2020 through 1 December 2020
ER -