FEVERLESS: Fast and Secure Vertical Federated Learning based on XGBoost for Decentralized Labels

Rui Wang; Oguzhan Ersoy; Hangyu Zhu; Yaochu Jin; Kaitai Liang

doi:10.1109/TBDATA.2022.3227326

FEVERLESS: Fast and Secure Vertical Federated Learning based on XGBoost for Decentralized Labels

Rui Wang, Oguzhan Ersoy, Hangyu Zhu, Yaochu Jin, Kaitai Liang

Cyber Security

Research output: Contribution to journal › Article › Scientific › peer-review

1 Citation (Scopus)

Abstract

Vertical Federated Learning (VFL) enables multiple clients to collaboratively train a global model over vertically partitioned data without leaking private local information. Tree-based models, like XGBoost and LightGBM, have been widely used in VFL to enhance the interpretation and efficiency of training. However, there is a fundamental lack of research on how to conduct VFL securely over distributed labels. This work is the first to fill this gap by designing a novel protocol, called FEVERLESS, based on XGBoost. FEVERLESS leverages secure aggregation via information masking technique and global differential privacy provided by a fairly and randomly selected noise leader to prevent private information from being leaked in the training process. Furthermore, it provides label and data privacy against honest-but-curious adversaries even in the case of collusion of <inline-formula><tex-math notation="LaTeX">$n - 2$</tex-math></inline-formula> out of n clients. We present a comprehensive security and efficiency analysis for our design, and the empirical results from our experiments demonstrate that FEVERLESS is fast and secure. In particular, it outperforms the solution based on additive homomorphic encryption in runtime cost and provides better accuracy than the local differential privacy approach.

Original language	English
Pages (from-to)	1-15
Number of pages	15
Journal	IEEE Transactions on Big Data
DOIs	https://doi.org/10.1109/TBDATA.2022.3227326
Publication status	E-pub ahead of print - 2022

Keywords

Vertical federated learning
XGBoost
privacy preservation
secure aggregation
differential privacy

Access to Document

10.1109/TBDATA.2022.3227326

Cite this

@article{3879fc9f9a6c44a9ae86824e3094a8cd,

title = "FEVERLESS: Fast and Secure Vertical Federated Learning based on XGBoost for Decentralized Labels",

abstract = "Vertical Federated Learning (VFL) enables multiple clients to collaboratively train a global model over vertically partitioned data without leaking private local information. Tree-based models, like XGBoost and LightGBM, have been widely used in VFL to enhance the interpretation and efficiency of training. However, there is a fundamental lack of research on how to conduct VFL securely over distributed labels. This work is the first to fill this gap by designing a novel protocol, called FEVERLESS, based on XGBoost. FEVERLESS leverages secure aggregation via information masking technique and global differential privacy provided by a fairly and randomly selected noise leader to prevent private information from being leaked in the training process. Furthermore, it provides label and data privacy against honest-but-curious adversaries even in the case of collusion of $n - 2$ out of n clients. We present a comprehensive security and efficiency analysis for our design, and the empirical results from our experiments demonstrate that FEVERLESS is fast and secure. In particular, it outperforms the solution based on additive homomorphic encryption in runtime cost and provides better accuracy than the local differential privacy approach.",

keywords = "Vertical federated learning, XGBoost, privacy preservation, secure aggregation, differential privacy",

author = "Rui Wang and Oguzhan Ersoy and Hangyu Zhu and Yaochu Jin and Kaitai Liang",

year = "2022",

doi = "10.1109/TBDATA.2022.3227326",

language = "English",

pages = "1--15",

journal = "IEEE Transactions on Big Data",

issn = "2332-7790",

publisher = "Institute of Electrical and Electronics Engineers (IEEE)",

}

TY - JOUR

T1 - FEVERLESS

T2 - Fast and Secure Vertical Federated Learning based on XGBoost for Decentralized Labels

AU - Wang, Rui

AU - Ersoy, Oguzhan

AU - Zhu, Hangyu

AU - Jin, Yaochu

AU - Liang, Kaitai

PY - 2022

Y1 - 2022

N2 - Vertical Federated Learning (VFL) enables multiple clients to collaboratively train a global model over vertically partitioned data without leaking private local information. Tree-based models, like XGBoost and LightGBM, have been widely used in VFL to enhance the interpretation and efficiency of training. However, there is a fundamental lack of research on how to conduct VFL securely over distributed labels. This work is the first to fill this gap by designing a novel protocol, called FEVERLESS, based on XGBoost. FEVERLESS leverages secure aggregation via information masking technique and global differential privacy provided by a fairly and randomly selected noise leader to prevent private information from being leaked in the training process. Furthermore, it provides label and data privacy against honest-but-curious adversaries even in the case of collusion of $n - 2$ out of n clients. We present a comprehensive security and efficiency analysis for our design, and the empirical results from our experiments demonstrate that FEVERLESS is fast and secure. In particular, it outperforms the solution based on additive homomorphic encryption in runtime cost and provides better accuracy than the local differential privacy approach.

AB - Vertical Federated Learning (VFL) enables multiple clients to collaboratively train a global model over vertically partitioned data without leaking private local information. Tree-based models, like XGBoost and LightGBM, have been widely used in VFL to enhance the interpretation and efficiency of training. However, there is a fundamental lack of research on how to conduct VFL securely over distributed labels. This work is the first to fill this gap by designing a novel protocol, called FEVERLESS, based on XGBoost. FEVERLESS leverages secure aggregation via information masking technique and global differential privacy provided by a fairly and randomly selected noise leader to prevent private information from being leaked in the training process. Furthermore, it provides label and data privacy against honest-but-curious adversaries even in the case of collusion of $n - 2$ out of n clients. We present a comprehensive security and efficiency analysis for our design, and the empirical results from our experiments demonstrate that FEVERLESS is fast and secure. In particular, it outperforms the solution based on additive homomorphic encryption in runtime cost and provides better accuracy than the local differential privacy approach.

KW - Vertical federated learning

KW - XGBoost

KW - privacy preservation

KW - secure aggregation

KW - differential privacy

UR - http://www.scopus.com/inward/record.url?scp=85144795394&partnerID=8YFLogxK

U2 - 10.1109/TBDATA.2022.3227326

DO - 10.1109/TBDATA.2022.3227326

M3 - Article

AN - SCOPUS:85144795394

SN - 2332-7790

SP - 1

EP - 15

JO - IEEE Transactions on Big Data

JF - IEEE Transactions on Big Data

ER -

FEVERLESS: Fast and Secure Vertical Federated Learning based on XGBoost for Decentralized Labels

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this