Federated Learning With Heterogeneity-Aware Probabilistic Synchronous Parallel on Edge

Jianxin  Zhao; Rui Han; Yongkai  Yang; Benjamin  Catterall; Chi Harold Liu; Lydia Y. Chen; Richard Mortier; Jon Crowcroft; Liang Wang

doi:10.1109/TSC.2021.3109910

Federated Learning With Heterogeneity-Aware Probabilistic Synchronous Parallel on Edge

Jianxin Zhao, Rui Han, Yongkai Yang^*, Benjamin Catterall, Chi Harold Liu, Lydia Y. Chen, Richard Mortier, Jon Crowcroft, Liang Wang

^*Corresponding author for this work

Data-Intensive Systems

Research output: Contribution to journal › Article › Scientific › peer-review

6 Citations (Scopus)

13 Downloads (Pure)

Abstract

With the massive amount of data generated from mobile devices and the increase of computing power of edge devices, the paradigm of Federated Learning has attracted great momentum. In federated learning, distributed and heterogeneous nodes collaborate to learn model parameters. However, while providing benefits such as privacy by design and reduced latency, the heterogeneous network present challenges to the synchronisation methods, or barrier control methods, used in training, regarding system progress and model convergence etc. The design of these barrier mechanisms is critical for the performance and scalability of federated learning systems. We propose a new barrier control technique called Probabilistic Synchronous Parallel (PSP). In contrast to existing mechanisms, it introduces a sampling primitive that composes with existing barrier control mechanisms to produce a family of mechanisms with improved convergence speed and scalability. Our proposal is supported with a convergence analysis of PSP-based SGD algorithm. In practice, we also propose heuristic techniques that further improve the efficiency of PSP. We evaluate the performance of proposed methods using the federated learning specific FEMNSIT dataset. The evaluation results show that PSP can effectively achieve good balance between system efficiency and model accuracy, mitigating the challenge of heterogeneity in federated learning.

Original language	English
Article number	9529051
Pages (from-to)	614-626
Number of pages	13
Journal	IEEE Transactions on Services Computing
Volume	15
Issue number	2
DOIs	https://doi.org/10.1109/TSC.2021.3109910
Publication status	Published - 2022

Bibliographical note

Green Open Access added to TU Delft Institutional Repository 'You share, we take care!' - Taverne project https://www.openaccess.nl/en/you-share-we-take-care
Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.

Keywords

Federated learning
edge computing
distributed computing
barrier control

Access to Document

10.1109/TSC.2021.3109910

Federated_Learning_With_Heterogeneity-Aware_Probabilistic_Synchronous_Parallel_on_EdgeFinal published version, 2.01 MB

Cite this

@article{828dd13b2d574e96b77041b4b5c715ea,

title = "Federated Learning With Heterogeneity-Aware Probabilistic Synchronous Parallel on Edge",

abstract = "With the massive amount of data generated from mobile devices and the increase of computing power of edge devices, the paradigm of Federated Learning has attracted great momentum. In federated learning, distributed and heterogeneous nodes collaborate to learn model parameters. However, while providing benefits such as privacy by design and reduced latency, the heterogeneous network present challenges to the synchronisation methods, or barrier control methods, used in training, regarding system progress and model convergence etc. The design of these barrier mechanisms is critical for the performance and scalability of federated learning systems. We propose a new barrier control technique called Probabilistic Synchronous Parallel (PSP). In contrast to existing mechanisms, it introduces a sampling primitive that composes with existing barrier control mechanisms to produce a family of mechanisms with improved convergence speed and scalability. Our proposal is supported with a convergence analysis of PSP-based SGD algorithm. In practice, we also propose heuristic techniques that further improve the efficiency of PSP. We evaluate the performance of proposed methods using the federated learning specific FEMNSIT dataset. The evaluation results show that PSP can effectively achieve good balance between system efficiency and model accuracy, mitigating the challenge of heterogeneity in federated learning.",

keywords = "Federated learning, edge computing, distributed computing, barrier control",

author = "Jianxin Zhao and Rui Han and Yongkai Yang and Benjamin Catterall and Liu, {Chi Harold} and Chen, {Lydia Y.} and Richard Mortier and Jon Crowcroft and Liang Wang",

note = "Green Open Access added to TU Delft Institutional Repository 'You share, we take care!' - Taverne project https://www.openaccess.nl/en/you-share-we-take-care Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.",

year = "2022",

doi = "10.1109/TSC.2021.3109910",

language = "English",

volume = "15",

pages = "614--626",

journal = "IEEE Transactions on Services Computing",

issn = "1939-1374",

publisher = "Institute of Electrical and Electronics Engineers (IEEE)",

number = "2",

}

TY - JOUR

T1 - Federated Learning With Heterogeneity-Aware Probabilistic Synchronous Parallel on Edge

AU - Zhao, Jianxin

AU - Han, Rui

AU - Yang, Yongkai

AU - Catterall, Benjamin

AU - Liu, Chi Harold

AU - Chen, Lydia Y.

AU - Mortier, Richard

AU - Crowcroft, Jon

AU - Wang, Liang

N1 - Green Open Access added to TU Delft Institutional Repository 'You share, we take care!' - Taverne project https://www.openaccess.nl/en/you-share-we-take-care Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.

PY - 2022

Y1 - 2022

N2 - With the massive amount of data generated from mobile devices and the increase of computing power of edge devices, the paradigm of Federated Learning has attracted great momentum. In federated learning, distributed and heterogeneous nodes collaborate to learn model parameters. However, while providing benefits such as privacy by design and reduced latency, the heterogeneous network present challenges to the synchronisation methods, or barrier control methods, used in training, regarding system progress and model convergence etc. The design of these barrier mechanisms is critical for the performance and scalability of federated learning systems. We propose a new barrier control technique called Probabilistic Synchronous Parallel (PSP). In contrast to existing mechanisms, it introduces a sampling primitive that composes with existing barrier control mechanisms to produce a family of mechanisms with improved convergence speed and scalability. Our proposal is supported with a convergence analysis of PSP-based SGD algorithm. In practice, we also propose heuristic techniques that further improve the efficiency of PSP. We evaluate the performance of proposed methods using the federated learning specific FEMNSIT dataset. The evaluation results show that PSP can effectively achieve good balance between system efficiency and model accuracy, mitigating the challenge of heterogeneity in federated learning.

AB - With the massive amount of data generated from mobile devices and the increase of computing power of edge devices, the paradigm of Federated Learning has attracted great momentum. In federated learning, distributed and heterogeneous nodes collaborate to learn model parameters. However, while providing benefits such as privacy by design and reduced latency, the heterogeneous network present challenges to the synchronisation methods, or barrier control methods, used in training, regarding system progress and model convergence etc. The design of these barrier mechanisms is critical for the performance and scalability of federated learning systems. We propose a new barrier control technique called Probabilistic Synchronous Parallel (PSP). In contrast to existing mechanisms, it introduces a sampling primitive that composes with existing barrier control mechanisms to produce a family of mechanisms with improved convergence speed and scalability. Our proposal is supported with a convergence analysis of PSP-based SGD algorithm. In practice, we also propose heuristic techniques that further improve the efficiency of PSP. We evaluate the performance of proposed methods using the federated learning specific FEMNSIT dataset. The evaluation results show that PSP can effectively achieve good balance between system efficiency and model accuracy, mitigating the challenge of heterogeneity in federated learning.

KW - Federated learning

KW - edge computing

KW - distributed computing

KW - barrier control

UR - http://www.scopus.com/inward/record.url?scp=85114711489&partnerID=8YFLogxK

U2 - 10.1109/TSC.2021.3109910

DO - 10.1109/TSC.2021.3109910

M3 - Article

SN - 1939-1374

VL - 15

SP - 614

EP - 626

JO - IEEE Transactions on Services Computing

JF - IEEE Transactions on Services Computing

IS - 2

M1 - 9529051

ER -

Federated Learning With Heterogeneity-Aware Probabilistic Synchronous Parallel on Edge

Abstract

Bibliographical note

Keywords

Access to Document

Other files and links

Fingerprint

Cite this