Maverick Matters: Client Contribution and Selection in Federated Learning

Jiyue Huang, Chi Hong, Yang Liu, Lydia Y. Chen*, Stefanie Roos

*Corresponding author for this work

Research output: Chapter in Book/Conference proceedings/Edited volumeConference contributionScientificpeer-review

16 Downloads (Pure)

Abstract

Federated learning (FL) enables collaborative learning between parties, called clients, without sharing the original and potentially sensitive data. To ensure fast convergence in the presence of such heterogeneous clients, it is imperative to timely select clients who can effectively contribute to learning. A realistic but overlooked case of heterogeneous clients are Mavericks, who monopolize the possession of certain data types, e.g., children hospitals possess most of the data on pediatric cardiology. In this paper, we address the importance and tackle the challenges of Mavericks by exploring two types of client selection strategies. First, we show theoretically and through simulations that the common contribution-based approach, Shapley Value, underestimates the contribution of Mavericks and is hence not effective as a measure to select clients. Then, we propose FedEMD, an adaptive strategy with competitive overhead based on the Wasserstein distance, supported by a proven convergence bound. As FedEMD adapts the selection probability such that Mavericks are preferably selected when the model benefits from improvement on rare classes, it consistently ensures the fast convergence in the presence of different types of Mavericks. Compared to existing strategies, including Shapley Value-based ones, FedEMD improves the convergence speed of neural network classifiers with FedAvg aggregation by 26.9% and its performance is consistent across various levels of heterogeneity.

Original languageEnglish
Title of host publicationAdvances in Knowledge Discovery and Data Mining - 27th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2023, Proceedings
EditorsHisashi Kashima, Tsuyoshi Ide, Wen-Chih Peng
PublisherSpringer
Pages269-282
Number of pages14
ISBN (Print)9783031333767
DOIs
Publication statusPublished - 2023
Event27th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2023 - Osaka, Japan
Duration: 25 May 202328 May 2023

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume13936 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference27th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2023
Country/TerritoryJapan
CityOsaka
Period25/05/2328/05/23

Keywords

  • client selection
  • data heterogeneity
  • Federated learning
  • shapley value
  • wasserstein distance

Fingerprint

Dive into the research topics of 'Maverick Matters: Client Contribution and Selection in Federated Learning'. Together they form a unique fingerprint.

Cite this