Workflow Optimization for Parallel Split Learning

Joana Tirana, Dimitra Tsigkari, George Iosifidis, Dimitris Chatzopoulos

Research output: Chapter in Book/Conference proceedings/Edited volumeConference contributionScientificpeer-review

10 Downloads (Pure)

Abstract

Split learning (SL) has been recently proposed as a way to enable resource-constrained devices to train multi-parameter neural networks (NNs) and participate in federated learning (FL). In a nutshell, SL splits the NN model into parts, and allows clients (devices) to offload the largest part as a processing task to a computationally powerful helper. In parallel SL, multiple helpers can process model parts of one or more clients, thus, considerably reducing the maximum training time over all clients (makespan). In this paper, we focus on orchestrating the workflow of this operation, which is critical in highly heterogeneous systems, as our experiments show. In particular, we formulate the joint problem of client-helper assignments and scheduling decisions with the goal of minimizing the training makespan, and we prove that it is NPhard. We propose a solution method based on the decomposition of the problem by leveraging its inherent symmetry, and a second one that is fully scalable. A wealth of numerical evaluations using our testbed’s measurements allow us to build a solution strategy comprising these methods. Moreover, we show that this strategy finds a near-optimal solution, and achieves a shorter makespan than the baseline scheme by up to 52.3%.
Original languageEnglish
Title of host publicationProceedings of the IEEE INFOCOM 2024 - IEEE Conference on Computer Communications
PublisherIEEE
Pages1331-1340
Number of pages10
ISBN (Electronic)979-8-3503-8350-8
ISBN (Print)979-8-3503-8351-5
DOIs
Publication statusPublished - 2024
Event IEEE INFOCOM 2024 - IEEE Conference on Computer Communications - Vancouver, Canada
Duration: 20 May 202423 May 2024

Conference

Conference IEEE INFOCOM 2024 - IEEE Conference on Computer Communications
Country/TerritoryCanada
CityVancouver
Period20/05/2423/05/24

Bibliographical note

Green Open Access added to TU Delft Institutional Repository 'You share, we take care!' - Taverne project https://www.openaccess.nl/en/you-share-we-take-care
Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.

Keywords

  • Training
  • Federated learning
  • Computational modeling
  • Artificial neural networks
  • Task analysis
  • Optimization

Fingerprint

Dive into the research topics of 'Workflow Optimization for Parallel Split Learning'. Together they form a unique fingerprint.

Cite this