Pipetune: Pipeline parallelism of hyper and system parameters tuning for deep learning clusters

Isabelly Rocha, Nathaniel Morris, Lydia Y. Chen, Pascal Felber, Robert Birke, Valerio Schiavoni

Research output: Chapter in Book/Conference proceedings/Edited volumeConference contributionScientificpeer-review

1 Citation (Scopus)

Abstract

DNN learning jobs are common in today's clusters due to the advances in AI driven services such as machine translation and image recognition. The most critical phase of these jobs for model performance and learning cost is the tuning of hyperparameters. Existing approaches make use of techniques such as early stopping criteria to reduce the tuning impact on learning cost. However, these strategies do not consider the impact that certain hyperparameters and systems parameters have on training time. This paper presents PIPETUNE, a framework for DNN learning jobs that addresses the trade-offs between these two types of parameters. PIPETUNE takes advantage of the high parallelism and recurring characteristics of such jobs to minimize the learning cost via a pipelined simultaneous tuning of both hyper and system parameters. Our experimental evaluation using three different types of workloads indicates that PIPETUNE achieves up to 22.6% reduction and 1.7× speed up on tuning and training time, respectively. PipeTune not only improves performance but also lowers energy consumption up to 29%.

Original languageEnglish
Title of host publicationMiddleware 2020
Subtitle of host publicationProceedings of the 2020 21st International Middleware Conference
Place of PublicationNew York
PublisherAssociation for Computing Machinery (ACM)
Pages89-104
Number of pages16
ISBN (Print)978-1-4503-8153-6
DOIs
Publication statusPublished - 2020
EventMiddleware 2020: 21st International Middleware Conference - Delft, Netherlands
Duration: 7 Dec 202011 Dec 2020
Conference number: 21st

Conference

ConferenceMiddleware 2020
CountryNetherlands
CityDelft
Period7/12/2011/12/20

Keywords

  • Accuracy time trade-off
  • Deep Neural Networks training
  • Parameter tuning

Fingerprint

Dive into the research topics of 'Pipetune: Pipeline parallelism of hyper and system parameters tuning for deep learning clusters'. Together they form a unique fingerprint.

Cite this