Automating Deep Neural Network Model Selection for Edge Inference

Bingqian Lu, Jianyi Yang, Lydia Y. Chen, Shaolei Ren

Research output: Chapter in Book/Conference proceedings/Edited volumeConference contributionScientificpeer-review

8 Citations (Scopus)

Abstract

The ever increasing size of deep neural network (DNN) models once implied that they were only limited to cloud data centers for runtime inference. Nonetheless, the recent plethora of DNN model compression techniques have successfully overcome this limit, turning into a reality that DNN-based inference can be run on numerous resource-constrained edge devices including mobile phones, drones, robots, medical devices, wearables, Internet of Things devices, among many others. Naturally, edge devices are highly heterogeneous in terms of hardware specification and usage scenarios. On the other hand, compressed DNN models are so diverse that they exhibit different tradeoffs in a multi-dimension space, and not a single model can achieve optimality in terms of all important metrics such as accuracy, latency and energy consumption. Consequently, how to automatically select a compressed DNN model for an edge device to run inference with optimal quality of experience (QoE) arises as a new challenge. The state-of-the-art approaches either choose a common model for all/most devices, which is optimal for a small fraction of edge devices at best, or apply device-specific DNN model compression, which is not scalable. In this paper, by leveraging the predictive power of machine learning and keeping end users in the loop, we envision an automated device-level DNN model selection engine for QoE-optimal edge inference. To concretize our vision, we formulate the DNN model selection problem into a contextual multi-armed bandit framework, where features of edge devices and DNN models are contexts and pre-trained DNN models are arms selected online based on the history of actions and users' QoE feedback. We develop an efficient online learning algorithm to balance exploration and exploitation. Our preliminary simulation results validate our algorithm and highlight the potential of machine learning for automating DNN model selection to achieve QoE-optimal edge inference.
Original languageEnglish
Title of host publication2019 IEEE First International Conference on Cognitive Machine Intelligence (CogMI)
EditorsC. Ceballos
Place of PublicationPiscataway
PublisherIEEE
Pages184-193
Number of pages10
ISBN (Electronic)978-1-7281-6737-4
ISBN (Print)978-1-7281-6738-1
DOIs
Publication statusPublished - 2019
Event1st IEEE International Conference on Cognitive Machine Intelligence - Los Angeles, United States
Duration: 12 Dec 201914 Dec 2019
Conference number: 1
http://www.sis.pitt.edu/lersais/cogmi/2019/

Conference

Conference1st IEEE International Conference on Cognitive Machine Intelligence
Abbreviated titleCogMI 2019
Country/TerritoryUnited States
CityLos Angeles
Period12/12/1914/12/19
Internet address

Keywords

  • Deep neural network
  • Edge inference
  • Model selection
  • Multi arm bandit
  • Online learning
  • Quality of experience

Fingerprint

Dive into the research topics of 'Automating Deep Neural Network Model Selection for Edge Inference'. Together they form a unique fingerprint.

Cite this