Abstract
Edge-cloud jobs are rapidly prevailing in many application domains, posing the challenge of using both resource-strenuous edge devices and elastic cloud resources. Efficient resource allocation on such jobs via scheduling algorithms is essential to guarantee their performance, e.g. latency. Deep reinforcement learning (DRL) is increasingly adopted to make scheduling decisions but faces the conundrum of achieving high rewards at a low training overhead. It is unknown if such a DRL can be applied to timely tune the scheduling algorithms that are adopted in response to fast changing workloads and resources. In this paper, we propose EdgeTuner to effectively leverage DRL to select scheduling algorithms online for edge-cloud jobs. The enabling features of EdgeTuner are sophisticated DRL model that captures complex dynamics of Edge-Cloud jobs/tasks and an effective simulator to emulate the response times of short-running jobs in accordance to dynamically changing scheduling algorithms. EdgeTuner trains DRL agents offline by directly interacting with the simulator. We implement EdgeTuner on Kubernetes scheduler and extensively evaluate it on Kubernetes cluster testbed driven by the production traces. Our results show that EdgeTuner outperforms prevailing scheduling algorithms by achieving significant lower job response time while accelerating DRL training speed by more than 180x.
Original language | English |
---|---|
Title of host publication | INFOCOM 2022 - IEEE Conference on Computer Communications |
Publisher | Institute of Electrical and Electronics Engineers (IEEE) |
Pages | 880-889 |
Number of pages | 10 |
ISBN (Electronic) | 978-1-6654-5822-1 |
DOIs | |
Publication status | Published - 2022 |
Event | 41st IEEE Conference on Computer Communications, INFOCOM 2022 - Virtual, Online, United Kingdom Duration: 2 May 2022 → 5 May 2022 |
Publication series
Name | Proceedings - IEEE INFOCOM |
---|---|
Volume | 2022-May |
ISSN (Print) | 0743-166X |
Conference
Conference | 41st IEEE Conference on Computer Communications, INFOCOM 2022 |
---|---|
Country/Territory | United Kingdom |
City | Virtual, Online |
Period | 2/05/22 → 5/05/22 |
Bibliographical note
Green Open Access added to TU Delft Institutional Repository 'You share, we take care!' - Taverne project https://www.openaccess.nl/en/you-share-we-take-careOtherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.
Keywords
- DRL
- Edge-cloud workloads
- Kubernetes
- run-time tuning
- scheduling algorithm