Abstract
The demand for additional performance due to the rapid increase in the size and importance of data-intensive applications has considerably elevated the complexity of computer architecture. In response, systems offer pre-determined behaviors based on heuristics and then expose a large number of configuration parameters for operators to adjust them to their particular infrastructure. Unfortunately, in practice this leads to a substantial manual tuning effort. In this work, we focus on one of the most impactful tuning decisions in big data systems: the number of executor threads. We first show the impact of I/O contention on the runtime of workloads and a simple static solution to reduce the number of threads for I/O-bound phases. We then present a more elaborate solution in the form of self-adaptive executors which are able to continuously monitor the underlying system resources and detect contentions. This enables the executors to tune their thread pool size dynamically at runtime in order to achieve the best performance. Our experimental results show that being adaptive can significantly reduce the execution time especially in I/O intensive applications such as Terasort and PageRank which see a 34% and 54% reduction in runtime.
Original language | English |
---|---|
Title of host publication | Middleware 2019 - Proceedings of the 2019 20th International Middleware Conference |
Subtitle of host publication | Proceedings of the 20th International Middleware Conference |
Place of Publication | New York |
Publisher | Association for Computing Machinery (ACM) |
Pages | 176-188 |
Number of pages | 13 |
ISBN (Electronic) | 9781450370097 |
ISBN (Print) | 978-1-4503-7009-7 |
DOIs | |
Publication status | Published - 13 Sept 2019 |
Event | ACM/IFIP 20th International Middleware Conference - UC Davis, Davis, CA, United States Duration: 9 Dec 2019 → 13 Dec 2019 Conference number: 2019 http://2019.middleware-conference.org/ |
Publication series
Name | Middleware 2019 - Proceedings of the 2019 20th International Middleware Conference |
---|
Conference
Conference | ACM/IFIP 20th International Middleware Conference |
---|---|
Abbreviated title | Middleware |
Country/Territory | United States |
City | Davis, CA |
Period | 9/12/19 → 13/12/19 |
Internet address |
Bibliographical note
Green Open Access added to TU Delft Institutional Repository ‘You share, we take care!’ – Taverne project https://www.openaccess.nl/en/you-share-we-take-careOtherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.
Keywords
- Apache Spark
- Big Data
- Self-Adaptive Executors
Fingerprint
Dive into the research topics of 'Self-adaptive Executors for Big Data Processing'. Together they form a unique fingerprint.Datasets
-
Self-adaptive Executors for Big Data Processing
Omranian Khorasani, S. (Creator), Epema, D. H. J. (Contributor) & Rellermeyer, J. S. (Contributor), TU Delft - 4TU.ResearchData, 6 Sept 2019
DOI: 10.4121/uuid:38529ffe-00d0-42b0-9b3c-29d192262686
Dataset/Software: Dataset