On the latency-accuracy tradeoff in approximate MapReduce jobs

Juan F. Perez, Robert Birke, Lydia Y. Chen

Research output: Chapter in Book/Conference proceedings/Edited volumeConference contributionScientificpeer-review

4 Citations (Scopus)

Abstract

To ensure the scalability of big data analytics, approximate MapReduce platforms emerge to explicitly trade off accuracy for latency. A key step to determine optimal approximation levels is to capture the latency of big data jobs, which is long deemed challenging due to the complex dependency among data inputs and map/reduce tasks. In this paper, we use matrix analytic methods to derive stochastic models that can predict a wide spectrum of latency metrics, e.g., average, tails, and distributions, for approximate MapReduce jobs that are subject to strategies of input sampling and task dropping. In addition to capturing the dependency among waves of map/reduce tasks, our models incorporate two job scheduling policies, namely, exclusive and overlapping, and two task dropping strategies, namely, early and straggler, enabling us to realistically evaluate the potential performance gains of approximate computing. Our numerical analysis shows that the proposed models can guide big data platforms to determine the optimal approximation strategies and degrees of approximation.

Original languageEnglish
Title of host publicationINFOCOM 2017 - IEEE Conference on Computer Communications
PublisherInstitute of Electrical and Electronics Engineers (IEEE)
ISBN (Electronic)9781509053360
DOIs
Publication statusPublished - 2 Oct 2017
Externally publishedYes
Event2017 IEEE Conference on Computer Communications, INFOCOM 2017: IEEE Conference on Computer Communications - Atlanta, United States
Duration: 1 May 20174 May 2017

Conference

Conference2017 IEEE Conference on Computer Communications, INFOCOM 2017
CountryUnited States
CityAtlanta
Period1/05/174/05/17

Fingerprint Dive into the research topics of 'On the latency-accuracy tradeoff in approximate MapReduce jobs'. Together they form a unique fingerprint.

  • Cite this

    Perez, J. F., Birke, R., & Chen, L. Y. (2017). On the latency-accuracy tradeoff in approximate MapReduce jobs. In INFOCOM 2017 - IEEE Conference on Computer Communications [8057038] Institute of Electrical and Electronics Engineers (IEEE). https://doi.org/10.1109/INFOCOM.2017.8057038