Catching the response time tail in the cloud

Sebastiano Spicuglia, Mathias Bjorkqvist, Lydia Y. Chen, Walter Binder

Research output: Chapter in Book/Conference proceedings/Edited volumeConference contributionScientificpeer-review

3 Citations (Scopus)


As modern service systems are pressured to provide competitive prices via cost-effective capacity planning, especially in the paradigm of cloud computing, service level agreements (SLAs) end up becoming ever more sophisticated, i.e., fulfilling targets of different percentiles of response times. However, it is no mean feat to predict even the average response times of real systems, or even abstracted queueing systems that typically simplify system details, and it gets even more complicated when trying to manage SLAs defined by various percentiles of response times. To efficiently capture these different percentiles, we first develop a novel and autonomic methodology - termed Burst Based Simulation, which combines burst profiling on real systems with complex, state-dependent simulations. Moreover, based on our methodology, we construct an analysis on SLA management: the prediction of SLA violations given a certain request pattern. We evaluate our approach on two types of service systems, virtualized and bare-metal, with wide ranges of SLAs and traffic loads. Our evaluation results show that our methodology is able to achieve an average error below 15% when predicting different response time percentiles, and accurately capture SLA violations.

Original languageEnglish
Title of host publicationProceedings of the 2015 IFIP/IEEE International Symposium on Integrated Network Management, IM 2015
PublisherInstitute of Electrical and Electronics Engineers (IEEE)
Number of pages6
ISBN (Electronic)9783901882760
Publication statusPublished - 1 Jan 2015
Externally publishedYes
Event14th IFIP/IEEE International Symposium on Integrated Network Management, IM 2015 - Ottawa, Canada
Duration: 11 May 201515 May 2015


Conference14th IFIP/IEEE International Symposium on Integrated Network Management, IM 2015


Dive into the research topics of 'Catching the response time tail in the cloud'. Together they form a unique fingerprint.

Cite this