Optimistic recovery for iterative dataflows in action

Sergey Dudoladov, Chen Xu, Sebastian Schelter, Asterios Katsifodimos, Stephan Ewen, Kostas Tzoumas, Volker Markl

Research output: Chapter in Book/Conference proceedings/Edited volumeConference contributionScientificpeer-review

15 Citations (Scopus)


Over the past years, parallel dataflow systems have been employed for advanced analytics in the field of data mining where many algorithms are iterative. These systems typically provide fault tolerance by periodically checkpointing the algorithm's state and, in case of failure, restoring a consistent state from a checkpoint. In prior work, we presented an optimistic recovery mechanism that in certain cases eliminates the need to checkpoint the intermediate state of an iterative algorithm. In case of failure, our mechanism uses a compensation function to transit the algorithm to a consistent state, from which the execution can continue and successfully converge. Since this recovery mechanism does not checkpoint any state, it achieves optimal failure-free performance while guaranteeing fault tolerance. In this paper, we demonstrate our recovery mechanism with the Apache Flink data processing engine. During our demonstration, attendees will be able to run graph algorithms and trigger failures to observe the algorithms recovering with compensation functions instead of checkpoints.

Original languageEnglish
Title of host publicationSIGMOD'15 -
Subtitle of host publicationProceedings of the 2015 ACM SIGMOD International Conference on Management of Data
PublisherAssociation for Computing Machinery (ACM)
Number of pages5
ISBN (Print)978-1-4503-2758-9
Publication statusPublished - 27 May 2015
Externally publishedYes
EventACM SIGMOD International Conference on Management of Data, SIGMOD 2015 - Melbourne, Australia
Duration: 31 May 20154 Jun 2015


ConferenceACM SIGMOD International Conference on Management of Data, SIGMOD 2015


  • Fault-tolerance
  • Iterative algorithms
  • Optimistic recovery

Fingerprint Dive into the research topics of 'Optimistic recovery for iterative dataflows in action'. Together they form a unique fingerprint.

Cite this