Democratizing Scalable Cloud Applications: Transactional Stateful Functions on Streaming Dataflows

Research output: ThesisDissertation (TU Delft)

8 Downloads (Pure)

Abstract

Web applications power almost every aspect of our digitalized society, from entertainment to web shopping, vacation planning and booking, online games, communication, work, and social interaction. However, building scalable and consistent web applications in modern cloud environments requires extensive and diverse expertise in multiple domains, such as cloud computing, software development, distributed and database systems, and domain knowledge. These requirements make the development of such applications possible only by a few highly talented individuals that only large corporations can hire. In this thesis, we aim at democratizing the development and maintenance of such cloud applications by identifying and addressing three key challenges: (i) programmability of cloud applications; (ii) high-performance serializable transactions with fault tolerance guarantees; and (iii) serverless semantics. To address those, we created Stateflow, a high-level, object-oriented, easy-to-use programming model that operates alongside Styx, a novel deterministic dataflow engine that provides high-performance serializable transactions and serverless semantics.

While investigating the challenge of democratizing scalable cloud applications, we discovered that they closely resemble the principles behind the streaming dataflow execution model. In Chapter 1, we highlight the similarities of streaming dataflow processing and the current state-of-the-art event-driven microservice architectures and lay a path towards the ideal cloud application runtime. To validate our hypothesis, we created T-Statefun, presented in Chapter 2, by adapting an existing dataflow system to support transactional cloud applications. At the time, the best candidate appeared to be Apache Flink Statefun, a stateful function as a service system (SFaaS), to which we added transactional support with coordinator functions. With T-Statefun, we showed that a dataflow system can support transactional cloud applications through an SFaaS API. Furthermore, its development helped us identify two significant issues: (i) it was challenging to program, especially after the addition of the coordinator functions; and (ii) due to the disaggregation of state and processing and an inefficient transactional protocol, T-Statefun was lacking in performance.

In this thesis, to address the programmability issue, in Chapter 3 we introduce Stateflow, a user-friendly programming model where software developers code in the well-established object-oriented programming style with zero boilerplate code, and Stateflow transforms it into an intermediate representation based on stateful dataflow graphs. While experimenting with Stateflow, we verified the inefficiencies detected in Chapter 2 regarding messaging and state, or the lack of transactional support in the rest of Stateflow’s supported backends. Thus, in Chapter 4, we present all the details behind the design of Styx, a distributed streaming dataflow system that supports multi-partition deterministic transactions with serializable isolation guarantees through a high-level, standard Python programming model that obviates transaction failure management. Our design choices and novel algorithms allow Styx to outperform the state-of-the-art systems by at least one order of magnitude in all tested workloads regarding throughput.

Styx demonstrates that it is possible to build a high-performance SFaaS system that provides transactional and fault-tolerance guarantees while offering an intuitive programming model with minimal boilerplate. Building on this foundation, we extend Styx with the ability to dynamically and efficiently adapt to varying workloads. To enable this, Chapter 5 explores how Styx can migrate state transactionally, a necessary capability for elasticity, given that Styx maintains application state in memory.

We conclude this thesis by summarizing the key findings and reflecting on the contributions, critically examining the limitations of the proposed methods, and considering their broader ethical and societal implications. Moreover, based on the insights we gained from creating the Stateflow programming model and the Styx runtime, we lay out the new challenges and future directions in the field.
Original languageEnglish
QualificationDoctor of Philosophy
Awarding Institution
  • Delft University of Technology
Supervisors/Advisors
  • Houben, G.J.P.M., Promotor
  • Katsifodimos, A., Copromotor
Award date14 Jan 2026
Print ISBNs978-94-6384-875-6
Electronic ISBNs978-94-6518-185-1
DOIs
Publication statusPublished - 2026

Keywords

  • stream processing
  • transactions
  • stateful functions
  • state migration
  • deterministic transactions
  • dataflow programming
  • microservice architectures
  • event-driven programming
  • fault tolerance

Fingerprint

Dive into the research topics of 'Democratizing Scalable Cloud Applications: Transactional Stateful Functions on Streaming Dataflows'. Together they form a unique fingerprint.

Cite this