TY - THES
T1 - Democratizing Scalable Cloud Applications
T2 - Transactional Stateful Functions on Streaming Dataflows
AU - Psarakis, K.
PY - 2026
Y1 - 2026
N2 - Web applications power almost every aspect of our digitalized society, from entertainment to web shopping, vacation planning and booking, online games, communication, work, and social interaction. However, building scalable and consistent web applications in modern cloud environments requires extensive and diverse expertise in multiple domains, such as cloud computing, software development, distributed and database systems, and domain knowledge. These requirements make the development of such applications possible only by a few highly talented individuals that only large corporations can hire. In this thesis, we aim at democratizing the development and maintenance of such cloud applications by identifying and addressing three key challenges: (i) programmability of cloud applications; (ii) high-performance serializable transactions with fault tolerance guarantees; and (iii) serverless semantics. To address those, we created Stateflow, a high-level, object-oriented, easy-to-use programming model that operates alongside Styx, a novel deterministic dataflow engine that provides high-performance serializable transactions and serverless semantics.While investigating the challenge of democratizing scalable cloud applications, we discovered that they closely resemble the principles behind the streaming dataflow execution model. In Chapter 1, we highlight the similarities of streaming dataflow processing and the current state-of-the-art event-driven microservice architectures and lay a path towards the ideal cloud application runtime. To validate our hypothesis, we created T-Statefun, presented in Chapter 2, by adapting an existing dataflow system to support transactional cloud applications. At the time, the best candidate appeared to be Apache Flink Statefun, a stateful function as a service system (SFaaS), to which we added transactional support with coordinator functions. With T-Statefun, we showed that a dataflow system can support transactional cloud applications through an SFaaS API. Furthermore, its development helped us identify two significant issues: (i) it was challenging to program, especially after the addition of the coordinator functions; and (ii) due to the disaggregation of state and processing and an inefficient transactional protocol, T-Statefun was lacking in performance.In this thesis, to address the programmability issue, in Chapter 3 we introduce Stateflow, a user-friendly programming model where software developers code in the well-established object-oriented programming style with zero boilerplate code, and Stateflow transforms it into an intermediate representation based on stateful dataflow graphs. While experimenting with Stateflow, we verified the inefficiencies detected in Chapter 2 regarding messaging and state, or the lack of transactional support in the rest of Stateflow’s supported backends. Thus, in Chapter 4, we present all the details behind the design of Styx, a distributed streaming dataflow system that supports multi-partition deterministic transactions with serializable isolation guarantees through a high-level, standard Python programming model that obviates transaction failure management. Our design choices and novel algorithms allow Styx to outperform the state-of-the-art systems by at least one order of magnitude in all tested workloads regarding throughput.Styx demonstrates that it is possible to build a high-performance SFaaS system that provides transactional and fault-tolerance guarantees while offering an intuitive programming model with minimal boilerplate. Building on this foundation, we extend Styx with the ability to dynamically and efficiently adapt to varying workloads. To enable this, Chapter 5 explores how Styx can migrate state transactionally, a necessary capability for elasticity, given that Styx maintains application state in memory.We conclude this thesis by summarizing the key findings and reflecting on the contributions, critically examining the limitations of the proposed methods, and considering their broader ethical and societal implications. Moreover, based on the insights we gained from creating the Stateflow programming model and the Styx runtime, we lay out the new challenges and future directions in the field.
AB - Web applications power almost every aspect of our digitalized society, from entertainment to web shopping, vacation planning and booking, online games, communication, work, and social interaction. However, building scalable and consistent web applications in modern cloud environments requires extensive and diverse expertise in multiple domains, such as cloud computing, software development, distributed and database systems, and domain knowledge. These requirements make the development of such applications possible only by a few highly talented individuals that only large corporations can hire. In this thesis, we aim at democratizing the development and maintenance of such cloud applications by identifying and addressing three key challenges: (i) programmability of cloud applications; (ii) high-performance serializable transactions with fault tolerance guarantees; and (iii) serverless semantics. To address those, we created Stateflow, a high-level, object-oriented, easy-to-use programming model that operates alongside Styx, a novel deterministic dataflow engine that provides high-performance serializable transactions and serverless semantics.While investigating the challenge of democratizing scalable cloud applications, we discovered that they closely resemble the principles behind the streaming dataflow execution model. In Chapter 1, we highlight the similarities of streaming dataflow processing and the current state-of-the-art event-driven microservice architectures and lay a path towards the ideal cloud application runtime. To validate our hypothesis, we created T-Statefun, presented in Chapter 2, by adapting an existing dataflow system to support transactional cloud applications. At the time, the best candidate appeared to be Apache Flink Statefun, a stateful function as a service system (SFaaS), to which we added transactional support with coordinator functions. With T-Statefun, we showed that a dataflow system can support transactional cloud applications through an SFaaS API. Furthermore, its development helped us identify two significant issues: (i) it was challenging to program, especially after the addition of the coordinator functions; and (ii) due to the disaggregation of state and processing and an inefficient transactional protocol, T-Statefun was lacking in performance.In this thesis, to address the programmability issue, in Chapter 3 we introduce Stateflow, a user-friendly programming model where software developers code in the well-established object-oriented programming style with zero boilerplate code, and Stateflow transforms it into an intermediate representation based on stateful dataflow graphs. While experimenting with Stateflow, we verified the inefficiencies detected in Chapter 2 regarding messaging and state, or the lack of transactional support in the rest of Stateflow’s supported backends. Thus, in Chapter 4, we present all the details behind the design of Styx, a distributed streaming dataflow system that supports multi-partition deterministic transactions with serializable isolation guarantees through a high-level, standard Python programming model that obviates transaction failure management. Our design choices and novel algorithms allow Styx to outperform the state-of-the-art systems by at least one order of magnitude in all tested workloads regarding throughput.Styx demonstrates that it is possible to build a high-performance SFaaS system that provides transactional and fault-tolerance guarantees while offering an intuitive programming model with minimal boilerplate. Building on this foundation, we extend Styx with the ability to dynamically and efficiently adapt to varying workloads. To enable this, Chapter 5 explores how Styx can migrate state transactionally, a necessary capability for elasticity, given that Styx maintains application state in memory.We conclude this thesis by summarizing the key findings and reflecting on the contributions, critically examining the limitations of the proposed methods, and considering their broader ethical and societal implications. Moreover, based on the insights we gained from creating the Stateflow programming model and the Styx runtime, we lay out the new challenges and future directions in the field.
KW - stream processing
KW - transactions
KW - stateful functions
KW - state migration
KW - deterministic transactions
KW - dataflow programming
KW - microservice architectures
KW - event-driven programming
KW - fault tolerance
U2 - 10.4233/uuid:837e043a-c6e3-4f87-a9b1-a59a9ade65f7
DO - 10.4233/uuid:837e043a-c6e3-4f87-a9b1-a59a9ade65f7
M3 - Dissertation (TU Delft)
SN - 978-94-6384-875-6
ER -