Apache Flink is an open-source system for processing streaming and batch data. Flink is built on the philosophy that many classes of data processing applications, including real-time analytics, continuous data pipelines, historic data processing (batch), and iterative algorithms (machine learning, graph analysis) can be expressed and executed as pipelined fault-tolerant dataflows. In this paper, we present Flink’s architecture and expand on how a (seemingly diverse) set of use cases can be unified under asingle execution model.
|Number of pages||11|
|Journal||Bulletin of the IEEE Computer Society Technical Committee on Data Engineering|
|Publication status||Published - 2015|