Generating high-performance FPGA accelerator designs for big data analytics with Fletcher and Apache Arrow

Johan Peltenburg; Jeroen van Straten; Matthijs Brobbel; Zaid Al-Ars; H. Peter Hofstee

doi:10.1007/s11265-021-01650-6

Generating high-performance FPGA accelerator designs for big data analytics with Fletcher and Apache Arrow

Johan Peltenburg, Jeroen van Straten, Matthijs Brobbel, Zaid Al-Ars, H. Peter Hofstee

Computer Engineering

Research output: Contribution to journal › Article › Scientific › peer-review

79 Downloads (Pure)

Abstract

As big data analytics systems are squeezing out the last bits of performance of CPUs and GPUs, the next near-term and widely available alternative industry is considering for higher performance in the data center and cloud is the FPGA accelerator. We discuss several challenges a developer has to face when designing and integrating FPGA accelerators for big data analytics pipelines. On the software side, we observe complex run-time systems, hardware-unfriendly in-memory layouts of data sets, and (de)serialization overhead. On the hardware side, we observe a relative lack of platform-agnostic open-source tooling, a high design effort for data structure-specific interfaces, and a high design effort for infrastructure. The open source Fletcher framework addresses these challenges. It is built on top of Apache Arrow, which provides a common, hardware-friendly in-memory format to allow zero-copy communication of large tabular data, preventing (de)serialization overhead. Fletcher adds FPGA accelerators to the list of over eleven supported software languages. To deal with the hardware challenges, we present Arrow-specific components, providing easy-to-use, high-performance interfaces to accelerated kernels. The components are combined based on a generic architecture that is specialized according to the application through an extensive infrastructure generation framework that is presented in this article. All generated hardware is vendor-agnostic, and software drivers add a platform-agnostic layer, allowing users to create portable implementations.

Original language	English
Pages (from-to)	565-586
Number of pages	22
Journal	Journal of Signal Processing Systems: the journal of DSPtechnologies
Volume	93
Issue number	5
DOIs	https://doi.org/10.1007/s11265-021-01650-6
Publication status	Published - 2021

Keywords

Accelerator
Analytics
Apache Arrow
Big data
FPGA
Fletcher

Access to Document

10.1007/s11265-021-01650-6

Peltenburg2021_Article_GeneratingHigh-PerformanceFPGAFinal published version, 3.17 MBLicence: CC BY

Cite this

@article{13c2cbb7f92f4ab5bd34ebd2d0cf2308,

title = "Generating high-performance FPGA accelerator designs for big data analytics with Fletcher and Apache Arrow",

abstract = "As big data analytics systems are squeezing out the last bits of performance of CPUs and GPUs, the next near-term and widely available alternative industry is considering for higher performance in the data center and cloud is the FPGA accelerator. We discuss several challenges a developer has to face when designing and integrating FPGA accelerators for big data analytics pipelines. On the software side, we observe complex run-time systems, hardware-unfriendly in-memory layouts of data sets, and (de)serialization overhead. On the hardware side, we observe a relative lack of platform-agnostic open-source tooling, a high design effort for data structure-specific interfaces, and a high design effort for infrastructure. The open source Fletcher framework addresses these challenges. It is built on top of Apache Arrow, which provides a common, hardware-friendly in-memory format to allow zero-copy communication of large tabular data, preventing (de)serialization overhead. Fletcher adds FPGA accelerators to the list of over eleven supported software languages. To deal with the hardware challenges, we present Arrow-specific components, providing easy-to-use, high-performance interfaces to accelerated kernels. The components are combined based on a generic architecture that is specialized according to the application through an extensive infrastructure generation framework that is presented in this article. All generated hardware is vendor-agnostic, and software drivers add a platform-agnostic layer, allowing users to create portable implementations.",

keywords = "Accelerator, Analytics, Apache Arrow, Big data, FPGA, Fletcher",

author = "Johan Peltenburg and {van Straten}, Jeroen and Matthijs Brobbel and Zaid Al-Ars and Hofstee, {H. Peter}",

year = "2021",

doi = "10.1007/s11265-021-01650-6",

language = "English",

volume = "93",

pages = "565--586",

journal = "Journal of Signal Processing Systems: the journal of DSPtechnologies",

issn = "1939-8018",

publisher = "Springer",

number = "5",

}

TY - JOUR

T1 - Generating high-performance FPGA accelerator designs for big data analytics with Fletcher and Apache Arrow

AU - Peltenburg, Johan

AU - van Straten, Jeroen

AU - Brobbel, Matthijs

AU - Al-Ars, Zaid

AU - Hofstee, H. Peter

PY - 2021

Y1 - 2021

N2 - As big data analytics systems are squeezing out the last bits of performance of CPUs and GPUs, the next near-term and widely available alternative industry is considering for higher performance in the data center and cloud is the FPGA accelerator. We discuss several challenges a developer has to face when designing and integrating FPGA accelerators for big data analytics pipelines. On the software side, we observe complex run-time systems, hardware-unfriendly in-memory layouts of data sets, and (de)serialization overhead. On the hardware side, we observe a relative lack of platform-agnostic open-source tooling, a high design effort for data structure-specific interfaces, and a high design effort for infrastructure. The open source Fletcher framework addresses these challenges. It is built on top of Apache Arrow, which provides a common, hardware-friendly in-memory format to allow zero-copy communication of large tabular data, preventing (de)serialization overhead. Fletcher adds FPGA accelerators to the list of over eleven supported software languages. To deal with the hardware challenges, we present Arrow-specific components, providing easy-to-use, high-performance interfaces to accelerated kernels. The components are combined based on a generic architecture that is specialized according to the application through an extensive infrastructure generation framework that is presented in this article. All generated hardware is vendor-agnostic, and software drivers add a platform-agnostic layer, allowing users to create portable implementations.

AB - As big data analytics systems are squeezing out the last bits of performance of CPUs and GPUs, the next near-term and widely available alternative industry is considering for higher performance in the data center and cloud is the FPGA accelerator. We discuss several challenges a developer has to face when designing and integrating FPGA accelerators for big data analytics pipelines. On the software side, we observe complex run-time systems, hardware-unfriendly in-memory layouts of data sets, and (de)serialization overhead. On the hardware side, we observe a relative lack of platform-agnostic open-source tooling, a high design effort for data structure-specific interfaces, and a high design effort for infrastructure. The open source Fletcher framework addresses these challenges. It is built on top of Apache Arrow, which provides a common, hardware-friendly in-memory format to allow zero-copy communication of large tabular data, preventing (de)serialization overhead. Fletcher adds FPGA accelerators to the list of over eleven supported software languages. To deal with the hardware challenges, we present Arrow-specific components, providing easy-to-use, high-performance interfaces to accelerated kernels. The components are combined based on a generic architecture that is specialized according to the application through an extensive infrastructure generation framework that is presented in this article. All generated hardware is vendor-agnostic, and software drivers add a platform-agnostic layer, allowing users to create portable implementations.

KW - Accelerator

KW - Analytics

KW - Apache Arrow

KW - Big data

KW - FPGA

KW - Fletcher

UR - http://www.scopus.com/inward/record.url?scp=85101818915&partnerID=8YFLogxK

U2 - 10.1007/s11265-021-01650-6

DO - 10.1007/s11265-021-01650-6

M3 - Article

SN - 1939-8018

VL - 93

SP - 565

EP - 586

JO - Journal of Signal Processing Systems: the journal of DSPtechnologies

JF - Journal of Signal Processing Systems: the journal of DSPtechnologies

IS - 5

ER -

Generating high-performance FPGA accelerator designs for big data analytics with Fletcher and Apache Arrow

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this