TY - GEN
T1 - Building High-Performance, Easy-to-Use Polymorphic Parallel Memories with HLS
AU - Stornaiuolo, L.
AU - Rabozzi, M.
AU - Santambrogio, M. D.
AU - Sciuto, D.
AU - Ciobanu, C. B.
AU - Stramondo, G.
AU - Varbanescu, A. L.
PY - 2019
Y1 - 2019
N2 - With the increased interest in energy efficiency, a lot of application domains experiment with Field Programmable Gate Arrays (FPGAs), which promise customized hardware accelerators with high-performance and low power consumption. These experiments possible due to the development of High-Level Languages (HLLs) for FPGAs, which permit non-experts in hardware design languages (HDLs) to program reconfigurable hardware for general purpose computing. However, some of the expert knowledge remains difficult to integrate in HLLs, eventually leading to performance loss for HLL-based applications. One example of such a missing feature is the efficient exploitation of the local memories on FPGAs. A solution to address this challenge is PolyMem, an easy-to-use polymorphic parallel memory that uses BRAMs. In this work, we present HLS-PolyMem, the first complete implementation and in-depth evaluation of PolyMem optimized for the Xilinx Design Suite. Our evaluation demonstrates that HLS-PolyMem is a viable alternative to HLS memory partitioning, the current approach for memory parallelism in Vivado HLS. Specifically, we show that PolyMem offers the same performance as HLS partitioning for simple access patterns, and outperforms partitioning as much as 13x when combining multiple access patterns for the same data structure. We further demonstrate the use of PolyMem for two different case studies, highlighting the superior capabilities of HLS-PolyMem in terms of performance, resource utilization, flexibility, and usability. Based on all the evidence provided in this work, we conclude that HLS-PolyMem enables the efficient use of BRAMs as parallel memories, without compromising the HLS level or the achievable performance.
AB - With the increased interest in energy efficiency, a lot of application domains experiment with Field Programmable Gate Arrays (FPGAs), which promise customized hardware accelerators with high-performance and low power consumption. These experiments possible due to the development of High-Level Languages (HLLs) for FPGAs, which permit non-experts in hardware design languages (HDLs) to program reconfigurable hardware for general purpose computing. However, some of the expert knowledge remains difficult to integrate in HLLs, eventually leading to performance loss for HLL-based applications. One example of such a missing feature is the efficient exploitation of the local memories on FPGAs. A solution to address this challenge is PolyMem, an easy-to-use polymorphic parallel memory that uses BRAMs. In this work, we present HLS-PolyMem, the first complete implementation and in-depth evaluation of PolyMem optimized for the Xilinx Design Suite. Our evaluation demonstrates that HLS-PolyMem is a viable alternative to HLS memory partitioning, the current approach for memory parallelism in Vivado HLS. Specifically, we show that PolyMem offers the same performance as HLS partitioning for simple access patterns, and outperforms partitioning as much as 13x when combining multiple access patterns for the same data structure. We further demonstrate the use of PolyMem for two different case studies, highlighting the superior capabilities of HLS-PolyMem in terms of performance, resource utilization, flexibility, and usability. Based on all the evidence provided in this work, we conclude that HLS-PolyMem enables the efficient use of BRAMs as parallel memories, without compromising the HLS level or the achievable performance.
KW - FPGA
KW - High-Level Synthesis
KW - Polymorphic Parallel Memory
UR - http://www.scopus.com/inward/record.url?scp=85068606907&partnerID=8YFLogxK
U2 - 10.1007/978-3-030-23425-6_4
DO - 10.1007/978-3-030-23425-6_4
M3 - Conference contribution
AN - SCOPUS:85068606907
SN - 978-3-030-23424-9
T3 - IFIP Advances in Information and Communication Technology
SP - 53
EP - 78
BT - VLSI-SoC
A2 - Bombieri, Nicola
A2 - Pravadelli, Graziano
A2 - Fujita, Masahiro
A2 - Austin, Todd
A2 - Reis, Ricardo
PB - Springer
CY - Cham
T2 - 26th IFIP/IEEE WG 10.5 International Conference on Very Large Scale Integration, VLSI-SoC 2018
Y2 - 8 October 2018 through 10 October 2018
ER -