Evolutionary bin packing for memory-efficient dataflow inference acceleration on FPGA

Mairin Kroes; Lucian Petrica; Sorin Cotofana; Michaela Blott

doi:10.1145/3377930.3389808

Evolutionary bin packing for memory-efficient dataflow inference acceleration on FPGA

Mairin Kroes, Lucian Petrica^*, Sorin Cotofana, Michaela Blott

^*Corresponding author for this work

Computer Engineering

Research output: Chapter in Book/Conference proceedings/Edited volume › Conference contribution › Scientific › peer-review

4 Citations (Scopus)

Abstract

Convolutional Neural Network (CNN) dataflow inference accelerators implemented in Field-Programmable Gate Arrays (FPGAs) have demonstrated increased energy efficiency and lower latency compared to CNN execution on CPUs or GPUs. However, the complex shapes of CNN parameter memories do not typically map well to FPGA On-Chip Memories (OCM), which results in poor OCM utilization and ultimately limits the size and types of CNNs which can be effectively accelerated on FPGAs. In this work, we present a design methodology that improves the mapping efficiency of CNN parameters to FPGA OCM. We frame the mapping as a bin packing problem and determine that traditional bin packing algorithms are not well suited to solve the problem within FPGA- and CNN-specific constraints. We hybridize genetic algorithms and simulated annealing with traditional bin packing heuristics to create flexible mappers capable of grouping parameter memories such that each group optimally fits FPGA on-chip memories. We evaluate these algorithms on a variety of FPGA inference accelerators. Our hybrid mappers converge to optimal solutions in a matter of seconds for all CNN use-cases, achieve an increase of up to 65% in OCM utilization efficiency for deep CNNs, and are up to 200× faster than current state-of-the-art simulated annealing approaches.

Original language	English
Title of host publication	GECCO 2020 - Proceedings of the 2020 Genetic and Evolutionary Computation Conference
Publisher	Association for Computing Machinery (ACM)
Pages	1125-1133
Number of pages	9
ISBN (Electronic)	9781450371285
DOIs	https://doi.org/10.1145/3377930.3389808
Publication status	Published - 2020
Event	2020 Genetic and Evolutionary Computation Conference, GECCO 2020 - Cancun, Mexico Duration: 8 Jul 2020 → 12 Jul 2020

Publication series

Name	GECCO 2020 - Proceedings of the 2020 Genetic and Evolutionary Computation Conference

Conference

Conference	2020 Genetic and Evolutionary Computation Conference, GECCO 2020
Country/Territory	Mexico
City	Cancun
Period	8/07/20 → 12/07/20

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

Access to Document

10.1145/3377930.3389808

Cite this

Kroes, M., Petrica, L., Cotofana, S., & Blott, M. (2020). Evolutionary bin packing for memory-efficient dataflow inference acceleration on FPGA. In GECCO 2020 - Proceedings of the 2020 Genetic and Evolutionary Computation Conference (pp. 1125-1133). (GECCO 2020 - Proceedings of the 2020 Genetic and Evolutionary Computation Conference). Association for Computing Machinery (ACM). https://doi.org/10.1145/3377930.3389808

Kroes, Mairin ; Petrica, Lucian ; Cotofana, Sorin et al. / Evolutionary bin packing for memory-efficient dataflow inference acceleration on FPGA. GECCO 2020 - Proceedings of the 2020 Genetic and Evolutionary Computation Conference. Association for Computing Machinery (ACM), 2020. pp. 1125-1133 (GECCO 2020 - Proceedings of the 2020 Genetic and Evolutionary Computation Conference).

@inproceedings{19a12aa82e624aa5afca8e96e725580e,

title = "Evolutionary bin packing for memory-efficient dataflow inference acceleration on FPGA",

abstract = "Convolutional Neural Network (CNN) dataflow inference accelerators implemented in Field-Programmable Gate Arrays (FPGAs) have demonstrated increased energy efficiency and lower latency compared to CNN execution on CPUs or GPUs. However, the complex shapes of CNN parameter memories do not typically map well to FPGA On-Chip Memories (OCM), which results in poor OCM utilization and ultimately limits the size and types of CNNs which can be effectively accelerated on FPGAs. In this work, we present a design methodology that improves the mapping efficiency of CNN parameters to FPGA OCM. We frame the mapping as a bin packing problem and determine that traditional bin packing algorithms are not well suited to solve the problem within FPGA- and CNN-specific constraints. We hybridize genetic algorithms and simulated annealing with traditional bin packing heuristics to create flexible mappers capable of grouping parameter memories such that each group optimally fits FPGA on-chip memories. We evaluate these algorithms on a variety of FPGA inference accelerators. Our hybrid mappers converge to optimal solutions in a matter of seconds for all CNN use-cases, achieve an increase of up to 65% in OCM utilization efficiency for deep CNNs, and are up to 200× faster than current state-of-the-art simulated annealing approaches.",

author = "Mairin Kroes and Lucian Petrica and Sorin Cotofana and Michaela Blott",

year = "2020",

doi = "10.1145/3377930.3389808",

language = "English",

series = "GECCO 2020 - Proceedings of the 2020 Genetic and Evolutionary Computation Conference",

publisher = "Association for Computing Machinery (ACM)",

pages = "1125--1133",

booktitle = "GECCO 2020 - Proceedings of the 2020 Genetic and Evolutionary Computation Conference",

address = "United States",

note = "2020 Genetic and Evolutionary Computation Conference, GECCO 2020 ; Conference date: 08-07-2020 Through 12-07-2020",

}

Kroes, M, Petrica, L, Cotofana, S & Blott, M 2020, Evolutionary bin packing for memory-efficient dataflow inference acceleration on FPGA. in GECCO 2020 - Proceedings of the 2020 Genetic and Evolutionary Computation Conference. GECCO 2020 - Proceedings of the 2020 Genetic and Evolutionary Computation Conference, Association for Computing Machinery (ACM), pp. 1125-1133, 2020 Genetic and Evolutionary Computation Conference, GECCO 2020, Cancun, Mexico, 8/07/20. https://doi.org/10.1145/3377930.3389808

Evolutionary bin packing for memory-efficient dataflow inference acceleration on FPGA. / Kroes, Mairin; Petrica, Lucian; Cotofana, Sorin et al.
GECCO 2020 - Proceedings of the 2020 Genetic and Evolutionary Computation Conference. Association for Computing Machinery (ACM), 2020. p. 1125-1133 (GECCO 2020 - Proceedings of the 2020 Genetic and Evolutionary Computation Conference).

Research output: Chapter in Book/Conference proceedings/Edited volume › Conference contribution › Scientific › peer-review

TY - GEN

T1 - Evolutionary bin packing for memory-efficient dataflow inference acceleration on FPGA

AU - Kroes, Mairin

AU - Petrica, Lucian

AU - Cotofana, Sorin

AU - Blott, Michaela

PY - 2020

Y1 - 2020

N2 - Convolutional Neural Network (CNN) dataflow inference accelerators implemented in Field-Programmable Gate Arrays (FPGAs) have demonstrated increased energy efficiency and lower latency compared to CNN execution on CPUs or GPUs. However, the complex shapes of CNN parameter memories do not typically map well to FPGA On-Chip Memories (OCM), which results in poor OCM utilization and ultimately limits the size and types of CNNs which can be effectively accelerated on FPGAs. In this work, we present a design methodology that improves the mapping efficiency of CNN parameters to FPGA OCM. We frame the mapping as a bin packing problem and determine that traditional bin packing algorithms are not well suited to solve the problem within FPGA- and CNN-specific constraints. We hybridize genetic algorithms and simulated annealing with traditional bin packing heuristics to create flexible mappers capable of grouping parameter memories such that each group optimally fits FPGA on-chip memories. We evaluate these algorithms on a variety of FPGA inference accelerators. Our hybrid mappers converge to optimal solutions in a matter of seconds for all CNN use-cases, achieve an increase of up to 65% in OCM utilization efficiency for deep CNNs, and are up to 200× faster than current state-of-the-art simulated annealing approaches.

AB - Convolutional Neural Network (CNN) dataflow inference accelerators implemented in Field-Programmable Gate Arrays (FPGAs) have demonstrated increased energy efficiency and lower latency compared to CNN execution on CPUs or GPUs. However, the complex shapes of CNN parameter memories do not typically map well to FPGA On-Chip Memories (OCM), which results in poor OCM utilization and ultimately limits the size and types of CNNs which can be effectively accelerated on FPGAs. In this work, we present a design methodology that improves the mapping efficiency of CNN parameters to FPGA OCM. We frame the mapping as a bin packing problem and determine that traditional bin packing algorithms are not well suited to solve the problem within FPGA- and CNN-specific constraints. We hybridize genetic algorithms and simulated annealing with traditional bin packing heuristics to create flexible mappers capable of grouping parameter memories such that each group optimally fits FPGA on-chip memories. We evaluate these algorithms on a variety of FPGA inference accelerators. Our hybrid mappers converge to optimal solutions in a matter of seconds for all CNN use-cases, achieve an increase of up to 65% in OCM utilization efficiency for deep CNNs, and are up to 200× faster than current state-of-the-art simulated annealing approaches.

UR - http://www.scopus.com/inward/record.url?scp=85091785319&partnerID=8YFLogxK

U2 - 10.1145/3377930.3389808

DO - 10.1145/3377930.3389808

M3 - Conference contribution

T3 - GECCO 2020 - Proceedings of the 2020 Genetic and Evolutionary Computation Conference

SP - 1125

EP - 1133

BT - GECCO 2020 - Proceedings of the 2020 Genetic and Evolutionary Computation Conference

PB - Association for Computing Machinery (ACM)

T2 - 2020 Genetic and Evolutionary Computation Conference, GECCO 2020

Y2 - 8 July 2020 through 12 July 2020

ER -

Kroes M, Petrica L, Cotofana S, Blott M. Evolutionary bin packing for memory-efficient dataflow inference acceleration on FPGA. In GECCO 2020 - Proceedings of the 2020 Genetic and Evolutionary Computation Conference. Association for Computing Machinery (ACM). 2020. p. 1125-1133. (GECCO 2020 - Proceedings of the 2020 Genetic and Evolutionary Computation Conference). doi: 10.1145/3377930.3389808

Evolutionary bin packing for memory-efficient dataflow inference acceleration on FPGA

Abstract

Publication series

Conference

UN SDGs

Access to Document

Other files and links

Fingerprint

Cite this