Cerebron: A Reconfigurable Architecture for Spatio-Temporal Sparse Spiking Neural Networks

Qinyu  Chen; Chang Gao; Yuxiang  Fu

doi:10.1109/TVLSI.2022.3196839

Cerebron: A Reconfigurable Architecture for Spatio-Temporal Sparse Spiking Neural Networks

Qinyu Chen^*, Chang Gao, Yuxiang Fu

^*Corresponding author for this work

Electronics

Research output: Contribution to journal › Article › Scientific › peer-review

5 Citations (Scopus)

37 Downloads (Pure)

Abstract

Spiking neural networks (SNNs) are promising alternatives to artificial neural networks (ANNs) since they are more realistic brain-inspired computing models. SNNs have sparse neuron firing over time, i.e., spatiotemporal sparsity; thus, they are helpful in enabling energy-efficient hardware inference. However, exploiting the spatiotemporal sparsity of SNNs in hardware leads to unpredictable and unbalanced workloads, degrading the energy efficiency. Compared to SNNs with simple fully connected structures, those extensive structures (e.g., standard convolutions, depthwise convolutions, and pointwise convolutions) can deal with more complicated tasks but lead to difficulties in hardware mapping. In this work, we propose a novel reconfigurable architecture, Cerebron, which can fully exploit the spatiotemporal sparsity in SNNs with maximized data reuse and propose optimization techniques to improve the efficiency and flexibility of the hardware. To achieve flexibility, the reconfigurable compute engine is compatible with a variety of spiking layers and supports inter-computing-unit (CU) and intra-CU reconfiguration. The compute engine can exploit data reuse and guarantee parallel data access when processing different convolutions to achieve memory efficiency. A two-step data sparsity exploitation method is introduced to leverage the sparsity of discrete spikes and reduce the computation time. Besides, an online channelwise workload scheduling strategy is designed to reduce the latency further. Cerebron is verified on image segmentation and classification tasks using a variety of state-of-the-art spiking network structures. Experimental results show that Cerebron has achieved at least 17.5<inline-formula> <tex-math notation="LaTeX">$\times$</tex-math> </inline-formula> prediction energy reduction and 20<inline-formula> <tex-math notation="LaTeX">$\times$</tex-math> </inline-formula> speedup compared with state-of-the-art field-programmable gate array (FPGA)-based accelerators.

Original language	English
Pages (from-to)	1425 - 1437
Number of pages	13
Journal	IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Volume	30
Issue number	10
DOIs	https://doi.org/10.1109/TVLSI.2022.3196839
Publication status	Published - 2022

Bibliographical note

Green Open Access added to TU Delft Institutional Repository 'You share, we take care!' - Taverne project https://www.openaccess.nl/en/you-share-we-take-care
Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.

Keywords

Field-programmable gate array (FPGA)
gate array (FPGA)
spiking neural network (SNN)
workload balancing

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

Access to Document

10.1109/TVLSI.2022.3196839

Cerebron_A_Reconfigurable_Architecture_for_Spatiotemporal_Sparse_Spiking_Neural_NetworksFinal published version, 4.48 MB

Cite this

@article{24b834f44b3f4aca80c545ed89c5a22f,

title = "Cerebron: A Reconfigurable Architecture for Spatio-Temporal Sparse Spiking Neural Networks",

abstract = "Spiking neural networks (SNNs) are promising alternatives to artificial neural networks (ANNs) since they are more realistic brain-inspired computing models. SNNs have sparse neuron firing over time, i.e., spatiotemporal sparsity; thus, they are helpful in enabling energy-efficient hardware inference. However, exploiting the spatiotemporal sparsity of SNNs in hardware leads to unpredictable and unbalanced workloads, degrading the energy efficiency. Compared to SNNs with simple fully connected structures, those extensive structures (e.g., standard convolutions, depthwise convolutions, and pointwise convolutions) can deal with more complicated tasks but lead to difficulties in hardware mapping. In this work, we propose a novel reconfigurable architecture, Cerebron, which can fully exploit the spatiotemporal sparsity in SNNs with maximized data reuse and propose optimization techniques to improve the efficiency and flexibility of the hardware. To achieve flexibility, the reconfigurable compute engine is compatible with a variety of spiking layers and supports inter-computing-unit (CU) and intra-CU reconfiguration. The compute engine can exploit data reuse and guarantee parallel data access when processing different convolutions to achieve memory efficiency. A two-step data sparsity exploitation method is introduced to leverage the sparsity of discrete spikes and reduce the computation time. Besides, an online channelwise workload scheduling strategy is designed to reduce the latency further. Cerebron is verified on image segmentation and classification tasks using a variety of state-of-the-art spiking network structures. Experimental results show that Cerebron has achieved at least 17.5 $\times$ prediction energy reduction and 20 $\times$ speedup compared with state-of-the-art field-programmable gate array (FPGA)-based accelerators.",

keywords = "Field-programmable gate array (FPGA), gate array (FPGA), spiking neural network (SNN), workload balancing",

author = "Qinyu Chen and Chang Gao and Yuxiang Fu",

note = "Green Open Access added to TU Delft Institutional Repository 'You share, we take care!' - Taverne project https://www.openaccess.nl/en/you-share-we-take-care Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.",

year = "2022",

doi = "10.1109/TVLSI.2022.3196839",

language = "English",

volume = "30",

pages = "1425 -- 1437",

journal = "IEEE Transactions on Very Large Scale Integration (VLSI) Systems",

issn = "1063-8210",

publisher = "Institute of Electrical and Electronics Engineers (IEEE)",

number = "10",

}

TY - JOUR

T1 - Cerebron

T2 - A Reconfigurable Architecture for Spatio-Temporal Sparse Spiking Neural Networks

AU - Chen, Qinyu

AU - Gao, Chang

AU - Fu, Yuxiang

N1 - Green Open Access added to TU Delft Institutional Repository 'You share, we take care!' - Taverne project https://www.openaccess.nl/en/you-share-we-take-care Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.

PY - 2022

Y1 - 2022

N2 - Spiking neural networks (SNNs) are promising alternatives to artificial neural networks (ANNs) since they are more realistic brain-inspired computing models. SNNs have sparse neuron firing over time, i.e., spatiotemporal sparsity; thus, they are helpful in enabling energy-efficient hardware inference. However, exploiting the spatiotemporal sparsity of SNNs in hardware leads to unpredictable and unbalanced workloads, degrading the energy efficiency. Compared to SNNs with simple fully connected structures, those extensive structures (e.g., standard convolutions, depthwise convolutions, and pointwise convolutions) can deal with more complicated tasks but lead to difficulties in hardware mapping. In this work, we propose a novel reconfigurable architecture, Cerebron, which can fully exploit the spatiotemporal sparsity in SNNs with maximized data reuse and propose optimization techniques to improve the efficiency and flexibility of the hardware. To achieve flexibility, the reconfigurable compute engine is compatible with a variety of spiking layers and supports inter-computing-unit (CU) and intra-CU reconfiguration. The compute engine can exploit data reuse and guarantee parallel data access when processing different convolutions to achieve memory efficiency. A two-step data sparsity exploitation method is introduced to leverage the sparsity of discrete spikes and reduce the computation time. Besides, an online channelwise workload scheduling strategy is designed to reduce the latency further. Cerebron is verified on image segmentation and classification tasks using a variety of state-of-the-art spiking network structures. Experimental results show that Cerebron has achieved at least 17.5 $\times$ prediction energy reduction and 20 $\times$ speedup compared with state-of-the-art field-programmable gate array (FPGA)-based accelerators.

AB - Spiking neural networks (SNNs) are promising alternatives to artificial neural networks (ANNs) since they are more realistic brain-inspired computing models. SNNs have sparse neuron firing over time, i.e., spatiotemporal sparsity; thus, they are helpful in enabling energy-efficient hardware inference. However, exploiting the spatiotemporal sparsity of SNNs in hardware leads to unpredictable and unbalanced workloads, degrading the energy efficiency. Compared to SNNs with simple fully connected structures, those extensive structures (e.g., standard convolutions, depthwise convolutions, and pointwise convolutions) can deal with more complicated tasks but lead to difficulties in hardware mapping. In this work, we propose a novel reconfigurable architecture, Cerebron, which can fully exploit the spatiotemporal sparsity in SNNs with maximized data reuse and propose optimization techniques to improve the efficiency and flexibility of the hardware. To achieve flexibility, the reconfigurable compute engine is compatible with a variety of spiking layers and supports inter-computing-unit (CU) and intra-CU reconfiguration. The compute engine can exploit data reuse and guarantee parallel data access when processing different convolutions to achieve memory efficiency. A two-step data sparsity exploitation method is introduced to leverage the sparsity of discrete spikes and reduce the computation time. Besides, an online channelwise workload scheduling strategy is designed to reduce the latency further. Cerebron is verified on image segmentation and classification tasks using a variety of state-of-the-art spiking network structures. Experimental results show that Cerebron has achieved at least 17.5 $\times$ prediction energy reduction and 20 $\times$ speedup compared with state-of-the-art field-programmable gate array (FPGA)-based accelerators.

KW - Field-programmable gate array (FPGA)

KW - gate array (FPGA)

KW - spiking neural network (SNN)

KW - workload balancing

UR - http://www.scopus.com/inward/record.url?scp=85137579507&partnerID=8YFLogxK

U2 - 10.1109/TVLSI.2022.3196839

DO - 10.1109/TVLSI.2022.3196839

M3 - Article

AN - SCOPUS:85137579507

SN - 1063-8210

VL - 30

SP - 1425

EP - 1437

JO - IEEE Transactions on Very Large Scale Integration (VLSI) Systems

JF - IEEE Transactions on Very Large Scale Integration (VLSI) Systems

IS - 10

ER -

Cerebron: A Reconfigurable Architecture for Spatio-Temporal Sparse Spiking Neural Networks

Abstract

Bibliographical note

Keywords

UN SDGs

Access to Document

Other files and links

Fingerprint

Cite this