Cerebron: A Reconfigurable Architecture for Spatio-Temporal Sparse Spiking Neural Networks

Qinyu Chen*, Chang Gao, Yuxiang Fu

*Corresponding author for this work

Research output: Contribution to journalArticleScientificpeer-review

5 Citations (Scopus)
37 Downloads (Pure)

Abstract

Spiking neural networks (SNNs) are promising alternatives to artificial neural networks (ANNs) since they are more realistic brain-inspired computing models. SNNs have sparse neuron firing over time, i.e., spatiotemporal sparsity; thus, they are helpful in enabling energy-efficient hardware inference. However, exploiting the spatiotemporal sparsity of SNNs in hardware leads to unpredictable and unbalanced workloads, degrading the energy efficiency. Compared to SNNs with simple fully connected structures, those extensive structures (e.g., standard convolutions, depthwise convolutions, and pointwise convolutions) can deal with more complicated tasks but lead to difficulties in hardware mapping. In this work, we propose a novel reconfigurable architecture, Cerebron, which can fully exploit the spatiotemporal sparsity in SNNs with maximized data reuse and propose optimization techniques to improve the efficiency and flexibility of the hardware. To achieve flexibility, the reconfigurable compute engine is compatible with a variety of spiking layers and supports inter-computing-unit (CU) and intra-CU reconfiguration. The compute engine can exploit data reuse and guarantee parallel data access when processing different convolutions to achieve memory efficiency. A two-step data sparsity exploitation method is introduced to leverage the sparsity of discrete spikes and reduce the computation time. Besides, an online channelwise workload scheduling strategy is designed to reduce the latency further. Cerebron is verified on image segmentation and classification tasks using a variety of state-of-the-art spiking network structures. Experimental results show that Cerebron has achieved at least 17.5<inline-formula> <tex-math notation="LaTeX">$\times$</tex-math> </inline-formula> prediction energy reduction and 20<inline-formula> <tex-math notation="LaTeX">$\times$</tex-math> </inline-formula> speedup compared with state-of-the-art field-programmable gate array (FPGA)-based accelerators.

Original languageEnglish
Pages (from-to)1425 - 1437
Number of pages13
JournalIEEE Transactions on Very Large Scale Integration (VLSI) Systems
Volume30
Issue number10
DOIs
Publication statusPublished - 2022

Bibliographical note

Green Open Access added to TU Delft Institutional Repository 'You share, we take care!' - Taverne project https://www.openaccess.nl/en/you-share-we-take-care
Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.

Keywords

  • Field-programmable gate array (FPGA)
  • gate array (FPGA)
  • spiking neural network (SNN)
  • workload balancing

Fingerprint

Dive into the research topics of 'Cerebron: A Reconfigurable Architecture for Spatio-Temporal Sparse Spiking Neural Networks'. Together they form a unique fingerprint.

Cite this