A Power-Efficient Parameter Quantization Technique for CNN Accelerators

Ercan Kalali, Rene van Leuken

Research output: Chapter in Book/Conference proceedings/Edited volumeConference contributionScientificpeer-review

4 Citations (Scopus)

Abstract

Quantization techniques are widely used in CNN inference to reduce the cost of hardware at the expense of small accuracy losses. However, after the quantization, there is still a multiplication cost for the fixed-point quantized CNN weights. Therefore, a novel CNN quantization technique is introduced, which can be implemented without using any multiplier. We evaluated our quantization technique using VGG-16 and Alexnet networks, and the Tiny ImageNet dataset. The quantization technique causes 0.39% and 0.98% accuracy losses for the 8-bit CNN weights compared to floating-point implementations of VGG-16 and Alexnet, respectively. After, a fine-tuning method for our quantization is introduced, which further reduces the accuracy loss. The fine-tuning reduced the accuracy losses on 8-bit quantized VGG-16 and Alexnet to 0.24% and 0.39%, respectively. Two different processing element architectures, which do not include any multiplier hardware, are designed to perform multiply-accumulate (MAC) operations of CNN models quantized by our technique. Two different systolic array prototypes are designed employing the two PE architectures to compare with the traditional fixed-point MAC implementation. The systolic array architectures containing our processing element designs reduced the power consumption of the systolic array up to 14.2% and 21.6%.
Original languageEnglish
Title of host publication2021 24th Euromicro Conference on Digital System Design (DSD)
Subtitle of host publicationProceedings
EditorsL. O'Conner
Place of PublicationPiscataway
PublisherIEEE
Pages18-23
Number of pages6
ISBN (Electronic)978-1-6654-2703-6
ISBN (Print)978-1-6654-2704-3
DOIs
Publication statusPublished - 2021
Event2021 24th Euromicro Conference on Digital System Design (DSD) - Virtual at Palermo, Spain
Duration: 1 Sept 20213 Sept 2021

Conference

Conference2021 24th Euromicro Conference on Digital System Design (DSD)
Abbreviated titleDSD 2021
Country/TerritorySpain
CityVirtual at Palermo
Period1/09/213/09/21

Keywords

  • Quantization
  • deep learning
  • hardware implementation
  • low power
  • ASIC

Fingerprint

Dive into the research topics of 'A Power-Efficient Parameter Quantization Technique for CNN Accelerators'. Together they form a unique fingerprint.

Cite this