A Power-Efficient Parameter Quantization Technique for CNN Accelerators

Research output: Chapter in Book/Conference proceedings/Edited volumeConference contributionScientificpeer-review

Abstract

Quantization techniques are widely used in CNN inference to reduce the cost of hardware at the expense of small accuracy losses. However, after the quantization, there is still a multiplication cost for the fixed-point quantized CNN weights. Therefore, a novel CNN quantization technique is introduced, which can be implemented without using any multiplier. We evaluated our quantization technique using VGG-16 and Alexnet networks, and the Tiny ImageNet dataset. The quantization technique causes 0.39% and 0.98% accuracy losses for the 8-bit CNN weights compared to floating-point implementations of VGG-16 and Alexnet, respectively. After, a fine-tuning method for our quantization is introduced, which further reduces the accuracy loss. The fine-tuning reduced the accuracy losses on 8-bit quantized VGG-16 and Alexnet to 0.24% and 0.39%, respectively. Two different processing element architectures, which do not include any multiplier hardware, are designed to perform multiply-accumulate (MAC) operations of CNN models quantized by our technique. Two different systolic array prototypes are designed employing the two PE architectures to compare with the traditional fixed-point MAC implementation. The systolic array architectures containing our processing element designs reduced the power consumption of the systolic array up to 14.2% and 21.6%.
Original languageEnglish
Title of host publication2021 24th Euromicro Conference on Digital System Design (DSD)
Subtitle of host publicationProceedings
EditorsL. O'Conner
Place of PublicationPiscataway
PublisherIEEE
Pages18-23
Number of pages6
ISBN (Electronic)978-1-6654-2703-6
ISBN (Print)978-1-6654-2704-3
DOIs
Publication statusPublished - 2021
Event2021 24th Euromicro Conference on Digital System Design (DSD) - Virtual at Palermo, Spain
Duration: 1 Sep 20213 Sep 2021

Conference

Conference2021 24th Euromicro Conference on Digital System Design (DSD)
Abbreviated titleDSD 2021
Country/TerritorySpain
CityVirtual at Palermo
Period1/09/213/09/21

Keywords

  • Quantization
  • deep learning
  • hardware implementation
  • low power
  • ASIC

Fingerprint

Dive into the research topics of 'A Power-Efficient Parameter Quantization Technique for CNN Accelerators'. Together they form a unique fingerprint.

Cite this