A Power-Efficient Parameter Quantization Technique for CNN Accelerators

Research output: Chapter in Book/Conference proceedings/Edited volumeConference contributionScientificpeer-review


Quantization techniques are widely used in CNN inference to reduce the cost of hardware at the expense of small accuracy losses. However, after the quantization, there is still a multiplication cost for the fixed-point quantized CNN weights. Therefore, a novel CNN quantization technique is introduced, which can be implemented without using any multiplier. We evaluated our quantization technique using VGG-16 and Alexnet networks, and the Tiny ImageNet dataset. The quantization technique causes 0.39% and 0.98% accuracy losses for the 8-bit CNN weights compared to floating-point implementations of VGG-16 and Alexnet, respectively. After, a fine-tuning method for our quantization is introduced, which further reduces the accuracy loss. The fine-tuning reduced the accuracy losses on 8-bit quantized VGG-16 and Alexnet to 0.24% and 0.39%, respectively. Two different processing element architectures, which do not include any multiplier hardware, are designed to perform multiply-accumulate (MAC) operations of CNN models quantized by our technique. Two different systolic array prototypes are designed employing the two PE architectures to compare with the traditional fixed-point MAC implementation. The systolic array architectures containing our processing element designs reduced the power consumption of the systolic array up to 14.2% and 21.6%.
Original languageEnglish
Title of host publication2021 24th Euromicro Conference on Digital System Design (DSD)
Subtitle of host publicationProceedings
EditorsL. O'Conner
Place of PublicationPiscataway
Number of pages6
ISBN (Electronic)978-1-6654-2703-6
ISBN (Print)978-1-6654-2704-3
Publication statusPublished - 2021
Event2021 24th Euromicro Conference on Digital System Design (DSD) - Virtual at Palermo, Spain
Duration: 1 Sep 20213 Sep 2021


Conference2021 24th Euromicro Conference on Digital System Design (DSD)
Abbreviated titleDSD 2021
CityVirtual at Palermo


  • Quantization
  • deep learning
  • hardware implementation
  • low power
  • ASIC


Dive into the research topics of 'A Power-Efficient Parameter Quantization Technique for CNN Accelerators'. Together they form a unique fingerprint.

Cite this