Abstract
This article presents the first keyword spotting (KWS) IC that uses a ring-oscillator-based time-domain processing technique for its analog feature extractor (FEx). Its extensive usage of time-encoding schemes allows the analog audio signal to be processed in a fully time-domain manner except for the voltage-to-time conversion stage of the analog front end. Benefiting from fundamental building blocks based on digital logic gates, it offers better technology scalability compared to conventional voltage-domain designs. Fabricated in a 65-nm CMOS process, the prototyped KWS IC occupies 2.03 mm 2 and dissipates 23- $\mu \text{W}$ power consumption, including analog FEx and digital neural network classifier. The 16-channel time-domain FEx achieves a 54.89-dB dynamic range for 16-ms frame shift size while consuming 9.3 $\mu \text{W}$. The measurement result verifies that the proposed IC performs a 12-class KWS task on the Google Speech Command dataset (GSCD) with >86% accuracy and 12.4-ms latency.
Original language | English |
---|---|
Pages (from-to) | 3298-3311 |
Number of pages | 14 |
Journal | IEEE Journal of Solid-State Circuits |
Volume | 57 |
Issue number | 11 |
DOIs | |
Publication status | Published - 2022 |
Bibliographical note
Green Open Access added to TU Delft Institutional Repository ‘You share, we take care!’ – Taverne project https://www.openaccess.nl/en/you-share-we-take-care Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.Keywords
- Analog
- bandpass filter (BPF)
- classifier
- feature extractor (FEx)
- Google Speech Command dataset (GSCD)
- keyword spotting (KWS)
- rectifier
- recurrent neural network (RNN)
- ring oscillator
- time domain