Abstract
Automatic speech recognition (ASR) should serve every speaker, not only the majority “standard” speakers of a language. In order to build inclusive ASR, mitigating the bias against speaker groups who speak in a “non-standard” or “diverse” way is crucial. We aim to mitigate the bias against non-native-accented Flemish in a Flemish ASR system. Since this is a low-resource problem, we investigate the optimal type of data augmentation, i.e., speed/pitch perturbation, cross-lingual voice conversion-based methods, and SpecAugment, applied to both native Flemish and non-native-accented Flemish, for bias mitigation. The results showed that specific types of data augmentation applied to both native and non-native-accented speech improve non-native-accented ASR while applying data augmentation to the non-native-accented speech is more conducive to bias reduction. Combining both gave the largest bias reduction for human-machine interaction (HMI) as well as read-type speech.
Original language | English |
---|---|
Title of host publication | Proceedings of the 2023 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) |
Publisher | IEEE |
Number of pages | 8 |
ISBN (Electronic) | 979-8-3503-0689-7 |
ISBN (Print) | 979-8-3503-0690-3 |
DOIs | |
Publication status | Published - 2023 |
Event | 2023 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) - Taipei, Taiwan Duration: 16 Dec 2023 → 20 Dec 2023 |
Workshop
Workshop | 2023 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) |
---|---|
Country/Territory | Taiwan |
City | Taipei |
Period | 16/12/23 → 20/12/23 |
Bibliographical note
Green Open Access added to TU Delft Institutional Repository 'You share, we take care!' - Taverne project https://www.openaccess.nl/en/you-share-we-take-careOtherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.
Keywords
- Speech recognition
- bias mitigation
- non-native accents
- data augmentation
- voice conversion