Abstract
Both human listeners and machines need to adapt their sound categories whenever a new speaker is encountered. This perceptual learning is driven by lexical information. In previous work, we have shown that deep neural network-based (DNN) ASR systems can learn to adapt their phoneme category boundaries from a few labeled examples after exposure (i.e., training) to ambiguous sounds, as humans have been found to do. Here, we investigate the time-course of phoneme category adaptation in a DNN in more detail, with the ultimate aim to investigate the DNN’s ability to serve as a model of human perceptual learning. We do so by providing the DNN with an increasing number of ambiguous retraining tokens (in 10 bins of 4 ambiguous items), and comparing classification accuracy on the ambiguous items in a held-out test set for the different bins. Results showed that DNNs, similar to human listeners, show a step-like function: The DNNs show perceptual learning already after the first bin (only 4 tokens of the ambiguous phone), with little further adaptation for subsequent bins. In follow-up research, we plan to test specific predictions made by the DNN about human speech processing.
Original language | English |
---|---|
Title of host publication | Statistical Language and Speech Processing |
Subtitle of host publication | 7th International Conference, SLSP 2019 |
Editors | C. Martín-Vide, M. Purver, S. Pollak |
Place of Publication | Cham |
Publisher | Springer |
Pages | 3-15 |
Number of pages | 13 |
ISBN (Electronic) | 978-3-030-31372-2 |
ISBN (Print) | 978-3-030-31371-5 |
DOIs | |
Publication status | Published - 2019 |
Event | SLSP 2019: Statistical Language and Speech Processing - Ljubljana, Slovenia Duration: 14 Oct 2019 → 16 Oct 2019 Conference number: 7th |
Publication series
Name | Part of the Lecture Notes in Computer Science book series, Also part of the Lecture Notes in Artificial Intelligence book sub series |
---|---|
Publisher | Springer |
Volume | 11816 |
Conference
Conference | SLSP 2019 |
---|---|
Country/Territory | Slovenia |
City | Ljubljana |
Period | 14/10/19 → 16/10/19 |
Keywords
- Phoneme category adaptation
- Human perceptual learning
- Deep neural networks
- Time-course