Using cross-model learnings for the Gram Vaani ASR Challenge 2022

Research output: Contribution to journalConference articleScientificpeer-review

1 Citation (Scopus)
31 Downloads (Pure)

Abstract

In the diverse and multilingual land of India, Hindi is spoken as a first language by a majority of its population. Efforts are made to obtain data in terms of audio, transcriptions, dictionary, etc. to develop speech-technology applications in Hindi. Similarly, the Gram-Vaani ASR Challenge 2022 provides spontaneous telephone speech, with natural back-ground and regional variations in Hindi. The challenge provides: 100 hours of labeled train-set, 5 hours of labeled dev-set and 1000 hours of unlabeled data-set. For the 'Closed Challenge', we trained an End-to-End (E2E) Conformer model using speed perturbations, SpecAugment techniques and use VTLN to handle any unknown speaker groups in the blind evaluation set. On the dev-set, we achieved a 30.3% WER compared to the 34.8% WER by the Challenge E2E baseline. For the 'Self Supervised Closed Challenge', a semi-supervised learning approach is used. We generate pseudo-transcripts for the unlabeled data using a hybrid TDNN-3gram LM model and trained an E2E model. This is then used as a seed for retraining the E2E model with high confidence data. Cross-model learning and refining of the E2E model gave 25.3% WER on the dev-set compared to ∼33-35% WER by the Challenge baseline that use wav2vec models.

Original languageEnglish
Pages (from-to)4880-4884
Number of pages5
JournalProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
Volume2022-September
DOIs
Publication statusPublished - 2022
Event23rd Annual Conference of the International Speech Communication Association, INTERSPEECH 2022 - Incheon, Korea, Republic of
Duration: 18 Sept 202222 Sept 2022

Bibliographical note

Green Open Access added to TU Delft Institutional Repository 'You share, we take care!' - Taverne project https://www.openaccess.nl/en/you-share-we-take-care
Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.

Keywords

  • cross-architecture learning
  • end-to-end ASR
  • Gram-Vaani Challenge
  • hybrid ASR
  • semi-supervised learning

Fingerprint

Dive into the research topics of 'Using cross-model learnings for the Gram Vaani ASR Challenge 2022'. Together they form a unique fingerprint.

Cite this