Investigating model performance in language identification: beyond simple error statistics

Suzy J. Styles, Victoria Y.H. Chua, Fei Ting Woon, Hexin Liu, Leibny Paola Garcia Perera, Sanjeev Khudanpur, Andy W.H. Khong, Justin Dauwels

Research output: Contribution to journalConference articleScientificpeer-review

10 Downloads (Pure)


Language development experts need tools that can automatically identify languages from fluent, conversational speech and provide reliable estimates of usage rates at the level of an individual recording. However, LID systems are typically evaluated on metrics such as equal error rate and balanced accuracy, applied at the level of an entire speech corpus. These overview metrics do not provide information about model performance at the level of individual speakers, recordings, or units of speech with different linguistic characteristics. Overview statistics may mask systematic errors in model performance for some subsets of the data, and consequently, have worse performance on data derived from some subsets of human speakers, creating a kind of algorithmic bias. Here, we investigate how well a number of LID systems perform on individual recordings and speech units with different linguistic properties in the MERLIon CCS Challenge featuring accented code-switched child-directed speech.

Original languageEnglish
Pages (from-to)4129-4133
Number of pages5
JournalProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
Publication statusPublished - 2023
Event24th International Speech Communication Association, Interspeech 2023 - Dublin, Ireland
Duration: 20 Aug 202324 Aug 2023


  • child-directed speech
  • code-switching
  • language diarization
  • language identification


Dive into the research topics of 'Investigating model performance in language identification: beyond simple error statistics'. Together they form a unique fingerprint.

Cite this