BIAS in Flemish automatic speech recognition

Aaricia  Herygers; Vass  Verkhodanova; Matt  Coler; O.E. Scharenborg; Munir  Georges

BIAS in Flemish automatic speech recognition

Aaricia Herygers, Vass Verkhodanova, Matt Coler, O.E. Scharenborg, Munir Georges

Multimedia Computing

Research output: Chapter in Book/Conference proceedings/Edited volume › Conference contribution › Scientific › peer-review

14 Downloads (Pure)

Abstract

Research has shown that automatic speech recognition (ASR) systems exhibit biases against different speaker groups, e.g., based on age or gender. This paper presents an investigation into bias in recent Flemish ASR. Seeing as Belgian Dutch, which is also known as Flemish, is often not included in Dutch ASR systems, a state-of-the-art ASR system for Dutch is trained using the Netherlandic Dutch data from the Spoken Dutch Corpus. Using the Flemish data from the JASMIN-CGN corpus, word error rates for various regional variants of Flemish are then compared. In addition, the most misrecognized phonemes are compared across speaker groups. The evaluation confirms a bias against speakers from West Flanders and Limburg, as well as against children, male speakers, and non-native speakers.

Original language	English
Title of host publication	Proceedings of the ESSV Konferenz Elektronische Sprachsignalverarbeitung
Number of pages	8
Publication status	Published - 2023
Event	ESSV Konferenz Elektronische Sprachsignalverarbeitung - Munich, Germany Duration: 1 Mar 2023 → 3 Mar 2023 Conference number: 34

Conference

Conference	ESSV Konferenz Elektronische Sprachsignalverarbeitung
Abbreviated title	ESSV 2023
Country/Territory	Germany
City	Munich
Period	1/03/23 → 3/03/23

Access to Document

ESSV_Herygers_BiasFinal published version, 877 KB

Cite this

@inproceedings{5051d4fd36d843629f56ef6588cd11ba,

title = "BIAS in Flemish automatic speech recognition",

abstract = "Research has shown that automatic speech recognition (ASR) systems exhibit biases against different speaker groups, e.g., based on age or gender. This paper presents an investigation into bias in recent Flemish ASR. Seeing as Belgian Dutch, which is also known as Flemish, is often not included in Dutch ASR systems, a state-of-the-art ASR system for Dutch is trained using the Netherlandic Dutch data from the Spoken Dutch Corpus. Using the Flemish data from the JASMIN-CGN corpus, word error rates for various regional variants of Flemish are then compared. In addition, the most misrecognized phonemes are compared across speaker groups. The evaluation confirms a bias against speakers from West Flanders and Limburg, as well as against children, male speakers, and non-native speakers. ",

author = "Aaricia Herygers and Vass Verkhodanova and Matt Coler and O.E. Scharenborg and Munir Georges",

year = "2023",

language = "English",

isbn = "978-3-95908-303-4",

booktitle = "Proceedings of the ESSV Konferenz Elektronische Sprachsignalverarbeitung",

note = "ESSV Konferenz Elektronische Sprachsignalverarbeitung, ESSV 2023 ; Conference date: 01-03-2023 Through 03-03-2023",

}

TY - GEN

T1 - BIAS in Flemish automatic speech recognition

AU - Herygers, Aaricia

AU - Verkhodanova, Vass

AU - Coler, Matt

AU - Scharenborg, O.E.

AU - Georges, Munir

N1 - Conference code: 34

PY - 2023

Y1 - 2023

N2 - Research has shown that automatic speech recognition (ASR) systems exhibit biases against different speaker groups, e.g., based on age or gender. This paper presents an investigation into bias in recent Flemish ASR. Seeing as Belgian Dutch, which is also known as Flemish, is often not included in Dutch ASR systems, a state-of-the-art ASR system for Dutch is trained using the Netherlandic Dutch data from the Spoken Dutch Corpus. Using the Flemish data from the JASMIN-CGN corpus, word error rates for various regional variants of Flemish are then compared. In addition, the most misrecognized phonemes are compared across speaker groups. The evaluation confirms a bias against speakers from West Flanders and Limburg, as well as against children, male speakers, and non-native speakers.

AB - Research has shown that automatic speech recognition (ASR) systems exhibit biases against different speaker groups, e.g., based on age or gender. This paper presents an investigation into bias in recent Flemish ASR. Seeing as Belgian Dutch, which is also known as Flemish, is often not included in Dutch ASR systems, a state-of-the-art ASR system for Dutch is trained using the Netherlandic Dutch data from the Spoken Dutch Corpus. Using the Flemish data from the JASMIN-CGN corpus, word error rates for various regional variants of Flemish are then compared. In addition, the most misrecognized phonemes are compared across speaker groups. The evaluation confirms a bias against speakers from West Flanders and Limburg, as well as against children, male speakers, and non-native speakers.

M3 - Conference contribution

SN - 978-3-95908-303-4

BT - Proceedings of the ESSV Konferenz Elektronische Sprachsignalverarbeitung

T2 - ESSV Konferenz Elektronische Sprachsignalverarbeitung

Y2 - 1 March 2023 through 3 March 2023

ER -

BIAS in Flemish automatic speech recognition

Abstract

Conference

Access to Document

Fingerprint

Cite this