Whisper-ATC: Open Models for Air Traffic Control Automatic Speech Recognition with Accuracy

Jan van Doorn, Junzi Sun, J.M. Hoekstra, Patrick Jonk, Vincent de Vries

Research output: Chapter in Book/Conference proceedings/Edited volumeConference contributionScientificpeer-review

523 Downloads (Pure)

Abstract

Current advancements in machine learning have provided new architectures, such as encoder-decoder transformers, for automatic speech recognition. For generic speech recognition, very high accuracies are already achievable. However, in air traffic control, automatic speech recognition models traditionally rely on domain-specific models constructed from limited training data. This study introduces this newly developed transformer model for air traffic control and provides a set of fully open automatic speech recognition models with high accuracies. This paper demonstrates how a large-scale, weakly supervised automatic speech recognition model, Whisper, is fine-tuned with various air traffic control datasets to improve model performance. We also evaluated the performance of different sizes of Whisper models. In the end, it was possible to achieve word error rates of 13.5% on the ATCO2 dataset and 1.17% on the ATCOSIM dataset with a random split (or 3.88% with speaker split). The study also reveals that finetuning with region-specific data can enhance performance by up to 60% in real-world scenarios. Finally, we have open-sourced the code base and the models for future research.
Original languageEnglish
Title of host publicationProceedings International Conference on Research in Air Transportation
EditorsEric Neiderman, Marc Bourgois, Dave Lovell, Hartmut Fricke
Number of pages8
Publication statusPublished - 2024
EventInternational Conference on Research in Air Transportation - Singapore, Singapore
Duration: 1 Jul 20244 Jul 2024

Conference

ConferenceInternational Conference on Research in Air Transportation
Abbreviated titleICRAT 2024
Country/TerritorySingapore
CitySingapore
Period1/07/244/07/24

Keywords

  • Air traffic control
  • automatic speech recognition
  • Whisper
  • machine learning

Fingerprint

Dive into the research topics of 'Whisper-ATC: Open Models for Air Traffic Control Automatic Speech Recognition with Accuracy'. Together they form a unique fingerprint.

Cite this