Detecting and analysing spontaneous oral cancer speech in the wild

Bence Mark Halpern, Rob van Son, Michiel W.M. van den Brekel, Odette Scharenborg

Research output: Chapter in Book/Conference proceedings/Edited volumeConference contributionScientificpeer-review


Oral cancer speech is a disease which impacts more than half a million people worldwide every year. Analysis of oral cancer speech has so far focused on read speech. In this paper, we 1) present and 2) analyse a three-hour long spontaneous oral cancer speech dataset collected from YouTube. 3) We set baselines for an oral cancer speech detection task on this dataset. The analysis of these explainable machine learning baselines shows that sibilants and stop consonants are the most important indicators for spontaneous oral cancer speech detection.
Original languageEnglish
Title of host publicationProceedings of Interspeech 2020
Pages4826 - 4830
Number of pages5
Publication statusPublished - 2020
EventINTERSPEECH 2020 - Shanghai, Shanghai, China
Duration: 25 Oct 202029 Oct 2020

Publication series

NameInterspeech 2020
ISSN (Print)1990-9772


ConferenceINTERSPEECH 2020


  • Corpus
  • Explainable AI
  • Oral cancer speech
  • Pathological speech

Fingerprint Dive into the research topics of 'Detecting and analysing spontaneous oral cancer speech in the wild'. Together they form a unique fingerprint.

Cite this