Personalised models for speech detection from body movements using transductive parameter transfer

Ekin Gedik; Hayley Hung

doi:10.1007/s00779-017-1006-4

Personalised models for speech detection from body movements using transductive parameter transfer

Ekin Gedik^*, Hayley Hung

^*Corresponding author for this work

Pattern Recognition and Bioinformatics

Research output: Contribution to journal › Article › Scientific › peer-review

17 Citations (Scopus)

57 Downloads (Pure)

Abstract

We investigate the task of detecting speakers in crowded environments using a single body worn triaxial accelerometer. Detection of such behaviour is very challenging to model as people’s body movements during speech vary greatly. Similar to previous studies, by assuming that body movements are indicative of speech, we show experimentally, on a real-world dataset of 3 h including 18 people, that transductive parameter transfer learning (Zen et al. in Proceedings of the 16th international conference on multimodal interaction. ACM, 2014) can better model individual differences in speaking behaviour, significantly improving on the state-of-the-art performance. We also discuss the challenges introduced by the in-the-wild nature of our dataset and experimentally show how they affect detection performance. We strengthen the need for an adaptive approach by comparing the speech detection problem to a more traditional activity (i.e. walking). We provide an analysis of the transfer by considering different source sets which provides a deeper investigation of the nature of both speech and body movements, in the context of transfer learning.

Original language	English
Pages (from-to)	723-737
Number of pages	15
Journal	Personal and Ubiquitous Computing
Volume	21
Issue number	4
DOIs	https://doi.org/10.1007/s00779-017-1006-4
Publication status	Published - 1 Aug 2017

Keywords

Human behaviour
Social actions
Social signal processing
Transfer learning
Wearable sensors

Access to Document

10.1007/s00779-017-1006-4

10.1007_s00779-017-1006-4Final published version, 1.1 MBLicence: CC BY

Cite this

@article{4fe22cfc44da4f18ab5b75a07e14103a,

title = "Personalised models for speech detection from body movements using transductive parameter transfer",

abstract = "We investigate the task of detecting speakers in crowded environments using a single body worn triaxial accelerometer. Detection of such behaviour is very challenging to model as people{\textquoteright}s body movements during speech vary greatly. Similar to previous studies, by assuming that body movements are indicative of speech, we show experimentally, on a real-world dataset of 3 h including 18 people, that transductive parameter transfer learning (Zen et al. in Proceedings of the 16th international conference on multimodal interaction. ACM, 2014) can better model individual differences in speaking behaviour, significantly improving on the state-of-the-art performance. We also discuss the challenges introduced by the in-the-wild nature of our dataset and experimentally show how they affect detection performance. We strengthen the need for an adaptive approach by comparing the speech detection problem to a more traditional activity (i.e. walking). We provide an analysis of the transfer by considering different source sets which provides a deeper investigation of the nature of both speech and body movements, in the context of transfer learning.",

keywords = "Human behaviour, Social actions, Social signal processing, Transfer learning, Wearable sensors",

author = "Ekin Gedik and Hayley Hung",

year = "2017",

month = aug,

day = "1",

doi = "10.1007/s00779-017-1006-4",

language = "English",

volume = "21",

pages = "723--737",

journal = "Personal and Ubiquitous Computing",

issn = "1617-4909",

publisher = "Springer",

number = "4",

}

TY - JOUR

T1 - Personalised models for speech detection from body movements using transductive parameter transfer

AU - Gedik, Ekin

AU - Hung, Hayley

PY - 2017/8/1

Y1 - 2017/8/1

N2 - We investigate the task of detecting speakers in crowded environments using a single body worn triaxial accelerometer. Detection of such behaviour is very challenging to model as people’s body movements during speech vary greatly. Similar to previous studies, by assuming that body movements are indicative of speech, we show experimentally, on a real-world dataset of 3 h including 18 people, that transductive parameter transfer learning (Zen et al. in Proceedings of the 16th international conference on multimodal interaction. ACM, 2014) can better model individual differences in speaking behaviour, significantly improving on the state-of-the-art performance. We also discuss the challenges introduced by the in-the-wild nature of our dataset and experimentally show how they affect detection performance. We strengthen the need for an adaptive approach by comparing the speech detection problem to a more traditional activity (i.e. walking). We provide an analysis of the transfer by considering different source sets which provides a deeper investigation of the nature of both speech and body movements, in the context of transfer learning.

AB - We investigate the task of detecting speakers in crowded environments using a single body worn triaxial accelerometer. Detection of such behaviour is very challenging to model as people’s body movements during speech vary greatly. Similar to previous studies, by assuming that body movements are indicative of speech, we show experimentally, on a real-world dataset of 3 h including 18 people, that transductive parameter transfer learning (Zen et al. in Proceedings of the 16th international conference on multimodal interaction. ACM, 2014) can better model individual differences in speaking behaviour, significantly improving on the state-of-the-art performance. We also discuss the challenges introduced by the in-the-wild nature of our dataset and experimentally show how they affect detection performance. We strengthen the need for an adaptive approach by comparing the speech detection problem to a more traditional activity (i.e. walking). We provide an analysis of the transfer by considering different source sets which provides a deeper investigation of the nature of both speech and body movements, in the context of transfer learning.

KW - Human behaviour

KW - Social actions

KW - Social signal processing

KW - Transfer learning

KW - Wearable sensors

UR - http://resolver.tudelft.nl/uuid:4fe22cfc-44da-4f18-ab5b-75a07e14103a

UR - http://www.scopus.com/inward/record.url?scp=85013074884&partnerID=8YFLogxK

U2 - 10.1007/s00779-017-1006-4

DO - 10.1007/s00779-017-1006-4

M3 - Article

AN - SCOPUS:85013074884

SN - 1617-4909

VL - 21

SP - 723

EP - 737

JO - Personal and Ubiquitous Computing

JF - Personal and Ubiquitous Computing

IS - 4

ER -

Personalised models for speech detection from body movements using transductive parameter transfer

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this