Abstract
This paper describes our use of mixed incentives and the citizen science portal LanguageARC to prepare, collect and quality control a large corpus of object namings for the purpose of providing speech data to document the under-represented Guanzhong dialect of Chinese spoken in the Shaanxi province in the environs of Xi’an.
Original language | English |
---|---|
Title of host publication | 2nd Workshop on Novel Incentives in Data Collection from People |
Subtitle of host publication | Models, Implementations, Challenges and Results, NIDCP 2022 - Proceedings at LREC 2022 Workshop - Language Resources and Evaluation Conference |
Editors | James Fiumara, Christopher Cieri, Mark Liberman, Chris Callison-Burch |
Publisher | European Language Resources Association (ELRA) |
Pages | 32-37 |
Number of pages | 6 |
ISBN (Electronic) | 9782493814050 |
Publication status | Published - 2022 |
Event | 2nd Workshop on Novel Incentives in Data Collection from People: Models, Implementations, Challenges and Results, NIDCP 2022 - Marseille, France Duration: 20 Jun 2022 → 25 Jun 2022 |
Publication series
Name | 2nd Workshop on Novel Incentives in Data Collection from People: Models, Implementations, Challenges and Results, NIDCP 2022 - Proceedings at LREC 2022 Workshop - Language Resources and Evaluation Conference |
---|
Conference
Conference | 2nd Workshop on Novel Incentives in Data Collection from People: Models, Implementations, Challenges and Results, NIDCP 2022 |
---|---|
Country/Territory | France |
City | Marseille |
Period | 20/06/22 → 25/06/22 |
Bibliographical note
Green Open Access added to TU Delft Institutional Repository ‘You share, we take care!’ – Taverne project https://www.openaccess.nl/en/you-share-we-take-careOtherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.
Keywords
- annotation
- language resources
- linguistic data
- novel incentives
- under-resourced languages