Answer Quality Aware Aggregation for Extractive QA Crowdsourcing

P. Zhu, Z. Wang, J. Yang, C. Hauff*, A. Anand

*Corresponding author for this work

Research output: Chapter in Book/Conference proceedings/Edited volumeConference contributionScientificpeer-review

3 Downloads (Pure)

Abstract

Quality control is essential for creating extractive question answering (EQA) datasets via crowdsourcing. Aggregation across answers, i.e. word spans within passages annotated, by different crowd workers is one major focus for ensuring its quality. However, crowd workers cannot reach a consensus on a considerable portion of questions. We introduce a simple yet effective answer aggregation method that takes into account the relations among the answer, question, and context passage. We evaluate answer quality from both the view of question answering model to determine how confident the QA model is about each answer and the view of the answer verification model to determine whether the answer is correct. Then we compute aggregation scores with each answer’s quality and its contextual embedding produced by pre-trained language models. The experiments on a large real crowdsourced EQA dataset show that our framework outperforms baselines by around 16% on precision and effectively conduct answer aggregation for extractive QA task.
Original languageEnglish
Title of host publicationFindings of the Association for Computational Linguistics: EMNLP 2022
PublisherAssociation for Computational Linguistics (ACL)
Pages6147-6159
Number of pages13
DOIs
Publication statusPublished - 2022
EventConference on Empirical Methods in Natural Language Processing 2022 - Abu Dhabi, United Arab Emirates
Duration: 7 Dec 022211 Dec 2022
https://2022.emnlp.org/

Conference

ConferenceConference on Empirical Methods in Natural Language Processing 2022
Abbreviated titleEMNLP 2022
Country/TerritoryUnited Arab Emirates
CityAbu Dhabi
Period7/12/2211/12/22
Internet address

Fingerprint

Dive into the research topics of 'Answer Quality Aware Aggregation for Extractive QA Crowdsourcing'. Together they form a unique fingerprint.

Cite this