Answer Quality Aware Aggregation for Extractive QA Crowdsourcing

P. Zhu; Z. Wang; J. Yang; C. Hauff; A. Anand

doi:10.18653/v1/2022.findings-emnlp.457

Answer Quality Aware Aggregation for Extractive QA Crowdsourcing

P. Zhu, Z. Wang, J. Yang, C. Hauff^*, A. Anand

^*Corresponding author for this work

Research output: Chapter in Book/Conference proceedings/Edited volume › Conference contribution › Scientific › peer-review

3 Downloads (Pure)

Abstract

Quality control is essential for creating extractive question answering (EQA) datasets via crowdsourcing. Aggregation across answers, i.e. word spans within passages annotated, by different crowd workers is one major focus for ensuring its quality. However, crowd workers cannot reach a consensus on a considerable portion of questions. We introduce a simple yet effective answer aggregation method that takes into account the relations among the answer, question, and context passage. We evaluate answer quality from both the view of question answering model to determine how confident the QA model is about each answer and the view of the answer verification model to determine whether the answer is correct. Then we compute aggregation scores with each answer’s quality and its contextual embedding produced by pre-trained language models. The experiments on a large real crowdsourced EQA dataset show that our framework outperforms baselines by around 16% on precision and effectively conduct answer aggregation for extractive QA task.

Original language	English
Title of host publication	Findings of the Association for Computational Linguistics: EMNLP 2022
Publisher	Association for Computational Linguistics (ACL)
Pages	6147-6159
Number of pages	13
DOIs	https://doi.org/10.18653/v1/2022.findings-emnlp.457
Publication status	Published - 2022
Event	Conference on Empirical Methods in Natural Language Processing 2022 - Abu Dhabi, United Arab Emirates Duration: 7 Dec 0222 → 11 Dec 2022 https://2022.emnlp.org/

Conference

Conference	Conference on Empirical Methods in Natural Language Processing 2022
Abbreviated title	EMNLP 2022
Country/Territory	United Arab Emirates
City	Abu Dhabi
Period	7/12/22 → 11/12/22
Internet address	https://2022.emnlp.org/

Access to Document

10.18653/v1/2022.findings-emnlp.457Licence: CC BY-NC-SA

2022.findings-emnlp.457Final published version, 630 KBLicence: CC BY-NC-SA

Cite this

@inproceedings{afd997555ee140b0abe8fd88861b8d82,

title = "Answer Quality Aware Aggregation for Extractive QA Crowdsourcing",

abstract = "Quality control is essential for creating extractive question answering (EQA) datasets via crowdsourcing. Aggregation across answers, i.e. word spans within passages annotated, by different crowd workers is one major focus for ensuring its quality. However, crowd workers cannot reach a consensus on a considerable portion of questions. We introduce a simple yet effective answer aggregation method that takes into account the relations among the answer, question, and context passage. We evaluate answer quality from both the view of question answering model to determine how confident the QA model is about each answer and the view of the answer verification model to determine whether the answer is correct. Then we compute aggregation scores with each answer{\textquoteright}s quality and its contextual embedding produced by pre-trained language models. The experiments on a large real crowdsourced EQA dataset show that our framework outperforms baselines by around 16% on precision and effectively conduct answer aggregation for extractive QA task.",

author = "P. Zhu and Z. Wang and J. Yang and C. Hauff and A. Anand",

year = "2022",

doi = "10.18653/v1/2022.findings-emnlp.457",

language = "English",

pages = "6147--6159",

booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2022",

publisher = "Association for Computational Linguistics (ACL)",

note = "Conference on Empirical Methods in Natural Language Processing 2022, EMNLP 2022 ; Conference date: 07-12-0222 Through 11-12-2022",

url = "https://2022.emnlp.org/",

}

Zhu, P , Wang, Z , Yang, J, Hauff, C & Anand, A 2022, Answer Quality Aware Aggregation for Extractive QA Crowdsourcing. in Findings of the Association for Computational Linguistics: EMNLP 2022. Association for Computational Linguistics (ACL), pp. 6147-6159, Conference on Empirical Methods in Natural Language Processing 2022, Abu Dhabi, United Arab Emirates, 7/12/22. https://doi.org/10.18653/v1/2022.findings-emnlp.457

Answer Quality Aware Aggregation for Extractive QA Crowdsourcing. / Zhu, P.; Wang, Z.; Yang, J. et al.
Findings of the Association for Computational Linguistics: EMNLP 2022. Association for Computational Linguistics (ACL), 2022. p. 6147-6159.

Research output: Chapter in Book/Conference proceedings/Edited volume › Conference contribution › Scientific › peer-review

TY - GEN

T1 - Answer Quality Aware Aggregation for Extractive QA Crowdsourcing

AU - Zhu, P.

AU - Wang, Z.

AU - Yang, J.

AU - Hauff, C.

AU - Anand, A.

PY - 2022

Y1 - 2022

N2 - Quality control is essential for creating extractive question answering (EQA) datasets via crowdsourcing. Aggregation across answers, i.e. word spans within passages annotated, by different crowd workers is one major focus for ensuring its quality. However, crowd workers cannot reach a consensus on a considerable portion of questions. We introduce a simple yet effective answer aggregation method that takes into account the relations among the answer, question, and context passage. We evaluate answer quality from both the view of question answering model to determine how confident the QA model is about each answer and the view of the answer verification model to determine whether the answer is correct. Then we compute aggregation scores with each answer’s quality and its contextual embedding produced by pre-trained language models. The experiments on a large real crowdsourced EQA dataset show that our framework outperforms baselines by around 16% on precision and effectively conduct answer aggregation for extractive QA task.

AB - Quality control is essential for creating extractive question answering (EQA) datasets via crowdsourcing. Aggregation across answers, i.e. word spans within passages annotated, by different crowd workers is one major focus for ensuring its quality. However, crowd workers cannot reach a consensus on a considerable portion of questions. We introduce a simple yet effective answer aggregation method that takes into account the relations among the answer, question, and context passage. We evaluate answer quality from both the view of question answering model to determine how confident the QA model is about each answer and the view of the answer verification model to determine whether the answer is correct. Then we compute aggregation scores with each answer’s quality and its contextual embedding produced by pre-trained language models. The experiments on a large real crowdsourced EQA dataset show that our framework outperforms baselines by around 16% on precision and effectively conduct answer aggregation for extractive QA task.

U2 - 10.18653/v1/2022.findings-emnlp.457

DO - 10.18653/v1/2022.findings-emnlp.457

M3 - Conference contribution

SP - 6147

EP - 6159

BT - Findings of the Association for Computational Linguistics: EMNLP 2022

PB - Association for Computational Linguistics (ACL)

T2 - Conference on Empirical Methods in Natural Language Processing 2022

Y2 - 7 December 0222 through 11 December 2022

ER -

Answer Quality Aware Aggregation for Extractive QA Crowdsourcing

Abstract

Conference

Access to Document

Fingerprint

Cite this