TY - GEN
T1 - Annotator-Centric Active Learning for Subjective NLP Tasks
AU - van der Meer, Michiel
AU - Falk, Neele
AU - Murukannaiah, Pradeep K.
AU - Liscio, Enrico
PY - 2024
Y1 - 2024
N2 - Active Learning (AL) addresses the high costs of collecting human annotations by strategically annotating the most informative samples.However, for subjective NLP tasks, incorporating a wide range of perspectives in the annotation process is crucial to capture the variability in human judgments.We introduce Annotator-Centric Active Learning (ACAL), which incorporates an annotator selection strategy following data sampling.Our objective is two-fold: (1) to efficiently approximate the full diversity of human judgments, and (2) to assess model performance using annotator-centric metrics, which value minority and majority perspectives equally.We experiment with multiple annotator selection strategies across seven subjective NLP tasks, employing both traditional and novel, human-centered evaluation metrics.Our findings indicate that ACAL improves data efficiency and excels in annotator-centric performance evaluations.However, its success depends on the availability of a sufficiently large and diverse pool of annotators to sample from.
AB - Active Learning (AL) addresses the high costs of collecting human annotations by strategically annotating the most informative samples.However, for subjective NLP tasks, incorporating a wide range of perspectives in the annotation process is crucial to capture the variability in human judgments.We introduce Annotator-Centric Active Learning (ACAL), which incorporates an annotator selection strategy following data sampling.Our objective is two-fold: (1) to efficiently approximate the full diversity of human judgments, and (2) to assess model performance using annotator-centric metrics, which value minority and majority perspectives equally.We experiment with multiple annotator selection strategies across seven subjective NLP tasks, employing both traditional and novel, human-centered evaluation metrics.Our findings indicate that ACAL improves data efficiency and excels in annotator-centric performance evaluations.However, its success depends on the availability of a sufficiently large and diverse pool of annotators to sample from.
UR - http://www.scopus.com/inward/record.url?scp=85217761137&partnerID=8YFLogxK
U2 - 10.18653/v1/2024.emnlp-main.1031
DO - 10.18653/v1/2024.emnlp-main.1031
M3 - Conference contribution
AN - SCOPUS:85217761137
T3 - EMNLP 2024 - 2024 Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference
SP - 18537
EP - 18555
BT - EMNLP 2024 - 2024 Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference
A2 - Al-Onaizan, Yaser
A2 - Bansal, Mohit
A2 - Chen, Yun-Nung
PB - Association for Computational Linguistics (ACL)
T2 - 2024 Conference on Empirical Methods in Natural Language Processing, EMNLP 2024
Y2 - 12 November 2024 through 16 November 2024
ER -