Abstract
We introduce SparCAssist, a general-purpose risk assessment tool for the machine learning models trained for language tasks. It evaluates models' risk by inspecting their behavior on counterfactuals, namely out-of-distribution instances generated based on the given data instance. The counterfactuals are generated by replacing tokens in rational subsequences identified by ExPred, while the replacements are retrieved using HotFlip or the Masked-Language-Model-based algorithms. The main purpose of our system is to help the human annotators to assess the model's risk on deployment. The counterfactual instances generated during the assessment are the by-product and can be used to train more robust NLP models in the future.
Original language | English |
---|---|
Title of host publication | SIGIR 2022 - Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval |
Publisher | Association for Computing Machinery (ACM) |
Pages | 3219-3223 |
Number of pages | 5 |
ISBN (Electronic) | 978-1-4503-8732-3 |
DOIs | |
Publication status | Published - 2022 |
Event | 45th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2022 - Madrid, Spain Duration: 11 Jul 2022 → 15 Jul 2022 |
Publication series
Name | SIGIR 2022 - Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval |
---|
Conference
Conference | 45th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2022 |
---|---|
Country/Territory | Spain |
City | Madrid |
Period | 11/07/22 → 15/07/22 |
Bibliographical note
Green Open Access added to TU Delft Institutional Repository 'You share, we take care!' - Taverne project https://www.openaccess.nl/en/you-share-we-take-careOtherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.
Keywords
- counterfactual interpretation
- data-annotation tools
- human-in-the-loop machine learning
- interpretable machine learning