Abstract
Although popular and effective, large language models (LLM) are characterised by a performance vs. transparency trade-off that hinders their applicability to sensitive scenarios. This is the main reason behind many approaches focusing on local post-hoc explanations recently proposed by the XAI community. However, to the best of our knowledge, a thorough comparison among available explainability techniques is currently missing, mainly for the lack of a general metric to measure their benefits. We compare state-of-the-art local post-hoc explanation mechanisms for models trained over moral value classification tasks based on a measure of correlation. By relying on a novel framework for comparing global impact scores, our experiments show how most local post-hoc explainers are loosely correlated, and highlight huge discrepancies in their results—their “quarrel” about explanations. Finally, we compare the impact scores distribution obtained from each local post-hoc explainer with human-made dictionaries, and point out that there is no correlation between explanation outputs and the concepts humans consider as salient.
Original language | English |
---|---|
Title of host publication | Explainable and Transparent AI and Multi-Agent Systems - 5th International Workshop, EXTRAAMAS 2023, Revised Selected Papers |
Editors | Davide Calvaresi, Amro Najjar, Andrea Omicini, Rachele Carli, Giovanni Ciatto, Reyhan Aydogan, Yazan Mualla, Kary Främling |
Publisher | Springer |
Pages | 97-115 |
Number of pages | 19 |
ISBN (Print) | 9783031408779 |
DOIs | |
Publication status | Published - 2023 |
Event | Proceedings of the 5th International Workshop on EXTRAAMAS 2023 - London, United Kingdom Duration: 29 May 2023 → 29 May 2023 |
Publication series
Name | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
---|---|
Volume | 14127 LNAI |
ISSN (Print) | 0302-9743 |
ISSN (Electronic) | 1611-3349 |
Conference
Conference | Proceedings of the 5th International Workshop on EXTRAAMAS 2023 |
---|---|
Country/Territory | United Kingdom |
City | London |
Period | 29/05/23 → 29/05/23 |
Bibliographical note
Green Open Access added to TU Delft Institutional Repository ‘You share, we take care!’ – Taverne project https://www.openaccess.nl/en/you-share-we-take-careOtherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.
Keywords
- eXplainable Artificial Intelligence
- Local Post-hoc Explanations
- Moral Values Classification
- Natural Language Processing