Skip to main navigation Skip to search Skip to main content

Natural Language Counterfactual Explanations in Financial Text Classification

Karol Dobiczek, P. Altmeyer, C.C.S. Liem

Research output: Chapter in Book/Conference proceedings/Edited volumeConference contributionScientificpeer-review

3 Downloads (Pure)

Abstract

The use of large language model (LLM) classifiers in finance and other high-stakes domains calls for a high level of trustworthiness and explainability. We focus on counterfactual explanations (CE), a form of explainable AI that explains a model’s output by proposing an alternative to the original input that changes the classification. We use three types of CE generators for LLM classifiers and assess the quality of their explanations on a recent dataset consisting of central bank communications. We compare the generators using a selection of quantitative and qualitative metrics. Our findings suggest that non-expert and expert evaluators prefer CE methods that apply minimal changes; however, the methods we analyze might not handle the domain-specific vocabulary well enough to generate plausible explanations. We discuss shortcomings in the choice of evaluation metrics in the literature on text CE generators and propose refined definitions of the fluency and plausibility qualitative metrics.
Original languageEnglish
Title of host publicationProceedings of the Fourth Workshop on Generation, Evaluation and Metrics (GEM²)
Pages958–972
ISBN (Electronic)979-8-89176-261-9
Publication statusPublished - 2025
Event4th Workshop on Generation, Evaluation and Metrics - Vienna, Austria
Duration: 31 Jul 20251 Aug 2025
Conference number: 4

Workshop

Workshop4th Workshop on Generation, Evaluation and Metrics
Abbreviated titleGEM 2025
Country/TerritoryAustria
City Vienna
Period31/07/251/08/25

Fingerprint

Dive into the research topics of 'Natural Language Counterfactual Explanations in Financial Text Classification'. Together they form a unique fingerprint.

Cite this