Challenges in the evaluation of conversational search systems

Gustavo Penha, Claudia Hauff

Research output: Chapter in Book/Conference proceedings/Edited volumeConference contributionScientificpeer-review

3 Citations (Scopus)
313 Downloads (Pure)

Abstract

The area of conversational search has gained significant traction in the IR research community, motivated by the widespread use of personal assistants. An often researched task in this setting is conversation response ranking, that is, to retrieve the best response for a given ongoing conversation from a corpus of historic conversations. While this is intuitively an important step towards (retrieval-based) conversational search, the empirical evaluation currently employed to evaluate trained rankers is very far from this setup: typically, an extremely small number (e.g., 10) of non-relevant responses and a single relevant response are presented to the ranker. In a real-world scenario, a retrieval-based system has to retrieve responses from a large (e.g., several millions) pool of responses or determine that no appropriate response can be found. In this paper we aim to highlight these critical issues in the offline evaluation schemes for tasks related to conversational search. With this paper, we argue that the currently in-use evaluation schemes have critical limitations and simplify the conversational search tasks to a degree that makes it questionable whether we can trust the findings they deliver.

Original languageEnglish
Title of host publicationKDD 2020 Workshop on Conversational Systems Towards Mainstream Adoption, KDD-Converse 2020
EditorsG. Di Fabbrizio, S. Kallumadi, U. Porwal, T. Taula
Number of pages5
Volume2666
Publication statusPublished - 2020
EventKDD 2020 Workshop on Conversational Systems Towards Mainstream Adoption, KDD-Converse 2020 - Virtual, Online, United States
Duration: 24 Aug 202024 Aug 2020
http://ceur-ws.org/Vol-2666/

Publication series

NameCEUR Workshop Proceedings
PublisherCEUR-WS
ISSN (Print)1613-0073

Conference

ConferenceKDD 2020 Workshop on Conversational Systems Towards Mainstream Adoption, KDD-Converse 2020
Abbreviated titleKDD-Converse 2020
Country/TerritoryUnited States
Period24/08/2024/08/20
Internet address

Bibliographical note

Virtual Workshop

Fingerprint

Dive into the research topics of 'Challenges in the evaluation of conversational search systems'. Together they form a unique fingerprint.

Cite this