Temporal Blind Spots in Large Language Models

Jonas Wallat, Adam Jatowt, Avishek Anand

Research output: Chapter in Book/Conference proceedings/Edited volumeConference contributionScientificpeer-review

Abstract

Large language models (LLMs) have recently gained significant attention due to their unparalleled zero-shot performance on various natural language processing tasks. However, the pre-Training data utilized in LLMs is often confined to a specific corpus, resulting in inherent freshness and temporal scope limitations. Consequently, this raises concerns regarding the effectiveness of LLMs for tasks involving temporal intents. In this study, we aim to investigate the underlying limitations of general-purpose LLMs when deployed for tasks that require a temporal understanding. We pay particular attention to handling factual temporal knowledge through three popular temporal QA datasets. Specifically, we observe low performance on detailed questions about the past and, surprisingly, for rather new information. In manual and automatic testing, we find multiple temporal errors and characterize the conditions under which QA performance deteriorates. Our analysis contributes to understanding LLM limitations and offers valuable insights into developing future models that can better cater to the demands of temporally-oriented tasks. The code is available https://github.com/jwallat/temporalblindspots.

Original languageEnglish
Title of host publicationWSDM 2024 - Proceedings of the 17th ACM International Conference on Web Search and Data Mining
Place of PublicationNew York
PublisherAssociation for Computing Machinery (ACM)
Pages683-692
Number of pages10
ISBN (Print)979-8-4007-0371-3
DOIs
Publication statusPublished - 2024
Event17th ACM International Conference on Web Search and Data Mining, WSDM 2024 - Merida, Mexico
Duration: 4 Mar 20248 Mar 2024

Conference

Conference17th ACM International Conference on Web Search and Data Mining, WSDM 2024
Country/TerritoryMexico
CityMerida
Period4/03/248/03/24

Bibliographical note

Green Open Access added to TU Delft Institutional Repository ‘You share, we take care!’ – Taverne project https://www.openaccess.nl/en/you-share-we-take-care
Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.

Keywords

  • large language models
  • question answering
  • temporal information retrieval
  • temporal query intents

Fingerprint

Dive into the research topics of 'Temporal Blind Spots in Large Language Models'. Together they form a unique fingerprint.

Cite this