De-DSI: Decentralised Differentiable Search Index

Research output: Chapter in Book/Conference proceedings/Edited volumeConference contributionScientificpeer-review

9 Downloads (Pure)

Abstract

This study introduces De-DSI, a novel framework that fuses large language models (LLMs) with genuine decentralization for information retrieval, particularly employing the differentiable search index (DSI) concept in a decentralized setting. Focused on efficiently connecting novel user queries with document identifiers without direct document access, De-DSI operates solely on query-docid pairs. To enhance scalability, an ensemble of DSI models is introduced, where the dataset is partitioned into smaller shards for individual model training. This approach not only maintains accuracy by reducing the number of data each model needs to handle but also facilitates scalability by aggregating outcomes from multiple models. This aggregation uses a beam search to identify top docids and applies a softmax function for score normalization, selecting documents with the highest scores for retrieval. The decentralized implementation demonstrates that retrieval success is comparable to centralized methods, with the added benefit of the possibility of distributing computational complexity across the network. This setup also allows for the retrieval of multimedia items through magnet links, eliminating the need for platforms or intermediaries.

Original languageEnglish
Title of host publicationEuroMLSys 2024 - Proceedings of the 2024 4th Workshop on Machine Learning and Systems
Place of PublicationNew York
PublisherAssociation for Computing Machinery (ACM)
Pages134-143
Number of pages10
ISBN (Print)979-8-4007-0541-0
DOIs
Publication statusPublished - 2024
Event4th Workshop on Machine Learning and Systems, EuroMLSys 2024, held in conjunction with ACM EuroSys 2024 - Athens, Greece
Duration: 22 Apr 202422 Apr 2024

Conference

Conference4th Workshop on Machine Learning and Systems, EuroMLSys 2024, held in conjunction with ACM EuroSys 2024
Country/TerritoryGreece
CityAthens
Period22/04/2422/04/24

Keywords

  • Distributed Systems
  • Information Retrieval
  • Large Language Models (LLMs)

Fingerprint

Dive into the research topics of 'De-DSI: Decentralised Differentiable Search Index'. Together they form a unique fingerprint.

Cite this