Large language models as oracles for instantiating ontologies with domain-specific knowledge

Giovanni Ciatto*, Andrea Agiollo, Matteo Magnini, Andrea Omicini

*Corresponding author for this work

Research output: Contribution to journalArticleScientificpeer-review

16 Downloads (Pure)

Abstract

Background:
Endowing intelligent systems with semantic data commonly requires designing and instantiating ontologies with domain-specific knowledge. Especially in the early phases, those activities are typically performed manually by human experts possibly leveraging on their own experience. The resulting process is therefore time-consuming, error-prone, and often biased by the personal background of the ontology designer.

Objective:
To mitigate that issue, we propose a novel domain-independent approach to automatically instantiate ontologies with domain-specific knowledge, by leveraging on large language models (LLMs) as oracles.

Methods:
Starting from (i) an initial schema composed by inter-related classes and properties and (ii) a set of query templates, our method queries the LLM multiple times, and generates instances for both classes and properties from its replies. Thus, the ontology is automatically filled with domain-specific knowledge, compliant to the initial schema. As a result, the ontology is quickly and automatically enriched with manifold instances, which experts may consider to keep, adjust, discard, or complement according to their own needs and expertise.

Contribution:
We formalise our method in general way and instantiate it over various LLMs, as well as on a concrete case study. We report experiments rooted in the nutritional domain where an ontology of food meals and their ingredients is automatically instantiated from scratch, starting from a categorisation of meals and their relationships. There, we analyse the quality of the generated ontologies and compare ontologies attained by exploiting different LLMs. Experimentally, our approach achieves a quality metric that is up to five times higher than the state-of-the-art, while reducing erroneous entities and relations by up to ten times. Finally, we provide a SWOT analysis of the proposed method.
Original languageEnglish
Article number112940
Number of pages22
JournalKnowledge-Based Systems
Volume310
DOIs
Publication statusPublished - 2025

Keywords

  • Automation
  • Domain-specific knowledge
  • Large language models
  • Nutrition
  • Ontology population

Fingerprint

Dive into the research topics of 'Large language models as oracles for instantiating ontologies with domain-specific knowledge'. Together they form a unique fingerprint.

Cite this