A Comparative Study of Ontology Matching Systems via Inferential Statistics

Majid Mohammadi, Wout Hofman, Yao Hua Tan

Research output: Contribution to journalArticleScientificpeer-review

18 Citations (Scopus)
17 Downloads (Pure)


Comparing ontology matching systems are typically performed by comparing their average performances over multiple datasets. However, this paper examines the alignment systems using statistical inference since averaging is statistically unsafe and inappropriate. The statistical tests for comparison of two or multiple alignment systems are theoretically and empirically reviewed. For comparison of two systems, the Wilcoxon signed-rank and McNemar's mid-p and asymptotic tests are recommended due to their robustness and statistical safety in different circumstances. The Friedman and Quade tests with their corresponding post-hoc procedures are studied for comparison of multiple systems, and their [dis]advantages are discussed. The statistical methods are then applied to benchmark and multifarm tracks from the ontology matching evaluation initiative (OAEI) 2015 and their results are reported and visualized by critical difference diagrams.

Original languageEnglish
Pages (from-to)1-14
JournalIEEE Transactions on Knowledge and Data Engineering
Publication statusPublished - 2018

Bibliographical note

Green Open Access added to TU Delft Institutional Repository ‘You share, we take care!’ – Taverne project https://www.openaccess.nl/en/you-share-we-take-care Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.


  • Benchmark testing
  • Bergmann
  • Friedman
  • Geoscience
  • Holm
  • McNemar
  • Nemenyi
  • Ontologies
  • Ontology alignment evaluation
  • paired t-test
  • post-hoc
  • Quade
  • Robustness
  • Shaffer
  • Statistical analysis
  • Task analysis
  • Wilcoxon signed-rank


Dive into the research topics of 'A Comparative Study of Ontology Matching Systems via Inferential Statistics'. Together they form a unique fingerprint.

Cite this