Any suggestions? Active schema support for structuring web information

Silviu Homoceanu, Christian Pek, Felix Geilert, Wolf-Tilo Balke

Research output: Chapter in Book/Conference proceedings/Edited volumeChapterScientificpeer-review

Abstract

Backed up by major Web players schema.org is the latest broad initiative for structuring Web information. Unfortunately, a representative analysis on a corpus of 733 million Web documents shows that, a year after its introduction, only 1.56% of documents featured any schema.org annotations. A probable reason is that providing annotations is quite tiresome, hindering wide-spread adoption. Here even state-of-the-art tools like Google’s Structured Data Markup Helper offer only limited support. In this paper we propose SASS, a system for automatically finding high quality schema suggestions for page content, to ease the annotation process. SASS intelligently blends supervised machine learning techniques with simple user feedback. Moreover, additional support features for binding attributes to values even further reduces the necessary effort. We show that SASS is superior to current tools for schema.org annotations.
Original languageEnglish
Title of host publicationInternational Conference on Database Systems for Advanced Applications
DOIs
Publication statusPublished - 2014
Externally publishedYes

Fingerprint

Dive into the research topics of 'Any suggestions? Active schema support for structuring web information'. Together they form a unique fingerprint.

Cite this