SmartPub: A Platform for Long-Tail Entity Extraction from Scientific Publications

Research output: Chapter in Book/Conference proceedings/Edited volumeConference contributionScientificpeer-review

103 Downloads (Pure)


This demo presents SmartPub, a novel web-based platform that supports the exploration and visualization of shallow meta-data (e.g., author list, keywords) and deep meta-data--long tail named entities which are rare, and often relevant only in specific knowledge domain--from scientific publications. The platform collects documents from different sources (e.g. DBLP and Arxiv), and extracts the domain-specific named entities from the text of the publications using Named Entity Recognizers (NERs) which we can train with minimal human supervision even for rare entity types. The platform further enables the interaction with the Crowd for filtering purposes or training data generation, and provides extended visualization and exploration capabilities. SmartPub will be demonstrated using sample collection of scientific publications focusing on the computer science domain and will address the entity types Dataset (i.e. dataset presented or used in a publication), and Methods (i.e. algorithms used to create/enrich/analyse a data set)
Original languageEnglish
Title of host publicationCompanion Proceedings of the The Web Conference 2018
Place of PublicationGeneva
PublisherInternational World Wide Web Conferences Steering Committee
Number of pages4
ISBN (Electronic)978-1-4503-5640-4
Publication statusPublished - 2018
EventWWW 2018: The Web Conference - Bridging natural and artificial intelligence worldwide - Lyon, France
Duration: 23 Apr 201827 Apr 2018


ConferenceWWW 2018
Abbreviated titleWWW 2018
Internet address


  • Information Extraction


Dive into the research topics of 'SmartPub: A Platform for Long-Tail Entity Extraction from Scientific Publications'. Together they form a unique fingerprint.

Cite this