Facet Embeddings for Explorative Analytics in Digital Libraries

Sepideh Mesbah; Kyriakos Fragkeskos; Christoph Lofi; Alessandro Bozzon; Geert Jan Houben

doi:10.1007/978-3-319-67008-9_8

Facet Embeddings for Explorative Analytics in Digital Libraries

Sepideh Mesbah^*, Kyriakos Fragkeskos, Christoph Lofi, Alessandro Bozzon, Geert Jan Houben

^*Corresponding author for this work

Web Information Systems

Research output: Chapter in Book/Conference proceedings/Edited volume › Conference contribution › Scientific › peer-review

10 Citations (Scopus)

Abstract

With the increasing amount of scientific publications in digital libraries, it is crucial to capture “deep meta-data” to facilitate more effective search and discovery, like search by topics, research methods, or data sets used in a publication. Such meta-data can also help to better understand and visualize the evolution of research topics or research venues over time. The automatic generation of meaningful deep meta-data from natural-language documents is challenged by the unstructured and often ambiguous nature of publications’ content. In this paper, we propose a domain-aware topic modeling technique called Facet Embedding which can generate such deep meta-data in an efficient way. We automatically extract a set of terms according to the key facets relevant to a specific domain (i.e. scientific objective, used data sets, methods, or software, obtained results), relying only on limited manual training. We then cluster and subsume similar facet terms according to their semantic similarity into facet topics. To showcase the effectiveness and performance of our approach, we present the results of a quantitative and qualitative analysis performed on ten different conference series in a Digital Library setting, focusing on the effectiveness for document search, but also for visualizing scientific trends.

Original language	English
Title of host publication	Research and Advanced Technology for Digital Libraries
Subtitle of host publication	21st International Conference on Theory and Practice of Digital Libraries, TPDL 2017, Proceedings
Editors	Jaap Kamps, Giannis Tsakonas, Yannis Manolopoulos, Lazaros Iliadis, Ioannis Karydis
Publisher	Springer
Pages	86-99
Number of pages	14
ISBN (Electronic)	978-3-319-67008-9
ISBN (Print)	978-3-319-67007-2
DOIs	https://doi.org/10.1007/978-3-319-67008-9_8
Publication status	Published - 2017
Event	21st International Conference on Theory and Practice of Digital Libraries, TPDL 2017 - Thessaloniki, Greece Duration: 18 Sept 2017 → 21 Sept 2017

Publication series

Name	Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume	10450 LNCS
ISSN (Print)	0302-9743
ISSN (Electronic)	1611-3349

Conference

Conference	21st International Conference on Theory and Practice of Digital Libraries, TPDL 2017
Country/Territory	Greece
City	Thessaloniki
Period	18/09/17 → 21/09/17

Access to Document

10.1007/978-3-319-67008-9_8

Cite this

Mesbah, S., Fragkeskos, K., Lofi, C., Bozzon, A., & Houben, G. J. (2017). Facet Embeddings for Explorative Analytics in Digital Libraries. In J. Kamps, G. Tsakonas, Y. Manolopoulos, L. Iliadis, & I. Karydis (Eds.), Research and Advanced Technology for Digital Libraries: 21st International Conference on Theory and Practice of Digital Libraries, TPDL 2017, Proceedings (pp. 86-99). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 10450 LNCS). Springer. https://doi.org/10.1007/978-3-319-67008-9_8

Mesbah, Sepideh ; Fragkeskos, Kyriakos ; Lofi, Christoph et al. / Facet Embeddings for Explorative Analytics in Digital Libraries. Research and Advanced Technology for Digital Libraries: 21st International Conference on Theory and Practice of Digital Libraries, TPDL 2017, Proceedings. editor / Jaap Kamps ; Giannis Tsakonas ; Yannis Manolopoulos ; Lazaros Iliadis ; Ioannis Karydis. Springer, 2017. pp. 86-99 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).

@inproceedings{4f584fd1794a439cb48227b4536b158d,

title = "Facet Embeddings for Explorative Analytics in Digital Libraries",

abstract = "With the increasing amount of scientific publications in digital libraries, it is crucial to capture “deep meta-data” to facilitate more effective search and discovery, like search by topics, research methods, or data sets used in a publication. Such meta-data can also help to better understand and visualize the evolution of research topics or research venues over time. The automatic generation of meaningful deep meta-data from natural-language documents is challenged by the unstructured and often ambiguous nature of publications{\textquoteright} content. In this paper, we propose a domain-aware topic modeling technique called Facet Embedding which can generate such deep meta-data in an efficient way. We automatically extract a set of terms according to the key facets relevant to a specific domain (i.e. scientific objective, used data sets, methods, or software, obtained results), relying only on limited manual training. We then cluster and subsume similar facet terms according to their semantic similarity into facet topics. To showcase the effectiveness and performance of our approach, we present the results of a quantitative and qualitative analysis performed on ten different conference series in a Digital Library setting, focusing on the effectiveness for document search, but also for visualizing scientific trends.",

author = "Sepideh Mesbah and Kyriakos Fragkeskos and Christoph Lofi and Alessandro Bozzon and Houben, {Geert Jan}",

year = "2017",

doi = "10.1007/978-3-319-67008-9_8",

language = "English",

isbn = "978-3-319-67007-2",

series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

publisher = "Springer",

pages = "86--99",

editor = "Jaap Kamps and Giannis Tsakonas and Yannis Manolopoulos and Lazaros Iliadis and Ioannis Karydis",

booktitle = "Research and Advanced Technology for Digital Libraries",

note = "21st International Conference on Theory and Practice of Digital Libraries, TPDL 2017 ; Conference date: 18-09-2017 Through 21-09-2017",

}

Mesbah, S, Fragkeskos, K, Lofi, C , Bozzon, A & Houben, GJ 2017, Facet Embeddings for Explorative Analytics in Digital Libraries. in J Kamps, G Tsakonas, Y Manolopoulos, L Iliadis & I Karydis (eds), Research and Advanced Technology for Digital Libraries: 21st International Conference on Theory and Practice of Digital Libraries, TPDL 2017, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 10450 LNCS, Springer, pp. 86-99, 21st International Conference on Theory and Practice of Digital Libraries, TPDL 2017, Thessaloniki, Greece, 18/09/17. https://doi.org/10.1007/978-3-319-67008-9_8

Facet Embeddings for Explorative Analytics in Digital Libraries. / Mesbah, Sepideh; Fragkeskos, Kyriakos; Lofi, Christoph et al.
Research and Advanced Technology for Digital Libraries: 21st International Conference on Theory and Practice of Digital Libraries, TPDL 2017, Proceedings. ed. / Jaap Kamps; Giannis Tsakonas; Yannis Manolopoulos; Lazaros Iliadis; Ioannis Karydis. Springer, 2017. p. 86-99 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 10450 LNCS).

Research output: Chapter in Book/Conference proceedings/Edited volume › Conference contribution › Scientific › peer-review

TY - GEN

T1 - Facet Embeddings for Explorative Analytics in Digital Libraries

AU - Mesbah, Sepideh

AU - Fragkeskos, Kyriakos

AU - Lofi, Christoph

AU - Bozzon, Alessandro

AU - Houben, Geert Jan

PY - 2017

Y1 - 2017

N2 - With the increasing amount of scientific publications in digital libraries, it is crucial to capture “deep meta-data” to facilitate more effective search and discovery, like search by topics, research methods, or data sets used in a publication. Such meta-data can also help to better understand and visualize the evolution of research topics or research venues over time. The automatic generation of meaningful deep meta-data from natural-language documents is challenged by the unstructured and often ambiguous nature of publications’ content. In this paper, we propose a domain-aware topic modeling technique called Facet Embedding which can generate such deep meta-data in an efficient way. We automatically extract a set of terms according to the key facets relevant to a specific domain (i.e. scientific objective, used data sets, methods, or software, obtained results), relying only on limited manual training. We then cluster and subsume similar facet terms according to their semantic similarity into facet topics. To showcase the effectiveness and performance of our approach, we present the results of a quantitative and qualitative analysis performed on ten different conference series in a Digital Library setting, focusing on the effectiveness for document search, but also for visualizing scientific trends.

AB - With the increasing amount of scientific publications in digital libraries, it is crucial to capture “deep meta-data” to facilitate more effective search and discovery, like search by topics, research methods, or data sets used in a publication. Such meta-data can also help to better understand and visualize the evolution of research topics or research venues over time. The automatic generation of meaningful deep meta-data from natural-language documents is challenged by the unstructured and often ambiguous nature of publications’ content. In this paper, we propose a domain-aware topic modeling technique called Facet Embedding which can generate such deep meta-data in an efficient way. We automatically extract a set of terms according to the key facets relevant to a specific domain (i.e. scientific objective, used data sets, methods, or software, obtained results), relying only on limited manual training. We then cluster and subsume similar facet terms according to their semantic similarity into facet topics. To showcase the effectiveness and performance of our approach, we present the results of a quantitative and qualitative analysis performed on ten different conference series in a Digital Library setting, focusing on the effectiveness for document search, but also for visualizing scientific trends.

UR - http://www.scopus.com/inward/record.url?scp=85029580458&partnerID=8YFLogxK

U2 - 10.1007/978-3-319-67008-9_8

DO - 10.1007/978-3-319-67008-9_8

M3 - Conference contribution

AN - SCOPUS:85029580458

SN - 978-3-319-67007-2

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 86

EP - 99

BT - Research and Advanced Technology for Digital Libraries

A2 - Kamps, Jaap

A2 - Tsakonas, Giannis

A2 - Manolopoulos, Yannis

A2 - Iliadis, Lazaros

A2 - Karydis, Ioannis

PB - Springer

T2 - 21st International Conference on Theory and Practice of Digital Libraries, TPDL 2017

Y2 - 18 September 2017 through 21 September 2017

ER -

Mesbah S, Fragkeskos K, Lofi C , Bozzon A , Houben GJ. Facet Embeddings for Explorative Analytics in Digital Libraries. In Kamps J, Tsakonas G, Manolopoulos Y, Iliadis L, Karydis I, editors, Research and Advanced Technology for Digital Libraries: 21st International Conference on Theory and Practice of Digital Libraries, TPDL 2017, Proceedings. Springer. 2017. p. 86-99. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). doi: 10.1007/978-3-319-67008-9_8

Facet Embeddings for Explorative Analytics in Digital Libraries

Abstract

Publication series

Conference

Access to Document

Other files and links

Fingerprint

Cite this