Efficient execution of top-K SPARQL queries

Sara Magliacane; Alessandro Bozzon; Emanuele Della Valle

doi:10.1007/978-3-642-35176-1-22

Efficient execution of top-K SPARQL queries

Sara Magliacane^*, Alessandro Bozzon, Emanuele Della Valle

^*Corresponding author for this work

Research output: Chapter in Book/Conference proceedings/Edited volume › Conference contribution › Scientific › peer-review

36 Citations (Scopus)

Abstract

Top-k queries, i.e. queries returning the top k results ordered by a user-defined scoring function, are an important category of queries. Order is an important property of data that can be exploited to speed up query processing. State-of-the-art SPARQL engines underuse order, and top-k queries are mostly managed with a materialize-then-sort processing scheme that computes all the matching solutions (e.g. thousands) even if only a limited number k (e.g. ten) are requested. The PARQL-ANK algebra is an extended SPARQL algebra that treats order as a first class citizen, enabling efficient split-and-interleave processing schemes that can be adopted to improve the performance of top-k SPARQL queries. In this paper we propose an incremental execution model for PARQL-ANK queries, we compare the performance of alternative physical operators, and we propose a rank-aware join algorithm optimized for native RDF stores. Experiments conducted with an open source implementation of a PARQL-ANK query engine based on ARQ show that the evaluation of top-k queries can be sped up by orders of magnitude.

Original language	English
Title of host publication	The Semantic Web, ISWC 2012
Subtitle of host publication	11th International Semantic Web Conference, Proceedings
Pages	344-360
Number of pages	17
DOIs	https://doi.org/10.1007/978-3-642-35176-1-22
Publication status	Published - 2012
Externally published	Yes
Event	11th International Semantic Web Conference, ISWC 2012 - Boston, MA, United States Duration: 11 Nov 2012 → 15 Nov 2012

Publication series

Name	Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Number	PART 1
Volume	7649 LNCS
ISSN (Print)	03029743
ISSN (Electronic)	16113349

Conference

Conference	11th International Semantic Web Conference, ISWC 2012
Country/Territory	United States
City	Boston, MA
Period	11/11/12 → 15/11/12

Access to Document

10.1007/978-3-642-35176-1-22

Cite this

Magliacane, S., Bozzon, A., & Della Valle, E. (2012). Efficient execution of top-K SPARQL queries. In The Semantic Web, ISWC 2012 : 11th International Semantic Web Conference, Proceedings (pp. 344-360). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 7649 LNCS, No. PART 1). https://doi.org/10.1007/978-3-642-35176-1-22

@inproceedings{9ab698200de04a0db8d256ae6e3bb78a,

title = "Efficient execution of top-K SPARQL queries",

abstract = "Top-k queries, i.e. queries returning the top k results ordered by a user-defined scoring function, are an important category of queries. Order is an important property of data that can be exploited to speed up query processing. State-of-the-art SPARQL engines underuse order, and top-k queries are mostly managed with a materialize-then-sort processing scheme that computes all the matching solutions (e.g. thousands) even if only a limited number k (e.g. ten) are requested. The PARQL-ANK algebra is an extended SPARQL algebra that treats order as a first class citizen, enabling efficient split-and-interleave processing schemes that can be adopted to improve the performance of top-k SPARQL queries. In this paper we propose an incremental execution model for PARQL-ANK queries, we compare the performance of alternative physical operators, and we propose a rank-aware join algorithm optimized for native RDF stores. Experiments conducted with an open source implementation of a PARQL-ANK query engine based on ARQ show that the evaluation of top-k queries can be sped up by orders of magnitude.",

author = "Sara Magliacane and Alessandro Bozzon and {Della Valle}, Emanuele",

year = "2012",

doi = "10.1007/978-3-642-35176-1-22",

language = "English",

isbn = "978-3-642-35175-4",

series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

number = "PART 1",

pages = "344--360",

booktitle = "The Semantic Web, ISWC 2012",

note = "11th International Semantic Web Conference, ISWC 2012 ; Conference date: 11-11-2012 Through 15-11-2012",

}

Magliacane, S, Bozzon, A & Della Valle, E 2012, Efficient execution of top-K SPARQL queries. in The Semantic Web, ISWC 2012 : 11th International Semantic Web Conference, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), no. PART 1, vol. 7649 LNCS, pp. 344-360, 11th International Semantic Web Conference, ISWC 2012, Boston, MA, United States, 11/11/12. https://doi.org/10.1007/978-3-642-35176-1-22

Efficient execution of top-K SPARQL queries. / Magliacane, Sara; Bozzon, Alessandro; Della Valle, Emanuele.
The Semantic Web, ISWC 2012 : 11th International Semantic Web Conference, Proceedings. 2012. p. 344-360 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 7649 LNCS, No. PART 1).

Research output: Chapter in Book/Conference proceedings/Edited volume › Conference contribution › Scientific › peer-review

TY - GEN

T1 - Efficient execution of top-K SPARQL queries

AU - Magliacane, Sara

AU - Bozzon, Alessandro

AU - Della Valle, Emanuele

PY - 2012

Y1 - 2012

N2 - Top-k queries, i.e. queries returning the top k results ordered by a user-defined scoring function, are an important category of queries. Order is an important property of data that can be exploited to speed up query processing. State-of-the-art SPARQL engines underuse order, and top-k queries are mostly managed with a materialize-then-sort processing scheme that computes all the matching solutions (e.g. thousands) even if only a limited number k (e.g. ten) are requested. The PARQL-ANK algebra is an extended SPARQL algebra that treats order as a first class citizen, enabling efficient split-and-interleave processing schemes that can be adopted to improve the performance of top-k SPARQL queries. In this paper we propose an incremental execution model for PARQL-ANK queries, we compare the performance of alternative physical operators, and we propose a rank-aware join algorithm optimized for native RDF stores. Experiments conducted with an open source implementation of a PARQL-ANK query engine based on ARQ show that the evaluation of top-k queries can be sped up by orders of magnitude.

AB - Top-k queries, i.e. queries returning the top k results ordered by a user-defined scoring function, are an important category of queries. Order is an important property of data that can be exploited to speed up query processing. State-of-the-art SPARQL engines underuse order, and top-k queries are mostly managed with a materialize-then-sort processing scheme that computes all the matching solutions (e.g. thousands) even if only a limited number k (e.g. ten) are requested. The PARQL-ANK algebra is an extended SPARQL algebra that treats order as a first class citizen, enabling efficient split-and-interleave processing schemes that can be adopted to improve the performance of top-k SPARQL queries. In this paper we propose an incremental execution model for PARQL-ANK queries, we compare the performance of alternative physical operators, and we propose a rank-aware join algorithm optimized for native RDF stores. Experiments conducted with an open source implementation of a PARQL-ANK query engine based on ARQ show that the evaluation of top-k queries can be sped up by orders of magnitude.

UR - http://www.scopus.com/inward/record.url?scp=84868533573&partnerID=8YFLogxK

U2 - 10.1007/978-3-642-35176-1-22

DO - 10.1007/978-3-642-35176-1-22

M3 - Conference contribution

AN - SCOPUS:84868533573

SN - 978-3-642-35175-4

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 344

EP - 360

BT - The Semantic Web, ISWC 2012

T2 - 11th International Semantic Web Conference, ISWC 2012

Y2 - 11 November 2012 through 15 November 2012

ER -

Efficient execution of top-K SPARQL queries

Abstract

Publication series

Conference

Access to Document

Other files and links

Fingerprint

Cite this