Abstract
Computing containment relations between massive collections of sets is a fundamental operation in data management, for example in graph analytics and data mining applications. Motivated by recent hardware trends, in this paper we present two novel solutions for computing set-containment joins over massive sets: the Patricia Trie-based Signature Join (PTSJ) and PRETTI+, a Patricia trie enhanced extension of the state-of-the-art PRETTI join. The compact trie structure not only enables efficient use of main-memory, but also significantly boosts the performance of both approaches. By carefully analyzing the algorithms and conducting extensive experiments with various synthetic and real-world datasets, we show that, in many practical cases, our algorithms are an order of magnitude faster than the state-of-the-art.
| Original language | English |
|---|---|
| Title of host publication | 31st IEEE International Conference on Data Engineering, ICDE 2015, Seoul, South Korea, April 13-17, 2015 |
| Editors | Johannes Gehrke, Wolfgang Lehner, Kyuseok Shim, Sang Kyun Cha, Guy M. Lohman |
| Publisher | IEEE |
| Pages | 303-314 |
| Number of pages | 12 |
| DOIs | |
| Publication status | Published - 2015 |
| Event | ICDE 2015: International Conference on Data Engineering - Seoul, Korea, Democratic People's Republic of Duration: 13 Apr 2015 → 17 Apr 2015 Conference number: 31 |
Conference
| Conference | ICDE 2015: International Conference on Data Engineering |
|---|---|
| Country/Territory | Korea, Democratic People's Republic of |
| City | Seoul |
| Period | 13/04/15 → 17/04/15 |
Fingerprint
Dive into the research topics of 'Efficient and scalable trie-based algorithms for computing set containment relations'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver