Efficient Answering of Set Containment Queries for Skewed Item Distributions
In Proceedings of the 14th International Conference on Extending Database Technology (EDBT '11), Uppsala, Sweden, 2011
2011
Conference/Workshop
- Contact persons: Manolis Terrovitis , Timos Sellis
Abstract.
In this paper we address the problem of efficiently evaluating containment
(i.e., subset, equality, and superset) queries over set-valued data. We
propose a novel indexing scheme, the Ordered Inverted File (OIF) which,
differently from the state-of-the-art, indexes set-valued attributes in an
ordered fashion. We introduce query processing algorithms that practically
treat containment queries as range queries over the ordered postings lists of
OIF and exploit this ordering to quickly prune unnecessary page accesses. OIF
is simple to implement and our experiments on both real and synthetic data
show that it greatly outperforms the current state-of-the-art methods for all
three classes of containment queries.