24th International Conference on Database Systems for Advanced Applications (DASFAA) 2019: 553-570
Abstract. Developing efficient and scalable techniques for pattern queries over large graphs is crucial for modern applications such as social networks, Web analysis, and bioinformatics. In this paper, we address the problem of efficiently finding the homomorphic matches for tree pattern queries with child and descendant edges (mixed pattern queries) over a large data graph. We propose a novel type of materialized views to accelerate the evaluation. Our materialized views are the sets of occurrence lists of the nodes of the pattern in the data graph. They are stored as compressed bitmaps on the inverted lists of the node labels in the data graph. Reachability information between occurrence list nodes is provided by a node reachability index. This technique not only minimizes the materialization space but also reduces CPU and I/O costs by translating view materialization processing into bitwise operations. We provide conditions for view usability using the concept of pattern node coverage. We design a holistic bottom-up algorithm which efficiently computes pattern query matches in the data graph using bitmap views. An extensive experimental evaluation shows that our method evaluates mixed patterns up to several orders of magnitude faster than existing algorithms.