In Inf. Syst. 95: 101616
Abstract. Data exploration and visual analytics systems are of great importance in Open Science scenarios, where less tech-savvy researchers wish to access and visually explore big raw data files (e.g., json, csv) generated by scientific experiments using commodity hardware and without being overwhelmed in the tedious processes of data loading, indexing and query optimization. In this paper, we present our work for enabling efficient query processing on large raw data files for interactive visual exploration scenarios and analytics. We introduce a framework, named RawVis, built on top of a lightweight in-memory tile-based index, VALINOR, that is constructed on-the-fly given the first user query over a raw file and progressively adapted based on the user interaction. We evaluate the performance of a prototype implementation compared to three other alternatives and show that our method outperforms in terms of response time, disk accesses and memory consumption. Particularly during an exploration scenario, the proposed method in most cases is about 5-10 faster compared to existing solutions, and requires significantly less memory resources.