Department of Distributed and Web Information Systems

The department conducts research, technological, and innovation challenges in the field of data management and processing in Web applications, and in distributed environments in general. In this context, we cover issues related to data modelling and management for the Data Web and the Semantic Web, Knowledge Graphs and Ontologies, integration of heterogeneous data sources, Web services, personalized information retrieval and recommendation, as well as sensor networks and peer-to-peer systems.

The department is in very close collaboration with members of the other departments of the Institute, sharing results, running projects, and working together on preparing project proposals. The recent activities of our group are rather balanced between research and development of innovative applications, and are outlined below.

Web of Data. The Department has successfully coordinated SmartDataLake (, an H2020 Research and Innovation Action aiming to enable data linking and analysis over heterogeneous data in data lakes. Our work has focused on developing techniques and algorithms for exploring and analyzing the data lake’s contents represented as a Heterogeneous Information Network (HIN). HINs are graphs comprising different types of nodes (entities) and edges (relationships). A core concept for analyzing HINs is that of metapath. Metapaths represent relationships of different semantics between entities of the same or different type, providing a mechanism for exploring and analyzing a HIN from multiple perspectives. Thus, they are fundamental for several types of analyses in HINs. Our outcomes include efficient and scalable algorithms for similarity search and exploration, entity resolution and ranking, link prediction, community detection, and change detection.

Dynamics and Evolution of the Data Web. This line of work assumes that changes on data are interconnected objects that have complex structure with semantic and temporal characteristics, rather than being isolated low-level transformations on data. Our work has been supported by the FP7-ICT project DIACHRON that concluded very successfully in 2016, however we continued to work in this direction throughout the present period with two PhDs, entitled “Managing, Querying and Analyzing Big Data on the Web” (Marios Meimaris, 2018), and “Managing Evolution in Web Data through Compex Changes” (Theodora Galani, 2021). Problems we addressed include data models and query languages that treat complex changes as first-class citizens, complex change representation and detection, RDF versioning and evolution management.

Geosocial networks. The Department has successfully coordinated SLIPO (, an H2020 Innovation Action developing technologies for the scalable and quality assured integration of Points of Interest (POI) Big Data assets. We have developed and applied linked data technologies to address the limitations, gaps and challenges of the current landscape in integrating, enriching, and sharing POI data. Specifically, we have developed effective and scalable methods for transforming conventional POI formats and schemas into RDF data, interlinking POI entities from different datasets, enriching POI entities with additional metadata, and offering value added services.

Leveraging Social Data. Our work in this area followed several directions. (1) We used knowledge graphs to provide better “semantic” reccommendations of potentially interesting tweets to Twitter users, as well as recommendations of similar users to “follow”. (2) We developed PHONY, a system for generating “feature-agnostic” datasets for fake news detection in social media. (3) Through an internal project (ended in 2019) we continued the development of TwitHoard (, an innovative tool for managing data collection campaings in Twitter. The work in this area resulted in a PhD entitled “Leveraging social networks and knowledge graphs for discovering and recommending engaging and credible information” (Danai Pla-Karidi, 2022).

Urban Innovation. The Department had a leading role in the City.Risks project (, an H2020 Research and Innovation Action aiming to increase the perception of security of citizens in big urban environments. Our work has focused on methods and algorithms for collecting and processing crime data and related location information from open data catalogues and other Web sources to analyze and predict spatial crime distribution in urban areas.
Moreover, in the context of the GSRI-funded project VR-Park ( we have developed an innovative system for assessing the activity of visitors in urban parks (, using as case study the Pedion Areos park.