Web Information Systems

Web search engines are widely used for searching information on the Web. Their increased popularity is due to the simple but intuitive search model offered (i.e., keyword-based search). However, there are use cases where the information need is complex, and is part of a so-called creativity cycle. Consider for example a researcher that needs to set up her research agenda and generate innovative ideas. She often has the "big picture" for her search plan, that is an abstraction with thoughts, ideas and concepts that actually describes the domain to explore. Based on this initial abstraction, she starts gathering information from several data sources. She studies and generates hypothesis. Then, she refines her abstractions, searches for new information etc, starting a new cycle. Such cycles (creativity cycles) actually enable discovery and innovation.

New search models and techniques are necessary to promote creativity and innovation. Web applications for Bio-sciences is one of the use-cases adopted. A critical objective is to support creativity cycles, and also to provide effective presentation and visualization capabilities for the lists of retrieved resources that will guide users in their search and exploration during a creativity cycle:

[Search plan abstraction - Information harvest and retrieval - User personalization - Organizing information]

The scientific and techical challenges are the following:

  • Knowledge representation and abstraction. Knowledge abstractions is a popular feature for mindmapping tools. Mind mapping refers to graphical representations of elements such as concepts, ideas, notes, tasks, or other items related to a topic of study. Mind mapping elements are organized in branches or groups according to the semantic interpretation given by the user. We study methods to exploit mind maps for gathering, organizing and exchanging information on the Web.
  • Web information harvest and retrieval. In a metasearch paradigm, user queries are propagated to several search engines. Relevant resources are retrieved, merged, ranked and presented to user. For effective harvest and retrieval, intelligent services are needed to orchestrate the metasearch. Depending on the type of resources, certain search engines will be favored against other engines in order to answer the query. We are interested in (a) automated web scraping methods, (b) retrieval methods assuming data evolution and change, (c) dynamic facet extraction from result lists, and (d) using the semantics of mindmaps to improve the precision of the retrieval and harvest tasks. We put emphasis on searching for scientific material (e.g., papers, technical articles, etc).
  • User personalization. Users with different backgrounds or viewpoints may interpret the same data in a different way. To avoid such ambiguous situations, searching may exploit the user context under which information becomes relevant in order to adapt the presentation of the retrieved resources to user needs (i.e., personalized ranking of the results). We are interested in adapting ranking lists to user interests, taking into consideration (a) users past search behaviour and (b) the semantics of mindmaps created by users in the past.