Most proposed approaches on data integration rely on the notion of a global schema to provide a unified and consistent view of the underlying data sources. While it has been successful for data warehouses, the effort to integrate new sources usually is high. This makes it difficult for such approaches to scale to many sources. Furthermore, for virtual data integration it is challenging to obtain a good data quality.
Our work on web data integration focusses on dynamic information fusion of data sources available on the web. Similar to the idea of mashups, we want to achieve a fast development of data integration applications by reusing existing services and entity search engines within a workflow-like data integration. Integration workflows are defined using a script language supporting powerful generic operators.
Our work on web data integration contains the following projects:
- With iFuice we developed an approach to information fusion of data sources using instance based peer-to-peer mappings between them.
- Object Matching is a crucial task in data integration systems. Our MOMA framework can be used for defining object matching workflows in a mapping-based P2P environment.
- Data Integration Applications: Based on the results of the above mentioned projects we design and implement several domain-specific applications, e.g., BioFuice for the integration of biological data. Based on bibliographic data source we also performed comprehensive citation analysis of database publications.
- We currently work on a mashup framework that aims for supporting online (ad-hoc) data integration in dynamic web applications.