Integration of molecular-biological Data
Molecular-biological annotation data is continuously being collected, curated and made accessible in numerous public data sources.
Integration of this data is a major challenge in bioinformatics. We developed the GenMapper system that physically integrates
heterogeneous annotation data in a flexible way and supports large-scale analysis on the integrated data. It uses a generic data
model to uniformly represent different kinds of annotations originating from different data sources. Existing associations between
objects, which represent valuable biological knowledge, are explicitly utilized to drive data integration and combine annotation
knowledge from different sources. To serve specific analysis needs, powerful operators are provided to derive tailored annotation
views from the generic data representation. GenMapper is operational and has been successfully used for large-scale functional
profiling of genes.
The current version of GenMapper is available here.
We also developed a hybrid approach to integrate annotation data for the expression analysis of genes and proteins. Expression data
is materialized in a data warehouse while annotation data is integrated virtually according to analysis needs. To facilitate the
access to many sources we utilize the commercial product SRS (Sequence Retrieval System) of LION bioscience.
BioFuice is a novel approach for integrating data from different private and public data sources and ontologies. BioFuice follows
a peer-to-peer-like data integration based on bidirectional mappings. Sources and mappings are associated with a domain model to
support a semantically meaningful interoperability. BioFuice extends the generic iFuice integration platform which utilizes specific
operators for data fusion and workflow-like script programs. BioFuice supports explorative data analysis and query and search
capabilities. We have applied BioFuice in different research projects, such as for integrating protein interactions, detection of
non-coding RNA and gene annotations based on expression experiments.
Current Project Members:
Previous Project Members:
Further and Related Information:
Master Thesis:
Selected Publications:
Posters and Talks:
|
|
|

 |  | Kirsten, T.; Do, H.H.; Sosna, D.; Rahm, E., Krohn, K.; Eszlinger, M.; Paschke, R. Gene Expression Warehousing in Leipzig Poster for the Workshop on Databases and Data Integration in Genome Research, Berlin, February 2002 2002-02 |
|
|
|