Seminar Bio-Datenbanken WS02/03 Literatur

Prof. Dr. E. Rahm:
Seminar "Bio-Datenbanken" (WS 02/03)


Thema 1: Einführung Bioinformatik / Bio-Datenbanken

Charakterisierung von DNA- und Proteinsequenzen als Zeichenkette
Aufbau von DNA bzw. Proteinen, Aminosäuren
genetischer Code; Transkription
Sekundär- und Tertiärstrukturen

Andreas D. Baxevanis, B. F. Francis Ouellette: Bioinformatics: A Practical Guide to the Analysis of Genes and Proteins, John Wiley & Sons, 2001

Head-Gordon, T., Wooley, J.C.: Computational Challenges in Structural and Functional Genomics. IBM System Journal, 40(2), 2001

François Bry, Peer Kröger: Introduction to Molecular Biology Databases.

Moussouni, F., Paton, N.W., Hayes, A., Oliver, S., Goble, C.A. and Brass, A.: Database Challenges for Genome Information in the Post Sequencing Phase. Proc. 10th Database and Expert Systems Applications (DEXA), T. Bench- Capon et al. (eds), Springer-Verlag, 540-549, 1999.


Thema 2: Sequenzanalyse, Suche in Gen-Datenbanken

Dan Gusfield: Algorithms on Strings, Trees, and Sequences, Cambridge University Press, 1997

Michael S. Waterman: Introduction to Computational Biology: Maps, Sequences and Genomes

Richard Durbin, Sean R. Eddy, Anders Krogh, Graeme Mitchison: Biological Sequence Analysis



Thema 3: Überblick Bio-Datenbanken, insb. Gen-Annotations-DB

Apweiler, R.: Introduction to Molecular Biology Databases.

Gelbart, W.M.: Databases in Genomic Research. Science, 1998, Vol. 282.

Baxevanis, A.D.: The Molecular Biology Database Collection: an update compilation of biological database resource. Nucleic Acids Research, 2001, Vol. 29, No.1.

EST Databases:

The Gene Discovery Page:

Biological databases:

Biology / Genetics / Microbiology databases:

Inman, J.T. et al.: A High-Throughput Distributed DNA Sequence Analysis and Database System. IBM System Journal, 40(2), 2001

Mangalam, H. et al.: GeneX: An Open Source Gene Expression Database and Integrated Tool Set. IBM System Journal, 40(2), 2001

Paton, N.W., Khan, S.A., Hayes, A., Moussouni, F., Brass, A., Eilbeck, K., Goble, C.A., Hubbard, S. and Oliver, S.G.: Conceptual Modelling of Genomic Information, Bioinformatics, Vol 16, No 6, 548-558, 2000.

Aberer, K.: The Use of Object-Oriented Data Models for BiomolecularDatabases. Proc. OOCNS 95 (Object-Oriented Computing in the Natural Sciences), Heidelberg, Germany, 1995.


Thema 4: Genexpressionsanalyse: Einführung; Datenbank-Anforderungen

Inhaltlicher Schwerpunkt:
Differential display, SAGE, northern / southern Blotting, Microarray

Wen, X., Fuhrman, S., Michaels, G. S., Carr, D. B., Smith, S., Barker, J. L., Somogyi, R. (1998): Large-scale temporal gene expression mapping of central nervous system development. Proc. Natl. Acad. Sci. U. S. A. 95: 334-339

Alwine, J.C., Kemp, D.J. & Stark, G.R.: Method for detection of specific RNAs in agarose gels by transfer to diazobenzyloxymethyl-paper and hybridization with DNA probes. Proc. Natl Acad. Sci. USA 74,5350-5354 (1977).

Liang, P. & Pardee, A.B.: Differential display of eukaryotic messenger RNA by means of the polymerasechain reaction. Science 257, 967-971 (1992).

Velculescu, V.E., Zhang, L., Vogelstein, B. & Kinzler, K.W.: Serial analysis of gene expression. Sciearray. Science 270, 467-470, (1995).

Shena, M. et al.: Quantitative Monitoring of Gene Expression Patterns with a complementary DNA Microarray. Science 270, 467-470, 1995.

Lockhart, D.J. et al.: Expression Monitoring by Hybridization to High-density Oligonucleotide Arrays. Nat. biotechnol. 14, 1675-1680, 1996.

[Na99]         Nature Genetics Supplement, 21:1, 1999.

H. H. Do, T. Kirsten, E. Rahm: Comparative Evaluation of Microarray-based Gene Expression Databases. IZBI, Univ. Leipzig, Sep. 2002

Vasmatzis, G. et al.: Discovery of Three Genes Specifically Expressed in Human Prostate by Expressed Sequence Tag Database Analysis. Proc. Natl. Acad. Sci. 95, 300-304, 1998

Bassett, D.E., Eisen, M.B., Boguski, M.S.: Gene expression informatics - it's all in your mine. Nature genetics supplement, 1999, Vol. 21.

Cheung, V.G. et al.: Expression profiling using cDNA microarrays. Nature genetics supplement, 1999, Vol. 21.

Ermolaeva, Olga et al: Data Management and Analysis for Gene Expression Arrays. Nature genetics, 1998, Vol 20.

Gardiner-Garden, M., Littlejohn, T.G.: A Comparison of Microarray Databases. Briefings in Bioinformatics 2:2. pp. 143-158. 2001.

Sherlock, G. et al: The Standford Microarray Database. Nucleic Acids Research, 2001, Vol.29, No.1

Microarrays databases on the WWW:


Thema 5: Genexpressionsanalyse: Data Mining-Verfahren

Tsur, S.: Data Mining in the Bioinformatics Domain. Proc. 26. Intl. Conf. on VLDB, 2000

Lee, C., Irizarry, K.: The GeneMine System for Genome/Proteome Annotation and Collaborative Data Mining. IBM System Journal, 40(2), 2001

Brazma, A. et al.: Gene Expression Data Mining and Analysis. In Jordan, B. (Ed.): DNA Microarrays: Gene Expression Applications, pp. 106-129, Springer-Verlag, 2002.

Dopazo, J.: Microarray data processing and analysis. In Lin, S.M, Johnson K.F. (Ed.): Microarray Data Analysis II. pp. 43-63, Kluwer Academic, 2002.

Kerr, M.K, M. Martin, and G.A. Churchill: Analysis of variance for gene expression microarray data. J. Comp. Biol. 7:819-837, 2000

Dudoit, S., Y.-H- Yang, M.C. Callow, and T.P. Speed: Statistical methods for identifying differentially expressed genes in replicated cNDA microarray experiments. Statistika Sinica 12(1), 2002.

Pan, W.: A Comparative Review of Statistical Methods for Discovering Differentially Expressed Genes in Replicated Microarray Experiments. Bioinformatics, 12, 546-554, 2002.

Brazma und Vilo: Gene expression data analysis. FEBS Letters, 480, 2000, 17-24.

Brown et al.: Knowledge-based analysis of microarray gene expression data by using support vector machines. Proc. Natl. Acad. Sci. USA, 2000, Vol. 97, 262-267.

D'haeseleer et al.: Mining the gene expression matrix: inferring gene relationships from large scale gene expression data. Information Processing in Cells and Tissues, 1998, 203-212.

Eisen et al.: Cluster analysis and display of genome-wide expression patterns. Proc. Natl. Acad. Sci. USA, 1998, Vol. 95, 14863-14868.

Getz et al.: Coupled two-way clustering analysis of gene microarray data. Proc. Natl. Acad. Sci. USA, 2000, Vol 97, 12079-12084.

Hastie et al.: Supervised harvesting of expression trees. Genome Biology, 2001, Vol. 2, Nr. 1, 0003.1-0003.12.


Thema 6: Protein-Datenbanken

Laurie Hammel, Jignesh M. Patel: Searching on the Secondary Structure of Protein Sequences, VLDB 2002

The Protein Data Bank: Nucleic Acids Research, Vol. 28, No. 1 235-242, 2000

Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res., 1997

The SWISS-PROT protein sequence data bank and its supplement TrEMBL, Nucleic Acids Research, 1997

Martin Ester, Hans-Peter Kriegel, Thomas Seidl, Xiaowei Xu: Formbasierte Suche nach komplementären 3D-Oberflächen in einer Protein-Datenbank. 373-382, BTW 1995


Thema 7: Pathway-Datenbanken

David Gilbert, Seminar "Metabolic Pathway Databases and Tools", 2001

P. D. Karp, M. Krummenacker, S. Paley, and J. Wagg: Integrated pathway-genome databases and their role in drug discovery. Trends Biotechnology, 17:275-281, 1999.

Kanehisa, M.: KEGG - From genes to biochemical pathways. In "Molecular Biology Databases" (Letovsky, S., ed.), Kluwer Academic Press (1998)

Wittig, U. and A. De Beuckelaer (2001): Analysis and comparison of metabolic pathway databases. Brief Bioinform 2(2): 126-42.


Thema 8: Integration von Bio-Daten

Aberer, K., Hemm, K.: A Methodology for Building a Data Warehouse in a Scienctific Environment. 1. IFCIS Intl. Conf. on Cooperative Information Systems, 1996.

Berti-Equille, L.: Integration of Biological Data and Quality-Driven Source Negotiation. Proceedings ER 2001: 256-269.

Chen, L., Jamil, H. M.: On Using Remote User Defined Functions as Wrappers for Biological DatabaseInteroperability. Technical Report TR-IDB-2002-02. (ebenfalls in: Special Issue of the International Journal of Cooperative Information Systems (IJCIS) on Data Management and Modeling Support in Bioinformatics, March 2003).

Cornell, M., Paton, N.; Goble, C.A. et al.: GIMS - A Data Warehouse for Storage and Analysis of Genome Sequence and Functional Data. Proceedings 2nd IEEE International Symposium on Bioinformatics and Bioengineering (BIBE'2001): 15-22.

Davidson, S.B., Overton, C., Buneman, P.: Challenges in Integrating Biological Data Sources. Computational Biology 2 (1995), pp 557-572.

Eckman, B.A. et al.: Optimized Seamless Integration of Biomolecular Data. Proceedings 2nd IEEE International Symposium on Bioinformatics and Bioengineering (BIBE'2001): 23-32.

Goble, C.A., Stevens, R. et al.: Transparent Access to Multiple Bioinformatics Information Sources. IBM Systems Journal Special issue on deep computing for the life sciences, 40(2):532 - 552, 2001.

Haas, L.M. et al.: DiscoveryLink: A System for Integrated Access to Life Sciences Data Sources. IBM System Journal, 40(2), 2001

Jamil, H.M.: Achieving Interoperability of Genome Databases through Intelligent Web Mediator. Proc. 1st IEEE Intern. Symposium on Bioinformatics and Bioengineering (BIBE'2000): 118-125.

Karp, P.D., Krummenacker, M., Paley, S., and Wagg, J.: Integrated Pathway/Genome Databases and their Role in Drug Discovery. Trends in Biotechnology 17(7): 275-281, 1999.

Leser, U.: Designing a Global Information Resource for Molecular Biology. Proc. Datenbanken in Büro, Technik und Wissenschaft (BTW), Freiburg, Germany, pp. 362-369, Springer Verlag.

Ferner: Diverse weitere Beiträge aus:
Proceedings 1st IEEE International Symposium on Bioinformatics and Bioengineering (BIBE'2000)
Proceedings 2nd IEEE International Symposium on Bioinformatics and Bioengineering (BIBE'2001)


Thema 9: Anfragesprachen an Bio-Datenbanken

Che, D.; Aberer, K.: A Query System in a Biological Database. Proceedings 11th International Conference on Scientific and Statistical Database Management 1999: 158-167.

Chen, L.; Jamil, H. M.: Supporting Remote User Defined Functions in Heterogeneous Biological Databases. Proceedings IEEE International Conference on Bioinformatics and Biomedical Egineering (BIBE 2001), S.144-152, 2001

Jamil, H. M.: GQL: A Reasonable Complex SQL for Genomic Databases. Proceedings 1st IEEE International Symposium on Bioinformatics and Bioengineering (BIBE'2000): 50-59.

Mork, P. et al.: {PQL}: A Declarative Query Language over Dynamic Biological Data. Proceedings American Medical Informatics Association (AMIA) Anual Symposium, 2002.

Shindyalov, I.N.; Bourne, P.E.: Protein Data Representation and Query Using Optimized Data Decomposition. Prceedings Computer Applications in the Biosciences 1997, 13, 487-496.

SRS (Sequence Retrieval System)

DELPHOS - Protein sequence query and retrieval system

AQL - Acedb Query Language


Thema 10: Web Services in der Bioinformatik




Stein, Lincoln: Creating a bioinformatics nation. NATURE, vol. 417, 9 May 2002, p. 119-120.

Stewart, B.: Lincoln Stein's Keynote: Building a Bioinformatics Nation,

Folien zur Keynote von Lincoln Stein:

Cerami, E.: Web Services for Bioinformatics, (mit 2 Beispielen)

Whitehead Institute: OmniGene., Rodden, T. The myGrid project. (Folien dazu)