Duration

2008

Description

The following pages show detailed results of our ontology evolution study for 16 currently developed life science ontologies presented at DILS 2008:

Overview

Ontologies became increasingly important in recent years, especially in life sciences. Typically, they consist of a harmonized vocabulary of terms, the so called concepts, describing and structuring a domain of interest. A prominent example in life sciences is Gene Ontology, a heavily used ontology source providing sub ontologies for molecular functions, biological processes and cellular components. Other life science ontologies are collected and made centrally available by the OBO (Open Biomedical Ontologies) Foundry. In life sciences, the biological objects, such as genes and proteins, are typically associated (“annotated”) with ontology concepts to consistently and semantically describe their properties, e.g., molecular functions and biological processes where proteins are involved in.

Usually, life science ontologies are explicitly modeled by ontology developers and scientists. Hence, these ontologies are mainly influenced by a specific community agreement (at least among the ontology developers) on the one hand and a specific state of domain knowledge an ontology represents, on the other hand. Therefore, ontologies evolve whenever they are adapted to implement new or changed community agreements, and new research results or insights influencing the covered domain knowledge, respectively. Since ontology changes can lead to outdated annotations of biological objects, particularly when ontology concepts have been deleted or became obsolete, it is of interest how stable ontologies are and how many and what type of changes have been made.

The analysis covers an interval of 45 months (May 2004, Feb 2008), i.e., we consider only ontology versions within this interval and at most one ontology version per month. For more than one ontology version per month, we use the first available version for our analysis.

 

Basic versioning updates

Information about analyzed ontologies and their versions:

  • |C|(start) … number of concepts in first version
  • |C|(end) … number of concepts in latest version
  • grow … ratio between |C|(end) and |C|(start)
  • t(start) … timestamp of first version
  • t(end) … timestamp of latest version
  • k … number of versions
ontologyICI(start)ICI(latest)growt(start)t(end)k

 

description

 

NCI Thesaurus35814639241.78May. 04Dec. 0739broad coverage of cancer domain
GeneOntology17368259951.50May. 04Feb. 0844aggregation of all GO sub ontologies
– Biological Process8625150011.74May. 04Feb. 0844annotation of gene products (biological role)
– Molecular Function733688181.20May. 04Feb. 0844annotation of gene products (molecular function)
– Cellular Components140721761.55May. 04Feb. 0844annotation of gene products (cellular location)
ChemicalEntities10236180071.76Oct. 04Jan. 0828chemical compounds of biological relevance
FlyAnatomy609062221.02Nov. 04Dec. 0716anatomy of Drosophila melanogaster
MammalianPhenotype417560771.46Aug. 05Jan. 0815terms for annotating mammalian phenotypic data
AdultMouseAnatomy241627451.14Aug. 05Sep. 0715adult anatomy of the mouse (Mus)
ZebrafishAnatomy138921721.56Nov. 05Oct. 0712anatomy and development of the Zebrafish
Sequence98114631.49Aug. 05Feb. 0826structured CV for sequence annotation
ProteinModification107411281.05Jun. 06Nov. 0714description of protein chemical modifications
CellType6878571.25Jun. 04Jun. 0719cell types from prokaryotes to mammals
PlantStructure6818351.23Jul. 05Feb. 0822plant morphological and anatomical structures
ProteinProteinInteraction1948194.22Aug. 05Feb. 0819annotation of protein interaction experiments
FlyBaseCV6586931.05Nov. 05Apr. 077used for various aspects of annotation by FlyBase
Pathway4275931.39Nov. 05Jan. 0822CV for pathways, annotation of gene products

 

Evolution statistics at a glance

Information about evolution in ontologies

  • Full period: May 2004 - Feb. 2008
  • Last year: Feb. 2007 - Feb. 2008
  • Add(full) … average number of added concepts per month in full period
  • Del(full) … average number of deleted concepts per month in full period
  • Obs(full) … average number of obsolete changed concepts per month in full period
  • adr … Add-Delete ratio between Add(full) and Del(full) + Obs(full)
  • add-frac(full) … fraction of added concepts per month in full period
  • del-frac(full) … fraction of deleted concepts per month in full period
  • obs-frac(full) … fraction of obsolete changed concepts per month in full period
  • Add(lastYear) … average number of added concepts per month in last year
  • Del(lastYear) … average number of deleted concepts per month in last year
  • Obs(lastYear) … average number of obsolete changed concepts per month in last year
ontologyAdd(full)Del(full)Obs(full)adradd-frac(full)del-frac(full)obs-frac(full)Add(lastYear)Del(lastYear)Obs(lastYear)
NCI Thesaurus62721242.41.3%0.0%0.0%41605
GeneOntology20012412.20.9%0.1%0.0%222205
– Biological Process1467216.21.2%0.1%0.0%133102
– Molecular Function36326.80.4%0.0%0.0%6973
– Cellular Components18208.91.0%0.1%0.0%1930
ChemicalEntities2566204.11.8%0.5%0.0%384670
FlyAnatomy5113.30.1%0.0%0.0%600
MammalianPhenotype65296.01.2%0.0%0.2%7423
AdultMouseAnatomy110030.90.4%0.0%0.0%100
ZebrafishAnatomy33515.51.8%0.3%0.1%4521
Sequence19324.11.5%0.3%0.2%1900
ProteinModification5211.50.4%0.2%0.1%702
CellType5102.80.7%0.2%0.1%100
PlantStructure5016.10.7%0.0%0.1%300
ProteinProteinInteraction210041.72.7%0.0%0.2%400
FlyBaseCV1012.10.2%0.0%0.1%000
Pathway7107.91.3%0.2%0.0%620

 

Comparative Trend Charts

Detailed Evolution Results