Duration
Description
The following pages show detailed results of our ontology evolution study for 16 currently developed life science ontologies presented at DILS 2008:
Overview
Ontologies became increasingly important in recent years, especially in life sciences. Typically, they consist of a harmonized vocabulary of terms, the so called concepts, describing and structuring a domain of interest. A prominent example in life sciences is Gene Ontology, a heavily used ontology source providing sub ontologies for molecular functions, biological processes and cellular components. Other life science ontologies are collected and made centrally available by the OBO (Open Biomedical Ontologies) Foundry. In life sciences, the biological objects, such as genes and proteins, are typically associated (“annotated”) with ontology concepts to consistently and semantically describe their properties, e.g., molecular functions and biological processes where proteins are involved in.
Usually, life science ontologies are explicitly modeled by ontology developers and scientists. Hence, these ontologies are mainly influenced by a specific community agreement (at least among the ontology developers) on the one hand and a specific state of domain knowledge an ontology represents, on the other hand. Therefore, ontologies evolve whenever they are adapted to implement new or changed community agreements, and new research results or insights influencing the covered domain knowledge, respectively. Since ontology changes can lead to outdated annotations of biological objects, particularly when ontology concepts have been deleted or became obsolete, it is of interest how stable ontologies are and how many and what type of changes have been made.
The analysis covers an interval of 45 months (May 2004, Feb 2008), i.e., we consider only ontology versions within this interval and at most one ontology version per month. For more than one ontology version per month, we use the first available version for our analysis.
Basic versioning updates
Information about analyzed ontologies and their versions:
- |C|(start) … number of concepts in first version
- |C|(end) … number of concepts in latest version
- grow … ratio between |C|(end) and |C|(start)
- t(start) … timestamp of first version
- t(end) … timestamp of latest version
- k … number of versions
ontology | ICI(start) | ICI(latest) | grow | t(start) | t(end) | k |
description
|
---|---|---|---|---|---|---|---|
NCI Thesaurus | 35814 | 63924 | 1.78 | May. 04 | Dec. 07 | 39 | broad coverage of cancer domain |
GeneOntology | 17368 | 25995 | 1.50 | May. 04 | Feb. 08 | 44 | aggregation of all GO sub ontologies |
– Biological Process | 8625 | 15001 | 1.74 | May. 04 | Feb. 08 | 44 | annotation of gene products (biological role) |
– Molecular Function | 7336 | 8818 | 1.20 | May. 04 | Feb. 08 | 44 | annotation of gene products (molecular function) |
– Cellular Components | 1407 | 2176 | 1.55 | May. 04 | Feb. 08 | 44 | annotation of gene products (cellular location) |
ChemicalEntities | 10236 | 18007 | 1.76 | Oct. 04 | Jan. 08 | 28 | chemical compounds of biological relevance |
FlyAnatomy | 6090 | 6222 | 1.02 | Nov. 04 | Dec. 07 | 16 | anatomy of Drosophila melanogaster |
MammalianPhenotype | 4175 | 6077 | 1.46 | Aug. 05 | Jan. 08 | 15 | terms for annotating mammalian phenotypic data |
AdultMouseAnatomy | 2416 | 2745 | 1.14 | Aug. 05 | Sep. 07 | 15 | adult anatomy of the mouse (Mus) |
ZebrafishAnatomy | 1389 | 2172 | 1.56 | Nov. 05 | Oct. 07 | 12 | anatomy and development of the Zebrafish |
Sequence | 981 | 1463 | 1.49 | Aug. 05 | Feb. 08 | 26 | structured CV for sequence annotation |
ProteinModification | 1074 | 1128 | 1.05 | Jun. 06 | Nov. 07 | 14 | description of protein chemical modifications |
CellType | 687 | 857 | 1.25 | Jun. 04 | Jun. 07 | 19 | cell types from prokaryotes to mammals |
PlantStructure | 681 | 835 | 1.23 | Jul. 05 | Feb. 08 | 22 | plant morphological and anatomical structures |
ProteinProteinInteraction | 194 | 819 | 4.22 | Aug. 05 | Feb. 08 | 19 | annotation of protein interaction experiments |
FlyBaseCV | 658 | 693 | 1.05 | Nov. 05 | Apr. 07 | 7 | used for various aspects of annotation by FlyBase |
Pathway | 427 | 593 | 1.39 | Nov. 05 | Jan. 08 | 22 | CV for pathways, annotation of gene products |
Evolution statistics at a glance
Information about evolution in ontologies
- Full period: May 2004 - Feb. 2008
- Last year: Feb. 2007 - Feb. 2008
- Add(full) … average number of added concepts per month in full period
- Del(full) … average number of deleted concepts per month in full period
- Obs(full) … average number of obsolete changed concepts per month in full period
- adr … Add-Delete ratio between Add(full) and Del(full) + Obs(full)
- add-frac(full) … fraction of added concepts per month in full period
- del-frac(full) … fraction of deleted concepts per month in full period
- obs-frac(full) … fraction of obsolete changed concepts per month in full period
- Add(lastYear) … average number of added concepts per month in last year
- Del(lastYear) … average number of deleted concepts per month in last year
- Obs(lastYear) … average number of obsolete changed concepts per month in last year
ontology | Add(full) | Del(full) | Obs(full) | adr | add-frac(full) | del-frac(full) | obs-frac(full) | Add(lastYear) | Del(lastYear) | Obs(lastYear) |
---|---|---|---|---|---|---|---|---|---|---|
NCI Thesaurus | 627 | 2 | 12 | 42.4 | 1.3% | 0.0% | 0.0% | 416 | 0 | 5 |
GeneOntology | 200 | 12 | 4 | 12.2 | 0.9% | 0.1% | 0.0% | 222 | 20 | 5 |
– Biological Process | 146 | 7 | 2 | 16.2 | 1.2% | 0.1% | 0.0% | 133 | 10 | 2 |
– Molecular Function | 36 | 3 | 2 | 6.8 | 0.4% | 0.0% | 0.0% | 69 | 7 | 3 |
– Cellular Components | 18 | 2 | 0 | 8.9 | 1.0% | 0.1% | 0.0% | 19 | 3 | 0 |
ChemicalEntities | 256 | 62 | 0 | 4.1 | 1.8% | 0.5% | 0.0% | 384 | 67 | 0 |
FlyAnatomy | 5 | 1 | 1 | 3.3 | 0.1% | 0.0% | 0.0% | 6 | 0 | 0 |
MammalianPhenotype | 65 | 2 | 9 | 6.0 | 1.2% | 0.0% | 0.2% | 74 | 2 | 3 |
AdultMouseAnatomy | 11 | 0 | 0 | 30.9 | 0.4% | 0.0% | 0.0% | 1 | 0 | 0 |
ZebrafishAnatomy | 33 | 5 | 1 | 5.5 | 1.8% | 0.3% | 0.1% | 45 | 2 | 1 |
Sequence | 19 | 3 | 2 | 4.1 | 1.5% | 0.3% | 0.2% | 19 | 0 | 0 |
ProteinModification | 5 | 2 | 1 | 1.5 | 0.4% | 0.2% | 0.1% | 7 | 0 | 2 |
CellType | 5 | 1 | 0 | 2.8 | 0.7% | 0.2% | 0.1% | 1 | 0 | 0 |
PlantStructure | 5 | 0 | 1 | 6.1 | 0.7% | 0.0% | 0.1% | 3 | 0 | 0 |
ProteinProteinInteraction | 21 | 0 | 0 | 41.7 | 2.7% | 0.0% | 0.2% | 4 | 0 | 0 |
FlyBaseCV | 1 | 0 | 1 | 2.1 | 0.2% | 0.0% | 0.1% | 0 | 0 | 0 |
Pathway | 7 | 1 | 0 | 7.9 | 1.3% | 0.2% | 0.0% | 6 | 2 | 0 |