Logo of bioinfoLink to Publisher's site
Bioinformatics. Sep 1, 2009; 25(17): 2292–2293.
Published online Jun 26, 2009. doi:  10.1093/bioinformatics/btp392
PMCID: PMC2734318

libAnnotationSBML: a library for exploiting SBML annotations

Abstract

Summary: The Systems Biology Markup Language (SBML) is an established community XML format for the markup of biochemical models. With the introduction of SBML level 2 version 3, specific model entities, such as species or reactions, can now be annotated using ontological terms. These annotations, which are encoded using the resource description framework (RDF), provide the facility to specify definite terms to individual components, allowing software to unambiguously identify such components and thus link the models to existing data resources.

libSBML is an application programming interface library for the manipulation of SBML files. While libSBML provides the facilities for reading and writing such annotations from and to models, it is beyond the scope of libSBML to provide interpretation of these terms. The libAnnotationSBML library introduced here acts as a layer on top of libSBML linking SBML annotations to the web services that describe these ontological terms. Two applications that use this library are described: SbmlSynonymExtractor finds name synonyms of SBML model entities and SbmlReactionBalancer checks SBML files to determine whether specifed reactions are elementally balanced.

Availability: http://mcisb.sourceforge.net/

Contact: ku.ca.retsehcnam@notsniaws.lien

1 INTRODUCTION

The minimum information requested in the annotation of biochemical models (MIRIAM; Le Novère et al., 2005) defines guidelines for annotation of biochemical models. The annotation of models with the MIRIAM standard provides a number of significant advantages in the development of computational tools and applications that can reason over them (Kell and Mendes, 2008).

An example is the task of comparing or merging two biochemical models. Before the introduction of MIRIAM, individual components of SBML models (Hucka et al., 2003) were identified solely by free-text, human-readable, name attributes, often resulting in equivalent components being named differently in different models. As naming conventions are non-standard, it is impossible to definitively match these components computationally, and the process of model merging then requires human input to resolve ambiguities. Providing MIRIAM-compliant annotations allows a component to be unambiguously identified by associating it with one or more terms from publicly available databases such as ChEBI (Degtyarenko et al., 2008) or KEGG (Kanehisa et al., 2000) (Fig. 1).

Fig. 1.
Simplified example of MIRIAM-compliant SBML species elements, annotated with ChEBI and KEGG terms, respectively.

2 FEATURES

The species elements in Figure 1 are both annotated with MIRIAM-compliant terms. libSBML (Bornstein et al., 2008) provides the facility for reading a given SBML element's annotation and hence could be used to determine that species1 and species2 are annotated with ChEBI term CHEBI:4167 and KEGG Compound C00031, respectively. From this, it may be concluded that the compounds represented by these species are different. However, manual inspection of the database references in ChEBI and KEGG show that both species are annotated with references that share the same chemical structure, and hence are equivalent.

Performing such a comparison computationally is beyond the scope of libSBML. To do so, the annotations must be ‘dereferenced’ by querying the two databases via their web service interfaces. This task is complicated particularly because each of the web services has non-standard interfaces.

The libAnnotationSBML library creates a unified framework for supporting MIRIAM-compliant annotations by wrapping these divergent web services into a Java API, allowing each web service to be queried in a consistent manner. The library itself can act as a layer on top of the libSBML API.

The library is built dynamically by querying the MIRIAM web service (Laibe and Le Novère, 2007), which provides a collection of data types that are recommended for use in model annotation. The web service provides details of each of these data types including names, URNs and physical URLs to resources. From this, a collection of Ontology objects are instantiated, one for each data type supported specified in MIRIAM.

Individual OntologyTerms objects are built up from an Ontology object and a unique identifier. Once instantiated, the OntologyTerm provides a number of methods, specified in Figure 2. The implementation of these methods is performed by mapping the calls to an appropriate call to the data type's web service, where such a web service exists.

Fig. 2.
Class diagram showing public methods of OntologyTerm and specialized subclasses ChebiTerm, UniProtTerm and KeggReactionTerm.

The OntologyTerm class can be extended to provide methods specific to the SBML element that is being described. For example, a metabolite species element annotated with a ChEBI term will return a ChebiTerm object, providing a method for accessing the chemical formula of the metabolite. Similarly, a KEGG Reaction annotation will return a KeggReactionTerm object, providing methods for accessing reactants and products, each returned as OntologyTerms themselves.

Applying libAnnotationSBML to the SBML in Figure 1 will associate an OntologyTerm with each of the species. Calling getName() on these ChEBI and KEGG OntologyTerm objects returns ‘d-glucopyranose’ and ‘d-glucose’, respectively. Considering the initial problem of comparing SBML components, this provides an example of why names cannot be used reliably to perform this task. A more reliable approach is to exploit the fact that many data resources cross-reference one another. For example, entries in the ChEBI database can provide details of the equivalent term in KEGG, and vice versa. The OntologyTerm class supports this by implementing a getXrefs() method which returns cross references themselves as OntologyTerms, along with a predicate, defined in libSBML, that indicates the relationship between them. When an OntologyTerm references an equivalent entity in a different database, the predicate libsbmlConstants.BQB_IS is returned. In the case of a genomic database entry cross referencing an entry in a proteomic database, libsbmlConstants.BQB_ENCODES is used. Utilizing this method, it can be determined computationally that the ChEBI and KEGG terms cross-reference one another, and hence species1 and species2 can be unambiguously determined to represent equivalent entities.

The libAnnotationSBML library facilitates the rapid development of tools to manipulate SBML annotation terms. The library can be used to add annotation to unannotated SBML models, using a similar approach to semanticSBML (Schulz et al., 2006). libAnnotationSBML can annotate both metabolites and proteins, exploiting the search facility that exists in both the ChEBI and UniProt web services (The UniProt Consortium, 2008).

The focus of libAnnotationSBML is to develop tools to manipulate already annotated models. An example of such a tool is the SbmlSynonymExtractor, which takes annotated SBML as input, and returns a mapping of all species terms to their name synonyms, harvested from ChEBI, KEGG or UniProt. Another tool, the SbmlReactionBalancer, determines whether the reactions specified within an SBML file are elementally balanced by querying the ChEBI web service to retrieve chemical formulae of reaction participants.

libAnnotationSBML was used extensively in the development of a genome-scale model of yeast metabolism, the first model of this scale in which all compartments, metabolites, enzymes and complexes are unambiguously defined using MIRIAM-compliant annotations (Herrgård et al., 2008).

3 IMPLEMENTATION AND DISTRIBUTION

The API is written in Java 1.5 and is dependent upon libSBML v3. It is supported in Linux, Windows and MacOS X and is distributed as source code and associated build files under the open source Academic Free Licence v3.0 from http://mcisb.sf.net/ along with other tools described in this manuscript.

ACKNOWLEDGEMENTS

The authors thank the BBSRC and EPSRC for financial support of the Manchester Centre for Integrative Systems Biology, of which this work was an integral part.

Conflict of Interest: none declared.

REFERENCES

  • Bornstein BJ, et al. LibSBML: an API Library for SBML. Bioinformatics. 2008;24:880–881. [PMC free article] [PubMed]
  • Degtyarenko K, et al. ChEBI: a database and ontology for chemical entities of biological interest. Nucleic Acids Res. 2008;36:D344–D350. [PMC free article] [PubMed]
  • Herrgård M, et al. A consensus yeast metabolic network reconstruction obtained from a community approach to systems biology. Nature Biotechnol. 2008;26:1155–1160. [PMC free article] [PubMed]
  • Hucka M, et al. The systems biology markup language (SBML): a medium for representation and exchange of biochemical network models. Bioinformatics. 2003;19:524–531. [PubMed]
  • Kanehisa M, et al. KEGG: Kyoto Encyclopedia of Genes and Genomes. Nucleic Acids Res. 2000;28:27–30. [PMC free article] [PubMed]
  • Kell DB, Mendes P. The markup is the model: reasoning about systems biology models in the Semantic Web era. J. Theor. Biol. 2008;252:538–543. [PubMed]
  • Laibe C, Le Novère N. MIRIAM resources: tools to generate and resolve robust cross-references in Systems Biology. BMC Syst. Biol. 2007;1:58. [PMC free article] [PubMed]
  • Le Novère N, et al. Minimum information requested in the annotation of biochemical models (MIRIAM) Nature Biotechnol. 2005;23:1509–1515. [PubMed]
  • Schulz M, et al. SBMLmerge, a system for combining biochemical network models. Genome Inform. Ser. 2006;17:62–71. [PubMed]
  • The UniProt Consortium. The universal protein resource (UniProt) Nucleic Acids Res. 2008;36:D190–D195. [PMC free article] [PubMed]

Articles from Bioinformatics are provided here courtesy of Oxford University Press
PubReader format: click here to try

Formats:

Related citations in PubMed

See reviews...See all...

Cited by other articles in PMC

  • Path2Models: large-scale generation of computational models from biochemical pathway maps[BMC Systems Biology. ]
    Büchel F, Rodriguez N, Swainston N, Wrzodek C, Czauderna T, Keller R, Mittag F, Schubert M, Glont M, Golebiewski M, van Iersel M, Keating S, Rall M, Wybrow M, Hermjakob H, Hucka M, Kell DB, Müller W, Mendes P, Zell A, Chaouiya C, Saez-Rodriguez J, Schreiber F, Laibe C, Dräger A, Le Novère N. BMC Systems Biology. 7116
  • A community-driven global reconstruction of human metabolism[Nature biotechnology. 2013]
    Thiele I, Swainston N, Fleming RM, Hoppe A, Sahoo S, Aurich MK, Haraldsdottir H, Mo ML, Rolfsson O, Stobbe MD, Thorleifsson SG, Agren R, Bölling C, Bordel S, Chavali AK, Dobson P, Dunn WB, Endler L, Hala D, Hucka M, Hull D, Jameson D, Jamshidi N, Jonsson JJ, Juty N, Keating S, Nookaew I, Le Novère N, Malys N, Mazein A, Papin JA, Price ND, Selkov E Sr, Sigurdsson MI, Simeonidis E, Sonnenschein N, Smallbone K, Sorokin A, van Beek JH, Weichart D, Goryanin I, Nielsen J, Westerhoff HV, Kell DB, Mendes P, Palsson BØ. Nature biotechnology. 2013 May; 31(5)10.1038/nbt.2488
  • A model of yeast glycolysis based on a consistent kinetic characterisation of all its enzymes[Febs Letters. 2013]
    Smallbone K, Messiha HL, Carroll KM, Winder CL, Malys N, Dunn WB, Murabito E, Swainston N, Dada JO, Khan F, Pir P, Simeonidis E, Spasić I, Wishart J, Weichart D, Hayes NW, Jameson D, Broomhead DS, Oliver SG, Gaskell SJ, McCarthy JE, Paton NW, Westerhoff HV, Kell DB, Mendes P. Febs Letters. 2013 Sep 2; 587(17)2832-2841
  • An analysis of a ‘community-driven’ reconstruction of the human metabolic network[Metabolomics. 2013]
    Swainston N, Mendes P, Kell DB. Metabolomics. 2013 Aug; 9(4)757-764
  • Controlled vocabularies and semantics in systems biology[Molecular Systems Biology. ]
    Courtot M, Juty N, Knüpfer C, Waltemath D, Zhukova A, Dräger A, Dumontier M, Finney A, Golebiewski M, Hastings J, Hoops S, Keating S, Kell DB, Kerrien S, Lawson J, Lister A, Lu J, Machne R, Mendes P, Pocock M, Rodriguez N, Villeger A, Wilkinson DJ, Wimalaratne S, Laibe C, Hucka M, Le Novère N. Molecular Systems Biology. 7543
See all...

Links

  • EST
    EST
    Published EST sequences
  • PubMed
    PubMed
    PubMed citations for these articles

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...