Display Settings:


Send to:

Choose Destination
See comment in PubMed Commons below
Bioinformatics. 2000 Jul;16(7):628-38.

Object-oriented parsing of biological databases with Python.

Author information

  • 1European Molecular Biological Laboratory, Meyerhofstrasse 1, Postfach 10.2209, Heidelberg, Germany. chenna@embl-heidelberg.de



While database activities in the biological area are increasing rapidly, rather little is done in the area of parsing them in a simple and object-oriented way.


We present here an elegant, simple yet powerful way of parsing biological flat-file databases. We have taken EMBL, SWISSPROT and GENBANK as examples. EMBL and SWISS-PROT do not differ much in the format structure. GENBANK has a very different format structure than EMBL and SWISS-PROT. Extracting the desired fields in an entry (for example a sub-sequence with an associated feature) for later analysis is a constant need in the biological sequence-analysis community: this is illustrated with tools to make new splice-site databases. The interface to the parser is abstract in the sense that the access to all the databases is independent from their different formats, since parsing instructions are hidden.

[PubMed - indexed for MEDLINE]
Free full text
PubMed Commons home

PubMed Commons

How to join PubMed Commons

    Supplemental Content

    Full text links

    Icon for HighWire
    Loading ...
    Write to the Help Desk