Send to

Choose Destination
See comment in PubMed Commons below
Source Code Biol Med. 2008 Apr 18;3:5. doi: 10.1186/1751-0473-3-5.

Simple tools for assembling and searching high-density picolitre pyrophosphate sequence data.

Author information

  • 1Entomology Unit, FAO/IAEA Agriculture and Biotechnology Laboratory, Agency's Laboratories Seibersdorf, International Atomic Energy Agency, A-1400 Vienna, Austria.



The advent of pyrophosphate sequencing makes large volumes of sequencing data available at a lower cost than previously possible. However, the short read lengths are difficult to assemble and the large dataset is difficult to handle. During the sequencing of a virus from the tsetse fly, Glossina pallidipes, we found the need for tools to search quickly a set of reads for near exact text matches.


A set of tools is provided to search a large data set of pyrophosphate sequence reads under a "live" CD version of Linux on a standard PC that can be used by anyone without prior knowledge of Linux and without having to install a Linux setup on the computer. The tools permit short lengths of de novo assembly, checking of existing assembled sequences, selection and display of reads from the data set and gathering counts of sequences in the reads.


Demonstrations are given of the use of the tools to help with checking an assembly against the fragment data set; investigating homopolymer lengths, repeat regions and polymorphisms; and resolving inserted bases caused by incomplete chain extension.


The additional information contained in a pyrophosphate sequencing data set beyond a basic assembly is difficult to access due to a lack of tools. The set of simple tools presented here would allow anyone with basic computer skills and a standard PC to access this information.

PubMed Commons home

PubMed Commons

How to join PubMed Commons

    Supplemental Content

    Full text links

    Icon for BioMed Central Icon for PubMed Central
    Loading ...
    Write to the Help Desk