TriageTools: tools for partitioning and prioritizing analysis of high-throughput sequencing data

Nucleic Acids Res. 2013 Apr;41(7):e86. doi: 10.1093/nar/gkt094. Epub 2013 Feb 13.

Abstract

High-throughput sequencing is becoming a popular research tool but carries with it considerable costs in terms of computation time, data storage and bandwidth. Meanwhile, some research applications focusing on individual genes or pathways do not necessitate processing of a full sequencing dataset. Thus, it is desirable to partition a large dataset into smaller, manageable, but relevant pieces. We present a toolkit for partitioning raw sequencing data that includes a method for extracting reads that are likely to map onto pre-defined regions of interest. We show the method can be used to extract information about genes of interest from DNA or RNA sequencing samples in a fraction of the time and disk space required to process and store a full dataset. We report speedup factors between 2.6 and 96, depending on settings and samples used. The software is available at http://www.sourceforge.net/projects/triagetools/.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Genes
  • Genes, Neoplasm
  • High-Throughput Nucleotide Sequencing / methods*
  • Humans
  • Receptor, Notch1 / genetics
  • Sequence Analysis, DNA / methods
  • Sequence Analysis, RNA / methods
  • Software*

Substances

  • NOTCH1 protein, human
  • Receptor, Notch1