Format

Send to

Choose Destination
Bioinformatics. 2018 Jul 1;34(13):2168-2176. doi: 10.1093/bioinformatics/bty098.

IDP-denovo: de novo transcriptome assembly and isoform annotation by hybrid sequencing.

Author information

1
Department of Internal Medicine, University of Iowa, Iowa City, USA.
2
Institute of Medicinal Plant Development, Chinese Academy of Medical Sciences, Peking Union Medical College, Beijing, China.
3
Department of Biostatistics, University of Iowa, Iowa City, USA.

Abstract

Motivation:

In the past years, the long read (LR) sequencing technologies, such as Pacific Biosciences and Oxford Nanopore Technologies, have been demonstrated to substantially improve the quality of genome assembly and transcriptome characterization. Compared to the high cost of genome assembly by LR sequencing, it is more affordable to generate LRs for transcriptome characterization. That is, when informative transcriptome LR data are available without a high-quality genome, a method for de novo transcriptome assembly and annotation is of high demand.

Results:

Without a reference genome, IDP-denovo performs de novo transcriptome assembly, isoform annotation and quantification by integrating the strengths of LRs and short reads. Using the GM12878 human data as a gold standard, we demonstrated that IDP-denovo had superior sensitivity of transcript assembly and high accuracy of isoform annotation. In addition, IDP-denovo outputs two abundance indices to provide a comprehensive expression profile of genes/isoforms. IDP-denovo represents a robust approach for transcriptome assembly, isoform annotation and quantification for non-model organism studies. Applying IDP-denovo to a non-model organism, Dendrobium officinale, we discovered a number of novel genes and novel isoforms that were not reported by the existing annotation library. These results reveal the high diversity of gene isoforms in D.officinale, which was not reported in the existing annotation library.

Availability and implementation:

The dataset of Dendrobium officinale used/analyzed during the current study has been deposited in SRA, with accession code SRP094520. IDP-denovo is available for download at www.healthcare.uiowa.edu/labs/au/IDP-denovo/.

Supplementary information:

Supplementary data are available at Bioinformatics online.

PMID:
29905763
PMCID:
PMC6022631
DOI:
10.1093/bioinformatics/bty098
[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for Silverchair Information Systems Icon for PubMed Central
Loading ...
Support Center