Send to

Choose Destination
Sci China Life Sci. 2017 Feb;60(2):116-125. doi: 10.1007/s11427-015-0349-4. Epub 2016 Jun 13.

Characterizing and annotating the genome using RNA-seq data.

Chen G1, Shi T2, Shi L1,3,4,5.

Author information

Center for Pharmacogenomics, School of Pharmacy and School of Life Sciences, Fudan University, Shanghai, 201203, China.
The Center for Bioinformatics and Computational Biology, Shanghai Key Laboratory of Regulatory Biology, the Institute of Biomedical Sciences and School of Life Sciences, East China Normal University, Shanghai, 200241, China.
State Key Laboratory of Genetic Engineering and MOE Key Laboratory of Contemporary Anthropology, School of Life Sciences, Fudan University, Shanghai, 200433, China.
Fudan-Zhangjiang Center for Clinical Genomics, Shanghai, 201203, China.
Zhangjiang Center for Translational Medicine, Shanghai, 201203, China.


Bioinformatics methods for various RNA-seq data analyses are in fast evolution with the improvement of sequencing technologies. However, many challenges still exist in how to efficiently process the RNA-seq data to obtain accurate and comprehensive results. Here we reviewed the strategies for improving diverse transcriptomic studies and the annotation of genetic variants based on RNA-seq data. Mapping RNA-seq reads to the genome and transcriptome represent two distinct methods for quantifying the expression of genes/transcripts. Besides the known genes annotated in current databases, many novel genes/transcripts (especially those long noncoding RNAs) still can be identified on the reference genome using RNA-seq. Moreover, owing to the incompleteness of current reference genomes, some novel genes are missing from them. Genome- guided and de novo transcriptome reconstruction are two effective and complementary strategies for identifying those novel genes/transcripts on or beyond the reference genome. In addition, integrating the genes of distinct databases to conduct transcriptomics and genetics studies can improve the results of corresponding analyses.


RNA-seq; de novo assembly; genetic variants; genome-guided transcriptome reconstruction; long noncoding RNA

[Indexed for MEDLINE]

Supplemental Content

Full text links

Icon for Springer
Loading ...
Support Center