Format

Send to

Choose Destination
Genome Biol. 2018 Nov 28;19(1):208. doi: 10.1186/s13059-018-1590-2.

CHESS: a new human gene catalog curated from thousands of large-scale RNA sequencing experiments reveals extensive transcriptional noise.

Author information

1
Center for Computational Biology, McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA.
2
Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA.
3
Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA.
4
McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA.
5
Institute of Bioinformatics, International Technology Park, Bangalore, India.
6
Manipal Academy of Higher Education (MAHE), Manipal, Karnataka, India.
7
Present address: Center for Individualized Medicine and Department of Laboratory Medicine and Pathology, Mayo Clinic, Rochester, MN, USA.
8
Departments of Biological Chemistry, Pathology, Neurology, and Oncology, Johns Hopkins University School of Medicine, Baltimore, MD, USA.
9
Center for Computational Biology, McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA. salzberg@jhu.edu.
10
Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA. salzberg@jhu.edu.
11
Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA. salzberg@jhu.edu.
12
Department of Biostatistics, Bloomberg School of Public Health, Johns Hopkins University, Baltimore, MD, USA. salzberg@jhu.edu.

Abstract

We assembled the sequences from deep RNA sequencing experiments by the Genotype-Tissue Expression (GTEx) project, to create a new catalog of human genes and transcripts, called CHESS. The new database contains 42,611 genes, of which 20,352 are potentially protein-coding and 22,259 are noncoding, and a total of 323,258 transcripts. These include 224 novel protein-coding genes and 116,156 novel transcripts. We detected over 30 million additional transcripts at more than 650,000 genomic loci, nearly all of which are likely nonfunctional, revealing a heretofore unappreciated amount of transcriptional noise in human cells. The CHESS database is available at http://ccb.jhu.edu/chess .

KEYWORDS:

GTEx; Human gene count; RNA sequencing; Transcriptome; Transcriptome assembly

Supplemental Content

Full text links

Icon for BioMed Central Icon for PubMed Central
Loading ...
Support Center