refTSS: A Reference Data Set for Human and Mouse Transcription Start Sites

J Mol Biol. 2019 Jun 14;431(13):2407-2422. doi: 10.1016/j.jmb.2019.04.045. Epub 2019 May 8.

Abstract

Transcription starts at genomic positions called transcription start sites (TSSs), producing RNAs, and is mainly regulated by genomic elements and transcription factors binding around these TSSs. This indicates that TSSs may be a better unit to integrate various data sources related to transcriptional events, including regulation and production of RNAs. However, although several TSS datasets and promoter atlases are available, a comprehensive reference set that integrates all known TSSs is lacking. Thus, we constructed a reference dataset of TSSs (refTSS) for the human and mouse genomes by collecting publicly available TSS annotations and promoter resources, such as FANTOM5, DBTSS, EPDnew, and ENCODE. The data set consists of genomic coordinates of TSS peaks, their gene annotations, quality check results, and conservation between human and mouse. We also developed a web interface to browse the refTSS (http://reftss.clst.riken.jp/). Users can access the resource for collecting and integrating data and information about transcriptional regulation and transcription products.

Keywords: annotation; data integration; reference data; transcription start sites; transcriptional regulation.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Atlases as Topic
  • Conserved Sequence
  • Databases, Genetic*
  • Gene Expression Regulation
  • Humans
  • Mice
  • Molecular Sequence Annotation
  • Promoter Regions, Genetic
  • Sequence Analysis, DNA / methods*
  • Transcription Initiation Site*