Natural Selection and Functional Potentials of Human Noncoding Elements Revealed by Analysis of Next Generation Sequencing Data

PLoS One. 2015 Jun 8;10(6):e0129023. doi: 10.1371/journal.pone.0129023. eCollection 2015.

Abstract

Noncoding DNA sequences (NCS) have attracted much attention recently due to their functional potentials. Here we attempted to reveal the functional roles of noncoding sequences from the point of view of natural selection that typically indicates the functional potentials of certain genomic elements. We analyzed nearly 37 million single nucleotide polymorphisms (SNPs) of Phase I data of the 1000 Genomes Project. We estimated a series of key parameters of population genetics and molecular evolution to characterize sequence variations of the noncoding genome within and between populations, and identified the natural selection footprints in NCS in worldwide human populations. Our results showed that purifying selection is prevalent and there is substantial constraint of variations in NCS, while positive selectionis more likely to be specific to some particular genomic regions and regional populations. Intriguingly, we observed larger fraction of non-conserved NCS variants with lower derived allele frequency in the genome, indicating possible functional gain of non-conserved NCS. Notably, NCS elements are enriched for potentially functional markers such as eQTLs, TF motif, and DNase I footprints in the genome. More interestingly, some NCS variants associated with diseases such as Alzheimer's disease, Type 1 diabetes, and immune-related bowel disorder (IBD) showed signatures of positive selection, although the majority of NCS variants, reported as risk alleles by genome-wide association studies, showed signatures of negative selection. Our analyses provided compelling evidence of natural selection forces on noncoding sequences in the human genome and advanced our understanding of their functional potentials that play important roles in disease etiology and human evolution.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Alleles
  • Conserved Sequence
  • Evolution, Molecular
  • Gene Frequency
  • Genetic Variation
  • Genome Components*
  • Genome, Human*
  • Genome-Wide Association Study*
  • Genomics*
  • High-Throughput Nucleotide Sequencing
  • Humans
  • Quantitative Trait Loci
  • Regulatory Elements, Transcriptional
  • Selection, Genetic*

Grants and funding

These studies were supported by the National Natural Science Foundation of China (NSFC) grants (91331204 and 31171218), by the Strategic Priority Research Program of the Chinese Academy of Sciences (CAS) (XDB13040100). PJ is supported by International Young Scientist fellowship of Chinese Academy of Sciences (2011Y2SB10). SX is Max-Planck Independent Research Group Leader and member of CAS Youth Innovation Promotion Association. SX also gratefully acknowledges the support of National Program for Top-notch Young Innovative Talents of The "Wanren Jihua" Project. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.