A 3.4-kb Copy-Number Deletion near EPAS1 Is Significantly Enriched in High-Altitude Tibetans but Absent from the Denisovan Sequence

Am J Hum Genet. 2015 Jul 2;97(1):54-66. doi: 10.1016/j.ajhg.2015.05.005. Epub 2015 Jun 11.

Abstract

Tibetan high-altitude adaptation (HAA) has been studied extensively, and many candidate genes have been reported. Subsequent efforts targeting HAA functional variants, however, have not been that successful (e.g., no functional variant has been suggested for the top candidate HAA gene, EPAS1). With WinXPCNVer, a method developed in this study, we detected in microarray data a Tibetan-enriched deletion (TED) carried by 90% of Tibetans; 50% were homozygous for the deletion, whereas only 3% carried the TED and 0% carried the homozygous deletion in 2,792 worldwide samples (p < 10(-15)). We employed long PCR and Sanger sequencing technologies to determine the exact copy number and breakpoints of the TED in 70 additional Tibetan and 182 diverse samples. The TED had identical boundaries (chr2: 46,694,276-46,697,683; hg19) and was 80 kb downstream of EPAS1. Notably, the TED was in strong linkage disequilibrium (LD; r(2) = 0.8) with EPAS1 variants associated with reduced blood concentrations of hemoglobin. It was also in complete LD with the 5-SNP motif, which was suspected to be introgressed from Denisovans, but the deletion itself was absent from the Denisovan sequence. Correspondingly, we detected that footprints of positive selection for the TED occurred 12,803 (95% confidence interval = 12,075-14,725) years ago. We further whole-genome deep sequenced (>60×) seven Tibetans and verified the TED but failed to identify any other copy-number variations with comparable patterns, giving this TED top priority for further study. We speculate that the specific patterns of the TED resulted from its own functionality in HAA of Tibetans or LD with a functional variant of EPAS1.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Adaptation, Biological / genetics*
  • Algorithms
  • Altitude*
  • Animals
  • Base Sequence
  • Basic Helix-Loop-Helix Transcription Factors / genetics*
  • DNA Copy Number Variations / genetics*
  • Ethnicity / genetics*
  • Evolution, Molecular*
  • Genetics, Population
  • Hemoglobins / genetics
  • Hemoglobins / metabolism
  • Hominidae / genetics*
  • Humans
  • Linkage Disequilibrium
  • Microarray Analysis / methods
  • Molecular Sequence Data
  • Polymerase Chain Reaction / methods
  • Sequence Analysis, DNA
  • Tibet

Substances

  • Basic Helix-Loop-Helix Transcription Factors
  • Hemoglobins
  • endothelial PAS domain-containing protein 1