Complete haplotype sequence of the human immunoglobulin heavy-chain variable, diversity, and joining genes and characterization of allelic and copy-number variation

Am J Hum Genet. 2013 Apr 4;92(4):530-46. doi: 10.1016/j.ajhg.2013.03.004. Epub 2013 Mar 28.

Abstract

The immunoglobulin heavy-chain locus (IGH) encodes variable (IGHV), diversity (IGHD), joining (IGHJ), and constant (IGHC) genes and is responsible for antibody heavy-chain biosynthesis, which is vital to the adaptive immune response. Programmed V-(D)-J somatic rearrangement and the complex duplicated nature of the locus have impeded attempts to reconcile its genomic organization based on traditional B-lymphocyte derived genetic material. As a result, sequence descriptions of germline variation within IGHV are lacking, haplotype inference using traditional linkage disequilibrium methods has been difficult, and the human genome reference assembly is missing several expressed IGHV genes. By using a hydatidiform mole BAC clone resource, we present the most complete haplotype of IGHV, IGHD, and IGHJ gene regions derived from a single chromosome, representing an alternate assembly of ∼1 Mbp of high-quality finished sequence. From this we add 101 kbp of previously uncharacterized sequence, including functional IGHV genes, and characterize four large germline copy-number variants (CNVs). In addition to this germline reference, we identify and characterize eight CNV-containing haplotypes from a panel of nine diploid genomes of diverse ethnic origin, discovering previously unmapped IGHV genes and an additional 121 kbp of insertion sequence. We genotype four of these CNVs by using PCR in 425 individuals from nine human populations. We find that all four are highly polymorphic and show considerable evidence of stratification (Fst = 0.3-0.5), with the greatest differences observed between African and Asian populations. These CNVs exhibit weak linkage disequilibrium with SNPs from two commercial arrays in most of the populations tested.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Alleles
  • Chromosomes, Artificial, Bacterial
  • DNA Copy Number Variations / genetics*
  • Female
  • Gene Fusion / genetics*
  • Genes, Immunoglobulin Heavy Chain*
  • Genetics, Population
  • Genotype
  • Haplotypes / genetics*
  • Humans
  • Hydatidiform Mole / genetics*
  • Immunoglobulin Heavy Chains / genetics*
  • Immunoglobulin Variable Region / genetics*
  • Molecular Sequence Data
  • Pregnancy
  • Sequence Analysis, DNA
  • V(D)J Recombination

Substances

  • Immunoglobulin Heavy Chains
  • Immunoglobulin Variable Region

Associated data

  • GENBANK/AC206018
  • GENBANK/AC231260
  • GENBANK/AC233755
  • GENBANK/AC234135
  • GENBANK/AC234225
  • GENBANK/AC234301
  • GENBANK/AC241513
  • GENBANK/AC241995
  • GENBANK/AC244226
  • GENBANK/AC244393
  • GENBANK/AC244395
  • GENBANK/AC244396
  • GENBANK/AC244397
  • GENBANK/AC244398
  • GENBANK/AC244399
  • GENBANK/AC244400
  • GENBANK/AC244405
  • GENBANK/AC244410
  • GENBANK/AC244411
  • GENBANK/AC244412
  • GENBANK/AC244430
  • GENBANK/AC244449
  • GENBANK/AC244450
  • GENBANK/AC244452
  • GENBANK/AC244456
  • GENBANK/AC244459
  • GENBANK/AC244460
  • GENBANK/AC244463
  • GENBANK/AC244464
  • GENBANK/AC244467
  • GENBANK/AC244468
  • GENBANK/AC244470
  • GENBANK/AC244473
  • GENBANK/AC244476
  • GENBANK/AC244477
  • GENBANK/AC244478
  • GENBANK/AC244480
  • GENBANK/AC244481
  • GENBANK/AC244482
  • GENBANK/AC244483
  • GENBANK/AC244484
  • GENBANK/AC244486
  • GENBANK/AC244487
  • GENBANK/AC244488
  • GENBANK/AC244490
  • GENBANK/AC244491
  • GENBANK/AC244492
  • GENBANK/AC244494
  • GENBANK/AC244495
  • GENBANK/AC244496
  • GENBANK/AC244497
  • GENBANK/AC244500
  • GENBANK/AC245023
  • GENBANK/AC245085
  • GENBANK/AC245090
  • GENBANK/AC245094
  • GENBANK/AC245166
  • GENBANK/AC245243
  • GENBANK/AC245369
  • GENBANK/AC246787
  • GENBANK/AC247036
  • GENBANK/KC162924
  • GENBANK/KC162925
  • GENBANK/KC162926
  • GENBANK/KC713935
  • GENBANK/KC713936
  • GENBANK/KC713937
  • GENBANK/KC713938
  • GENBANK/KC713939
  • GENBANK/KC713940
  • GENBANK/KC713941
  • GENBANK/KC713942
  • GENBANK/KC713943
  • GENBANK/KC713944
  • GENBANK/KC713945
  • GENBANK/KC713946
  • GENBANK/KC713947
  • GENBANK/KC713948
  • GENBANK/KC713949
  • GENBANK/KC713950