Format

Send to

Choose Destination
J Immunol Methods. 2015 Dec;427:105-16. doi: 10.1016/j.jim.2015.10.009. Epub 2015 Nov 1.

Discrimination of germline V genes at different sequencing lengths and mutational burdens: A new tool for identifying and evaluating the reliability of V gene assignment.

Author information

1
School of Biomedical Engineering, Science and Health Systems, 711 Bossone Building, Drexel University, 3141 Chestnut Street, Philadelphia, PA 19104, USA.
2
Department of Pathology and Laboratory Medicine, Perelman School of Medicine, University of Pennsylvania, 405B Stellar Chance Labs, 422 Curie Boulevard, Philadelphia, PA 19104, USA.
3
School of Biomedical Engineering, Science and Health Systems, 711 Bossone Building, Drexel University, 3141 Chestnut Street, Philadelphia, PA 19104, USA; Department of Microbiology and Immunology, College of Medicine, 2900 Queen Lane, Philadelphia, PA 19129, USA. Electronic address: uri.hershberg@drexel.edu.

Abstract

Immune repertoires are collections of lymphocytes that express diverse antigen receptor gene rearrangements consisting of Variable (V), (Diversity (D) in the case of heavy chains) and Joining (J) gene segments. Clonally related cells typically share the same germline gene segments and have highly similar junctional sequences within their third complementarity determining regions. Identifying clonal relatedness of sequences is a key step in the analysis of immune repertoires. The V gene is the most important for clone identification because it has the longest sequence and the greatest number of sequence variants. However, accurate identification of a clone's germline V gene source is challenging because there is a high degree of similarity between different germline V genes. This difficulty is compounded in antibodies, which can undergo somatic hypermutation. Furthermore, high-throughput sequencing experiments often generate partial sequences and have significant error rates. To address these issues, we describe a novel method to estimate which germline V genes (or alleles) cannot be discriminated under different conditions (read lengths, sequencing errors or somatic hypermutation frequencies). Starting with any set of germline V genes, this method measures their similarity using different sequencing lengths and calculates their likelihood of unambiguous assignment under different levels of mutation. Hence, one can identify, under different experimental and biological conditions, the germline V genes (or alleles) that cannot be uniquely identified and bundle them together into groups of specific V genes with highly similar sequences.

KEYWORDS:

B cells; Gene identification; Germline anotation; High throughput sequencing

PMID:
26529062
PMCID:
PMC4811607
DOI:
10.1016/j.jim.2015.10.009
[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for Elsevier Science Icon for PubMed Central
Loading ...
Support Center