Format

Send to

Choose Destination
Methods. 2015 Dec;91:40-47. doi: 10.1016/j.ymeth.2015.09.021. Epub 2015 Sep 25.

CoVaMa: Co-Variation Mapper for disequilibrium analysis of mutant loci in viral populations using next-generation sequence data.

Author information

1
Department of Molecular and Experimental Medicine, The Scripps Research Institute, La Jolla, CA 92037, USA; Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, USA; Department of Biochemistry and Molecular Biology, The University of Texas Medical Branch, Galveston, TX, USA. Electronic address: arouth@scripps.edu.
2
Integrative Genomics and Bioinformatics Core, The Salk Institute for Biological Studies, La Jolla, CA 92037, USA.
3
Infectious Disease Service, San Antonio Military Medical Center, Fort Sam Houston, TX 78234, USA; Infectious Disease Clinical Research Program, Uniformed Services University of the Health Sciences, Bethesda, MD 20814, USA.
4
Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, USA.
5
Department of Molecular and Experimental Medicine, The Scripps Research Institute, La Jolla, CA 92037, USA. Electronic address: betorbet@scripps.edu.

Abstract

Next-Generation Sequencing (NGS) has transformed our understanding of the dynamics and diversity of virus populations for human pathogens and model systems alike. Due to the sensitivity and depth of coverage in NGS, it is possible to measure the frequency of mutations that may be present even at vanishingly low frequencies within the viral population. Here, we describe a simple bioinformatic pipeline called CoVaMa (Co-Variation Mapper) scripted in Python that detects correlated patterns of mutations in a viral sample. Our algorithm takes NGS alignment data and populates large matrices of contingency tables that correspond to every possible pairwise interaction of nucleotides in the viral genome or amino acids in the chosen open reading frame. These tables are then analysed using classical linkage disequilibrium to detect and report evidence of epistasis. We test our analysis with simulated data and then apply the approach to find epistatically linked loci in Flock House Virus genomic RNA grown under controlled cell culture conditions. We also reanalyze NGS data from a large cohort of HIV infected patients and find correlated amino acid substitution events in the protease gene that have arisen in response to anti-viral therapy. This both confirms previous findings and suggests new pairs of interactions within HIV protease. The script is publically available at http://sourceforge.net/projects/covama.

KEYWORDS:

Covariation; Flock House Virus; Human immunodeficiency virus protease; Linkage disequilibrium; RNAseq

PMID:
26408523
PMCID:
PMC4684750
DOI:
10.1016/j.ymeth.2015.09.021
[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for Elsevier Science Icon for PubMed Central
Loading ...
Support Center