Format

Send to

Choose Destination
Proteins. 2016 Jul;84(7):869-74. doi: 10.1002/prot.25040. Epub 2016 Apr 16.

ScaffoldSeq: Software for characterization of directed evolution populations.

Author information

1
Department of Chemical Engineering and Materials Science, University of Minnesota, Minneapolis, Minnesota, 55455.

Abstract

ScaffoldSeq is software designed for the numerous applications-including directed evolution analysis-in which a user generates a population of DNA sequences encoding for partially diverse proteins with related functions and would like to characterize the single site and pairwise amino acid frequencies across the population. A common scenario for enzyme maturation, antibody screening, and alternative scaffold engineering involves naïve and evolved populations that contain diversified regions, varying in both sequence and length, within a conserved framework. Analyzing the diversified regions of such populations is facilitated by high-throughput sequencing platforms; however, length variability within these regions (e.g., antibody CDRs) encumbers the alignment process. To overcome this challenge, the ScaffoldSeq algorithm takes advantage of conserved framework sequences to quickly identify diverse regions. Beyond this, unintended biases in sequence frequency are generated throughout the experimental workflow required to evolve and isolate clones of interest prior to DNA sequencing. ScaffoldSeq software uniquely handles this issue by providing tools to quantify and remove background sequences, cluster similar protein families, and dampen the impact of dominant clones. The software produces graphical and tabular summaries for each region of interest, allowing users to evaluate diversity in a site-specific manner as well as identify epistatic pairwise interactions. The code and detailed information are freely available at http://research.cems.umn.edu/hackel. Proteins 2016; 84:869-874.

KEYWORDS:

bioinformatics; epistasis; family clustering; high-throughput; sequence analysis; software

PMID:
27018773
PMCID:
PMC5519505
DOI:
10.1002/prot.25040
[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for Wiley Icon for PubMed Central
Loading ...
Support Center