Send to

Choose Destination

See 1 citation found by title matching your search:

Nucleic Acids Res. 2019 Jul 2;47(W1):W623-W631. doi: 10.1093/nar/gkz326.

SeqTailor: a user-friendly webserver for the extraction of DNA or protein sequences from next-generation sequencing data.

Author information

St. Giles Laboratory of Human Genetics of Infectious Diseases, Rockefeller Branch, The Rockefeller University, New York, NY 10065, USA.
Laboratory of Human Genetics of Infectious Diseases, Necker Branch, INSERM UMR1163, Paris 75015, France, EU.
Paris Descartes University, Imagine Institute, Paris 75015, France, EU.
Institute of Medical Genetics, School of Medicine, Cardiff University, Cardiff, CF14 4XN, UK.
Howard Hughes Medical Institute, New York, NY 10065, USA.
Pediatric Immunology-Hematology Unit, Necker Hospital for Sick Children, Paris 75015, France, EU.
The Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA.
Department of Genetics and Genomics, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA.


Human whole-genome-sequencing reveals about 4 000 000 genomic variants per individual. These data are mostly stored as VCF-format files. Although many variant analysis methods accept VCF as input, many other tools require DNA or protein sequences, particularly for splicing prediction, sequence alignment, phylogenetic analysis, and structure prediction. However, there is no existing webserver capable of extracting DNA/protein sequences for genomic variants from VCF files in a user-friendly and efficient manner. We developed the SeqTailor webserver to bridge this gap, by enabling rapid extraction of (i) DNA sequences around genomic variants, with customizable window sizes and options to annotate the splice sites closest to the variants and to consider the neighboring variants within the window; and (ii) protein sequences encoded by the DNA sequences around genomic variants, with built-in SnpEff annotator and customizable window sizes. SeqTailor supports 11 species, including: human (GRCh37/GRCh38), chimpanzee, mouse, rat, cow, chicken, lizard, zebrafish, fruitfly, Arabidopsis and rice. Standalone programs are provided for command-line-based needs. SeqTailor streamlines the sequence extraction process, and accelerates the analysis of genomic variants with software requiring DNA/protein sequences. It will facilitate the study of genomic variation, by increasing the feasibility of sequence-based analysis and prediction. The SeqTailor webserver is freely available at

Supplemental Content

Full text links

Icon for Silverchair Information Systems Icon for PubMed Central Icon for Rockefeller University Rita and Frits Markus Library
Loading ...
Support Center