Format

Send to

Choose Destination
J Mol Diagn. 2019 Dec 16. pii: S1525-1578(19)30436-2. doi: 10.1016/j.jmoldx.2019.10.011. [Epub ahead of print]

Sample Tracking Using Unique Sequence Controls.

Author information

1
Canada's Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, British Columbia; Faculty of Health Science, Simon Fraser University, Burnaby, British Columbia. Electronic address: rmoore@bcgsc.ca.
2
Canada's Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, British Columbia.
3
Department of Pathology and Laboratory Medicine, University of British Columbia, Vancouver, British Columbia, Canada.
4
Canada's Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, British Columbia; Department of Medical Genetics, University of British Columbia, Vancouver, British Columbia, Canada.
5
Canada's Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, British Columbia; Department of Pathology and Laboratory Medicine, University of British Columbia, Vancouver, British Columbia, Canada. Electronic address: akarsan@bcgsc.ca.

Abstract

Sample tracking and identity are essential when processing multiple samples in parallel. Sequencing applications often involve high sample numbers, and the data are frequently used in a clinical setting. As such, a simple and accurate intrinsic sample tracking process through a sequencing pipeline is essential. Various solutions have been implemented to verify sample identity, including variant detection at the start and end of the pipeline using arrays or genotyping, bioinformatic comparisons, and optical barcoding of samples. None of these approaches are optimal. To establish a more effective approach using genetic barcoding, we developed a panel of unique DNA sequences cloned into a common vector. A unique DNA sequence is added to the sample when it is first received and can be detected by PCR and/or sequencing at any stage of the process. The control sequences are approximately 200 bases long with low identity to any sequence in the National Center for Biotechnology Information nonredundant database (<30 bases) and contain no long homopolymer (>7) stretches. When a spiked next-generation sequencing library is sequenced, sequence reads derived from this control sequence are generated along with the standard sequencing run and are used to confirm sample identity and determine cross-contamination levels. This approach is used in our targeted clinical diagnostic whole-genome and RNA-sequencing pipelines and is an inexpensive, flexible, and platform-agnostic solution.

Supplemental Content

Full text links

Icon for Elsevier Science
Loading ...
Support Center