Format

Send to

Choose Destination
Bioinformatics. 2016 Sep 1;32(17):i576-i585. doi: 10.1093/bioinformatics/btw454.

Characterizing leader sequences of CRISPR loci.

Author information

1
Bioinformatics Group, Department of Computer Science, University of Freiburg, 79110 Freiburg, Germany.
2
Archaea Centre, Department of Biology, University of Copenhagen N, DK2200 Copenhagen N, Denmark.
3
Bioinformatics Group, Department of Computer Science, University of Freiburg, 79110 Freiburg, Germany BIOSS Centre for Biological Signalling Studies, Cluster of Excellence, University of Freiburg, Freiburg im Breisgau, Germany.

Abstract

MOTIVATION:

The CRISPR-Cas system is an adaptive immune system in many archaea and bacteria, which provides resistance against invading genetic elements. The first phase of CRISPR-Cas immunity is called adaptation, in which small DNA fragments are excised from genetic elements and are inserted into a CRISPR array generally adjacent to its so called leader sequence at one end of the array. It has been shown that transcription initiation and adaptation signals of the CRISPR array are located within the leader. However, apart from promoters, there is very little knowledge of sequence or structural motifs or their possible functions. Leader properties have mainly been characterized through transcriptional initiation data from single organisms but large-scale characterization of leaders has remained challenging due to their low level of sequence conservation.

RESULTS:

We developed a method to successfully detect leader sequences by focusing on the consensus repeat of the adjacent CRISPR array and weak upstream conservation signals. We applied our tool to the analysis of a comprehensive genomic database and identified several characteristic properties of leader sequences specific to archaea and bacteria, ranging from distinctive sizes to preferential indel localization. CRISPRleader provides a full annotation of the CRISPR array, its strand orientation as well as conserved core leader boundaries that can be uploaded to any genome browser. In addition, it outputs reader-friendly HTML pages for conserved leader clusters from our database.

AVAILABILITY AND IMPLEMENTATION:

CRISPRleader and multiple sequence alignments for all 195 leader clusters are available at http://www.bioinf.uni-freiburg.de/Software/CRISPRleader/

CONTACT:

costa@informatik.uni-freiburg.de or backofen@informatik.uni-freiburg.de

SUPPLEMENTARY INFORMATION:

Supplementary data are available at Bioinformatics online.

PMID:
27587677
DOI:
10.1093/bioinformatics/btw454
[Indexed for MEDLINE]

Supplemental Content

Full text links

Icon for Silverchair Information Systems
Loading ...
Support Center