Using the T-Coffee package to build multiple sequence alignments of protein, RNA, DNA sequences and 3D structures

Jean-Francois Taly; Cedrik Magis; Giovanni Bussotti; Jia-Ming Chang; Paolo Di Tommaso; Ionas Erb; Jose Espinosa-Carrasco; Carsten Kemena; Cedric Notredame

doi:10.1038/nprot.2011.393

Using the T-Coffee package to build multiple sequence alignments of protein, RNA, DNA sequences and 3D structures

Nat Protoc. 2011 Nov;6(11):1669-82. doi: 10.1038/nprot.2011.393.

Authors

Jean-Francois Taly¹, Cedrik Magis, Giovanni Bussotti, Jia-Ming Chang, Paolo Di Tommaso, Ionas Erb, Jose Espinosa-Carrasco, Carsten Kemena, Cedric Notredame

Affiliation

¹ Comparative Bioinformatics Group, Bioinformatics and Genomics Program, Centre for Genomic Regulation (CRG), Universitat Pompeu Fabra, Barcelona, Spain.

PMID: 21979275
DOI: 10.1038/nprot.2011.393

Abstract

T-Coffee (Tree-based consistency objective function for alignment evaluation) is a versatile multiple sequence alignment (MSA) method suitable for aligning most types of biological sequences. The main strength of T-Coffee is its ability to combine third party aligners and to integrate structural (or homology) information when building MSAs. The series of protocols presented here show how the package can be used to multiply align proteins, RNA and DNA sequences. The protein section shows how users can select the most suitable T-Coffee mode for their data set. Detailed protocols include T-Coffee, the default mode, M-Coffee, a meta version able to combine several third party aligners into one, PSI (position-specific iterated)-Coffee, the homology extended mode suitable for remote homologs and Expresso, the structure-based multiple aligner. We then also show how the T-RMSD (tree based on root mean square deviation) option can be used to produce a functionally informative structure-based clustering. RNA alignment procedures are described for using R-Coffee, a mode able to use predicted RNA secondary structures when aligning RNA sequences. DNA alignments are illustrated with Pro-Coffee, a multiple aligner specific of promoter regions. We also present some of the many reformatting utilities bundled with T-Coffee. The package is an open-source freeware available from http://www.tcoffee.org/.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Algorithms
Amino Acid Sequence
Base Sequence
DNA / chemistry*
Models, Molecular
Molecular Sequence Data
Nucleic Acid Conformation*
Proteins / chemistry*
RNA / chemistry*
Sequence Alignment / methods*
Software

Substances

Proteins
RNA
DNA