Format

Send to

Choose Destination
Proteins. 2018 Jul;86(7):759-776. doi: 10.1002/prot.25510. Epub 2018 May 6.

Clustering of multi-domain protein sequences.

Author information

1
Indian Institute of Science Mathematics Initiative, Bangalore, 560012, India.
2
Molecular Biophysics Unit, Indian Institute of Science, Bangalore, 560012, India.
3
Institute of Bioinformatics and Applied Biotechnology, Bangalore, 560100, India.

Abstract

The overall function of a multi-domain protein is determined by the functional and structural interplay of its constituent domains. Traditional sequence alignment-based methods commonly utilize domain-level information and provide classification only at the level of domains. Such methods are not capable of taking into account the contributions of other domains in the proteins, and domain-linker regions and classify multi-domain proteins. An alignment-free protein sequence comparison tool, CLAP (CLAssification of Proteins) was previously developed in our laboratory to especially handle multi-domain protein sequences without a requirement of defining domain boundaries and sequential order of domains. Through this method we aim to achieve a biologically meaningful classification scheme for multi-domain protein sequences. In this article, CLAP-based classification has been explored on 5 datasets of multi-domain proteins and we present detailed analysis for proteins containing (1) Tyrosine phosphatase and (2) SH3 domain. At the domain-level CLAP-based classification scheme resulted in a clustering similar to that obtained from an alignment-based method. CLAP-based clusters obtained for full-length datasets were shown to comprise of proteins with similar functions and domain architectures. Our study demonstrates that multi-domain proteins could be classified effectively by considering full-length sequences without a requirement of identification of domains in the sequence.

KEYWORDS:

alignment-free method; multi-domain protein; protein classification; sequence classification

PMID:
29675880
DOI:
10.1002/prot.25510

Supplemental Content

Full text links

Icon for Wiley
Loading ...
Support Center