Display Settings:


Send to:

Choose Destination
See comment in PubMed Commons below
Bioinformatics. 2007 Jun 1;23(11):1410-7. Epub 2007 Mar 28.

SherLoc: high-accuracy prediction of protein subcellular localization by integrating text and protein sequence data.

Author information

  • 1School of Computing, Queen's University, Kingston, Ontario, Canada. shatkay@cs.queensu.ca



Knowing the localization of a protein within the cell helps elucidate its role in biological processes, its function and its potential as a drug target. Thus, subcellular localization prediction is an active research area. Numerous localization prediction systems are described in the literature; some focus on specific localizations or organisms, while others attempt to cover a wide range of localizations.


We introduce SherLoc, a new comprehensive system for predicting the localization of eukaryotic proteins. It integrates several types of sequence and text-based features. While applying the widely used support vector machines (SVMs), SherLoc's main novelty lies in the way in which it selects its text sources and features, and integrates those with sequence-based features. We test SherLoc on previously used datasets, as well as on a new set devised specifically to test its predictive power, and show that SherLoc consistently improves on previous reported results. We also report the results of applying SherLoc to a large set of yet-unlocalized proteins.


SherLoc, along with Supplementary Information, is available at: http://www-bs.informatik.uni-tuebingen.de/Services/SherLoc/

[PubMed - indexed for MEDLINE]
Free full text
PubMed Commons home

PubMed Commons

How to join PubMed Commons

    Supplemental Content

    Icon for HighWire
    Loading ...
    Write to the Help Desk