Display Settings:


Send to:

Choose Destination
See comment in PubMed Commons below
BMC Bioinformatics. 2007 Oct 30;8:423.

PubMed related articles: a probabilistic topic-based model for content similarity.

Author information

  • 1College of Information Studies, University of Maryland, College Park, Maryland, USA. jimmylin@umd.edu



We present a probabilistic topic-based model for content similarity called pmra that underlies the related article search feature in PubMed. Whether or not a document is about a particular topic is computed from term frequencies, modeled as Poisson distributions. Unlike previous probabilistic retrieval models, we do not attempt to estimate relevance-but rather our focus is "relatedness", the probability that a user would want to examine a particular document given known interest in another. We also describe a novel technique for estimating parameters that does not require human relevance judgments; instead, the process is based on the existence of MeSH in MEDLINE.


The pmra retrieval model was compared against bm25, a competitive probabilistic model that shares theoretical similarities. Experiments using the test collection from the TREC 2005 genomics track shows a small but statistically significant improvement of pmra over bm25 in terms of precision.


Our experiments suggest that the pmra model provides an effective ranking algorithm for related article search.

[PubMed - indexed for MEDLINE]
Free PMC Article

Images from this publication.See all images (6)Free text

Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
PubMed Commons home

PubMed Commons

How to join PubMed Commons

    Supplemental Content

    Full text links

    Icon for BioMed Central Icon for PubMed Central
    Loading ...
    Write to the Help Desk