Display Settings:


Send to:

Choose Destination
See comment in PubMed Commons below
Nat Protoc. 2009;4(8):1184-91. doi: 10.1038/nprot.2009.97. Epub 2009 Jul 23.

Mapping identifiers for the integration of genomic datasets with the R/Bioconductor package biomaRt.

Author information

  • 1Lawrence Berkeley National Laboratory, Berkeley, CA, USA. steffen@stat.berkeley.edu


Genomic experiments produce multiple views of biological systems, among them are DNA sequence and copy number variation, and mRNA and protein abundance. Understanding these systems needs integrated bioinformatic analysis. Public databases such as Ensembl provide relationships and mappings between the relevant sets of probe and target molecules. However, the relationships can be biologically complex and the content of the databases is dynamic. We demonstrate how to use the computational environment R to integrate and jointly analyze experimental datasets, employing BioMart web services to provide the molecule mappings. We also discuss typical problems that are encountered in making gene-to-transcript-to-protein mappings. The approach provides a flexible, programmable and reproducible basis for state-of-the-art bioinformatic data integration.

[PubMed - indexed for MEDLINE]
Free PMC Article

Images from this publication.See all images (7)Free text

Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
PubMed Commons home

PubMed Commons

How to join PubMed Commons

    Supplemental Content

    Full text links

    Icon for Nature Publishing Group Icon for PubMed Central
    Loading ...
    Write to the Help Desk