Display Settings:

Format

Send to:

Choose Destination
    BMC Bioinformatics. 2004 Jun 23;5:80.

    Mistaken identifiers: gene name errors can be introduced inadvertently when using Excel in bioinformatics.

    Source

    Genomics & Bioinformatics Group, Laboratory of Molecular Pharmacology, Center for Cancer Research (CCR), National Cancer Institute (NCI), National Institutes of Health (NIH), Bethesda, MD 20892 USA. barry@discover.nci.nih.gov

    Abstract

    BACKGROUND:

    When processing microarray data sets, we recently noticed that some gene names were being changed inadvertently to non-gene names.

    RESULTS:

    A little detective work traced the problem to default date format conversions and floating-point format conversions in the very useful Excel program package. The date conversions affect at least 30 gene names; the floating-point conversions affect at least 2,000 if Riken identifiers are included. These conversions are irreversible; the original gene names cannot be recovered.

    CONCLUSIONS:

    Users of Excel for analyses involving gene names should be aware of this problem, which can cause genes, including medically important ones, to be lost from view and which has contaminated even carefully curated public databases. We provide work-arounds and scripts for circumventing the problem.

    PMID:
    15214961
    [PubMed - indexed for MEDLINE]
    PMCID: PMC459209
    Free PMC Article

    Images from this publication.See all images (4) Free text

    Figure 4
    Figure 3
    Figure 2
    Figure 1

      Supplemental Content

      Click here to read Click here to read

      Recent activity

      Your browsing activity is empty.

      Activity recording is turned off.

      Turn recording back on

      See more...
      Write to the Help Desk