Time for a unified system of mutation description and reporting: a review of locus-specific mutation databases

Genome Res. 2002 May;12(5):680-8. doi: 10.1101/gr.217702.

Abstract

Mutation databases of human genes are assuming an increasing importance in all areas of health care. In addition, more and more experts in the mutations and diseases of particular genes are curating published and unpublished mutations in locus-specific databases (LSDB). These databases contain such extensive information that they have become known as knowledge bases. We analyzed these databases and their content between June 21, 2001, and July 18, 2001. We were able to access 94 independent websites devoted to the documentation of mutation containing 262 LSDBs for study. We analyzed one LSDB from each of these websites (i.e., 94 LSDBs) for the presence or absence of 80 content criteria, as generally each gene in a multigene website documented the same criteria. No criterion studied gave unanimous agreement in every database. Twenty-two genes were represented by more than one LSDB. The number of mutations recorded, excluding p53, was 23,822 with 1518 polymorphisms. Fifty-four percent of the LSDBs studied were easy to use and 11% hard to follow; 73% of the databases were displayed through HTML. Three databases were found that were given a high score for ease of use and wealth of content. Thus, the study provided a strong case for uniformity of data to make the content maximally useful. In this direction, a hypothetical content for an ideal LSDB was derived. We also derived a community structure that would enhance the chances of mutation capture rather than being left unpublished in a patient's report. We hope the interested community and granting bodies will assist in achieving the vision of a public system that collects and displays all variants discovered.

Publication types

  • Comparative Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Computational Biology / methods*
  • Computational Biology / statistics & numerical data
  • Computational Biology / trends
  • Data Collection / methods
  • Databases, Genetic / standards*
  • Databases, Genetic / statistics & numerical data
  • Databases, Genetic / trends*
  • Documentation / methods
  • Genes
  • Genetic Diseases, Inborn / genetics
  • Genetic Markers / genetics*
  • Genotype
  • Humans
  • Internet
  • Mutation / genetics*
  • Phenotype
  • Proteins / classification
  • Proteins / genetics
  • Software

Substances

  • Genetic Markers
  • Proteins