Display Settings:

Format

Send to:

Choose Destination
    DNA Seq. 2004 Oct-Dec;15(5-6):362-4.

    DNA sequence error rates in Genbank records estimated using the mouse genome as a reference.

    Source

    University of Edinburgh, School of Biological Sciences, Ashworth Laboratories, UK.

    Abstract

    We estimate DNA sequence error rates in Genbank records containing protein-coding and non-coding DNA sequences by comparing sequences of the inbred mouse strain C57BL/6J, sequenced as part of the mouse genome project and independently by other laboratories. C57BL/6J was produced by more than 100 generations of brother-sister mating, and can be assumed to be virtually free of residual polymorphism and mutational variation, so differences between independent sequences can be attributed to error. The estimated single nucleotide error rate for coding DNA is 0.10% (SE 0.012%), which is substantially lower than previous estimates for error rates in Genbank accessions. The estimated single nucleotide error rate for intronic DNA sequences (0.22%; SE 0.051%) is significantly higher than the rate for coding DNA. Since error rates for the mouse genome sequence are very low, the vast majority of the errors we detected are likely to be in individual Genbank accessions. The frequency of insertion-deletion (indel) errors in non-coding DNA approaches that of single nucleotide errors in non-coding DNA, whereas indel errors are uncommon in coding sequences.

    PMID:
    15621661
    [PubMed - indexed for MEDLINE]

      Supplemental Content

      Save items

      loading

      Recent activity

      Your browsing activity is empty.

      Activity recording is turned off.

      Turn recording back on

      See more...
      Write to the Help Desk