## Computation of Average Heterozygosity and Standard Error for dbSNP RefSNP Clusters## John Spouge, Lon Phan,
and Stephen Sherry |

Average heterozygosity
is computed for each refSNP cluster based on all the variation data
submitted
for ss# members. There are three types of variation data in dbSNP: direct
measures of heterozygosity in a sample, "binned" allele frequency
estimates
that can only be resolved to a small number of classes, and point estimates
based on moderate to large sample sizes. Estimates of heterozygosity
are
computed for each class and then summed with each term weighted by its
standard error. This produces a linear estimate of |