Format

Send to

Choose Destination
See comment in PubMed Commons below
Biometrics. 2011 Jun;67(2):344-52. doi: 10.1111/j.1541-0420.2010.01455.x. Epub 2010 Jun 16.

Asymptotic conditional singular value decomposition for high-dimensional genomic data.

Author information

1
Johns Hopkins Bloomberg School of Public Health, Baltimore, Maryland 21205-2179, USA. jleek@jhsph.edu

Abstract

High-dimensional data, such as those obtained from a gene expression microarray or second generation sequencing experiment, consist of a large number of dependent features measured on a small number of samples. One of the key problems in genomics is the identification and estimation of factors that associate with many features simultaneously. Identifying the number of factors is also important for unsupervised statistical analyses such as hierarchical clustering. A conditional factor model is the most common model for many types of genomic data, ranging from gene expression, to single nucleotide polymorphisms, to methylation. Here we show that under a conditional factor model for genomic data with a fixed sample size, the right singular vectors are asymptotically consistent for the unobserved latent factors as the number of features diverges. We also propose a consistent estimator of the dimension of the underlying conditional factor model for a finite fixed sample size and an infinite number of features based on a scaled eigen-decomposition. We propose a practical approach for selection of the number of factors in real data sets, and we illustrate the utility of these results for capturing batch and other unmodeled effects in a microarray experiment using the dependence kernel approach of Leek and Storey (2008, Proceedings of the National Academy of Sciences of the United States of America 105, 18718-18723).

PMID:
20560929
PMCID:
PMC3165001
DOI:
10.1111/j.1541-0420.2010.01455.x
[Indexed for MEDLINE]
Free PMC Article
PubMed Commons home

PubMed Commons

0 comments
How to join PubMed Commons

    Supplemental Content

    Full text links

    Icon for Wiley Icon for PubMed Central
    Loading ...
    Support Center