Display Settings:

Format

Send to:

Choose Destination
See comment in PubMed Commons below
PLoS One. 2014 Jan 14;9(1):e84696. doi: 10.1371/journal.pone.0084696. eCollection 2014.

Meta-analysis of repository data: impact of data regularization on NIMH schizophrenia linkage results.

Author information

  • 1Battelle Center for Mathematical Medicine, The Research Institute at Nationwide Children's Hospital, Columbus, Ohio, United States of America.
  • 2Department of Genetics, Rutgers University, Piscataway, New Jersey, United States of America.
  • 3Genomics Research Branch, The National Institute of Mental Health, Bethesda, Maryland, United States of America.

Abstract

Human geneticists are increasingly turning to study designs based on very large sample sizes to overcome difficulties in studying complex disorders. This in turn almost always requires multi-site data collection and processing of data through centralized repositories. While such repositories offer many advantages, including the ability to return to previously collected data to apply new analytic techniques, they also have some limitations. To illustrate, we reviewed data from seven older schizophrenia studies available from the NIMH-funded Center for Collaborative Genomic Studies on Mental Disorders, also known as the Human Genetics Initiative (HGI), and assessed the impact of data cleaning and regularization on linkage analyses. Extensive data regularization protocols were developed and applied to both genotypic and phenotypic data. Genome-wide nonparametric linkage (NPL) statistics were computed for each study, over various stages of data processing. To assess the impact of data processing on aggregate results, Genome-Scan Meta-Analysis (GSMA) was performed. Examples of increased, reduced and shifted linkage peaks were found when comparing linkage results based on original HGI data to results using post-processed data within the same set of pedigrees. Interestingly, reducing the number of affected individuals tended to increase rather than decrease linkage peaks. But most importantly, while the effects of data regularization within individual data sets were small, GSMA applied to the data in aggregate yielded a substantially different picture after data regularization. These results have implications for analyses based on other types of data (e.g., case-control GWAS or sequencing data) as well as data obtained from other repositories.

PMID:
24454738
[PubMed - indexed for MEDLINE]
PMCID:
PMC3891773
Free PMC Article
PubMed Commons home

PubMed Commons

0 comments
How to join PubMed Commons

    Supplemental Content

    Icon for Public Library of Science Icon for PubMed Central
    Loading ...
    Write to the Help Desk