Send to:

Choose Destination
See comment in PubMed Commons below
Comput Stat Data Anal. 2009 Jan 15;53(3):788-800.

Internal validation inferences of significant genomic features in genome-wide screening.

Author information

  • 1Department of Biostatistics, St. Jude Children's Research Hospital, 332 N. Lauderdale Street, Memphis, TN 38105-2794.


Although validation of classification and prediction models has been a long-standing topic in Statistics and computer learning, the concept of statistical validation in genome-wide screening studies has been vague. Internal validation generally refers to validation procedures solely based on the study dataset. A popular approach to internal validation of identified genomic features has been the split-dataset validation. Contrast to this approach, internal validation in genome-wide association screening studies is precisely defined through the concepts of association profile and profile significance. A general procedure and two specific profile significance measures are developed and are compared with the split-dataset validation approach by a simulation study. The simulation results clearly demonstrate the strength and limitations of the profile significance approach to internal validation, especially its enormous gain in sensitivity (power) and stability over the split-dataset validation. The proposed methodology is illustrated by an example of genome-wide SNP associaiton analysis in genetic epidemiology.

Free PMC Article
PubMed Commons home

PubMed Commons

How to join PubMed Commons

    Supplemental Content

    Full text links

    Icon for PubMed Central
    Loading ...
    Write to the Help Desk