- Journal List
- BMC Genet
- v.11; 2010
- PMC2825231

# PCA-based bootstrap confidence interval tests for gene-disease association involving multiple SNPs

^{1}Department of Epidemiology and Health Statistics, School of Public Health, Shandong University, Jinan 250012, PR China

^{2}MRC Epidemiology Unit, Institute of Metabolic Science, Addenbrooke's Hospital, Cambridge CB2 0QQ, UK

^{}Corresponding author.

## Abstract

### Background

Genetic association study is currently the primary vehicle for identification and characterization of disease-predisposing variant(s) which usually involves multiple single-nucleotide polymorphisms (SNPs) available. However, SNP-wise association tests raise concerns over multiple testing. Haplotype-based methods have the advantage of being able to account for correlations between neighbouring SNPs, yet assuming Hardy-Weinberg equilibrium (*HWE*) and potentially large number degrees of freedom can harm its statistical power and robustness. Approaches based on principal component analysis (*PCA*) are preferable in this regard but their performance varies with methods of extracting principal components (*PC*s).

### Results

*PCA*-based bootstrap confidence interval test (*PCA-BCIT*), which directly uses the *PC *scores to assess gene-disease association, was developed and evaluated for three ways of extracting *PC*s, i.e., cases only(*CAES*), controls only(*COES*) and cases and controls combined(*CES*). Extraction of *PC*s with *COES *is preferred to that with *CAES *and *CES*. Performance of the test was examined via simulations as well as analyses on data of rheumatoid arthritis and heroin addiction, which maintains nominal level under null hypothesis and showed comparable performance with permutation test.

### Conclusions

*PCA-BCIT *is a valid and powerful method for assessing gene-disease association involving multiple SNPs.

## Background

Genetic association studies now customarily involve multiple SNPs in candidate genes or genomic regions and have a significant role in identifying and characterizing disease-predisposing variant(s). A critical challenge in their statistical analysis is how to make optimal use of all available information. Population-based case-control studies have been very popular[1] and typically involve contingency table tests of SNP-disease association[2]. Notably, the genotype-wise Armitage trend test does not require *HWE *and has equivalent power to its allele-wise counterpart under *HWE*[3,4]. A thorny issue with individual tests of SNPs for linkage disequilibrium (*LD*) in such setting is multiple testing, however, methods for multiple testing adjustment assuming independence such as Bonferroni's[5,6] is knowingly conservative[7]. It is therefore necessary to seek alternative approaches which can utilize multiple SNPs simultaneously. The genotype-wise Armitage trend test is appealing since it is equivalent to the score test from logistic regression[8] of case-control status on dosage of disease-predisposing alleles of SNP. However, testing for the effects of multiple SNPs simultaneously via logistic regression is no cure for difficulty with multicollinearity and curse of dimensionality[9]. Haplotype-based methods have many desirable properties[10] and could possibly alleviate the problem[11-14], but assumption of *HWE *is usually required and a potentially large number of degrees of freedom are involved[7,11,15-18].

It has recently been proposed that *PCA *can be combined with logistic regression test (*LRT*)[7,16,17] in a unified framework so that *PCA *is conducted first to account for between-SNP correlations in a candidate region, then *LRT *is applied as a formal test for the association between *PC *scores (linear combinations of the original SNPs) and disease. Since *PC*s are orthogonal, it avoids multicollinearity and at the meantime is less computer-intensive than haplotype-based methods. Studies have shown that *PCA-LRT *is at least as powerful as genotype- and haplotype-based methods[7,16,17]. Nevertheless, the power of *PCA*-based approaches vary with ways by which *PC*s are extracted, e.g., from genotype correlation, LD, or other kinds of metrics[17], and in principle can be employed in frameworks other than logistic regression[7,16,17]. Here we investigate ways of extracting *PCs *using genotype correlation matrix from different types of samples in a case-control study, while presenting a new approach testing for gene-disease association by direct use of *PC *scores in a *PCA*-based bootstrap confidence interval test (*PCA-BCIT*). We evaluated its performance via simulations and compared it with *PCA-LRT *and permutation test using real data.

## Methods

### PCA

Assume that *p *SNPs in a candidate region of interest have coded values (*X*_{1}, *X*_{2}, ⋯, *X*_{p}) according to a given genetic model (e.g., additive model) whose correlation matrix is *C*. *PCA *solves the following equation,

where = 1, *i *= 1,2, ⋯, *p*, *l*_{i }= (*l*_{i1}, *l*_{i2}, ⋯, *l*_{ip})' are loadings of *PC*s. The score for an individual subject is

where cov (*F*_{i}, *F*_{j}) = 0, *i *≠ *j*, and var(*F*_{1}) ≥ var(*F*_{2}) ≥ ⋯ ≥ var(*F*_{p}).

### Methods of extracting *PC*s

Potentially, *PCA *can be conducted via four distinct extracting strategies (*ES*) using case-control data, i.e., 0. Calculate *PC *scores of individuals in cases and controls separately (*SES*), 1. Use cases only (*CAES*) to obtain loadings for calculation of *PC *scores for subjects in both cases and controls, 2. Use controls only (*COES*) to obtain the loadings for both groups, and 3. Use combined cases and controls (*CES*) to obtain the loadings for both groups. It is likely that in a case-control association study, loadings calculated from cases and controls can have different connotations and hence we only consider scenarios 1-3 hereafter. More formally, let (*X*_{1}, *X*_{2}, ⋯, *X*_{p}) and (*Y*_{1}, *Y*_{2}, ⋯, *Y*_{p}) be *p*-dimension vectors of SNPs at a given candidate region for cases and controls respectively, then we have,

Strategy 1 (** CAES**):

where *C*_{XX }is the correlation matrix of (*X*_{1}, *X*_{2}, ⋯, *X*_{p}), and = 1, *i *= 1,2, ⋯, *p*. The *i*^{th }*PC *for cases is calculated by

and for controls

Strategy 2 (** COES**):

where *C*_{YY }is the correlation matrix of (*Y*_{1}, *Y*_{2}, ⋯, *Y*_{p}). The *i*^{th }*PC *for controls is calculated by

And for cases, the *i*^{th }*PC*, i = 1,2, ⋯, *p*, is calculated by

Strategy 3 (** CES**):

where *C *is the correlation matrix obtained from the pooled data of cases and controls, and . The *i*^{th }*PC *of cases is calculated by

The *i*^{th }*PC *of controls is calculated by

### PCA-BCIT

Given a sample of *N *cases and *M *controls with *p*-SNP genotypes (*X*_{1}, *X*_{2}, ⋯, *X*_{N})^{T}, (*Y*_{1}, *Y*_{2}, ⋯, *Y*_{M})^{T}, and *X*_{i }= (*X*_{1i}, *X*_{2i}, ⋯, *x*_{pi}) for the *i*^{th }case, *Y*_{i }= (*Y*_{1i}, *Y*_{2i}, ⋯, *y*_{pi}) for the *i*^{th }control, a *PCA-BCIT *is furnished in three steps:

#### Step 1: Sampling

Replicate samples of cases and controls are obtained with replacement separately from (*X*_{1}^{(b}, *X*_{2}^{(b)}, ⋯, *X*_{N}^{(b)})^{T }and (*Y*_{1}^{(b}, *Y*_{2}^{(b)}, ⋯, *Y*_{M}^{(b)})^{T}, *b *= 1,2, ⋯, *B *(*B *= 1000).

#### Step 2: *PCA*

For each replicate sample obtained at Step 1, *PCA *is conducted and a given number of *PC*s retained with a threshold of 80% explained variance for all three strategies[16], expressed as and .

#### Step 3: *PCA-BCIT*

**3a**) For each replicate, the mean of the *k*^{th }*PC *in cases is calculated by

and that of the *k*^{th }*PC *in controls is calculated by

**3b**) Given confidence level (1 - α ), the confidence interval of is estimated by percentile method, with form

where is the percentile of , and is the percentile.

The confidence interval of is estimated by

where is the percentile of , and is the percentile.

**3c**) Confidence intervals of cases and controls are compared. The null hypothesis is rejected if and do not overlap, which is and are statistically different[19], indicating the candidate region is significantly associated with disease at level α. Otherwise, the candidate region is not significantly associated with disease at level α.

### Simulation studies

We examine the performance of *PCA-BCIT *through simulations with data from the North American Rheumatoid Arthritis (RA) Consortium (NARAC) (868 cases and 1194 controls)[20], taking advantage of the fact that association between protein tyrosine phosphatase non-receptor type 22 (*PTPN22*) and the development of RA has been established[21-24]. Nine SNPs have been selected from the *PNPT22 *region (114157960-114215857), and most of the SNPs are within the same LD block (Figure (Figure1).1). Females are more predisposed (73.85%) and are used in our simulation to ensure homogeneity. The corresponding steps for the simulation are as follows.

**LD (**. The nine

*r*^{2}) among nine*PTPN22*SNPs*PTPN22*SNPs are rs971173, rs1217390, rs878129, rs11811771, rs11102703, rs7545038, rs1503832, rs12127377, rs11485101. The triangle marks a single LD block within this region: (rs878129, rs11811771, rs11102703, rs7545038,

**...**

#### Step 1: Sampling

The observed genotype frequencies in the study sample are taken to be their true frequencies in populations of infinite sizes. Replicate samples of cases and controls of given size (*N*, *N *= 100, 200, ⋯, 1000) are generated whose estimated genotype frequencies are expected to be close to the true population frequencies while both the allele frequencies and *LD *structure are maintained. Under null hypothesis, replicate cases and controls are sampled with replacement from the controls. Under alternative hypothesis, replicate cases and controls are sampled with replacement from the cases and controls respectively.

#### Step 2: *PCA-BCITing*

For each replicate sample, *PCA-BCITs *are conducted through the three strategies of extracting *PC*s as outlined above on association between *PC *scores and disease (RA).

#### Step 3: Evaluating performance of *PCA-BCIT*s

Repeat steps 1 and 2 for *K *( *K *= 1000 ) times under both null and alternative hypotheses, and obtain the frequencies (*P*_{α}) of rejecting null hypothesis at level α (α = 0.05).

### Applications

*PCA-BCITs *are applied to both the NARAC data on *PTPN22 *in 1493 females (641 cases and 852 controls) described above and a data containing nine SNPs near μ-opioid receptor gene (*OPRM1*) in Han Chinese from Shanghai (91 cases and 245 controls) with endophenotype of heroin-induced positive responses on first use[25]. There are two LD blocks in the region of gene *OPRM1 *(Figure (Figure22).

## Results

### Simulation study

The performance of *PCA-BCIT *is shown in Table Table11 for the three strategies given a range of sample sizes. It can be seen that strategies 2 and 3 both have type I error rates approaching the nominal level (α = 0.05), but those from strategy 1 deviate heavily. When sample size larger than 800, the power of *PCA-BCIT *is above 0.8, and strategies 2 and 3 outperform strategy 1 slightly.

### Applications

For the NARAC data, Armitage trend test reveals none of the SNPs in significant association with RA using Bonferroni correction (Table (Table2),2), but the results of *PCA-BCIT *with strategies 2 and 3 show that the first *PC *extracted in region of *PTPN22 *is significantly associated with RA. The results are similar to that from permutation test (Table (Table33).

For the *OPRM1 *data, the sample characteristics are comparable between cases and controls (Table (Table4),4), and three SNPs (rs696522, rs1381376 and rs3778151) are showed significant association with the endophenotype (Table (Table5).5). The results of *PCA-BCIT *with strategies 2 and 3 and permutation test are all significant at level α = 0.01. In contrast, result from *PCA-LRT *is not significant at level α = 0.05 with strategy 2 (Table (Table3).3). The apparent separation of cases and controls are shown in Figure Figure33 for *PCA-BCIT *with strategy 3, suggesting an intuitive interpretation.

## Discussion

In this study, a *PCA*-based bootstrap confidence interval test[19,26-28] (*PCA*-*BCIT*) is developed to study gene-disease association using all SNPs genotyped in a given region. There are several attractive features of *PCA*-based approaches. First of all, they are at least as powerful as genotype- and haplotype-based methods[7,16,17]. Secondly, they are able to capture LD information between correlated SNPs and easy to compute with needless consideration of multicollinearity and multiple testing. Thirdly, *BCIT *integrates point estimation and hypothesis testing as a single inferential statement of great intuitive appeal[29] and does not rely on the distributional assumption of the statistic used to calculate confidence interval[19,26-29].

While there have been several different but closely related forms of bootstrap confidence interval calculations[28], we focus on percentiles of the asymptotic distribution of *PC*s for given confidence levels to estimate the confidence interval. *PCA-BCIT *is a data-learning method[29], and shown to be valid and powerful for sufficiently large number of replicates in our study. Our investigation involving three strategies of extracting *PC*s reveals that strategy 1 is invalid, while strategies 2 and 3 are acceptable. From analyses of real data we find that *PCA-BCIT *is more favourable compared with *PCA-LRT *and permutation test. It is suggested that a practical advantage of *PCA-BCIT *is that it offers an intuitive measure of difference between cases and controls by using the set of SNPs (*PC *scores) in a candidate region (Figure (Figure3).3). As extraction of *PC*s through *COES *is more in line with the principle of a case-control study, it will be our method of choice given that it has a comparable performance with *CES*. Nevertheless, *PCA-BCIT *has the limitation that it does not directly handle covariates as is usually done in a regression model.

## Conclusions

*PCA-BCIT *is both a valid and a powerful *PCA*-based method which captures multi-SNP information in study of gene-disease association. While extracting *PC*s based on *CAES, COES *and *CES *all have good performances, it appears that *COES *is more appropriate to use.

## Abbreviations

*SNP*: single nucleotide polymorphism; *HWE*: Hardy-Weinberg Equilibrium; *LD*: linkage disequilibrium; *LRT*: logistic regression test; *PCA*: principle component analysis; *PC*: principle component; *ES*: extracting strategy; *SES*: separate case and control extracting strategy (strategy 0); *CAES*: case-based extracting strategy (strategy 1); *COES*: control-based extracting strategy (strategy 2); *CES*: combined case and control extracting strategy (strategy 3); *BCIT*: bootstrap confidence interval test.

## Authors' contributions

QQP, JHZ, and FZX conceptualized the study, acquired and analyzed the data and prepared for the manuscript. All authors approved the final manuscript.

## Acknowledgements

This work was supported by grant from the National Natural Science Foundation of China (30871392). We wish to thank Dr. Dandan Zhang (Fudan University) and NARAC for supplying us with the data, and comments from the Associate Editor and anonymous referees which greatly improved the manuscript. Special thanks to referee for the insightful comment that extraction of *PC*s with controls is line with the case-control principles.

## References

- Morton NE, Collins A. Tests and estimates of allelic association in comples. Proc Natl Acad Sci USA. 1998;95:11389–11393. doi: 10.1073/pnas.95.19.11389. [PMC free article] [PubMed] [Cross Ref]
- Sasieni PD. From genotypes to genes: doubling the sample size. Biometrics. 1997;53:1253–1261. doi: 10.2307/2533494. [PubMed] [Cross Ref]
- Gordon D, Haynes C, Yang Y, Kramer PL, Finch SJ. Linear trend tests for case-control genetic association that incorporate random phenotype and genotype misclassification error. Genet Epidemiol. 2007;31:853–870. doi: 10.1002/gepi.20246. [PubMed] [Cross Ref]
- Slager SL, Schaid DJ. Case-control studies of genetic markers: Power and sample size approximations for Armitage's test for trend. Human Heredity. 2001;52:149–153. doi: 10.1159/000053370. [PubMed] [Cross Ref]
- Sidak Z. On Multivariate Normal Probabilities of Rectangles: Their Dependence on Correlations. The Annals of Mathematical Statistics. 1968;39:1425–1434.
- Sidak Z. On Probabilities of Rectangles in Multivariate Student Distributions: Their Dependence on Correlations. The Annals of Mathematical Statistics. 1971;42:169–175. doi: 10.1214/aoms/1177693504. [Cross Ref]
- Zhang FY, Wagener D. An approach to incorporate linkage disequilibrium structure into genomic association analysis. Journal of Genetics and Genomics. 2008;35:381–385. doi: 10.1016/S1673-8527(08)60055-7. [PMC free article] [PubMed] [Cross Ref]
- Balding DJ. A tutorial on statistical methods for population association studies. Nature Reviews Genetics. 2006;7:781–791. doi: 10.1038/nrg1916. [PubMed] [Cross Ref]
- Schaid DJ, McDonnell SK, Hebbring SJ, Cunningham JM, Thibodeau SN. Nonparametric tests of association of multiple genes with human disease. American Journal of Human Genetics. 2005;76:780–793. doi: 10.1086/429838. [PMC free article] [PubMed] [Cross Ref]
- Becker T, Schumacher J, Cichon S, Baur MP, Knapp M. Haplotype interaction analysis of unlinked regions. Genetic Epidemiology. 2005;29:313–322. doi: 10.1002/gepi.20096. [PubMed] [Cross Ref]
- Chapman JM, Cooper JD, Todd JA, Clayton DG. Detecting disease associations due to linkage disequilibrium using haplotype tags: A class of tests and the determinants of statistical power. Human Heredity. 2003;56:18–31. doi: 10.1159/000073729. [PubMed] [Cross Ref]
- Epstein MP, Satten GA. Inference on haplotype effects in case-control studies using unphased genotype data. American Journal of Human Genetics. 2003;73:1316–1329. doi: 10.1086/380204. [PMC free article] [PubMed] [Cross Ref]
- Fallin D, Cohen A, Essioux L, Chumakov I, Blumenfeld M, Cohen D, Schork NJ. Genetic analysis of case/control data using estimated haplotype frequencies: Application to APOE locus variation and Alzheimer's disease. Genome Research. 2001;11:143–151. doi: 10.1101/gr.148401. [PMC free article] [PubMed] [Cross Ref]
- Stram DO, Pearce CL, Bretsky P, Freedman M, Hirschhorn JN, Altshuler D, Kolonel LN, Henderson BE, Thomas DC. Modeling and E-M estimation of haplotype-specific relative risks from genotype data for a case-control study of unrelated individuals. Human Heredity. 2003;55:179–190. doi: 10.1159/000073202. [PubMed] [Cross Ref]
- Clayton D, Chapman J, Cooper J. Use of unphased multilocus genotype data in indirect association studies. Genetic Epidemiology. 2004;27:415–428. doi: 10.1002/gepi.20032. [PubMed] [Cross Ref]
- Gauderman WJ, Murcray C, Gilliland F, Conti DV. Testing association between disease and multiple SNPs in a candidate gene. Genetic Epidemiology. 2007;31:383–395. doi: 10.1002/gepi.20219. [PubMed] [Cross Ref]
- Oh S, Park T. Association tests based on the principal-component analysis. BMC Proc. 2007;1(Suppl 1):S130. doi: 10.1186/1753-6561-1-s1-s130. [PMC free article] [PubMed] [Cross Ref]
- Wang T, Elston RC. Improved power by use of a weighted score test for linkage disequilibrium mapping. American Journal of Human Genetics. 2007;80:353–360. doi: 10.1086/511312. [PMC free article] [PubMed] [Cross Ref]
- Heller G, Venkatraman ES. Resampling procedures to compare two survival distributions in the presence of right-censored data. Biometrics. 1996;52:1204–1213. doi: 10.2307/2532836. [Cross Ref]
- Plenge RM, Seielstad M, Padyukov L, Lee AT, Remmers EF, Ding B, Liew A, Khalili H, Chandrasekaran A, Davies LRL. TRAF1-C5 as a risk locus for rheumatoid arthritis - A genomewide study. New England Journal of Medicine. 2007;357:1199–1209. doi: 10.1056/NEJMoa073491. [PMC free article] [PubMed] [Cross Ref]
- Begovich AB, Carlton VE, Honigberg LA, Schrodi SJ, Chokkalingam AP, Alexander HC, Ardlie KG, Huang Q, Smith AM, Spoerke JM. A missense single-nucleotide polymorphism in a gene encoding a protein tyrosine phosphatase (PTPN22) is associated with rheumatoid arthritis. Am J Hum Genet. 2004;75:330–337. doi: 10.1086/422827. [PMC free article] [PubMed] [Cross Ref]
- Carlton VEH, Hu XL, Chokkalingam AP, Schrodi SJ, Brandon R, Alexander HC, Chang M, Catanese JJ, Leong DU, Ardlie KG. PTPN22 genetic variation: Evidence for multiple variants associated with rheumatoid arthritis. American Journal of Human Genetics. 2005;77:567–581. doi: 10.1086/468189. [PMC free article] [PubMed] [Cross Ref]
- Kallberg H, Padyukov L, Plenge RM, Ronnelid J, Gregersen PK, Helm-van Mil AHM van der, Toes REM, Huizinga TW, Klareskog L, Alfredsson L. Gene-gene and gene-environment interactions involving HLA-DRB1, PTPN22, and smoking in two subsets of rheumatoid arthritis. American Journal of Human Genetics. 2007;80:867–875. doi: 10.1086/516736. [PMC free article] [PubMed] [Cross Ref]
- Plenge RM, Padyukov L, Remmers EF, Purcell S, Lee AT, Karlson EW, Wolfe F, Kastner DL, Alfredsson L, Altshuler D. Replication of putative candidate-gene associations with rheumatoid arthritis in > 4,000 samples from North America and Sweden: Association of susceptibility with PTPN22, CTLA4, and PADI4. American Journal of Human Genetics. 2005;77:1044–1060. doi: 10.1086/498651. [PMC free article] [PubMed] [Cross Ref]
- Zhang D, Shao C, Shao M, Yan P, Wang Y, Liu Y, Liu W, Lin T, Xie Y, Zhao Y. Effect of mu-opioid receptor gene polymorphisms on heroin-induced subjective responses in a Chinese population. Biol Psychiatry. 2007;61:1244–1251. doi: 10.1016/j.biopsych.2006.07.012. [PubMed] [Cross Ref]
- Carpenter J. Test Inversion Bootstrap Confidence Intervals. Journal of the Royal Statistical Society Series B (Statistical Methodology) 1999;61:159–172. doi: 10.1111/1467-9868.00169. [Cross Ref]
- Davison AC, Hinkley DV, Young GA. Recent developments in bootstrap methodology. Statistical Science. 2003;18:141–157. doi: 10.1214/ss/1063994969. [Cross Ref]
- DiCiccio TJ, Efron B. Bootstrap confidence intervals. Statistical Science. 1996;11:189–212. doi: 10.1214/ss/1032280214. [Cross Ref]
- Efron B. Bootstrap Methods: Another Look at the Jackknife. The Annals of Statistics. 1979;7:1–26. doi: 10.1214/aos/1176344552. [Cross Ref]

**BioMed Central**

## Formats:

- Article |
- PubReader |
- ePub (beta) |
- PDF (367K) |
- Citation

- Testing association between disease and multiple SNPs in a candidate gene.[Genet Epidemiol. 2007]
*Gauderman WJ, Murcray C, Gilliland F, Conti DV.**Genet Epidemiol. 2007 Jul; 31(5):383-95.* - Tests of association between quantitative traits and haplotypes in a reduced-dimensional space.[Ann Hum Genet. 2005]
*Sha Q, Dong J, Jiang R, Zhang S.**Ann Hum Genet. 2005 Nov; 69(Pt 6):715-32.* - Resampling-based multiple hypothesis testing procedures for genetic case-control association studies.[Genet Epidemiol. 2006]
*Chen BE, Sakoda LC, Hsing AW, Rosenberg PS.**Genet Epidemiol. 2006 Sep; 30(6):495-507.* - Single-variant and multi-variant trend tests for genetic association with next-generation sequencing that are robust to sequencing error.[Hum Hered. 2012]
*Kim W, Londono D, Zhou L, Xing J, Nato AQ, Musolf A, Matise TC, Finch SJ, Gordon D.**Hum Hered. 2012; 74(3-4):172-83. Epub 2013 Apr 11.* - Weighted SNP set analysis in genome-wide association study.[PLoS One. 2013]
*Dai H, Zhao Y, Qian C, Cai M, Zhang R, Chu M, Dai J, Hu Z, Shen H, Chen F.**PLoS One. 2013; 8(9):e75897. Epub 2013 Sep 30.*

- Molecular Mechanical Differences between Isoforms of Contractile Actin in the Presence of Isoforms of Smooth Muscle Tropomyosin[PLoS Computational Biology. 2013]
*Hilbert L, Bates G, Roman HN, Blumenthal JL, Zitouni NB, Sobieszek A, Mackey MC, Lauzon AM.**PLoS Computational Biology. 2013 Oct; 9(10)e1003273* - Heritability of objectively assessed daily physical activity and sedentary behavior[The American Journal of Clinical Nutrition....]
*den Hoed M, Brage S, Zhao JH, Westgate K, Nessa A, Ekelund U, Spector TD, Wareham NJ, Loos RJ.**The American Journal of Clinical Nutrition. 2013 Nov; 98(5)1317-1325* - SNP Set Association Analysis for Genome-Wide Association Studies[PLoS ONE. ]
*Cai M, Dai H, Qiu Y, Zhao Y, Zhang R, Chu M, Dai J, Hu Z, Shen H, Chen F.**PLoS ONE. 8(5)e62495* - A Latent Variable Partial Least Squares Path Modeling Approach to Regional Association and Polygenic Effect with Applications to a Human Obesity Study[PLoS ONE. ]
*Xue F, Li S, Luan J, Yuan Z, Luben RN, Khaw KT, Wareham NJ, Loos RJ, Zhao JH.**PLoS ONE. 7(2)e31927* - Gene- or region-based association study via kernel principal component analysis[BMC Genetics. ]
*Gao Q, He Y, Yuan Z, Zhao J, Zhang B, Xue F.**BMC Genetics. 1275*

- CompoundCompoundPubChem chemical compound records that cite the current articles. These references are taken from those provided on submitted PubChem chemical substance records. Multiple substance records may contribute to the PubChem compound record.
- MedGenMedGenRelated information in MedGen
- PubMedPubMedPubMed citations for these articles
- SNPSNPNucleotide polymorphism records from dbSNP that have current articles as submitter-provided references.
- SubstanceSubstancePubChem chemical substance records that cite the current articles. These references are taken from those provided on submitted PubChem chemical substance records.

- PCA-based bootstrap confidence interval tests for gene-disease association invol...PCA-based bootstrap confidence interval tests for gene-disease association involving multiple SNPsBMC Genetics. 2010; 11()6

Your browsing activity is empty.

Activity recording is turned off.

See more...