![]() | ![]() |
Formats:
|
||||||||||||||||||||||||||
Copyright American Journal of Epidemiology Published by the Johns Hopkins Bloomberg School of Public Health 2008. Comparison of Statistical Methods for Estimating Genetic Admixture in a Lung Cancer Study of African Americans and Latinos Corresponding author.Correspondence to Dr. Melinda C. Aldrich, University of California, San Francisco, Box 2911 Rock Hall, Mission Bay 582, 1550 4th Street, San Francisco, CA 94143-2911 (e-mail: melinda.aldrich/at/ucsf.edu). Received January 31, 2008; Accepted June 27, 2008. Abstract A variety of methods are available for estimating genetic admixture proportions in populations; however, few investigators have conducted detailed comparisons using empirical data. The authors characterized admixture proportions among self-identified African Americans (n = 535) and Latinos (n = 412) living in the San Francisco Bay Area who participated in a lung cancer case-control study (1998–2003). Individual estimates of genetic ancestry based on 184 informative markers were obtained from a Bayesian approach and 2 maximum likelihood approaches and were compared using descriptive statistics, Pearson correlation coefficients, and Bland-Altman plots. Case-control differences in individual admixture proportions were assessed using 2-sample t tests and logistic regression analysis. Results indicated that Bayesian and frequentist approaches to estimating admixture provide similar estimates and inferences. No difference was observed in admixture proportions between African-American cases and controls, but Latino cases and controls significantly differed according to Amerindian and European genetic ancestry. Differences in admixture proportions between Latino cases and controls were not unexpected, since cases were more likely to have been born in the United States. Genetic admixture proportions provide a quantitative measure of ancestry differences among Latinos that can be used in analyses of genetic risk factors. Keywords: African Americans, case-control studies, epidemiologic methods, genetics, population, Hispanic Americans, linkage disequilibrium, lung neoplasms, statistics An abundance of statistical methods and genetic markers are available with which to identify population substructure and estimate genetic ancestry in non-randomly mating populations recently formed from previously isolated populations, hence considered admixed populations (1). Genomic control (2) and structured association are 2 classes of statistical methods developed to control for population heterogeneity in admixed populations. Genomic control is a nonparametric method used to correct the inflated chi-square statistic caused by the presence of population heterogeneity. Numerous structured association methods have been developed for estimating genetic admixture. General approaches include the weighted least squares (3), maximum likelihood (4, 5), and Bayesian (6, 7) methods, although many variations exist (8–13). Structured association methods are often regarded as preferable to genomic control, since applying an inflation factor to the entire genome may either over- or undercorrect in certain scenarios (14). In addition, structured association methods provide a useful estimate of genetic ancestry for multivariable models. Although self-reported ethnicity is associated with genetic ancestry (15, 16) and debate continues as to the implications of measuring genetic ancestry (17–22), self-reported ethnicity is unlikely to provide an accurate measure of genetic ancestry, as it represents a combination of known and unknown factors which are genetic, social, economic, and behavioral. With developments in statistical methods and identification of genetic markers, genetic ancestry has become a new variable in statistical models used to study disease associations. However, the majority of published studies providing admixture estimates have been based on convenience samples, rather than epidemiologic investigations. Furthermore, limited comparative data exist for admixture estimation using Bayesian versus maximum likelihood approaches (9, 23–26), and frequently fewer than 50 markers have been used (4, 5, 25, 27–47). African Americans have an approximately 2-fold higher incidence of lung cancer than Latinos in the United States (48). However, these observations are based upon self-reported ethnicity, and reasons for the disparate incidence rates remain unclear. Admixed populations, such as African Americans and Latinos, offer investigators a valuable opportunity to explore disparities in complex disease. In the present analysis, we compared Bayesian and frequentist approaches to estimating genetic admixture in African Americans and Latinos participating in a lung cancer case-control study. MATERIALS AND METHODS Subjects Newly diagnosed lung cancer patients residing in the San Francisco Bay Area of California were identified through the Northern California Cancer Center and Summit Hospital from September 1998 through March 2003. Incident patients were eligible for participation if they 1) self-identified as African-American or Latino, 2) were 21 years of age or older, 3) resided within one of 5 adjacent counties (Alameda, Contra Costa, Santa Clara, San Francisco, or San Mateo), and 4) had a diagnosis of primary lung cancer. Cases meeting eligibility criteria were requested to participate in an in-person interview and to donate a biologic sample (blood or buccal smear). Cases were not excluded if they had been previously diagnosed with cancer. A total of 368 cases (255 African Americans, 113 Latinos) were included in this analysis. Potential controls were recruited through 1) random digit dialing, 2) Health Care Financing Administration records for persons aged 65 years or older, and 3) community-based sources. For each case, approximately 2 controls of the same age (±10 years), sex, and self-identified ethnicity were recruited. Eligible controls were requested to participate in an in-person interview and to donate a biologic sample. Extensive details of case and control recruitment are summarized elsewhere (49). A total of 579 controls (280 African Americans, 299 Latinos) were included in the analysis. The study, designated the San Francisco Bay Area Lung Cancer Study, was approved by the University of California Committee for the Protection of Human Subjects. Written informed consent was obtained from all participating subjects. Interview data collection and specimen processing Demographic and epidemiologic data and biologic specimens were collected during the in-person interviews. Specimens were transported to a laboratory within 48 hours of collection and processed for long-term storage until they were ready for genotyping. When samples had been collected from all study participants, biospecimens were thawed and DNA was isolated by means of automated phenol chloroform extraction using the Autogen 3000 (Autogen, Inc., Holliston, Massachusetts). DNA concentration was measured by fluorescence (PicoGreen, Invitrogen Corporation, Carlsbad, California) and normalized to 30–100 ng/μL, for a total concentration of 150–500 ng. Whole genome amplification was performed on samples yielding insufficient DNA (2 blood samples and 6 buccal samples) in accordance with the Omniplex protocol (Sigma-Aldrich Corporation, St. Louis, Missouri). The amplified product was cleaned with Millipore's Montage PCR96 filter plate (Millipore Corporation, Billerica, Massachusetts) (50). Marker selection A panel of 184 autosomal single nucleotide polymorphisms distinguishing the continental ancestor populations comprising Latinos and African Americans (see Web Table 1, which is posted on the Journal’s website (http://aje.oxfordjournals.org/)) was genotyped using DNA from Europeans (San Francisco Bay Area, California; n = 47) (51), West Africans (Bantu and Nilo Saharan speakers, Nigeria; n = 46), and Amerindians (Mayans, Guatemala; n = 46) (52, 53). Mean fixation indices (FST), estimated using FSTAT following the method of Weir and Cockerham (54), were 0.52 for West Africans versus Europeans, 0.52 for West Africans versus Amerindians, and 0.48 for Europeans versus Amerindians. Genotyping DNA collected from participants was genotyped at the University of California, Davis, Genome Center using the Illumina Bead Station 500G Golden Gate genotyping platform (Illumina, Inc., San Diego, California) and a custom Illumina panel. A participant was selected for genotyping if he or she was a lung cancer case (Latino or African-American) or a Latino control. A random sample of African-American controls was selected to complete the study. Participants were removed from statistical analyses if, during the interview, they reported belonging to more than 1 ethnic group (n = 44) or the quality of their DNA sample was poor (n = 5); this resulted in a final sample of 947 admixed participants (African-American or Latino). Statistical analysis Statistical analyses and admixture estimation procedures were conducted separately for African Americans (n = 535) and Latinos (n = 412). Exact tests for Hardy-Weinberg equilibrium and the linkage disequilibrium measure r2 were calculated using SAS/Genetics software (SAS Institute Inc., Cary, North Carolina). Correction for multiple testing among markers was conducted using the false discovery rate (55). A 1-sample Kolmogorov-Smirnov test was used to assess deviations from Hardy-Weinberg equilibrium by comparing significance probabilities obtained from the chi-square test with a uniform probability distribution. Maximum likelihood estimates of composite linkage disequilibrium, which makes no assumption about random mating or Hardy-Weinberg equilibrium, were computed (56). Chi-square tests yielded significance probabilities for assessing composite linkage disequilibrium which were compared with a nonparametric distribution from 30 iterations of sampling with replacement from all 15,976 pairwise comparisons of markers on different chromosomes. Using the multilocus genotyping data, individual admixture was estimated in African-American and Latino participants using a maximum likelihood method (designated MLK) developed for this project by one of the authors (S. S.) and written in R. Although an array of approaches have been developed for inferring population structure (3, 4, 6–11, 24, 57–63), STRUCTURE 2.1 (6–8) was selected for comparison, since it is a Bayesian method frequently used in published genetic association studies, single nucleotide polymorphism data can be input, and the software is freely available. IAE3CI (23, 39, 64) was also selected for comparison because it provided an alternate maximum likelihood method, single nucleotide polymorphism data could be input, it was easily implemented, and it was available to the authors. To improve ancestry assignment, ancestral population genotyping data were input in all models along with the genotypes of the admixed participants. Parameters for STRUCTURE were set according to author recommendations. An admixture model with independent allele frequencies was selected for inference of ancestry. A Markov chain Monte Carlo scheme (50,000 burn-in length and 50,000 iterations after burn-in) based on Gibbs sampling was implemented to generate the posterior distribution of admixture proportions given the observed genotype at each locus. To estimate the number of K subpopulations, Markov chain Monte Carlo analysis was conducted for K = 2 through K = 5 for African Americans and Latinos independently. The model with the largest log-likelihood was used to select the final K. Ancestral subjects with greater than 5% admixture (3 Amerindians, 6 Africans, and 6 Europeans) were identified using STRUCTURE and were subsequently excluded from Bayesian and maximum likelihood methods estimating admixture proportions for African-American and Latino participants. The maximum likelihood estimation program IAE3CI is based on methods from Hanis et al. (4) and Chakraborty et al. (5) and has been implemented in other studies (23, 38, 64). Bounds for admixture proportions are set to 0 ≤ m ≤ 1, where m represents the contribution of the parental populations to the hybrid population. The MLK program, which also follows estimation methods described by Chakraborty (65, 66) and Hanis et al. (4), initially allowed m to be unconstrained, imposing no minimum or maximum value. Maximum likelihood estimates of m1, m2, and m3 were then adjusted to sum to 1.0 such that a negative value was set to 0 and a value greater than 1 was set to 1.0. Comparisons were made between the estimates obtained from the Bayesian and maximum likelihood methods. Descriptive statistics were estimated and case-control differences were compared using t tests. Scatterplots, correlation coefficients, and Bland-Altman plots compared the admixture estimation approaches from STRUCTURE, IAE3CI, and MLK (constrained 0 ≤ m ≤ 1). European ancestry, which contributed a relatively large amount of ancestry and had the least variability in both Latinos and African Americans, was dichotomized using the median value in each population, and logistic regression analyses were used to compare maximum likelihood admixture estimates of median-or-greater European ancestry between cases and controls. All statistical analyses were conducted using SAS, version 9.1, or R, version 2.5. All P values are 2-sided. RESULTS Latino controls were more likely than Latino cases to be foreign-born (Table 1). Mexican ancestry, determined using the reported birthplaces of the respondent and the 2 previous generations, was similar between Latino cases and controls, but controls had greater Central American ancestry than cases (30% and 10%, respectively). Additionally, 23% of cases but only 9% of controls were third-generation US-born.
Marker locations and allele frequencies for the ancestral populations genotyped for this study are displayed in Web Table 1 (http://aje.oxfordjournals.org/). Among the Latino controls, 9 loci did not conform to the expected Hardy-Weinberg equilibrium values (P < 0.05), and among African-American controls, 18 markers were out of Hardy-Weinberg equilibrium (P < 0.05) (data not shown). Correction for multiple testing resulted in only 1 marker being out of equilibrium for both African-American and Latino controls. Among African-American and Latino controls, 58% and 65% of loci had reduced heterozygosity, respectively, as compared with expected Hardy-Weinberg proportions. The Kolmogorov-Smirnov test indicated no departure from Hardy-Weinberg equilibrium for Latino or African-American controls (data not shown). The largest values for the linkage disequilibrium measure r2 were 0.63 and 0.79 for African-American and Latino controls, respectively. Eighteen (0.05%) and 4 (0.01%) marker pairs among all 32,765 pairwise comparisons for African-American and Latino controls, respectively, had r2 values greater than 0.20. Assessment of composite linkage disequilibrium showed that the distribution of P values differed from the null distribution derived from 30 iterations of sampling with replacement, providing evidence of allelic association between loci in both African Americans and Latinos (data not shown). STRUCTURE identified a 3-ancestral-population model (K = 3) that best fit the genotyping data for both African Americans and Latinos. For Latinos, STRUCTURE and MLK yielded the highest correlations for all ancestral populations (r = 0.99; see Figure 1
Estimated mean admixture proportions from all ancestral populations were similar regardless of whether a Bayesian or maximum likelihood method was applied (Tables 2 and 3). Among Latinos, all 3 estimation methods showed that Amerindian and European genetic ancestry differed significantly between cases and controls (P < 0.01, Table 2). Among African Americans, none of the admixture proportions differed between cases and controls—a result confirmed by all 3 estimation programs (Table 3).
For the following reasons, the maximum likelihood method MLK (assuming 0 ≤ m ≤ 1) was selected as the preferred program for estimating admixture: 1) admixture estimates were similar between all programs; 2) inferences about differences between case and control admixture were the same for all programs; 3) MLK almost perfectly correlated with STRUCTURE, an approach often used for estimating admixture; 4) epidemiology is solidly grounded in frequentist statistics; 5) fewer input parameters (assumptions) were required for MLK than for STRUCTURE; and 6) MLK required a fraction of the computing time necessitated by STRUCTURE. Figure 7
DISCUSSION In this study, population substructure was estimated in self-identified African Americans and Latinos from the San Francisco Bay Area. The chi-square test and the Kolmogorov-Smirnov test displayed little if any evidence of deviations from Hardy-Weinberg equilibrium. Together, the presence of reduced heterozygosity and excess allelic associations between the physically unlinked ancestry informative markers suggest that modest population substructure was present among these African-American and Latino controls. The proportions of European, African, and Amerindian ancestry in the Latinos and African Americans participating in this study were consistent with estimates from prior studies (32, 38, 39, 41, 67–69). Estimates derived from STRUCTURE and MLK consistently showed the greatest similarities. McKeigue et al. stated that Bayesian and frequentist approaches to estimating admixture will give similar estimates provided that sufficiently large samples are used, since the mean of the posterior distribution from Bayesian analyses will be “asymptotically equivalent to the maximum likelihood estimate” (10, p. 173). The results from this analysis are consistent with this statement. Similar admixture proportions between African-American cases and controls and no association between European ancestry and lung cancer suggest that 1) the cases and controls in this study were well-matched according to genetic admixture proportions, 2) the markers used in this study failed to detect differences in this population, 3) any possible existing population structure did not influence sampling, or 4) environmental factors may play a larger role in the incidence of lung cancer for African Americans. One cannot rule out the possibility that susceptibility to lung cancer in African Americans is mediated by genetic factors. For Latinos, case-control differences in Amerindian and European genetic ancestry suggest either that these populations were not sufficiently matched on ethnicity or that markers associated with European or Amerindian ancestry may alter lung cancer risk among Latinos. The observed increase in European ancestry among Latino cases as compared with controls is compatible with our previously reported differences in cases and controls (49, 70). Cases ascertained through the population-based cancer registry were more likely to have been born in the United States than controls, many of whom were recruited through community-based sources. Use of community-based control recruitment resulted in our having a greater proportion of more recent immigrants; thus, it is not unexpected that controls had lower levels of European genetic ancestry than cases. Latino controls may not have been selected from the same study base as the cases, possibly resulting in an insufficiently matched study population. Controlling for admixture proportions in genetic association studies is an appropriate strategy for addressing this imbalance, as long as genetic admixture is not in the causal pathway between exposure and disease (11). Although admixture estimates and statistical inferences were similar, both STRUCTURE and MLK have advantages and disadvantages. Advantages of STRUCTURE are that genetically similar clusters can be identified and the ancestry of admixed individuals of unknown origin can be estimated with or without data from ancestral populations of known origin. There are several weaknesses of this Bayesian approach: 1) identification of subpopulations is subjective (11); 2) identified clusters may not correspond to actual populations (6, 24); 3) extensive computing time is required; 4) the user must specify a number of priors and other parameters; and 5) a complicated prior distribution is specified, making it difficult to easily interpret the statistical methods being implemented or the consequences of deviations from the assumed model structure. The maximum likelihood program MLK has the advantage that no bounds are imposed on the admixture proportions and therefore proportions greater than 1 or less than 0 are identified. The unconstrained MLK model provides more accurate admixture estimates than a model imposing boundaries on m, since truncating admixture estimates induces a directional bias (Dr. Neil Risch, University of California, San Francisco, personal communication, 2006). When the research aim is to compare mean admixture estimates between cases and controls, an unconstrained MLK model is appropriate. When the aim of estimating admixture is to obtain individual admixture estimates and examine associations with disease, the assumption of 0 ≤ m ≤ 1 can be imposed. Weaknesses of the MLK program include: 1) starting values of m are required, although these are informed by prior studies; 2) the number of subpopulations in the admixed population cannot be estimated; 3) having missing genotyping data eliminates the observation; and 4) the ancestry of admixed individuals of unknown origin cannot be estimated without data from ancestral populations of known origin. Both the STRUCTURE method and the MLK method make several assumptions in these analyses. Both of these methods assume that 1) ancestral populations are in Hardy-Weinberg equilibrium within populations; 2) loci are in linkage equilibrium within subpopulations (6); 3) admixture occurred at the same point in time and randomly with respect to genotypes within populations; 4) there is no uncertainty in the ancestral population composition or the allele frequencies (4, 71); 5) no systematic change in allele frequencies has occurred in the parental or hybrid populations (71); and 6) there are only 3 contributing parental populations as determined by the sampling scheme (6). It is difficult to know whether these assumptions hold, since the ancestral populations from which African Americans and Latinos arose are unavailable. It is unlikely that the ancestral population composition and allele frequencies are known without error, and erroneous parental population selection can bias admixture estimates (4). Measurement error of admixture proportions can either bias effect estimates or lead to residual confounding when controlling for confounding by admixture. The ancestral populations genotyped for this study are unlikely to be fully representative of the actual parental populations, since modern descendants of these populations may have undergone genetic events resulting in differing allele frequencies from their ancestors. This limitation, which applies to both the Bayesian and maximum likelihood approaches, is reflected in the unconstrained estimates of m from MLK. While the maximum likelihood model is parsimonious, it can give illegal values (m > 1 or m < 0) when the model fails (72). Model failure indicates that the number of ancestral populations may be incorrectly specified for the individual cases or controls, the ancestral genotype frequencies have error, or both. Poor correlations between programs for Amerindian ancestry among the African Americans participating in this study suggest that the Mayan population may not have been a representative Amerindian population for the African Americans, although admixture estimates were similar to those of published reports (39). Strengths of this analysis include the large number of markers (73, 74) and the use of empirical data. Most genetic association studies have estimated admixture in Latinos and African Americans using a small number of markers for few diseases (see Web Tables 2 and 3, respectively (http://aje.oxfordjournals.org/)). Removal of ancestral individuals having less than 95% homogeneous ancestry allowed homogeneous ancestral allele frequencies to inform estimates of admixture, further strengthening this analysis. In summary, genetic association studies conducted in admixed populations should include a panel of markers to identify genetic differences in ancestry. If additional genotyping is too costly, investigators should consider the presence of false associations due to allele frequency differences between cases and controls when interpreting results. A maximum likelihood method provided admixture estimates similar to those of the more computationally intensive Bayesian approach. While there are readily available admixture estimation programs, it is important for genetic epidemiologists to understand the fundamental issues and assumptions contributing to the estimation process. With the ongoing development of statistical tools, identification of more informative genetic markers, and their increasing use in epidemiologic studies, the importance of these issues is emphasized. [Web Tables and Figures]
Acknowledgments Author affiliations: Department of Medicine, School of Medicine, University of California, San Francisco, San Francisco, California (Melinda C. Aldrich); Division of Biostatistics, School of Public Health, University of California, Berkeley, Berkeley, California (Steve Selvin); Department of Neurological Surgery, School of Medicine, University of California, San Francisco, San Francisco, California (Helen M. Hansen, Margaret R. Wrensch, Jennette D. Sison, John K. Wiencke); Division of Epidemiology, School of Public Health, University of California, Berkeley, Berkeley, California (Lisa F. Barcellos, Patricia A. Buffler); Division of Research, Kaiser Permanente, Oakland, California (Charles P. Quesenberry); Department of Medicine, Division of Biological Sciences, University of Chicago, Chicago, Illinois (Rick A. Kittles); Obras Sociales del Hermano Pedro, Antigua, Guatemala (Gabriel Silva); and Rowe Program in Human Genetics, Departments of Biochemistry, Molecular Medicine, and Internal Medicine, School of Medicine, University of California, Davis, Davis, California (Michael F. Seldin). This work was supported by grants from the National Institute of Environmental Health Sciences (R01ES06717 to J. K. W. and 2R01ES09137-06 to P. A. B.), the National Institute of Arthritis and Musculoskeletal and Skin Diseases (R01AR050267 to M. F. S.), and the National Institute of Diabetes and Digestive and Kidney Diseases (R01K071185 to M. F. S.). The authors thank Dr. John Belmont for his support with collection of the Mayan population samples. They also thank the Northern California Cancer Center and Summit Hospital for their assistance with case ascertainment. Conflict of interest: none declared. References 1. Ziv E, Burchard EG. Human population structure and genetic association studies. Pharmacogenomics. 2003;4(4):431–441. [PubMed] 2. Devlin B, Roeder K. Genomic control for association studies. Biometrics. 1999;55(4):997–1004. [PubMed] 3. Long JC. The genetic structure of admixed populations. Genetics. 1991;127(2):417–428. [PubMed] 4. Hanis CL, Chakraborty R, Ferrell RE, et al. Individual admixture estimates: disease associations and individual risk of diabetes and gallbladder disease among Mexican-Americans in Starr County, Texas. Am J Phys Anthropol. 1986;70(4):433–441. [PubMed] 5. Chakraborty R, Ferrell RE, Stern MP, et al. Relationship of prevalence of non-insulin-dependent diabetes mellitus to Amerindian admixture in the Mexican Americans of San Antonio, Texas. Genet Epidemiol. 1986;3(6):435–454. [PubMed] 6. Pritchard JK, Stephens M, Donnelly P. Inference of population structure using multilocus genotype data. Genetics. 2000;155(2):945–959. [PubMed] 7. Falush D, Stephens M, Pritchard JK. Inference of population structure using multilocus genotype data: linked loci and correlated allele frequencies. Genetics. 2003;164(4):1567–1587. [PubMed] 8. Falush D, Stephens M, Pritchard JK. Inference of population structure using multilocus genotype data: dominant markers and null alleles. Mol Ecol Notes. 2007;7(4):574–578. [PubMed] 9. Tang H, Peng J, Wang P, et al. Estimation of individual admixture: analytical and study design considerations. Genet Epidemiol. 2005;28(4):289–301. [PubMed] 10. McKeigue PM, Carpenter JR, Parra EJ, et al. Estimation of admixture and detection of linkage in admixed populations by a Bayesian approach: application to African-American populations. Ann Hum Genet. 2000;64(pt 2):171–186. [PubMed] 11. Hoggart CJ, Parra EJ, Shriver MD, et al. Control of confounding of genetic associations in stratified populations. Am J Hum Genet. 2003;72(6):1492–1504. [PubMed] 12. Hoggart CJ, Shriver MD, Kittles RA, et al. Design and analysis of admixture mapping studies. Am J Hum Genet. 2004;74(5):965–978. [PubMed] 13. Parra EJ, Kittles RA, Argyropoulos G, et al. Ancestral proportions and admixture dynamics in geographically defined African Americans living in South Carolina. Am J Phys Anthropol. 2001;114(1):18–29. [PubMed] 14. Marchini J, Cardon LR, Phillips MS, et al. The effects of human population structure on large genetic association studies. Nat Genet. 2004;36(5):512–517. [PubMed] 15. Tang H, Quertermous T, Rodriguez B, et al. Genetic structure, self-identified race/ethnicity, and confounding in case-control association studies. Am J Hum Genet. 2005;76(2):268–275. [PubMed] 16. Mountain JL, Risch N. Assessing genetic contributions to phenotypic differences among ‘racial’ and ‘ethnic’ groups. Nat Genet. 2004;36(11 suppl):S48–S53. [PubMed] 17. Tishkoff SA, Kidd KK. Implications of biogeography of human populations for ‘race’ and medicine. Nat Genet. 2004;36(11 suppl):S21–S27. [PubMed] 18. Rotimi CN. Are medical and nonmedical uses of large-scale genomic markers conflating genetics and ‘race’? Nat Genet. 2004;36(11 suppl):S43–S47. [PubMed] 19. Burchard EG, Ziv E, Coyle N, et al. The importance of race and ethnic background in biomedical research and clinical practice. N Engl J Med. 2003;348(12):1170–1175. [PubMed] 20. Race Ethnicity and Genetics Working Group. The use of racial, ethnic, and ancestral categories in human genetics research. Am J Hum Genet. 2005;77(4):519–532. [PubMed] 21. Fine MJ, Ibrahim SA, Thomas SB. The role of race and genetics in health disparities research. Am J Public Health. 2005;95(12):2125–2128. [PubMed] 22. Krieger N. Stormy weather: race, gene expression, and the science of health disparities. Am J Public Health. 2005;95(12):2155–2160. [PubMed] 23. Tsai HJ, Choudhry S, Naqvi M, et al. Comparison of three methods to estimate genetic ancestry and control for stratification in genetic association studies among admixed populations. Hum Genet. 2005;118(3–4):424–433. [PubMed] 24. Wu B, Liu N, Zhao H. PSMIX: an R package for population structure inference via maximum likelihood method. BMC Bioinformatics. 2006;7:317. [PubMed] 25. Fernandez JR, Shriver MD, Beasley TM, et al. Association of African genetic admixture with resting metabolic rate and obesity among women. Obes Res. 2003;11(7):904–911. [PubMed] 26. Barnholtz-Sloan JS, Chakraborty R, Sellers TA, et al. Examining population stratification via individual ancestry estimates versus self-reported race. Cancer Epidemiol Biomarkers Prev. 2005;14(6):1545–1551. [PubMed] 27. Alarcon GS, Bastian HM, Beasley TM, et al. Systemic lupus erythematosus in a multi-ethnic cohort (LUMINA) XXXII: [corrected] contributions of admixture and socioeconomic status to renal involvement. Lupus. 2006;15(1):26–31. [PubMed] 28. Choudhry S, Ung N, Avila PC, et al. Pharmacogenetic differences in response to albuterol between Puerto Ricans and Mexicans with asthma. Am J Respir Crit Care Med. 2005;171(6):563–570. [PubMed] 29. Choudhry S, Burchard EG, Borrell LN, et al. Ancestry-environment interactions and asthma risk among Puerto Ricans. Am J Respir Crit Care Med. 2006;174(10):1088–1093. [PubMed] 30. Salari K, Choudhry S, Tang H, et al. Genetic admixture and asthma-related phenotypes in Mexican American and Puerto Rican asthmatics. Genet Epidemiol. 2005;29(1):76–86. [PubMed] 31. Sweeney C, Wolff RK, Byers T, et al. Genetic admixture among Hispanics and candidate gene polymorphisms: potential for confounding in a breast cancer study? Cancer Epidemiol Biomarkers Prev. 2007;16(1):142–150. [PubMed] 32. Ziv E, John EM, Choudhry S, et al. Genetic ancestry and risk factors for breast cancer among Latinas in the San Francisco Bay Area. Cancer Epidemiol Biomarkers Prev. 2006;15(10):1878–1885. [PubMed] 33. Chen H, Hernandez W, Shriver MD, et al. ICAM gene cluster SNPs and prostate cancer risk in African Americans. Hum Genet. 2006;120(1):69–76. [PubMed] 34. Wassel Fyr CL, Kanaya AM, Cummings SR, et al. Genetic admixture, adipocytokines, and adiposity in Black Americans: The Health, Aging, and Body Composition Study. Hum Genet. 2007;121(5):615–624. [PubMed] 35. Gallagher CJ, Keene KL, Mychaleckyj JC, et al. Investigation of the estrogen receptor-alpha gene with type 2 diabetes and/or nephropathy in African-American and European-American populations. Diabetes. 2007;56(3):675–684. [PubMed] 36. Gower BA, Fernandez JR, Beasley TM, et al. Using genetic admixture to explain racial differences in insulin-related phenotypes. Diabetes. 2003;52(4):1047–1051. [PubMed] 37. Higgins PB, Fernandez JR, Goran MI, et al. Early ethnic difference in insulin-like growth factor-1 is associated with African genetic admixture. Pediatr Res. 2005;58(5):850–854. [PubMed] 38. Peralta CA, Ziv E, Katz R, et al. African ancestry, socioeconomic status, and kidney function in elderly African Americans: a genetic admixture analysis. J Am Soc Nephrol. 2006;17(12):3491–3496. [PubMed] 39. Reiner AP, Ziv E, Lind DL, et al. Population structure, admixture, and aging-related phenotypes in African American adults: The Cardiovascular Health Study. Am J Hum Genet. 2005;76(3):463–477. [PubMed] 40. Shaffer JR, Kammerer CM, Reich D, et al. Genetic markers for ancestry are correlated with body composition traits in older African Americans. Osteoporos Int. 2007;18(6):733–741. [PubMed] 41. Tsai HJ, Shaikh N, Kho JY, et al. Beta 2-adrenergic receptor polymorphisms: pharmacogenetic response to bronchodilator among African American asthmatics. Hum Genet. 2006;119(5):547–557. [PubMed] 42. Duggan D, Zheng SL, Knowlton M, et al. Two genome-wide association studies of aggressive prostate cancer implicate putative prostate tumor suppressor gene DAB2IP. J Natl Cancer Inst. 2007;99(24):1836–1844. [PubMed] 43. Hernandez W, Grenade C, Santos ER, et al. IGF-1 and IGFBP-3 gene variants influence on serum levels and prostate cancer risk in African-Americans. Carcinogenesis. 2007;28(10):2154–2159. [PubMed] 44. Hooker S, Bonilla C, Akereyeni F, et al. NAT2 and NER genetic variants and sporadic prostate cancer susceptibility in African Americans. Prostate Cancer Prostatic Dis. 2007 Nov 20 [Epub ahead of print]. (doi: 10.1038/sj.pcan.4501027). 45. Leak TS, Keene KL, Langefeld CD, et al. Association of the proprotein convertase subtilisin/kexin-type 2 (PCSK2) gene with type 2 diabetes in an African American population. Mol Genet Metab. 2007;92(1–2):145–150. [PubMed] 46. Yende S, Angus DC, Ding J, et al. 4G/5G plasminogen activator inhibitor-1 polymorphisms and haplotypes are associated with pneumonia. Am J Respir Crit Care Med. 2007;176(11):1129–1137. [PubMed] 47. Zuo L, Kranzler HR, Luo X, et al. CNR1 variation modulates risk for drug and alcohol dependence. Biol Psychiatry. 2007;62(6):616–626. [PubMed] 48. Ries LAG, Melbert D, Krapcho M, et al. SEER Cancer Statistics Review, 1975–2004. Bethesda, MD: National Cancer Institute; 2007. ( http://seer.cancer.gov/csr/1975_2004/). 49. Wrensch MR, Miike R, Sison JD, et al. CYP1A1 variants and smoking-related lung cancer in San Francisco Bay Area Latinos and African Americans. Int J Cancer. 2005;113(1):141–147. [PubMed] 50. Hansen HM, Wiemels JL, Wrensch M, et al. DNA quantification of whole genome amplified samples for genotyping on a multiplexed bead array platform. Cancer Epidemiol Biomarkers Prev. 2007;16(8):1686–1690. [PubMed] 51. Wiemels JL, Wiencke JK, Kelsey KT, et al. Allergy-related polymorphisms influence glioma status and serum IgE levels. Cancer Epidemiol Biomarkers Prev. 2007;16(6):1229–1235. [PubMed] 52. Tian C, Hinds DA, Shigeta R, et al. A genomewide single-nucleotide-polymorphism panel with high ancestry information for African American admixture mapping. Am J Hum Genet. 2006;79(4):640–649. [PubMed] 53. Tian C, Hinds DA, Shigeta R, et al. A genomewide single-nucleotide-polymorphism panel for Mexican American admixture mapping. Am J Hum Genet. 2007;80(6):1014–1023. [PubMed] 54. Weir BS, Cockerham CC. Estimating F-statistics for the analysis of population structure. Evolution. 1984;38:1358–1370. 55. Benjamini Y, Hochberg Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc Ser B. 1995;57:289–300. 56. Weir BS. Genetic Data Analysis II: Methods for Discrete Population Genetic Data. Sunderland, MA: Sinauer Associates; 1996. 57. Pritchard JK, Rosenberg NA. Use of unlinked genetic markers to detect population stratification in association studies. Am J Hum Genet. 1999;65(1):220–228. [PubMed] 58. Guillot G, Mortier F, Estoup A. Geneland: a computer package for landscape genetics. Mol Ecol Notes. 2005;5(3):712–715. 59. Corander J, Waldmann P, Marttinen P, et al. BAPS 2: enhanced possibilities for the analysis of genetic population structure. Bioinformatics. 2004;20(15):2363–2369. [PubMed] 60. Huelsenbeck JP, Andolfatto P. Inference of population structure under a Dirichlet process model. Genetics. 2007;175(4):1787–1802. [PubMed] 61. Purcell S, Sham P. Properties of structured association approaches to detecting population stratification. Hum Hered. 2004;58(2):93–107. [PubMed] 62. Dawson KJ, Belkhir K. A Bayesian approach to the identification of panmictic populations and the assignment of individuals. Genet Res. 2001;78(1):59–77. [PubMed] 63. Wang J. Maximum-likelihood estimation of admixture proportions from genetic data. Genetics. 2003;164(2):747–765. [PubMed] 64. Bonilla C, Parra EJ, Pfaff CL, et al. Admixture in the Hispanics of the San Luis Valley, Colorado, and its implications for complex trait gene mapping. Ann Hum Genet. 2004;68(pt 2):139–153. [PubMed] 65. Chakraborty R, Weiss KM. Frequencies of complex diseases in hybrid populations. Am J Phys Anthropol. 1986;70(4):489–503. [PubMed] 66. Chakraborty R. Gene admixture in human populations: models and predictions. Yearb Phys Anthropol. 1986;29:1–43. 67. Collins-Schramm HE, Phillips CM, Operario DJ, et al. Ethnic-difference markers for use in mapping by admixture linkage disequilibrium. Am J Hum Genet. 2002;70(3):737–750. [PubMed] 68. Reiner AP, Carlson CS, Ziv E, et al. Genetic ancestry, population sub-structure, and cardiovascular disease-related traits among African-American participants in the CARDIA Study. Hum Genet. 2007;121(5):565–575. [PubMed] 69. Collins-Schramm HE, Chima B, Morii T, et al. Mexican American ancestry-informative markers: examination of population structure and marker characteristics in European Americans, Mexican Americans, Amerindians and Asians. Hum Genet. 2004;114(3):263–271. [PubMed] 70. Cabral DN, Napoles-Springer AM, Miike R, et al. Population- and community-based recruitment of African Americans and Latinos: The San Francisco Bay Area Lung Cancer Study. Am J Epidemiol. 2003;158(3):272–279. [PubMed] 71. Reed TE. Caucasian genes in American Negroes. Science. 1969;165(895):762–768. [PubMed] 72. Chakraborty R, Kamboh MI, Ferrell RE. ‘Unique’ alleles in admixed populations: a strategy for determining ‘hereditary’ population differences of disease frequencies. Ethn Dis. 1991;1(3):245–256. [PubMed] 73. Risch N, Burchard E, Ziv E, et al. Categorization of humans in biomedical research: genes, race and disease. Genome Biol. 2002;3(7):comment2007.1–comment2007.12. [PubMed] 74. Pritchard JK, Donnelly P. Case-control studies of association in structured or admixed populations. Theor Popul Biol. 2001;60(3):227–237. [PubMed] |
PubMed related articles
Your browsing activity is empty. Activity recording is turned off. |
|||||||||||||||||||||||||
Pharmacogenomics. 2003 Jul; 4(4):431-41.
[Pharmacogenomics. 2003]Biometrics. 1999 Dec; 55(4):997-1004.
[Biometrics. 1999]Genetics. 1991 Feb; 127(2):417-28.
[Genetics. 1991]Am J Phys Anthropol. 1986 Aug; 70(4):433-41.
[Am J Phys Anthropol. 1986]Genet Epidemiol. 1986; 3(6):435-54.
[Genet Epidemiol. 1986]Am J Hum Genet. 2005 Feb; 76(2):268-75.
[Am J Hum Genet. 2005]Nat Genet. 2004 Nov; 36(11 Suppl):S48-53.
[Nat Genet. 2004]Nat Genet. 2004 Nov; 36(11 Suppl):S21-7.
[Nat Genet. 2004]Am J Public Health. 2005 Dec; 95(12):2155-60.
[Am J Public Health. 2005]Genet Epidemiol. 2005 May; 28(4):289-301.
[Genet Epidemiol. 2005]Int J Cancer. 2005 Jan 1; 113(1):141-7.
[Int J Cancer. 2005]Cancer Epidemiol Biomarkers Prev. 2007 Aug; 16(8):1686-90.
[Cancer Epidemiol Biomarkers Prev. 2007]Cancer Epidemiol Biomarkers Prev. 2007 Jun; 16(6):1229-35.
[Cancer Epidemiol Biomarkers Prev. 2007]Am J Hum Genet. 2006 Oct; 79(4):640-9.
[Am J Hum Genet. 2006]Am J Hum Genet. 2007 Jun; 80(6):1014-23.
[Am J Hum Genet. 2007]Genetics. 1991 Feb; 127(2):417-28.
[Genetics. 1991]Am J Phys Anthropol. 1986 Aug; 70(4):433-41.
[Am J Phys Anthropol. 1986]Genetics. 2000 Jun; 155(2):945-59.
[Genetics. 2000]Am J Hum Genet. 2003 Jun; 72(6):1492-1504.
[Am J Hum Genet. 2003]BMC Bioinformatics. 2006 Jun 22; 7():317.
[BMC Bioinformatics. 2006]Am J Phys Anthropol. 1986 Aug; 70(4):433-41.
[Am J Phys Anthropol. 1986]Genet Epidemiol. 1986; 3(6):435-54.
[Genet Epidemiol. 1986]Hum Genet. 2005 Dec; 118(3-4):424-33.
[Hum Genet. 2005]J Am Soc Nephrol. 2006 Dec; 17(12):3491-6.
[J Am Soc Nephrol. 2006]Ann Hum Genet. 2004 Mar; 68(Pt 2):139-53.
[Ann Hum Genet. 2004]Cancer Epidemiol Biomarkers Prev. 2006 Oct; 15(10):1878-85.
[Cancer Epidemiol Biomarkers Prev. 2006]J Am Soc Nephrol. 2006 Dec; 17(12):3491-6.
[J Am Soc Nephrol. 2006]Am J Hum Genet. 2005 Mar; 76(3):463-77.
[Am J Hum Genet. 2005]Hum Genet. 2006 Jun; 119(5):547-57.
[Hum Genet. 2006]Am J Hum Genet. 2002 Mar; 70(3):737-50.
[Am J Hum Genet. 2002]Int J Cancer. 2005 Jan 1; 113(1):141-7.
[Int J Cancer. 2005]Am J Epidemiol. 2003 Aug 1; 158(3):272-9.
[Am J Epidemiol. 2003]Am J Hum Genet. 2003 Jun; 72(6):1492-1504.
[Am J Hum Genet. 2003]Am J Hum Genet. 2003 Jun; 72(6):1492-1504.
[Am J Hum Genet. 2003]Genetics. 2000 Jun; 155(2):945-59.
[Genetics. 2000]BMC Bioinformatics. 2006 Jun 22; 7():317.
[BMC Bioinformatics. 2006]Genetics. 2000 Jun; 155(2):945-59.
[Genetics. 2000]Am J Phys Anthropol. 1986 Aug; 70(4):433-41.
[Am J Phys Anthropol. 1986]Science. 1969 Aug 22; 165(895):762-8.
[Science. 1969]Ethn Dis. 1991 Summer; 1(3):245-56.
[Ethn Dis. 1991]Am J Hum Genet. 2005 Mar; 76(3):463-77.
[Am J Hum Genet. 2005]Genome Biol. 2002 Jul 1; 3(7):comment2007.
[Genome Biol. 2002]Theor Popul Biol. 2001 Nov; 60(3):227-37.
[Theor Popul Biol. 2001]