- We are sorry, but NCBI web applications do not support your browser and may not function properly. More information

# Genome Scan Meta-Analysis of Schizophrenia and Bipolar Disorder, Part I: Methods and Power Analysis

^{1}Center for Neurobiology and Behavior, Department of Psychiatry, University of Pennsylvania School of Medicine, Philadelphia;

^{2}Department of Genetics, Trinity College, Dublin; and

^{3}Division of Genetics and Development, Guy’s, King’s & St Thomas’ School of Medicine, London

## Abstract

This is the first of three articles on a meta-analysis of genome scans of schizophrenia (SCZ) and bipolar disorder (BPD) that uses the rank-based genome scan meta-analysis (GSMA) method. Here we used simulation to determine the power of GSMA to detect linkage and to identify thresholds of significance. We simulated replicates resembling the SCZ data set (20 scans; 1,208 pedigrees) and two BPD data sets using very narrow (9 scans; 347 pedigrees) and narrow (14 scans; 512 pedigrees) diagnoses. Samples were approximated by sets of affected sibling pairs with incomplete parental data. Genotypes were simulated and nonparametric linkage (NPL) scores computed for 20 180-cM chromosomes, each containing six 30-cM bins, with three markers/bin (or two, for some scans). Genomes contained 0, 1, 5, or 10 linked loci, and we assumed relative risk to siblings (λ_{sibs}) values of 1.15, 1.2, 1.3, or 1.4. For each replicate, bins were ranked within-study by maximum NPL scores, and the ranks were averaged (*R*_{avg}) across scans. Analyses were repeated with weighted ranks ( for each scan). Two *P* values were determined for each *R*_{avg}:* P*_{AvgRnk} (the pointwise probability) and *P*_{ord} (the probability, given the bin’s place in the order of average ranks). GSMA detected linkage with power comparable to or greater than the underlying NPL scores. Weighting for sample size increased power. When no genomewide significant *P* values were observed, the presence of linkage could be inferred from the number of bins with nominally significant *P*_{AvgRnk}, *P*_{ord}, or (most powerfully) both. The results suggest that GSMA can detect linkage across multiple genome scans.

## Introduction

Schizophrenia (SCZ; locus SCZD [MIM #181500]) and bipolar disorder (BPD; loci MAFD1 [MIM 125480] and MAFD2 [MIM 309200]) are severe, common, genetically complex psychiatric phenotypes. As discussed in parts II and III of this series, there have been at least 20 SCZ and 22 BPD genome scans to date, with others in progress. Most twin and family studies suggest that there are independent genetic factors underlying these disorders (Levinson and Mowry 2000), but overlapping factors remain a possibility (Berrettini 2000): severe BPD cases have symptoms resembling SCZ, “schizoaffective” cases with mixed symptoms have increased familial risks of both disorders, and there are chromosomal regions in which linkage evidence has been reported for both SCZ and BPD. Most of the available genome scan studies were initiated early in the 1990s and have been relatively small; for SCZ, the average study included ~60 pedigrees, and for BPD the average study included 28. There have been only one BPD and three SCZ studies reported with >100 pedigrees in the complete scan, although a number of larger studies are currently in progress.

Although statistically significant findings have been reported, and some of these have been supported by other studies, there is no locus, for either disorder, that has consistently produced evidence for linkage in most studies. It is, therefore, likely that susceptibility is conferred by DNA sequence variations at combinations of loci, each with a small effect on risk (common polygenes), that the loci of greatest effect vary considerably across families and samples (locus heterogeneity), or both. It may be necessary to combine the results of multiple studies to have adequate power to detect these loci, until much larger samples are available.

The present series of articles uses a rank-based method, genome scan meta-analysis (GSMA) (Wise et al. 1999), to evaluate the evidence for linkage across available SCZ and BPD genome scans. This first article briefly reviews the theoretical basis for the method and then describes a set of simulation studies that estimate the power of GSMA to detect linked loci in data sets comparable to the SCZ and BPD data sets, across a range of genetic models, providing empirical standards for assessing statistical significance. The second and third articles then describe the GSMAs of SCZ and BPD, respectively (Lewis et al. 2003 [in this issue]; Segurado et al. 2003 [in this issue]).

Several other strategies are available for meta-analysis of linkage data. The most robust approach would be to obtain the original genotypes for each study, construct a combined map of the markers, and perform new linkage analyses. However, the genotypes are not available for all SCZ and BPD studies and in some cases are restricted by industry relationships. Construction of a combined map has also been problematic, although the new deCode map (Kong et al. 2002) should make this easier in the future. An alternative approach, multiple scan probability (MSP) (Badner and Gershon 2002*b*), has been applied to published SCZ and BPD data (Badner and Gershon 2002*a*). This approach combines *P* values after correcting for the size of the linkage area. GSMA and MSP are compared further below.

GSMA has a number of advantages over other available methods. It requires only placing markers within 30-cM bins, rather than determining precise positional relationships (although precision in localizing the linkage signal is thereby reduced). Because raw data are not required, it is straightforward for investigators to provide the necessary results (linkage scores, *P* values, or ranked data) for each location. When several genetic analyses have been performed in a particular study, results can be maximized to produce a single set of ranks for that study. No assumptions are made about models of inheritance or of genetic heterogeneity. However, GSMA provides no formal test of genetic hetereogeneity, and interpretation of genomewide statistical significance currently relies on empirical grounds.

In contrast to the GSMA, meta-analysis in epidemiological studies provides a combined effect size (e.g., relative risk) with its confidence interval. These methods have directly interpretable parameter values and allow testing of heterogeneity between studies, but they require each included study to use the same statistical analysis and to test the same hypothesis. Such methods are difficult to apply to linkage studies, which commonly report a LOD score (i.e., a measure of significance) and not an effect size. Their extension to assess evidence of linkage across a region is also not straightforward. Novel methods for meta-analysis of linkage studies are therefore needed.

In summary, on the basis of the empirical standards for assessing significance reported in this article, the GSMA of SCZ genome scans produced significant evidence for linkage for 12–19 of 120 bins, depending on the approach to assessing significance. By contrast, the GSMA of BPD scans produced no clear statistically significant evidence for linkage, and the analysis was complicated by the many combinations of diagnoses included in linkage analyses by different studies. The bins with nominally significant *P* values for BPD showed no clear overlap with the most significant bins for SCZ. Results for each disorder and their implications are discussed in the second and third articles in this series (Lewis et al. 2003 [in this issue]; Segurado et al. 2003 [in this issue]).

### Rationale for the Simulation Studies

The simulation studies were intended to clarify the kinds of genetic effects that might be detected by GSMA in the SCZ and BPD data sets. Simplifying assumptions were necessary, given the unlimited range of possible genetic models. GSMA should have the greatest power to detect effects that are present in a substantial number of the individual studies (common polygenes) and less power to detect effects that are present in very few families or samples (extreme locus heterogeneity). We therefore designed a set of simulation studies based on the common polygene model. We recognize that there could be susceptibility loci for SCZ and/or BPD that can be detected only in particular ethnic groups or pedigree structures. As in all studies of complex disorders, GSMA can only produce positive evidence for certain effects; it cannot exclude other hypotheses.

There are advantages to using simulated samples of affected sibling pairs (ASPs) to model common polygenes. For families containing a single ASP, if one ignores diagnoses in parents or other relatives and if transmission is not recessive (i.e., if risks are similar in sibs, parents, and offspring), then there is a straightforward relationship between the locus-specific relative risk to siblings of affected probands (λ_{sibs}) and the power to detect linkage for either a single major locus or a multiplicative interaction among loci (James 1971; Risch 1987, 1990). Studying samples of ASPs therefore simplifies the simulations, because the exact choice of parameter values becomes unimportant as long as they predict the desired locus-specific λ_{sibs} (although, in the real case, λ_{sibs} values can be distorted by imprecise estimates of population prevalence, particularly for rarer disorders). A disadvantage of this approach is that extended pedigrees could have greater power under certain genetic models, but it would have been difficult here to select a limited yet plausible range of models. Much of the linkage information in available SCZ and BPD samples is contained in small nuclear families and, thus, in the sharing of alleles by ASPs. Therefore, we decided to assess the power of GSMA in samples of families with two affected sibs each (independent ASPs). Only dominant transmission models were studied, because recessive inheritance is unlikely, given the absence of differences between risks to sibs versus risks to parents/offspring for either disorder (Risch 1990) and because ASPs have less power to detect linkage under dominant transmission and, thus, results should be conservative. For comparison, recessive inheritance has been studied for one of the models at the lowest value of λ_{sibs} (1.15), and power was indeed substantially greater than for dominant transmission.

There is a common misconception that ASP analyses cannot detect linkage in the presence of genetic heterogeneity. For a known risk allele, locus-specific λ_{sibs} could be computed by genotyping a population-based sample to establish the risk allele frequency and penetrances and then determining the relative risk to a carrier probands’ sibs, parents, and offspring versus the population risk. In the absence of recessive effects, risks to siblings, parents, and offspring will be similar, and, in families transmitting the risk allele, ASP identical-by-descent (IBD) allele sharing would be predicted by the locus-specific λ_{sibs}: the proportion of ASPs sharing 0 alleles IBD, * z*_{0}*,* is 0.25/λ_{sibs} (Risch 1990). However, in linkage studies, the risk locus is unknown, and sharing is measured at nearby marker loci. If 20% of probands carry an allele that increases risk to sibs by threefold, the observed *marker z*_{0} will be the weighted average of *z*_{0} in the 80% families with no linkage (0.25) and in the 20% families with linkage (0.25/3=0.0833); thus, populationwide *z*_{0}=0.21667, and power will be similar to that for a locus-specific λ_{sibs}=0.25/0.21667=1.1538. Power will be similarly reduced for nonparametric analyses and for parametric heterogeneity LOD score analyses. Thus, in the studies presented here, genetic effects are expressed as populationwide λ_{sibs}. Results would be similar regardless of whether each locus produced a smaller increase in risk in many families or a larger increase in a small proportion of families.

We therefore estimated, for each actual SCZ or BPD genome scan, the number of ASPs with roughly equivalent linkage information, as described below. Genotypes were simulated for large pools of ASP families (two parents of unknown diagnosis and two affected siblings) on the basis of genetic models predicting specific values of λ_{sibs} or no linkage, and appropriate types and numbers of ASPs were randomly drawn from the pools of families. Given the λ_{sibs} values used here (1.15–1.4), individual samples in each replicate varied greatly in IBD sharing proportions, as would be observed with either a common polygene model or a moderate locus heterogeneity model. Linkage analyses were performed using nonparametric linkage (NPL) *Z*_{all} scores (GENEHUNTER 2.0), which correlate highly with ASP analyses. NPL scores simplified the simulation procedure because, even when multipoint analyses are used (as was the case here), NPL scores maximize at marker loci rather than between loci, so that only one data point per marker had to be considered.

## Material and Methods

### GSMA

GSMA was developed to deal with the diverse study designs, analysis methods, and marker densities used in genome scan studies. GSMA divides the autosomes into 120 bins of ~30 cM in length (X and Y chromosomes are not considered here). The 30-cM bin width is the largest that divides the smallest chromosomes into two bins each, and it ensures that scores within bins are correlated strongly and those between bins more weakly; for the real data sets in the following two articles (Lewis et al. 2003 [in this issue]; Segurado et al. 2003 [in this issue]), we have assessed the effects of bin placement by combining adjacent bins, but we have not comprehensively studied alternative bin widths. For consistency across GSMAs, the boundaries of these bins are defined by Généthon markers, as shown in the following two articles (Lewis et al. 2003 [in this issue]; Segurado et al. 2003 [in this issue]). Thus, markers located between ~0 cM and 30 cM on chromosome 1 are assigned to bin 1.1, those between 30 cM and 60 cM to bin 1.2, etc. To simplify the simulation studies reported here, a 3,600-cM genome was assumed, comprised of 20 180-cM autosomes, each containing six bins.

The ranking procedure is illustrated in appendix A. In brief, for each of *N* studies, the bins are ranked (1 = best) on the basis of the maximum linkage score or lowest *P* value observed within each bin. These are within-study ranks (*R*_{study}). All negative linkage scores are considered equivalent to 0, for consistency with methods that produce no negative values—we did not study the influence of this “floor” effect on results (i.e., ignoring the degree of negativity of a score, which could counterbalance a very positive score in another study). For bins with the same maximum scores (such as those with 0 scores), each bin is assigned the mean value of the ranks in their range. For weighted analyses, each rank is multiplied by the weight for that study—here, the standardized (see below). Then, the average rank (*R*_{avg}) is computed for each bin, across all *N* studies. The mean is 60.5 under the null hypothesis, and values closer to 1 indicate a clustering of evidence for linkage in that bin. Theoretical thresholds of significance for unweighted analyses are shown in order to provide the reader with a sense of the relevant ranges of values for different sample sizes. (Note: previous GSMAs reported *R*_{study} values ranked in descending order and summed rather than averaged them. The two procedures are statistically identical, but it is more intuitive for 1 to be “best.”)

The assessment of statistical significance is summarized in the third table of appendix A, in table 1, and in figure 1.

*P*

_{ord}: observed and expected ordered

*R*

_{avg}values. The black line shows the 120 observed

*R*

_{avg}values, sorted with the first-place bin on the right. The light gray line and vertical error bars show the mean ± 2 SD of the

*j*th-place bin

**...**

*P*_{AvgRnk} is the pointwise probability of observing a given *R*_{avg} for a bin by chance, in a GSMA of *N* studies (third table in appendix A). This can be determined theoretically (see appendix B) or by permutation: for a GSMA of *N* studies, for each permuted replicate, the observed 120 *R*_{study} values for each study are randomly reassigned to bins and then averaged across studies for each bin. At least 5,000 replicates are produced, or a larger number to produce stable estimates of small or marginal *P* values. If the observed *R*_{avg} is 38.9, then *P*_{AvgRnk} is the proportion of bins in the random replicates with *R*_{avg}=38.9 (e.g., among 120 bins × 5,000 replicates).

*P*_{ord} (short for “*P*_{AvgRnk|order}”) is the probability of observing a given *R*_{avg} for a bin by chance in bins with the same “place” in the ascending order of *R*_{avg} values in randomly permuted replicates (table 1; fig. 1). Thus, in the permutation test described above, the *R*_{avg} values for each replicate are now sorted by size. If the observed fourth-best *R*_{avg}=31.0, *P*_{ord} is the probability of observing a value 31.0 in the 5,000 fourth-place bins in the replicates. Consider the analogy of a race: *P*_{AvgRnk} determines which bins are the “fastest runners,” whereas *P*_{ord} determines whether it is a “fast race” (i.e., whether the top finishers all ran faster than the top finishers in most races). As shown by the simulation studies reported below, it is this aggregate information about the set of most significant bins that provides additional information about linkage when genomewide significance is not observed for *P*_{AvgRnk}.

*P*_{AvgRnk} and *P*_{ord} have been computed as (*r*+1)/(*n*+1), where *r* is the number of replicates exceeding a particular score and *n* is the total number of replicates. As explained by North et al. (2002, p. 439), “if the null hypothesis is true, then the test statistics of the *n* replicates and the test statistic of the actual data are all realizations of the same random variable,” and thus the *P* value is more accurate when the data set being tested is included in the ranking of all known outcomes. Like *P*_{AvgRnk}, *P*_{ord} is a pointwise value—~5% of bins will have *P*_{ord}.05 in unlinked replicates, although these values will be scattered throughout both large and small ranks. The 5% threshold for genomewide significance of each type of *P* value must therefore be corrected for 120 bins: .05/120=.000417. This is larger than the threshold of pointwise *P*=.00002 for linkage results (Lander and Kruglyak 1995), because, for GSMA, inference is restricted to discrete bins of 30 cM. Statistical properties of *P*_{ord}, including nonindependence of the values, are discussed further below.

The terminology described above is summarized in appendix C.

### Description of the Samples

The simulated GSMA data sets were based on three of the SCZ and BPD data sets described in the second and third papers in this series, respectively.

- 1.The simulated SCZ data sets (table 2) approximated the linkage information of the 20 genome scan analyses from 17 independent projects included in the SCZ GSMA, using narrow diagnoses (usually SCZ plus schizoaffective disorders). Simulated sample sizes were the reported number of ASPs in a corresponding study or an estimate based on a multicenter SCZ sample (Levinson et al. 2000), where (
*N*[*genotypedcases*]-*N*[*informativepedigrees*])≈1.39×(*N*[*independentASPs*]). The proportion of genotyped parents was as reported or was estimated from study descriptions and was increased when many unaffected relatives were genotyped. Average marker spacing was 10 or 15 cM, similar to the corresponding study. - 2.The simulated BPD data sets resembled sets of genome scans in the BPD GSMA. These studies considered various diagnostic combinations of bipolar-I, bipolar-II, schizoaffective disorder–bipolar type, recurrent major depression, and sometimes other diagnoses. Thus, the number of affected cases and informative families varied according to the model. The simulated data sets resembled those for the two narrowest models considered in the BPD GSMA: the BP-VN (very narrow) data set (table 3) includes nine simulated samples whose sizes resemble those of the nine BPD scans analyzed under a very narrow diagnostic model, considering only BPI and SAB as affected; the BP-N (narrow) data set (table 4) includes 14 simulated samples whose sizes resemble those of the 14 BPD scans analyzed under a narrow model, considering BPI, SAB, and BPII cases as affected. Numbers of ASPs were determined as for SCZ, with zero, one, or two genotyped parents each in ~33% of families, on the basis of the SCZ data (because pedigree structure details were sometimes unavailable). Average marker spacing was 10 cM.

### Simulation Procedure

Genotypes were created by simulation for chromosomes with either no linked locus, or with one linked disease locus under one of four dominant genetic models:

- 1.λ
_{sibs}=1.15 [*q*(*D*)=0.434,*f*(*dd*)=0.01,*f*(*Dd*,*DD*)=0.1]; - 2.λ
_{sibs}=1.2 [*q*(*D*)=0.0169,*f*(*dd*)=0.006,*f*(*Dd*,*DD*)=0.03]; - 3.λ
_{sibs}=1.3 [*q*(*D*)=0.0173,*f*(*dd*)=0.005,*f*(*Dd*,*DD*)=0.03]; - 4.λ
_{sibs}=1.4 [*q*(*D*)=0.0244,*f*(*dd*)=0.004,*f*(*Dd*,*DD*)=0.025];

where *D* indicates the disease allele, *d* the wild-type allele, *q* the allele frequency, and *f* the penetrance. This range of values was selected because samples of 500–1,000 ASPs have been shown to have reasonable power to detect linkage at values ~1.3–1.4, with power increasing rapidly above this range and decreasing rapidly below it (Hauser et al. 1996).

To confirm that power would be greater for recessive transmission, a single set of replicates was created for the SCZ data set as discussed below, using parameter values predicting λ_{sibs}=1.15 [*q*(*D*)=0.276, *f*(*dd*,*Dd*)=0.01, *f*(*DD*)=0.04].

LINKAGE-format files were created with sets of families with two parents of unknown diagnosis and two affected offspring. Simulated genotypes were created using GENSIM (Kruglyak et al. 1996; Kruglyak and Daly 1998). The structure of the simulated chromosomes is shown in figure 2. Each chromosome was 180 cM long (six 30-cM bins). Bins contained three markers (10-cM spacing), except for four SCZ scans with 15-cM spacing. Disease loci were conservatively placed halfway between adjacent markers, given that many other aspects of the procedure were idealized (e.g., identical marker maps in all studies, no genotyping errors).

*A*

**...**

Four types of genomes were created (table 5), with 1, 5, or 10 linked loci in “edge” or “mid” bins. Lower multipoint NPL scores were anticipated near the edges of chromosomes because of reduced information content. The value of λ_{sibs} was constant for all disease loci in a simulated genome. Sets of “linked” chromosomes for 10,000 families were created for each of 24 combinations of four values of λ_{sibs}, two marker densities, and zero, one, or two parents genotyped. “Unlinked” sets of chromosomes for 100,000 families were created for each of six combinations of two marker densities and zero, one, or two parents genotyped. NPL analysis was performed for each family, and tables of NPL scores at each marker locus were stored for each family to create pools from which samples were then drawn.

For each GSMA replicate, a row of NPL scores for a single chromosome was selected randomly from the table with the appropriate λ_{sibs}, marker density, number of parents typed, and chromosome type (linked-edge, linked-mid, or unlinked), for each chromosome for each family in each individual scan in each data set, for each full model (λ_{sibs}, and numbers of linked/edge, linked/mid, and unlinked chromosomes) (table 5). For example, for one of the 100 “ed1” model (see footnote “a” of table 5 for an explanation of model abbreviations) SCZ GSMA replicates with λ_{sibs}=1.15, for sample 1, NPL scores for one linked/edge chromosome and 19 unlinked chromosomes were drawn from the appropriate pools for each of 135 ASP families with two typed parents, 115 with one typed parent, and 80 with no typed parents, and so on for the other 19 SCZ samples, totalling 1,625 pedigrees with 570, 506, and 547 families with two, one, and zero parents typed, respectively (table 2). Within each data set, the NPL scores within each scan were combined across families (), and the maximum NPL scores within the 120 bins for each sample were saved in a table, substituting 0 for negative scores. These scores were ranked across the 120 bins, averaging the ranks of sets of tied bins. The original analyses considered *R*_{study} values in descending order and sums of ranks. Here we present equivalent values in ascending order and average ranks as noted above. Simulations were performed with 100 replicates for each linked model and 1,000 replicates of unlinked data sets (no linkage present).

### Weighting Procedure

*R*_{study} values were weighted for some analyses. Alternative weights were tested on 100 SCZ replicates (ed1md4/1.15). The total NPL score was computed for each marker, extrapolating the values for bins containing two markers (*M*3=*M*2, and *M*2=[*M*1+*M*2]/2), and the best score was selected for each bin. Pearson’s correlation coefficient (*r* and *R*^{2}) was computed between each bin’s NPL score and either the unweighted or weighted *R*_{avg}, comparing three weights: (a) *N*(*pedigrees*), (b) , or (c) . The average *R*^{2} values were 0.79 (unweighted), 0.778 (a), 0.856 (b), and 0.856 (c). Because (b) would excessively downweight studies with large pedigrees, was adopted here. We used *N* from the actual corresponding study, because a GSMA will generally include samples with varying pedigree structures, so that the weight is a rough approximation of relative linkage information.

### Statistical Analyses

Values of *P*_{AvgRnk} were computed as described above (and see appendix A, table 1 and fig. 1). For weighted analyses, *P* values computed by permutation were compared with the actual probability of observing a given value or one more extreme in the 1,000 unlinked replicates, and results agreed quite closely. For example, for SCZ, the 5% threshold was 46.815 by the permutation procedure and 46.715 based on unlinked replicates.

For *P*_{ord}, a new procedure was developed. *P*_{ord} values are pointwise measures—that is, when numbers from 1 to 120 were placed randomly in each of 20 rows (similar to the grids of values for 120 bins for the 20 SCZ samples) and averaged, 4.69% of bins had *P*_{ord}<.05. However, 5.74% of bins met this criterion in 250 unlinked SCZ data sets. We noted that edge bins 1 and 6 had larger (worse) average ranks (61.68 vs. 59.91 for bins 1–4; *t*=8.47 for bin 1 vs. bin 4, 999 df, *P*<.00001) (shown in fig. 3) and slightly higher mean *P*_{ord} values (.51065 vs. .50245; *t*=2.53; *P*=.011 for bin 1 vs. bin 4, chromosome 1), as predicted from the lower linkage information content at the ends of maps. Thus, the smaller (better) *R*_{avg} of a middle bin would be evaluated against the distribution of middle + edge bins, lowering its *P*_{AvgRnk}. When rows were permuted by chromosome rather than by bin, mean *P*_{ord} values for edge versus middle bins were .506972 and .503461 (*t*=0.032 [not significant] for bin 1 vs. bin 4). Therefore, the analyses reported below used permutation by chromosome to compute *P*_{ord}. However, this method (adapted for chromosomes of nonuniform lengths) was not more conservative in the SCZ and BPD GSMAs in the articles that follow and was not used. Presumably, this phenomenon is more apparent in idealized simulated data.

**...**

Distributions of *P*_{AvgRnk} and *P*_{ord} were compared in replicates containing linked loci versus the unlinked replicates, to determine how false and true positive results could best be differentiated.

## Results

### Detection of Significant Linkage with NPL Scores versus GSMA

Table 6 shows the proportion of linked bins achieving genomewide suggestive or nominal significance levels for weighted *P*_{AvgRnk} or NPL scores, for SCZ data sets (λ_{sibs}=1.15 or 1.3). GSMA was as powerful as NPL and more powerful for weaker linkage, owing in part to the fact that GSMA considers 30-cM bins so that there is a more modest correction for multiple testing. Figure 4 shows mean numbers of bins achieving genomewide significance (*P*=.05/120=.0004167), using empirical thresholds from 1,000 unlinked replicates (120,000 bins) per data set, for weighted ranks. In contrast to table 6, where only the disease locus bins are considered and mid and edge bins are considered separately, in figure 4 the total number of genomewide significant values is shown, ~20% of which are in bins adjacent to disease bins. Power was excellent to detect at least one such value for SCZ and BP-N data sets if the populationwide locus-specific λ_{sibs} was at least 1.3. When λ_{sibs}=1.15, SCZ data sets had good power when 5 loci were linked, as did BP-N when 10 loci were linked. Power was poor for BP-VN.

*P*.0004167 (the threshold for genomewide significance = .05/120), for weighted analyses. Diamonds represent

**...**

### Power to Detect Nominally Significant *P*_{AvgRnk}

Power to detect *P*_{AvgRnk} at the 95% and 99% thresholds is shown in figure 5*A–C*. Unweighted results are shown to demonstrate power in the absence of other assumptions, such as weighting schemes. For comparison, table 7 summarizes power for weighted analyses for selected models, typically ~3%–7% greater. The data are illustrated in figure 6. Power at the 95% threshold was high for SCZ data sets (20 samples; 1,625 ASPs); low for BP-VN data sets (nine studies; 501 ASPs), except at the high λ_{sibs}; and intermediate for BP-N (14 studies, 1,017 ASPs). A limitation of GSMA is also illustrated: for weaker genetic effects, as the number of linked loci increases, the power to detect each individual locus declines, but there can be considerable power to detect at least some of the linked loci. Note that analyses presented below consider only λ_{sibs} values of 1.15 and 1.3, which are sufficiently representative.

### Aggregate Significance Thresholds

One can also consider aggregate thresholds of significance: does the number or pattern of bins achieving a criterion exceed chance expectation? For *P*_{AvgRnk}, in the three sets of 1,000 unlinked GSMA replicates, only 5% of data sets had 11 bins with *P*_{AvgRnk}<.05. This can be considered as a criterion for concluding that linkage is present somewhere in the genome, although this does not determine which bins are the true positives.

*P*_{ord} is a type of order statistic analysis. The *P* values of sequential order statistics are not independent. Let *X*[*r*] be the *r*th order statistic for the set of average ranks (*R*_{avg})—that is, *X*[*r*] has the value of the *r*th-lowest average rank, and, specifically, the value *X*[1] is the lowest average rank. If *X*[*r*] has a very low *P*_{ord}, there is an increased probability that the next-most-extreme order statistic, *X*[*r*-1], will also be significant, since *X*[*r*-1]<*X*[*r*], by definition, and X[*r*-1] therefore has a truncated distribution. The dependency can be determined theoretically by using the distribution function for summed ranks (Wise et al. 1999; appendix B), but it is difficult to compute theoretically the more general probability of observing a cluster of *N* significant *P*_{ord} values. Empirically, we found that only 5% of unlinked replicates had four or more *P*_{ord} values <.05 out of the 10 lowest *R*_{avg} values. This is a second empirical threshold for determining, in aggregate, whether there is evidence of linkage in a GSMA.

### The Relationship between *P*_{ord} and *P*_{AvgRnk} Values

The most striking difference between replicates with and without linked chromosomes was the pattern of *P*_{ord} and *P*_{AvgRnk} values. Table 7 shows the number of bins with *P*_{AvgRnk}<.05, *P*_{ord}<.05, or both, for selected data sets. These data are shown graphically in figure 6, for λ_{sibs}=1.15 data sets, to illustrate the following findings:

- 1.As the number of linked bins and the genetic effect and power increase, more bins have both
*P*_{AvgRnk}and*P*_{ord}values <.05, versus a mean of ~0.55 bins with these values in unlinked data sets. This is true even when λ_{sibs}=1.15, where linkage is difficult to detect (Hauser et al. 1996); for example, for SCZ data sets, with 5–10 disease loci,*P*_{AvgRnk}and*P*_{ord}are <.05 for many disease and adjacent bins. - 2.As power and number of linked loci increase, fewer bins have
*only**P*_{AvgRnk}<.05 but*not**P*_{ord}<.05. The number of bins with only*P*_{ord}<.05 is relatively constant, except that the number increases for adjacent bins when there are more linked chromosomes—an observation that may be relevant to interpreting the data for the SCZ GSMA in the next article in this series (Lewis et al. 2003 [in this issue]). - 3.When multiple bins have both
*P*_{AvgRnk}and*P*_{ord}<.05, many or most of them contain linked loci, even when genetic effects are too weak to produce a significant*P*_{AvgRnk}for each bin. For example, for the SCZ replicates with 10 linked loci (λ_{sibs}=1.15; ed2md8), an average of almost 11 linked and adjacent bins had*P*_{AvgRnk}and*P*_{ord}<.05, compared with slightly more than 1 unlinked bin (table 7)—that is, >90% were linked or adjacent. For unlinked data sets, <5% of GSMA replicates contain four or more such bins for SCZ or BP-N, or five or more bins for the smaller BP-VN data sets. Therefore, conservatively, observing five or more bins with*P*_{AvgRnk}and*P*_{ord}<.05 is a genomewide criterion that suggests linkage in some or all of these bins. - 4.By contrast, when there is only a single linked locus in the genome, only the magnitude and not the pattern of
*P*_{AvgRnk}and*P*_{ord}values can identify linkage.

### Combined *P* Values

Finally, we considered the possibility of a test for the significance of a pair of *P*_{AvgRnk} and *P*_{ord} values (*P*_{comb}). These values are not independent and cannot be combined theoretically. One can determine, from unlinked replicates, the frequency of values more extreme than a given pair; however, if this is done for several pairs, the type I error rate becomes inflated rapidly. For example, if one restricts values to [*P*_{AvgRnk}<.05 and *P*_{ord}<.05], then, for each *P*_{AvgRnk}, there is a maximum *P*_{ord} such that *P*_{comb}<.000417. If one plots all such pairs on a log scale, however, one defines a triangle within which lies 0.001 of all observed pairs of values in unlinked SCZ data sets. There is an infinite number of ways to further restrict this space to reduce type I error, so any one choice would be arbitrary. In practice, a criterion of [*P*_{AvgRnk}<.05 and *P*_{ord}<.05] selects 0.005 of bins in unlinked replicates, and this criterion performs adequately in detecting linkage. For example, in 100 replicates of SCZ ed2md8 data sets (λ_{sibs}=1.15), 65% of disease bins, 23.4% of adjacent bins, 2.2% of other bins on linked chromosomes, and 1% of bins on unlinked chromosomes had *P*_{AvgRnk} and *P*_{ord}<.05; the proportions of each of these types of bins within the *P*_{comb} triangle described above were 58.9%, 38.3%, 2%, and 0.9%. Thus, pending further study, *P*_{comb} does not define significance for individual bins. When multiple bins are associated with *P*_{AvgRnk} and *P*_{ord}<.05, these bins are the most likely to contain linked loci.

### Recessive Transmission

To confirm that, for simple ASP families, power would be greater than reported here for recessive transmission models, 100 SCZ replicates were simulated under the recessive model described above, predicting λ_{sibs}=1.15. Each genome contained 5 linked chromosomes (to simplify the simulation, all were linked/edge) and 15 unlinked chromosomes. The power to detect a given linked bin was 0.94 for *P*<.05, 0.79 for *P*<.0083, and 0.39 for *P*<.000417, considerably greater than that observed for dominant transmission and λ_{sibs}=1.15 (table 6). The mean number of bins with *P*_{AvgRnk} and *P*_{ord}<.05 was 9.48, compared with <6 for dominant transmission (fig. 6).

## Discussion

A rank-based method, GSMA, can have considerable power to detect genetic linkage for the models studied here. Power has been studied for *P*_{AvgRnk}, the probability of observing a bin’s average rank by chance; *P*_{ord}, the probability of observing the *j*th-place bin’s average rank in *j*th-place bins in randomly permuted data; and the co-occurrence of nominally significant values for both measures. For λ_{sibs}1.3, even the smallest data set (BP-N) had good/excellent power to detect at least nominally significant *P*_{AvgRnk}. Genomewide significance can be identified in larger data sets by a *P*_{AvgRnk} corrected for multiple testing (*P*<.000417). GSMA had power comparable to or greater than that of the NPL scores from which the ranked data were derived, although with greatly reduced localization of the linkage signal. Where populationwide genetic effects are weak, the aggregate criteria presented above and summarized in appendix D can still provide significant evidence that one or more bins are likely to contain linked loci. The individual bins most likely to be linked are those with nominally significant values of both *P*_{AvgRnk} and *P*_{ord}. These criteria make it possible to identify a set of likely linkages even when none of them achieve genomewide significance individually. In the second article in this series (Lewis et al. 2003 [in this issue]), these criteria define a set of bins likely to contain loci linked to schizophrenia.

The present analyses have many limitations, including the following:

- 1.The simulations are idealized, with uniform markers and marker spacing across studies and no genotyping errors. Disease loci were therefore placed halfway between flanking markers, which reduces power, but actual power could be less than was observed here.
- 2.We did not study the effects of variable marker density within scans. In the actual SCZ and BPD GSMAs, we sought genome scan data for markers at reasonably uniform density, prior to fine mapping of candidate regions; however, for some studies, the only available analyses included additional markers either in the most positive regions or in candidate regions suggested by earlier studies. Bins with more typed markers will achieve larger average maximum linkage scores, and this bias could be compounded if investigators chose to type more markers when they observed slightly positive scores in previously reported candidate regions; this issue is discussed further in the second article in this series (Lewis et al. 2003 [in this issue]). On the other hand, small improvements in a bin’s within-study rank in a few samples would have little effect on the overall average rank across studies.
- 3.We did not study the effect of our decision in the SCZ and BPD GSMAs to select a maximum linkage score for a bin across all analyses, if multiple linkage analyses had been carried out. GSMA results might vary with different approaches.
- 4.We did not simulate data sets in which λ
_{sibs}varied across individual studies. Simulated NPL scores varied considerably across samples within each data set, especially for smaller samples, but heterogeneity across samples was not formally studied. GSMA would seem inappropriate for detecting linkage that exists in only a small proportion of studies. If differences were hypothesized between identifiable subsets of studies (clinical subtype, pedigree structure, ethnic background, etc.), it would be preferable to analyze the subsets separately. Most of the SCZ and BPD samples were from large, outbred, predominantly European populations. It seems likely that GSMA of these samples would have the greatest power to detect genetic effects that are present in many of the samples. But we have not studied power in the case of substantial between-study heterogeneity. - 5.Other issues not considered here include the power to detect linkage when λ
_{sibs}varies across loci in the same genome, the effect of two or more disease loci in the same bin or on the same chromosome, recessive transmission models, and X- or Y-chromosome linkage.

There may be advantages to other approaches to meta-analysis. The best approach would presumably be to perform new analyses using genotypes from each study. For example, several schizophrenia candidate regions have been studied by large multicenter collaborations that genotyped a common set of markers and combined the samples for analysis (Gill et al. 1996; SLCG 1996; Levinson et al. 2000). However, these studies included only a subset of collaborating SCZ linkage samples and did not scan the genome. Dorr et al. (1997) has suggested a logistic regression approach for analyzing allele sharing in multiple samples while taking possible heterogeneity across samples into account. This method was applied by Levinson et al. (2000), but applying it to all samples would require access to raw genotypes from all studies, which was not possible here.

Badner and Gershon (2002*b*) developed the MSP method for meta-analysis of candidate regions and applied it to SCZ and BPD (Badner and Gershon 2002*a*). MSP identifies clusters of positive values from the actual data and assesses significance using *P* values corrected for the size of the region. Similarities and differences in the GSMA and MSP analyses of SCZ and BPD are discussed in the following two articles (Lewis et al. 2003 [in this issue]; Segurado et al. 2003 [in this issue]). We would note here that it could be problematic to combine the lowest *P* values from genome scans, which, particularly for smaller scans, can be severely upwardly biased (Göring et al. 2001). However, the relative power of these methods is not yet clear. For example, perhaps by giving full weight to very low *P* values, MSP could better detect linkage in the presence of substantial heterogeneity across samples. GSMA might be more powerful when small genetic effects were present in all samples.

In conclusion, simulation studies suggest that GSMA can identify regions of the genome that are likely to contain linked loci, if the size of the data set is appropriate to the magnitude of the genetic effect. While the method has excellent power under some circumstances to detect genomewide significant linkage, GSMA could prove most useful when there are many weakly linked loci. In these cases, there may be more nominally significant bins than expected by chance, including an excess of bins with nominally significant *P* values both for the average rank and for the average rank given the order of ranks. No method can exclude linkage in any chromosomal region for a complex disorder, and GSMA data should not be interpreted in this way; however, where direct computation of linkage scores from raw genotypes is not feasible, GSMA provides a useful first step toward evaluating whether data from multiple genome scans provide evidence for linkage in specific chromosomal regions.

## Acknowledgments

The late Dr. Lodewijk Sandkuijl originally suggested designing a power analysis of GSMA. His thoughtful advice on this and many other issues will be greatly missed. Dr. Kenneth Kendler and Mr. Michael Levinson provided helpful suggestions about communicating these ideas more clearly. Dr. Sevilla Detera-Wadleigh’s role in organizing the bipolar meta-analysis provided the impetus to include those samples in the simulation study. This work was supported by National Institute of Mental Health grants MH61602 and K24-MH64197 (to D.F.L).

## Appendix A: Summary of GSMA Ranking Procedure: *R*_{study}, *R*_{avg}, and *P*_{AvgRnk}

First, select the *most significant linkage score* in each bin for each study (0 if negative):

Maximum Linkage Score in Bin | ||||||

Study | 1.1 | 1.2 | 1.3 | 1.4 | … | 22.2 |

1 | .83 | .41 | 1.90 | 3.19 | … | .32 |

2 | 0 | 0 | 0 | .22 | … | .89 |

… | … | … | … | … | … | … |

9 | .37 | .44 | .78 | 1.44 | … | .66 |

Within each study, rank each score (*R*_{study}) from 1 (“best”) to 120 (“worst”), with ties (e.g., for study 2, 38 bins including bins 1, 2, and 3 had 0 scores, leading to a rank of 101.5 for each).

Within-Study Rank (R_{study}) for Bin | ||||||

Study | 1.1 | 1.2 | 1.3 | 1.4 | … | 22.2 |

1 | 45 | 69 | 9 | 1 | … | 92 |

2 | 101.5 | 101.5 | 101.5 | 87 | … | 33 |

… | … | … | … | … | … | … |

9 | 93 | 87.5 | 59 | 9 | … | 58 |

Then compute the average rank (*R*_{avg}) for each bin across studies. Each *R*_{study} can be multiplied by a weighting factor before averaging (e.g., the standardized ). Compute pointwise *P*_{AvgRnk} (probability of the average rank) for each bin (theoretically, or empirically if weighted) to answer the question “By chance, how frequently would any bin have *R*_{avg} this low or lower?”

Weighted Within-Study Rank for Bin | |||||||

Study | Weight | 1.1 | 1.2 | 1.3 | 1.4 | … | 22.2 |

1 | 2.1 | 94.5 | 144.9 | 18.9 | 2.1 | … | 193.2 |

3 | .9 | 91.35 | 91.35 | 91.35 | 78.3 | … | 29.7 |

… | … | … | … | … | … | … | |

9 | .5 | 46.5 | 43.75 | 29.5 | 4.5 | … | 29 |

R_{avg} | 82.6 | 88.9 | 46.3 | 26.1 | … | 50.1 | |

Empirical P_{AvgRnk} | .9630 | .9916 | .1350 | .0025 | .2102 |

Thresholds of significance for *R*_{avg} are shown below for GSMAs with different numbers of studies (computed theoretically for unweighted analyses; empirical weighted thresholds will vary slightly).

R_{avg} Threshold for P_{AvgRnk} = | |||

No. ofStudies | .05 | .01 | .001 |

9 | 41.44 | 34.0 | 26.22 |

14 | 45.29 | 39.14 | 32.57 |

20 | 47.75 | 42.6 | 36.95 |

## Appendix B: Summed Rank Distribution Function

The GSMA procedure is based on an understanding of the distribution of summed ranks across multiple sets of ranked data, with within-study bins ranked in *descending* order (Wise et al. 1999). Under the null hypothesis of no linkage in a bin, the ranks will be randomly assigned from each study. The distribution function is

where *R* is summed rank, *X*_{i} is the rank of study *i, m* is the number of studies, *n* is the number of bins (120), and *d* is the integer part of (*R*-*m*)/*n*. For unweighted analyses, pointwise *P* values for *R* can be determined directly from this distribution, although, for weighted analyses, a permutation procedure is required, as described above.

It is mathematically equivalent to rank bins in descending order and to average rather than sum the ranks across bins. For any R, the equivalent average ascending rank *R*_{avg} is: (*n*+1)-(*R*/*m*). Results have been expressed this way in the current article to provide a more intuitive terminology.

## Appendix C: Summary of Terminology

- Bin:
- One of 120 30-cM autosomal segments used as units of analysis in GSMA; bin 2.1 is the first 30 cM of chromosome 2.
*R*_{study}(within-study rank):- The rank of each bin within a single study, based on the maximum linkage score (or lowest
*P*value) within it. The bin containing the best score has a rank of 1. All negative and 0 scores are considered to be tied. For weighted analyses, each raw rank is multiplied by the study’s weighting factor. *R*_{avg}(average rank):- The average of a bin’s within-study ranks or weighted ranks across all studies.
*P*_{AvgRnk}(probability of*R*_{avg}):- The pointwise probability of observing a given
*R*_{avg}for a bin in a GSMA of*N*studies, determined by theoretical distribution (unweighted analysis only) or by permutation test (fig. 2). *P*_{ord}(probability of*R*_{avg}given the order):- The pointwise probability that, for example, a 1st-place, 2nd-place, 3rd-place, etc., bin would achieve
*R*_{avg}at least this extreme in a GSMA of*N*studies. - Genomewide significance:
- For α=0.05, correction for 120 bins yields a threshold for genome-wide significance of .000417 for
*P*_{AvgRnk}or*P*_{ord}. For suggestive linkage (a result observed once per scan by chance), α=1/120=0.0083.

## Appendix D: Criteria for Genomewide Significance

For individual bins, the criterion for genomewide significance is *P*_{AvgRnk}<.000417. When linkage is likely to be present in one or more bins, the aggregate criteria are as follows: 11 bins with *P*_{AvgRnk}<.000417, 4 bins with *P*_{ord}<.05 among the 10 best values of *R*_{avg}, or 5 bins with *P*_{AvgRnk}<.05 and *P*_{ord}<.05. Bins with *P*_{AvgRnk}<.05 and *P*_{ord}<.05 are most likely to contain linked loci. No valid combined significance criterion was identified.

## Electronic-Database Information

The URL for data presented herein is as follows:

## References

*a*) Meta-analysis of whole-genome linkage scans of bipolar disorder and schizophrenia. Mol Psychiatry 7:405–411 [PubMed]

*b*) Regional meta-analysis of published data supports linkage of autism with markers on chromosome 7. Mol Psychiatry 7:56–66 [PubMed]

*P*values from Monte Carlo procedures. Am J Hum Genet 71:439–441 [PMC free article] [PubMed]

**American Society of Human Genetics**

## Formats:

- Article |
- PubReader |
- ePub (beta) |
- PDF (266K)

- Genome scan meta-analysis of schizophrenia and bipolar disorder, part II: Schizophrenia.[Am J Hum Genet. 2003]
*Lewis CM, Levinson DF, Wise LH, DeLisi LE, Straub RE, Hovatta I, Williams NM, Schwab SG, Pulver AE, Faraone SV, et al.**Am J Hum Genet. 2003 Jul; 73(1):34-48. Epub 2003 Jun 11.* - Genome scan meta-analysis of schizophrenia and bipolar disorder, part III: Bipolar disorder.[Am J Hum Genet. 2003]
*Segurado R, Detera-Wadleigh SD, Levinson DF, Lewis CM, Gill M, Nurnberger JI Jr, Craddock N, DePaulo JR, Baron M, Gershon ES, et al.**Am J Hum Genet. 2003 Jul; 73(1):49-62. Epub 2003 Jun 11.* - Identifying genomic regions for fine-mapping using genome scan meta-analysis (GSMA) to identify the minimum regions of maximum significance (MRMS) across populations.[BMC Genet. 2005]
*Cooper ME, Goldstein TH, Maher BS, Marazita ML.**BMC Genet. 2005 Dec 30; 6 Suppl 1:S42. Epub 2005 Dec 30.* - Two-dimensional genome scan identifies multiple genetic interactions in bipolar affective disorder.[Biol Psychiatry. 2010]
*Fullerton JM, Donald JA, Mitchell PB, Schofield PR.**Biol Psychiatry. 2010 Mar 1; 67(5):478-86. Epub 2009 Dec 22.* - How and why genetic linkage has not solved the problem of psychosis: review and hypothesis.[Am J Psychiatry. 2007]
*Crow TJ.**Am J Psychiatry. 2007 Jan; 164(1):13-21.*

- Meta-Analysis of Genome-Wide Scans Provides Evidence for Sex- and Site-Specific Regulation of Bone Mass[Journal of bone and mineral research : the ...]
*Ioannidis JP, Ng MY, Sham PC, Zintzaras E, Lewis CM, Deng HW, Econs MJ, Karasik D, Devoto M, Kammerer CM, Spector T, Andrew T, Cupples LA, Duncan EL, Foroud T, Kiel DP, Koller D, Langdahl B, Mitchell BD, Peacock M, Recker R, Shen H, Sol-Church K, Spotila LD, Uitterlinden AG, Wilson SG, Kung AW, Ralston SH.**Journal of bone and mineral research : the official journal of the American Society for Bone and Mineral Research. 2007 Feb; 22(2)173-183* - Meta-Analysis of Repository Data: Impact of Data Regularization on NIMH Schizophrenia Linkage Results[PLoS ONE. ]
*Walters KA, Huang Y, Azaro M, Tobin K, Lehner T, Brzustowicz LM, Vieland VJ.**PLoS ONE. 9(1)e84696* - Refining genome-wide linkage intervals using a meta-analysis of genome-wide association studies identifies loci influencing personality dimensions[European Journal of Human Genetics. 2013]
*Amin N, Hottenga JJ, Hansell NK, Janssens AC, de Moor MH, Madden PA, Zorkoltseva IV, Penninx BW, Terracciano A, Uda M, Tanaka T, Esko T, Realo A, Ferrucci L, Luciano M, Davies G, Metspalu A, Abecasis GR, Deary IJ, Raikkonen K, Bierut LJ, Costa PT, Saviouk V, Zhu G, Kirichenko AV, Isaacs A, Aulchenko YS, Willemsen G, Heath AC, Pergadia ML, Medland SE, Axenovich TI, de Geus E, Montgomery GW, Wright MJ, Oostra BA, Martin NG, Boomsma DI, van Duijn CM.**European Journal of Human Genetics. 2013 Aug; 21(8)876-882* - Identification of shared genetic susceptibility locus for coronary artery disease, type 2 diabetes and obesity: a meta-analysis of genome-wide studies[Cardiovascular Diabetology. ]
*Wu C, Gong Y, Yuan J, Gong H, Zou Y, Ge J.**Cardiovascular Diabetology. 1168* - Meta-analyses of genome-wide linkage scans of anxiety-related phenotypes[European Journal of Human Genetics. 2012]
*Webb BT, Guo AY, Maher BS, Zhao Z, van den Oord EJ, Kendler KS, Riley BP, Gillespie NA, Prescott CA, Middeldorp CM, Willemsen G, de Geus EJ, Hottenga JJ, Boomsma DI, Slagboom EP, Wray NR, Montgomery GW, Martin NG, Wright MJ, Heath AC, Madden PA, Gelernter J, Knowles JA, Hamilton SP, Weissman MM, Fyer AJ, Huezo-Diaz P, McGuffin P, Farmer A, Craig IW, Lewis C, Sham P, Crowe RR, Flint J, Hettema JM.**European Journal of Human Genetics. 2012 Oct; 20(10)1078-1084*

- Genome Scan Meta-Analysis of Schizophrenia and Bipolar Disorder, Part I: Methods...Genome Scan Meta-Analysis of Schizophrenia and Bipolar Disorder, Part I: Methods and Power AnalysisAmerican Journal of Human Genetics. Jul 2003; 73(1)17PMC

Your browsing activity is empty.

Activity recording is turned off.

See more...