- We are sorry, but NCBI web applications do not support your browser and may not function properly. More information

- Journal List
- NIHPA Author Manuscripts
- PMC2758777

# Sample size requirements to detect an intervention by time interaction in longitudinal cluster randomized clinical trials

^{1}Division of Biostatistics Department of Epidemiology and Population Health Albert Einstein College of Medicine Bronx, NY, USA

^{2}Department of Psychiatry Weill Medical College of Cornell University New York, NY, USA

^{3}Department of Public Health Weill Medical College of Cornell University New York, NY, USA

## Abstract

In designing a longitudinal cluster randomized clinical trial (cluster-RCT), the interventions are randomly assigned to clusters such as clinics. Subjects within the same clinic will receive the identical intervention. Each will be assessed repeatedly over the course of the study. A mixed-effects linear regression model can be applied in a cluster-RCT with three level data to test the hypothesis that the intervention groups differ in the course of outcome over time. Using a test statistic based on maximum likelihood estimates, we derived closed form formulae for statistical power to detect the intervention by time interaction and the sample size requirements for each level. Importantly, the sample size does not depend on correlations among second level data units and the statistical power function depends on the number of second and third level data units through their product. A simulation study confirmed that theoretical power estimates based on the derived formulae are nearly identical to empirical estimates.

**Keywords:**

*longitudinal cluster RCT*,

*three level data*,

*power*,

*sample size*,

*intervention by time interaction*,

*effect size*

## 1. Introduction

A longitudinal cluster randomized trial (cluster-RCT) assumes a three level data structure in that the time-specific outcome assessments are nested within subjects who in turn, are nested within the randomized clusters. For instance, consider a study designed to test the effect of an experimental intervention of physician training on the reduction of severity of patients' symptoms of depression over time. In this design, primary care clinics are randomly assigned to either experimental or control intervention and each physician within an experimental clinic is trained to detect and treat depression. Each physician will treat multiple subjects, who, in turn, repeatedly measured on severity of depression symptoms over time.

The primary hypothesis in such a study would focus on the difference in declines of symptom severity over time between subjects who were treated by physicians with and without the experimental intervention. The three level data in a longitudinal cluster-RCT could test the significance of the intervention by time interaction using a mixed-effects linear regression model [1-3].

Sample size determination and power calculations are essential in designing a cluster-RCT. The number of clusters that is required for a target statistical power must be estimated at the experimental design stage. To this end, we build on sample size formulae for two level data structures [4-6] to derive explicitly closed form power function and sample size formulae for detecting a hypothesized interaction effect. The derivations are based on a distribution of a test statistic that used the maximum likelihood estimate of the interaction effect. A simulation study followed to verify the statistical power achieved with the estimated sample sizes.

## 2. Statistical Model

A three level mixed-effects linear model for outcome *Y* can be written as follows:

where *i* =1,2,…,2*N*_{3} is the index for the level three unit (e.g., clinic); *j* = 1,…, *N*_{2}, is the index for the level two unit (e.g., subject) nested within each *i*; and *k* = 1, 2, …, *N*_{1}, is the index for the level one unit (e.g., repeated outcome observations) within each *j*. The intervention assignment indicator variable *X _{ijk}* = 0 if the

*i*-th level three unit is assigned to a control intervention and

*X*= 1 if assigned to an experimental intervention; therefore

_{ijk}*X*=

_{ijk}*X*for all

_{i}*j*and

*k*. Furthermore, here a balanced design is assumed in that Σ

_{i}

*X*=

_{i}*N*

_{3}. The time variable is denoted by

*T*. In this study, it is assumed that

_{ijk}*T*=

_{ijk}*T*for all

_{k}*i*and

*j*, and that the time increase from 0 (the baseline) to

*T*

_{end}=

*N*

_{1}- 1 (the last time point) by 1 with equal time intervals. Therefore, the parameter ξ represent the intervention effect at the baseline, and the parameter τ represents the slope of time effect, that is, decline in symptom severities over time. Finally, the intervention by time effect δ is of primary interest representing the slope difference in outcome

*Y*between the intervention groups, or additional decline in the experimental group. The overall fixed intercept is denoted by β

_{0}.

It is assumed that the error term *e*_{ijk} is normally distributed as $N(0,{\sigma}_{e}^{2})$, the level two random intercept ${u}_{j\left(i\right)}~N(0,{\sigma}_{2}^{2})$ and the level three random intercept ${u}_{i}~N(0,{\sigma}_{3}^{2})$. Among those random components, it is further assumed that *u*_{i} *u*_{j(i)} *e*_{ijk}, i.e., these three random components are mutually independent. In addition, *conditional independence* is assumed for all *u*_{j(i)} and for all *e*_{ijk}, whereas as *u*_{i} are *unconditionally* independent. That is, *u*_{j(i)} are independent conditional on *u*_{i}, and *e*_{ijk} are independent conditional on both *u*_{i} and *u*_{j(i)}. After all, β_{0}, ξ, τ and δ are fixed effect parameters and the last three terms in model (1) are random effects.

As the parameter δ is of the primary interest, the null hypothesis to be tested is:

Under model (1), with its accompanying assumptions such as conditional independence among random components, it can be shown that the elements of the mean vector are

and that the elements of the covariance matrix are:

where 1(.) is an indicator function. This yields in particular,

Therefore, the correlation among level two data can be written for *j* ≠ *j*' as follows.

And, the correlation among level one data can be written for *k* ≠ *k*',

It can be easily seen that ρ_{1} ≥ ρ_{2} with equality when ${\sigma}_{2}^{2}=0$.

## 3. Maximum Likelihood Estimate and its Variance

The maximum likelihood estimate (MLE)$\widehat{\delta}$ of the interaction effect is indeed the slope difference between the two groups: that is,

where ${\widehat{\eta}}_{g}(g=0,1)$ is the MLE of the slope for the outcome *Y* in the *g*-th group, in which *X*_{i} = *g*. Specifically, for *i* in the *g*-th group,

where: 1)${\stackrel{\u2012}{Y}}_{g}(g=0,1)$ is the overall group mean of the outcome *Y* for the *g*-th group; 2) $\stackrel{\u2012}{T}={\Sigma}_{k=1}^{{N}_{1}}{T}_{k}/{N}_{1}$ is the “mean” time point; and 3) ${\mathit{Var}}_{p}\left(T\right)={\Sigma}_{k=1}^{{N}_{1}}{({T}_{k}-\stackrel{\u2012}{T})}^{2}/{N}_{1}$ is the “population variance” of the time variable *T*. In fact, the slope estimate (8), but not the variance of the slope estimate, is the same as that of an ordinary linear regression with *u*_{i} = *u*_{j(i)} = 0 in model (1). The reason for this, on a heuristic level, is that weights assigned to data points *Y*_{ijk} in estimation of the slopes are identical and the slopes do not depend on random intercepts of any data level. Indeed, the ordinary least square estimate (8) is the mle under a perfectly balanced design [2] that we are considering in this paper.

Based on equations (3) and (8), it can easily be shown that the MLE $\widehat{\delta}$ is unbiased, i.e., $E\left(\widehat{\delta}\right)=E({\widehat{\eta}}_{1}-{\widehat{\eta}}_{0})=(\tau +\delta )-\tau =\delta $. The variance of a slope MLE ${\widehat{\eta}}_{\delta}$ can be obtained based on equation (4) as follows (see Appendix for a proof):

Therefore, the variance of $\widehat{\delta}$ is

Observe that ${\widehat{\eta}}_{1}$ and ${\widehat{\eta}}_{0}$ are independent each other. It is notable, however, that the variance of $\widehat{\delta}$ depends only on the residual variance ${\sigma}_{e}^{2}$, and none of ${\sigma}_{3}^{2}$, ${\sigma}_{2}^{2}$, or ρ_{2}. Therefore, *for a given total variance* σ^{2}, it decreases with decreasing ${\sigma}_{e}^{2}$ or increasing ρ_{1}, the correlation among the first level data.

## 4. Power and sample size

The following test statistic *D*, based on (7) and (10), can be used to test the null hypothesis (2):

If the three variance components—${\sigma}_{2}^{2}$, ${\sigma}_{3}^{2}$ and ${\sigma}_{e}^{2}$— are known, then the test statistic *D* is normally distributed with mean$\delta \u2215\mathit{se}\left(\widehat{\delta}\right)$ and variance 1. When those three variance components are unknown and replaced by their MLE's, the test statistic *D* becomes a Wald test statistic and its *asymptotic* distribution is normal based on a large sample theory [7]. Thus, under the null hypothesis (2), *D* ~ *N*(0, 1) and under an alternative hypothesis of $\delta \ne 0,D~N(\delta \u2215\mathit{se}\left(\widehat{\delta}\right),1)$.

The power of the test statistic *D*, denoted by , can therefore be written as follows:

where α is a two-sided significance level; β represents the probability of type II error; Φ is the cumulative distribution function (CDF) of a standard normal distribution and Φ^{-1} is its inverse. From now on, it is understood that: 1) δ = |δ| > 0; and 2) the probability below a critical value, Φ^{-1}(α/2), in the other side under the alternative hypothesis is negligible and thus assumed to be 0. When the slope difference is expressed in pooled within-group standard deviation (SD) units, i.e., when expressed in terms of a standardized effect size

the power function can be expressed as follows:

It follows that when the hypothesis testing is based on *D* with a two-sided significance level of α, the third level unit sample size *N*_{3} per group for a desired statistical power = 1 - β can be calculated from equation (12) as:

or equivalently in terms of the standardized effect size Δ_{δ} from equation (13)

More precisely, *N*_{3} is the smallest integer greater than the right hand side of equation (14) or (15). It can be observed that the level 3 sample size is a deceasing function of increasing ρ_{1} and *Var _{p}*(

*T*) in particular. Stated differently, more follow-up with more consistent (as opposed to erratic) observations within subjects over time will increase the power (15) and at the same time will reduce sample size required of

*N*

_{3}or

*N*

_{2}for the same anticipated power.

The sample size *N*_{2} has a reciprocal relationship with *N*_{3} in a sense that the power depends through *N*_{2}*N*_{3} because both are free each other and of the other parameters. Therefore, sample size *N*_{2} for the level two data can immediately be determined from equation (15) as follows:

The sample size *N*_{1} for the level one data should, however, be determined in an iterative manner because *Var _{p}*(

*T*) is a function of

*N*

_{1}. Specifically, an iterative solution for

*N*

_{1}must satisfy the following equation:

## 5. Simulation study specification

We conducted simulation studies to verify the sample size *N*_{3} (15) and the power function (13) using SAS PROC MIXED, which is suitable for fitting the three-level mixed-effects linear model (1). For a two-sided significance level α = 0.05 and a desired power = 0.8, the following combinations of the simulation parameters were prespecified: Δ_{δ}*T*_{end} = Δ_{δ}(*N*_{1} - 1) = 0.3, 0.4, 0.5; *N*_{2} = 5, 10, 20, 30; *N*_{1} = 3, 6, 12; ρ_{1} = 0.4, 0.5, 0.6 while without loss of generality σ = 1, ρ_{2} = 0.05, β_{0} = ξ = 0, and τ = -1 (in model (1)) remained fixed. This 3×4×3×3 factorial design scheme yielded a total of 108 combinations of those parameters. In particular, the effect size of the interaction, or the between-group slope difference Δ_{δ}, is specified in a way that it would yield a standardized between-group mean difference Δ_{δ}*T*_{end} at the end of trial, i.e., when *T* = *T*_{end} = *N*_{1} - 1.

To generate simulated data, we first estimated *N*_{3} using equation (15) for a given combination (see step 2 below). Specifically, for each combination we followed the following steps for simulations:

- Calculate the variance of time,
*Var*(_{p}*T*), for given*N*_{1}; - Calculate variance components, ${\sigma}_{2}^{2}$, and ${\sigma}_{3}^{2}$ based on equations (5) and (6) for given ρ
_{1}, ρ_{2}and σ^{2}; Specifically, ${\sigma}_{2}^{2}=({\rho}_{1}-{\rho}_{2}){\sigma}^{2}$ and ${\sigma}_{3}^{2}={\rho}_{2}{\sigma}^{2}$; - Calculate ${\sigma}_{e}^{2}={\sigma}^{2}-({\sigma}_{3}^{2}+{\sigma}_{2}^{2})$;
- Calculate δ =σΔ
_{δ}for the given σ^{2}and Δ_{δ}; - Generate the random intervention assignment indicator
*X*= 0 or 1 for each_{i}*i*= 1,2,.., 2*N*_{3}in a balanced manner so that Σ_{i }*X*=_{i}*N*_{3}; - Generate
*u*from $N(0,{\sigma}_{2}^{2})$ independently for each_{i}*i*= 1,2,…,2*N*_{3}(Unconditional independence assumption); - For each
*u*, generate_{i}*u*_{j(i)}from $N(0,{\sigma}_{2}^{2})$ independently for*j*= 1,2,…,*N*_{2}(Conditional independence assumption); - For each combination of
*u*and_{i}*u*_{j(i)}, generate*e*from_{ijk}*N*(0, σ_{e }^{2}) independently for*k*= 1,2, …,*N*_{1}(Conditional independence assumption); - Generate outcome data set for
*Y*= β_{ijk}_{0}+ ξ*X*+ τ_{i}*T*+ δ_{k}*X*_{i}*T*+_{k}*u*+_{i}*u*_{j(i)}+*e*(1);_{ijk} - Fit the data set with the three-level linear mixed-effects model (1);
- Retain a
*p*-value, denoted by*p*(δ) for the_{s}*s*-th simulated data set, obtained from testing the null hypothesis (2); - Repeat the steps 6-12 for 1000 times (i.e.,
*s*= 1, 2, …, 1000) for each combination of the simulation parameters.

Let us denote the empirical power by $\stackrel{~}{\phi}$ that is obtained from the 1000 simulations as follows:

This empirical power is compared with the theoretical power that is computed based on *N*_{3} obtained in step 2 above, but not with the prespecified power of 0.8. It should be noted that the theoretical power obtained in that way is never less than the prespecified power of 0.8 since *N*_{3} is the smallest integer greater than the right hand side of equation (15).

## 6. Simulation study results

Table 1 summarizes the specified (*N*_{2} and *N*_{1}) and estimated (*N*_{3}) sample sizes, the empirical power $\stackrel{~}{\phi}$ (18) and the theoretical power (13) based on the estimated *N*_{3}. Although the empirical power is negligibly underestimated as reflected on the mean differences in the last row in Table 1, it is virtually identical to the theoretical power. For instance, among the 108 combinations (Table 1), the maximum absolute difference $\mid \phi -\stackrel{~}{\phi}\mid $ was 0.027, which is tolerable given that the width of the 95% confidence interval for simulation estimates is $\pm 1.96\sqrt{0.8\times 0.2\u22151000}=\pm 0.025$. Thus, the derived formulae for sample size and the power are very accurate under the conditions that were examined. In each case, the theoretical power is no less than 0.8, since the power calculations were based on “integer” values of *N*_{3}.

*N*

_{3}theoretical power and empirical power $\stackrel{~}{\phi}$ for testing intervention group by time interaction effect in a three level mixed-effects linear regression analysis, based on 1000 simulations.

As expected, the sample size *N*_{3} for the identical power decreases with increasing correlation ρ_{1} when the other design parameters are held the same. For example, when *N*_{2} = 5, *N*_{1} = 6, and Δ_{δ}*T*_{end} = 0.3, (or Δ_{δ} = 0.3/5 = 0.06) the respective sample sizes requirements for 80% power, for the level three data (*N*_{3}), were 30, 25, and 20 for ρ_{1} = 0.4, 0.5, and 0.6. Furthermore, the theoretical power is identical for various combinations of *N*_{2} and *N*_{3} that yield an equivalent product, assuming other design parameters are held constant. For instance, as shown in Table 1, each the following pairs of *N*_{2} and *N*_{3} with a product of 210 yielded identical power of 0.801 when *N*_{1} = 3, ρ_{1} = 0.4, Δ_{δ}*T*_{end} = 0.3 (or Δ_{δ}= 0.3/2 = 0.15): *N*_{2} = 5 and *N*_{3} = 42; *N*_{2} = 10 and *N*_{3} = 21; *N*_{2} = 30 and *N*_{3} = 7.

## 7. Application

The results in Table 1 can be applied to designing a longitudinal cluster-RCT. Consider, for instance, a longitudinal cluster-RCT that compares an innovative primary care level intervention with a usual primary care practice on depression outcome of subjects as conducted in the PROSPECT [8,9] and the RESPECT [10] trials. To test whether the course of depressive symptoms over time depends on the care that the subjects receive, it is anticipated that primary clinics can accommodate 20 subjects (*N*_{2}) for the research purpose and each patient would be followed up for 6 times (*N*_{1}) for assessments. The results presented in Table 1 can be applied to estimating number of primary clinics, i.e., level 3 units (*N*_{3}), for 80% power. If ρ_{1} = 0.5, then four clinics (*N*_{3}) for each of the two intervention groups, or a total of 160 subjects, would be needed to detect an effect size Δ_{δ}*T*_{end} = 5Δ_{δ} = 0.4 (or Δ_{δ} = 0.4/5 = 0.08) with at least 80% statistical power (Table 1). Sample size requirements for other design parameters can be obtained from Table 1. For other combinations of design specification that were not presented in Table 1, the sample size formula (18) can be applied.

## 8. Discussion

The derived power function (13) and level 3 unit sample size formula (15) requirements to detect an intervention by time interaction are shown to be accurate compared to empirical estimates based on a simulation study. Therefore, sample size formulae (16, 27) for number of level 2 and level 1 data units are also accurate because they are different expressions of equation (15). Importantly, the sample size did not depend on correlations among second level data units and the statistical power function depends on the number of second and third level data units through their product. Furthermore, when either *N*_{3} or *N*_{2} is equal to one, it reduces the level 3 data structure to that of level 2 data with the number of second level data as *N*_{2} or *N*_{3} correspondingly. In either case, the variance ${\sigma}_{3}^{2}$ of the level three random intercept can be considered to be 0 and thus ρ_{2} can be assumed to be 0. This reduces the sample size formula (14) to equation (2.4.1) in Diggle et al [6] on its page 29, as it should. In Diggle et al's formula too, it can be found that the power function is increasing in ρ_{1}.

Collectively, therefore, as far as testing the intervention by time interaction is concerned, the design can be very flexible for the same statistical power depending on feasibility. For example, when *N*_{3}*N*_{2} = 200 subjects per group is needed for 80% power, then sample sizes for *N*_{3} and *N*_{2} can be determined depending on availability of recruitment of level two and level three units regardless of an anticipated ρ_{2}. To this end, if recruitment of 10 subjects (*N*_{2}) per clinic was feasible, then the investigators could try to enlist 20 clinics (*N*_{3}) per intervention group. On the other hand, if only 5 clinics (*N*_{3}) were available per intervention group, then recruitment of 40 subjects (*N*_{2}) per clinic would be required. In an extreme case where only one clinic (*N*_{3}=1) is available, one could recruit 200 subjects (*N*_{2}) from the single clinic.

Although the empirical power was based on unknown variance components of random effects, it was virtually identical to the theoretical power derived with known variance components in the test statistic *D* (11). Therefore, derivation of power function with unknown variances may not be necessary even for small *N*_{3}, although it might be possible through application of CDFs of central and non-central *t* distributions [11] replacing the standard normal CDF Φ and its inverse Φ^{-1} in equation (14) or (15).

It should be noted that the sample size formula is to detect a slope difference *per se* but not an expected between-group difference at *T*_{end}, the end of a study. In other words, the sample formula (15) derived herein is not appropriate to detect an intervention effect at a prespecified time point such as the end of a trial. It is because the variance of this effect is not equal to ${T}_{\mathit{end}}^{2}\mathit{Var}({\widehat{\eta}}_{1}-{\widehat{\eta}}_{0})$, even if the estimated quantities are the same. Thus, this intervention effect, Δ_{δ}*T*_{end}, served as the basis for estimating a hypothesized slope difference Δ_{δ}.

Other sample size formulae are available. For instance, Liu et al [12] derived sample size formulas for the slope difference using generalized estimating equations. Murray et al [13] presented detectable effect sizes based on expected mean square errors using random coefficients analysis for the nested cohort design. Roy et al [14] derived general-form sample size determinations using a mixed-effects linear model, taking into account for potential attrition rates and more general correlation structures. Heo and Leon [15] derived an algorithm for sample size requirements to detect a main effect of group using a linear mixed effects model for three level data. Although comparisons of sample sizes assuming different modeling approaches would provide better insight in designing a cluster-RCT, the sample size equations presented above (15,16,17) are more readily implemented.

The sample size determinations derived here have limitations. First, the formulae were derived assuming fixed numbers of units for all levels although number of subjects per clinic will likely vary, i.e., *j* = 1, 2, …, *n _{i}*, depending the

*i*-th clinic. Furthermore, the number of assessments per subjects will also vary (i.e.,

*k*= 1, 2, …,

*n*, depending on both clinics and subjects) because attrition of subjects during a trial in reality is the norm rather than exception [16,17]. Nevertheless, our derivation based on non-varying cluster sizes provides a useful approximation and, further, can serve as a basis for deriving a sample size algorithm for varying cluster sizes. For instance, if the variation in the cluster sizes is completely at random in the missing data analysis framework [18], a replacement of the varying cluster sizes with an average cluster size has been shown to be effective for sample size and statistical power with varying cluster sizes under two level binary outcome data [19]. Second, for pragmatic reasons the covariance structure (4) considered here was based on the conditional independence assumption. Therefore, robustness of the derived formulae under alternative covariance structure, such as autocorrelation or unstructured covariance matrix, is unknown.

_{ij}In conclusion, the derived formulae for sample sizes (15,16,17) and power functions (12,13) can be useful in designing community based longitudinal cluster-randomized clinical trials that compare slopes of outcomes over time between two intervention groups in a three level data structure.

## Acknowledgement

We are grateful to Donald Hedeker Ph.D., two anonymous referees and an Associate Editor for their valuable suggestions. This study was supported in part by NIMH grants, P30MH068638 and R01MH060447.

## Appendix

Proof of equation (9), *Var* $\mathit{Var}\left({\widehat{\eta}}_{g}\right)=\frac{{\sigma}_{e}^{2}}{{N}_{3}{N}_{2}{N}_{1}{\mathit{Var}}_{p}\left(T\right)}=\frac{(1-{\rho}_{1}){\sigma}^{2}}{{N}_{3}{N}_{2}{N}_{1}{\mathit{Var}}_{p}\left(T\right)}$. Let ${W}_{k}=({T}_{k}-\stackrel{\u2012}{T})$, then We have: ${\Sigma}_{k=1}^{{N}_{1}}{W}_{k}^{2}={N}_{1}{\mathit{Var}}_{p}\left(T\right)$; ${\Sigma}_{k=1}^{{N}_{1}}{W}_{k}=0$; ${\Sigma}_{k\prime \ne k}^{{N}_{1}}{W}_{k\prime}=-{W}_{k}$ and ${\widehat{\eta}}_{g}={\Sigma}_{i=1}^{{N}_{3}}{\Sigma}_{j=1}^{{N}_{2}}{\Sigma}_{k=1}^{{N}_{1}}{W}_{k}({Y}_{\mathit{ijk}}-{\stackrel{\u2012}{Y}}_{g})/{N}_{3}{N}_{2}{N}_{1}{\mathit{Var}}_{p}\left(T\right)={\Sigma}_{i=1}^{{N}_{3}}{\Sigma}_{j=1}^{{N}_{2}}{\Sigma}_{k=1}^{{N}_{1}}{W}_{k}{Y}_{\mathit{ijk}}/{N}_{3}{N}_{2}{N}_{1}{\mathit{Var}}_{p}\left(T\right)$. Observing that *Y* is independent over *i*, we decompose the variance of the numerator of ${\widehat{\eta}}_{g}$ as follows:

Now, recall equation (4), that is,

It follows that *A* = σ^{2}*N*_{3}*N*_{2}*N*_{1}*Var*_{p}(*T*) since $\mathit{Var}\left({Y}_{\mathit{ijk}}\right)={\sigma}^{2}={\sigma}_{e}^{2}+{\sigma}_{2}^{2}+{\sigma}_{3}^{2}$. Further, ${\Sigma}_{k=1}^{{N}_{1}}{\Sigma}_{k\prime \ne k}^{{N}_{1}}{W}_{k}{W}_{k\prime}\mathit{Cov}({Y}_{\mathit{ijk}},{Y}_{\mathit{ijk}\prime})=-({\sigma}_{2}^{2}+{\sigma}_{3}^{2}){\Sigma}_{k=1}^{{N}_{1}}{W}_{k}^{2}$ since ${\Sigma}_{k\prime \ne k}^{{N}_{1}}{W}_{k\prime}=-{W}_{k}$. Therefore, $B=-({\sigma}_{2}^{2}+{\sigma}_{3}^{2}){N}_{3}{N}_{2}{N}_{1}{\mathit{Var}}_{p}\left(T\right)$. It is easy to see that *C* = 0 since ${\Sigma}_{k=1}^{{N}_{1}}{W}_{k}=0$. Hence, we have $\mathit{Var}\left({\Sigma}_{i=1}^{{N}_{3}}{\Sigma}_{j=1}^{{N}_{2}}{\Sigma}_{k=1}^{{N}_{1}}{W}_{k}{Y}_{\mathit{ijk}}\right)=A+B={\sigma}_{e}^{2}{N}_{3}{N}_{2}{N}_{1}{\mathit{Var}}_{p}\left(T\right)$. It follows that equation (9) above holds.

## Reference

## Formats:

- Article |
- PubReader |
- ePub (beta) |
- PDF (774K)

- Statistical power and sample size requirements for three level hierarchical cluster randomized trials.[Biometrics. 2008]
*Heo M, Leon AC.**Biometrics. 2008 Dec; 64(4):1256-62. Epub 2008 Feb 11.* - Sample size requirement to detect an intervention effect at the end of follow-up in a longitudinal cluster randomized trial.[Stat Med. 2010]
*Heo M, Kim Y, Xue X, Kim MY.**Stat Med. 2010 Feb 10; 29(3):382-90.* - Sample size requirements to detect an intervention by time interaction in longitudinal cluster randomized clinical trials with random slopes.[Comput Stat Data Anal. 2013]
*Heo M, Xue X, Kim MY.**Comput Stat Data Anal. 2013 Apr 1; 60:169-178.* - Subgroup analyses in randomised controlled trials: quantifying the risks of false-positives and false-negatives.[Health Technol Assess. 2001]
*Brookes ST, Whitley E, Peters TJ, Mulheran PA, Egger M, Davey Smith G.**Health Technol Assess. 2001; 5(33):1-56.* - Application of a generalized random effects regression model for cluster-correlated longitudinal data to a school-based smoking prevention trial.[Am J Epidemiol. 2000]
*Sashegyi AI, Brown KS, Farrell PJ.**Am J Epidemiol. 2000 Dec 15; 152(12):1192-200.*

- A universal harm-minimisation approach to preventing psychostimulant and cannabis use in adolescents: a cluster randomised controlled trial[Substance Abuse Treatment, Prevention, and ...]
*Vogl LE, Newton NC, Champion KE, Teesson M.**Substance Abuse Treatment, Prevention, and Policy. 924* - The CLIMATE schools combined study: a cluster randomised controlled trial of a universal Internet-based prevention program for youth substance misuse, depression and anxiety[BMC Psychiatry. ]
*Teesson M, Newton NC, Slade T, Chapman C, Allsop S, Hides L, McBride N, Mewton L, Tonks Z, Birrell L, Brownhill L, Andrews G.**BMC Psychiatry. 1432* - A cluster randomised controlled trial of the Climate Schools: Ecstasy and Emerging Drugs Module in Australian secondary schools: study protocol[BMC Public Health. ]
*Champion KE, Teesson M, Newton NC.**BMC Public Health. 131168* - Sample size estimation in educational intervention trials with subgroup heterogeneity in only one arm[Statistics in medicine. 2013]
*Esserman D, Zhao Y, Tang Y, Cai J.**Statistics in medicine. 2013 May 30; 32(12)2140-2154* - Sample size requirements to detect an intervention by time interaction in longitudinal cluster randomized clinical trials with random slopes[Computational statistics & data analysis. 2...]
*Heo M, Xue X, Kim MY.**Computational statistics & data analysis. 2013 Apr 1; 60169-178*

- PubMedPubMedPubMed citations for these articles

- Sample size requirements to detect an intervention by time interaction in longit...Sample size requirements to detect an intervention by time interaction in longitudinal cluster randomized clinical trialsNIHPA Author Manuscripts. Mar 15, 2009; 28(6)1017PMC

Your browsing activity is empty.

Activity recording is turned off.

See more...