NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.

Hong H, Carlin BP, Chu H, et al. A Bayesian Missing Data Framework for Multiple Continuous Outcome Mixed Treatment Comparisons [Internet]. Rockville (MD): Agency for Healthcare Research and Quality (US); 2013 Jan.

## A Bayesian Missing Data Framework for Multiple Continuous Outcome Mixed Treatment Comparisons [Internet].

Show details## Results for OA Data

Table 2 compares the fit of six models with our OA data. We apply homogeneous variance across arms in LARE and homogeneous covariance matrices for CBRE2 and ABRE2; that is,
${\mathbf{\sum}}_{k}^{\mathit{\text{Out}}}$ and
${\mathbf{\Lambda}}_{k}^{\mathit{\text{Out}}}$ are the same for all *k*, respectively. All CB and AB models incorporate the missingness into models, and only CBRE2hom and ABRE2hom models allow correlation structure between outcomes. The fixed effects model gives the largest mean deviance score *$\stackrel{\u0304}{D}$* when applied to the OA data, and an unacceptably large DIC score. ABRE1 fits the data best with the smallest *$\stackrel{\u0304}{D}$*, but there is no significant difference in fit across random effects models. AB models give slightly higher pD than CB models because they are less constrained and more parameters need to be estimated. Since our data are sparse, heterogeneous variance assumption, a feature of CBRE1 and ABRE1, is not a good choice here. Considering both goodness of fit and complexity, CBRE2hom gives the smallest DIC, though again, the DIC differences between this model and ABRE2hom or LAREhom are not of practical importance (less than five units). The estimated variability on the standard deviation scale is always between 1 and 1.5, with associated 95% credible interval widths around 0.4 based on the median posteriors in LAREhom, CBRE2hom and ABRE2hom models. The median posterior of correlations between two outcomes are 0.494 (95% credible interval 0.18 to 0.71) and 0.377 (0.06 to 0.61) for the CBRE2hom and ABRE2hom models, respectively, revealing the two outcomes to be positively but weakly correlated (data is not shown).

Table 3 displays the results from four models; LAFE, LAREhom, CBRE2hom, and ABRE2hom with respect to the pain outcome. Here, smaller values of *d _{k}*

_{1}and

*μ*

_{k}_{1}mean better condition and the “best” treatment based on the Best12 probability is in bold. In the LAREhom model, it is essentially tied with aquatic and proprioception exercises for first place. Our CB and ABRE2hom models suggest that proprioception exercise is the best treatment, followed by strength exercise, but the Best12 probability of proprioception exercise from ABRE2hom is much larger than that from CBRE2hom. However, since standard deviations are somewhat large, there is no significant difference between these two treatments. There are large differences in Best12 probabilities across three random effects models. This might be due to different model assumptions and settings but also to the network in the data structure.

Table 4 shows similar information with respect to the disability outcome. Aerobic exercises perform best based on Best12 probabilities from LAREhom models. Proprioception and aerobic exercises are tied for first place in the CBRE2hom model, and proprioception exercise is the best treatment followed by strength exercise in ABRE2hom. It seems that proprioception and aerobic exercises are helpful to reduce disability across all models, but there is still no strong evidence regarding significant difference among the treatments.

Figure 3 delivers our findings above graphically in terms of mean difference between therapy and no treatment (*d _{kl}*) with 95% credible intervals across the four models. We indicate the best treatment with respect to each outcome in each model with a triangle character, and the worst treatment with a square. For the pain outcome, strength and proprioception exercises perform significantly better than no active treatment across all models, whereas for the disability outcome, only aerobic exercise is significantly different from no active treatment under the three random effects models. Compared with the pain outcome, the 95% credible sets in disability are wider because only about half as many studies reported this outcome.

Figures 4 and 5 exhibit the posterior probabilities of each treatment taking each possible ranking from 1 (best) to 11 (worst) for both the pain reduction and disability improvement outcomes.^{25} Although these graphs cannot reveal significant differences in rankings among treatments or the magnitudes of any treatment differences, they do still give a sense of the uncertainty in the rank for each treatment. Note that in both figures the positive correlation between the two outcomes leads to generally similar treatment ranking probabilities for both outcomes. In Figure 5, proprioception exercise's probability of being the best treatment for pain is roughly 0.8, leaving the remaining 10 treatments to share the remaining 0.2 probability of being the best; this treatment also has the single largest probability of being best for disability improvement (about 0.4). By contrast, the LA model rankings in Figure 4 do not suggest a dominant treatment for either outcome, though aerobic exercise has a nearly 0.4 chance of being best for disability improvement, and placebo is unequivocally worst for pain reduction.

To obtain Best12 probabilities with combined score in Equation (8), we investigate three sets of weights: (*w*_{1},*w*_{2}) = (0.5, 0.5), (0.8, 0.2), and (0.2, 0.8). Our CB and ABRE2hom models give proprioception exercise as the global winner for all three sets of weights. Aerobic exercise is the overall winner in the LAREhom model (results not shown). The reason why the weights do not have much effect here is that some treatment effects are so large in one outcome that they dominate the effects from the other outcome, even when we put low weight on the former (e.g., Best12 probability of aerobic exercise in the disability outcome is much larger than that of low intensity diathermy the pain outcome for LAREhom).

## Sensitivity Analysis

Our CB and ABRE2hom models yield weakly positive correlation between two outcomes under noninformative Wishart prior on covariance matrix of random effects, assuming zero correlation between outcomes with γ = 2 degrees of freedom. As a sensitivity analysis, we consider three different more informative Wishart priors: 0.5 between-outcome correlation with γ = 2 and 4, and 0.9 between-outcome correlation with γ = 4. Note that a Wishart prior becomes less informative as γ decreases to 0.

Table 5 displays the results of our sensitivity analysis in terms of model fits (pD, *$\stackrel{\u0304}{D}$*, and DIC) and posterior estimates of correlation between two outcomes (*$\widehat{\rho}$*). Here, the degree of informativeness in the Wishart hyperprior increases from left to right. The *$\widehat{\rho}$*s in CBRE2hom models are likely to be affected more by the selection of a Wishart prior having *$\widehat{\rho}$* close to 0.9 when *ρ*_{0} = 0.9, γ = 4 while ABRE2hom gives a bit more robust *$\widehat{\rho}$* around 0.5 across the three sets of informative priors. In CBRE2hom, pD decreases as we utilize a more informative prior, whereas ABRE2hom gives almost the same pD values across all informative priors. Regarding treatment effect parameters, informative priors do not give dramatic difference in the treatment ranking (proprioception exercise is the best treatment in both outcomes under both CB and ABRE2hom models across all informative prior cases), but provide smaller standard deviation of those parameters.

## Results for Simulation Study

Tables 6 and 7 present the results of our simulation under
${\rho}_{\mathit{\text{AB}}}^{\ast}=0.6\phantom{\rule{0.2em}{0ex}}\text{and}\phantom{\rule{0.2em}{0ex}}0.0$, respectively. For CBRE2 and ABRE2 models, we used two different Wishart priors for the covariance matrices; namely, a noninformative Wishart
$\left(\left(\begin{array}{cc}10& 0\\ 0& 10\end{array}\right),2\right)$ and a weakly informative Wishart(4** R***, 4), respectively, where

*** is the true covariance matrix. We report $\text{Pr}(\widehat{{\mu}_{11}}>\widehat{{\mu}_{21}})$ in parentheses which is interpreted as the probability of an incorrect decision when ${d}_{21}^{\ast}=1\phantom{\rule{0.2em}{0ex}}\text{or}\phantom{\rule{0.2em}{0ex}}2$, but should be around 0.5 when ${d}_{21}^{\ast}=0$, along with the simulated Type I error and power. Here, using true covariance matrix in the prior distribution could be a way overly optimistic, but we adopt the truth to investigate how much power could be gained with informative priors.**

*R*In Table 6, all models work fairly well when there is no missing data (“complete”). For Type I error, the LAREhom model performs poorly under MAR and MNAR mechanisms with very extreme $\text{Pr}(\widehat{{\mu}_{11}}>\widehat{{\mu}_{21}})$ values, very close to 0 (MAR) or 1 (MNAR). Power1 decreases under the MCAR mechanism as we expected due to the loss of data, but our CBRE2 and ABRE2 models give slightly higher power than LAREhom. The LAREhom model gives extremely high Power1 under MAR, but too low under MNAR. Here, under MNAR the probability of an incorrect decision is 0.377 using LAREhom, while it is only 0.080 using CBRE2 and ABRE2. All models yield very high power when ${d}_{21}^{\ast}=2$ except the LAREhom model under MNAR mechanism. The fifth and sixth columns show that adopting weakly informative Wishart priors can improve power without severely damaging Type I error.

Table 7 shows that our methods have less benefit when two outcomes are independent. In this case, the LAREhom model does not suffer as much on Type I error under MAR and MNAR mechanisms, and Power1 values are not extreme; it also gives slightly smaller $\text{Pr}(\widehat{{\mu}_{11}}>\widehat{{\mu}_{21}})$ values when ${d}_{21}^{\ast}=1$ under MNAR than our CBRE2 and ABRE2 models. This is because these methods do not borrow much strength across outcomes since the correlation is close to zero in this setting. Compared with Table 2, CBRE2 and ABRE2 produce somewhat smaller powers under severe missingness mechanisms than when the two outcomes were correlated.

Figure 6 exhibits the density plot of median posteriors of *d*_{21} from 1,000 simulated partially missing data under each of three models with noninformative Wishart priors, when
${\rho}_{\mathit{\text{AB}}}^{\ast}=0.6$ and
${d}_{21}^{\ast}$ is 0, 1, and 2 under MCAR, MAR, and MNAR mechanisms. When the missingness does not depend on the data (MCAR), the median posteriors of *d*_{21} are unbiased across all three models, though ABRE2 gives slightly smaller estimator variances, suggesting smaller mean squared error (MSE). On the other hand, the MAR and MNAR mechanisms lead to huge positive or negative biases with the LAREhom model, resulting in large Type I error and extreme Power1 values. This bias depends on the choices of coefficients in Equation (9); for example, if we alter (9) to logit(*p _{i,mis}*) = −4 − 2

*y¯*

_{i}_{12}+

*y¯*

_{i}_{22}for MAR, LAREhom gives 0.087 Power1 while CBRE2 and ABRE2 give 0.37 and 0.311, respectively. No matter which rules drive the missingness, it is obvious that LAREhom models produce larger bias than our models when the missingness does not randomly occur and the two outcomes are correlated.

Figure 7 displays the same density plots as in Figure 6, but under ${\rho}_{\mathit{\text{AB}}}^{\ast}=0.0$. All three models deliver unbiased estimates under MCAR and MAR, but give somewhat biased estimates under MNAR, although the magnitudes of bias are similar across models. Our CBRE2 and ABRE2 models tend to give slightly larger estimator variances. Here, the missingness does not much affect the bias of estimators in LAREhom with two uncorrelated outcomes. Although our methods do not deliver strikingly better features over the existing LAREhom model in this idealized case, our methods do not surrender much in terms of Type I error and power, justifying their uses across both dependent and independent scenarios.

- Results - A Bayesian Missing Data Framework for Multiple Continuous Outcome Mixe...Results - A Bayesian Missing Data Framework for Multiple Continuous Outcome Mixed Treatment Comparisons

Your browsing activity is empty.

Activity recording is turned off.

See more...