Effects of ignoring clustered data structure in confirmatory factor analysis of ordered polytomous items: a simulation study based on PANSS

Jan Stochl; Peter B Jones; Jesus Perez; Golam M Khandaker; Jan R Böhnke; Tim J Croudace

doi:10.1002/mpr.1474

Effects of ignoring clustered data structure in confirmatory factor analysis of ordered polytomous items: a simulation study based on PANSS

Int J Methods Psychiatr Res. 2016 Sep;25(3):205-19. doi: 10.1002/mpr.1474. Epub 2015 Jun 20.

Authors

Jan Stochl^{1

2

3}, Peter B Jones⁴, Jesus Perez⁴, Golam M Khandaker⁴, Jan R Böhnke^{5

6}, Tim J Croudace^{4

7}

Affiliations

¹ Department of Psychiatry, University of Cambridge, Cambridge, UK. jan.stochl@york.ac.uk.
² Mental Health and Addiction Research Group (MHARG), Department of Health Sciences, University of York, York, UK. jan.stochl@york.ac.uk.
³ Department of Kinanthropology, Charles University in Prague, Prague, Czech Republic. jan.stochl@york.ac.uk.
⁴ Department of Psychiatry, University of Cambridge, Cambridge, UK.
⁵ Mental Health and Addiction Research Group (MHARG), Department of Health Sciences, University of York, York, UK.
⁶ Hull York Medical School (HYMS), University of York, York, UK.
⁷ Social Dimensions of Health Institute and School of Nursing and Midwifery, University of Dundee, Dundee, UK.

Abstract

Statistical theory indicates that hierarchical clustering by interviewers or raters needs to be considered to avoid incorrect inferences when performing any analyses including regression, factor analysis (FA) or item response theory (IRT) modelling of binary or ordinal data. We use simulated Positive and Negative Syndrome Scale (PANSS) data to show the consequences (in terms of bias, variance and mean square error) of using an analysis ignoring clustering on confirmatory factor analysis (CFA) estimates. Our investigation includes the performance of different estimators, such as maximum likelihood, weighted least squares and Markov Chain Monte Carlo (MCMC). Our simulation results suggest that ignoring clustering may lead to serious bias of the estimated factor loadings, item thresholds, and corresponding standard errors in CFAs for ordinal item response data typical of that commonly encountered in psychiatric research. In addition, fit indices tend to show a poor fit for the hypothesized structural model. MCMC estimation may be more robust against clustering than maximum likelihood and weighted least squares approaches but further investigation of these issues is warranted in future simulation studies of other datasets. Copyright © 2015 John Wiley & Sons, Ltd.

Keywords: PANSS; factor analysis; hierarchical modelling; simulation.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Computer Simulation
Data Interpretation, Statistical*
Factor Analysis, Statistical*
Humans
Psychiatric Status Rating Scales / statistics & numerical data*

Abstract

Publication types

MeSH terms

Grants and funding