Send to

Choose Destination
See comment in PubMed Commons below
Stat Med. 2003 Dec 15;22(23):3671-85.

Comparison of multiple regression to two latent variable techniques for estimation and prediction.

Author information

Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, MN 55455, USA.


In the areas of epidemiology, psychology, sociology, and other social and behavioural sciences, researchers often encounter situations where there are not only many variables contributing to a particular phenomenon, but there are also strong relationships among many of the predictor variables of interest. By using the traditional multiple regression on all the predictor variables, it is possible to have problems with interpretation and multicollinearity. As an alternative to multiple regression, we explore the use of a latent variable model that can address the relationship among the predictor variables. We consider two different methods for estimation and prediction for this model: one that uses multiple regression on factor score estimates and the other that uses structural equation modelling. The first method uses multiple regression but on a set of predicted underlying factors (i.e. factor scores), and the second method is a full-information maximum-likelihood technique that incorporates the complete covariance structure of the data. In this tutorial, we will explain the model and each estimation method, including how to carry out prediction. A data example will be used for demonstration, where respiratory disease death rates by county in Minnesota are predicted by five county-level census variables. A simulation study is performed to evaluate the efficiency of prediction using the two latent variable modelling techniques compared to multiple regression.

[Indexed for MEDLINE]
PubMed Commons home

PubMed Commons

How to join PubMed Commons

    Supplemental Content

    Loading ...
    Support Center