• We are sorry, but NCBI web applications do not support your browser and may not function properly. More information
Logo of nihpaAbout Author manuscriptsSubmit a manuscriptNIH Public Access; Author Manuscript; Accepted for publication in peer reviewed journal;
J Gerontol A Biol Sci Med Sci. Author manuscript; available in PMC Feb 20, 2009.
Published in final edited form as:
PMCID: PMC2645650

Methodology, Design, and Analytic Techniques to Address Measurement of Comorbid Disease



Measurement of comorbidity affects all variable axes that are considered in health care research: confounding, modifying, independent, and dependent variable. Comorbidity measurement particularly affects research involving older adults because they bear the disproportionate share of the comorbidity burden.


We examine how well researchers can expect to segregate study participants into those who are healthier and those who are less healthy, given the variable axis for which they are measuring comorbidity, the comorbidity measure they select, and the analytic method they choose. We also examine the impact of poor measurement of comorbidity.


Available comorbidity measures make use of medical records, self-report, physician assessments, and administrative databases. Analyses using these scales introduce uncertainties that can be framed as measurement error or misclassification problems, and can be addressed by extant analytic methods. Newer analytic methods make efficient use of multiple sources of comorbidity information.


Consideration of the comorbidity measure, its role in the analysis, and analogous measurement error problems will yield an analytic solution and an appreciation for the likely direction and magnitude of the biases introduced.

All comorbidity measures segregate study participants into those who are healthier and those who are less healthy. The objective of this review is to examine how well researchers can expect to have achieved this goal, given the variable axis for which they are measuring comorbidity, the comorbidity measure they select, and the analytic method they choose.

Analytic Strategies for Measuring Comorbidity

We first describe strategies for constructing indices of comorbidity or multiple morbidities, focusing on their advantages and disadvantages (Table 1). A sum of the number of diagnoses from among a list of candidate diagnoses provides an ordinal score. This method has the advantage of conceptual simplicity and ease of data ascertainment, but there are disadvantages. First, all diagnoses are scored equivalently. For many analytic relations, different diseases as well as their severity will affect outcomes differently. This disadvantage can be addressed by weighting the contributions of different diseases, depending on their role in the analytic relationship. Weighting schemes can also take into account treatment. If, however, weightings are overly customized to specific diseases, the index will be less useful when applied in other settings. Second, analytic strategies often force a linear relationship with the ordinal scale across its entire range. A participant moving from the zero score to one comorbid disease could realize the majority of the comorbidity effect, with additional unit increases having a diminishing impact. Categorizing summed scores, rather than treating the index as an ordinal variable, addresses this disadvantage. The third disadvantage of a summed measure is that it ignores potentially important relationships between diseases that might differ from their simple sum. The interaction between chronic obstructive pulmonary disease and congestive heart failure might exceed the simple sum, whereas cardiovascular disease related to diabetes might be overweighted in an index that counts both independently. These problems may be reduced if a condition must reach a clinical threshold before being counted. In addition, individual diseases counted separately in an index might arise from a common cause, whether that cause is exogenous (e.g., tobacco smoke) or endogenous (e.g., inflammatory response). The impact of the clustering of individual diseases with a common pathologic mechanism may be lesser or greater than the simple sum. Furthermore, all of the extant measures—regardless of whether they are based on simple counts or a sum of individually weighted conditions—assume an additive relationship for the included diseases.

Table 1
Merits of Analytic Strategies and Data Sources for Comorbidity Measurement

Data Sources Available to Measure Comorbidity

All comorbidity indices rely on medical record information, patient self-report, clinical judgment, or large administrative databases. Regardless of the data source, errors can be introduced into the construction of the index during data collection. In addition, patients with cognitive impairment may underreport symptoms of other conditions (1) and may be seen less frequently by their physicians, resulting in an underrecognition or undertreatment of other conditions (2). Both of these biases will affect the accuracy of measurement of comorbidity, particularly in older populations, regardless of the data source.

Medical Records

Many comorbidity indices rely on medical records [e.g., Charlson Index (3), Cumulative Illness Rating Scale (4), Index of Coexistent Diseases (5), and Kaplan–Feinstein Index (6)]. An important advantage of medical records is the good correspondence between diagnosed conditions and the disease entities ordinarily included in comorbidity indices. Another important advantage is that medical records reflect the information available to clinicians treating the patient/study participant. As comorbidity indices are often meant to reflect the impact of coexistent diseases on medical decision making, this close correspondence assures that the index reflects the target concept.

Using medical records as a data source also has important disadvantages. First, review of records is a resource-intensive method of data collection. Second, medical record review requires patient consent and consent of the health care provider and/or the health care organization. Third, it usually requires validation substudies to assure sufficient intra-rater and inter-rater reliability. Fourth, medical records must be available over a sufficient period of time to assure that the comorbidity index can be accurately constructed. Fifth, the quality of the medical record data may be different for inpatient than for outpatient care.

Patient Self-Report

Some comorbidity indices have been developed to collect information directly from patients [e.g., the Comorbidity Symptom Scale (7), Geriatric Index of Comorbidity (8), or Total Illness Burden Index (9)]. Other comorbidity indices originally intended to use medical records as the data source have been adapted for patient self-report [e.g., the Katz adaptation of the Charlson Index to patient interview (10)]. Self-report of comorbidity correlates well with data collected by medical record review (11). Two advantages of self-report compared with medical record review are its more limited resource requirements and its potential for being more complete. Patients can be asked to recall their entire medical history, whereas medical records may be limited to a time period that does not include all relevant history. In contrast, in studies of large populations, or when patients are in hospitals or nursing homes, the cost of interviewing may be prohibitive. Furthermore, cognitive impairment can adversely affect recall accuracy. The availability of electronic medical records is increasingly affecting the balance of resource requirements between medical record review and patient recruitment and interview.

Despite the adequate correlation between comorbidity scores obtained from patient interview and from medical record review, the most important disadvantage of patient recall is its potential for measurement error. The greatest concern is that errors in recalling or reporting comorbidity data will correlate with errors in recall or reporting of other study variables. These dependent errors can substantially bias estimates of effect.

Clinical Judgment

A simple and efficient method of collecting comorbidity information is to collect an overall rating of patients’ health status from their physicians, such as the American Society of Anesthesiologists Index. The advantages of these ratings are: (a) simplicity, which translates to low resource requirements, (b) independence, which precludes dependent errors associated with errors in interview data, and (c) good correspondence with physician impressions that affect medical decision making.

The major disadvantage of these ratings is also their simplicity, because the simplicity masks the true complexity of comorbidity. A second disadvantage is that patient consent will be required to obtain the rating from the physician or to review medical records. If the rating is the only information obtained from physicians directly or from medical record review, then the efficiency of data collection will be poor.

Administrative Databases

Administrative databases, such as claims databases [e.g., the Diagnostic Cost Group/Hierarchical Condition Categories (11) and pharmacy databases (see the Chronic Disease Score; 12)] have been used to construct comorbidity indices. These indices translate information gathered for an unrelated primary purpose to a secondary purpose of scaling comorbidity. The translation will be inevitably imperfect. For example, claims databases contain information intended to maximize reimbursement, so this information may inflate the burden of comorbid diseases. The quality of the claims data may also be better for inpatient services than for outpatient services. Use of pharmacy databases requires that all participants had uniform access to reimbursement for relevant pharmaceuticals and that all participants used only the pharmacy housing the database to obtain prescriptions. The advantages of using administrative data are: (a) fewer resources required to collect comorbidity data than medical record review and patient interview, particularly given very large study populations, and (b) independence of errors in measurement of comorbidity from errors in measurement of other study variables collected by other methods. An important disadvantage of administrative databases is that specialized expertise is required to obtain administrative data and to manipulate them. Certain scales derived from one data source can be adapted to administrative databases [the Deyo adaptation of the Charlson index (13)].

The Minimum Data Set has been used to assess comorbidity in nursing home residents (14). This work combines the medical record, clinical rating, and administrative data perspectives. These data are used to assess comorbidity in the population of postacute patients leaving the hospital and returning home via a nursing home.

Variable Axes

There are four variable axes for which measurement of comorbidity might be required: confounder, modifier, exposure, and outcome. All measures of comorbidity are imperfect surrogates because of: (a) poor correspondence between the measure and the target conceptual entity, (b) the difficulty in determining whether comorbid conditions precede, are concurrent with, or postdate outcomes, (c) and data limitations. As a result, study participants can be misclassified. Below we review the expected effect of misclassification of comorbidity when it is used in each variable axis (see also Table 2).

Table 2
Implications of Nondifferential and Nondependent Comorbidity Mismeasurement by Variable Axis

Comorbidity as Confounder

When comorbidity is a confounder, and because there is no perfect measure, there will be imperfect control for confounding. The relative risk due to confounding measures the direction and magnitude of confounding:


When misclassification of comorbidity is nondifferential (i.e., rates of misclassification of comorbidity are expected to be homogeneous across strata of the dependent and independent variables) and nondependent (i.e., errors in misclassification of comorbidity are not expected to correlate with errors in classification of the dependent or independent variables), the relative risk due to confounding (RRconfounding) is expected to be biased toward the null. The result is that the adjusted estimate of effect (e.g., RRadjusted) is biased toward the crude estimate of effect (e.g., RRcrude). Under these circumstances, estimates of effect adjusted for the imperfect measure of comorbidity by modeling, pooling, or standardization will lie between the true adjusted effect and the unadjusted estimate of effect. Were comorbidity misclassified differentially or dependently, the direction of the error in the relative risk due to confounding would still be correctable, but the bias before correction would not be predictable.

In general, when comorbidity is prevalent, its relative risk due to confounding is non-negligible (15). There is opportunity for substantial bias even with nondifferential and nondependent misclassification because the relative risk due to confounding may be far from the null. When comorbidity is uncommon, however, its relative risk due to confounding will be negligible. The contrasting impact of comorbidity and its misclassification when comorbidity is prevalent, versus when it is rare, illustrates the importance of accurate measurement of comorbidity in older populations in whom comorbid disease is prevalent.

Comorbidity as Modifier

Researchers may wish to investigate whether the relationship between an independent and dependent variable is different depending on the level of comorbidity. For example, one might investigate the relationship between therapy and survival. One might then stratify the analysis by levels of comorbidity to examine whether the effect is similar across levels. These analyses examine effect measure modification, interdependence, or interaction. Although these have subtle differences, they share in common the potential for misleading results when either the exposure variable or modifying variable is misclassified. When comorbidity is the modifying variable and is subject to misclassification, effect measure modification (or interdependence or interaction) may appear, when in truth there is none (16). Conversely, when comorbidity is the main effect variable, true effect measure modification may be masked by the misclassification of the modifying variable.

Comorbidity as Exposure

Nondifferential, nondependent misclassification of a dichotomized measure of comorbidity as the exposure variable usually biases the estimate of effect toward the null. When comorbidity is analyzed as a continuous or ordinal variable, however, its mismeasurement or misclassification can have unpredictable consequences on the estimate of effect (1719). If errors in classification of the independent variable are correlated with errors in classification of the dependent variable, the estimate of effect can be substantially biased, even with only small error rates (20), particularly if the exposure and outcome are rare. Dependent errors can be avoided by assuring that the exposure (comorbidity) and outcome are ascertained by different data collection methods. When both are ascertained by the same method, such as patient interview, dependent errors can easily account for all of an apparently important estimate of effect (21,22).

Comorbidity as Outcome

Nondifferential and nondependent misclassification of comorbid disease status, when it is dichotomized as the outcome variable, biases difference estimates of effect toward the null. Ratio measures of effect will be unbiased, however, when specificity is perfect (23). Thus, investigators who wish to study comorbidity as a dependent variable ought to set a threshold for comorbid disease status and operationalize it as a dichotomous outcome. The threshold above which participants are classified as having comorbid disease ought to be high. A high threshold assures few false-positive cases, which is a requirement to achieve the advantage of the unbiased estimate of effect. In addition, the data collection methods selected should assure independence of comorbid disease scoring from other study variables.

New Directions in Measurement of Comorbidity

Sensitivity Analysis

All comorbidity measures are surrogates for an unmeasurable gold standard. For example, all comorbidity measures rely on the quality of diagnostic testing, which may vary with patient age, sex, system of care, health insurance type, race/ethnicity, or socioeconomic status. Differences in comorbidity related to these factors reflect the combination of real differences and differences attributable to variation in the quality of diagnostic testing. The deviation of the analysis from its ideal is a function of measurement error. Other sources of measurement error include errors in reporting, recording, or abstracting comorbid conditions. Quantitative sensitivity analysis is a solution to adjust estimates of effect for errors in classification (24) and to expand intervals so that they reflect total error, rather than only random error (2528). Quantitative sensitivity analysis proves especially effective when errors in classification can be modeled as a function of well-measured variables, such as age or gender, in the analysis (29).

Multiple Informants

A second approach to assessing comorbidity takes advantage of the circumstance in which several measures of comorbidity are available in the same data set. Comparisons have been made between information sources—examining medical record-based sources versus interview sources (3032) or administrative claims–based sources (33,34)—and between different measures of comorbidity (31,34,35). Often all measures are marginally correlated with one another (30,31,33,34) and add little control for confounding when a second index supplements the first (31,33,35). Rather than searching for the best source of information or the best index, we have applied a multiple informants regression approach to measure the effect of comorbid disease on health outcomes and to control for its confounding impact (37). The multiple informants approach merges parallel streams of information into the regression equations from all independent sources or indices to yield a single measure of the effect of comorbid disease, and simultaneously allows an examination of whether the individual indices have different effects. The multiple informants approach is ideally suited for this context, because all indices measure the same fundamental concept and because no index of comorbid diseases or method of data collection has consistently proved superior (37).

Solutions From Other Fields

Research on comorbid disease has focused on developing measures appropriate for particular research questions, data collection methods, and patient populations. This focus has generated a proliferation of ever more specific measures. Two other observational sciences—health risk assessment and economics—have confronted similar problems and have responded by developing methods that quantify uncertainty. While this effort has not excluded research to improve methods and to improve measurement, the parallel development of methods to quantify uncertainty has yielded more defensible estimates of associations. Epidemiology has also recently re-emphasized uncertainty assessment, using techniques such as marginal structural models, multiple bias modeling, data augmentation, and multiple-imputation. As many of the challenges confronted by researchers who wish to improve assessment of comorbidity can be framed as measurement error problems or classification error problems, these techniques might be used to address uncertainty arising from the imperfect measurement of comorbid disease.


The work was supported, in part, by an award from the National Institute on Aging.

We thank Dr. Brian D. Bradbury for completing and summarizing a thorough review of the literature related to measuring comorbidity.

This work was presented, in part, at a meeting of the National Institute on Aging’s Task Force on Comorbidity (Bethesda, Maryland, July 2004) and at the National Institute on Aging-funded Comorbidity Conference (Atlanta, Georgia, March 2005). A more complete white paper on this topic is available by request from the corresponding author.


1. McCormick WC, Kukull WA, van Belle G, Bowen JD, Teri L, Larson EB. Symptom patterns and comorbidity in the early stages of Alzheimer’s disease. J Am Geriatr Soc. 1994;42:517–521. [PubMed]
2. McCormick WC, Kukull WA, van Belle G, Bowen JD, Teri L, Larson EB. The effect of diagnosing Alzheimer’s disease on frequency of physician visits: a case-control study. J Gen Intern Med. 1995;10:187–193. [PubMed]
3. Charlson ME, Pompei P, Ales KL, MacKenzie CR. A new method of classifying prognostic comorbidity in longitudinal studies: development and validation. J Chronic Dis. 1987;40:373–383. [PubMed]
4. Linn BS, Linn MW, Gurel L. Cumulative illness rating scale. J Am Geriatr Soc. 1968;16:622–626. [PubMed]
5. Greenfield S, Blanco DM, Elashoff RM, Ganz PA. Patterns of care related to age of breast cancer patients. JAMA. 1987;257:2766–2770. [PubMed]
6. Kaplan MH, Feinstein AR. The importance of classifying initial comorbidity in evaluating the outcome of diabetes mellitus. J Chronic Dis. 1974;27:387–404. [PubMed]
7. Crabtree HL, Gray CS, Hildreth AJ, O’Connell JE, Brown J. The Comorbidity Symptom Scale: a combined disease inventory and assessment of symptom severity. J Am Geriatr Soc. 2000;48:1674–1678. [PubMed]
8. Rozzini R, Frisoni GB, Ferrucci L, et al. Geriatric index of comorbidity validation and comparison with other measures of comorbidity. Age Aging. 2002;31:277–285. [PubMed]
9. Greenfield S, Sullivan L, Dukes KA, et al. Development and testing of a new measure of case mix for use in office practice. Med Care. 1995;33(Suppl):AS47–AS55. [PubMed]
10. Katz JN, Chang LC, Sangha O, Fossel AH, Bates DW. Can comorbidity be measured by questionnaire rather than medical record review? Med Care. 1996;34:73–84. [PubMed]
11. Ash A, Porell F, Gruenberg L, Sawitz E, Beiser A. Adjusting Medicare capitation payments using prior hospitalization data. Health Care Financ Rev. 1989;10:17–29. [PubMed]
12. Von Korff M, Wagner EH, Saunders K. A chronic disease score from automated pharmacy data. J Clin Epidemiol. 1992;45:197–203. [PubMed]
13. Deyo RA, Cherkin DC, Ciol MA. Adapting a clinical comorbidity index for use with ICD-9-CM administrative databases. J Clin Epidemiol. 1992;45:613–619. [PubMed]
14. Mor V. A comprehensive clinical assessment tool to inform policy and practice. Applications to the minimum data set. Med Care. 2004;42:50–59. [PubMed]
15. Flanders WD, Khoury MJ. Indirect assessment of confounding: graphic description and limits on effect of adjusting for covariates. Epidemiology. 1990;1:239–246. [PubMed]
16. Greenland S. The effect of misclassification in the presence of covariates. Am J Epidemiol. 1980;112:564–569. [PubMed]
17. Wacholder S. When measurement errors correlate with truth: surprising effects of non-differential misclassification. Epidemiology. 1995;6:157–161. [PubMed]
18. Sorahan T, Gilthorpe MS. Non-differential misclassification of exposure always leads to an underestimate of risk: an incorrect conclusion. Occup Environ Med. 1994;51:839–840. [PMC free article] [PubMed]
19. Weinberg CR, Umbach DM, Greenland S. When will nondifferential misclassification of an exposure preserve the direction of a trend? Am J Epidemiol. 1994;140:565–571. [PubMed]
20. Kristensen P. Bias from nondifferential but dependent misclassification of exposure and outcome. Epidemiology. 1992;3:210–215. [PubMed]
21. Balfour JL, Kaplan GA. Neighborhood environment and loss of physical function in older adults: evidence from the Alameda County study. Am J Epidemiol. 2002;155:507–515. [PubMed]
22. Lash TL, Fink AK. Re: Neighborhood environment and loss of physical function in older adults: evidence from the Alameda County study. Letter to the editor. Am J Epidemiol. 2003;157:472–473. [PubMed]
23. Brenner H, Savitz DA. The effects of sensitivity and specificity of case selection on validity, sample size, precision, and power in hospital-based case-control studies. Am J Epidemiol. 1990;132:181–192. [PubMed]
24. Greenland S. Basic methods for sensitivity analysis and external adjustment. In: Rothman KJ, Greenland S, editors. Modern Epidemiology. Philadelphia, PA: Lippincott-Raven Publishers; 1998. pp. 343–358.
25. Lash TL, Fink AK. Semi-automated sensitivity analysis to assess systematic errors in observational epidemiologic data. Epidemiology. 2003;14:451–458. [PubMed]
26. Phillips CV. Quantifying and reporting uncertainty from systematic errors. Epidemiology. 2003;14:459–466. [PubMed]
27. Greenland S. Sensitivity analysis, Monte Carlo risk analysis, and Bayesian uncertainty assessment. Risk Analysis. 2001;21:579–583. [PubMed]
28. Greenland S. Multiple-bias modeling for analysis of observational data. J Royal Stat Soc A. 2005;168:267–306.
29. Lash TL, Silliman RA. A sensitivity analysis to separate bias due to confounding from bias due to predicting misclassification by a variable that does both. Epidemiology. 2000;11:544–549. [PubMed]
30. Mandelblatt JS, Bierman AS, Gold K, et al. Constructs of burden of illness in older patients with breast cancer: a comparison of measurement methods. Health Serv Res. 2001;36:1085–1107. [PMC free article] [PubMed]
31. Silliman RA, Lash TL. Comparison of interview-based and medical-record based indices of comorbidity among breast cancer patients. Med Care. 1999;37:339–349. [PubMed]
32. Simpson CF, Boyd CM, Carlson MC, Griswold ME, Guralnik J, Fried LP. Agreement between self-report of disease diagnoses and medical record validation in disabled older women: factors that modify agreement. J Am Geriatr Soc. 2004;52:123–127. [PubMed]
33. Malenka DJ, McLerran D, Roos N, Fisher ES, Wennberg JE. Using administrative data to describe casemix: a comparison with the medical record. J Clin Epidemiol. 1994;47:1027–1032. [PubMed]
34. Newschaffer CJ, Bush TL, Penberthy LT. Comorbidity measurement in elderly female breast cancer patients with administrative and medical records data. J Clin Epidemiol. 1997;6:725–733. [PubMed]
35. Roos LL, Sharp SM, Cohen MM, Wajda A. Risk adjustment in claims-based research: the search for efficient approaches. J Clin Epidemiol. 1989;42:1193–1206. [PubMed]
36. Lash TL, Thwin S, Horton NJ, Guadagnoli E, Silliman RA. Multiple informants: a new method to assess breast cancer patients’ comorbidity. Am J Epidemiol. 2003;157:249–257. [PubMed]
37. Yancik R, Ganz PA, Varricchio CG, Conley B. Perspectives on comorbidity and cancer in older patients: approaches to expand the knowledge base. J Clin Oncol. 2001;19:1147–1151. [PubMed]
PubReader format: click here to try


Related citations in PubMed

See reviews...See all...

Cited by other articles in PMC

See all...


  • PubMed
    PubMed citations for these articles

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...