• We are sorry, but NCBI web applications do not support your browser and may not function properly. More information
Logo of nihpaAbout Author manuscriptsSubmit a manuscriptNIH Public Access; Author Manuscript; Accepted for publication in peer reviewed journal;
Alzheimers Dement. Author manuscript; available in PMC Jan 1, 2012.
Published in final edited form as:
PMCID: PMC3044596

Reducing case ascertainment costs in US population studies of Alzheimer's disease, dementia, and cognitive impairment—Part 1*


Establishing methods for ascertainment of dementia and cognitive impairment that are accurate and also cost effective is a challenging enterprise. Large population-based studies often using administrative data sets offer relatively inexpensive but reliable estimates of severe conditions including moderate to advanced dementia that are useful for public health planning, but they can miss less severe cognitive impairment which may be the most effective point for intervention. Clinical and epidemiological cohorts, intensively assessed, provide more sensitive detection of less severe cognitive impairment but are often costly. Here, several approaches to ascertainment are evaluated for validity, reliability, and cost. In particular, the methods of ascertainment from the Health and Retirement Study (HRS) are described briefly, along with those of the Aging, Demographics, and Memory Study (ADAMS). ADAMS, a resource-intense sub-study of the HRS, was designed to provide diagnostic accuracy among persons with more advanced dementia. A proposal to streamline future ADAMS assessments is offered. Also considered are decision tree, algorithmic, and web-based approaches to diagnosis that can reduce the expense of clinical expertise and, in some contexts, can reduce the extent of data collection. These approaches are intended for intensively assessed epidemiological cohorts. The goal is valid and reliable detection with efficient and cost-effective tools.

1. Introduction

Alzheimer's disease (AD) is an enormous public health problem that is expected to markedly increase in the coming decades with the aging of the post World War II generation in the United States and many other countries. Developing strategies to delay the onset of the signs and symptoms of AD is critical for disease prevention [1,2]. While longitudinal cohort studies and other population surveys have contributed much to the current knowledge, very large clinical trials will likely be needed in the future. In fact, there will be upward cost pressures on the budgets of the longitudinal cohort studies, the population surveys, and the clinical trials, particularly ones emphasizing preclinical and mild impairment, as more biomarker technology is employed to assist in the diagnosis of AD and other age-related causes of cognitive impairment, and as sample sizes increase to accommodate the need to study subjects under age 65 years in addition to those age 65 years or older. Another major contributor to the costs of cohort studies and clinical trials are the costs related to clinical diagnoses. Several groups across the country have investigated a variety of approaches to rein in such costs without sacrificing reliability or validity.

This article and a companion one [3] pertain to reducing case ascertainment costs for large cohort studies, and are applicable to primary and secondary prevention trials. Sections 2 and 3 pertain to the Health and Retirement Study (HRS), a nationally representative cohort study of health, retirement, and aging. In Section 2, Weir draws on his experience with the HRS to emphasize the need for cost-effective methods of case ascertainment in population-based studies of prevalence and burden of disease. In Section 3, Wallace, Langa, and Plassman present a possible alternative approach to the current methods of dementia assessment within the HRS, by emphasizing more fully data collection by telephone and other electronic means, and more intensive acquisition of available clinical records. Sections 4, 5, and 6 are more general and focus on aspects of the diagnostic process itself. In Section 4, Wilson and Bennett illustrate utility of a cost-efficient decision tree approach which combines computer-based summaries of neuropsychological performance tests with expert clinical judgment to generate diagnoses of cognitive impairment, dementia, and AD, using data from the Rush Religious Orders Study (ROS) and the Rush Memory and Aging Project (MAP). In Section 5, Duara and Loewenstein demonstrate how a valid consensus diagnosis of dementia and pre-dementia states can be achieved with greater reliability and considerably reduced effort and cost by avoiding the traditional consensus deliberation and using instead a simple algorithm that combines an independent neuropsychological diagnosis and a functional assessment by a clinician. In Section 6, Ganguli offers a novel, web-based, approach to diagnosis and diagnostic consensus. This approach avoids the logistic costs of, and saves time over, the typical live “consensus conference” of experts; it adds standardization, allows reliability to be monitored, and provides opportunities for finer-grained analyses of the components of expert diagnosis. Finally, in Section 7, the Discussion, Sano provides a synthesis of the material laid out in the earlier sections, and she offers some perspective of her own.

2. Need for cost-effective, case-ascertainment methods in surveys of dementia: Experience of the Health and Retirement Study (HRS)

Population surveys play an important role in improving scientific understanding of the causes, consequences, and morbidity levels of AD, dementia, and cognitive impairment. These surveys are particularly valuable for understanding the context in which the disease occurs and progresses, including its interaction with other conditions and its impact on the families of those affected. Survey planners, however, must be concerned about the participation of the cognitively impaired if they aim to represent adequately that part of the older population. More than any other condition or characteristic, cognitive decline directly affects the ability of a potential subject to participate in a survey because—by their nature—surveys are cognitively demanding conversations.

An important issue in finding cost-effective, case-ascertainment solutions is deciding what should be measured. AD, dementia, and cognitive impairment have multiple etiologies and are complex diagnostic categories involving social functioning as well as cognitive ability. Identifying persons with outright dementia and sub-categorizing those attributable to AD would lead to one sort of study design. Seeking to identify milder cognitive impairments and/or to potentially identify all cases of AD (mild to severe) requires a different design. Because of the much larger number of persons with milder forms of cognitive impairment, designs to capture them are necessarily more expensive unless the assessment itself can be made much less costly.

In many surveys, clinical diagnosis involves a combination of cognitive testing, informant reporting, and clinical judgment. Survey planners must decide how much weight to place on each of those. Clinical judgment is the most difficult and costly element to incorporate into surveys. Informant reports raise concerns about inter-rater reliability (i.e., some informants may report differently or have different knowledge of the subject). Cognitive testing at a single point in time also faces issues of reliability, especially if variability in cognitive performance increases with cognitive decline. Moreover, some subjects may not wish to be tested precisely because they know their abilities have declined.

The HRS began in 1992 with a cohort of individuals age 51–61 years and was supplemented in 1993 with a cohort of individuals age 70 years or older, then referred to as the AHEAD Study. Both studies included a cognitive battery [4]. New cohorts were added in 1998 to make the combined study representative of the US population born before 1948, and in 2004 another new cohort made it representative of the US population born before 1954. In this section, the HRS will be used to comment on three critical case-ascertainment issues faced in surveys of AD, dementia, and cognitive impairment in the United States. The issues include: nursing home coverage, selective non-participation because of cognitive difficulties, and determination of diagnostic status. Each issue relates to potential bias of survey results. For the third issue, cost implications are also briefly considered.

In some community-based studies, especially cross-sectional studies, the exclusion of nursing home residents is done for practical reasons but at the risk of bias because that group may include a substantial part of the cognitively impaired population. The HRS at baseline did exclude nursing home residents. Currently, however, the HRS is a longitudinal study, following all participants through the end of life. Sample members are followed even if they subsequently enter nursing homes. As a result, the HRS now fully represents nursing home residents.

A second major contributor to bias is selective non-participation by cognitively impaired elderly. The HRS interview is cognitively demanding both for its length (75 minutes or more) and the complexity of its multi-disciplinary content. In order to maintain coverage of the cognitively impaired, the HRS makes use of proxy interviews. If a sample member is unable or unwilling to participate in the HRS interview, a proxy is sought who can answer on his or her behalf. Most often this is a spouse or, in the absence of a spouse, another family member. The importance of this can be seen in Fig. 1 which maps response rate in the first follow-up wave against cognitive score in the starting (baseline) wave. The cognitive score is a measure of total recall (sum of words recalled at immediate plus delayed recall from a list of ten). Including the proxy interviews in HRS, there is little relationship between cognitive score and continued participation. The English Longitudinal Study of Aging (ELSA) is, by design, quite similar to the HRS in its content and age coverage. However, proxies are generally not used. In ELSA, the rates of non-response are very much related to prior cognitive measures. Finally, if the proxy interviews from HRS were excluded (as if there had been no response), the HRS and ELSA curves look very much alike.

Fig. 1
The role of proxy interviewing in maintaining participation of the cognitively impaired: The Health and Retirement Study (HRS) versus the English Longitudinal Study of Aging (ELSA). `Total Recall at Starting Wave' refers to the sum of words recalled at ...

The third case-ascertainment issue is the difficulty of determining diagnostic status. Many chronic conditions are commonly diagnosed in the healthcare system, and survey participants can and do report them accurately. However, cognitive impairment is not systematically diagnosed, and many people with diagnosable impairment either do not know or do not report it. The advantage of proxy interviews is that persons with cognitive impairment do not disappear from the study. Many aspects of their lives, including their other health conditions, use of medical services and informal care, and economic circumstances, can be reported by proxies. What is lost, however, is the direct measurement of their cognitive abilities. The HRS uses several proxy-reported measures, including the Jorm Informant Questionnaire on Cognitive Decline in the Elderly (IQCODE), to assess the cognitive status of a sample member via a proxy interview [5].

Both the HRS cognitive score for self-respondents, and the IQCODE for proxied cases, provide a useful ranking of participants by cognitive status that can be used for important questions like the costs to the family of formal and informal care for persons with cognitive impairment [6,7]. The survey measures are, however, far short of a medical diagnosis of AD or dementia. To bridge that gap, an HRS supplemental study known as the Aging, Demographics, and Memory Study (ADAMS) was undertaken. In ADAMS, the HRS cognitive and proxy measures were used to stratify the sample of subjects above age 70 years by cognitive status and then participants were selected within each stratum to receive a comprehensive in-home evaluation leading to a research diagnosis of dementia, cognitive impairment not dementia (CIND), or no cognitive impairment [8]. ADAMS found a prevalence of 13.7% demented and 22.2% CIND in the population age 71 years or older [9,10]. AD was identified as a subtype of dementia at a prevalence of 9.7%.

The ADAMS diagnostic categories correspond to distinct levels of morbidity and this can be seen in outcomes following the ADAMS assessment. At the next HRS wave in 2004, daily care hours were 0.7 for normal cognition, 2.8 for CIND, and 7.3 for the demented. Mortality risk was 2.6 times higher for CIND versus normals, and 5.7 times higher for demented versus normals. The CIND group, which is roughly twice as large as the demented group, is thus very different from both the normal cognition group and those with diagnosed dementia.

ADAMS provides useful insight into the strengths and limitations of self-report in the HRS. Of those diagnosed with dementia in ADAMS, only 40% had previously reported in HRS that they had been diagnosed with a memory-related disease (and only 4% of those with CIND). During the ADAMS assessment, only 47% of individuals with dementia and 7% of individuals with CIND reported (via proxy respondents) that they had seen a doctor for memory problems. This may reflect reporting bias at least as much as it does under-diagnosis or under-treatment. Another study found 85% of ADAMS cases diagnosed as demented had a Medicare claim with a dementia-related diagnosis [11]. The percentage was much lower for individuals with CIND.

At the present time, with limited medical treatment options available for cognitive impairment, the use of self-report survey measures is clearly inadequate. Survey-based cognitive measurements, on the other hand, can be valuable. The HRS-based cognitive status strata used to draw the ADAMS sample had a strong correlation with the eventual ADAMS diagnosis (ADAMS assessments were blinded to the sample strata), accounting for half of the variance in diagnosis. There is an enormous difference in cost between adding a few cognitive measures to a survey like HRS, and conducting a rigorous diagnostic protocol like ADAMS. The challenge ahead is to find more cost-effective designs (see Section 3).

It seems likely that the way forward for case ascertainment in surveys will involve a multi-stage screening process, or what might also be termed an adaptive testing process. Many persons can be determined to be unimpaired with relatively minimal testing. Many definite dementia cases may be identifiable through Medicare records or relatively inexpensive follow-up interviews with informants. The difficult cases are those at the borderlines between normal and mild impairment and between mild impairment and dementia. Those are the cases for which the most extensive testing will be needed for definitive ascertainment.

3. An alternative protocol to study dementia occurrence in the Health and Retirement Study (HRS)

Over the past decade, the HRS has included a population sub-study of AD, dementia, and cognitive impairment among persons age 71 years or older [12]. The sub-study, ADAMS (see Section 2), has produced important findings on the prevalence and incidence of AD, vascular dementia, dementia, and cognitive impairment in the US population [9,10]. In addition, data are emerging from ADAMS on the antecedents and outcomes of the dementia syndrome [1316]. ADAMS, with its national coverage and comprehensive case-finding approach, is unique in the United States. Other US population-based studies with comparable aims and credible case-finding strategies are confined to local communities.

The ADAMS approach to case-finding includes home visits by a nurse and neuro-psychological technician, who perform, among other things, a clinical and cognitive history from both participants and proxy respondents, and extensive neuropsychological and neurological testing. The methods have been previously described [8], and the data from ADAMS are available on the HRS web site (www.hrsonline.isr.umich.edu). The data collected at the home visits allow an in-depth assessment of cognitive function and illnesses, albeit without neuro-imaging or a standard battery of laboratory tests.

Because ADAMS is a subsample of the HRS, it has the great advantage of having a nationally-referent study population, as well as possessing a great deal of prior participant information, including dementia screening instrument findings. However, despite these important strengths, the methodology of ADAMS is resource-intensive, requiring a substantial amount of time to arrange the home visits in advance. There are usually incumbent airfares and hotel and staff costs to conduct the home-based study evaluations. In addition, perhaps not unexpected because of advanced age and frequent cognitive impairment of participants, the ADAMS home visit participation rate of 56% [9] was less than hoped. Thus, despite the detailed information collected in ADAMS, more efficient methods for widely distributed populations would allow a lower cost alternative when resources are scarce.

In this section, one such possibility is described—a plan to do more assessment by telephone. The plan, which involves six steps, would require future validation of the proposed methods. The steps are given below and are also depicted in Fig. 2. They attempt to address special issues that begin with the HRS source population. Essentially designated ADAMS participants would be treated as others in the general HRS protocol. That is, except for possible validation visits, no additional home visits would be conducted.

Fig. 2
Subject progress through the alternative protocol for the Aging, Demographics, and Memory Study.

Step 1. Sample participants would be reached first by telephone. Consents would be obtained from the primary participant and one proxy respondent. A preliminary interview would then be conducted with both the primary and proxy respondents, in order to assist in gathering ancillary information and available community medical records. These records might include primary and referral physician records, and insurance claims data, including from Medicare if available.

Step 2. The current physical and cognitive health of the primary respondent would be ascertained using screening items that cover such items as being in bed more than half of each day, use of analgesics and other psychotropic medications, and general adequacy of food intake.

Step 3. Further evaluation of the primary respondent for delirium, mental illness and other impediments to interviewability would be performed using standardized instruments. Two of the most important conditions that may mimic dementia are depression and delirium. Both conditions can be reliably screened using proxy respondents and established instruments [17,18], particularly emphasizing levels of alertness and consciousness. Determination of, or at least screening for, major concurrent psychological and psychiatric conditions is paramount, since they may be causes of “pseudo-dementia” as well as a result of the dementing processes [19,20].

If for any reason the primary respondent is deemed unsuitable for cognitive testing or more standardized data collection, the remainder of the information would be collected from the proxy respondent(s), as is done in the full ADAMS protocol. Assembled preliminary information would then be evaluated to decide the remaining steps.

Step 4. If needed, the proxy respondent would be fully interviewed to obtain a clinical and cognitive history of the primary participant, and all additional relevant clinical records would be sought. A proxy cognition evaluation would be conducted if indicated, and possible additional proxy respondents would be identified for interview if additional observations would be informative. Data from prior HRS waves could be particularly important here to provide context to the new data collection. For example, the trajectory of decline in instrumental activities of daily living (IADLs) could help validate new information collected on the emergence of dementia [21].

Step 5. If feasible, the primary respondent would be interviewed, maximizing the content of the full ADAMS protocol where adapted by telephone. Additionally, if the respondent has a computer available, cognitive testing via the Internet could be explored and could facilitate testing to those that require visualization. Even without visual testing, many cognitive tests have been adapted for telephone use, including assessment of executive function. It may be possible to hone the testing protocol by evaluating various versions or possibly seeking local professional psychological testing services. In addition, a personal history of neurological as well as cognitive symptoms and impairment would be sought, including conditions and exposures relevant to the onset or progression of dementia, such as hypertension, stroke, alcoholism, head trauma, Parkinson's disease, cognitive impairment acquired early in life, and certain hazardous occupational or other environmental exposures.

Step 6. In the final phase of the evaluation, an adjudication panel would review collected information, at first blind to medical records and reports of community diagnoses, and then using all available information, as in the parent ADAMS protocol, yielding the research cognitive diagnosis for analytical purposes.

It is likely that this 6-step alternative approach to data collection in ADAMS will yield substantial data collection cost savings for each target respondent studied. While a formal cost evaluation was not conducted, the savings in travel-related staff and logistical expenses would likely decrease the overall cost by about 20%. It is possible that staff time related to conducting telephone interviews would increase, so the actual cost savings would have to be evaluated through a pilot study.

Despite the efficiencies that are likely to be realized by the above proposal, there are potential important limitations that should be considered. (1) In most instances there will be no direct professional observation of the participant's level of illness or responsiveness to test items or clinical condition. (For example, there would be no direct observation of participant use of external cues during testing—such as writing down items.) (2) No new biomarker specimen could be directly obtained, although clinical specimens of participants could be obtained by other means. (3) Testing may be more difficult than in person if the respondent has auditory disability or has other difficulty using the telephone. (4) There may be a few desirable cognitive tests that are impractical to administer, such as those requiring direct and visual participation (writing or drawing), and for which computer-aided or proxy-administered testing is not feasible. (How much diagnostic information would be lost must be evaluated in validation studies.) (5) No neurological observation or examination could be performed. This will likely make diagnosis of some dementia sub-types more difficult.

Clearly there are some potentially important limitations associated with these proposed survey methods, but how different population prevalence estimates would be vis-à-vis the full ADAMS methodology is unclear, and a well-designed validation/pilot study is indicated. A hybrid protocol is also possible: when dementia status is uncertain, a home visit with an elaborated protocol could be conducted. Alternatively, in this circumstance, participants could be referred to the nearest appropriate medical center for a fuller evaluation, given that early diagnosis of previously undiagnosed dementia may be of value to the participant and to the family. In the future, new biomarkers related to dementing illnesses may aid in the remote diagnosis of this important, disabling and fatal disorder. In that regard, the availability in the HRS cohort of genome-wide SNP determinations may allow additional diagnostic power.

4. Cost-efficient approach to dementia diagnosis in epidemiologic cohort studies

Longitudinal studies of AD are expensive due in part to the costs associated with clinical classification of AD, dementia, and mild cognitive impairment (MCI). The standard model adopted in most studies is comparable to evaluations performed in tertiary care clinical settings. The core elements of the clinical evaluation—history, neurological examination, and cognitive testing—are supplemented with other procedures including blood work, brain scan, informant interview, and diagnostic case conference. However, the needs of longitudinal research differ from those of clinical practice. In particular, it is imperative to maintain uniformity in procedures across clinicians and time to minimize random variability in clinical classification and to allow for examination of possible change in disease occurrence over time [22]. This is essential to ensure that the relation between an exposure and outcomes, in either observational studies or intervention trials, is not the result of variability in the diagnostic process. Further, procedures should be transparent to facilitate investigation of the effect of specific criteria on diagnostic classification and comparison of findings across studies performed in different cohorts. Clinical decision making systems with these properties have been developed to diagnose psychiatric conditions [23,24]. In the early 1990s at the Rush Alzheimer's Disease Center (Chicago, Illinois), a clinical classification system was developed to enhance uniformity, transparency, and efficiency in the diagnosis of AD, dementia, and MCI. Over more than 15 years, this clinical classification system has been implemented in several ongoing longitudinal cohort studies involving thousands of participants and tens of thousands of clinical evaluations.

The standard clinical evaluation used in most longitudinal studies at Rush consists of a structured medical history, neurological examination, and cognitive performance testing. Data are collected on laptop computers with forms programmed in Blaise, a Pascal-based data entry system. The medical history focuses on conditions with the potential to impair cognitive function such as depression, cardiovascular disease, and head injury. A complete neurological examination is administered by specially trained nurses [25]. A battery of cognitive tests is administered in an approximately 45-minute session by a research assistant. The evaluation does not include an informant interview, blood work, or brain scan, procedures often included in dementia evaluations conducted at tertiary care centers, nor are routine case conferences employed.

A key issue in clinical classification of dementia by the criteria of the National Institute of Neurological and Communicative Disorders and Stroke and Alzheimer's Disease and Related Disorders Association (NINCDS-ADRDA criteria) [26] is determining whether functioning in different cognitive domains is impaired. To enhance uniformity in these determinations across clinical decision makers and time, an algorithm was developed for rating impairment in five cognitive domains: orientation, attention, memory, language, and visuospatial ability [27,28]. The algorithm was designed to mimic the expert clinical judgments of an experienced neurologist (D.A.B.) and neuropsychologist (R.S.W.). Cutoff scores were selected to identify impairment on 11 widely used cognitive measures at four educational levels (0 to 7 years, 8 to 11 years, 12 to 15 years, 16+ years), and rules were developed for converting the test impairment data to ratings of impairment in the five cognitive domains. The algorithm was then pilot tested against expert judgment. Because the agreement was far from perfect, the cutoff scores and algorithm rules were adjusted, its agreement with clinical judgment was retested, and the process was repeated until there was adequate agreement between the algorithm and clinicians.

After a clinical evaluation is completed, the neuropsychologist reviews the cognitive test data plus information on years of education and sensory or motor impairment and then either agrees or disagrees with the algorithm's impairment ratings for each of the 5 cognitive domains. In the event of disagreement, the neuropsychologist provides a new rating and the reasons for disagreement (e.g., visually impaired). Upon completion of this process, the cognitive test results are considered consistent with dementia if at least 3 of the 5 domains were impaired (and with AD if 1 of the domains was memory) and inconsistent if 0 or 1 domain was impaired. If two domains were impaired, the neuropsychologist rates dementia and AD as present (probable or highly probable) or absent (possible or not present). Subsequently, an experienced physician or nurse clinician reviews all clinical data and briefly examines the participant before determining whether the person has experienced a meaningful decline from a previously higher level of cognitive functioning. This clinical determination is based on several factors, particularly self report of memory impairment, previously shown to be related to AD pathology in the Religious Orders Study [29], and performance on cognitive testing that is lower than expected for a given educational level, as summarized by the educationally adjusted impairment ratings for cognitive tests and domains. If necessary, the system can be implemented in the absence of the brief clinical evaluation.

A second algorithm classified subjects with respect to dementia (history of cognitive decline, at least 2 impaired cognitive domains on testing), AD (dementia includes memory impairment), and MCI (no dementia, at least one impaired cognitive domain). The clinician agrees or disagrees and in the latter case provides a new rating and reasons for disagreement.

Although diagnoses in this system are guided by algorithms, clinicians make the final diagnostic decisions. Because the neuropsychologist and physician/nurse practitioner make their decisions sequentially, case conferences are not routinely needed. Further, because the algorithms are based on expert judgment, disagreement is relatively uncommon and clinical effort focuses more on cases with missing data or other specific problems. As a result, substantially less clinician time is expended per case.

In the remainder of the section, data are presented from the Rush Religious Orders Study [30] and Rush Memory and Aging Project [31] on the agreement of clinical diagnosis of AD from this system with pathologic diagnosis [28]. Additional data are then presented on the relation of genetic and experiential risk factors to AD to illustrate the similarity of findings using this system with studies using other more labor intensive diagnostic systems.

As reported in more detail elsewhere [28], of 452 completed autopsies in ROS and MAP, 141 were clinically diagnosed with probable AD proximate to death and 128 (91%) met the National Institute on Aging-Reagan pathologic criteria for a high or intermediate likelihood of AD. By way of comparison, data were examined from the Rush Alzheimer's Disease Center's memory clinic where the standard dementia evaluation includes an informant interview, blood work, brain scan, and case conference. Of 428 completed autopsies from the memory clinic, 306 met clinical criteria for probable AD and 286 (93%) had a high or intermediate likelihood of AD on autopsy.

There were 37 completed autopsies from the two epidemiological studies of individuals with a clinical diagnosis of possible AD (i.e., meet criteria for AD and had another condition contributing to cognitive impairment). On autopsy, 23 (62%) had a high or intermediate likelihood of AD. Of 54 completed autopsies from the memory clinic of persons diagnosed with possible AD, 48 (89%) meet AD pathologic criteria.

Overall, the algorithmic clinical diagnosis of AD showed good agreement with the pathologic diagnosis. The level of clinical-pathologic agreement in the epidemiologic cases was roughly comparable to the level observed in persons from a memory clinic whose AD was diagnosed after a more extensive clinical evaluation.

Inheritance of at least one copy of the apolipoprotein E ε4 allele has been associated with incidence of AD in studies employing a comprehensive dementia evaluation [32,33]. In ROS, the association of ε4 with incident MCI [34] and AD [35] was consistent with previous AD research and with its association with accelerated cognitive decline in the same cohort.

Research based on comprehensive dementia evaluations has found that persons with higher levels of depressive symptoms are more likely to develop AD [36,37]. The same relationship has been observed in ROS using the algorithmically guided diagnostic system, further supported by the association of depressive symptoms with cognitive decline [38].

Type 2 diabetes mellitus has been linked to an increased risk of dementia in studies using intensive diagnostic systems [39,40]. In ROS, persons with diabetes were more likely to develop AD than persons without diabetes and they experienced more rapid cognitive decline [41].

The relation of engagement in cognitively stimulating activities has been linked to risk of AD in several studies [42,43]. In ROS [44] and MAP [45], persons with higher levels of engagement in cognitively stimulating activities were less likely to develop incident AD or MCI, and they experienced a slower rate of cognitive decline.

These examples support the validity of the algorithmically guided diagnoses of AD and dementia. That is, the association of risk factors with these diagnoses is consistent with external data employing more labor intensive diagnostic systems and with internal data using cognitive decline as a complementary outcome.

Diagnostic classification is a complex decision making process. The use of algorithms to guide diagnostic classification has several advantages. The most important one is that structuring the decision-making process reduces random variability, drift, and bias across time and clinicians. Further, an algorithm takes advantage of more information. Without such structure, individual clinicians often employ heuristics, making consensus diagnostic conferences essential. Thus, a practical advantage is that the system requires far less clinician time markedly reducing the direct and opportunity costs of the most expensive data collectors in longitudinal cohort studies of aging and AD. The algorithmically guided classification system presented here has been used in several epidemiologic studies of aging and AD. Research to date suggests that it has adequate performance properties.

5. Assessing the reliability and validity of an algorithm for the diagnosis of dementia and MCI

AD has a prodromal phase that involves progressive impairments in cognitive and functional abilities, from a cognitively normal stage to MCI, and eventually, dementia [4650]. This has led to proposals for making an accurate diagnosis of AD well before criteria for dementia are fulfilled [4749]. The diagnosis of AD during the predementia stage, including the amnestic MCI stage [51], has considerable potential for enabling pharmacological and non-pharmacological interventions to be introduced when they are likely to be more effective. However, as new criteria [52] have been introduced for diagnosing these pre-dementia states it is becoming evident that there is considerable variability among clinicians and research teams for determining thresholds for making these diagnoses [5358]. These differences in thresholds are likely to translate into considerable variability in predictive potential for a given diagnosis, such as aMCI.

Traditionally, the diagnosis of normal cognition, MCI, and dementia is based upon a combination of a physician's diagnosis, which itself is based upon informant-reports of cognitive and functional impairment and a clinical evaluation, reconciled with a neuropsychological diagnosis (NPDx) rendered by a psychologist. The typical consensus diagnosis (ConsDx) is labor-intensive and influenced by the philosophy, personality, discipline and inherent biases of the individual clinicians involved [59,60]. Methods that may ensure high inter-individual and inter-site reliability in diagnosing predementia states and mild dementia should require fewer subjects with greater power to obtain reliable results for clinical, epidemiological, and especially longitudinal studies. Algorithmic approaches to making consensus diagnoses have been used successfully, showing high concordance rates to physician reviews, even in population-based studies [61].

Duara et al. developed a diagnostic algorithm to identify individuals as cognitively normal, with MCI, or with dementia [62]. This algorithmic approach may be expanded to provide reliable diagnoses of cognitive states that precede MCI. The remainder of this section is devoted to a description of the testing to validate the algorithm.

A total of 532 English- and Spanish-speaking, elderly, community-dwelling subjects, 52 to 92 years of age, were recruited by advertisement for a memory-screening study and from a memory disorders clinic. All subjects were assigned a Physician's Cognitive Diagnosis (PhysDx)—no cognitive impairment (NCI), MCI, or Dementia—by the examining physician (who was skilled in diagnosing dementia and MCI) based on the subject's entire clinical history (obtained in English or Spanish), including his/her functional status, Mini-Mental State Examination (MMSE) score [63] and sub-scores. Factors that could influence the physician's impression about the subject's cognitive and functional abilities were taken into consideration, such as educational and cultural background, visual and hearing deficits, language and speech disorders, general medical, neurological and psychiatric conditions, and the perceived reliability of the informant.

Each subject in the study was administered a neuropsychological test battery in his/her native language (English or Spanish). To assess memory, the 3-trial Fuld Object Memory Evaluation [64] and Delay Recall of the Wechsler Memory Scale-R [65] were employed. Tests of non-memory function included category fluency (language function) [66], letter fluency (executive and language function) [67], Block Design-WAISIII (visuospatial skills) [68], Trails B (executive function) [69], and Similarities-WAIS-R (executive function) [68]. Neuropsychological classification (NPDx) was achieved employing methods developed by Loewenstein et al. [70]. The nomenclature used for NPDx was as follows: NCI, non-amnestic MCI (naMCI, single or multi-domain), amnestic MCI (aMCI, single or multi-domain) and dementia.

The threshold for MCI was a test score of 1.5 SD or greater below expected normative values, accounting for age, educational level, and language of administration and based on a large co-normed normative database employed in previous studies [7072]. The 1.5 SD cut-score is typically the cut-off used for MCI [51]. The threshold for dementia was a score that was 2.0 SD or greater below expected normative values in at least one memory and at least one non-memory test. This corresponds to a confirmation of a dementia syndrome at or below the 5th percentile specified by NINCDS-ADRDA criteria [26,51,73].

The National Alzheimer's Coordinating Center/Uniform Data Set (NACC/UDS) D1 diagnosis nomenclature [52] was used for making the final cognitive diagnosis, using, (a) the traditional consensus diagnosis (ConsDx), and (b), the algorithmic diagnosis (AlgDx).

(a) The ConsDx was derived by discussions between the physician and neuropsychologist in a consensus meeting. These two individuals reviewed the subject's clinical history, CDR scores [73,74], and results of neuropsychological evaluation, taking into account any factors that may have influenced the testing.

(b) The AlgDx was derived by combining the PhyDx with the NPDx, using a computational algorithm that provided the final cognitive diagnosis, as defined by NACC/UDS nomenclature (Table 1). The validity of the AlgDx was assessed by its concordance with the ConsDx, and by its correspondence to two biomarkers closely associated with the presence of AD, namely, medial temporal atrophy (MTA) [7578] scores from brain MRI scans, and ApoE-ε4 genotype [79] (biomarkers of AD were used because AD, alone or in combination with other causes, is by far the most common cause of progressive aMCI or dementia [80].)

Table 1
Algorithm for combining the physician's diagnosis (PhysDx) with the neuropsychological diagnosis (NPDx) to derive the algorithmic diagnosis (AlgDx)

The inter-rater reliability of PhysDx and NPDx was assessed for two separate physician/neuropsychologist teams, who independently assessed the same 30 subjects (10 with NCI, 10 with MCI, and 10 with dementia). The inter-rater reliability for PhysDx, measured by Cohen's weighted kappa [81], was 0.69 (SE=.11) (agreement was 70% for NCI, 70% for MCI, and 80% for dementia) and for the NPDx was 0.88 (SE=.07) (agreement was 89% for NCI, 91% for MCI, and 88% for dementia). Finally, the inter-rater reliability for ConsDx was 0.78 (SE=.07) (agreement was 90% for NCI, 70% for MCI, and 80% for dementia).

The concordance of the AlgDx to the same ConsDx categories ranged from 85% to 92%. AlgDx and ConsDx were the same for 88.2% of NCI, 85.1% of aMCI and 90.9% of dementia cases. Cohen's weighted kappa for agreement was 0.84 (SE=.02), a high concordance between the two approaches.

The majority of this subject sample had MTA data (427 cases) and ApoE-ε4 genotyping (314 subjects). Using the AlgDx classification, post-hoc Sidak tests of means indicated subjects diagnosed with aMCI had higher mean MTA scores (0.944; SD 0.76) in comparison to NCI subjects (0.632; SD 0.69), and dementia subjects had higher MTA scores (1.78; SD 1.10) in comparison to subjects in the other two diagnostic groups [F(3,423)=41.02; p<.001]. With regards to ε4 frequencies, there were statistically significant differences between groups [X2 (df= 3) = 31.33; p<.001]. Subjects diagnosed as dementia had the highest ε4 frequency (34.9 %), followed by aMCI (28.3%) and NCI (12.9%) subjects.

The AlgDX was developed with the goal of having a unified method of incorporating elements from the medical and neuropsychological examinations, with clear decision-making rules employed across varying diagnostic teams. The data suggest that the AlgDx provides a simple, reliable, and valid alternative to the classical consensus diagnosis of cognitive impairment. As such, AlgDx may have particular utility for longitudinal, multi-site clinical trials and population-based studies of MCI and dementia. It is apparent that increased reliability, brevity, convenience, and equivalent validity of the AlgDx makes it an appropriate and a potentially cost-effective approach for diagnosing MCI and other pre-dementia states, as well as early dementia.

The AlgDx of normal cognition, MCI, and dementia is a valid alternative that reduces time, effort and biases associated with the ConsDx. Given the inherent reliability of a fixed algorithm, its user friendly nature and coupled with demonstrated efficiency and avoidance of individual bias, the application of the AlgDx in clinical and epidemiological research is worthy of further study.

6. A web-based approach to diagnostic consensus: Experience of the Monongahela-Youghiogheny Healthy Aging Team (MYHAT) Project

For research diagnosis in conditions such as AD, where there is no single definitive diagnostic test, many clinical research centers rely on a process of data review, adjudication, and consensus by a multidisciplinary panel of expert clinicians [8284]. The panel meets in real time to review detailed information on various aspects of the clinical and laboratory assessment of a given patient, discuss the findings, and, render a consensus diagnosis using standardized criteria. This process allows each study participant's data to be individually considered in detail, bringing to bear a wealth of collective clinical expertise and judgment. However, it involves the cost of the time spent by experts in the meeting, the inefficiency of scheduling meetings at a time and location that all experts can attend, and the near impossibility of including experts at different sites.

Research diagnosis poses different challenges in population studies. Participants are often interviewed and examined in their homes or other locations remote from the academic center; assessments are often conducted by raters who are well-trained research personnel but not expert clinicians. When these assessments follow highly standardized protocols, the resulting data can be reviewed by experts who have not personally examined the participants [85,86]. Nevertheless, for multiple experts to review the same data and come to consensus, it is still the norm for a live consensus conference or a teleconference to be implemented, with the same constraints of cost and scheduling.

In this section, an initiative to establish a process of web-based diagnostic consensus within a population study is reported. The objective was to involve clinical experts in reviewing and rating standardized assessment data in a manner that would eliminate scheduling constraints, minimize cost once the infrastructure was established, and also yield data for analysis beyond the diagnosis itself.

6.1. Study site, sample, assessment

The Monongahela-Youghiogheny Healthy Aging Team (MYHAT) Project is a population-based study of the epidemiology of MCI, conducted in a group of small-town communities near Pittsburgh, in southwestern Pennsylvania. All study procedures including the web-based process are approved annually by the University of Pittsburgh Institutional Review Board. Sampling, recruitment, and assessment of the study cohort have been described previously [87,88].

6.1.1. Selection of variables for online consensus website

The assessment protocol was reviewed to select variables that were judged to contribute to the desired ratings and diagnostic impressions. These data points were categorized into 4 groups: (1) demographic and background characteristics: age, gender, education, primary occupation, reading level and estimated IQ, hearing, and vision; (2) variables relevant to Clinical Dementia Rating [73]: subjective complaints, ADLs, IADLs, depressive symptoms, medication management, social engagement, judgment; (3) variables relevant to cognitive classification: participant's scores on all tests in the neuropsychological battery; (4) variables relevant to etiological diagnosis: health history, medications, physical and neurological examination, neuroimaging reports if any.

6.1.2. Structure and sequence of web pages

Designated raters log into a secure website and select, from a drop-down menu, one subject at a time for rating. The subject is identified by Study ID and the annual cycle during which the posted data were collected. Clicking on the Study ID leads the rater to a series of pages containing data from that participant. A schematic diagram of the sequence is shown in Fig. 3.

Fig. 3
Schematic diagram of online diagnostic consensus process used in the Monongahela-Youghiogheny Healthy Aging Team Project.

The first page viewed displays variables from the first two categories above. The MMSE [63] total score is visible on this page; clicking on the legend “MMSE” opens a table where item-by-item MMSE scores are provided. The same option is available for ADL and other impairment scales. At the bottom of this page, a link is provided to “make your Clinical Dementia Rating.” Clicking on this link opens a page where a link is provided to the Washington University CDR scoring algorithm [89] to calculate the CDR summary score based on individual “box scores.” Below this is a menu to choose a selection for a required CDR rating of 0, 0.5, 1, 2, or 3; an optional free text field for comments, if any; and a menu to choose a required certainty rating ranging from 1 (not at all certain) to 5 (absolutely certain). Clicking on an icon to save these ratings leads the rater to the next page.

The following page is devoted to cognitive classification, containing the neuropsychological information. Note that the rater has already completed the CDR, based on everyday functioning, before viewing the neuropsychological data. On the cognitive page, tests are categorized by the corresponding principal cognitive domain, i.e., attention/processing speed, executive function, memory, language, and visuospatial function. For each test, the table provides the participant's score alongside the mean, standard deviation, and 7th percentile score (equivalent to 1.5 SD below the mean) for the participant's age-gender-education group. If desired, the rater can click on the name of the test to view a table showing MYHAT cohort norms on that test. Clicking on “Clock Drawing” displays a scanned image of that participant's clock drawing. At the bottom of this page is a link to click to “Make your Cognitive Classification.” The choices provided include normal, focal amnestic MCI, multi-domain amnestic MCI, focal non-amnestic MCI, multi-domain non-amnestic MCI, and moderate to severe cognitive impairment (aka dementia). Again, the rater selects the cognitive classification and certainty level; comments are optional.

The final page is devoted to etiologically relevant information and includes the fourth category of variables listed earlier; by this time the rater has already viewed the CDR and cognitive data and is able to take all information into account when rendering the etiological diagnosis. The rater clicks on “Make your Etiological Diagnosis.” Again, the etiological choice and the certainty ratings are required; additional comments, including etiological options not listed in the menu, are optional (Fig. 4).

Fig. 4
Screenshot of web page for etiologic diagnosis: The Monongahela-Youghiogheny Healthy Aging Team Project.

When the rater has completed all 3 ratings, clicking on a link to return to the home page saves all these ratings to the database. An asterisk now appears next to the study ID, indicating that this rater has already dated that case. Now, clicking on the ID allows the rater to view ratings that other raters have completed on the same case, and to revise his ratings if so desired. The original ratings are also saved.

6.1.3. Web development details

The web site was developed using Microsoft SharePoint with web parts developed in Visual Studio.Net. It is estimated that building the website required 320 hours (8 weeks) of programmer time, and, subsequently, one hour per week of systems administrator time for server upkeep, security, and website modifications. The website requires 128-bit SSL (Secure Socket Layer) encryption as well as full authentication using domain accounts and passwords. Raters are identified automatically within the database and all ratings, including any changes in ratings, confidence or comments are logged. Research subjects are only identified by ID number. No caching or cookies are stored on the rater's workstation.

6.2. Conclusions

The familiar, widely established, live expert consensus process is not usually well-described or highly standardized, with rare published exceptions [85,86]. Previous studies have reported inter-rater agreement on diagnoses by individual clinicians [84] and stability or validity of diagnosis after examination of neuroimaging and neuropathological data [85,86]. Few if any reports describe the consensus diagnosis process itself or the relative contribution of its various components. The authors are aware of two previous efforts at web-based diagnosis. One was designed to examine inter-rater agreement in pathology diagnosis, and was accomplished by adding annotations to an existing website where the pathology images were stored [90]. Another describes a clinical web environment for the diagnosis of Alzheimer's and other dementias [91].

As this section is a descriptive report of a web-based process, there are no empirical results to report. Raters include neurologists, psychiatrists, neuropsychologists, and geriatricians. They have uniformly described the process as interesting and user-friendly, and, after an initial learning curve, reported that cases take an average of 10 minutes to complete. Since raters can log into the secure website from any location at their convenience, scheduling constraints are eliminated and time is efficiently used. Apart from identifying the diagnoses on which all raters agree, and to select the modal rating or diagnosis where there is less than perfect agreement, data can be used in multiple ways. For example, it is possible to calculate inter-rater agreement across raters and within/across specialties, and to identify rating/diagnostic categories where there is greater and lesser agreement. Researchers can also examine change in etiological diagnosis after additional (e.g., brain MRI) data are presented. They can attempt to develop diagnostic algorithms based on ratings provided by experts. By empirically identifying the components that predict these ratings, they can even attempt to “deconstruct” to some extent the process of expert clinical judgment.

The web-based process described here combines the advantages of assessment by non-experts, expert judgment and diagnosis, and the convenience of online rating. It offers some alternatives which have potential advantages of standardization, empirical analysis, and efficiency in the use of expert clinician's time, which likely enhances cost-effectiveness.

7. Discussion

The public health burden of AD is well documented for the most advanced stages of disease and the importance of identifying cases is critical to establishing both prevalence and incidence of disease and to one day conducting clinical trials in disease prevention. In the various sections of this article, different methods of case ascertainment are highlighted through examples of several studies. In each study the focus is on aspects of the diagnostic process in which cost savings might be realized by means of some methodological modification. However, before savings can be realized there must be a transparent model of cost which acknowledges the resources required for each element of ascertainment. These elements include among others, sample identification and retention, proxy input, clinical, demographic and social data collection, cognitive assessment, functional assessment, and the cost of diagnostic expertise. As we move to earlier stages of detection there may even be an expectation of specific laboratory data for detection which will also incur an as yet unknown cost.

One of the challenges of ascertainment of dementia, acknowledged by many, is that the disease of interest impedes the ability to participate in study activities. The ability to complete assessments and to assess one's own performance is impaired. This leads to the need to use proxies to remove the “self-as-observer” bias for measuring current level of function and to provide comparison to historical performance. Proxies, often family or friends, are a major focus for assessment in surveys such as HRS, but they can have a varying ability to make historical comparisons. This ability to describe a change from previous function can mitigate the lack of previous assessments, which reflects a cost saving, but the frequency of contact and the intimacy of knowledge will impact on the ability to contribute information of value. The identification of a second individual as well as an evaluation of the quality of information they can provide must be factored into the cost of ascertainment. The ADAMS cohort, a resource-intensive sub-study of HRS, demonstrates the under-detection of survey methods, but provides an opportunity to select and validate survey items that might provide greater specificity. A proposal to reduce the cost of the ADAMS evaluation focuses on the use of telephone assessment rather than in-person evaluations. This multi-step process, proposed by Wallace, Langa, and Plassman (Section 3), is as yet unvalidated; it begins with intensive clinical data collection and includes screening for depression and delirium. The efficiency of this approach includes selective use of informants in the early steps, telephone-based assessment of cognition and intensive medical record review. These distance-based assessments limit the ability to collect biological samples but cost reduction is realized through minimizing staff travel, though the assessment time and effort for record review must be considered. Of note, Wilson and Bennett (Section 4) describe longitudinal assessment, with algorithmic diagnosis, that does not depend on an informant. Such an approach may be most efficient when the population has high retention and follow-up, as in ROS or MAP.

Several studies focus on detailed data collected by non-professionals, but reviewed by professionals to achieve the diagnosis (Sections 4–6). This requires one or more professionals to give input to some or all aspects of the clinical picture but saves the cost of professional assessment of individual cases. The consensus diagnosis typically describes an interdisciplinary team of experts conducting ascertainment by agreement. A feature of the consensus diagnosis is that judgment is used to weigh variables that may have a non-specific contribution or be part of a complex interaction. For example, a consensus diagnosis may weigh poor cognitive performance in the presence of other medical, physical, or social factors. Technologies are available to permit this activity to be done without the cost or burden of joint meetings but through review of iterative opinions as described in the MYHAT Project (Section 6). Web-based approaches to data review and decision collection provides a practical approach to examining each step of the diagnostic process. Further cost savings may be achieved by the use of an algorithm that weighs the input from different assessment domains. Weightings would ideally be built on the actual experience of clinical expertise. For example, examination of how specific medical, physical or social factors were used to arrive at a consensus diagnosis can inform a diagnostic algorithm. The inclusion of demographic and clinical data can provide additional refinements, and laboratory results may suggest etiology and levels of certainty. ROS and MAP include longitudinal follow-up with autopsy confirmed diagnosis which provides unique validation of the diagnostic algorithm (Section 4). Transparency requires acknowledging the initial cost of clinical expertise and neuropathologic validation in formulating the algorithm with savings achieved by eliminating clinical expertise for case review.

The work of Duara and Loewenstein (Section 5) and Ganguli (Section 6) describe categorization within mild forms of impairment, describing amnestic and nonamnestic forms of MCI, conditions defined primarily by normative data. These categories illustrate the ability to detect subtle deficits but will remain in diagnostic infancy until we have sufficient longitudinal data to understand the predictive value and the full impact of such entities within the healthcare system. Nevertheless, the systematic collection of extensive data can provide a diagnostic algorithm for the most important conditions, and may identify the minimum data set required to make the diagnosis. The simultaneous collection of simple survey tools within these elaborate assessments may provide validation for methods to be used in larger populations.

While diagnosis by consensus or algorithm may save the cost of the professional assessment of each individual, the cost of collecting assessments remains high. Cognitive testing requires training and takes time and resources. Functional assessment and other clinical and demographic data are rarely available in a systematic way. Of note, neither administrative data sets nor general medical records are likely to contain information about subtle cognitive deficits or mild functional impairment (unless cognitive impairment is a presenting complaint). Thus, specific assessment above and beyond medical record is required for ascertainment of cognitive loss or dementia. The work of Wilson and Bennett (Section 4), Duara and Loewenstein (Section 5), and Ganguli (Section 6) describe collection of extensive assessments with highly trained staff. Maintaining such staff who can provide reliable standardized assessment is a very large part of the ascertainment expense. Weir (Section 2) and Wilson and Bennett (Section 3) describe ways to reduce this expense, including the use of survey methods and telephone-based assessments. The sensitivity of these approaches for ascertainment of dementia is well established, although not without limitations. However, the effectiveness in detecting MCI is not well established. Among the challenges of these methods is the capture of information from an aging population, often with sensory deficit. About 40% of people over age 65 years have hearing loss, impeding telephone interactions; visual impairment can interfere with completion of written materials; and general caution about interacting with unknown surveyors can reduce the likelihood of participation among the elderly. Estimates of participation of targeted populations can be as low as 50 to 65% and while increased sample size can address this for establishing prevalence, true incidence and randomized trials are particularly challenged. Thus, the cost savings of these approaches needs to be weighed against any need to increase sample size.

True incidence requires longitudinal surveillance and re-assessment over time. Even among those with measurable cognitive impairment, incidence is as low 5% to 15% annually. Prevention studies of non-impaired individuals required a doubling of observation time to achieve the needed conversion rates. The cost of maintaining the cohort can be considerable, particularly when it may be a transient population. While impairment may reduce mobility, relocation for proximity to family or for higher level care is likely. Of note, little has been described of the cost of recruitment or retention. The resources and cost of obtaining consent, and of re-evaluation to insure continued capacity to consent in a population at risk for cognitive change are also unrecognized. Yet these are necessary costs when methodologies include performance-based evaluation, acquiring medical records, and using a proxy informant. Many cost-effective approaches to ascertainment focus on reducing manpower for data collection. These include removing the expert clinician from the assessment, and using technologies to collect data, all of which reduce the human contact between participant and researcher. While these may be cost saving and even reduce the burden on the participant they may also inadvertently reduce the visibility of the project and undermine the importance of the effort contributed by participants. Of particular importance is building the partnership with participants to create shared commitment for the outcomes of these studies. When the motivation for follow-up is shared by the participant and project staff, the effort to track and follow cases may be reduced and retention may be higher.

Great progress has been made in the willingness and ability to detect cognitive impairment and dementia in aging populations. Factors such as the aging of the baby boomer generation, proposed increases in retirement age and improved longevity highlight the healthcare imperative to understand and address these diseases. Our growing experience with evaluation of cognition and function in diverse and real world populations is leading to efficient methods for characterizing cognitive impairment and diagnosing dementia. Efficient and effective measures of performance have been developed and are widely used. Moving forward, a transparent and informed approach is needed to evaluate both the savings and potential unintended costs of ascertainment methods. Efforts to maximize these efficiencies can reduce the cost of diagnosis but must be balanced against the cost of underestimates of disease. These methods will be critical to conducting disease prevention trials and in fact have already proven effective in such studies. While the most novel approaches to prevention postulate a pre-symptomatic stage of disease which would be defined by a biomarker it is not clear what economy this will provide. To date no biomarker predicts progression or incident dementia better than memory impairment in the otherwise asymptomatic individual. While the development of precise diagnostic laboratory tests may reduce uncertainty of diagnosis it is unclear that it will reduce the cost of ascertainment.


This combined effort was supported by National Institute on Aging grants U01AG009740, P30AG010161, R01AG015819, R01AG017917, P50AG025711, R01AG030561, R01AG023651, K24AG022035, P50AG005138.

This work was also supported by the State of Florida Department of Elder Affairs, Tallahassee, Florida; the Johnnie Byrd, Sr., Alzheimer's Center and Research Institute and the School of Medicine, University of South Florida, Tampa, Florida.

Drs. Wallace, Langa, and Plassman thank Michelle Birt Leeds for her excellent assistance in assembling and editing parts of this manuscript. Dr. Ganguli thanks Lynda Rose, Cathy Kanczes, Jack Doman, Joni Vander Bilt, and Katherine McMichael at the University of Pittsburgh.


Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

Disclosure statement for authors The authors have no conflicts to disclose. The sponsors had neither a role in the analysis or interpretation of these data, nor in the content of the paper. Appropriate approval procedures were used concerning human subjects.


[1] Brookmeyer R, Gray S, Kawas C. Projections of Alzheimer's disease in the United States and the public health impact of delaying disease onset. Am J Pub Health. 1998;88:1337–42. [PMC free article] [PubMed]
[2] Sloane PD, Zimmerman S, Suchindran C, et al. The public health impact of Alzheimer's disease, 2000–2050: potential implication of treatment advances. Ann Rev Public Health. 2002;23:213–31. [PubMed]
[3] Evans DA, Grodstein F, Loewenstein D, Kaye J, Weintraub S. Reducing case ascertainment costs in US population studies of Alzheimer's disease, dementia, and cognitive impairment—Part 2. Alzheimer Dement. 2011 In press. [PMC free article] [PubMed]
[4] Ofstedal MB, Fisher GG, Herzog AR. Documentation of Cognitive Functioning Measures in the Health and Retirement Study. HRS Documentation Report DR-006. Mar, 2005. http://hrsonline.isr.umich.edu/sitedocs/userg/dr-006.pdf.
[5] Jorm AF, Jacomb PA. The Informant Questionnaire on Cognitive Decline in the Elderly (IQCODE): socio-demographic correlates, reliability, validity and some norms. Psychol Med. 1989;19:1015–22. [PubMed]
[6] Langa KM, Chernew ME, Kabeto MU, Herzog AR, Ofstedal MB, Willis RJ, et al. National estimates of the quantity and cost of informal caregiving for the elderly with dementia. J Gen Intern Med. 2001;16:770–8. [PMC free article] [PubMed]
[7] Langa KM, Larson EB, Wallace RB, Fendrick AM, Foster NL, Kabeto MU, et al. Out-of-pocket health care expenditures among older Americans with dementia. Alzheimer Dis Assoc Disord. 2004;18:90–8. [PubMed]
[8] Langa KM, Plassman BL, Wallace RB, Herzog AR, Heeringa SG, Ofstedal MB, et al. The Aging, Demographics, and Memory Study: Study design and methods. Neuroepidemiology. 2005;25:181–91. [PubMed]
[9] Plassman BL, Langa KM, Fisher GG, Heeringa SG, Weir DR, Ofstedal MB, et al. Prevalence of cognitive impairment without dementia in the United States. Ann Intern Med. 2008;148:427–34. [PMC free article] [PubMed]
[10] Plassman BL, Langa KM, Fisher GG, Heeringa SG, Weir DR, Ofstedal MB, et al. Prevalence of dementia in the United States: The Aging, Demographics, and Memory Study. Neuroepidemiology. 2007;29:125–32. [PMC free article] [PubMed]
[11] Taylor DH, Østbye T, Langa KM, Weir D, Plassman Bl. The accuracy of Medicare claims as an epidemiological tool: The case of dementia revisited. J Alzheimers Dis. 2009;17:807–15. [PMC free article] [PubMed]
[12] Juster F, Suzman R. An overview of the Health and Retirement study. J Hum Resour. 1995;30:S7–S56.
[13] Gure T, Kabeto M, Plassman B, Piette J, Langa K. Differences in functional impairment across subtypes of dementia. J Gerontol A Biol Sci Med Sci. 2010;65:434–41. [PMC free article] [PubMed]
[14] Okura T, Plassman B, Steffens D, Llewellyn D, Potter G, Langa K. Prevalence of neuropsychiatric symptoms and their association with functional limitations in older adults in the United States: the aging, demographics, and memory study. J Am Geriatr Soc. 2010;58:330–7. [PMC free article] [PubMed]
[15] Rogers MA, Plassman BL, Kabeto M, Fisher GG, McArdle JJ, Llewellyn DJ, Potter GG, Langa KM. Parental education and late-life dementia in the United States. J Geriatr Psychiatry Neurol. 2009;22:71–80. [PMC free article] [PubMed]
[16] Mehta K, Stewart A, Langa K, Yaffe K, Moody-Ayers S, Williams B, et al. “Below average” self-assessed school performance and Alzheimer's disease in the Aging, Demographics, and Memory Study. Alzheimers Dement. 2009;5:380–7. [PMC free article] [PubMed]
[17] Schuurmans M, Deschamps P, Markham S, Shortridge-Baggett L, Duursma S. The measurement of delirium: review of scales. Res Theory Nurs Pract. 2003;17:207–24. [PubMed]
[18] Chen C, Liu C, Liang H. Comparison of patient and caregiver assessments of depressive symptoms in elderly patients with depression. Psychiatry Res. 2009;166:69–75. [PubMed]
[19] Steffens DC, Fisher GG, Langa KM, Potter GG, Plassman BL. Prevalence of depression among older Americans: The Aging, Demographics and Memory Study. Int Psychogeriatr. 2009;21:879–88. [PMC free article] [PubMed]
[20] Panza F, Frisardi V, Capurso C, D'Introno A, Colacicco A, Imbimbo B, et al. Late-life depression, mild cognitive impairment, and dementia: possible continuum? Am J Geriatr Psychiatry. 2010;18:98–116. [PubMed]
[21] Barberger-Gateau P, Fabrigoule C, Helmer C, Rouch I, Dartigues J. Functional impairment in instrumental activities of daily living: an early clinical sign of dementia? J Am Geriatr Soc. 1999;47:456–62. [PubMed]
[22] Hebert LE, Bienias JL, Aggarwal NT, Wilson RS, Bennett DA, Shah RC, et al. Change in risk of Alzheimer's disease over time. Neurology. 2010;75:786–91. [PMC free article] [PubMed]
[23] Wing JK. The use of the Present State Examination in general population surveys. Acta Psychiatr Scand. 1980;62:230–40.
[24] Copeland JR, Dewey ME, Griffiths-Jones HM. A computerized psychiatric diagnostic system and case nomenclature for elderly subjects: GMS and AGECAT. Psychol Med. 1986;16:89–99. [PubMed]
[25] Bennett DA, Shannon KM, Beckett LA, Goetz CG, Wilson RS. Metric properties of nurses' ratings of parkinsonian signs with a modified Unified Parkinson's Disease Rating Scale. Neurology. 1997;49:1580–7. [PubMed]
[26] McKhann G, Drachman D, Folstein M, Katzman R, Price D, Stadlan EM. Clinical diagnosis of Alzheimer's disease: Report of the NINCDS-ADRDA Work Group under the auspices of Department of Health and Human Services Task Force on Alzheimer's Disease. Neurology. 1984;34:939–44. [PubMed]
[27] Bennett DA, Wilson RS, Schneider JA, Evans DA, Beckett LA, Aggarwal NT, et al. Natural history of mild cognitive impairment in older persons. Neurology. 2002;59:198–205. [PubMed]
[28] Bennett DA, Schneider JA, Aggarwal NT, Arvanitakis Z, Shah RC, Kelly JF, et al. Decision rules guiding the clinical diagnosis of Alzheimer's disease in two community-based cohort studies compared to standard practice in a clinic-based cohort study. Neuroepidemiology. 2006;27:169–76. [PubMed]
[29] Barnes LL, Schneider JA, Boyle PA, Bienias JL, Bennett DA. Memory complaints are related to Alzheimer disease pathology in older persons. Neurology. 2006;67:1581–5. [PMC free article] [PubMed]
[30] Wilson RS, Bienais JL, Evans DA, Bennett DA. Religious Orders Study: overview and change in cognitive and motor speed. Aging Neuropsychol Cogn. 2004;11:280–303.
[31] Bennett DA, Schneider JA, Buchman AS, Mendes de Leon CF, Bienais JL, Wilson RS. The Rush Memory and Aging Project: Study design and baseline characteristics of the study cohort. Neuroepidemiology. 2005;25:163–75. [PubMed]
[32] Tang MX, Stern Y, Marder J, Bell K, Gurland B, Lantigua R, et al. The APOE ε4 allele and the risk of Alzheimer disease among African Americans, Whites, and Hispanics. JAMA. 1998;279:751–5. [PubMed]
[33] Khachaturian AS, Corcoran CD, Mayer LS, Zandi PP, Breitner JC, Cache County Investigators Apolipoprotein E ε4 count affects age at onset of Alzheimer disease, but not lifetime susceptibility: The Cache County Study. Arch Gen Psychiatry. 2004;61:518–24. [PubMed]
[34] Boyle PA, Buchman AS, Wilson RS, Kelly JF, Bennett DA. The ApoE ε4 allele is associated with incident mild cognitive impairment among community-dwelling older persons. Neuroepidemiology. 2009;34:43–9. [PMC free article] [PubMed]
[35] Wilson RS, Schneider JA, Barnes LL, Beckett LA, Aggarwal NT, Cochran EJ, et al. The apolipoprotein E ε4 allele and decline in different cognitive systems during a 6-year period. Arch Neurol. 2002;59:1154–60. [PubMed]
[36] Berger AK, Fratglioni L, Forsell Y, Winblad B, Backman L. The occurrence of depressive symptoms in the preclinical phase of AD: a population-based study. Neurology. 1999;53:1998–2002. [PubMed]
[37] Devanand DP, Sano M, Tang MX, Taylor S, Gurland BJ, Wilder D, et al. Depressed mood and the incidence of Alzheimer's disease in the elderly living in the community. Arch Gen Psychiatry. 1996;53:175–82. [PubMed]
[38] Wilson RS, Barnes LL, Mendes de Leon CF, Aggarwal NT, Schneider JS, Bach J, et al. Depressive symptoms, cognitive decline, and risk of AD in older persons. Neurology. 2002;59:364–70. [PubMed]
[39] Luchsinger JA, Tang MX, Stern Y, Shea S, Mayeux R. Diabetes mellitus and risk of Alzheimer's disease and dementia with stroke in a multiethnic cohort. Am J Epidemiol. 2001;154:635–41. [PubMed]
[40] Peila R, Rodriguez BL, Launer LJ, Honolulu-Asia Aging Study Type 2 diabetes, APOE gene, and the risk for dementia and related pathologies: The Honolulu-Asia Aging Study. Diabetes. 2002;51:1256–62. [PubMed]
[41] Arvanitakis Z, Wilson RS, Bienias JL, Evans DA, Bennett DA. Diabetes mellitus and risk of Alzheimer disease and decline in cognitive function. Arch Neurol. 2004;61:661–6. [PubMed]
[42] Scarmeas N, Levy G, Tang MX, Manly J, Stern Y. Influence of leisure activity on the incidence of Alzheimer's disease. Neurology. 2001;57:2236–42. [PMC free article] [PubMed]
[43] Verghese J, Lipton RB, Katz MJ, Hall CB, Derby CA, Kuslansky G, et al. Leisure activities and the risk of dementia in the elderly. N Eng J Med. 2003;348:2508–16. [PubMed]
[44] Wilson RS, Mendes de Leon CF, Barnes LL, et al. Participation in cognitively stimulating activities and risk of incident Alzheimer's disease. JAMA. 2002;287:742–8. [PubMed]
[45] Wilson RS, Scherr PA, Schneider JA, Tang Y, Bennett DA. Relation of cognitive activity to risk of developing Alzheimer disease. Neurology. 2007;69:1911–20. [PubMed]
[46] Bowen J, Teri L, Kukull W, McCormick W, McCurry SM, Larson EB. Progression to dementia in patients with isolated memory loss. Lancet. 1997;349:763–5. [PubMed]
[47] Grober E, Hall CB, Lipton RB, Zonderman AB, Resnick SM, Kawas C. Memory impairment on free and cued selective reminding predicts dementia. Neurology. 2000;54:827–32. [PubMed]
[48] Howieson DB, Dame A, Camicioli R, Sexton G, Payami H, Kaye JA. Cognitive markers preceding Alzheimer's dementia in the healthy oldest old. J Am Geriatr Soc. 1997;45:584–9. [PubMed]
[49] Fabrigoule C, Rouch I, Taberly A, Letenneur L, Commenges D, Mazaux JM, et al. Cognitive process in preclinical phase of dementia. Brain. 1998;121:135–41. [PubMed]
[50] Tabert MH, Albert SM, Borukhova-Milov L, Camacho Y, Pelton G, Liu X, et al. Functional deficits in patients with mild cognitive impairment: Prediction of Alzheimer's disease. Neurology. 2002;58:758–64. [PubMed]
[51] Petersen RC. Mild cognitive impairment as a diagnostic entity. J Intern Med. 2004;256:183–94. [PubMed]
[52] Beekly DL, Ramos EM, Lee WW, Deitrich WD, Jacka ME, Wu J, et al. The National Alzheimer's Coordinating Center (NACC) database: The Uniform Data Set. Alzheimer Dis Assoc Disord. 2007;21:249–58. [PubMed]
[53] Graham JE, Rockwood K, Beattie BL, McDowell I, Eastwood R, Gauthier S. Standardization of the diagnosis of dementia in the Canadian Study of Health and Aging. Neuroepidemiology. 1996;15:246–56. [PubMed]
[54] Luis CA, Barker WW, Loewenstein DA, Crum TA, Rogaeva E, Kawarai T, St George-Hyslop P, Duara R. Conversion to Dementia among Two Groups with Cognitive Impairment. A Preliminary Report. Dement Geriatr Cogn Disord. 2004;18:307–13. [PubMed]
[55] Bennett DA, Schneider JA, Bienias JL, Evans DA, Wilson RS. Mild cognitive impairment is related to Alzheimer disease pathology and cerebral infarctions. Neurology. 2005;64:834–41. [PubMed]
[56] Fillenbaum GG, Peterson B, Morris JC. Estimating the validity of the clinical Dementia Rating Scale: the CERAD experience. Consortium to Establish a Registry for Alzheimer's Disease. Aging (Milano) 1996;8:379–85. [PubMed]
[57] Rockwood K, Strang D, MacKnight C, Downer R, Morris JC. Interrater reliability of the Clinical Dementia Rating in a multicenter trial. J Am Geriatr Soc. 2000;48:558–9. [PubMed]
[58] Schafer KA, Tractenberg RE, Sano M, Mackell JA, Thomas RG, Gamst A, et al. Reliability of monitoring the clinical dementia rating in multicenter clinical trials. Alzheimer Dis Assoc Disord. 2004;18:219–22. [PubMed]
[59] Hogervorst E, Bandelow S, Combrinck M, Irani SR, Smith AD. The validity and reliability of 6 sets of clinical criteria to classify Alzheimer's disease and vascular dementia in cases confirmed post-mortem: added value of a decision tree approach. Dement Geriatr Cogn Disord. 2003;16:170–80. [PubMed]
[60] Jobst KA, Barnetson LP, Shepstone BJ. Accurate prediction of histologically confirmed Alzheimer's disease and the differential diagnosis of dementia: the use of NINCDS-ADRDA and DSM-III-R criteria, SPECT, X-ray CT, and APO E4 medial temporal lobe dementias. The Oxford Project to Investigate Memory and Aging. Int Psychogeriatr. 1998;10:271–302. [PubMed]
[61] Liu H, Harker JO, Wong AL, Maclean CH, Bulpitt KJ, Mittman BS, et al. Case finding for population-based studies of rheumatoid arthritis: Comparison of patient self-reported ACR criteria-based algorithms to physician-implicit review for diagnosis of rheumatoid arthritis. Semin Arthritis Rheum. 2004;33:302–10. [PubMed]
[62] Duara R, Loewenstein DA, Greig M, Acevedo A, Potter E, Appel J, et al. Reliability and validity of an algorithm for the diagnosis of normal cognition, mild cognitive impairment, and dementia: Implications for multicenter research studies. Am J Geriatr Psychiatry. 2010;18:363–70. [PMC free article] [PubMed]
[63] Folstein MF, Folstein SE, McHugh PR. “Mini-mental state”. A practical method for grading the cognitive state of patients for the physician. J Psychiatr Res. 1975;12:189–98. [PubMed]
[64] Fuld PA. Fuld Object-Memory Evaluation. Illinois; Stoelting Co: 1981.
[65] Wechsler . The Wechsler Memory Scale-Revised. The Psychological Corporation; San Antonio, TX: 1987.
[66] Spreen O, Strauss E. A compendium of neuropsychological tests: Administration, norms, and commentary. 2nd Ed. Oxford University Press; New York: 1998.
[67] Acevedo A, Loewenstein DA, Barker W, Harwood DF, Luis C, Bravo M. Category fluency test: Normative data for English- and Spanish-speaking elderly. J Int Neuropsychol Soc. 2000;6:760–9. [PubMed]
[68] Wechsler D. The Wechsler Adult Intelligence-Revised. Psychological Corporation; New York: 1981.
[69] Army Individual Test Battery . Manual of directions and scoring. War Department, Adjutant General's Office; Washington: 1944.
[70] Loewenstein DA, Acevedo A, Agron J, Martinez G, Duara R. The use of amnestic and nonamnestic composite measures at different thresholds in the neuropsychological diagnosis of MCI. J Clin Exp Neuropsychol. 2007;29:300–7. [PubMed]
[71] Acevedo A, Loewenstein DA, Agrón J, Duara R. Influence of socio-demographic variables on neuropsychological test performance in Spanish-speaking older adults. J Clin Exp Neuropsychol. 2007;29:530–44. [PubMed]
[72] Loewenstein DA, Acevedo A, Ownby R, Agrón J, Barker WW, Issacson RS, et al. Using different memory cut-offs to assess MCI. Am J Geriatr Psychiatry. 2006;4:911–9. [PubMed]
[73] Morris JC. The Clinical Dementia Rating (CDR): Current version and scoring rules. Neurology. 1993;43:2412–4. [PubMed]
[74] Morris JC. Clinical dementia rating: a reliable and valid diagnostic and staging measure for dementia of the Alzheimer type. Int Psychogeriatr. 1997;9(Suppl 1):173–6. [PubMed]
[75] Bobinski M, Wegiel J, Wisniewski HM, Tarnawski M, Bobinski M, Reisberg B, et al. Neurofibrillary pathology - correlation with hippocampal formation atrophy in Alzheimer disease. Neurobiol Aging. 1996;17:909–19. [PubMed]
[76] Jack CR, Jr, Dickson DW, Parisi JE, Xu YC, Cha RH, O'Brien PC, et al. Antemortem MRI findings correlate with hippocampal neuropathology in typical aging and dementia. Neurology. 2002;58:750–7. [PMC free article] [PubMed]
[77] Järvenpää T, Laakso MP, Rossi R, Koskenvuo M, Kaprio J, Räihä I, et al. Hippocampal MRI volumetry in cognitively discordant monozygotic twin pairs. J Neurol Neurosurg Psychiatry. 2004;75:116–20. [PMC free article] [PubMed]
[78] Duara R, Loewenstein DA, Potter E, Appel J, Greig MT, Urs R, et al. Medial Temporal Lobe Atrophy on MRI Scans and the Diagnosis of Alzheimer's Disease. Neurology. 2008;71:1986–92. [PMC free article] [PubMed]
[79] Ye S, Huang Y, Müllendorff K, Dong L, Giedt G, Meng EC, et al. Apolipoprotein (apo) E4 enhances amyloid beta peptide production in cultured neuronal cells: apoE structure as a potential therapeutic target. Proc Natl Acad Sci USA. 2005;102:18700–5. [PMC free article] [PubMed]
[80] Barker WW, Luis CA, Kashuba A, Luis M, Harwood DG, Loewenstein D, et al. Relative frequencies of Alzheimer disease, Lewy body, vascular and frontotemporal dementia, and hippocampal sclerosis in the state of Florida brain bank. Alzheimer Dis Assoc Disord. 2002;16:203–12. [PubMed]
[81] Cohen J. Weighted kappa: nominal scale agreement with provision for scaled disagreement or partial credit. Psych Bull. 1968;70:213–220. [PubMed]
[82] Blacker D, Albert MS, Bassett SS, Go RC, Harrell LE, Folstein MF. Reliability and validity of NINCDS-ADRDA criteria for Alzheimer's disease. The National Institute of Mental Health Genetics Initiative. Arch Neurol. 1994;51:1198–1204. 1994. [PubMed]
[83] Lopez OL, Becker JT, Klunk WE, Saxton JA, Hamilton RL, Kaufer DI, et al. Research evaluation and diagnosis of Probable Alzheimer's disease over the last two decades. Neurology. 2000;55:1854–62. [PubMed]
[84] Lopez OL, Litvan I, Catt KE, Stowe R, Klunk WE, Kaufer DI, et al. Accuracy of four clinical diagnostic criteria for the diagnosis of neurodegenerative dementias. Neurology. 1999;53:1292–9. [PubMed]
[85] Lopez OL, Kuller LH, Fitzpatrick A, Ives D, Becker JT, Beauchamp N. Evaluation of dementia in the Cardiovascular Health Cognition Study. Neuroepidemiology. 2003;22:1–12. [PubMed]
[86] Shumaker SA, Legault C, Rapp SR, Thal L, Wallace RB, Ockene JK, et al. Estrogen plus progestin and the incidence of dementia and mild cognitive impairment in postmenopausal women: the Women's Health Initiative Memory Study: a randomized controlled trial. JAMA. 2003;289:2651–62. [PubMed]
[87] Ganguli M, Snitz B, Vander Bilt J, Chang CC-H. How much do depressive symptoms affect cognition at the population level? The Monongahela-Youghiogheny Healthy Aging Team (MYHAT) study. Int J Geriatr Psychiatry. 2009;24:1277–84. [PMC free article] [PubMed]
[88] Ganguli M, Snitz BE, Lee CW, Vanderbilt J, Saxton JA, Chang CCH. Age and education effects and norms on a cognitive test battery from a population-based cohort: The Monongahela-Youghiogheny Healthy Aging Team. Aging Ment Health. 2010;14:100–7. [PMC free article] [PubMed]
[90] Zapletal E, Le Bozec C, Degoulet P, Guinebretiere J-M, Jaulent M-C. Specifications and implementation of a new exchange format to support computerized consensus in pathology. In: Fieschi M, editor. MEDINFO 2004. IOS Press; Amsterdam: 2004. pp. 693–7. [PubMed]
[91] Araujo CPS, Del Pino MAP, Baez PG, Lopez PF. Clinical web environment to assist the diagnosis of Alzheimer's disease and other dementias. WSEAS Transactions on Computers; 4th WSEAS Conference on Applied Informatics and Communications.2004. pp. 2083–8.
PubReader format: click here to try


Related citations in PubMed

See reviews...See all...

Cited by other articles in PMC

See all...


  • MedGen
    Related information in MedGen
  • PubMed
    PubMed citations for these articles

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...