NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.

Panel on New Directions in Social Demography, Social Epidemiology, and the Sociology of Aging; Committee on Population; Division on Behavioral and Social Sciences and Education; National Research Council; Waite LJ, Plewes TJ, editors. New Directions in the Sociology of Aging. Washington (DC): National Academies Press (US); 2013 Dec 26.

Cover of New Directions in the Sociology of Aging

New Directions in the Sociology of Aging.

Show details

3Data Needs and Opportunities

The aging of individuals and the aging of populations are two interrelated phenomena. These phenomena, along with the joint aging of institutions and of bodies, have been the subject of multitiered scientific investigations that have brought together many disciplines, sociology among them. The investigations have led to an established knowledge base and a rich portfolio of datasets about older people and the process of aging. The growing understanding of ongoing changes in the social contexts of aging as well as refinements in the conceptualization of aging, particularly the appreciation of dynamic and reciprocal relationships between individuals and social contexts, have combined to create new needs—and opportunities—for data gathering and analysis.

This chapter identifies several key areas in need of further development to enhance measurement and data quality. For example, it is becoming increasingly important to measure interhousehold and intergenerational transmission of resources but current measures are inadequate to portray those transmissions. Emerging living arrangements make it difficult to distinguish the varying types of parent figures. Similarly, while actual transfers are measured, latent support information is not generally gathered. The new context of intergenerational transmission needs entirely new measures such as space-time linked measures of neighborhood environments.

This chapter also points to the wedding of biological and sociological measures that is already underway and suggests that the growing study of biomarkers also generates a growing need for new measures, particularly to investigate the mediation of the social environment over the life course. Finally, this chapter identifies opportunities to gather data to take advantage of ongoing studies, linkages to existing databases, and new social media and technologies. Given the increasing prominence of transdisciplinary research, data-gathering efforts should attend to harmonization across different disciplines, as well as across studies, surveys, or countries.


The call for new data by no means dismisses the currently available portfolio of data. Indeed, current data include a wealth of material and are increasingly valuable, as they are repositories of longitudinal records and perspectives, can be augmented with additional research protocols, and can be utilized to examine entirely new questions not anticipated when they were designed.

Among longitudinal data sources, the Health and Retirement Study (HRS) is considered a national gem (see Hauser and Weir, 2010, and Hauser and Willis, 2005). Since its inception in the early 1990s, the HRS has become the largest, most representative longitudinal study of the U.S. population aged 50 and older. Funded by a cooperative agreement between the National Institute on Aging (NIA) and the University of Michigan, the HRS addresses the health and economic well-being of Americans during the latter part of life. It is cross-sectionally representative, with new cohorts added every six years. Because the HRS started with a probability sample, it can be used to generalize to subpopulations or cohorts and for framing the further study of subpopulations.

Scientists from a variety of disciplines, including economics, demography, sociology, psychology, medicine, epidemiology, health services, and survey methodology have been involved in the design and ongoing revisions of the HRS. Its content now incorporates economic, social, psychological, and physiological measures. Economic content has expanded to include measures of wealth from all sources, consumption and time use over the life course. Psychosocial content addresses well-being, life style, family ties, social relationships, personality, work life, and self-related beliefs. Physiological data collected by the HRS now includes a range of biomarkers on cardiovascular health, performance and anthropometric measures, and biological risk measures, as well as DNA sample for large-scale genotyping.

The HRS itself is of immense research value, permitting the study of relationships between health and cognitive outcomes, genetic profiles, economic status and behaviors, and social relationships over time and across cohorts and ages. It further features potential linkages to other data sources, including Medicare claims, the National Death Index, and administrative earnings and benefits data from Social Security. The HRS also meshes well with other datasets because a number of studies in Europe, Asia, and Latin America were deliberately designed to include measures that overlapped with it.1 The value of the HRS has thus been heightened as it has influenced the development of comparable large-scale longitudinal studies of aging in many other nations.

The HRS is reported throughout to have yielded more than 1,400 papers and publications—the outcome of work by more than 1,000 authors and co-authors and analysis by more than 10,000 registered data users (Hauser and Weir, 2010, p. 113). It has been described as “the model for a network of harmonized international longitudinal studies that monitors work, health, social, psychological, family and economic status, and assesses critical life transitions and trajectories related to retirement, economic security, health and function, social and behavioral function and support systems” (Weir et al., 2011, p. 905).

While HRS is a superb resource, the current portfolio includes other important datasets addressing many aspects of aging. The Wisconsin Longitudinal Study (WLS) began as a 1957 survey of Wisconsin high school seniors to assess the demand for postsecondary schooling. In subsequent data collection cycles, the survey has gradually expanded and shifted focus, moving from an earlier concern with education, career, and family to giving greater attention to health and aging. Siblings, spouses, and survivors of the original cohort of graduates have been added to the sample, and a range of biomarkers and DNA sampling have been added to the data collection protocols. Although no one involved in the original survey could have anticipated this, over several decades the WLS has become a foundation for a major study of the life course and aging (Hauser, 2009; Sewell et al., 2004).

Other datasets crucial to the study of aging include both large data monitoring efforts and in-depth studies of selected topics. Among these valuable datasets are

  • the National Health Interview Survey;
  • the Current Population Survey;
  • the American Community Survey;
  • the National Longitudinal Mortality Study;
  • the Panel Study of Income Dynamics (PSID), a long-running panel with unique multigenerational coverage, a recently expanded health content and a Disability and Time Use “daily diary” module, focused on older adults;
  • Midlife in the United States (MIDUS), which includes biosocial data collection and psychological measures;
  • the National Social Life Health and Aging Project;
  • the National Longitudinal Surveys of Youth (NLSY), particularly the NLSY79—a nationally representative sample of 12,686 young men and women who were 14-22 years old when they were first surveyed in 1979 and are now in their late 40s and in their 50s; and
  • the National Longitudinal Study of Adolescent Health.

To call for new data on aging is thus not to disparage the excellent foundation for research that has been carefully built. Indeed, the current portfolio of datasets has already permitted significant research beyond what it was originally designed to address. Hauser and Willis (2005), key figures in establishing this portfolio, acknowledge and celebrate this (p. 229):

The contingencies of economic and social change, interacting with those of the life course, imply that no longitudinal survey design will ever anticipate all uses of a study—or guarantee that planned uses of data will pay off. Indeed, we are impressed that the best research based on a given survey often explores questions or tests theories that were not contemplated by the designers of the survey. What we can do is to give our studies enough strength to hold up across time and to keep our eyes open for new opportunities to create data and add to knowledge.


In the process of scientific inquiry, the need for data is necessarily ever-changing. As Seltzer (2011b, p. 1) observes

Statements about what we know are very useful, but to move forward we must also determine what we still need to know and how we can know it. We collect data to address important research questions, but existing data then constrain our ability to address the new questions we discover.

The changing demographics of the aging population require changing data collections. An increased focus on data on the “oldest old,” and growing groups like immigrants and foreign-born citizens is putting pressure on existing data sources. Information on changing causes of death, and on human capital (education and work experience) and labor force participation among older adults is also in short supply.

The important subject of social engagement of the elderly and their well-being is also undercovered in current collections. Some of these topics require especially innovative approaches to data collection, such as employed in the study on social volunteering and health (Fried et al., 2004).

Many of the new questions regarding aging have to do with the connections between social contexts and the mechanisms that mediate or translate between the micro (perhaps as small as a code of DNA) and the macro (perhaps as large as the global economy), and between different domains that influence and constitute the social phenomenon of aging. Angel and Settersten (see Chapter 6) affirm the need to study “the friction of human lives in action.” Among the many instances of that friction that call for further data and new measures are intergenerational and interhousehold dynamics and the biological correlates of diverse social contexts over the life course. Transdisciplinary approaches may be particularly fruitful in these research endeavors.

Sometimes new data to shed light on new and emerging trends can be obtained from innovative use of current data collections. An example of an innovative design to capture understudied and “novel” populations might be the Non-Normative Child Supplement to the WLS. In this survey, parents who reported in earlier waves of data collection that at least one child has a major mental health or developmental disability (major depression, ADHD, etc.) are selected to participate in a supplement that captures the distinctive aging experiences of parents whose grown children may have characteristics and traits of special interest.

Measurement of Interhousehold and Intergenerational Transmission of Resources

Without doubt, the demography of family life in the United States has changed, in ways that include increased life expectancy, lower fertility, lengthening of generations, divorce, nonmarital childbearing, multipartner fertility, social parents when one biological parent cohabits, and grandparents who themselves separate and repartner. “These demographic changes,” asserts Seltzer (2011b, p. 1), “alter what a family is, and may change what a family does.” With the aging of cohorts whose lives have been affected by these demographic changes, new kinds of data are needed to understand the ramifications of these changes not only for individual welfare across the life course, but also for the transmission of health and resources across generations, the heterogeneity of risk, and the maintenance and exacerbation of disparities in the larger society.

When, why, and how do adult children and parents help each other? Will older adults exchange aid (financial or in-kind transfers) with adult stepchildren to the same degree as with biological children, or as with biological children raised in other households? How does the quality of the relationship with each child in a family structure affect the dynamic of intertransfers? What levels of care will adult children provide biological but absent parents, stepparents, or parent figures acquired through cohabitation? On whom can individuals potentially depend for support? How secure is the family safety net when it includes new types of relationships? Who will provide care for those without children?

Current measures are often inadequate to these questions. The proliferation of parent figures can be challenging for surveys to capture or even code. Many people experience living arrangements that are difficult to describe using standard household enumeration methods. The terminology—such as “former stepchildren”—can be difficult to parse. Intergenerational transfers of resources require data collection across households, tracking adult children who live elsewhere. Many of these issues were not concerns when primary datasets were originally designed. For example, the HRS relies on one spouse/partner to report on his or her own and the partner's children—an increasingly difficult task. The intensity of help is also not measured. The PSID does not follow stepchildren. Most major national surveys address actual transfers but not latent support, making it harder to study intergenerational impact on risk. (On survey shortcomings regarding intergenerational transfers, see Bianchi, 2011.)

Concerns about intergenerational transfers encompass not only internal family structures, but also local environments, such as neighborhoods. Recent studies show that the neighborhood environment in one generation may have a lingering impact on the next generation, with results confirming a powerful link between neighborhoods and cognitive ability that extends across generations (Sharkey and Elwert, 2011). Thus, not only the private resources in a family, however defined, but also the impact of context—in this instance, the neighborhood of residence—are also passed across generations. This intergenerational transmission of context clearly calls for new measures and data, such as space-time linked measures of neighborhood environments, to understand the maintenance and reproduction of inequality, the need for care among aging parents, and the capacity to provide care by adult children.

Contexts other than neighborhoods are also relevant to “the friction of human lives in action.” In Networks, Neighborhoods, and Institutions: An Integrated “Activity Space” Approach for Research on Aging (see Chapter 8), Cagney et al. encourage attention to activity space, or those places in which people engage in routine activities. This may include not only residence and neighborhood, but also workplaces, recreational sites, stores, clinics, streets, both formal and informal institutional settings, and any site of unstructured yet patterned interaction. The emphasis in this approach is on integrated, simultaneous, and interactive effects of multiple geographic and institutional settings. This too calls attention to the inadequacy of current measures. Cagney et al. point out, for example, that neighborhood is often defined by convenience, according to predefined administrative boundaries or census tract, rather than a spatial demarcation relevant to the inhabitants' lives. As aging changes and the conceptualization of aging changes, a common concern across researchers is that investigation of aging populations has focused too much on individual-level characteristics and now needs to acquire more data on the friction and flows among between individuals, households, generations, and other social contexts and networks.

Measurement of Biomarkers and Biosocial Interaction

The need for new measures, already being addressed but nonetheless ever-growing, is also evident in the area of biomarkers, particularly to investigate the mediation of social environment on physiology over the life course. The wedding of biological and sociological measures has been under way and is changing the landscape in basic ways. As Butz and Torrey (2006, p. 1,899) observe, “No field outside the social sciences is having as much of an impact within the field as biology.” Datasets that survey economic, social, and health phenomena increasingly incorporate biomarkers collected from survey respondents. This has included measures of, the cardiovascular system, metabolic activity, immune systems, and anthropometrics as well as the collection of samples for genetic analysis.

A primary instigation for new measurements is to gather the biological correlates of socioeconomic status (SES), or, as Gruenewald describes it in Opportunities and Challenges in the Study of Biosocial Dynamics in Healthy Aging (see Chapter 10), the pathways by which social environment and experience get “under the skin.” Bachrach and Abeles (2004, p. 22) note that “scientists have extensively documented the relationship of socioeconomic status to health but are barely beginning to understand the processes generating the relationship.” Indeed, some lament that the exponential increase in biomarker collection has outpaced theorizing about biosocial interactions (see Weinstein, Glei, and Goldman, Chapter 11; Shnittker, Chapter 13).

Research on biomarkers to identify the biological processes that underlie links between social factors and health is under way but much remains to be done.2 A number of questions relate biomarkers and the aging process. Some of these concern the impact of SES over the life course, particularly the timeframe when SES becomes embodied. Several different life course frameworks have been proposed to explain the role of biosocial processes in healthy aging. The impact of SES may be greatest at a sensitive period, such that events and environments at certain life course phases may have a stronger impact than those at other life phases. Or perhaps a critical period is the most essential, if experiences and exposures during a specific narrow window of development permanently “program” bodily systems, which will be relatively unaffected by subsequent exposure to risk—or to security and abundance. A third framework posits an accumulation of risk, or “chain of insults,” accruing over time. And the social context, especially SES, may moderate the relationship between genes and various outcomes (Caspi et al., 2003). Adjudicating between these different models, or perhaps constructing a hybrid model, calls for much new data. A serious shortcoming of current investigation is the lack of biomarkers from earlier life periods in samples of older persons.

In the view of Gruenewald (see Chapter 10), “Exciting opportunities are on the horizon for identifying how SES experiences at different phases of the life course get under the skin to affect a wide array of biological processes and how SES disparities in biological functioning might track across the life time.” Gruenewald goes on to explain [italics in original]

These biological imprints of social adversity in childhood may provide clues as to the trajectories of healthy or unhealthy aging that lie ahead. Longitudinal, life course investigations that concurrently measure social and biological factors from childhood to adulthood will be particularly fruitful for understanding when social adversity is embodied for different systems, the permanency of biological imprints, and the genetic, psychosocial and behavioral modifiers of these links.

Some researchers go further, citing the need for collection of biomarkers not only across an individual's life time but also across generations in order to understand and appreciate the impact of SES across the life course (see Kuzawa, 2005). To Shanahan (see Chapter 12), the next big opportunity in biosocial investigation may be population-based social genomics, particularly gene expression association with stress response (including defensive or inflammatory responses). Social genomics study the mechanisms by which social experiences regulate genetic activity, specifically the rate at which information contained in DNA is transcribed into mRNA. Shanahan (see Chapter 12) sees social genomics as a most “promising subfield” of research on aging because:

By establishing these meditational links between social experiences and transcriptional activity, scientists can begin to understand how social experiences, like those associated with socioeconomic status, affect physical and mental health. Thus, a major payoff of social genomics for demography and social epidemiology is the strengthening of causal claims by specifying the mechanisms that link social experience with behavior and health.

Population-based social genomics necessarily involves expanding the collection of biomarkers to include samples for genetic analysis in many different phases of the life course. Data collection from before, during, and after periods that might be sensitive or critical would be necessary for adjudicating different frameworks regarding the impact of SES and social relationships across the life course. Different phases of the life course may also be sensitive with respect to different aspects of the regulatory system. Evidence also suggests intergenerational transmission of gene expression patterns in response to social experiences over longer time scales, such as decades or generations, in which physiological changes adjust to more gradual changes in the environment (Kuzawa, 2005).

Initiatives in biosocial investigation involve not only enhancing the range of available biomarkers but also refining measures of SES and social risk that may correlate with these biological measures and moderate their relationship with health or other outcomes later in life. This particularly includes measures of chronic stressors, those features of social context “that are graded by SES and heighten a sense of vigilance, mistrust, and threat and that diminish self-regulatory capacities” (see Shanahan, Chapter 12). For example, work on specific aspects of the social environment shows that social isolation or loneliness is a trigger for proinflammatory gene expression in certain individuals. Discerning a relationship among environment, physiology, and behavior, as well as the implications of that interaction for healthy aging, requires careful measurement in all these domains—although, as Shanahan (Chapter 12) underscores, “it may be that stressor-inflammatory symptom associations are characterized by multifinality (the same causal agents leading to different outcomes), equifinality (diverse causal agents leading to the same outcome), or both—types of complexity that are often not considered in empirical research.”

The process of defining and implementing these new measures will winnow them. Some debates concern which measures are most appropriate to particular phenomena. Others challenge the mechanisms they purport to mediate. Weinstein, Glei, and Goldman (see Chapter 11) note the enduring doubts over whether the relationship between SES and health is as universal as suggested; other doubts concern what counts as evidence for stress or allostatic load. Another set of concerns addresses the strength of the evidence regarding the extent to which biomarker data are useful in mediating the relationship between SES and health status—which Weinstein, Glei, and Goldman consider an open question. These issues do not foreclose this line of research; rather, they call for careful measures, the development of and attention to theory, and a full assessment of null and inconsistent results.

Other concerns raised by Schnittker (see Chapter 13) suggest that while social genomic research demonstrates the relevance of SES for gene expression, this research reveals little regarding what it is about SES that matters. This too calls not only for further development of conceptualization and measures of SES, but also for further thinking about causation and correlation. Tracing biological pathways is thus part of the whole picture of research on aging, but not a final answer. In the view of Schnittker (Chapter 13):

One risk in demanding a biological mechanism lies in prematurely foreclosing on a generative sense of uncertainty, especially when mediating pathways are dynamic.… [I]n general, socioeconomic status is linked to health through a variety of proximate mechanisms that change over time and, therefore, will involve different biological pathways at different times.… Provisionality, in this sense, is part of the effect itself, not a reflection of scientific naiveté.

As biosocial investigations of aging continue, new measures of both SES and physiology are needed. Several reviews (Seeman and McEwen, 1996; Uchino et al., 1996) have concluded that social support is reliably related to beneficial effects on cardiovascular, endocrine, and immune systems. Most studies, however, looked only at the positive aspects of social relationships, neglecting the negative impact of much social interaction or the burden of some social ties. In many studies, physiological processes were unidimensional, with cardiovascular reactivity, for example, measured as high or low, although it is a joint function of sympathetic and vagal activity.

Insights from sociology and social epidemiology of aging could contribute to medical research and practice, and to research in biology. For example, a current major focus of research in inpatient care is in understanding how to improve functional outcomes. Biomarkers for poor functional outcomes are being actively sought for severe sepsis, acute lung injury, and stroke. The key outcome of trials of therapeutic hypothermia and early interventions for ischemic stroke is good functional outcome (Iwashyna and Netzer, 2012, p. 329). Biological models of these diseases could be enriched and improved by the inclusion of recent approaches to physical frailty (Freedman, 2011) and physical functioning as reflected in the International Classification of Functioning. Recent work on social environments during childhood, cited in this report, also offer the possibility of improving biological models at this important stage of development. Indeed, scholars in the sociology of aging have much to offer medical interventionalists and clinical epidemiologists on conceptualizing and measuring these key outcomes.

Recommendation 3. The National Institute on Aging should manage its research program in a manner that promotes implementation of models and metrics in the areas of:

  • Encouraging conceptualization of family relationships and exchanges, including behavior, expectations, plans and attitudes, and to develop measures of key family concepts, because the family is a fundamentally important social context for aging.
  • Encouraging conceptualization of characteristics of contexts and local areas through development of new measures, expanded links to existing data, and ongoing incorporation of new technologies, such as global positioning systems, smartphones, and area mapping.
  • Building on recent advances in the use of biomeasures and linking them to social and behavioral processes through conceptualization of the potential links, attention to measurement tools and analysis techniques, identification of “best practices” in developing the linkages, and the development of protocols for cross-training of researchers in sociology with other relevant disciplines, such as genetics, economics, medicine, and biology.
  • Illuminating the mechanisms that mediate between social contexts and health, both across the life course and over generations.


The need for new data can be met in a variety of ways. Many different study designs are available. Some approaches simply augment existing studies, adding scope or depth to those ongoing surveys. Other opportunities include subsampling for populations of interest and linking the survey data to auxiliary data within the contexts of the ongoing surveys. Still other opportunities involve sophisticated sample design strategies for hypothesis-driven samples, exploring new data collection techniques such as Ecological Momentary Assessments (EMAs), and agent-based models. The most challenging new frontier is found in mining the resources of the Internet and exploring the use of search technologies to reveal human behaviors for populations of interest. Transdisciplinary efforts (see Chapter 4) are also promising avenues.

Augmenting Existing Surveys

One way to address new data needs is to add new instruments, measures, and respondents to existing surveys. As an example of overlaying new collection instruments to existing collections, several NIA-funded surveys in recent years have added the collection of daily diary data, which captures momentary reports of physical and mental health (e.g., National Study of Daily Experience in MIDUS, Disability and Use of Time in PSID).

As noted above, many current surveys were designed before intergenerational issues came to the forefront of aging research. More survey questions on intergenerational transfers of resources could be added to the next wave of ongoing longitudinal surveys. These might address new topics, e.g., types and intensity of nonfinancial assistance, or perceived potential sources of (or responsibilities to provide) care. They might also address an expanded range of relationships, for example, adult children of cohabiting partners, or former stepchildren. (Such information about family relationships might be successfully collected with use of retrospective questions.) They might also include much more attention to the negative aspects of social relationships, including those characterized by strain or conflict, or those fostering unhealthy behaviors.

Existing social surveys can also be augmented with the addition of biomarkers. The development of new technologies, such as dried blood spots and a growing number of point-of-service meters, make it more feasible to collect a range of useful biomarkers. These may be less invasive for respondents, easier for staff to collect, and require less sophisticated handling of samples.

The important question of intergenerational transfers could be explicated by additional information collected in the process of conducting existing surveys. This might include, for example, intergenerational relationships, such as the children and grandchildren of respondents in the WLS, or the stepchildren of respondents in the PSID, or the parents of respondents in the National Longitudinal Survey of Adult Health as part of the HRS. It might even include relationships within many other types of social networks—for example, the neighbors or colleagues or teammates of established respondents, to explore other aspects of the quality and structures of social networks and their impact on healthy aging.

Surveys could also be combined in various ways with other ways of obtaining data and evaluating evidence. Respondents in large, nationally representative samples might be combined with randomized clinical trials or interventions, for example. Both have strengths and weaknesses that could offset one another if combined. Clinical trials have the strength of random assignment, which, in theory, equalizes all the unmeasured characteristics of subjects across treatment and control groups. Surveys generally have the strength that respondents represent the population at large, which means that findings from these survey data characterize the population as a whole. Clinical trials typically do not represent the general population but some more restricted groups, such as those who volunteer or those who visit particular clinics or health-care providers. And survey respondents who have had some treatment or experience have not been randomly assigned to it. The combination of treatments to subjects, randomly assigned, with survey data collection would yield samples that would represent some population that could be characterized and results that could be generalized to that population.

Clinical trials could also be enriched by the addition of key social concepts and measures to the data collected about patients. For example, the efficacy of interventions to improve functional status, or to avoid rehospitalization, may depend on the social support and social environment into which patients are released. Measures of the social support available to patients, characteristics of the households and neighborhoods in which they will recover (Cagney et al., see Chapter 8), and demands or conflicts they may face could add a great deal to bring the voice of the sociological study of aging in those conversations to inform this research.


A different way to take advantage of existing population surveys is to use them to define representative subsamples on which to conduct much deeper, more detailed investigation. The broad aims of many population-based surveys require data collection across a great number of domains. That scope necessarily competes with depth of coverage. By defining subsamples within the surveyed population, these supplemental substudies could devote resources to more detailed investigation and more thorough social, psychological, and physiological assessments.

The sociological study of aging also has much to contribute to for designing and interpreting key outcomes in clinical trials and patient-level interventions and their evaluation. It can and should contribute appropriate phenotypic characterization of patient outcomes for understanding disease processes and nonmortality outcomes (Iwashyna and Netzer, 2012).

Necessary resources might include more sophisticated training and greater time for staff to conduct in-person interviews, as well as the use of an array of other techniques for assessing social and psychological factors. Fuller attention could be given, in smaller studies, to mapping social networks and exploring both negative and positive aspects of social interactions. They could also investigate intergenerational transfers more extensively.

Subsamples could also be defined for further physiological or clinical research. While many useful protocols are community-based and can be conducted in a broad population survey, others must necessarily be confined to a clinic or in-home clinical assessment. These permit significantly more detailed and in-depth physiological and neuropsychological assessments and the collection of a greater range of biomarkers from respondents than generally done in surveys.

A paramount example of such a substudy is the Aging, Demographics, and Memory Substudy (ADAMS), a population-based study of dementia funded by NIA. In ADAMS, a selected subsample of HRS respondents received extensive in-home clinical and neuropsychological assessments of their cognitive status. While community-based studies have focused on particular geographical areas or on nationally distributed samples that are not representative of the population as a whole, ADAMS is the first study of its kind to undertake in person assessments of dementia in a national sample that is representative of the U.S. elderly population.3

In addition to the wealth of information about the members of representative subsamples, substudies also offer opportunities to validate and explore the significance of findings from the broad samples from which they are drawn. For example, challenge tests performed in clinics may provide perspective on the seated, resting blood pressure measurements performed in the larger survey, and detailed assessments of insulin resistance may help in the interpretation of data on glycosylated hemoglobin from the larger sample population.

Moreover, by sampling from the larger initial population-based sample in known ways, more informed generalizations can be made about the larger cohort and about the population from which it was originally drawn.

Linkages with Auxiliary Data

A further way to extract yet more value from the current portfolio of population-based datasets is to link them with administrative data. Such linkages can vastly enhance the value of existing observations without increasing the burden on either respondents or survey staff. Cross-linking of data can make any dataset much more valuable to investigators by expanding the range of factors that can be analyzed or controlled for. Cross-linking may also increase risks of disclosure, and such risks must be addressed. Possible useful linkages include linking survey data to the administrative records of Medicare (such as pioneered with HRS data),4 to Social Security administrative data, to private employer pension plans, and to the National Death Index. The forthcoming availability of the 1940 Census of Population records affords a new avenue for studying ancestry and the impact of locality on social phenomena. The explosive growth of electronic medical records offers another source of potential record linkages, though, like administrative data from Medicare and Social Security records, there are significant privacy concerns to take into account.

Exploring New and Emerging Data Sources

In addition to focusing on extending the usefulness of survey data by increasing cross-linkages with existing administrative data sources, data providers, and other organizations are profitably experimenting with techniques that mine the resources of the Internet through Web searches, Twitter messages, Facebook, and blog posts. These sources are used to generate massive databases of big data that, if appropriately queried, could reveal patterns that illuminate sociological laws of human behavior. There is a growing variety of applications of big data techniques. Government agencies from the Intelligence Advanced Research Activity to the Census Bureau have taken an interest in using Web search queries, Internet commerce postings, Internet traffic flow, financial market indicators, traffic Webcams, changes in Wikipedia entries, and the like to detect patterns in communication, consumption, and population movement.

Other Web-enabled techniques are gaining traction. For example, crowd-sourcing techniques distribute tasks and problems to groups of people via the Web and assemble their input. These techniques have promise to increase understanding of human dynamics in an environment that is conducive to providing quick insights, though the anonymity of the Web currently limits the reliability of the observations. Like big data applications, however, crowd-sourcing techniques require a great deal more research to validate and generalize the data that are thus developed.

Alternative Sampling Strategies for Hypothesis-Driven Studies

Another design approach is to devise specific sampling strategies for hypothesis-driven studies. For example, workplace-based samples might be chosen so as to investigate the impact of the intersection of work characteristics and workplace policies on many aspects of the experience of aging. Questions to be addressed might include whether retirement is feasible, when it occurs, and if it eventuates in subsequent employment or other commitments. Other questions in hypothesis-driven studies might concern issues, such as postemployment health, income security, social satisfaction, capacity to assist others, or the strength and content of social networks.

Ecological Momentary Assessments

EMAs and other data collection techniques that gather geographic information can also be used to gain the new measures needed in research on aging. EMAs involve repeatedly sampling respondents' current experience and behaviors in real time, in their natural environments. Respondents provide data entries at periodic intervals or random time samples, using a range of technologies, including cellphones, global positioning system devices, electronic diaries, and physiological sensors, such as ActiGraph accelerometer activity monitoring devices. Data on respondents' affective, behavioral, and contextual experiences, as well as physiological status and reactions, can be made as close in time to these experiences as possible. Because it permits real-time data capture, EMAs eliminate the recall bias that can occur in retrospective summary self-reports. Because it occurs in real-world environments, EMAs maximize ecological validity. EMAs also facilitate measurements requiring temporal and spatial precision.

In addition, EMAs enable tracking relations between individuals. For example, members of a family may record reactions to an event in parallel, or they may record reactions to each other. EMAs also allow the study of microprocesses and contextual associations that influence behavior in realworld contexts, including health behaviors or response to stressors. Other advantages of EMAs are improvement in compliance via electronic data capture, elimination of recall problems among the cognitively impaired, ease of implementation, and validation of global measures. (For more on EMAs, see Cain et al., 2009; Shiffman et al., 2008).

Furthermore, EMAs may be an important part of study designs that investigate activity space. While standard survey-based approaches often include abstract categories of institutional participation (such as church attendance, club membership), they lack the means to observe actual social and spatial exposures. Similarly, assessments of neighborhood characteristics may note the existence of facilities or institutions and estimate the impact of institutional density on aging-related outcomes, but they cannot assess actual participation in local institutions. EMAs and other means of measuring activity space make it more possible to study exposure to a range of institutions and settings. Cagney et al. (see Chapter 8) note the critical role of “overlapping routines and casual contact among neighborhood residents in developing public trust. In the absence of activity space information, the potential for such contact characterizing a given residential area is difficult to observe.”5

Agent-Based Models

Agent-based models (ABMs) may also be useful in research on aging. These models combine tools from artificial intelligence with theories of social process and behavior. They construct “agents” (or avatars) that are autonomous, interdependent, adaptive rule followers. Such agents are capable of strategic goal-oriented actions across time and are able to adapt by, for example, adjusting the timing or sequencing of key transitions. The models also generate settings of social time and space. In computerized thought experiments, agents are placed within these settings and become socially integrated through networks of interactions. ABMs thus allow a test of possible mechanisms that produce different pattern outcomes. They can illuminate such fundamental processes as the emergence of social structure, unintended consequences of policy interventions, or the social ramifications of rapid technological or environmental change. With due regard for the limitations of experimental designs (including questions of external validity), ABMs are one tool for examining how the micro level is linked to the macro level in ways far more complex and dynamic than aggregation. (For more on ABMs, see Hardy and Skirbekk, Chapter 7, and Macy and Miller, 2002.)

Recommendation 4. The National Institute on Aging should seek opportunities to:

  • Supplement ongoing surveys by addition of respondents related to members of current samples.
  • Identify subpopulations of interest and ensure their representation in sponsored surveys.
  • Explore the use of the Internet and other electronic media to gather information on patterns of human interaction, consumption, and behavior.


With the number and range of studies being conducted on aging, the need for international comparisons, and the involvement of researchers from many different disciplines, harmonization of efforts is crucial. Comparability enhances replicability, aggregation, validation, data linkages, and tests of universality. It also makes collaboration across disciplines and genuinely transdisciplinary research more feasible. There are reasonable limits to the goal of harmonization. As Hauser and Weir note (2010, p. S125), congruence and overlap in studies are very helpful, but measures and designs should “not be so similar that the same gaps or problems could occur in every study.” Furthermore, the most significant variables seem to retain their significance across the variety of measures used to capture them. This appears to be the case, for example, for social class and health. As Syme and King note (see Chapter 14), “Social class has been measured in many different ways in different population groups, but the same result emerges in virtually all of these studies—those from reduced social circumstances have higher rates of virtually all diseases.” These admonitions do not lessen the value of all that can be gained by harmonization across surveys and settings and among researchers.

Cross-Survey Comparability

Cross-survey comparability of measures is of obvious importance. Comparability of measures may be particularly challenging when those measures are meant to capture something that is itself in flux. The concept of parent, for example, has gained some fluidity during recent cultural and demographic change. The term may be deemed to apply to biological parents, stepparents, social parents acquired through a biological parent's cohabitation with another adult, and perhaps other parent figures. Household resident is another term open to different interpretations—by both data collectors and respondents. Surveys have used a variety of rules to determine who is a household resident, with some resulting confusion. The 2000 decennial census, for example, has an overcount of children about around age 10, which analysts attribute to the different definitions and interpretations of household residents in situations of joint custody (Seltzer, 2011a, p. 3).

Other measures would seem less susceptible to variable interpretation or cultural change. Geographic proximity might seem straightforward to measure and is significant in aging research to study, among other things, intergenerational transfer of resources, such as the ability of a grandparent to provide child-care to a grandchild, or for an adult child to provide assistance with errands and housekeeping to a parent. Yet there is great variation across surveys in how proximity is measured, with some asking distance in miles and others asking for travel time.

Response categories can also vary, making it problematic to compare measures even when the same metric is used. Comparisons and aggregations are compromised by such variation. Comparisons can also be affected by subtle aspects of survey administration such as mode of interview, order of questions, and interviewer-respondent interaction that are above and beyond the precise wording of a particular measure.

Other issues may arise in the comparability of biomarkers. Here, not only is the definition of the measure a concern, but also the cross-study reliability of correct implementation of complex protocols and the cross-study validity of assessments, particularly when complex assays of samples are involved.

International Standardization

Comparable national data are essential for comparing findings on aging around the globe. Although the aging of populations is occurring throughout the world, it is occurring at different tempos and exhibits very different demographic profiles (see Hardy and Skirbekk, Chapter 7). Data analyses that make national, regional, and international trends more evident, and projections more valid, require comparable data.

Internationally comparable data are also critical for distinguishing between local and universal phenomena. The connection between SES and health, for example, may differ substantially depending on social policies, safety nets, degrees of social inequality, and life course exposures. International investigations of the SES-health nexus require comparable measures as they observe patterns and attempt to trace causal connections.

Some progress has been made in recent years in harmonizing international data on aging through extending HRS-type data collections in several countries, but there are additional opportunities in this area. The initiatives in Asia along these lines are discussed in a recent National Research Council report (National Research Council, 2012a).

Assessments of progress on international harmonization vary. Hauser and Weir (2010, p. S111) report considerable progress: “There is now a domestic and international cohort of longitudinal studies of aging that one might better describe in terms of a network of investigators and samples than as a collection of discrete projects. Collaboration, cooperation, and harmonization are guiding principles.” They cite a roster of international studies, all influenced by and comparable to HRS.

Despite this overall progress, there are areas in which comparisons are less well developed. Weinstein, Glei, and Goldman (see Chapter 11) are more cautious in their assessment, particularly with regard to the new measures that integrate biological and sociological data. They remark on the imperative of international comparisons of social gradients in health that integrate biological mechanisms, but they cite researchers in this field who lament that “such undertakings are generally unable to establish whether divergent findings reflect true variability in the physiological pathways linking SES to health across countries, regions, and time periods; differences across datasets in measurement error or definitions of biomarkers, SES and health outcomes; differences in analytic strategies; or differences in sample size.”6

Crossing Disciplines

Transdisciplinary research on aging seems to be both an already-established enterprise and a serious ongoing challenge (see Chapter 4). Major longitudinal studies have been designed and conducted with the participation of researchers with expertise in a great variety of disciplines, and they are overseen by interdisciplinary groups. Regarding the development of the HRS, Hauser and Willis (2005, p. 214) recall that when selection of measures required adjudication at a higher level, “these decisions were based on the potential of a given measure to contribute to the goals and theoretical frameworks of the HRS rather than their contribution to particular disciplinary concerns.”

Again, the relationship between SES and health points to the importance of harmonization. As Hauser and Willis (2005, p. 217) admit

In the past, surveys with good health measures have had poor economic measures, and vice versa, and surveys with both measures have been cross-sectional. This has hampered serious investigation. The availability of longitudinal observations with extensive and detailed data on both health and socioeconomic status has stimulated analysis and discussion by leaders in several disciplines of the interactions between health and social and economic standing across the life course.

The demand for more high-quality data on both social and health dimensions, including biomarkers, is growing at a great rate, as is the need to expand the range of feasible data collection protocols and assays. As discussed above, despite considerable advances, large population surveys continue to suffer from a relative dearth of biomarkers. Social genomics and other areas of investigation require such data, but it is not sufficient. Advances in the use of biomarkers require transdisciplinary teams to design studies and analyze findings. Intellectual exchange and cooperation within such teams involve many opportunities as well as challenges. As Shanahan (see Chapter 12) asserts, “Population-based studies of health have traditionally had an admirably interdisciplinary quality. As the models that describe connections between social, psychological, and biological levels of analysis becomes increasingly complex, greater attention should be paid to how such teams are organized and encouraged.”


Emerging needs for new data and measures are being identified. Opportunities are available for adding to the database on aging. Trends in harmonization facilitate gains from research. Yet there are obstacles to be addressed. These require overcoming some of the unique challenges to collecting data from older adults, especially the frail elderly. There are concerns over response burden, challenges in doing telephone interviews with the hearing impaired and those with cognitive functioning difficulties, problems of institutionalization, issues with concerned children who serve as “gate keepers” in turning away potential interviewers, and vigilance among institutional review board (IRB) panels who view elderly populations as vulnerable. Other obstacles are also important. The issue of managing the costs of new data-gathering initiatives and re-evaluating the restrictions on data usage are discussed at further length here.


Although it is clear that important scientific questions are motivating new strategies for data collection and measurement, resources for research are finite and must be allocated carefully. The agenda for research on aging is broad, and the need for new data can appear daunting. But there are also ways of limiting and managing costs.

One way to limit the costs of new data collection is to augment ongoing efforts. Discrete measures could be added to current population-based surveys or subsamples could be drawn from existing samples for in-depth, hypothesis-driven studies. They could take full advantage of the existing infrastructure of ongoing biosocial surveys, increasing their value for research without incurring the costs of launching an entirely new research endeavor.

Other possible data collection efforts are unavoidably costly. These include in-depth social and physiological assessments. The comprehensive measurement of social factors (such as life experience interviews or daily assessments of experiences and behaviors) can be quite invasive and require detailed training and instruction to collect accurately. Biomarkers can be very expensive to gather and may entail conducting in-person interviews to acquire informed consent for obtaining biospecimens; providing equipment, training, and appropriate settings for collection; validating proper implementation; and ensuring subsequent secure storage and processing. After the cost of these measures could be contained by limiting their implementation to selected subsamples of population-based surveys, as discussed above, they would nonetheless be considerable.

Of course, nothing is as costly as doing a job poorly. Regarding the vexed efforts to ascertain the exact relationship between SES and health over the life course, Gruenewald (see Chapter 10) remarks that “the biggest challenge is concurrent high-quality measurement of both social and biological characteristics within studies. Early large-scale biosocial investigations tended to add fairly crude measures of one or the other domain depending on the original study framework with the result that failure to observe significant or strong biosocial associations led to aspersions on the whole enterprise.” The selection of inexpensive measures, if they are not truly adequate to the research questions being addressed, provides little saving.

A further point to keep in mind about the costs of data collection is that these accrue not only to those collecting the data but also to those providing it—the respondents or subjects. The burden can include time, effort, discomfort, and willingness to risk disclosure of private information. These can discourage response and compliance, especially with more invasive, detailed, or frequent forms of data collection. Attrition in longitudinal surveys is a particular concern if data gathering becomes too onerous on respondents. As Hauser and Willis (2005, p. 214) observe, “surveys are successful only when members of the population are willing to join with funding organizations and researchers in adding to the society's knowledge.” Attention to the costs of new measures must therefore also consider the costs to respondents.

Restrictions on Use

No matter the quantity and quality of the data collected, the database on aging will be of little value if researchers cannot access it fully. At the same time that the technology and methodology for accessing data are expanding exponentially, a growing number of restrictions encumber full access to data. Whether well-intentioned or misguided, a variety or regulations and practices inhibit full data usage.

Making data available to researchers does entail risks, and, to a large extent, concerns about privacy and data access are warranted. Possible violations of privacy, confidentiality, and security of respondents loom large in a social context that accords even greater attention to the erosion of privacy, hidden forms of surveillance, the commercial exploitation of personal data, and security threats from hacking of databases. Concerns may be heightened when collected research data seem particularly private, such as biomarkers revealing health problems or the potential for health problems, socially stigmatized or illegal behavior, or personal financial information.

The increased feasibility of identifying and tracking individuals accompanying improvements in data processing and search capabilities also raises grave concerns. Earlier means of deidentifying data provide little reassurance when date of birth, gender, and zip code alone are sufficient to personally identify 87 percent of the population (King, 2011, p. 720).

Butz and Torrey (2006, p. 1,898) discuss the dilemma with particular reference to longitudinal data in the social sciences:

There is a debilitating tradeoff between the power of richly detailed data on individuals (and firms and other organizations) to test important hypotheses, on the one hand, and the possibility that such detailed information will threaten the privacy and confidentiality of the persons and organizations described in the data, on the other hand. The masking of subject identity arises in many fields of science but is particularly vexing in social sciences where available masking techniques necessarily reduce the amount of information in the data. Exact-matched data files from multiple surveys and administrative record sources are particularly valuable analytically, but also particularly at risk for disclosure. Consider, for example, matched files that contain demographic and attitudinal information from persons in surveys, economic information on these same persons from government records, and medical information from providers. When longitudinal data are so matched, the analytical payoff can be unprecedented at relatively little cost in time and money.

In another instance, aware of the growing problem of identity theft, the Social Security Administration recently restricted access to a large part of the Social Security Death Master File, an index of 90 million deaths that are reported to the agency and that include names, Social Security numbers, and dates of death. In 2012, the agency declared that the records that had been contributed by state agencies were exempted from public disclosure. By this decision, records that had been used for social epidemiological research were foreclosed for those purposes (Sack, 2012).

Addressing these risks of disclosure is valid, and methods, institutional arrangements and systems for protecting the data have emerged to mitigate the risks. Advances in methodology for insuring data access while ensuring privacy and confidentiality have opened new possibilities (National Research Council, 2007, 2010). Among the techniques that are used with some success are to eliminate direct identifiers in records and to foreclose the possibility of identifying unique individuals in small cells by replacement or masking techniques. These techniques have been employed, for example, by the Inter-University Consortium for Political and Social Research, which is a major repository of datasets (National Research Council, 2009, p. 61). Other protection systems tightly control access to authorized (and sworn) researchers using such facilities as the NORC data enclave.

Nevertheless, current procedures and practices for ensuring access and protecting confidentiality may be unduly burdensome or even dysfunctional. Often access to survey data involves cumbersome administrative and legal arrangements or licensing agreements. Sometimes this entails traveling to secure enclaves or submitting statistical software to be run in a special protected facility. Some data sharing protocols require researchers to do their work in locked rooms without access to the Internet, other data sources, electronic communication with other researchers, or their customary software and hardware tools. Such restrictions reduce or preclude the use that can be made of exceptionally valuable samples.

Other restrictions limit who in the research community can aspire to access. Geographic area measures (such as poverty rates or crime rates in a given region, state, county, metropolitan area, census tract, or neighborhood) are often closely controlled. Those wishing to use these data must meet stringent requirements, such as having an active grant that can stand surety against misuse of data (as the HRS does). One effect of such requirements is that younger scholars, those not at major research institutions, those beginning work in a new area, and those defining new areas of inquiry are seriously handicapped—limiting the scientific community and damaging the research enterprise.

The restrictions currently in place do not necessarily provide enhanced security to respondents, or at least not in a way that balances the costs to researchers. A case in point is the Health Insurance Portability and Accountability Act (HIPAA) Privacy Rule, implemented in 2003. Although intended to establish equilibrium between protecting the privacy of individual health information and preserving the legitimate use of this information by researchers, the rule is often seen as adversely affecting biomedical research. In 2007, the Institute of Medicine (IOM) commissioned a national survey about the HIPAA Privacy Rule among epidemiologists (see Ness, 2007). The survey addressed questions of the scope, degree, type, and variability of influence of the rule on research conducted by epidemiologists on human subjects. Epidemiologists ranked their perceptions and offered further comments on barriers, delays, and added costs of doing research because of the rule. On a 5-point Likert scale, 67.8 percent reported that the rule makes research more difficult, ranking this at 4 to 5; almost 40 percent gave a similar ranking when asked whether the rule had increased research costs; and 51 percent indicated that the additional time added by the rule to complete research projects was high. Other concerns among epidemiologists were the variability and confusion of implementation, with different IRBs being seen as having different interpretations of the rule, sometimes preventing linkages to public records or private-sector maintained data files. Overall, as one respondent reflected, “In the main, HIPAA has not prevented any research that I have desired to pursue. What is has done is slow the research enterprise through its training and compliance elements. I and my staff spend more time doing compliance related things and less and less time doing actual research” (Ness, 2007, p. 2,167).

Has the HIPAA Privacy Rule actually enhanced the protection of respondents? The surveyed epidemiologists are skeptical. Only one-quarter perceived that the rule enhanced the privacy and confidentiality of study participants. And even that gain comes at a cost. Researchers expressed frustration that, while offering little substantive privacy protection beyond what was already available in the Common Rule, the new restrictions had increased the burden on respondents who are now faced with far more cumbersome consent forms. Not only is the HIPAA consent process time-consuming for subjects, but, as one epidemiologist observed, it “detracts from the informed consent process pertaining to the more critical issue: the actual medical risks and benefits of participating” (Institute of Medicine, 2009; Ness, 2007, p. 2,167). In reaction to these issues with HIPAA, the IOM panel recommended a new approach to protecting privacy and insuring access involving data security, increased transparency, and greater accountability. Included in the panel's recommendations was a recommendation that the U.S. Department of Health and Human Services (HHS) regularly convene consensus development conferences in collaboration with health research stakeholders to collect and evaluate current practices in privacy protection in order to identify and disseminate best practices for responsible research. As a practical matter, the panel recommended that HHS provide reasonable protection against civil suits brought pursuant to federal or state law for members of IRBs and Privacy Boards for decisions made within the scope of their responsibilities under the HIPAA Privacy Rule and the Common Rule (Institute of Medicine, 2009). The need for consideration of these and other measures to foster research uses of data while protecting privacy is a continuing issue.



These include the Mexican Health and Aging Study; the English Longitudinal Study of Ageing; the Korean Longitudinal Study of Ageing; the Japanese Study of Aging and Retirement; the China Health and Retirement Longitudinal Study; the Longitudinal Aging Study in India; and the Survey of Health, Ageing, and Retirement in Europe (now comprising 16 nations). Other national studies with some elements comparable to HRS include the Irish Longitudinal Study on Ageing, the Canadian Longitudinal Study of Aging, and the Costa Rican Study of Longevity and Healthy Aging (see Hauser and Weir, 2010, p. 113).


In this report, biomarkers are defined to include not only chemical or metabolomic levels that can be detected in a tissue sample (including blood and saliva) but also physiological measures (e.g., blood pressure, peak expiratory flow) and anthropomorphic measures (e.g., height, weight) and functional measures (e.g., grip strength).


HRS obtains information about health-care costs and diagnoses from Medicare records maintained by the Centers for Medicare & Medicaid Services. It is able to do this match because the HRS asks all respondents who are eligible for Medicare to provide their identification numbers; more than 80 percent of them consent to do so. The data are held in a restricted database. Available: http://hrsonline​.isr​ [April 2013].


EMAs may also make it possible for local stakeholders to gather data relevant to advocating for or designing public policies or interventions. For example, electronic audit tools, adapted to lower literacy skills and physical limitations of an older population, can be used by residents to gather photo, audio, and location data documenting the walkability of a neighborhood (see Syme and King, Chapter 14).


Citing Goldman et al. (2011, p. 311).

Copyright 2013 by the National Academy of Sciences. All rights reserved.
Bookshelf ID: NBK184357


  • PubReader
  • Print View
  • Cite this Page
  • PDF version of this title (1.8M)

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...