NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.

National Research Council (US) Committee to Examine the Methodology for the Assessment of Research-Doctorate Programs; Ostriker JP, Kuh CV, editors. Assessing Research-Doctorate Programs: A Methodology Study. Washington (DC): National Academies Press (US); 2003.

Cover of Assessing Research-Doctorate Programs

Assessing Research-Doctorate Programs: A Methodology Study.

Show details


Assessments of the quality of research-doctorate programs and their faculty are rooted in the desire of programs to improve quality through comparisons with other similar programs. Such comparisons assist them to achieve more effectively their ultimate objective—to serve society through the education of students and the production of research. Accompanying this desire to improve is a complementary goal to enhance the effectiveness of doctoral education and, more recently, to provide objective information that would assist potential students and their advisors in comparing programs. The first two goals emerged as graduate education began to grow before World War II and as higher education in the United States was transformed from a predominantly elite enterprise to the widespread and diverse enterprise that it is today. The final goal became especially prominent during the past two decades as doctoral training expanded beyond training for the professoriate.

As we begin a study of methodology for the next assessment of research-doctorate programs, we have stepped back to ask some fundamental questions: Why are we doing these rankings? Whom do they serve? How can we improve them? This introduction will also serve to provide a brief history of the assessment of doctoral programs and report on more recent movements to improve doctoral education.


The assessment of doctorate programs in the United States has a history of at least 75 years. Its origins may date to 1925, a year in which 1,206 Ph.D. degrees were granted by 61 doctoral institutions in the United States. About two-thirds of these degrees were in the sciences, including the social sciences, and most of the remaining third were in the humanities. Yet, Raymond M.Hughes, president of Miami University of Ohio and president of the Association of American Colleges, said in his 1925 annual report:

At the present time every college president in the country is spending a large portion of his time in seeking men to fill vacancies on the staff of his institution, and every man [president] is confronted with the question of where he can hope to get the best prepared man of the particular type he desires.1

Hughes conducted a study of 20 to 60 faculty members in each field and asked them to rank about 38 institutions according to “esteem at the present time for graduate work in your subject.”

Graduate education continued to expand, and from time to time, reputational studies of graduate programs were carried out. These studies limited themselves to “the best” programs and, increasingly, those programs that were excluded complained about sampling bias.

In the 1960s, Allan Cartter, vice president of the American Council on Education, pioneered the modern approach for assessing reputation, which was used in the 1982 and 1993 NRC assessments. He sought to include all major universities and, instead of asking raters about the “esteem” in which graduate programs were held, he asked for qualitative judgments of three kinds: 1) the quality of the graduate faculty, 2) the effectiveness of the doctoral program, and 3) the expected change in relative position of a program in the next 5 to 10 years.2 In 1966, when Cartter's first study appeared, slightly over 19,000 Ph.D.s were being produced annually in over 150 institutions.

Ten years later, following a replication of the Cartter study by Roose and Anderson in 1970, another look at the methodology to assess doctoral programs was undertaken under the auspices of the Conference Board of Associated Research Councils.3 A conference on assessing doctoral programs concluded that raters should be given the names of faculty in departments they rate and that “objective measures” of the characteristics of programs should be collected in addition to the reputational measures. These recommendations were followed in the 1982 assessment that was conducted by the National Research Council (NRC).4 By this time, over 31,000 doctorates were being produced by over 300 institutions, of which 228 participated in the NRC study.

The most recent NRC assessment of doctorates, conducted in 1993 and published in 1995, was even more comprehensive. The 1995 Study design tried to maintain continuity with the 1982 measures, but it added and refined quantitative measures. With the help of citation and publication data gathered by the Institute for Scientific Information (ISI), it expanded the measures of publications and citations. It also included measures of awards and honors for the humanities. It covered 41 fields in 274 institutions, and data were presented for 3,634 doctoral programs.

This expansion, however, did not produce a non-controversial set of rankings. It is widely asserted that “halo” effects give high rankings to programs on the basis of recognizable names—star faculty—without considering average program quality. Similarly, there is evidence to support the contention that programs within well-known, larger universities may have been rated higher than equivalent programs in lesser-known, smaller institutions. It is further argued that the reputational rankings favor already prestigious departments, which may be, to put it gently, “past their prime,” while de-emphasizing striving programs that are investing in achieving excellence. Another criticism involves the inability of the study to recognize the excellence of “niche” and smaller programs. It is also asserted that, although reputational measures seek to address scholarly achievement as something separate from educational effectiveness, they do not succeed. The high correlation between these two measures supports this assertion.

Finally, and most telling, there is criticism of the entire ranking business. Much of this criticism, directed against rankings published by a national news magazine, attacked those annual rankings as derived from capricious criteria constructed from varying weights of changing variables. Fundamentally, the incentives created by any system of rankings were said to induce an emphasis on research productivity and scholarly ranking of faculty to the detriment of another important objective of doctoral education—the training of the next generation of scholars and researchers. Rankings were said to create a “horse race” mentality in which every doctoral program, regardless of its mission, was encouraged to emulate programs in the nation's leading research universities with their emphasis on research and the production of faculty who focused primarily on research. At the same time, a growing share of Ph.D.s were setting off for careers outside research universities and, even when they did take on academic positions, taught in institutions that were not research universities. As Ph.D. destinations changed, the question arose whether the research universities were providing appropriate training.

Calls for Reforms in Graduate Education

Although rankings may be under fire from some quarters, this report comes at a time when such an effort can be highly useful for U.S. doctoral education generally. Recently, there have been numerous calls for reform in graduate education. Although based on solid research about selected programs and their graduates, these calls lack a general knowledge base that can inform recommendations about, for example, attrition from doctoral study, time to degree, and completion. Further, individual programs find it difficult to compare themselves with similar programs. Some description of the suggested graduate education reforms can help to explain why a database, constructed on uniform definitions and collected in the same year, could be helpful both as a baseline from which reform can be measured and as a support for data-based discussions of whether reforms are needed.

In the late 1940s, the federal government was concerned with the need for educating a large number of college-bound World War II veterans and created the National Science Foundation to support basic science research at universities and to fund those students interested in pursuing advanced training and education. Competition with the Russians, the battle to win the Cold War, and the sense that greater expertise in science and engineering was key to America's interests jumpstarted a new wave of investments in the 1960s, resulting in a tripling of Ph.D.s in science and engineering during that decade. Therefore, for nearly a quarter of a century those calling for change asked universities to expand offerings and capacity in areas of national need, especially in scientific fields.5

By the mid-1970s, a tale of two realities had emerged. The demand for students pursuing doctoral degrees in the sciences and engineering continued unabated. At the same time, the number of students earning doctoral degrees in the humanities and social sciences started a decade-long drop, often encouraged by professional associations worried by gloomy job prospects and life decisions based on reactions to the Vietnam War (for a period graduate school insured military service deferment). Thus, a presumed crisis for doctorates in the humanities and humanistic social sciences was appearing as early as the 1970s. Nonetheless, the overall number of doctoral recipients quadrupled between 1960 and 1990.6

By the 1990s a kind of conversion of perspectives emerged. Rapid change in technologies, broad geopolitical factors, and intense competition for the best minds led scientific organizations and bodies to call for the dramatic over-haul of doctoral education in science and engineering. For the first time, we questioned whether we had overproduced Ph.D.s in certain scientific fields. Meanwhile, worry about lengthening times to degree, incomplete information on completion rates, and less-than-desirable job outcomes led to plans to reform practices in the humanities, the arts, and the social sciences.

A number of these reform efforts have implications for the present NRC study and should be briefly highlighted. The most significant statement in the area of science and engineering policy came from the Committee on Science, Engineering and Public Policy (COSEPUP), formed by the National Academy of Sciences, the National Academy of Engineering, and the Institute of Medicine. Cognizant of the career options that students follow (more than half in non-university settings), the COSEPUP report, Reshaping the Graduate Education of Scientists and Engineers (1995), called for graduate programs to offer more versatile training, recognizing that only a fraction of the doctoral recipients become faculty members. The committee encouraged more training programs to emphasize more and better mentoring relationships. The report called for programs to continue emphasizing quality in the educational experience, monitor time to degree, attract a more diverse domestic pool of students, and make expectations as transparent as possible.

The COSEPUP report took on the additional task of segmenting the graduate pathways. It acknowledged that some students would stop after a master's degree, others would complete a doctorate, and others would complete a doctorate and have significant research careers. The committee suggested different graduate expectations and outcomes for students, depending upon the pathway chosen. To assist this endeavor the committee called for the systematic collection of pertinent data and the establishment of a national policy conversation that included representatives from relevant sectors of society—industry, the Academy, government, and research units, among others. The committee signaled the need to pay attention to the plight of postdoctoral fellows, employment opportunities in a variety of fields, and the importance of attracting talented international students.7

Three years later the Pew Charitable Trust funded the first of three examinations of graduate education. Re-envisioning the Ph.D., a project headed by Professor Jody Nyquist and housed at the University of Washington, began by canvassing stakeholders—students, faculty, employers, funders, and higher education associations. More than 300 were interviewed, five focus groups were created, e-mail surveys went to six samples, and a mail survey was distributed. Nyquist and her team brought together representatives of this group for a two-day conference in 2000. Since that meeting the project has continued as an active website for the sharing of best practices.

The project began with the question, “How can we re-envision the Ph.D. to meet the societal needs of the 21st century?” It found that representatives from different sectors had different emphases. On the whole, however, there was the sense that, while the American-style Ph.D. has great value, attention is needed in several areas. First, time to degree must be shortened. For scientists this means incorporating years as a postdoctoral fellow into an assessment of time to degree.8 Second, the pool of students seeking doctorates needs to be more diverse, especially through the inclusion of more students of color. Third, doctoral students need greater exposure to information technology during their careers. Fourth, students must have a more varied and flexible curriculum. Fifth, interdisciplinary research should be emphasized. And sixth, the graduate curriculum should include a broader sense of the global economy and the environment. The project and call for reforms built on Woodrow Wilson National Fellowship Foundation President Robert Weisbuch's assessment that “when it comes to doctoral education, nobody is in charge, and that may be the secret of its success. But laissez-faire is less than fair to students and to the social realms that graduate education can benefit.” The project concluded with the recommendation that a more self-directed process take place. Or in the words of Weisbuch, “Re-envisioning isn't about tearing down the successfully loose structure but about making it stronger, more particularly asking it to see and understand itself.”9

The Pew Charitable Trusts also sponsored research that assessed students as well as their concerns and views of doctoral education as another way of spotlighting the need to reform doctoral education. Chris Golde and Timothy Dore surveyed doctoral students in 11 fields at 27 universities, with a response rate of 42.5 percent, yielding nearly 4,200 respondents. The Golde and Dore study (2001), At Cross Purposes, concluded that “the training doctoral students receive is not what they want, nor does it prepare them for the jobs they take.” They also found that “many students do not clearly understand what doctoral study entails, how the process works and how to navigate it effectively.”10

A Web-based survey conducted by the National Association of Graduate and Professional Students (NAGPS) produced similar findings. Students expressed tremendous satisfaction with individual mentoring but some pointed to a mismatch between their graduate school education and the jobs they took after completing their dissertation. Responses, of course, varied from field to field. Most notably, students called for more transparency about the process of earning a doctorate, more focus on individual student assessments, and greater help for students who sought nontraditional jobs.11 Both the Golde and Dore study and the NAGPS survey asked various constituent groups to reassess their approaches in training doctoral students.

Pew concluded its interest in the reform of the research doctorate with support to the Woodrow Wilson National Fellowship Foundation. The Foundation was asked to provide a summary of reforms recommended to date and offer an assessment of what does and could work. The Woodrow Wilson Foundation extended this initial mandate in two significant ways.

First, it worked with 14 universities in launching the Responsive Ph.D. project.12 All 14 institutions agreed to explore best practices in graduate education. To frame the project, participating schools agreed to look at partnerships between graduate schools and others sectors, to diversify the pool of students enrolled in doctoral education, to examine the paradigms for doctoral training, and to revise practices wherever appropriate. Specifically, the project highlighted professional development and pedagogical training as new key practices. The architects of the effort believed that improved professional development would better match student interests and their opportunities. They sensed an inattentiveness to pedagogical training in many programs and believed more attention here would benefit all students. Concerned with the insularity or narrowing decried by many interviewed by the Re-envisioning the Ph.D. project, the Responsive Ph.D. project invited participants concerned with new paradigms to address matters of interdisciplinarity and public engagement. They were encouraged to hire new people to help remedy the relative underrepresentation of students of color in most fields besides education. The project wanted to underscore the problem and encourage imaginative, replicable experiments to improve the recruitment, retention, and graduation of domestic minorities. Graduate programs were encouraged to work more closely with representatives of the K-12 sectors, community colleges, four-year institutions other than research universities, foundations, governmental agencies, and others who hire doctoral students.13

Second, the Responsive Ph.D. project advertised the success of various projects through publications and a call for a fuller assessment of what works and what does not. Former Council of Graduate Schools (CGS) President Jules LaPidus observed, “Universities exist in a fine balance between being responsive to ‘the needs of the time' and being responsible for preserving some vision of learning that transcends time.”14 To find that proper balance the project proposed national studies and projects.

By contrast, the Carnegie Initiative, building on the same body of evidence that fueled the directions championed by the Responsive Ph.D. project, centered the possibilities for reform in departments. After a couple of years of review, the initiative settled on a multiyear project at a select number of universities in a select number of disciplines. Project heads, Lee Shulman, George Walker, and Chris Golde, argue that cultural change, so critical to reform, occurs in most research universities in departments. Through a competitive process, departments in chemistry, mathematics, English, and education were selected. Departments of history and neurosciences will be selected to participate in both research and action projects.

Focused attempts to expand the professoriate and enrich the doctoral experience, by exposing more doctoral students to teaching opportunities beyond their own campuses, have paralleled these two projects. Guided by leadership at the CGS and the Association of American Colleges and Universities (AAC&U), the Preparing Future Faculty initiative involved hundreds of students and several dozen schools. The program assumed that “for too many individuals, developing the capacity for teaching and learning about fundamental professional concepts and principles remain accidental occurrences. We can—and should—do a better job of building the faculty the nation's colleges and universities need.”15 In light of recent surveys and studies, the Preparing Future Faculty program is quickly becoming the Preparing Future Professionals program, modeled on programs started at Arizona State University, Virginia Tech, University of Texas, and other universities.

Mention should also be made of the Graduate Education Initiative funded by the Andrew W. Mellon Foundation. Between 1990 and 2000, this program gave “approximately $80 million to assist students in 52 departments at 10 leading research universities. These departments were encouraged to review their curricula, examinations, advising, official timetables, and dissertation requirements to facilitate timely degree completion and to reduce attrition, while maintaining or increasing the quality of doctoral training they provided.”16 Although this project will be carefully evaluated, the evaluation has yet to be completed since some of the students have yet to graduate.


The calls for reform in doctoral education, although confirmed by testimony, surveys of graduate deans, and student surveys, do not have a strong underpinning in systematic data collection. With the exception of a study by Golde and Dore, which covered 4,000 students in a limited number of fields and institutions, and another by Cerny and Nerad, who investigated outcomes in 5 fields and 71 institutions, there has been little study at the national level of what doctoral programs provide for their students or of what outcomes they experience after graduation. National data gathering, which must, of necessity, be conducted as part of an assessment of doctoral programs, provides an opportunity for just such an investigation.

To date, the calls for reform agree that doctoral education in the United States remains robust, that it is valued at home and abroad, but that it must change if we are to remain an international leader. There is no commonly held view of what should and can be reformed. At the moment there is a variety of both research and action projects. Where agreement exists it centers on the need for versatile doctoral programs; on a greater sense of what students expect, receive, and value; on emphasizing the need to know, publicize, and control time to degree and degree completion rates as well as on the conclusion that a student's assessment of a program should play a role in the evaluation of that program.

This conclusion points to the possibility that a national assessment of doctoral education can contribute to an understanding of practices and outcomes that goes well beyond the attempts to assess the effectiveness of doctoral education undertaken in past NRC studies. The exploration of this possibility provided a major challenge to this Committee and presented the promise that, given a solid methodology, the next study could provide an empirical basis for the understanding of reforms in doctoral education.


The previous sections present a picture of the broader context in which the Committee to Examine the Methodology of Assessing Research-Doctorate Programs approached its work. The rest of the report describes how the Committee went about its task and what conclusions it reached concerning fields to be included in the next study, quantitative measures of the correlates of quality, measures of student educational processes and outcomes, the measurement of scholarly reputation and how to present data about it, and the general conclusion about whether a new study should be undertaken.



Consisting of the Social Science Research Council, the American Council of Learned Societies, the American Council on Education, and the National Research Council.


A study by Joseph Cerny and Maresi Nerad replaced time to degree with time to first tenure and found remarkable overlap between science and non-science graduates of UC Berkeley 10 years after completion of the doctorate.


The 14 participating universities were: University of Colorado, Boulder; University of California, Irvine; University of Michigan; University of Pennsylvania; University of Washington; University of Wisconsin, Madison; University of Texas, Austin; Arizona State University; Duke University; Howard University; Indiana University; Princeton University; Washington University, St. Louis; and Yale University.

Copyright © 2003, National Academy of Sciences.
Bookshelf ID: NBK43466


  • PubReader
  • Print View
  • Cite this Page
  • PDF version of this title (7.5M)

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...