BMJ. Apr 8, 2000; 320(7240): 976–980.

Use of consensus development to establish national research priorities in critical care

Keryn Vella, research fellow,a Caroline Goldfrad, statistician,a Kathy Rowan, scientific director,a Julian Bion, reader in intensive care medicine,b and Nick Black, professorc



To test the feasibility of using a nominal group technique to establish clinical and health services research priorities in critical care and to test the representativeness of the group's views.


Generation of topics by means of a national survey; a nominal group technique to establish the level of consensus; a survey to test the representativeness of the results.


United Kingdom and Republic of Ireland.


Nominal group composed of 10 doctors (8 consultants, 2 trainees) and 2 nurses.

Main outcome measure

Level of support (median) and level of agreement (mean absolute deviation from the median) derived from a 9 point Likert scale.


Of the 325 intensive care units approached, 187 (58%) responded, providing about 1000 suggestions for research. Of the 106 most frequently suggested topics considered by the nominal group, 37 attracted strong support, 48 moderate support and 21 weak support. There was more agreement after the group had met—overall mean of the mean absolute deviations from the median fell from 1.41 to 1.26. The group's views represented the views of the wider community of critical care staff (r=0.73, P<0.01). There was no significant difference in the views of staff from teaching or from non-teaching hospitals. Of the 37 topics that attracted the strongest support, 24 were concerned with organisational aspects of critical care and only 13 with technology assessment or clinical research.


A nominal group technique is feasible and reliable for determining research priorities among clinicians. This approach is more democratic and transparent than the traditional methods used by research funding bodies. The results suggest that clinicians perceive research into the best ways of delivering and organising services as a high priority.


The need to involve as many legitimate stakeholders as possible in the identification and prioritisation of research topics is increasingly being recognised. Not only might such a strategy ensure that the interests of all relevant people are considered, it might also increase ownership of the ensuing research and, perhaps, the likelihood of the results influencing clinical practice and policy. The more groups and individuals involved, however, the greater the potential difficulty in prioritising suggestions. Informal methods, such as committees, risk being dominated by the more powerful members. In contrast, formal methods of consensus development provide a means of managing group decision making so that all participants have the same influence on the outcome.1 These methods have been used to prioritise research, but, apart from occupational medicine2,3 and haematology,4 their use has been confined to nursing 510 and chiropractic.11

Our primary objectives were to test the feasibility of using a nominal group technique and to establish priorities for clinical and health services research in critical (intensive and high dependency) care based on the views of a small selected group of the principal clinicians involved—doctors and nurses. The secondary objectives were to determine the extent to which priorities differ between staff from units based in teaching and non-teaching hospitals, to investigate the impact of the nominal group technique on participants' initial views, and to assess whether the views of such a small selected panel are representative of practising clinicians in general. This last issue has been investigated only once before in the health field—in the context of guidelines for coronary angiography.12


Generation and categorisation of topics

We sought potential research topics from all 325 intensive care units in the United Kingdom and Republic of Ireland in July 1998. We asked the clinical director (or lead consultant) and nurse manager of each adult unit to suggest up to 10 research topics (on intensive care organisation, clinical practice, and outcomes) that they considered the most important. They were encouraged to discuss ideas with their unit colleagues, particularly more junior ones. As respondents could remain anonymous, reminders could not be sent to non-respondents. After exclusion of suggestions not containing a hypothesis (for example, “how many units have more than six beds?”), an experienced clinician (JB) categorised the rest according to 15 domains using the predominant theme of the topic. The 100 most frequently suggested topics were selected—the maximum deemed possible for the nominal group to consider in a single day.

Composition of nominal group

The members of the nominal group were selected from people that we knew to be interested in research in intensive care in the United Kingdom and Ireland. The composition of the group was intended to reflect the diversity of clinician involvement in critical care and the level of influence of each category on critical care policy and practice. Of the 12 people invited, only one declined to participate; he was replaced. For the 325 UK and Irish intensive care units, the 12 participants reflected the professions (10 doctors, 2 nurses); grade of doctor (8 consultants, 2 trainees); geographic distribution (5 from London and the south east, 2 from the south west and Wales, 4 from the Midlands and East Anglia, 1 from northern England and Scotland); and hospital status (5 teaching, 7 non-teaching).

Nominal group process

We sent participants a first round questionnaire about the 100 suggested research topics, asking them to indicate their personal level of support for each topic on a Likert scale of 1 to 9 (1=no support, 5=moderate support, 9=strong support). Replies from the 12 participants were collated, and the distribution of ratings for each topic was displayed on the line below the Likert scale in the second round questionnaire. These questionnaires, personalised such that each participant also had their own first round ratings indicated, were distributed to participants when they attended a one day group meeting in October 1998.

The meeting was facilitated by NB, who had experience of nominal group techniques. The group had to explore the reasons for any differences in ratings and re-rate all 100 topics. This was an opportunity to reconsider their initial rating in the light of other participants' views. They were under no pressure to achieve consensus, and all ratings were made privately. The facilitator tried to ensure that all participants had an opportunity to contribute. Two observers (JB, KR) kept a non-attributable written record of the main points of the discussion. When differences in first round ratings seemed to have resulted partly from ambiguity in the wording of the topic, the group agreed a revised wording before making their second round rating.

For each topic, the level of group support for a topic hypothesis was indicated by the median and the level of agreement within the group by the mean absolute deviation from the median. Topics were ranked according to the medians. Medians of 7-9 were defined as strong support, 4-6.5 as moderate, and 1-3.5 as weak. The level of agreement was categorised according to thirds of the mean absolute deviation from the median (low >1.41, moderate 1.08-1.41, high <1.08). Any change between the first and the second round indicated the impact of the nominal group meeting in promoting consensus. The significance of any change in rank order was tested with Wilcoxon's signed ranks test, and any association between the level of support and the degree of consensus for topics was tested with the χ2 test.

Assessment of representativeness

To assess the representativeness of the group's views, we sent a questionnaire to the 313 intensive care units that were not represented by members of the nominal group. The questionnaire included 30 of the topics that the nominal group had considered, 10 of which had attracted strong support, 10 moderate support, and 10 weak support. The topics were mixed up, and the recipients were not told of the basis of the topic selection. The questionnaire layout and the rating scale were similar to those used with the nominal group. As before, the level of support (median) and of agreement (mean absolute deviation from the median) for each topic was calculated. The representativeness of the group's view was assessed by the level of association with the survey finding (Pearson correlation coefficient) and the level of agreement (κ statistic). Finally, responses from the staff of teaching (university or university affiliated) hospitals were compared with those from non-teaching hospitals.


Generation and categorisation of topics

Of the 325 intensive care units approached, 187 (58%) responded, providing about 1000 suggestions for research. Many topics recurred, which facilitated the identification of the most frequently cited ones. The 15 categories each contained four to six topics, apart from the “organ system support and treatment” category, which included 28 topics. Table Table11 shows some examples.

Table 1
 Categories and examples of research questions most frequently suggested by staff of 187 intensive care units

Nominal group's level of support

At the group meeting, discussion of the 100 most frequently suggested topics led to several changes to the topics. It was apparent that the wording of six topics were ambiguous because they contained two independent issues (for example, “Does skill mix and staff/patient ratio affect sickness rates amongst intensive care unit nursing staff?”); such topics were split into two, resulting in a total of 106 topics. In three topics, terms were altered to clarify or broaden the meaning (“antidepressants” became “psychotropics,” “inotropic” became “vasoactive,” and “staffed beds” became “available beds”). As a result, direct comparisons of the group's initial ratings and their meeting ratings had to be confined to the 91 unmodified topics.

Of the 106 topics, 37 attracted strong support (final median 7-9), 48 moderate support (median 4-6.5), and 21 weak support (median 1-3.5). Table Table22 shows examples. The level of agreement within the group varied by topic, as indicated by the mean absolute deviation from the median. The level of support for a topic was significantly positively associated with the level of group agreement (χ2=13.4, P=0.01). Of the 37 topics attracting strong support, 24 related to identifying the organisational features of critical care that are likely to improve patient outcomes.

Table 2
Examples of topics with strong (median 7-9), moderate (median 4-6.5), or weak (median 1-3.5) support in final rating. Values are medians (mean absolute deviation from median)

Effect of nominal group technique on rank order and level of consensus

The effect of the group meeting was to polarise views more—the number of topics with moderate support declined (66 to 39) while the number with strong or weak support increased (19 to 32 and 6 to 20 respectively). Overall, the category of level of support did not alter for 62 topics, decreased for 15, and increased for 14. Although the rank order of the 91 topics changed, the change was not significant (Wilcoxon signed ranks test, z=−0.27; P=0.98).

There was more agreement after the meeting than before. The overall mean of mean absolute deviation from the median for all 91 topics fell from 1.41 to 1.26. A high level of agreement was achieved for 25 topics at the meeting (compared with 18 beforehand), and the number of topics with low agreement fell from 51 to 30. While 48 topics did not change much, 21 shifted from low to moderate agreement and 13 from low or moderate to high agreement. In contrast, the level of agreement over 9 topics fell as a result of the meeting.

Representativeness of nominal group's judgment

A 78% response rate (244/313) was achieved in the survey to assess the representativeness of the group's judgment. Although the rank order of the level of support for topics was similar (Mann-Whitney U test, z=337, P=0.09) and the level of association of ratings was highly significant (r=0.73, P< 0.01), the actual ratings were generally much higher among the survey respondents (table (table3).3). This was reflected in the low level of agreement between the group and the survey ratings (κ=0.15). Lower levels of consensus existed among the 244 survey respondents than in the nominal group. In the survey, high agreement was achieved for only one of the 30 topics compared with nine by the group, and, conversely, for 26 of the 30 topics there was only low agreement among the survey respondents. The principal reason for the low level of agreement between the group and the survey respondents was that the latter had assumed that all 30 topics had considerable support from the group (probably because when we were originally seeking topic suggestions we wrote, “we will be circulating the most ‘popular’ research questions for you to help us prioritise”).

Table 3
Comparison of nominal group's views (n=12) with views obtained by national survey (n=244). Values are medians (mean absolute deviation from median)

Views of teaching and non-teaching hospital staff

Of the 244 respondents to the survey, 58 were based in teaching and 186 in non-teaching hospitals. There was no difference in the median of the median scores (6.0) for the 30 topics between these two groups of staff, and the rank order of topics was similar (Mann-Whitney U test, z=382.5; P=0.302).


Feasibility of formal consensus development

We have demonstrated the feasibility of using a formal consensus development method for establishing clinical and health services research priorities in a specific clinical area. Participation by the relevant clinical community was good (57% without use of reminders), suggesting a high level of interest in identifying research topics. Although the high response meant that a large number of suggestions were received (over 1000), there was sufficient commonality to allow us identification of 100 key issues. This commitment among staff was also reflected in the high acceptance rate for participating in the nominal group (11 out of the original 12 invited) and in the response to the final survey (78%).

Clinicians' views of research priorities

We have established what clinicians' views of clinical and health services research priorities are. Topics related to research into the organisation and delivery of critical care dominated, with less support for evaluation of specific healthcare technologies such as investigations and treatments. Most of the topics that attracted strong support related to organisational features of critical care likely to improve patient outcomes. This may explain why we found little difference in the views of staff from teaching and non-teaching hospitals.

Value of nominal group meeting

In terms of the rank order of support for suggested topics, the meeting had little impact. But it did serve to increase the level of agreement between group members. It also tended to polarise the topics—27 of the 66 topics that had moderate support in the initial ratings shifted to strong or weak support following discussion. Associated with this phenomenon was the observation that the greater the level of support for a topic, the more agreement there was in the group. The meeting also provided insights into the reasons for a lack of agreement where this arose.

We also have shown that the views of a small nominal group can represent those of the wider community from which they are drawn. This is consistent with the only other evidence from the health sector, which showed that in the United States the views of nine family physicians, cardiologists, and cardiac surgeons on the appropriate use of coronary angiography were consistent with the views of 1058 colleagues.12 We too found high levels of association and a similar ranking of topics.

Shortcomings of study design

Firstly, some of the initial lack of agreement between group members arose because of ambiguity in the wording of topics. This highlights the need for great care in the preparation of questionnaires, including a pilot phase to check for face validity. This will not guarantee the avoidance of all problems but would reduce the likelihood.

Secondly, we created some confusion in the minds of the respondents to the final survey by inadvertently indicating in our earlier communication that we would be sending them the “most ‘popular’ research questions” to rank. With this expectation, some respondents seemed reluctant to assign low scores to topics; a quarter (62/244) rated over 90% of topics with a score of 5 or more. Despite this influence, the ranking of topics was similar to that of the nominal group.

Thirdly, we confined this exercise to the two principal clinical groups with a strong, clear interest in critical care—namely, doctors and nurses. We ignored the views of other stakeholders, such as therapists, technicians, patients, relatives, and staff from other medical specialties. The results might have been different if the views of these other groups had been considered. It is also important to recognise that this study has identified the most commonly perceived priorities for research. These may not be the most important for improving the quality of critical care.

What is already known on this topic

Formal consensus development methods have rarely been used to establish national research priorities in medicine, partly because their feasibility and reliability is uncertain

What this study adds

In critical care, clinicians can generate and then rate the importance of research topics using a nominal group technique

The group's views represented the views of the wider community of critical care staff, suggesting that the approach could be used to improve the transparency and democracy of decision making by research funding bodies


This study has implications both for the use of consensus development methods and for research in critical care. We encourage the approach described here in other areas of health care, not only as a means of identifying research priorities in a structured and transparent way but also to establish whether the method is equally robust when tackling very different issues, such as long term care or community services. Although there have been some previous applications of consensus development methods, they have mostly used Delphi surveys 2,3,57,9,10 or informal mechanisms for deriving group judgments.

We encourage researchers in critical care to focus on the topics that have the widest support and to use that fact when approaching research funding bodies. We also encourage funders to use the results of exercises such as the one described here to shape and influence their commissioning of research. Studies that have the widespread support of the key clinical groups are more likely to gain cooperation and participation in their execution and, maybe, increase the likelihood of any research findings being taken up in clinical practice and policy. The approach described here also has the advantage of being systematic and transparent, unlike the usual means used by funding bodies to prioritise their needs. However, greater involvement of stakeholders and the application of clear method have a cost. We estimate that the whole process of organising, running, and analysing the nominal group cost about £10 000 (including about £5000 for the time of all the clinical participants). We believe that the clear benefits of the approach make this highly cost effective.


We thank all staff from intensive care units who responded to the initial request for topics and the final survey, and members of the nominal group (Drs Geoff Bellingan, Ruth Endacott, Chris Garrard, Cameron Howie, Andy Padkin, Saxon Ridley, Alasdair Short, Sue Sinclair, Mervyn Singer, Carl Waldmann, and David Watson, and Mrs Sue Baker).


Funding: Research and development directorate of the West Midlands regional office of the NHS Executive.

Competing interests: None declared.


Articles from BMJ : British Medical Journal are provided here courtesy of BMJ Group
