• We are sorry, but NCBI web applications do not support your browser and may not function properly. More information
Logo of hsresearchLink to Publisher's site
Health Serv Res. Aug 2006; 41(4 Pt 2): 1677–1689.
PMCID: PMC1955343

Making Noncatastrophic Health Care Processes Reliable: Learning to Walk before Running in Creating High-Reliability Organizations


Health care clinicians successfully apply proven medical evidence in common acute, chronic, or preventive care processes less than 80 percent of the time. This low level of reliability at the basic process level means that health care's efforts to improve reliability start from a different baseline from most other industries, and therefore may require a different approach. This paper describes The Institute for Healthcare Improvement's (IHI) current approach to improving health care reliability, including a useful nomenclature for levels of reliability, and a focus on improving reliability of basic health care processes before moving on to more sophisticated high reliability organization concepts. Early IHI work with a community of health care reliability innovators has identified four themes in health care settings that help to explain at least a portion of the gap in process reliability between health care and other industries. These include extreme dependence on hard work and personal vigilance, a focus on mediocre benchmark outcomes rather than process, great tolerance of provider autonomy, and failure to create systems that are specifically designed to reach articulated reliability goals. This paper describes our recommendations for the initial steps health care organizations' might take, based on these four themes, as they begin to move toward higher reliability.

Keywords: Reliability, reliable design, human factors

Current pay for performance and public reporting of clinical process measures have drawn attention to the generally poor process performance health care achieves in the application of clinical evidence. Recent studies show widespread inconsistency within organizations and across all provider groups in the reliable delivery of high-quality care. In particular, two studies by the RAND Corporation report that, for many clinical conditions for which evidence-based care is clearly established, only about 50 percent of patients receive the recommended care (McGlynn et al. 2003; Kerr et al. 2004). Given the demands by Centers for Medicare and Medicaid Services (CMS) and Joint Commission for Accreditation of Hospitals Organization (JACHO) for process measurement and subsequent reporting, health care organizations have recently shown heightened interest in how to make key clinical processes more reliable. The Institute for Healthcare Improvement (IHI) has responded to this effort by exploring high-reliability organizations in industry, and transposing practical lessons to health care.

Weick and Sutcliffe (2001) have described the hallmarks of high-reliability organizations in other industries, including concepts such as “collective mindfulness” and “preoccupation with failure.” Not clearly stated, but implied in their work is the concept that in these nonhealth care high-reliability organizations, common processes are already considerably more reliable than those in health care. It's one thing to be preoccupied with failure, and to analyze every defect, with a goal of achieving high reliability, in an industry where core processes operate correctly 98 or 99 percent of the time as the baseline. It's quite another in health care, where we start from a baseline of core processes that are defective 50 percent of the time!

As health care application of clinical best practices is less reliable than many industrial processes, the IHI Innovation Team believes that we should get the basics in place first, and focus on process reliability before we go on to more advanced concepts such as preoccupation with failure. This focus on process is meant to be specifically applied to routine processes where the immediate result of a defect is not catastrophic, such as hand washing between patients, and not to those processes where a defect would be immediately catastrophic, such as accurate blood typing for transfusions. This paper presents the IHI's strategy for improving routine noncatastrophic processes in health care whose baseline performance reliability is less than 80 percent (the vast majority of clinical care).


The IHI Innovation Team defines reliability as “failure-free operation over time” (a definition by David Garvin at the Harvard Business School). We have adopted a nomenclature using failure rate (calculated as 1minus reliability, or “unreliability”) as an index, expressed as an order of magnitude. Thus, 10−1 means approximately one defect per 10 process opportunities, 10−2 is approximately one defect per 100 opportunities, and so on. In our work with health care processes we have found it useful to avoid strict mathematical interpretations, and to quantify defect rates more broadly using the framework noted in Table 1. For example, 10−2 is defined as five or fewer defects for 100 opportunities rather than the strictly mathematical one defect for 100 opportunities.

Table 1
Reliability Labels

As a general rule, we have found that 10−1 or worse performance indicates the absence of any articulated common process. In other words, if one were to ask five frontline staff participants in the process to describe it, one would get five different answers. At a higher level of reliability, 10−2 performance usually indicates the presence of a clearly articulated process (although by industry standards significant variation would still be present; Resar et al. 2004).

The first goal for the process approach devised by the IHI Innovation Team is to achieve a 10−2 level performance, a substantial improvement from baseline “chaotic” or “10−1 performance. For example, if currently 56 percent of surgery patients are getting their prophylactic antibiotics within 1 hour before the surgical incision (Bratzler 2005), (i.e., a “chaotic” level of reliability), we establish an initial design aim that at least 95 percent of patients would get the prophylactic antibiotics within that hour (10−2 reliability).


Given all the good intentions and talent available in medicine, and the solidity of the medical evidence, why are clinical processes carried out at such low levels of reliability? Few clinicians come to work with the intention of performing poorly. The generally accepted answer to this question—“it's the system, not the people”—may be true, but is not particularly helpful in specifically detailing how to improve the clinical processes. The IHI experience with 40 organizations working to achieve higher levels of reliability for CMS Core Measures, has provided some insights into some of the reasons for low reliability in health care. After reviewing the historical struggles of these organizations to become more reliable, we noted four common themes as possible explanations for the reliability gap:

  • Current improvement methods in health care are excessively dependent on vigilance and hard work.
  • The current practice of benchmarking to mediocre outcomes in health care gives clinicians and leaders a false sense of process reliability.
  • A permissive attitude toward clinical autonomy creates and allows for wide, and unjustifiable, performance variation.
  • Processes are rarely designed to meet specific, articulated reliability goals.

Each of these will be discussed in detail.


An analysis of improvement projects in the organizations in the IHIs' reliability community, although subjective, demonstrated that the majority of the historical improvement work on conditions such as community-acquired pneumonia or congestive heart failure (CHF) involved a lengthy process of development of complicated protocols, followed by an attempt at full-scale implementation, without small-scale testing (Fran Griffin, IHI, 2005 personal communication). Implementation depended primarily on education of the staff and feedback of information on compliance. The implied message for those whose compliance was less than ideal was: “Work harder next time using the protocols” (into which they had little development input).

Strategies such as these have been classified as “Model 10−1 change concepts,” and consist primarily of intent (attitude), vigilance, standardization, and hard work. We have observed that these “Model 10−1 concepts” achieve at best 80 or 90 percent success. Occasionally, they achieve higher levels of reliability, but prove impossible to sustain over time. “Model 10−1 change concepts” attempt to prevent failure without including human factors or principles of reliability science. In our view, the “intent, vigilance, and hard work” model is a good start, but is not sufficient to achieve 10−2 performance. Clearly, although there is value in these traditional approaches, they are by themselves demonstrably weak in creating reliable processes (Bero et al. 1998).

The addition of human factors and reliability science principles to the “intent and vigilance” model creates what we term “Model 10−2 change concepts.” These involve sophisticated designs to prevent failure as a first step, then adding failure identification and mitigation as a second step. These 10−2 change concepts are currently used in many health care processes. A radiology example of the use of the human factors and reliability science is presented in Table 2.

Table 2
Example of Using Human Factors and Reliability Science

It is unreasonable to expect anything better than 10−1 performance from any process that does not use these more sophisticated concepts of design, failure detection, and failure mitigation. Leaders who review reliability improvement projects, and who wish to achieve higher levels of reliability, should demand a mix of Model 10−1 and Model 10−2 change concepts in the design.


There are two ways in which benchmarking to outcomes makes health care leaders complacent about their processes. First, bad outcomes in health care seem relatively rare, to those who are at the front lines. Risk of death, for example, for patients who pass through the health care industry is at the 10−4 level (Leape 1994). Fatality from anesthesia in American Society of Anesthesiology (ASA) Level 1 patients is 10−5. So any individual physician will experience these adverse events as rare occurrences. This phenomenon can best be described as the tyranny of small numbers. Furthermore, when organizations benchmark themselves against others' outcomes (in an industry whose process reliability is generally dismal), they are comforted by learning that they are “in the middle of the pack” as far as outcomes such as infection rates, and mortality rates, so they assume that their processes must be “OK” too.

The second way in which benchmarking gives health care a false sense of the reliability of their processes arises from the “loose coupling” between many health care processes and adverse outcomes. For example, compliance with hand hygiene protocols is generally about 50 percent, but nosocomial infection rates are much smaller, perhaps 1 percent. A single defect in the hand hygiene process has a low probability of actually causing an infection and if infection is caused it will be separated by days from the defect. In this regard, the contrast with much of industry is quite different. For example, if a piece of steel needs to be a certain length and, by definition, cutting it too short is a defect, the outcome is always the same: too short, bad outcome. Moreover, the concrete, visual nature of the outcome immediately tells the worker who cuts the steel too short that there's a problem.

The failure to observe a direct relationship between a process defect and a poor outcome in health care allows a natural lessening of vigilance regarding the process. It is easy, then, for those of us in health care to draw conclusions like: “The hand hygiene process can't actually be too bad; otherwise we would have more infections.” This dissociation between the process and the eventual outcomes, due in part to the biological resilience of patients, shields health care workers from noticing the level of unreliability of their health care processes. If leadership is going to drive overall health care outcomes such as adverse events or mortality well below their current rates, leaders will need to demand great improvement in the very unreliable processes that currently lead to those “pretty good” outcomes.


Variability in how a process is performed leads to unreliability and confusion; it directly affects our ability to learn from our defects (Taiichi 1988). For most clinical situations for which there is solid medical evidence there are actually many possible approaches each of which falls within the margins of the evidence. These margins define the “Standard of Care,” a sort of “safety zone” based on medical evidence. In other words, everyone within the zone is considered to be delivering good, science-based care, even though each of them is using a different process. And this variation isn't seen as a problem, as each of them is regarded as using the “right” science.

Unfortunately, the resulting variability in the process of delivering care forces the organization in which these autonomous providers work to develop a supporting infrastructure that is at best marginally effective. For example, training for new employees, and testing current employees for competence are both extremely difficult to establish and maintain when the processes on which training and testing are based are highly variable. The difficulties such variability creates in real practice can be illustrated by the use of anticoagulant therapy within a group of six internists. In most office practices, nurses are the communication and execution link between physicians and patients for anticoagulant testing and dosing. If each physician has different methods for anticoagulant dosing and laboratory evaluations, the practice would need to develop six separate training and testing processes. No wonder testing and training are so rare. Hence, a potentially catastrophic defect in administration of anticoagulants, such as a marked decrease in blood clotting ability is difficult to trace back to a defect in training or competency, and is just accepted. Typically, there is no well-defined, common process for anticoagulation across the entire practice, and no one in the practice is responsible for the entire anticoagulation process. The ability for the clinicians or staff to learn from any particular defect in the process has been lost, swamped in a sea of variability.

An ideal practice's key processes would have little variability across the six physicians, allowing an infrastructure with much greater potential for training and testing. Defects in the process could be traced back through the infrastructure relatively easily, and design changes that achieved higher levels of process reliability could be introduced. Responsibility for the process could be more realistically assigned. Health care leaders should be able to expect this level of standardization and be willing to set expectations, and assign ownership. If there is variation in noncatastrophic processes, it should be driven by patient preference, not provider autonomy.


The IHI uses a three-step model for applying principles of reliability to health care processes (Espinosa and Nolan 2000; Nolan 2000). Figure 1 gives a schematic description of the model.

Figure 1
IHI 3 step reliability design model

The initial step is to prevent the most common cause of failure, which is the lack of any defined process. Logically, standardization of processes is the most crucial and immediate tactic. It is more important that the process be standard, than that it be perfect. In other words, those processes being standardized should be expected to reach a higher level of reliability, but not necessarily perfection. Allowing a team to design for less than perfect design is somewhat foreign to health care improvement teams; experience has shown, however, that attempts to design for perfection, particularly early on, commonly lead to overly complex protocols that plan for every possible contingency. A more realistic first step goal in redesigning any of health care's typically complex processes is to aim for an 80 or 90 percent success rate, that is, a 10−1 level of reliability. The second step is to identify defects from the first step, and then to mitigate those defects. Whenever possible, “first step” defects should be identified as they occur, which makes it possible to intercede before they affect the overall outcome significantly, and allows for mitigation of defects that are detected and intercepted in timely fashion. The third step is to understand clearly the reasons for failure—the “critical failure modes”—in either of the two preceding steps, and use that understanding in redesigning the overall process. In most health care systems, critical process failure modes are seldom prioritized, and even less often used in process redesign, especially with an articulated reliability goal in mind. Leadership should expect that improvement teams use an approach that goes beyond the production of a one shot attempt at hitting a home run with a complex process design produced by experts in a closed room. Our three-step design encourages small scale testing with multiple frontline inputs, and is far more likely to produce real improvement in reliability.


Figure 2 visually illustrates the steps in design for identification and subsequent administration of discharge instructions to patients with CHF in an attempt to achieve 10−2 performance in this CMS core measure. Initial efforts specified standardization, (step 1) using vigilance, education and preprinted order sheets. A critical failure mode (step 3) turned out to be “missing the diagnosis of CHF until it was too late,” and so the process was redesigned to include early identification of CHR using BNP testing. This allowed “real time” mitigation with subsequent discharge instructions (step 2), and achieved 10−2 level of reliability for the discharge instruction measure.

Figure 2
Three tier design strategy for CHF discharge instructions


A recent study of the “ventilator bundle” has shown a strong relationship between process improvement to the 10−2 level of reliability and decreased episodes of ventilator-associated pneumonia (Resar et al. 2005). The ventilator bundle consists of four processes in the care of patients on respirators: prophylaxis of deep vein thrombosis, prophylaxis against peptic ulceration, elevation of the head of the bed, and sedation vacation). When all four of these processes are delivered a level of 10−2 reliability for each ventilated patient, rates of ventilator-associated pneumonia improve dramatically (Figure 3).

Figure 3
Association of adherence to ventilator bundle and reduction of VAP

Intensive care units that successfully reduced ventilator-associated pneumonia used 10−2 model change concepts and the three-step design method. For example, head of bed elevation was standardized by “making the desired action the default” as step 1. Step 2 involved the ward clerk checking the room on an hourly basis to see if the head of the bed was elevated and if not elevated, the defect was mitigated by a predetermined process, which varied by institution. Defects requiring mitigation were studied and the process redesigned as needed, resulting in dramatic improvement in the reliability with which the ventilator bundle was implemented (process) and a marked reduction in ventilator-associated pneumonia (outcome). As a result of the improved reliability in this process, many hospitals have experienced order-of-magnitude reductions in ventilator-acquired pneumonia rates, and some have gone more than a year without any of these devastating complications of intensive care.


Under the complex and demanding circumstances that characterize care in a busy hospital or outpatient clinic, what should leaders do to improve reliability of common processes? The following recommendations can be considered a starting point for literally any noncatastrophic process whether related to inpatient or outpatient settings:

  • Leaders should look for, ask about, and require that Model 10−2 (human factors and reliability science) concepts be included in the design of any improvements (Theme I). An example of a human factors concept is the use of a red sticky placed on the wheelchair of patients brought down to radiology if the transporter has had to help that patient into the chair. The red sticky acts as a reminder that the patient might not be capable of standing or transferring independently, and that two people will probably be needed to help the patient up onto the table. This simple signal has been shown to dramatically decrease serious falls and injuries in radiology.
  • Focus initially on key processes, rather than on benchmarked outcomes (Theme II). For example, rather than focusing primarily on reduction of ventilator-associated pneumonia, the initial focus might be on how reliably the team accomplishes all elements of the ventilator bundle, as the science now shows a strong relationship between reliably implementing the elements of the ventilator bundle and the reduction of ventilator-associated pneumonia. If the outcome does not improve after reliably implementing the process, either the science is wrong or the process has not been truly made more reliable. Most improvement teams will not discover new science.
  • Delineate clear performance variability limits by standardization of the process (Theme III). These limits should be narrow enough to permit the creation of good infrastructure for teaching and testing. An example might be to allow only one process for the anticoagulation of patients in an office with one dosing formula, one laboratory scheduling method, and a single way to handle INR abnormalities
  • Demand the three-step design be used on more complex processes and that processes are designed to meet specific, articulated reliability goals (Theme IV).
  • Lastly, a learning organization should select one or two key clinical processes, such as the ventilator bundle or community acquired pneumonia, and deliberately aim for the 10−2 articulated goal using the lessons amplified in the four themes. The deliberate design should be both an attempt to improve a key process, but also act as a learning model for reliability.


Improvement in the reliability of clinical processes in health care will require a focus on a set of principles that are carefully designed for improvement of organizations that start from a baseline of low process reliability. The initial emphasis needs to be placed on process rather than outcomes. Because it is impossible to design and implement a perfect process from scratch, the first step for health care improvers is to articulate just how good a noncatastrophic process needs to be. The design for that less than perfect result can then follow.

Noncatastrophic processes in health care can clearly operate at a lower level of reliability than the level required of processes in a “high-reliability organization.” It is the IHI innovation team's consensus that health care leaders must first develop a strategy to move from grossly unreliable, chaotic processes to a minimum 10−2 level of reliability before taking on the next challenge: moving to the level of the high-reliability organization. Health care must walk before it can run!


Based on work of the Institute for Healthcare Improvement (IHI) Innovation Team. The author thanks Frank Davidoff, M.D., and the IHI Publication Team for help in preparing this manuscript.


  • Bero LA, et al. “Getting Research Findings into Practice Cochrane Database of Systematic Reviews” British Medical Journal. 1998;317:465–8. [PMC free article] [PubMed]
  • Bratzler DW, Steele L, Dellinger E, Fry D, Wright C, Ma A, Carr K, Reo L. “Use of Antimicrobial Prophylaxis for Surgery” Archives of Surgery. 2005;140:174–82. [PubMed]
  • Espinosa JA, Nolan TW. “Reducing Errors by Emergency Physicians in Interpreting Radiographs: Longitudinal Study” British Medical Journal. 2000;320(7237):737–40. [PMC free article] [PubMed]
  • Kerr EA, McGlynn EA, Adams J, Keesey J, Asch SM. “Profiling the Quality of Care in Twelve Communities: Results from the CQI Study” Health Affairs. 2004;23(3):247–56. [PubMed]
  • Leape LL. “Error in Medicine” Journal of the American Medical Association. 1994;272(23):1851–7. [PubMed]
  • McGlynn EA, Asch SM, Adams J, Adams J, Kelsey J, Hicks J, De Cristofaro A, Kerr E. “The Quality of Health Care Delivered to Adults in the United States” New England Journal of Medicine. 2003;348(26):2635–45. [PubMed]
  • Nolan TW. “System Changes to Improve Patient Safety” British Medical Journal. 2000;320(7237):771–3. [PMC free article] [PubMed]
  • Griffin F. May 2005, Reliability Community Director, IHI, Personal Communication.
  • Resar RK, Griffin F, Haraden C, Nolan T. 2004. [April 10, 2006]. “IHI Whitepaper on Reliability” Available at http://www.ihi.org.
  • Resar RK, Pronovost P, Haradon C, Simmonds T, Rainey T, Nolan T. “Using a Bundle Approach to Improve Ventilator Care Process and Reduce Ventilator-Associated Pneumonia” Joint Commission Journal on Quality and Patient Safety. 2005;31(5):243–8. [PubMed]
  • Taiichi O. Toyota Production System. Beyond Large-Scale Production Productivity. Portland, OR: Oregan Press; 1988.
  • Weick KE, Sutcliffe KM. Managing the Unexpected. University of Michigan Business School Management Series. New York: John Wiley & Sons; 2001.

Articles from Health Services Research are provided here courtesy of Health Research & Educational Trust
PubReader format: click here to try


Related citations in PubMed

See reviews...See all...

Cited by other articles in PMC

See all...


  • MedGen
    Related information in MedGen
  • PubMed
    PubMed citations for these articles

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...