Interactive Intervention Analysis
David Gotz and Krist Wongsuphasawat
Abstract
Disease progression is often complex and seemingly unpredictable. Moreover, patients often respond in dramatically different ways to various treatments, and determining the appropriate intervention for a patient can sometimes be difficult. In this paper, we describe an interactive visualization-based system for intervention analysis and apply it to patients at risk of developing congestive heart failure (CHF). Text analysis techniques are used to extract Framingham criteria data from clinical notes. We then correlate the progression of these criteria with intervention data. A visualization-based user interface is provided to allow interactive exploration. We present an overview of the system and share clinician feedback regarding the prototype implementation.
1. Introduction
Disease progression is a complex process. Two patients exhibiting similar symptoms may respond very differently to the same interventions. The differences can depend on a large number factors including co-morbidities and specifics of the disease manifestation. These variations can sometimes make treatment decisions challenging.
To ensure quality care, disease-specific guidelines are often produced by medical organizations to codify generally accepted best practices. These guidelines provide a critically important framework within which clinicians can choose proper treatments. However, guidelines have limitations. They typically distinguish only broad categories of patients (e.g., males over 65 with high cholesterol). They generally do not personalize treatment plans for patients with multiple co-morbidities or other unique combinations of factors despite these types of patients being common in the general population [1]. In addition, guidelines often list several alternative treatments without specifying which might be most effective.
In this paper, we demonstrate how text analytics applied to electronic medical data can be combined with interactive visualization tools to enable personalized interactive intervention analysis. Our approach has two key phases: a preprocess phase and a runtime phase.
As a pre-process, we first analyze unstructured clinical notes for a population of patients to extract disease progression data. Our prototype targets Congestive Heart Failure (CHF), so we extract timestamped Framingham criteria [2]. We then correlate this data with structured information about interventions (e.g., medications and procedures). At runtime, we visualize this information to let users interactively explore variations in disease progression paths. The visualization shows each path’s associated outcome and correlated interventions. Screenshots from our system are shown in Figures 1 and and55.

2. Related Work
The work described in this paper builds upon three distinct areas of related work: population-based analysis (both for CHF and other conditions), text analysis, and visual analysis.
Population-Based Analysis
Longitudinal population studies are a commonly used tool in healthcare. By observing groups of patients over time, these studies can provide insights that are statistically significant. One such study is the Framingham Study [2] which identified 18 criteria which have been shown to be predictive of eventual CHF diagnosis. Our system uses these criteria as the features that define disease progression paths.
While the Framingham Study analyzed data from a controlled population of patients, similarity-based cohorts obtained from electronic data have been gaining wider attention in recent years as a means for developing personalized evidence for making complex treatment decisions [3, 4]. Our work is aligned with this approach.
Text Analysis
Modern electronic medical systems capture a large amount of structured data such as diagnosis and procedure codes, medication orders, and patient demographic data. However, a significant and extremely valuable set of data resides in unstructured clinical notes. For this reason, text analysis techniques are often employed to extract structured timestamped medical event data from these text-based notes. Most relevant to our work are text analysis techniques that can recognize Framingham criteria [5].
Visual Analysis
Given the complexity of medical data, many research projects have explored ways in which visualization can be used within the healthcare domain. Some have focused on making it easier for practitioners to navigate and understand patient data. For example, Lifelines [6] uses an interactive timeline metaphor to summarize a single patient’s electronic medical record. Other systems focus on cohorts of patients and visually summarize aggregate information. Recent work in this area includes LifeFlow [7] which visualizes groups of temporal event sequences, and Outflow [8] (the technique adopted in this paper) which visualizes event sequences using graph-based models that can represent the many ways in which patients’ conditions can evolve over time.
3. System Overview
We have developed a visual analytics system to support interactive intervention analysis and applied our prototype to patients at risk for developing CHF. The process diagram shown in Figure 2 illustrates our system’s design. Beginning with a source electronic medical record (EMR) database, we first extract Framingham events from unstructured clinical notes using text analysis techniques. For a given target patient, we then visualize data from a cohort of similar patients using our visualization tool.

Text Analysis for Framingham Criteria Identification
Despite the rich structured data found in many EMR systems, many important observations (such as Framingham criteria events) are typically captured in the form of unstructured clinical notes. To structure this data into a form that can be used for subsequent analysis, we use text analysis algorithms built on Apache’s open-source UIMA framework [9] to analyze all clinical notes in our database. The extracted information (consisting of timestamped Framingham criteria events for each patient) is stored together with the original structured EMR data.
Because our prototype focuses on CHF, we use a number of UIMA-based annotators to detect mentions of the 18 Framingham criteria. These were developed in consultation with, and evaluated by, clinicians who are experts in treating CHF [5]. The annotators, developed using a population of over 6000 incident heart failure patients (including over 500,000 clinical notes), detect which of the 18 criteria are mentioned as well as when they appeared (based on the date of the clinical note in which a mention was found). The annotators also recognize negation (e.g., “no sign of ankle edema”) and classify these occurrences as distinct from non-negated mentions.
Similar Patient Query
At runtime, our system allows clinicians to select an individual target patient to serve as the focus of the analysis. The target patient ID is passed to a similarity-based query component which retrieves a cohort of clinically similar patients from the database. While a variety of advanced similarity metrics can be used (e.g., [10]), we employ a rule-based approach which retrieves all patients who have, at some point in the past, shared the same set of Framingham criteria as the target patient has currently. For each similar patient, we retrieve both outcome measures (e.g. diagnosis and mortality data) and temporal data for Framingham criteria and interventions.
Visual Outcome Analysis
Following the similar patient query step outlined above, the system visualizes the temporal data using an enhanced version of Outflow [8], a visualization technique for analyzing temporal event data. We first describe the core Outflow approach to representing patient data. We then describe the essential extensions made to the base Outflow platform that enable interactive intervention analysis.
Outflow Overview
Given a set of patients Psimilar = {pi} similar to the target patient ptarget, Outflow first builds a graph-based data representation by aggregating the Framingham event sequences {s0, ..., sn} for each pi. As illustrated in Figure 3, this representation captures all observed symptom evolution paths. Each path represents the progression for a different subset of Psimilar. Statistics (such as the average time between symptoms, average outcome, and the number of patients traversing a path) are computed for each node and edge. Outcome can be any patient measure (e.g., hospitalization rate, lab tests, etc.) that can be normalized to the [0–1] range. We use mortality rate in our prototype. Next, alignment point is defined based on the full set of Framingham criteria present for Ptarget. As a result, the graph captures both (a) how similar patients reached ptarget’s condition and (b) how similar patients evolved thereafter.

Outflow then visually encodes the graph as illustrated in Figure 4. Nodes are represented with vertical rectangles whose height represents the number of patients. Edges are represented using two components: a time edge that captures that average time between symptoms, and a link edge that encodes sequentiality. Color-coding is used to indicate average outcome values with red edges signifying progression paths that have poor outcomes and green edges representing paths with desired outcomes. Interactive features such as panning, zooming, and brushing are provided to enable exploration as illustrated in Figure 1.
Extensions for Intervention Analysis
Motivated by clinician feedback to our initial Outflow prototype [8], we added a number of extensions to our system that enable more clinically meaningful use cases. Our extensions focus on two key areas: outcomes and interventions.
The early feedback we gathered suggested that correlating outcomes with clinical paths was very useful. However, defining an outcome is complex and clinicians must often balance several perhaps competing measures. To address this challenge, the Outflow system has been extended to support multiple outcomes measures. For example, our prototype for CHF intervention analysis supports both mortality rate and diagnosis rate. The generic design of the system means that additional measures, such as lab test results or treatment costs, can be easily incorporated if such data is made available.
An even more critical limitation of our initial prototype was that the visualization did not include intervention data. Users pointed out that while the pathways were interesting, knowing that one path had good outcomes while others had bad outcomes was not on its own especially actionable. The clinicians instead wanted to know “how do we move a patient from the red path to the green one?” This required not only a visualization of the outcome, but also what interventions—medications or procedures—could be ordered by the physician to influence which path was followed.
To address this challenge, we computed additional statistics for every node and edge in the pathway graph shown in Figure 3. Each structure in our representation (nodes and edges) represents a subset of Psimilar. For each subset of patients, we analyze every medication and procedure associated with the group and compare the frequency of those interventions with the frequency observed in the overall cohort Psimilar. Unusually frequent or rare interventions are flagged and a correlation score is computed. The interventions with the highest scores are then made visible through the user interface as shown in Figure 5. When users mouse over specific nodes or edges, histograms are shown which highlight the interventions that occur unusually frequently (or rarely) for the corresponding subgroup of similar patients. This allows a user to, for example, mouse over a large red path to see which medications, if any, were unique to the patients that traveled that path.
4. Physician Feedback
Our system is designed to let users analyze two types information. First, the system visualizes how alternative disease progression paths correlate with outcomes. Second, users can explore which interventions correlate most strongly with the various paths. In this section we report on the feedback provided to us by clinical evaluators. Results from a non-medical (sporting event statistics) 12-person user study focusing strictly on general usability of the Outflow visualization for temporal analysis are published elsewhere [11].
We gathered clinical feedback on our CHF system in two ways. First, we conducted an in-depth review with a medical doctor who was given extended time to work with our prototype system. Second, we provided a number of brief demonstrations to clinical professionals (including both doctors and medical administrators) and solicited general feedback.
Pathways
The medical doctor who spent the most time with our system felt that the visualization of pathways, based on actual patient data, was very useful. “Inertia is very important” he stated, suggesting that the color coded paths can provide clinicians with concrete evidence regarding which way a patient is trending. He stated that while you try as a physician to help patients “turn the corner ... generally speaking, those on a better trajectory will do better.”
Figure 1 shows an an example of how our system conveys this type of trajectory information. As the patient cohort illustrated in Figure 1(a) shows, there can be a strong correlation between pathway and outcome. This is conveyed via the preponderance of green and red paths. Figure 1(b) shows trajectories that led to especially poor outcomes.
Interventions
While visualizing disease progression paths along with their associated outcomes is valuable, it is not insufficient if the goal is to help those challenging patients “turn the corner.” Critical to the utility of our system is the integration of intervention data. This was the feedback we received most often from the clinicians reviewing our system. “These [interventions] are going to be extremely important” said one physician when talking about how this tool might help save patients on a bad trajectory.
The correlated intervention data informs users about how the things that they can control—such as medications and procedures—might impact outcome. By brushing a red path, users can see what medications were especially common for the poorly performing patients. These may be things to consider avoiding. Similarly, the list of excessively rare interventions might be things to consider adding to a patient’s treatment regimen in an attempt to deflect them to a better path.
At times (especially for larger cohorts) the list of correlated medications could grow long, making it harder for users to scan. To help ease this task, the sidebar shows interventions sorted alphabetically (to allow a clinician to look for a specific item) instead of by rank. A histogram conveys rank information and enables quick identification of the strongest correlations.
The notion of changing a patient’s trajectory was the primary concept discussed by the doctor participating our in-depth review. He pointed out that patients often respond as expected to various interventions. However, sometimes there are patients that are “circling the drain”, meaning they enter a downward spiral despite a clinician’s best intentions. In these cases, he believed that clinicians will invest the time to research options for interventions using a system like ours which can highlight what has worked in the past for similar cases.
Requested Enhancements
Despite the overall receptive response to our system, some requests were made for enhancements that would make the tool even more powerful. Perhaps most critical, users pointed to the complexity. “You might get some push back at first because it is different, but once you get the hang of it there is a lot of information.” Reducing the complexity of the interface is important, and while the system does provide some simplification controls more work is required. However, as our user suggested, we believe that a significant part of this reaction is due to unfamiliarity.
Users also asked for the ability to increase customization of the analysis to meet specific clinical needs. For example, users suggested the ability to filter the set of patients returned by the similar patient query patient cohort. Our system does not yet support this capability, but it is an interesting topic for future work. Clinicians also suggested expansion of the patient events used in the analysis. This is possible through either (1) the development of new text annotators to extract more information from clinical notes, or (2) the inclusion of event data from structured medical records. However, incorporating additional event types leads directly to more variation–and therefore complexity–in the resulting pathways. This trade-off must be managed to ensure that key clinical features are included without making the pathway structure too complex to interpret.
5. Conclusion
We introduced an interactive environment for intervention analysis and a prototype system applied to a population of CHF patients. The system extracts Framingham criteria from unstructured clinical notes and correlates them with structured intervention data. The data is visualized using an extended version of Outflow, a temporal pathway visualization technique. New Outflow features include support for multiple outcome measures and the addition of correlated intervention statistics. While early clinician feedback shows the potential impact of our approach, much remains for future work. This includes a pilot deployment with a larger group of clinicians and a more formal evaluation. We also plan to apply our approach to other diseases where extensions to our prototype—such as new text annotators—will likely be required.







