4Screening Technologies II: Toxicogenomics

Publication Details

In recent years, toxicogenomics has started to become fully integrated into drug safety assessment and into the efforts of the U.S. Food and Drug Administration (FDA) to build more safety into drugs, noted Federico Goodsaid, Senior Staff Scientist in Genomics at the FDA’s Center for Drug Evaluation and Research. Toxicogenomics has also increasingly become a major tool in the development of new biomarkers for drug safety assessment. Four speakers from three different companies—Iconix, Bristol-Myers Squibb, and Abbott Pharmaceuticals—explained how their firms are developing and applying toxicogenomic tools for drug safety. The goal was to describe the range of drug safety data now being supplied by toxicogenomics—from preclinical safety assessments to the clinic. The following summaries address data derived from studies conducted in rats. It is important to note that these results apply only to rats and have not yet been shown to be clinically relevant. Future steps for these technologies and these types of databases include determining whether they are clinically relevant.


Dr. Halbert discussed how current work in toxicogenomics is an important part of the FDA’s Critical Path Initiative. Specifically, Item 20 on the Critical Path Opportunities List calls for “modernizing predictive toxicology,” described as follows:

Identifying preclinical biomarkers that predict human liver or kidney toxicity would speed innovation for many different types of therapeutics. Activities to develop genomic biomarkers for the mechanistic interpretation of toxicological observations—complementary to but independent of these classic toxicological observations—could begin to create the data foundation for qualification of new safety biomarkers. Collaborations among sponsors to share what is known about existing safety assays could be a first step toward the goal of safer medical products. (FDA, 2006:9)

Halbert explained that the fundamental underlying principle of toxicogenomics is that compounds with similar mechanisms of toxicity and efficacy will have similar gene expression profiles. Thus information about how various compounds affect gene expression—in the context of other knowledge about those compounds—can lead to a better understanding of both the compounds’ mechanisms of action and their toxicity. One of the goals of toxicogenomics is to identify biomarkers—generally sets of genes or RNA—from data collected on known drugs and toxicants, and use these biomarkers to predict mechanisms of action or toxicity in new compounds.

To be effective, toxicogenomics requires the collection and analysis of large amounts of data. These data must be highly diverse, in terms of not only the types of drugs and compounds that should be represented in the database, but also the types of data collected. For example, gene expression data should be collected in addition to traditional toxicology end points such as clinical chemistry and histopathology. The data should be organized in a well-curated database, and their interpretation requires novel methods of analyzing patterns and predicting outcomes.

Toxicogenomics is currently being used in a variety of ways in drug discovery and development. It is being applied

  • to rescue at-risk programs at the preclinical or early clinical stages by gaining additional insight into a compound’s mechanism of action and how it is causing toxicity;
  • to screen and evaluate leads at different stages proactively by predicting toxicities and mechanisms of action so that candidate compounds can be eliminated from the development pipeline as early as possible; and
  • to develop preclinical biomarkers of drug response and toxicity.

Toxicogenomics offers a number of advantages. Gene expression can be predictive and can be more sensitive than traditional approaches. It is high-content: many things are measured at the same time, and in particular, biomarkers from multiple end points can be measured in a single experiment if one understands what those biomarkers are. Toxicogenomics also supports an understanding of both toxicity and safety. Moreover, it can quickly provide a great deal of additional mechanistic understanding when a problem with a compound arises. The hope is that all of these capabilities will lead to better decision making, removal of candidate compounds from the development pipeline earlier in development, and increased confidence in moving compounds forward.


Halbert described how his company, Iconix, uses toxicogenomics in drug discovery, in biomarker identification and validation, and in preclinical safety assessment.

The DrugMatrix Reference Database

At Iconix, toxicogenomics is grounded in a large database called the DrugMatrix Reference Database. It allows researchers to identify the mechanisms of toxicity of novel compounds through comparison with the database’s reference set of compounds, to benchmark the effects of unknown compounds against these reference compounds, and to identify potential biomarkers that can be used to predict both toxicological and pharmacological end points in rats.

The DrugMatrix database was assembled by accumulating standardized information on more than 640 compounds across nine different tissues in male Sprague-Dawley rats. In choosing the compounds to include in the database, researchers ensured that there were at least three molecules in the database for each structure–activity class of compounds.

The maximum tolerated dose (MTD) and fully effective dose (FED) were estimated from the literature, and a preliminary range-finding study of dose versus toxicity was performed to determine the dose levels to use for each compound. Full studies were then carried out on at least 24 rats for each compound: two dose levels (MTD and FED), at least four or five time points (¼, 1, 3, 5, and 14 days), and three rats per group.

When the rats were sacrificed, all of the tissues and blood were harvested and stored in freezers so they would be available later for gene expression or histopathological studies. Gene expression studies were carried out for the more than 640 compounds, and a full set of histopathological information was generated for each, including histology, clinical chemistry, hematology, and body and organ weights. Furthermore, full pharmacological profiling was carried out on 870 compounds, including the 640 that were chosen for gene expression profiling. This pharmacological profiling consisted of 127 assays, including receptor binding, enzyme, and drug-metabolizing enzyme (DME) assays. The result was a highly comprehensive set of information on how these compounds affect rats.


Once the information described above had been accumulated and organized into a database, it was used to identify “RNA-based biomarkers.” The purpose of identifying biomarkers is to be able to predict various end points from the gene expression data—that is, to know in advance what outcomes can be expected by detecting certain patterns in how a compound affects gene expression.

Iconix attempted to relate four types of phenotypic end points to patterns of gene expression—histopathology, pharmacology, clinical chemistry, and hematology measurements—and sought literature annotations as to how the compounds affect laboratory animals. To develop a biomarker, researchers select a phenotypic end point of interest and create a training set. The training set consists of a positive set of treatments that lead to the desired end point and a negative set of treatments that do not. This training set, along with the gene expression data for each treatment, serves as the input to a Support Vector Machine (SVM) classifier algorithm, which in turn identifies a biomarker—a pattern of gene expression correlated with the end point of interest. The biomarker can then be validated internally within the data set, and possibly in a forward-validated way.

To use these biomarkers, a gene expression profile is generated for a test compound. If the test compound matches any biomarker for a particular end point, this indicates that the test compound causes gene expression changes in that tissue, changes similar to those caused by compounds in the class used to build that biomarker. This in turn means that the test compound is similar to that particular class of compound and can be expected to generate a similar end point.

In addition to helping to identify biomarkers, the information in the DrugMatrix database provides rich insight into what is occurring at the transcription level when animal organs are perturbed with these compounds at a variety of different doses and time points. This information can help achieve an understanding of the mechanisms of activity for compounds at a much deeper level.

Example: Developing a Kidney Biomarker

One of the fundamental questions regarding gene expression is whether it can be used to identify changes in an animal that are predictive of something that has not yet occurred. Halbert described the development of a kidney biomarker that Iconix hoped would be indicative of the development of kidney injury. Researchers hypothesized that the biomarker would be seen in advance of the detection of any pathological changes in the animal, thus ultimately predicting injury prior to its actual incidence.

Researchers designed an experiment to produce latent renal tubular injury in rats. As is well known, the renal tubules are a major site of toxicity and can be damaged by a variety of drugs. Working with a set of 119 compounds that caused the kind of delayed kidney damage of interest—that is, no histopathological injury at day 5 but measurable injury at day 28—the researchers identified a multigene biomarker that could predict the kidney damage from gene expression patterns that were apparent on day 5. The biomarker was validated by being tested on 32 compounds that had not been used in the original training set.

The researchers then checked the literature for individual gene biomarkers identified as predicting this sort of damage and compared them with the new multigene biomarker. The highest-performing individual gene was Tsc-22. It had a sensitivity of about 63 percent but had a very high false positive rate, so that its specificity was only about 44 percent. By contrast, the multigene biomarker had a sensitivity of 83 percent, a specificity of 79 percent, and an overall accuracy of about 75 percent.

An additional benefit of this biomarker was that it contained a number of genes that could be related to the types of injury that were occurring in the kidney. Thus the gene expression changes served to highlight the various early mechanisms and pathways that contributed to the eventual nephrotoxicity (see Figure 4-1). Despite this biomarker’s success, however, a number of questions remain, such as whether it can be validated in other laboratories; whether its level of accuracy is sufficient to make it useful for drug discovery and development and allow it to replace more costly and time-consuming assays; and how relevant it will be for humans, given that it was derived in rats.

FIGURE 4-1 Example of gene expression changes highlighting early mechanisms and pathways that contribute to nephrotoxicity


FIGURE 4-1 Example of gene expression changes highlighting early mechanisms and pathways that contribute to nephrotoxicity. The above signature was developed to predict nephrotoxicity. In this experiment, gene expression levels were measured after five (more...)

The Future of Toxicogenomics

Generally speaking, then, genomic biomarkers have great potential, holding promise for increased predictivity, sensitivity, and specificity, although in every case it is necessary to do an independent forward validation. Furthermore, these biomarkers are relatively easy to apply. Gene expression can be measured easily in target organs by using microarrays and RT-PCR (reverse transcription polymerase chain reaction), and special models or treatment conditions are unnecessary—the gene expression studies can be piggybacked on other studies already under way. As a result, expression profiling is increasingly being incorporated into standard lead optimization and preclinical studies, and is being used as part of the general evaluation performed when one is deciding whether to take a particular compound forward.

More specifically, toxicogenomics is being applied in a variety of ways to drug discovery and the development interface. It is being used for the prospective prediction of toxicology and the retrospective understanding of why a compound is causing a particular problem. At the drug discovery stage, it is being used to help rank and select compounds or chemical platforms in vivo, and researchers are beginning to develop the technology to the point where it can be used in vitro. Later in the process, toxicogenomics is proving useful in the benchmarking of compounds relative to competitor molecules. This technology is also being used to verify safety at the gene expression level. Researchers establish confidence in a compound by studying it in a variety of tissues and at various doses and time points, and determining that it does not cause the same sorts of changes in gene expression that are signals of problems in the reference compounds that have been studied.

Halbert predicted that the utility and impact of toxicogenomics will drive its acceptance. Already the FDA is learning how to apply this technology. The agency has had a copy of the DrugMatrix database for 2–3 years and has been applying the technology and trying to understand how gene expression can be useful in assessing compounds. Validation of and improvements in the technology will make it possible to move it upstream in the drug discovery process. Working with Abbott Laboratories, Iconix has done a great deal of work with in vitro primary rat hepatocytes that can be used to predict particular end points. Moving from in vivo to in vitro applications will make it possible to increase the sample throughput and begin to look at molecules at a much earlier stage, gaining some understanding of the safety of the molecules at the transcriptional profiling level very early in the process. This capability will lead in turn to reduced costs, and it will also be possible to automate much of this work.

The question remaining is how to transition from prediction of toxicity in rats to prediction of toxicity in humans. Data from rats are very important for addressing regulatory questions, and there have been some efforts to link drug responses in rats with clinical outcomes (toxicological responses in humans). However, a reliable method for doing so has not been developed, and a correlation between rats and humans has yet to be established.


As a second example of the use of toxicogenomics in safety science, Dr. Cockett described how Bristol-Myers Squibb (BMS) uses gene chips and gene expression analysis in the late discovery stage, as well as in the early candidate assessment stage. The experimental design for this technology includes treatment of animals or cell lines with a compound at varied doses and time points. Following treatment, RNA is extracted from the animal tissues or cells and analyzed with a gene chip, such as the Affymetrix Human Genome U133, which contains 44,000 probe sets corresponding to about 32,000 genes. The resulting data are displayed with a heat map that indicates those genes whose expression levels have changed significantly as a result of exposure to the compound, as well as the degree to which the expression level has changed.

A Simple Example

To provide an idea of how this technology might be applied, Cockett described an experiment aimed at creating a disease-like state and then identifying an optimal treatment to cure that state. THP-2 monocyte cells were exposed to tumor necrosis factor (TNF) to create a “surrogate disease phenotype,” and gene chips were used to measure the cells’ changes in gene expression. The TNF-exposed cells were then treated with a variety of drugs, and the response was measured again. The goal was to find drugs that would reverse this phenotype, thereby “curing” the surrogate disease. The gene expression results were displayed in the form of a principal component analysis (PCA) of 515 response markers (see Figure 4-2).

FIGURE 4-2 Use of a principal component analysis (PCA) to identify compounds that reverse surrogate disease phenotypes


FIGURE 4-2 Use of a principal component analysis (PCA) to identify compounds that reverse surrogate disease phenotypes. In this experiment, a surrogate disease phenotype was created by exposing THP-2 monocyte cells to tumor necrosis factor (TNF). After (more...)

In the graph of that PCA, control treatments can be seen clustered near the bottom, all in the same color; the TNF-treated cells can be seen clustered near the top of the graph, again in the same color; and scattered around the graph are clusters of other colored dots representing the outcomes of various treatments. The multiple dots of each color that cluster together are experimental replicates and serve to demonstrate the reproducibility of these measurements. As can be seen in the graph, some of the drugs—represented by dots that lie near the controls—completely reversed the TNF response. These were the drugs with the desired response against the surrogate disease. Other drugs not only reversed the TNF stimulation, but also created other effects in the cells as well, as measured by increases in the expression of various genes. The dots representing those drugs are scattered around the graph.

Using this technique, it is also possible to examine what happened in the cells by looking at the effects on individual genes. The TNF treatment caused the expression of MCP-1 to increase sharply, for instance, as would be expected. A number of drugs reversed this effect, bringing the expression of MCP-1 down to control levels. But some of these drugs caused an increase in the expression of other genes, representing an off-target activity. These are the drugs seen to the left in the PCA plot.

In sum, this type of analysis allows one to look holistically at how various drugs are acting. This capability can provide insights into the mechanisms behind different off-target activities and help in deciding whether these activities are desirable or not, information that in turn can be used to help guide drug selection.

How BMS Uses Toxicogenomics

One key to revolutionizing the current drug development paradigm is for organizations to commit to providing the training and technology upgrades necessary to enable applications of toxicogenomics. Toxicogenomics has been integrated into much of the drug discovery and development work being done at BMS. The company has trained toxicologists and pathologists in how to understand, analyze, integrate, and communicate transcriptional profiling data. Furthermore, the company enhanced its informatics infrastructure to enable its scientists to use the technology. Scientists at BMS use a number of tools, including the Rosetta Resolver, an analysis system for gene expression data from Rosetta Inpharmatics, and the Iconix DrugMatrix database, as well as a number of tools developed in house. Having a variety of tools with which to analyze the data is useful because it enables scientists to look at the experimental system from multiple angles. The various tools also have different abilities to distinguish the signals from the noise.

With the goal of learning how toxicogenomic data compare with results achieved through standard toxicology assessments, BMS now includes transcriptional profiling as part of routine toxicology assessments and prior to conducting GLP (good laboratory practices) safety studies. Roughly 40 percent of the nonclinical toxicogenomic studies at BMS employ this additional assessment tool, and approximately 60 percent of those toxicogenomic studies have been aimed at investigating the mechanisms of toxicity in molecules in which toxicity has already been observed.

To date, toxicogenomics has proved valuable to BMS in a number of ways. It has been useful in identifying pharmacological markers, and in studying on-target versus off-target activities and the tissues in which these activities occur. Sometimes pharmacological events involving tissues in rodents are very different from what occurs in humans; these differences can be investigated by looking at gene expression data from human tissues. Toxicogenomic studies have aided in the understanding of mechanisms of toxicity, particularly in those cases in which a compound can be classified as similar to a control compound class in the Iconix Drug Matrix database. When such a match occurs, researchers can expect that the pharmacological and toxicological activity of the new compound will be similar to that of the compound class in the database, making it possible to decide how to proceed without the need for more experiments.

Potential pharmacological biomarkers are identified in about 30 percent of the studies. These biomarkers are in a target pathway, they can be modulated across the toxicity target tissues and other tissues, and they are often known markers of effect. Furthermore, the response correlates with expected efficacious concentrations, and often the story one can tell is biologically compelling. In short, about a third of the time, BMS researchers find that they can gain a biological understanding related to the known literature, understand the biology of a toxicogenetic experiment in a rat, and proceed rapidly to the next stage of development.

Global Transcription as a Marker for Effect

Cockett described a technique for looking at global effects on the transcriptome that involves graphing all the genes in an experiment on a single plot showing how much they have been changed, either repressed or induced (see Figure 4-3). In the resulting diagram, the marks at the bottom in green represent genes that were repressed, while those on the top in red represent genes that were induced.

FIGURE 4-3 Observation of global changes within the transcriptome


FIGURE 4-3 Observation of global changes within the transcriptome. By graphing all the genes in an experiment on a single plot, as shown above, one is able to visualize globally the extent to which genes were repressed or induced. The marks at the bottom (more...)

In this particular experiment, the researchers found that 221 genes, or 1.4 percent of the transcriptome, had changed. This is actually not a large percentage, since with significance set at p <0.01, 1 percent of the genes will meet this cutoff as the result of random chance. To determine the relevance of this percentage that is just slightly greater than random chance, a second experiment was conducted, involving two different dose levels—a low dose of 10 mg/kg and a high dose of 50 mg/kg. The low dose yielded a 1.3 percent change in gene expression (slightly greater than random), while the high dose yielded an 8.7 percent change—that is, 8.7 percent of all the genes that showed up on the chip had been induced or repressed as a result of the experiment. The low dose in this experiment correlated very closely with the NOAEL (no observed adverse effect level) for that drug. The 10 mg/kg dose showed no adverse effects in the rat, while the 50 mg/kg dose clearly did. Thus the researchers concluded that analyzing gene expression at a global level in this way can provide types of information similar to those gleaned during standard toxicology assessments.

Different compounds can display very different patterns when viewed from this global perspective (see Figure 4-4), and this information can be used to make decisions on whether to continue developing a drug. The researchers tested four compounds in an attempt to draw further conclusions from this technology. The first compound resulted in expression changes in 11.4 percent of the measured genes. Such a high percentage of change was indicative of a nonspecific compound that was hitting multiple targets and causing a great deal of transcriptome change.

FIGURE 4-4 A global transcriptional profile as a biomarker for NOAEL (no observed adverse effect level)


FIGURE 4-4 A global transcriptional profile as a biomarker for NOAEL (no observed adverse effect level). In attempt to draw further conclusions from this technology, four compounds were tested. The first resulted in expression changes in 11.4 percent (more...)

The second compound resulted in a 2.9 percent gene expression change, which was associated with potent pharmacology as well as myopathy. Although this was not a large percentage change, certain specific effects of the compound needed to be explored.

The third compound led to only a 1.4 percent change. This was a highly selective compound with no obvious off-target effects—there was no toxicology in rodents dosed with the compound, and very little in the transcriptome. The molecule subsequently failed, however, because it had a cardiac liability with an ion channel. The lesson here is that toxicogenomics cannot detect everything. In this case, for instance, the assay may not have been looking at the right type of tissue to discern the ion-channel effect.

Finally, the fourth compound resulted in an 8.8 percent gene expression change. A large number of genes were changing, but in this case it turned out that they were associated with an acute phase response at the site of injection, which was largely irrelevant to the pharmacology of the drug and absent when the drug was delivered via a different route. Thus it is possible to be misled in other ways beside missing a toxicity that exists, and it is important to be careful in interpreting these sorts of experiments.

After combining all of its toxicology experiments, BMS found that whenever there was a greater than 3 percent transcriptome change, there was also a clear pathology present; furthermore, most of the compounds with no pathology caused much less than a 3 percent change. Therefore, a broad rule of thumb emerged that a 3 percent transcriptome change represented a crude cutoff point for where one could expect to observe pathology.

Thus BMS has learned to use global transcriptional profiling in its drug safety work. Generally speaking, increases in transcriptional change correlate with increasing pathology and increasing dose, and a level of transcriptional change greater than 3 percent suggests drug-related pathology. Profound transcriptional change—at a level of 7 percent or greater—is usually associated with multiple toxicities, and it is often problematic to interpret these data because it is difficult to disentangle the many phenomena involved. On the other hand, minimal global change—less than 3 percent—is not an assurance of drug safety, but it is suggestive of at least a pharmacological specificity, and one must look at the specific genes and the pathways that are modulated to understand the response in greater detail. Finally, transcriptional changes are distinct from histopathology. They may be less sensitive, or they may arise from pathology elsewhere, as in the case of a liver transcriptional readout of an acute phase reaction in skin. In such situations, one can be misled by a set of gene changes in one tissue responding to changes in another.


At Abbott Laboratories, toxicogenomics is increasingly being integrated into the drug discovery and development process. Brian Spear, the company’s Director of Genomic and Proteomic Technologies, explained that information gained from studying changes in gene expression can be of value at four different levels of discovery and development:

  • Identifying toxicological issues early prior to large financial investments
  • Selecting compounds least likely to fail because of toxicological issues
  • Understanding mechanisms of toxicity
  • Bridging the preclinical and clinical data by understanding the mechanisms and commonalities in responses

Selecting Compounds4

Abbott uses gene expression to try to predict whether a compound is likely to be a hepatotoxicant. Gene expression assays are used to determine not whether a compound is safe, but whether it has a high enough chance of being hepatotoxic that discontinuing its development would be advisable. In short, Abbott uses this technology for lead optimization rather than for safety assessments.

Before the researchers could begin using this technology, they had to develop a hepatotoxicity reference set by administering various doses of multiple known hepatotoxicants and multiple known nonhepatotoxicants to rats, and then observing gene expression changes in a set of 40 genes. The researchers established that different patterns of gene expression are elicited by acute hepatotoxicants, moderate hepatotoxicants, and nonhepatotoxicants (see Figure 4-5). Therefore, when a new compound is administered to rats, the resulting gene expression profile can be compared with that resulting from the reference compounds to determine whether the profile matches that of a severe hepatotoxicant, a mild hepatotoxicant, or a nonhepatotoxicant. Specifically, the similarity between the new compound’s gene expression pattern and that of the test set is analyzed with a pattern-recognition algorithm based on a neural network, and the degree of similarity is reduced to a numerical score. In the example described by Spear, the scores were on a scale of 1 to 4, where 1 indicated severe hepatotoxicity, 2 moderate hepatotoxicity, 3 mild hepatotoxicity, and 4 no evidence of liver injury.

FIGURE 4-5 Gene expression profile of a hepatotoxicity reference set


FIGURE 4-5 Gene expression profile of a hepatotoxicity reference set. This figure exhibits gene expression patterns elicited by acute hepatotoxicants, moderate hepatotoxicants, and nonhepatotoxicants. When a new compound is administered to rats, the resulting (more...)

After establishing this predictive reference set, the researchers tested it to see whether it was actually predictive for rats. Using 278 different expression profiles with multiple drugs, multiple times, and multiple doses, they compared the predictions for these various treatments with what was already known about the compounds in the literature. (All the test compounds in this case were ones for which information existed in the literature.) Among the 278 expression profiles, the predictive assay yielded the same numerical score as the literature in 246 cases. In another 21 cases, the assay’s score was within 1 of the score in the literature—for example, 3 instead of 2. Thus for 267 of the 278 expression profiles, the assay agreed closely—and often exactly—with the results reported in the literature on the degree of hepatotoxicity to be expected.

The next step was to test new compounds with the assay to see whether it could be used in a predictive way. The first trials compared how well the gene expression patterns at 2 weeks correlated with the results of histopathology and clinical chemistry at 2 weeks. For these trials, the researchers used a cutoff score of 2.5 to rate the gene expression patterns, so that any compound scoring below 2.5 was said to be positive for hepatotoxicity and any compound scoring above that level was negative—a simple yes/no score. The assay was 100 percent accurate: it correctly predicted six of six negatives and two of two positives. But since the gene expression patterns, like the histopathology and clinical chemistry studies, were from the 2-week time point, the results were a bit like predicting the present. The real question was whether gene expression patterns observed earlier in the process could predict hepatotoxicity prior to its physical manifestation.

To answer this question, the researchers compared the results of short-term gene expression assays—performed 3 or 5 days after exposure—with the results of 2-week toxicology studies. Again using yes/no scoring, the short-term assays correctly predicted 50 of 52 hepatotoxicity results, or 8 of 9 positive outcomes and 42 of 43 negative outcomes. This added up to a specificity of 97.7 percent, a sensitivity of 88.9 percent, and an overall accuracy of 96.2 percent, which meets Abbott’s needs. Spear explained that the accuracy need not be 100 percent, just high enough to provide sufficient confidence that work on a compound should be discontinued based on 3- or 5-day exposure data, without the need for expensive and time-consuming animal studies.

Abbott now uses this assay regularly for screening new compounds. In one case, for instance, three compounds were examined for a project looking at kinase inhibitors. One of the three compounds scored very low on the assay, implying severe hepatotoxicity, and work on that compound was discontinued. The other two had scores in the mild or nonhepatotoxic range, and work on them moved forward.

Spear offered several conclusions and lessons learned from Abbott’s experience with these gene expression assays:

  • The accuracy of the assay must be established, and while it need not be 100 percent, it must be good enough to establish sufficient confidence for decision-making purposes. When multiple compounds are being considered, about 96 percent accuracy is sufficient.
  • The validity of Abbott’s gene assay applies only to rats; the assay has not been validated in humans.
  • Conventional toxicology and pathology remain the gold standard, and it is necessary to compare what happens in the animals with what happens in the assay. If the two sets of results conflict, the toxicology is considered correct, and the assay must be reworked.
  • The value of such an assay is greatest during lead optimization, prior to candidate selection. The assay is most useful when there are multiple compounds involved and the project team needs help in making decisions about which ones to pursue.

Understanding Mechanisms of Toxicity5

A second application of toxicogenomics is to help elucidate the mechanism of toxicity once a compound has shown toxicity in rats or in some other preclinical model. It is important to understand the mechanism involved and to know whether it can be screened for. Toxicogenomics can be useful for this purpose because various gene expression patterns have been associated with specific mechanisms. An important caveat is that toxicogenomics should be relied upon not as a way of identifying mechanisms of toxicity, but as a way of generating a hypothesis that can be tested with more conventional approaches. Thus the technology’s value lies in its ability to help obtain an answer more quickly.

An example is a drug that showed cardiotoxicity in rats in a 2-week study at a high dose—200 mg/kg—with myocardial degeneration and necrosis. At lower doses of 30 or 80 mg/kg, however, there was no evidence of cardiotoxicity. Since the researchers had no serum protein or other biomarker with which to monitor the toxicity, they turned to gene expression patterns in an attempt to understand the mechanism of toxicity. Among rats in the low-dose treatment groups, there were very few gene expression changes in the heart, while rats given 200 mg/kg for 5 days showed striking gene expression changes. What was most interesting was that in rats given 200 mg/kg for 1 day, there were no physical signs or symptoms of cardiotoxicity, but the gene expression pattern was quite similar to that seen in the rats given the high dose for 5 days. Thus the gene expression pattern was an early indicator of gene expression changes related to cardiotoxicity.

To explore these results further, the researchers looked at the particular genes that were up- and down-regulated, and were able to determine that a number of the genes were related to mitochondrial impairment. Some were mitochondrial function genes; these appeared to be down-regulated. Others were genes related to oxidative stress; these were up-regulated. Accordingly, the researchers hypothesized that the compound was inhibiting mitochondrial function, and designed experiments to test this. The first test was to treat the animals with the compound for 4 days and then remove their mitochondria and determine the mitochondria’s oxygen consumption. The mitochondrial oxygen consumption in the treated rats had been reduced to a degree that was comparable to that seen with doxorubicin, another cardiac toxin. In an in vitro experiment in which mitochondria were isolated from cardiac tissue and then treated with different compounds, the test compound was also seen to result in mitochondrial inhibition. Thus the researchers concluded that the compound was likely to be a mitochondrial toxin.

Spear pointed out that the gene expression assay was used to generate a hypothesis—that the mechanism of toxicity was inhibition of mitochondrial function—but other tests were then used to test this hypothesis. The test for toxicology is still clinical chemistry and histopathology, and gene expression studies are not going to replace in vivo toxicological studies. While gene expression studies may shorten the path to an answer, conventional studies will still be necessary.

Gene Expression Profiling in Early Discovery Studies6

Dr. Blomme elaborated on the toxicogenomics work being done at Abbott, describing the added value of rat exploratory toxicology studies.

During the lead optimization process, molecules are characterized with a battery of in vitro and in vivo assays to evaluate various physical, chemical, pharmacological, metabolic, pharmacokinetic, and toxicological properties. If assays are to be used to help make go/no go decisions early in the development process, they must have two important characteristics: they must utilize limited quantities of compound (milligram to gram range) since at this stage compound availability is a major limitation, and results need to be delivered rapidly.

Generally speaking, efforts to reliably evaluate physical, chemical, pharmacological, metabolic, and pharmacokinetic properties during the lead optimization process have been successful, but the same cannot be said for toxicology. Consequently, Abbott came up with the concept of using short-term rat studies to study toxicology during the lead optimization process.

Ideally, for these studies to be useful in the lead optimization process, they should last no more than a few days, use limited numbers of animals, and be performed with 2–4 grams of compound, a quantity Abbott researches have found to be sufficient. Traditional dose range–finding studies involve five animals per group, dosing for 7 or more days, and more than 10 grams of compound. Requiring less compound can translate into studies being completed earlier in the development process.

Traditional toxicology end points include clinical pathology and histopathology; after a short period of dosing, however, these analyses cannot predict toxic events consistently. Furthermore, when only a small number of animals are used, the pathological results are often difficult to interpret. Predictive toxicogenomics is a valuable technology because it has greater sensitivity than traditional methods. As explained earlier in this chapter, gene expression changes typically occur before the functional and morphological changes that are detected by histopathology or clinical pathology.

The increased sensitivity of predictive toxicogenomics implies that gene expression studies should theoretically make it possible to dose animals for shorter durations, and the literature suggests that in some cases, gene expression changes can be observed within hours or perhaps 1 day of administering a toxicant. However, this is the exception rather than the rule. For many compounds, a steady state in tissue kinetics must be achieved before gene expression changes can be measured reliably. In particular, after only 1 day of exposure, gene expression changes can vary greatly among individuals, making interpretation challenging. Further dosing generally leads to less variability and more reliable interpretation.

To illustrate, Blomme described work on developing a signature for bile duct hyperplasia, done in collaboration with Iconix Pharmaceuticals. The goal was to predict bile duct hyperplasia in the liver of rats using liver gene expression profiles from either 1 or 5 days of dosing. Typically it takes several days for bile duct hyperplasia to occur and to be visible morphologically to pathologists.

The signature was derived from a training set of DrugMatrix reference profiles with the same procedures described earlier by Dr. Halbert. The researchers then evaluated the ability of that signature to detect bile duct hyperplasia in rats by exposing rats to a set of 10 compounds not included in the training set. Rats were treated for 1, 5, and 28 days with daily doses of one of the 10 compounds. Gene expression profiles were created from the livers of the rats treated for 1 or 5 days, while the livers were examined histopathologically for the rats treated for 28 days to determine whether treatment with a particular compound had led to bile duct hyperplasia. Then the researchers looked at how often the signature correctly predicted the presence or absence of bile duct hyperplasia after 28 days of dosing.

Using the expression profiles generated after 5 days of dosing, the signature correctly predicted the occurrence of bile duct hyperplasia after 28 days in all cases—seven of seven positive compounds and three of three negative compounds. Using the 1-day gene expression profiles, however, the signature was much less successful, predicting only three of seven positive and three of three negative compounds. Thus it can be concluded that prediction in a single-dose study is not reliable, and that 3 to 5 days of dosing will generally be necessary to enable gene expression profiles to predict outcome reliably.

Another advantage of gene expression profiling is that, in general, fewer animals are necessary. Abbott researchers have found that gene expression profiling of compounds in animals is not as variable as many of the traditional biomarkers, such as histopathology or clinical pathology. Thus for their short-term toxicology studies, the researchers are confident in making decisions using only three animals per group. A significant reduction in numbers of animals corresponds to a significant reduction in the amount of compound required for the studies.

A third advantage of gene expression profiling in the context of short-term studies is the ability to generate mechanistic data that are useful in understanding changes detected by other means. Since short-term exploratory studies use limited numbers of rats per group, limited numbers of groups, and doses that are not always optimized, the data are often quite challenging to interpret. By gaining a better understanding of changes, it becomes possible to make better predictions about their progression and significance.

An example is a compound that was given at four different doses up to 300 mg/kg/day, which after 5 days of dosing led to a dose-dependent increase in liver weight. Because many drugs on the market cause an increase in liver weight, particularly at high doses, the researchers sought to determine the toxicological significance of this finding.

Using a gene expression–based artificial neural network algorithm to predict the hepatotoxicity potential, the researchers found that after prolonged dosing at 200 or 300 mg/kg per day, the compound would likely become toxic in the rats. Therefore, to perform 2- or 4-week studies, they would have to use doses lower than 200 mg/kg. They then used the DrugMatrix database to evaluate the gene expression profiles at the various doses. At doses greater than 200/mg/kg/day, there was a significant correlation with several gene expression profiles in the database that were induced by hepatic toxicants, such as dipyrone and econazole. Next the researchers tried to determine which pathways were being affected by the compound. The data indicated that when doses greater than 200 mg/kg/day were administered, several toxicologically relevant pathways were affected, including oxidative stress, cholesterol biosynthesis, and aryl hydrocarbon receptor signaling. This finding provided additional evidence that at doses greater than 200 mg/kg/day, the compound would result in rat hepatotoxicity. Thus a dose of 100 mg/kg/day was selected for the subsequent 2-week rat toxicology study.

According to Blomme, these examples demonstrate that gene expression profiling can be a valuable addition to early discovery studies. It is a sensitive and specific indicator of toxicity. It is associated with less interindividual variability, making it possible to use fewer animals and thus to conduct studies with less compound. And it adds a level of mechanistic information that is quite useful in improving the interpretation of findings of short-term studies. Abbott researchers are using this technology to assess the toxicity of compounds and make go/no go decisions about their advancement.


There are various uses of and methods for conducting gene expression analyses to help predict the toxic effects of compounds and provide insights into the mechanisms of toxicity. Gene expression can be predictive and, in particular, can be more sensitive than traditional approaches; it was suggested that a level of transcriptional change greater than 3 percent indicates drug-related pathology. Information gained from studying gene expression changes can be used to

  • identify toxicological issues early prior to large financial investments;
  • select the compounds that are least likely to fail because of toxicological issues;
  • understand mechanisms of toxicity; and
  • bridge preclinical and clinical data by understanding the mechanisms and commonalities in responses.

These gene expression assays, however, apply only to rat models, and the next challenge is to transition from prediction of toxicity in rats to prediction of toxicity in humans.



This section is based on the presentation of Don Halbert, Executive Vice President for Research and Development, Iconix Pharmaceuticals.


This section is based on the presentation of Dr. Halbert.


This section is based on the presentation of Mark Cockett, Vice President, Applied Genomics, Bristol-Myers Squibb.


This section is based on the presentation of Dr. Spear.


This section is based on the presentation of Dr. Spear.


This section is based on the presentation of Eric Blomme, Project Leader in Cellular, Molecular, and Exploratory Technology, Abbott Laboratories.