• We are sorry, but NCBI web applications do not support your browser and may not function properly. More information
Logo of summtransbLink to Publisher's site
Summit on Translat Bioinforma. 2009; 2009: 116–120.
Published online Mar 1, 2009.
PMCID: PMC3041577

Toward an Ontological Treatment of Disease and Diagnosis

Richard H. Scheuermann, PhD,*,1 Werner Ceusters, MD,2,4 and Barry Smith, PhD*,3,4


Many existing biomedical vocabulary standards rest on incomplete, inconsistent or confused accounts of basic terms pertaining to diseases, diagnoses, and clinical phenotypes. Here we outline what we believe to be a logically and biologically coherent framework for the representation of such entities and of the relations between them. We defend a view of disease as involving in every case some physical basis within the organism that bears a disposition toward the execution of pathological processes. We present our view in the form of a list of terms and definitions designed to provide a consistent starting point for the representation of both disease and diagnosis in information systems in the future.


The goal of this communication is to outline a terminological framework that encompasses diseases, their causes and manifestations, and diagnostic acts and other entities pertaining to the ways diseases are recognized and interpreted in the clinic. Inspection reveals that such entities have thus far not been adequately treated in standard vocabulary resources. The National Cancer Institute Thesaurus (NCIT), for example, identifies ‘Chronic Phase of Disease’ as a subtype of ‘Finding’, which it defines as:

Objective evidence of disease perceptible to the examining physician (sign) and subjective evidence of disease perceived by the patient (symptom) [1].

This definition implies, however, that a disease does not exist except as one or other form of evidence. It thus illustrates a common conflation between processes on the side of the organism and the evidence for the existence of such processes. That this conflation is problematic is revealed when we need to link observable clinical phenomena to hypothesized unobservable biological causes.

A misplaced focus on observables is reflected also in the traditional practice of classifying diseases on the basis of patterns of similarities in signs and symptoms. This practice creates problems in face of the wide variations in clinical presentations of many diseases [2] and of the increasing importance for our understanding of the ways disease correlates with genetic and environmental variables [3]. The effective study of such correlations requires clinical research to be applied to ever larger pools of subjects drawn from geographically separated populations in multi-institution studies, requiring that the healthcare institutions involved embrace common standardized terminologies in capturing and sharing their data. The definitions presented here are designed to provide the resources in terminology and disease classification to support such standardization.

The approach we recommend rests on an account of diseases as dispositions rooted in physical disorders in the organism and realized in pathological processes. This approach helps us to do justice (1) to the existence of pre-clinical manifestations of disease (disorders can exist before they are realized in overt pathological processes); (2) to the combinations of disease and predispositions to disease which can exist within a single patient (as when an instance of disease of type A in a given patient is a risk factor for a second disease of type B); and (3) to the fact that the disease course and the clinical picture may vary widely between patients who have the same disease.

Materials and methods

We reviewed the current definitions of terms pertaining to disease and diagnosis in standard terminology resources and found them to capture inadequately the logical relationships between the terms defined, thus providing an inadequate foundation for information integration and reasoning. We created our definitions drawing on best practices in ontology development as promulgated within the OBO Foundry [4]. These definitions apply to the terms as used in the context of this paper. Thus we do not claim that ‘disease’ as here defined denotes what clinician in every case refer to when they use the term ‘disease’. Rather, our definitions are designed to make clear that such clinical use is often ambiguous.


While it is generally good practice to provide precise definitions for the terms assembled in a terminology, some terms must remain undefined in order to avoid circularity or infinite regress. The undefined terms are of three sorts: either (i) they are non-technical terms derived from ordinary English; (ii) they are technical terms derived from basic science; or (iii) they are primitive terms specific to our domain of interest. Some terms in group (iii) require special attention. While, ex hypothesi, we cannot provide definitions for these terms, we can provide some elucidation and illustrative examples.

Informal Elucidations of Primitive Terms

Physical components are anatomical structures and other physical entities within or on the surface of the body, including organs, cells, portions of blood, body flora, pathogens, toxins, and their combinations. Bodily qualities are for example the color or mass of a physical component. Bodily processes are processes unfolding in or on the body in which physical components serve as participant. We use ‘bodily feature’ as an abbreviation for a physical component, a bodily quality, or a bodily process. (Disjunctive terms of this sort fall short of ontological best practice; they are employed here in order to simplify our treatment of established disjunctive terms such as ‘sign’ and ‘phenotype’.)

A disposition is an attribute of an organism in virtue of which it will initiate certain specific sorts of processes when certain conditions are satisfied. Examples are: our disposition to crave liquid following dehydration; the disposition of an epithelial cell in the G2 phase of the cell cycle to become diploid following mitosis. In any organism there is a wide variety of dispositions, some associated with health, others with disease. We use ‘realization’ to refer to the process through which a disposition is realized, and we shall identify diseases as dispositions realized in pathological processes.

Each disposition in the organism has a physical basis. The physical basis of a disease is some combination of physical components within the organism, typically at multiple levels of granularity.

When we say that some bodily feature of an organism is clinically abnormal, this signifies that it: (1) is not part of the life plan for an organism of the relevant type (unlike aging or pregnancy), (2) is causally linked to an elevated risk either of pain or other feelings of illness, or of death or dysfunction, and (3) is such that the elevated risk exceeds a certain threshold level [5]. This treatment of ‘abnormal’ is distinct from those statistical treatments which do not take account of the overlap in the distribution of test results between normal and abnormal populations or of normal distribution extremes. What are standardly called ‘normal variants’ (for example a left lung with three lobes) do not satisfy criteria (2) and (3).

We use ‘homeostasis’ to designate a disposition of the whole organism (or of some causally relatively isolated part of the organism, such as a single cell) to regulate its bodily processes in such a way as (1) to maintain bodily qualities within a certain range or profile and (2) to respond successfully to departures from this range caused by internal influences or environmental influences such as poisoning. When bodily processes yield qualities outside the homeostatic range, the organism initiates processes designed to return the qualities to a value within this range. In some cases, homeostasis can be lost and then re-gained at a level that is clinically abnormal, for example in the case of adaptation to major injury. In other cases the organism will pass a point where it falls irreversibly outside the realm of homeostasis.

Definitions of Terms Referring to Entities on the Side of the Organism

We pursue a view of disease as resting in every case on some (perhaps as yet unknown) physical basis [6]. When, for example, there is a persistent elevated level of glucose in the blood, this is because (1) some physical structure or substance in the organism is disordered (e.g. loss of beta cells in pancreatic islets) as a result of which (2) there exists a disposition (diabetes) for the organism to act in a certain abnormal way. The disposition in question is realized by pathological processes (diabetic nephropathy) including manifestations that can be recognized as signs of the disorder (proteinuria).

There is a range of values for a set of bodily feature types whose maintenance is continuously sought by an organism in homeostasis (e.g. 65 – 110 mg glucose/dL serum). This range for each given organism will vary in light of environmental and behavioral changes. For example it will reflect raised heart beat frequency while taking exercise.

Abnormal Homeostasis =def. – Homeostasis that is clinically abnormal for an organism of a given type and age in a given environment.

Normal Homeostasis =def. – Homeostasis of a type that is not clinically abnormal.

Disorder =def. – A causally relatively isolated combination of physical components that is (a) clinically abnormal and (b) maximal, in the sense that it is not a part of some larger such combination.

Although each single cell within a tumor is disordered in its own right, for us the disorder is the tumor as a whole; it is the maximal collection of all disordered cells. Other examples of disorders are: mutated genomic DNA, portions of endotoxin in blood, blood with reduced blood cortisol levels causing adrenal crisis. Such disorders are the physical basis of disease. A disease comes into existence because some physical component becomes malformed. The disorder might be a malformation or involve a virus or toxin coming in from the outside, or it arises because the absence of a normal bodily component leads to abnormal functioning.

Pathological Process =def. – A bodily process that is a manifestation of a disorder.

Some pathological processes are changes in the way a normal physiological function is realized (e.g. hyperventilation); some have no normal physiological counterpart (e.g. acute inflammation).

Disease =def. – A disposition (i) to undergo pathological processes that (ii) exists in an organism because of one or more disorders in that organism.

Epilepsy as a disease that disposes to the occurrence of seizures (pathological processes) due to an underlying abnormality in the neuronal circuitry of the brain (physical basis); AIDS as a disease that disposes to non-HIV pathogen persistence and duplication (pathological processes) following opportunistic infections that take advantage of a weakened immune system (physical basis).

Predisposition to Disease of Type X =def. – A disposition in an organism that constitutes an increased risk of the organism’s subsequently developing the disease X.

A predisposition is a disposition to acquire a further disposition. Some diseases, for example AIDS, are predispositions to further diseases. The case of moderate genetic risk factors tells us that not all predispositions to disease are themselves diseases.

Etiological Process =def. – A process in an organism that leads to a subsequent disorder.

Example: toxic chemical exposure resulting in a mutation in the genomic DNA of a cell.

The etiological process creates the physical basis of that disposition to pathological processes which is the disease. With some diseases it may be possible to associate specific etiological determinants – processes which must take place if a disease is to exist. Some etiological processes, in contrast, will be causes of clinical phenotypes, such as inflammation, which are common to many diseases. They will be comparable to the environmental processes that modify the presentation and course of the disease.

Etiological processes do not form a natural kind. To be etiological is to be such as to have brought about an outcome of a certain sort: pathological processes realizing one disease may lead to dysfunction that gives rise to the further disease of depression.

Disease Course =def. – The totality of all processes through which a given disease instance is realized.

Transient Disease Course =def. – A disease course that terminates in a return to normal homeostasis.

Chronic Disease Course =def. – A disease course that (a) does not terminate in a return to normal homeostasis and (b) would, absent intervention, fall within an abnormal homeostatic range.

Examples: acquired deafness; intermittent seizures in a person suffering from epilepsy.

Progressive Disease Course =def. – A disease course that (a) does not terminate in a return to homeostasis and (b) would, absent intervention, involve an increasing deviation from homeostasis

Example: malignant cancer.

Note that for any given patient it may at any given point in time be difficult to determine which type of disease course is involved. A single episode of transient paralysis may be insufficient to arrive at a diagnosis of multiple sclerosis until a second episode occurs. Although the disposition was present at the time of the initial episode, our ability to diagnose the underlying disorder is limited by the manifestations that have been observed up to that point in time.

Definitions of Terms Referring to Genetic Disorders

Genetic Disorder =def. – A disorder whose etiology involves an abnormality in the nucleotide sequence of an organism’s genome.

Constitutional Genetic Disorder =def. – A genetic disorder inherited during conception that is borne by all cells in the organism.

Examples: mutation in the hexosaminidase gene leading to Tay-Sachs disease.

Acquired Genetic Disorder =def. – A genetic disorder acquired by a single cell in an organism that leads to a population of cells within the organism bearing the disorder.

Example: a point mutation acquired in the H-ras gene in colorectal adenoma cells.

Constitutional Genetic Disease =def. – A disease whose physical basis is a constitutional genetic disorder.

Examples: chronic: color blindness, polydactyly; progressive: Down syndrome, Tay-Sachs disease.

Acquired Genetic Disease =def. – A disease whose physical basis is an acquired genetic disorder.

Examples: chronic: benign colonic neoplasia (here the physical basis is an APC mutation); progressive: malignant colon cancer (here the physical basis is a combination of APC, ras and p53 mutations).

Genetic Predisposition to Disease of Type X =def.

A predisposition to disease of type X whose physical basis is a constitutional abnormality in an organism’s genome.

This abnormality is the physical basis for the increased risk of acquiring disease X. Examples: p53 mutation in Li-Fraumeni Syndrome predisposing to cancer; ApoE alleles predisposing to Alzheimer’s.

Definitions of Terms Referring to Infections

Infectious Disorder =def. – A disorder whose etiology includes the presence of a pathogenic organism within a host organism or an abnormal imbalance in the normal resident organismal flora.

Infectious Disease =def. – A disease whose physical basis is an infectious disorder.

Examples: transient: seasonal flu; chronic: genital herpes; progressive: Ebola hemorrhagic fever.

Secondary Infection =def. – A disorder consisting in the presence of a pathogenic organism within a host organism that occurs due to the disposition established by a prior infection with a pathogenic organism of a different kind (e.g. cryptosporidiosis in a patient suffering from AIDS).

Definitions of Terms Relating to Clinical Evaluations

In many cases, organisms harbor disorders before the associated dispositions are realized in changes that are observable. Once observable, these changes are usually first recognized by patients (symptoms) and subsequently observed by clinicians (signs). Although the terms ‘sign’ and ‘symptom’ are frequently used in this way to distinguish sources of evidence, the distinction may be of limited utility. We believe that a more rigorous treatment of the distinction would be through the explicit representation of the agents (clinician, patient, family member, lab technician) involved in different sorts of observations. However, because the distinction of ‘sign’ and ‘symptom’ is routinely drawn by clinicians in the conduct of patient care, we include definitions for these terms as follows.

Sign =def. – A bodily feature of a patient that is observed in a physical examination and is deemed by the clinician to be of clinical significance.

We can distinguish a further use of ‘sign’ in the context ‘sign of’. Two clinicians may observe the same clinically abnormal bodily feature, e.g. a hand tremor, in a single patient but interpret it differently, either as a ‘sign of’ a distinct disorder (where the patient has two disorders) or of one disorder but about which they differ in opinion about the relevant disease type (e.g. hyperthyroidism or Parkinson’s).

Vital sign =def. – A physical sign in which a non-zero value is standardly considered to be an indication that the organism is alive.

The relative values for vital signs are often used as measures that can indicate the presence of disease.

Symptom =def. – A bodily feature of a patient that is observed by the patient and is hypothesized by the patient to be a realization of a disease.

Again we can distinguish the special usage ‘symptom of’: a clinician may attribute a symptom as being a symptom of some specific disease. On some readings of the term, ‘symptom’ refers paradigmatically to pains and other feelings and sensations which are such that they can be observed only by the patient.

Neither signs nor symptoms form a natural kind, but are rather composite classes – fiat collections of bodily features delineated by certain socially established cognitive practices on the parts of clinicians and patients.

Clinical History =def. – A series of statements representing health-relevant features of a patient.

The term ‘clinical history’ is also sometimes used to refer to the collection of disease courses in a given patient. Even a patient who never went to the doctor may have a clinical history on this reading.

Clinical History Taking =def. – An interview in which a clinician elicits a clinical history from a patient or from a third party who is reporting on behalf of the patient.

Physical Examination =def. – A sequence of acts of observing and measuring bodily features of a patient performed by a clinician; measurements may occur with and without elicitation.

Laboratory Test =def. – A measurement assay that has as input a patient-derived specimen, and as output a result representing a quality of the specimen.

Laboratory Finding =def. – A representation of a quality of a specimen that is the output of a laboratory test and that can support an inference to an assertion about some quality of the patient.

Normal Value =def. – A value for a quality reported in a lab report and asserted by the testing lab or the kit manufacturer to be normal based on a statistical treatment of values from a reference population.

Manifestation of a Disease =def. – A bodily feature of a patient that is (a) a deviation from clinical normality that exists in virtue of the realization of a disease and (b) is observable.

Observability includes observable through elicitation of response or through the use of special instruments.

Preclinical Manifestation of a Disease =def. – A manifestation of a disease that exists prior to its becoming detectable in a clinical history taking or physical examination.

Clinical Manifestation of a Disease =def. – A manifestation of a disease that is detectable in a clinical history taking or physical examination.

Phenotype =def. – A (combination of) bodily feature(s) of an organism determined by the interaction of its genetic make-up and environment.

Clinical Phenotype =def. – A clinically abnormal phenotype.

Disease Phenotype =def. – A clinical phenotype that is characteristic of a single disease.

Note that, according to our definition, a disease phenotype can exist without being observed. Indeed, as technology advances, our ability to detect the underlying components of a disease phenotype will expand. The full disease phenotype would incorporate the abnormal phenotypes realized at each stage of the disease course. We can also distinguish a less and a more inclusive reading of ‘disease phenotype’. Under the former, a disease phenotype may be a single type of abnormality characteristic of a given disease; under the latter a disease phenotype is a maximal combination of such single phenotypes, ordered in a temporal sequence characteristic of one or more typical courses for the given disease.

Clinical Picture =def. – A representation of a clinical phenotype that is inferred from the combination of laboratory, image and clinical findings about a given patient.

Diagnosis =def. – A conclusion of an interpretive process that has as input a clinical picture of a given patient and as output an assertion to the effect that the patient has a disease of such and such a type.

A diagnosis is a continuant entity that, once made, will survive through time, and is often supplanted by further diagnoses. The diagnostic process is thus iterative: the clinician is forming hypotheses during history taking, testing these during physical exam, forming new hypotheses as a result, and so on.


The following figure summarizes the view of disease and diagnosis presented here. As a result of an etiological process, a physical change occurs in the healthy individual giving rise to a disorder whose realizations, which are initially undetectable (pre-clinical manifestation), and then become detectable as symptoms and signs (clinical manifestations). The latter constitute in their totality the phenotype for the given disease as instantiated in this specific patient. They can be observed through physical examination and laboratory testing of specimens derived from the patient, the results of which can be recorded in the medical record as a clinical picture. The clinical picture is interpreted by the physician in arriving at a diagnosis, which serves in turn as the foundation for the development of a patient management plan.

An external file that holds a picture, illustration, etc.
Object name is amia-s2009-116f1.jpg


For helpful discussions of this proposal we thank L. Cowell, W. Hogan, A. James, J. Loscalzo, B. Peters, N. Williams, and the attendees of the 2008 ‘Signs, Symptoms and Findings: First Steps Toward an Ontology of Clinical Phenotypes’ and ‘Infectious Disease Ontology’ workshops. This work is supported by the NIH – N01AI40076, N01AI40041, U54RR023468 and U54HG004928, and by the Oishei Foundation.


2. Loscalzo J, Kohane I, Barabasi A-L. Human disease classification in the postgenomic era: A complex systems approach to human pathobiology. Mol Syst Biol. 2007;3:124. [PMC free article] [PubMed]
3. Butte AJ, Kohane IS. Creation and implications of a phenome-genome network. Nat Biotechnol. 2006;24(1):55–62. [PMC free article] [PubMed]
4. Smith B, Ashburner M, Rosse C, Bard J, Bug W, Ceusters W, Goldberg LJ, Eilbeck K, Ireland A, Mungall CJ, Leontis N, Rocca-Serra P, Ruttenberg A, Sansone SA, Scheuermann RH, Shah N, Whetzel PL, Lewis S. The OBO Foundry: Coordinated evolution of ontologies to support biomedical data integration. Nature Biotechnology. 2007;25(11):1251–1255. [PMC free article] [PubMed]
5. Schulz S, Johansson I. Continua in biological systems. The Monist. 2007;90(4):499–522.
6. Williams N. The factory model of disease. The Monist. 2007;90(4):555–584.

Articles from Summit on Translational Bioinformatics are provided here courtesy of American Medical Informatics Association


Related citations in PubMed

See reviews...See all...

Cited by other articles in PMC

See all...


  • MedGen
    Related information in MedGen
  • PubMed
    PubMed citations for these articles

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...