Appendix B: Glossary

Publication Details

Cycle Threshold (Ct, CT, Ct)

In an RT-PCR reaction template, the relative ratios of products and reagents vary. At the beginning of the process, reagents are in excess, and template and products are at low concentrations and do not compete with primer binding, so that the amplification proceeds at a constant, exponential rate. After this initial phase, the process enters a linear phase of amplification, due to competition of product renaturation with primer binding. In late reaction cycles, the amplification reaches a plateau phase and no more products accumulate. To achieve accuracy and precision, it is necessary to collect quantitative data during the exponential phase of amplification, since in this phase amplification is extremely reproducible. In RT-PCR, this process is automated and measurements are made at each cycle. The ‘cycle threshold’ is the cycle of the RT-PCR reaction corresponding to the beginning of the exponential phase of amplification.

DNA Microarray

A DNA microarray (also commonly referred to as “gene chip,” “DNA chip”) is a collection of microscopic DNA spots (defined “features”), commonly representing single genes or transcripts, arrayed on a solid surface by covalent attachment to chemically suitable matrices, or directly synthesized on them. DNA microarrays use DNA as part of their detection system. Qualitative or quantitative measurements with DNA microarrays use the selective nature of DNA-DNA or DNA-RNA hybridization under high-stringency conditions and fluorophore-based detection. DNA arrays are commonly used for gene expression profiling, i.e., monitoring expression levels of thousands of genes simultaneously, or for comparative genomic hybridization.

Gene Annotation

Gene annotation is the body of information that is associated with genes, as well as the process involved with the generation and maintenance of such information. Molecular biology and bioinformatics have faced the need for DNA annotation since the 1980s. Today a number of genomic and proteomic annotation projects have made this information publicly available.

Gene Expression

Gene expression refers to the translation of the information encoded in a gene into an RNA transcript. Expressed transcripts include messenger RNAs (mRNA) translated into proteins, as well as other types of RNA, such as transfer RNA (tRNA), ribosomal RNA (rRNA), micro RNA (miRNA), and non-coding RNA (ncRNA), that are not translated into protein. Gene expression is a highly specific process by which cells switch genes on and off in a timely manner, according to their state. The study of mRNA expression in a cell is an indirect way to study the proteins counterpart.

Gene Expression Classifier

The term classifier is derived from the field of machine learning. The goal of classification is to group items that have similar feature values into groups. Usually, in the context of gene expression analysis, a classifier is a composite algorithm that achieves patients classification by using gene expression measurements.

Gene Expression Profiling

This term refers to any genomic techniques that measure the fraction of the genes that is expressed in a specific sample. This definition refers to techniques that allow the assessment of more than one gene at a time, especially microarray and real time RT-PCR.

Gene expression profile: This is any set of genes for which the expression in a specific sample is known. A gene expression profile may account for a variable number of genes, and the corresponding expression values may be obtained by different techniques. Gene expression profiles can be associated, by various techniques, to phenotypes.

Gene expression pattern: This is an equivalent term currently in use to refer to “gene expression profile.”

Gene expression signature: This is an equivalent term currently in use to refer to a specific “gene expression profile,” usually associated with a specific phenotype.


In biology the genome of an organism is its whole hereditary information and is encoded in the DNA (for some viruses, RNA). This includes both the genes encoding for proteins, as well as the non-coding sequences of the DNA. The term, coined in 1920 by Hans Winkler, is the fusion of the words gene and chromosome. The study of the global properties of genomes is usually referred to as ‘genomics’, which distinguishes it from genetics, which generally studies the properties of single genes or groups of genes.

Laser Capture Microdissection

Laser Capture Microdissection (LCM) is a method for isolating pure cells of interest from specific regions of tissue sections. In this procedure a special film is applied on tissue sections that are analyzed under the microscope. When the cells of choice are identified, the operator can use a laser to dissect the cells and transfer them off of the film leaving all unwanted cells behind in the tissue section. LCM does not alter or damage the morphology and chemistry of the sample collected from which is possible to prepare DNA, RNA and/or protein. LCM can be performed on a variety of tissue samples, including blood smears, cytologic preparations, cell cultures and frozen and paraffin embedded archival tissue.


MIAME (Minimum Information About a Microarray Experiment) is a standard for reporting microarray experiments. It is intended to specify all the information necessary to interpret the results of the experiment unambiguously and to reproduce the experiment. While the standard defines the content desired for reports, it does not specify the format in which this data should be presented. There are a number of file formats for representing this data, and both public and subscription-based repositories for such experiments.


In an experimental context, normalizations are used to standardize data to enable differentiation between real (biological) variations and variations due to the measurement process. In gene expression analysis (by DNA microarray or RT-PCR), normalization refers to the process of identifying and removing the systematic effects, bringing the data from different samples onto a common scale. Several alternative methods and approaches to perform normalization exist both for RT-PCR and DNA microarray.


Oligonucleotides are short sequences of nucleotides (RNA or DNA), typically with twenty or fewer bases, although automated synthesizers allow the synthesis of oligonucleotides up to 200 bases. The length of a synthesized base is usually denoted by the suffix ‘mer’: for example, a fragment of 25 bases would be called a 25-mer. Oligonucleotides are used as probes to detect complementary DNA or RNA molecules. Specific DNA oligonucleotides are used in the PCR, and in this instance, they are referred to as “primers,” since they generate a place for the DNA polymerase to bind and extend the primers themselves, by the addition of nucleotides to make a copy of the target sequence. Oligonucleotides are may be referred to as “oligos.”


In the context of gene expression profiling analysis the term “platform” is often used to refer to the technology, instruments, and protocols used to measure gene expression. In this sense real time RT-PCR, cDNA microarrays, and oligonucleotide microarrays represent different platforms.

Polymerase Chain Reaction (PCR)

PCR is a molecular biology technique for isolating and exponentially amplifying a DNA sequence of interest in vitro via enzymatic replication. This technique has been extensively modified to perform a wide array of tasks, and it is now a common tool used in medical and biological research. PCR is now used to obtain the sequence of genes, to diagnose hereditary diseases, identify genetic fingerprints (forensics medicine), detect infectious diseases, and create transgenic organisms. Coupled to “reverse transcription” it is used to amplify RNA molecules.


A primer is a nucleic acid strand or a related molecule that serves as a starting point for DNA replication. A primer is required because most DNA polymerases cannot begin synthesizing a new DNA strand from scratch, but can only add to an existing strand of nucleotides. In most natural DNA replication, the ultimate primer for DNA synthesis is a short strand of RNA. This RNA is produced by “primase,” and is later removed and replaced with DNA by a DNA polymerase. Many laboratory techniques of biochemistry and molecular biology that involve DNA polymerases, such as DNA sequencing and polymerase chain reaction, require primers. The primers used for these techniques are usually short, chemically synthesized DNA molecules with a length about twenty bases.


In molecular biology, a hybridization probe is a fragment of DNA of variable length, which is used to detect the presence of nucleotide sequences that are complementary to the sequence in the probe. The complementary sequences are referred to as “targets.” The hybridization probe is usually labeled radioactively, or with immunological or fluorescent markers. The labeled probe is then denatured (by heating) into single DNA strands and hybridized to target DNA (Southern blotting) or RNA (Northern blotting) immobilized on a membrane or in situ. In a DNA microarray the hybridization scheme is reversed and the probes are attached to a solid surface, while the labeled targets are in the reaction solution. Similarly, in real time RT-PCR, probes are fragments of DNA that fluoresce when hybridized to the complementary investigated RNA molecule.


The term proteome was coined by Mark Wilkins in 1994, as the fusion between proteins and genome. This term refers to the entire set of proteins expressed by a genome, cell, tissue or organism at a given time under defined conditions. The proteome is larger and more complex than the genome, especially in eukaryotes, in the sense that there are more proteins than genes. This is due to alternative splicing of genes and post-translational modifications like glycosylation or phosphorylation.

Real Time Reverse Transcriptase Polymerase Chain Reaction (RT-PCR)

Real-time RT-PCR is a molecular biology technique that allows the amplification and the quantification in real time of defined RNA molecules from specific specimens. This technology has been used for several years in research and clinical settings to measure RNA molecules. In the first step DNA, copies of the investigated RNA molecules present in the template are obtained by a reaction named reverse transcription. Then DNA amplification is obtained using PCR, while the quantification of the accumulating DNA product is accomplished by the use of specific fluorescent reagents. The quantification of the target RNA molecule is based on the analysis of the accumulation curve of the complementary DNA, as measured by the fluorescence detected at each cycle of the reaction.

Reverse Transcription

In biochemistry, reverse transcription is the enzymatic reaction induced on by the RNA-dependent DNA polymerase. This enzyme, also known as reverse transcriptase, is a DNA polymerase enzyme that copies single-stranded RNA into DNA. This process is the reverse of normal transcription, which involves the synthesis of RNA from DNA.


This type of enzyme, abbreviated commonly as RNase, is a nuclease that catalyzes the hydrolysis of RNA molecules into smaller components. They are divided into endonucleases (can cut RNA molecules in the middle) and exonucleases (degrades RNA from the extremities of the molecules).


In gene expression profiling analysis, a target is the RNA transcript that is under investigation using its complementary counterpart, the probe.

Tissue Microarrays

Tissue microarrays (TMA) consist of paraffin blocks in which can be embedded with up to 1000 separate tissue cores, assembled in array fashion to allow simultaneous histological analysis.


Transcription is the process by which DNA sequences are copied into complementary RNA molecules by the enzyme RNA polymerase. This reaction represents the transfer of genetic information from DNA into RNA, which is from “storing” to “function.” The DNA sequence that is transcribed into an RNA molecule is called a “transcript.”


The transcriptome is the set of all RNA molecules, or “transcripts,” produced in one or a population of cells. The term can be applied to the total set of transcripts in a given organism, or to the specific subset of transcripts present in a particular cell type. Unlike the genome, which is roughly fixed for a given cell line (excluding mutations), the transcriptome can vary from cell to cell, and with external environmental conditions. Because it includes all RNA transcripts in the cell, the transcriptome reflects the genes that are being actively expressed at any given time. The study of the trascriptome examines the expression level of RNAs in a given cell population, often using high-throughput techniques based on DNA microarray technology, or RT-PCR.