Chapter 1Historical Background and Overview

Varki A, Sharon N.

Publication Details

This chapter provides historical background to the field of glycobiology and an overview of this book. General terms found throughout the volume are also considered. The common monosaccharide units of glycoconjugates are mentioned and a uniform symbol nomenclature used for structural depictions throughout the book is presented. The major glycan classes to be discussed in the book are described, and an overview of the general pathways for their biosynthesis is provided. Topological issues relevant to biosynthesis and functions of glycoconjugates are also briefly considered, and the growing role of these molecules in medicine and biotechnology is briefly surveyed.

WHAT IS GLYCOBIOLOGY?

The central paradigm driving research in molecular biology has been that biological information flows from DNA to RNA to protein. The power of this concept lies in its template-based precision, the ability to manipulate one class of molecules based on knowledge of another, and the patterns of sequence homology and relatedness that predict function and reveal evolutionary relationships. A variety of additional roles for RNA have also recently emerged. With the sequencing of human genomes and many other commonly studied organisms, even more spectacular gains in understanding the biology of nucleic acids and proteins are anticipated.

However, the tendency is to assume that a conventional molecular biology approach encompassing just these molecules will explain the makeup of cells, tissues, organs, physiological systems, and intact organisms. In fact, making a cell requires two other major classes of molecules: lipids and carbohydrates. These molecules can serve as intermediates in generating energy and as signaling effectors, recognition markers, and structural components. Taken together with the fact that they encompass some of the major posttranslational modifications of proteins themselves, lipids and carbohydrates help to explain how the relatively small number of genes in the typical genome can generate the enormous biological complexities inherent in the development, growth, and functioning of intact organisms.

The biological roles of carbohydrates are particularly important in the assembly of complex multicellular organs and organisms, which requires interactions between cells and the surrounding matrix. All cells and numerous macromolecules in nature carry an array of covalently attached sugars (monosaccharides) or sugar chains (oligosaccharides), which are generically referred to as “glycans” in this book. Sometimes, these glycans can also be freestanding entities. Because many glycans are on the outer surface of cellular and secreted macromolecules, they are in a position to modulate or mediate a wide variety of events in cell–cell, cell–matrix, and cell–molecule interactions critical to the development and function of a complex multicellular organism. They can also act as mediators in the interactions between different organisms (e.g., between host and a parasite or a symbiont). In addition, simple, rapidly turning over, protein-bound glycans are abundant within the nucleus and cytoplasm, where they can serve as regulatory switches. A more complete paradigm of molecular biology must therefore include glycans, often in covalent combination with other macromolecules, that is, glycoconjugates, such as glycoproteins and glycolipids.

The chemistry and metabolism of carbohydrates were prominent matters of interest in the first part of the 20th century. Although these topics engendered much scientific attention, carbohydrates were primarily considered as a source of energy or as structural materials and were believed to lack other biological activities. Furthermore, during the initial phase of the molecular biology revolution of the 1960s and 1970s, studies of glycans lagged far behind those of other major classes of molecules. This was in large part due to their inherent structural complexity, the great difficulty in determining their sequences, and the fact that their biosynthesis could not be directly predicted from a DNA template. The development of many new technologies for exploring the structures and functions of glycans has since opened a new frontier of molecular biology that has been called “glycobiology”—a word first coined in the late 1980s to recognize the coming together of the traditional disciplines of carbohydrate chemistry and biochemistry with a modern understanding of the cell and molecular biology of glycans and, in particular, their conjugates with proteins and lipids.

From a strictly technical point of view, glycobiology can almost be viewed as an anomaly in the history of the biological sciences. If the development of methodologies for glycan analysis had kept pace with those of other macromolecules in the 1960s and 1970s, glycans would have been an integral part of the initial phase of the molecular and cell biology revolution, and there might have been no need to single them out later for study as a distinct discipline. Regardless, the term glycobiology has gained wide acceptance, with a number of textbooks, a major biomedical journal, a growing scientific society, and many research conferences now bearing this name. Defined in the broadest sense, glycobiology is the study of the structure, biosynthesis, biology, and evolution of saccharides (sugar chains or glycans) that are widely distributed in nature, and the proteins that recognize them. (The Oxford English Dictionary definition is “the branch of science concerned with the role of sugars in biological systems.”) Glycobiology is one of the more rapidly growing fields in the natural sciences, with broad relevance to many areas of basic research, biomedicine, and biotechnology. The field includes the chemistry of carbohydrates, the enzymology of glycan formation and degradation, the recognition of glycans by specific proteins (lectins and glycosaminoglycan-binding proteins), glycan roles in complex biological systems, and their analysis or manipulation by a variety of techniques. Research in glycobiology thus requires a foundation not only in the nomenclature, biosynthesis, structure, chemical synthesis, and functions of glycans, but also in the general disciplines of molecular genetics, protein chemistry, cell biology, developmental biology, physiology, and medicine. This volume provides an overview of the field, with some emphasis on the glycans of animal systems. It is assumed that the reader has a basic background in advanced undergraduate-level chemistry, biochemistry, and cell biology.

HISTORICAL ORIGINS OF GLYCOBIOLOGY

As mentioned above, glycobiology had its early origins in the fields of carbohydrate chemistry and biochemistry, and has only recently emerged as a major aspect of molecular and cellular biology and physiology. Some of the major investigators and important discoveries that influenced the development of the field are presented in Figure 1.1 and Table 1.1. As with any such attempt, a comprehensive list is impossible, but details regarding some of these discoveries can be found in other chapters of this volume. A summary of the general principles gained from this research is presented in Table 1.2.

FIGURE 1.1. Nobel laureates in fields related to the early history of glycobiology.

FIGURE 1.1

Nobel laureates in fields related to the early history of glycobiology. Listed are the Laureates and their original Nobel citations: Hermann Emil Fischer (Chemistry, 1902), “in recognition of the extraordinary services he has rendered by his work (more...)

TABLE 1.1

TABLE 1.1

Important discoveries in the history of glycobiology

TABLE 1.2

TABLE 1.2

“Universal” principles of glycobiology

MONOSACCHARIDES ARE THE BASIC STRUCTURAL UNITS OF GLYCANS

Carbohydrates are defined as polyhydroxyaldehydes, polyhydroxyketones and their simple derivatives, or larger compounds that can be hydrolyzed into such units. A monosaccharide is a carbohydrate that cannot be hydrolyzed into a simpler form. It has a potential carbonyl group at the end of the carbon chain (an aldehyde group) or at an inner carbon (a ketone group). These two types of monosaccharides are therefore named aldoses and ketoses, respectively (for examples, see below and for more details, see Chapter 2). Free monosaccharides can exist in open-chain or ring forms (Figure 1.2). Ring forms of the monosaccharides are the rule in oligosaccharides, which are linear or branched chains of monosaccharides attached to one another via glycosidic linkages (the term “polysaccharide” is typically reserved for large glycans composed of repeating oligosaccharide motifs). The generic term “glycan” is used throughout this book to refer to any form of mono-, oligo-, or polysaccharide, either free or covalently attached to another molecule.

FIGURE 1.2. Open-chain and ring forms of glucose.

FIGURE 1.2

Open-chain and ring forms of glucose. Changes in the orientation of hydroxyl groups around specific carbon atoms generate new molecules that have a distinct biology and biochemistry (e.g., galactose is the C-4 epimer of glucose).

The ring form of a monosaccharide generates a chiral anomeric center at C-1 for aldo sugars or at C-2 for keto sugars (for details, see Chapter 2). A glycosidic linkage involves the attachment of a monosaccharide to another residue, typically via the hydroxyl group of this anomeric center, generating α linkages or β linkages that are defined based on the relationship of the glycosidic oxygen to the anomeric carbon and ring (see Chapter 2). It is important to realize that these two linkage types confer very different structural properties and biological functions upon sequences that are otherwise identical in composition, as classically illustrated by the marked differences between starch and cellulose (both homopolymers of glucose), the former largely α1-4 linked and the latter β1-4 linked throughout. A glycoconjugate is a compound in which one or more monosaccharide or oligosaccharide units (the glycone) are covalently linked to a noncarbohydrate moiety (the aglycone). An oligosaccharide that is not attached to an aglycone possesses the reducing power of the aldehyde or ketone in its terminal monosaccharide component. This end of a sugar chain is therefore often called the reducing terminus or reducing end, terms that tend to be used even when the sugar chain is attached to an aglycone and has thus lost its reducing power. Correspondingly, the outer end of the chain tends to be called the nonreducing end (note the analogy to the 5′ and 3′ ends of nucleotide chains or the amino and carboxyl termini of polypeptides).

GLYCANS CAN CONSTITUTE A MAJOR PORTION OF THE MASS OF A GLYCOCONJUGATE

In naturally occurring glycoconjugates, the portion of the molecule comprising the glycans can vary greatly in its contribution to the overall size, from being very minor in amount to being the dominant component or even almost the exclusive one. In many cases, the glycans comprise a substantial portion of the mass of glycoconjugates (for a typical example, see Figure 1.3). For this reason, the surfaces of all types of cells in nature (which are heavily decorated with different kinds of glycoconjugates) are effectively covered with a dense array of sugars, giving rise to the so-called glycocalyx. This cell-surface structure was first observed many years ago by electron microscopists as an anionic layer external to the cell surface membrane, which could be decorated with polycationic reagents such as cationized ferritin (for a historical example, see Figure 1.4).

FIGURE 1.3. Schematic representation of the Thy-1 glycoprotein including the three N-glycans (blue) and a glycosylphosphatidylinositol (GPI-glycan, green) lipid anchor whose acyl chains (yellow) would normally be embedded in the membrane bilayer.

FIGURE 1.3

Schematic representation of the Thy-1 glycoprotein including the three N-glycans (blue) and a glycosylphosphatidylinositol (GPI-glycan, green) lipid anchor whose acyl chains (yellow) would normally be embedded in the membrane bilayer. Note that the polypeptide (more...)

FIGURE 1.4. Historical electron micrograph of endothelial cells from a blood capillary in the diaphragm muscle of a rat, showing the lumenal cell membrane of the cells (facing the blood) decorated with particles of cationized ferritin (arrowheads).

FIGURE 1.4

Historical electron micrograph of endothelial cells from a blood capillary in the diaphragm muscle of a rat, showing the lumenal cell membrane of the cells (facing the blood) decorated with particles of cationized ferritin (arrowheads). These particles (more...)

MONOSACCHARIDES CAN BE LINKED TOGETHER IN MANY MORE WAYS THAN AMINO ACIDS OR NUCLEOTIDES

Nucleotides and proteins are linear polymers that can each contain only one basic type of linkage between monomers. In contrast, each monosaccharide can theoretically generate either an α or a β linkage to any one of several positions on another monosaccharide in a chain or to another type of molecule. Thus, it has been pointed out that although three different nucleotides or amino acids can only generate six trimers, three different hexoses could produce (depending on which of their forms are considered) anywhere from 1,056 to 27,648 unique trisaccharides. This difference in complexity becomes even greater as the number of monosaccharide units in the glycan increases. For example, a hexasaccharide with six different hexoses could have more than 1 trillion possible combinations. Thus, an almost unimaginable number of possible saccharide units could be theoretically present in biological systems. Fortunately for the student of glycobiology, naturally occurring biological macromolecules are so far known to contain relatively few of the possible monosaccharide units, in a limited number of combinations. However, the great majority of glycans in nature have yet to be discovered and structurally defined.

COMMON MONOSACCHARIDE UNITS OF GLYCOCONJUGATES

Although several hundred distinct monosaccharides are known to occur in nature, only a small number of these are commonly found in animal glycans. They are listed below, along with their standard abbreviations (for details regarding their structures, see Chapter 2).

  • Pentoses: Five-carbon neutral sugars, e.g., D-xylose (Xyl)
  • Hexoses: Six-carbon neutral sugars, e.g., D-glucose (Glc), D-galactose (Gal), and D-mannose (Man).
  • Hexosamines: Hexoses with an amino group at the 2-position, which can be either free or, more commonly, N-acetylated, e.g., N-acetyl-D-glucosamine (GlcNAc) and N-acetyl-D-galactosamine (GalNAc).
  • Deoxyhexoses: Six-carbon neutral sugars without the hydroxyl group at the 6-position (e.g., L-fucose [Fuc]).
  • Uronic acids: Hexoses with a negatively charged carboxylate at the 6-position, e.g., D-glucuronic acid (GlcA) and L-iduronic acid (IdoA).
  • Sialic acids: Family of nine-carbon acidic sugars (generic abbreviation is Sia), of which the most common is N-acetylneuraminic acid (Neu5Ac, also sometimes called NeuAc or historically, NANA) (for details, see Chapter 14).

For the sake of simplicity, the symbols D- and L- are omitted from the full names of monosaccharides from here on, and only the symbol L- will be used when appropriate (e.g., L-fucose or L-iduronic acid).

This limited set of monosaccharides dominates the glycobiology of more recently evolved (so-called “higher”) animals, but several others have been found in “lower” animals (e.g., tyvelose; see Chapters 23 and 24), bacteria (e.g., keto-deoxyoctulosonic acid, rhamnose, L-arabinose, and muramic acid; see Chapter 20), and plants (e.g., arabinose, apiose, and galacturonic acid; see Chapter 22). A variety of modifications of glycans enhance their diversity in nature and often serve to mediate specific biological functions. Thus, the hydroxyl groups of different monosaccharides can be subject to phosphorylation, sulfation, methylation, O-acetylation, or fatty acylation. Although amino groups are commonly N-acetylated, they can be N-sulfated or remain unsubstituted. Carboxyl groups are occasionally subject to lactonization to nearby hydroxyl groups or even lactamization to nearby amino groups.

Details regarding the structural depiction of monosaccharides, linkages, and oligosaccharides are discussed in Chapter 2. Many figures in this volume use a simplified style of depiction of sugar chains (see Figure 1.5). This figure (also reproduced on the inside front cover) uses a monosaccharide symbol set modified from the first edition of Essentials of Glycobiology, which has also been adopted by several other groups interested in presenting databases of structures (e.g., the Consortium for Functional Glycomics).

FIGURE 1.5. Recommended symbols and conventions for drawing glycan structures.

FIGURE 1.5

Recommended symbols and conventions for drawing glycan structures. (Top panel) The monosaccharide symbol set from the first edition of Essentials of Glycobiology is modified to avoid using the same shape or color, but with different orientation to represent (more...)

MAJOR CLASSES OF GLYCOCONJUGATES AND GLYCANS

The common classes of glycans found in or on eukaryotic cells are primarily defined according to the nature of the linkage to the aglycone (protein or lipid) (see Figure 1.6 and Figure 1.7). A glycoprotein is a glycoconjugate in which a protein carries one or more glycans covalently attached to a polypeptide backbone, usually via N or O linkages. An N-glycan (N-linked oligosaccharide, N-(Asn)-linked oligosaccharide) is a sugar chain covalently linked to an asparagine residue of a polypeptide chain, commonly involving a GlcNAc residue and the consensus peptide sequence: Asn-X-Ser/Thr. N-Glycans share a common pentasaccharide core region and can be generally divided into three main classes: oligomannose (or high-mannose) type, complex type, and hybrid type (see Chapter 8). An O-glycan (O-linked oligosaccharide) is frequently linked to the polypeptide via N-acetylgalactosamine (GalNAc) to a hydroxyl group of a serine or threonine residue and can be extended into a variety of different structural core classes (see Chapter 9). A mucin is a large glycoprotein that carries many O-glycans that are clustered (closely spaced). Several other types of O-glycans also exist (e.g., attached to proteins via O-linked mannose). A proteoglycan is a glycoconjugate that has one or more glycosaminoglycan (GAG) chains (see definition below) attached to a “core protein” through a typical core region ending in a xylose residue that is linked to the hydroxyl group of a serine residue. The distinction between a proteoglycan and a glycoprotein is otherwise arbitrary, because some proteoglycan polypeptides can carry both glycosaminoglycan chains and different O- and N-glycans (see Chapter 16). Figure 1.7 provides a listing of known glycan-protein linkages in nature.

FIGURE 1.6. Common classes of animal glycans.

FIGURE 1.6

Common classes of animal glycans. (Modified from Varki A. 1997. FASEB J. 11: 248–255; Fuster M. and Esko J.D. 2005. Nat. Rev. Can. 7: 526–542.)

FIGURE 1.7. Glycan-protein linkages reported in nature.

FIGURE 1.7

Glycan-protein linkages reported in nature. (Updated and redrawn, with permission of Oxford University Press, from Spiro R.G. 2002. Glycobiology. 12: 43R–56R.) Diagrammatic representation of the five distinct types of sugar-peptide bonds that (more...)

A glycophosphatidylinositol anchor is a glycan bridge between phosphatidylinositol and a phosphoethanolamine that is in amide linkage to the carboxyl terminus of a protein. This structure typically constitutes the only anchor to the lipid bilayer membrane for such proteins (see Chapter 11). A glycosphingolipid (often called a glycolipid) consists of a glycan usually attached via glucose or galactose to the terminal primary hydroxyl group of the lipid moiety ceramide, which is composed of a long chain base (sphingosine) and a fatty acid (see Chapter 10). Glycolipids can be neutral or anionic. A ganglioside is an anionic glycolipid containing one or more residues of sialic acid. It should be noted that these represent only the most common classes of glycans reported in eukaryotic cells. There are several other less common types found on one or the other side of the cell membrane in animal cells (see Chapters 12 and 17).

Although different glycan classes have unique core regions by which they are distinguished, certain outer structural sequences are often shared among different classes of glycans. For example, N- and O-glycans and glycosphingolipids frequently carry the subterminal disaccharide Galβ1-4GlcNAcβ1- (N-acetyllactosamine or LacNAc) or, less commonly, GalNAcβ1-4GlcNAcβ1- (LacdiNAc) units. The LacNAc units can sometimes be repeated, giving extended poly-N-acetyllactosamines (sometimes incorrectly called “poly-lactosamines”). Less commonly, the LacdiNAc motif can also be repeated (termed polyLacdiNAc). Outer LacNAc units can be modified by fucosylation or by branching and are typically capped by sialic acids or, less commonly, by sulfate, Fuc, α-Gal, β-GalNAc, or β-GlcA units (see Chapters 13 and 14). In contrast, glycosaminoglycans are linear copolymers of acidic disaccharide repeating units, each containing a hexosamine (GlcN or GalN) and a hexose (Gal) or hexuronic acid (GlcA or IdoA) (see Chapter 16). The type of disaccharide unit defines the glycosaminoglycan as chondroitin or dermatan sulfate (GalNAcβ1-4GlcA/IdoA), heparin or heparan sulfate (GlcNAcα1-4GlcA/IdoA), or keratan sulfate (Galβ1-4GlcNAc). Keratan sulfate is actually a 6-O-sulfated form of poly-N-acetyllactosamine attached to an N- or O-glycan core, rather than to a typical Xyl-Ser-containing proteoglycan linkage region. Another type of glycosaminoglycan, hyaluronan (a polymer of GlcNAcβ1-4GlcA), appears to exist primarily as a free sugar chain unattached to any aglycone (Chapter 15). The glycosaminoglycans (except for hyaluronan) also typically have sulfate esters substituting either amino or hydroyxl groups (i.e., N- or O-sulfate groups). Another anionic polysaccharide that can be extended from LacNAc units is polysialic acid, a homopolymer of sialic acid that is selectively expressed only on a few proteins in vertebrates. Polysialic acids are also found as the capsular polysaccharides of certain pathogenic bacteria (Chapter 14).

GLYCAN STRUCTURES ARE NOT ENCODED DIRECTLY IN THE GENOME

It is important to reemphasize that unlike protein sequences, which are primary gene products, glycan chain structures are not encoded directly in the genome and are secondary gene products. A few percent of known genes in the human genome are dedicated to producing the enzymes and transporters responsible for the biosynthesis and assembly of glycan chains (see Chapter 7), typically as posttranslational modifications of proteins or by glycosylation of core lipids. The glycan chains themselves represent numerous combinatorial possibilities, generated by a variety of competing and sequentially acting glycosidases and glycosyltransferases (see Chapter 5) and the subcompartmentalized “assembly-line” mechanisms of glycan biosynthesis in the Golgi apparatus (see Chapter 3). Thus, even with full knowledge of the expression levels of all relevant gene products, we do not understand enough about the structures and pathways to predict the precise structures of glycans elaborated by a given cell type. Furthermore, small changes in environmental cues can cause dramatic changes in glycans produced by a given cell. It is this variable and dynamic nature of glycosylation that makes it a powerful way to generate biological diversity and complexity. Of course, it also makes glycans more difficult to study than nucleic acids and proteins.

SITE-SPECIFIC STRUCTURAL DIVERSITY IN PROTEIN GLYCOSYLATION

One of the most fascinating and yet frustrating aspects of protein glycosylation is the phenomenon of microheterogeneity. This term indicates that at any given glycan attachment site on a given protein synthesized by a particular cell type, a range of variations can be found in the structures of the attached glycan chain. The extent of this microheterogeneity can vary considerably from one glycosylation site to another, from glycoprotein to glycoprotein, and from cell type to cell type. Thus, a given protein originally encoded by a single gene can exist in numerous “glycoforms,” each effectively a distinct molecular species. Mechanistically, microheterogeneity might be explained by the rapidity with which multiple, sequential, partially competitive glycosylation and deglycosylation reactions must take place in the endoplasmic reticulum (ER) and Golgi apparatus, through which a newly synthesized glycoprotein passes (see Chapter 3). An alternate possibility is that each individual cell or cell type is in fact exquisitely specific in the details of the glycosylation that it produces, but that intercellular variations result in the observed microheterogeneity of samples from natural multicellular sources. Whatever the origin of microheterogeneity, it accounts for the anomalous behavior of glycoproteins in various analytical/separation techniques (such as sodium dodecyl sulfate–polyacrylamide gel electrophoresis [SDS-PAGE], in which multiple or diffuse bands are observed) and makes complete structural analysis of a glycoprotein a difficult task. From a functional point of view, the biological significance of microheterogeneity remains unclear. It is possible that this is a type of diversity generator, intended for diversifying endogenous recognition functions and/or for evading microbes and parasites, each of which can bind with high specificity only to certain glycan structures (see Chapters 34 and 39).

CELL BIOLOGY OF GLYCOSYLATION

Most well-characterized pathways for the biosynthesis of major classes of glycans are confined within the ER and Golgi compartments (see Chapter 3). Thus, for example, newly synthesized proteins originating from the ER are either cotranslationally or posttranslationally modified with sugar chains at various stages in their itinerary toward their final destinations. The glycosylation reactions usually use activated forms of monosaccharides (nucleotide sugars; see Chapter 4) as donors for reactions that are catalyzed by glycosyltransferases (for details about their biochemistry, molecular genetics, and cell biology, see Chapters 3, 5, and 7). In almost all cases, these nucleotide donors are synthesized within the cytosolic or nuclear compartment from monosaccharide precursors of endogenous or exogenous origin (see Chapter 4). To be available to perform the glycosylation reactions, the donors must be actively transported across a membrane bilayer into the lumen of the ER and Golgi compartments. Much effort has gone into understanding the mechanisms of glycosylation within the ER and the Golgi apparatus, and it is clear that a variety of factors determine the final outcome of glycosylation reactions. Some bulky sugar chains are made on the cytoplasmic face of these intracellular organelles and are flipped across their membranes to the other side, but most are synthesized by adding one monosaccharide at a time to the growing glycan chain on the inside of the ER or the Golgi. Regardless, the portion of a glycoconjugate that faces the inside of these compartments will ultimately face the inside of a secretory granule or lysosome and will be topologically unexposed to the cytosol. The biosynthetic enzymes (glycosyltransferases, sulfotransferases, etc.) responsible for catalyzing these reactions are well studied (see Chapter 5), and their location has helped to define various functional compartments of the ER-Golgi pathway. A classical model envisioned that these enzymes are physically lined up along this pathway in the precise sequence in which they actually work. This appears to be an oversimplified view, because there is considerable overlap in the distribution of these enzymes, and the actual distribution of a given enzyme seems to depend on the cell type.

All of the topological considerations mentioned above are reversed with regard to nuclear and cytoplasmic glycosylation, because the active sites of the relevant glycosyltransferases face the cytosol, which is in direct communication with the interior of the nucleus. Until the mid-1980s, the accepted dogma was that glycoconjugates, such as glycoproteins and glycolipids, occurred exclusively on the outer surface of cells, on the internal (luminal) surface of intracellular organelles, and on secreted molecules. As discussed above, this was consistent with knowledge of the topology of the biosynthesis of the classes of glycans known at the time, which took place within the lumen of the ER-Golgi pathway. Thus, despite some clues to the contrary, the cytosol and nucleus were assumed to be devoid of glycosylation capacity. However, it is now clear that certain distinct types of glycoconjugates are synthesized and reside within the cytosol and nucleus (see Chapter 17). Indeed, one of them, called O-linked GlcNAc (see Chapter 18), may well be numerically the most common type of glycoconjugate in many cell types. The fact that this major form of glycosylation was missed by so many investigators for so long serves to emphasize the relatively unexplored state of the whole field of glycobiology.

Like all components of living cells, glycans are constantly being degraded and the enzymes that catalyze this process cleave sugar chains either at the outer (nonreducing) terminal end (exoglycosidases) or internally (endoglycosidases) (see Chapters 3 and 41). Some terminal monosaccharide units such as sialic acids are sometimes removed and new units reattached during endosomal recycling, without degradation of the underlying chain. The final complete degradation of most glycans is generally performed by multiple glycosidases in the lysosome. Once broken down, their individual unit monosaccharides are then typically exported from the lysosome into the cytosol so that they can be reused (see Figure 1.8). In contrast to the relatively slow turnover of glycans derived from the ER-Golgi pathway, the O-GlcNAc monosaccharide modifications of the nucleus and cytoplasm may be more dynamic and rapidly turned over (see Chapter 18).

FIGURE 1.8. Biosynthesis, use, and turnover of a common monosaccharide.

FIGURE 1.8

Biosynthesis, use, and turnover of a common monosaccharide. This schematic shows the biosynthesis, fate, and turnover of galactose, a common monosaccharide constituent of animal glycans. Although small amounts of galactose can be taken up from the outside (more...)

TOOLS USED TO STUDY GLYCOSYLATION

Unlike oligonucleotides and proteins, glycans are not commonly expressed in a linear, unbranched fashion. Even when they are found as linear macromolecules (e.g., GAGs), they often contain a variety of substituents, such as sulfate groups. Thus, the complete sequencing of glycans is practically impossible to accomplish by a single method and requires iterative combinations of physical, chemical, and enzymatic approaches that together yield the details of the structure under study (for a discussion of the various forms of low- and high-resolution separation and analysis, including mass spectrometry and NMR, see Chapter 47). Less detailed information on structure may be sufficient to explore the biology of some glycans and can be obtained by simple techniques, such as the use of enzymes (endoglycosidases and exoglycosidases), lectins, and other glycan-binding proteins (see Chapters 45 and 47), chemical modification or cleavage, metabolic radioactive labeling, antibodies, or cloned glycosyltransferases (Chapter 49). Glycosylation can also be perturbed in a variety of ways, for example, by glycosylation inhibitors and primers (Chapter 50) and by genetic manipulation of glycosylation in intact cells and organisms (Chapter 46). The directed in vitro synthesis of glycans by chemical and enzymatic methods has also taken great strides in recent years, providing many new tools for exploring glycobiology (Chapters 49 and 51). The generation of complex glycan libraries by a variety of routes has further enhanced this interface of chemistry and biology (Chapter 49).

GLYCOMICS

Analogous to genomics and proteomics, glycomics represents the systematic methodological elucidation of the “glycome” (the totality of glycan structures) of a given cell type or organism (see Chapter 48). In reality, the glycome is far more complex than the genome or proteome. In addition to the vastly greater structural diversity in glycans, one is faced with the complexities of glycosylation microheterogeneity (see above) and the dynamic changes that occur in the course of development, differentiation, metabolic changes, malignancy, inflammation, or infection. Added diversity arises from intraspecies and interspecies variations in glycosylation. Thus, a given cell type in a given species can manifest a large number of possible glycome states. Glycomic analysis today generally consists of extracting entire cell types, organs, or organisms; releasing all the glycan chains from their linkages; and cataloging them via approaches such as mass spectrometry. In a variation called glycoproteomics, the glycans are analyzed while still attached to protease-generated fragments of glycoproteins. The results obtained represent a spectacular improvement over what was possible a few decades ago, but they still constitute an effort analogous to cutting down all the trees in a forest and cataloging them, without attention to the layout of the forest and the landscape. This type of glycomic analysis needs to be complemented by classical methods such as tissue-section staining or flow cytometry, using lectins or glycan-specific antibodies that aid in understanding the glycome by taking into account the heterogeneity of glycosylation at the level of the different cell types and subcellular domains in the tissue under study. This is even more important because of the common observation that removing cells from their normal milieu and placing them into tissue culture can result in major changes in the glycosylation machinery of the cell. However, such classical approaches suffer from poor quantitation and relative insensitivity to structural details. A combination of the two approaches is now potentially feasible via laser-capture microdissection of specific cell types directly from tissue sections, with the resulting samples being studied by mass spectrometry.

Because most of the genes involved in glycan biosynthetic pathways have been cloned from multiple organisms, it is possible today to obtain an indirect genomic and transcriptomic view of the glycome in a specific cell type (see Chapter 7). However, given the relatively poor correlation between mRNA and protein levels, and the complex assembly line and competitive nature of the cellular Golgi glycosylation pathways, even complete knowledge of the mRNA expression patterns of all relevant genes in a given cell cannot allow accurate prediction of the distribution and structures of glycans in that cell type. In other words, there is as yet no reliable indirect route toward elucidating the glycome, other than by actual structural analysis using an array of methods.

GLYCOSYLATION DEFECTS IN ORGANISMS AND CULTURED CELLS

Many mutant variants of cultured cell lines with altered glycan structures and specific glycan biosynthetic defects have been described, the most common of which are those that are lectin resistant (see Chapter 46). Indeed, with few exceptions, mutants with specific defects at most steps of the major pathways of glycan biosynthesis have been found in cultured animal cells. The use of such cell lines has been of great value in elucidating the details of glycan biosynthetic pathways. Their existence implies that many types of glycans are not crucial to the optimal growth of single cells living in the sheltered and relatively unchanging environment of the culture dish. Rather, most glycan structures must be more important in mediating cell–cell and cell–matrix interactions in intact multicellular organisms and/or interactions between organisms. In keeping with this supposition, genetic defects completely eliminating major glycan classes in intact animals all cause embryonic lethality (see Chapter 42 and Table 6.1). As might be expected, naturally occurring viable animal mutants of this type tend to have disease phenotypes of intermediate severity and show complex phenotypes involving multiple systems. Less severe genetic alterations of outer chain components of glycans tend to give viable organisms with more specific phenotypes (see Chapter 42 and Table 6.1 ). Overall, there is much to be learned by studying the consequences of natural or induced genetic defects in intact multicellular organisms (see Chapter 42). It is interesting to note that, in the short time since the first edition of this book, we have gone from asking “What is it that glycans do anyway?” to having to explain a large number of complex and sometimes nonviable glycosylation-modified phenotypes in humans, mice, flies, and other organisms.

THE BIOLOGICAL ROLES OF GLYCANS ARE DIVERSE

A major theme of this volume is the exploration and elucidation of the biological roles of glycans. Like any biological system, the optimal approach carefully considers the relationship of structure and biosynthesis to function (see Chapter 6). As might be imagined from their ubiquitous and complex nature, the biological roles of glycans are quite varied. Indeed, asking what these roles are is akin to asking the same question about proteins. Thus, all of the proposed theories regarding glycan function turn out to be partly correct, and exceptions to each can also be found. Not surprisingly for such a diverse group of molecules, the biological roles of glycans span the spectrum from those that are subtle to those that are crucial for the development, growth, function, or survival of an organism (for further discussion, see Chapter 6). The diverse functions ascribed to glycans can be more simply divided into two general categories: (i) structural and modulatory functions (involving the glycans themselves or their modulation of the molecules to which they are attached) and (ii) specific recognition of glycans by glycan-binding proteins. Of course, any given glycan can mediate one or both types of functions. The binding proteins in turn fall into two broad groups: lectins and sulfated GAG-binding proteins (see Chapter 26). Such molecules can be either intrinsic to the organism that synthesized the cognate glycans (e.g., see Chapters 28, 29, 30, 31, 32, 33, and 35) or extrinsic (see Chapter 34 and 39 for information concerning microbial proteins that bind to specific glycans on host cells). The atomic details of these glycan-protein interactions have been elucidated in many instances (see Chapter 27). Although there are exceptions to this notion, the following general theme has emerged regarding lectins: Monovalent binding tends to be of relatively low affinity, although there are exceptions to this notion, and such systems typically achieve their specificity and function by achieving high avidity, via interactions of multivalent arrays of glycans with cognate lectin-binding sites (see Chapters 30 and 40).

GLYCOSYLATION CHANGES IN DEVELOPMENT, DIFFERENTIATION, AND MALIGNANCY

Whenever a new tool (e.g., an antibody or lectin) specific for a particular glycan is developed and used to probe its expression in intact organisms, it is common to find exquisitely specific temporal and spatial patterns of expression of that glycan in relation to cellular activation, embryonic development, organogenesis, and differentiation (see Chapter 38). Certain relatively specific changes in expression of glycans are also often found in the course of transformation and progression to malignancy (see Chapter 44), as well as other pathological situations such as inflammation. These spatially and temporally controlled patterns of glycan expression imply the involvement of glycans in many normal and pathological processes, the precise mechanisms of which are understood in only a few cases.

EVOLUTIONARY CONSIDERATIONS IN GLYCOBIOLOGY

Remarkably little is known about the evolution of glycosylation. There are clearly shared and unique features of glycosylation in different kingdoms and taxa. Among animals, there may be a trend toward increasing complexity of N- and O-glycans in more recently evolved (“higher”) taxa. Intraspecies and interspecies variations in glycosylation are also relatively common. It has been suggested that the more specific biological roles of glycans are often mediated by uncommon structures, unusual presentations of common structures, or further modifications of the commonly occurring saccharides themselves. Such unusual structures likely result from such unique expression patterns of the relevant glycosyltransferases or other glycan-modifying enzymes. On the other hand, such uncommon glycans can be targets for specific recognition by infectious microorganisms and various toxins. Thus, at least a portion of the diversity in glycan expression in nature must be related to the evolutionary selection pressures generated by interspecies interactions (e.g., of host with pathogen or symbiont). In other words, the two different classes of glycan recognition mentioned above (mediated by intrinsic and extrinsic glycan-binding proteins) are in constant competition with each other, with regard to a particular glycan target. The specialized glycans expressed by parasites and microbes that are of great interest from the biomedical point of view (see Chapters 20, 21, and 40) are themselves presumably subject to evolutionary selection pressures. The evolutionary issues presented above are further considered in Chapter 19, which also discusses the limited information concerning how various glycan biosynthetic pathways appear to have evolved and diverged in different life forms.

GLYCANS IN MEDICINE AND BIOTECHNOLOGY

Numerous natural bioactive molecules are glycoconjugates, and the attached glycans can have dramatic effects on the biosynthesis, stability, action, and turnover of these molecules in intact organisms. For example, heparin, a sulfated glycosaminoglycan, and its derivatives are among the most commonly used drugs in the world. For this and many other reasons, glycobiology and carbohydrate chemistry have become increasingly important in modern biotechnology. Patenting a glycoprotein drug, obtaining an FDA approval for its use, and monitoring its production all require knowledge of the structure of its glycans. Moreover, glycoproteins, which include monoclonal antibodies, enzymes, and hormones, are by now the major products of the biotechnology industry, with sales in the tens of billions of dollars annually, which continues to grow at an increasing rate. In addition, several human disease states are characterized by changes in glycan biosynthesis that can be of diagnostic and/or therapeutic significance. The emerging importance of glycobiology in medicine and biotechnology is further considered in Chapters 43 and 51.

FURTHER READING

  1. Sharon N. Carbohydrates. Sci. Am. 1980;243[5]:90–116. [PubMed: 7423183]
  2. Rademacher TW, Parekh RB, Dwek RA. Glycobiology. Annu. Rev. Biochem. 1988;57:785–838. [PubMed: 3052290]
  3. Sharon N, Lis H. Carbohydrates in cell recognition. Sci. Am. 1993;268:82–89. [PubMed: 7678182]
  4. Varki A. Biological roles of oligosaccharides: All of the theories are correct. Glycobiology. 1993;3:97–130. [PubMed: 8490246]
  5. Cabezas JA. The origins of glycobiology. Biochem. Edu. 1994;22:3–7.
  6. Drickamer K, Taylor ME. Evolving views of protein glycosylation. Trends Biochem. Sci. 1998;23:321–324. [PubMed: 9787635]
  7. Etzler ME. Oligosaccharide signaling of plant cells. J. Cell Biochem. (suppl.). 1998;30–31:123–128. [PubMed: 9893263]
  8. Gagneux P, Varki A. Evolutionary considerations in relating oligosaccharide diversity to biological function. Glycobiology. 1999;9:747–755. [PubMed: 10406840]
  9. Esko JD, Lindahl U. Molecular diversity of heparan sulfate. J. Clin. Invest. 2001;108:169–173. [PMC free article: PMC203033] [PubMed: 11457867]
  10. Roseman S. Reflections on glycobiology. J. Biol. Chem. 2001;276:41527–41542. [PubMed: 11553646]
  11. Hakomori S. The glycosynapse. Inaugural article. Proc. Natl. Acad. Sci. 2002;99:225–232. [PMC free article: PMC117543] [PubMed: 11773621]
  12. Spiro RG. Protein glycosylation: Nature, distribution, enzymatic formation, and disease implications of glycopeptide bonds. Glycobiology. 2002;12:43R–56R. [PubMed: 12042244]
  13. Haltiwanger RS, Lowe JB. Role of glycosylation in development. Annu. Rev. Biochem. 2004;73:491–537. [PubMed: 15189151]
  14. Sharon N, Lis H. History of lectins: From hemagglutinins to biological recognition molecules. Glycobiology. 2004;14:53R–62R. [PubMed: 15229195]
  15. Toole BP. Hyaluronan: From extracellular glue to pericellular cue. Nat. Rev. Can. 2004;4:528–539. [PubMed: 15229478]
  16. Drickamer K, Taylor ME. Introduction to glycobiology. 2. Oxford University Press; Oxford: 2006.
  17. Freeze HH. Genetic defects in the human glycome. Nat. Rev. Genet. 2006;7:537–551. [PubMed: 16755287]
  18. Lutteke T, Bohne-Lang A, Loss A, Goetz T, Frank M, von der Lieth CW. GLYCO-SCIENCES.de: An Internet portal to support glycomics and glycobiology research. Glycobiology. 2006;16:71R–81R. [PubMed: 16239495]
  19. Ohtsubo K, Marth JD. Glycosylation in cellular mechanisms of health and disease. Cell. 2006;126:855–867. [PubMed: 16959566]
  20. Patnaik SK, Stanley P. Lectin-resistant CHO glycosylation mutants. Methods Enzymol. 2006;416:159–182. [PubMed: 17113866]
  21. Prescher JA, Bertozzi CR. Chemical technologies for probing glycans. Cell. 2006;126:851–854. [PubMed: 16959565]
  22. van Die I, Cummings RD. Glycans modulate immune responses in helminth infections and allergy. Chem Immunol Allergy. 2006;90:91–112. [PubMed: 16210905]
  23. Varki A. Nothing in glycobiology makes sense, except in the light of evolution. Cell. 2006;126:841–845. [PubMed: 16959563]
  24. Bishop JR, Schuksz M, Esko JD. Heparan sulphate proteoglycans fine-tune mammalian physiology. Nature. 2007;446:1030–1037. [PubMed: 17460664]
  25. Hart GW, Housley MP, Slawson C. Cycling of O-linked β-N-acetylglucosamine on nucleo-cytoplasmic proteins. Nature. 2007;446:1017–1022. [PubMed: 17460662]
  26. Kamerling J, Boons G-J, Lee Y, Suzuki A, Taniguchi N, Voragen AGJ. Comprehensive glycoscience. 1–4. Elsevier Science; London: 2007.