NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.

Varki A, Cummings RD, Esko JD, et al., editors. Essentials of Glycobiology. 2nd edition. Cold Spring Harbor (NY): Cold Spring Harbor Laboratory Press; 2009.

Bookshelf ID: NBK1935PMID: 20301257

Chapter 29L-type Lectins

Marilynn E Etzler, Avadhesha Surolia, and Richard D Cummings.

The L-type lectins were first discovered in the seeds of leguminous plants, and they were found to have structural motifs that are now known to be present in a variety of glycan-binding proteins from other eukaryotic organisms. The structures of many of these lectins have been thoroughly characterized, and many L-type lectins are employed in a wide range of biomedical and analytical procedures. This chapter discusses the structure–function relationships of these lectins and the various biological roles they have in different organisms.

HISTORICAL BACKGROUND

The L-type lectins have a rich history that goes back to the end of the 19th century when it was found that extracts from the seeds of leguminous plants could agglutinate red blood cells. These agglutinins (later named lectins) were found to be soluble proteins that are very abundant in the seeds of leguminous plants, and differences in hemagglutination specificity were found among agglutinins from different species of legumes. A considerable amount of work on these proteins was done in the early part of the 20th century, including the crystallization of concanavalin A (ConA; the hemagglutinin from jack beans) and the finding that the hemagglutinating properties of these proteins are due to their ability to bind glycans on the cell surface.

The abundance of these proteins in the soluble extracts of legume seeds enabled a large number of these lectins to be isolated and characterized. The seed lectins were found to have considerable amino acid sequence homology, and the variety of carbohydrate-binding specificities found among these lectins enabled them to be employed as useful tools in a wide variety of analytical and biomedical procedures.

The crystal structures of a number of legume seed lectins were obtained, and the locations of the carbohydrate-binding sites were determined. Structural similarities were soon noticed among these lectins and some other lectins, including the galectins (see Chapter 33) and a variety of other lectins discussed in this chapter. For this reason, the term “L-lectins” has recently been designated as a classification for all lectins with this legume seed lectin-like structure.

COMMON FEATURES OF L-TYPE LECTINS

The L-type lectins are distinguished from other lectins primarily on the basis of tertiary structure. In general, either the entire lectin monomer or the carbohydrate-recognition domains (CRDs) of the more complex lectins are composed of antiparallel β-sheets connected by short loops and β-bends, and they are usually devoid of any α-helical structure. These sheets form a dome-like structure related to the “jelly-roll fold,” and it is often called a “lectin fold.” The carbohydrate-binding site is generally localized toward the apex of this dome. The tertiary structure of the monomer of ConA, the lectin from the seeds of the legume Canavalia ensiformis, is shown in Figure 29.1. The crystal structures of at least 20 other legume L-type lectin monomers have been determined by high-resolution X-ray crystallography and are almost superimposable on this structure. Thus, it is not surprising that the primary amino acid sequences of these legume lectins show remarkable homology with one another as well as with the sequences of many other legume seed lectins sequenced but not yet crystallized. Fewer, but significant, homologies in primary structure have been found between legume L-type lectins and some L-type lectins from far distant sources, such as ERGIC-53 and VIP36. Yet, in other L-type lectins, no homology is found with the seed lectins, although they contain similar lectin folds. For example, a comparison of the tertiary structure of the legume soybean lectin with the structure of human galectin-3 shows that both proteins contain the typical L-type lectin fold, but no amino acid sequence homology exists between these two lectins (Figure 29.2).

FIGURE 29.1. Structure of concanavalin A (ConA), a legume seed lectin.

FIGURE 29.1

Structure of concanavalin A (ConA), a legume seed lectin. (a) The tertiary structure of the monomer is best described as a “jelly-roll fold.” (b) This fold consists of a flat six-stranded antiparallel “back” β-sheet (more...)

FIGURE 29.2. Comparison of the subunit structures of soybean agglutinin (left) complexed with a pentasaccharide containing Galβ1–4GlcNAc-R (Olsen L.

FIGURE 29.2

Comparison of the subunit structures of soybean agglutinin (left) complexed with a pentasaccharide containing Galβ1–4GlcNAc-R (Olsen L.R., Dessen A., Gupta D., et al. 1997. Biochemistry 36: 15073–15080) and human galectin-3 (right (more...)

It is clear from comparisons of sequences of L-type lectins from legumes and their relationship to the phylogeny of the various species within the Leguminoseae family of plants that the variety found among these lectins most probably arose from divergent evolution. It remains an open possibility that the tertiary structures of some of the other members of the L-type lectin family arose by convergent evolution. It must also be noted that a protein cannot be firmly placed in the L-type lectin category simply on the basis of its tertiary structure. The protein also must be found to have the glycan-binding activity that would classify it as a lectin.

All soluble L-type lectins found to date are multimeric proteins, although all do not have the same quaternary structure. Thus, these lectins are multivalent with more than one glycan-binding site per lectin molecule. The same multivalent principle applies to the membrane bound L-type lectins because the presence of two or more molecules on a membrane surface essentially presents a multivalent situation. In addition to increasing the avidity of the lectins for branched and/or cell-surface glycans, it is becoming increasingly apparent that this multivalence can have great biological significance. Binding of the lectins to the cell surface can lead to aggregation of specific glycan receptors, which can promote a variety of biological responses such as mitogenesis and various signal transduction processes.

PLANT L-TYPE LECTINS

Distribution and Localization

Plant L-type lectins are primarily found in the seeds of leguminous plants where they constitute about 10% of the total soluble protein of the seed extracts. They are synthesized during seed development several weeks after flowering and transported to the vacuole where they become condensed into specialized vesicles called protein bodies. They are stable during desiccation of the seeds and can remain in that state indefinitely until the seeds germinate. They represent one of several classes of proteins stored in high concentrations in the seeds and are often called storage proteins. During seed germination, the storage bodies become the vacuoles of the cotyledons, which appear as the first leafy appendages of the plant. During the first week of development, this cotyledon provides food for the plant and eventually shrivels up and disappears. L-type lectins have also been found in the bark of some leguminous trees, and very low amounts of these lectins are also found in other vegetative tissues of legumes. In some cases, these latter lectins have been found to be encoded by separate but very similar genes. About 100 of the seed legume L-type lectins have been characterized, and they are the most extensively studied proteins of this class.

Structure

A common feature of the legume L-type lectins is their oligomeric structure. The structures of the monomers consist of three antiparallel β-sheets: a flat six-stranded “back” sheet, a concave seven-stranded “front” sheet, and a short “top” sheet that keeps the two major sheets together (Figure 29.1a,b). All of these lectins require Ca++ and a transition metal ion (usually Mn++) for their carbohydrate-binding activity. The glycan-binding and metal-binding sites are localized in close proximity to each other at the top of the “front” sheet.

The glycan-binding site is composed of four loops: A, B, C, and D (Figure 29.3, top). These loops contain four invariant amino acids that are essential for carbohydrate binding (Figure 29.3, bottom). Loop A contains an invariant aspartate, which forms hydrogen bonds between its side chain and the glycan ligand. This amino acid is linked to its preceding amino acid (usually alanine) by a rare cis-peptide bond, which is stabilized by the metal ions and is necessary for the proper orientation of the aspartate in the combining site. Loop B contains an invariant glycine, which also forms hydrogen bonds with the ligand. An exception to this case is found in two lectins (ConA and the closely related Dioclea grandiflora lectin) where the glycine is replaced with an arginine. Both the glycine and arginine form hydrogen bonds with the ligand via their main-chain amides. Loop C contains an invariant asparagine, which forms a hydrogen bond with the ligand via its side chain. This loop also contains an invariant hydrophobic amino acid.

FIGURE 29.3. (Top) Three-dimensional structure of a legume lectin (PNA) monomer showing the four loops involved in sugar binding: loops A, B, C, and D.

FIGURE 29.3

(Top) Three-dimensional structure of a legume lectin (PNA) monomer showing the four loops involved in sugar binding: loops A, B, C, and D. The bound sugar (lactose) is shown as a ball-and-stick model. Calcium and manganese ions are required for ligand (more...)

The legume L-type lectins are generally classified into groups based on their carbohydrate specificities. These differences in specificities are brought about by variability in the conformation and size of the D loop and to some extent by the C loop. Although the main specificity regions of the legume lectins are determined by the loops, it is now clear that there are sites other than these that contribute to lectin specificity. There also exist several additional modes of refining these specificities, such as interaction with water, posttranslational modifications, and state of oligomerization.

Within these oligomeric proteins, the back β-sheet is involved in formation of the oligomers, which are mostly dimeric or tetrameric in nature. A variety of quaternary structures are found among the lectins. The dimeric and tetrameric structures of ConA are shown in Figure 29.1c,d). Although some of the other lectins occur as dimeric and tetrameric structures, several other different orientations of the β-sheets have been found to account for the variability in dimeric and tetrameric structures of other lectins in this class. It is of interest that some legume lectins have been found to have a hydrophobic binding site that binds adenine and adenine-derived plant hormones with micromolar affinity; this is two to three orders of magnitude higher than their affinity for monosaccharides. Three of these lectins (the soybean agglutinin, phytohemagglutinin L [PHAL], and Dolichos biflorus lectin) have been crystallized and found to have a unique tetrameric structure, in that the dimer–dimer interface creates a channel running through the center of the tetramer. Two identical adenine-binding sites are found at opposite ends of this channel.

Another common feature of the legume lectins is that they are secretory proteins and undergo cotranslational signal peptide removal, which accompanies their entry into the secretory system. All but the peanut agglutinin are N-glycosylated; the N-glycans undergo the normal posttranslational modifications that occur as they transit the Golgi apparatus. The lectins vary from one another as to whether the mature proteins contain oligomannose-type, complex-type, or a mixture of both types of N-glycans. The lectins may also undergo a variety of proteolytic modifications as they transit through the secretory system. Some of the lectins are cleaved to generate a β-chain, corresponding to the amino terminus and an α-chain corresponding to the carboxyl terminus. For example, the pea lectin and favin (the lectin from Vicia faba) are tetrameric glycoproteins that contain two types of subunits, α and β, which have molecular weights of about 5,000 and 21,000, respectively. These two lectins are each synthesized as single polypeptide precursors that contain the sequences of both chains in the following orientation: β-chain–α-chain. The chains associate to form dimers; they are then proteolytically processed in the protein bodies to form tetramers containing two separate α- and β-chains. Other lectins may undergo carboxy-terminal trimming of only some of their subunits. For example, the soybean agglutinin, phytohemagglutinin E (PHAE), and Dolichos biflorus lectins are tetramers of equimolar mixtures of intact and trimmed subunits. The most intriguing proteolytic modification occurs in the case of ConA: A small segment is removed from the interior of the protein and the original amino terminus is ligated with the original carboxyl terminus. This forms what is termed a circularly permuted protein. The glycosylated segment of the protein is removed during this transpeptidation process; thus, the mature ConA is not a glycoprotein. This finding provided an explanation for the fact that the protein sequence of isolated ConA aligns with other seed lectins, whereas the alignment of the DNA encoding the protein with other lectin genes suggested that the gene is circularly permuted.

Function

Despite the extensive amount of information available on the plant L-type lectins, the biological role of these proteins remains a mystery. Many hypotheses have been proposed over the years. One hypothesis is that the lectins simply serve as other storage proteins for the plant. However, it is a puzzle as to why proteins with such exquisite carbohydrate specificity would evolve merely to feed the plant. Another hypothesis was that the L-type lectins are involved in the recognition of glycan signals produced by Rhizobia during the initiation of the nitrogen-fixing Rhizobium-legume symbiosis. Recently, it has been shown that this hypothesis is not correct. A different type of lectin (an LNP) was found that may have this function (see Chapters 22 and 37). In fact, none of the L-type lectins has been shown to recognize any of the plant glycan signals identified to date (see Chapters 22 and 37); searches in plants for ligands for these lectins have been unsuccessful so far.

The most attractive hypothesis to date for plant L-type lectin function is that these proteins may have a role in plant defense, which may occur in one or more ways. Some of these lectins have been found to be toxic to insects, thus raising the possibility that they may serve as a deterrent to these pests and other pathogens. They also may have a role as pattern-recognition receptors within the plant innate immune system. Interestingly, some animal lectins, as discussed elsewhere, have been found to function in innate immunity. A recent study has shown that the Dolichos biflorus seed lectin is also a lipoxygenase. It will be of interest to see how many other L-type lectins have this activity, which is necessary to initiate the wound-induced defense pathway in plants.

L-TYPE LECTINS IN PROTEIN QUALITY CONTROL AND SORTING

Calnexin/Calreticulin

Calnexin (CNX) and calreticulin (CRT) are homologous protein chaperones that mediate quality control of proteins in the endoplasmic reticulum (ER) (see Chapter 36). CNX is membrane-bound and is perhaps closely associated with the protein-translocating channel that is involved in importing nascent proteins into the ER. CRT is a soluble ER luminal component. As discussed in Chapter 36, these two proteins bind to monoglucosylated, high-mannose-type glycans and prevent their exit from the ER until they are properly folded and assembled into correct quaternary structures. During the binding and dissociation from CRT or CNX, if the glycoprotein folds correctly, then glucose removal by glucosidase-II allows its passage out of the ER. In the event that a glycoprotein misfolds or aggregates, it is reglucosylated by UDP-Glc:glycoprotein glucosyltransferase (UGGT); this enzyme only recognizes misfolded or aggregated glycoproteins. Following reglucosylation, the monoglucosylated protein binds again to CRT or CNX. Thus, there is a cycle of glucose removal and addition by the alternating actions of glucosidase-II and UGGT and interactions with CNX/CRT.

Both CRT and CNX are Ca++-binding proteins, and their carbohydrate-binding activity is sensitive to changes in Ca++ concentration. CNX is a type I membrane protein with its carboxy-terminal end in the cytoplasm. The lumenal portion of the protein is divided into three domains: a Ca++-binding domain (which is adjacent to the transmembrane domain), a proline-rich long hairpin loop called the P domain, and the amino-terminal L-type lectin domain. CRT has a similar structure, but it is missing the cytoplasmic and transmembrane regions; it is retained in the ER through its KDEL-retrieval signal at the carboxyl terminus (Figure 29.4).

FIGURE 29.4. Schematic representation of calnexin showing the lectin domain, the P domain (containing the proline repeats), and the calcium-binding domain (a).

FIGURE 29.4

Schematic representation of calnexin showing the lectin domain, the P domain (containing the proline repeats), and the calcium-binding domain (a). Structure of calnexin based on crystallographic data (b). (Adapted, with permission of Elsevier, from Schrag (more...)

ERGIC-53 and VIP36

ERGIC-53 and VIP36 are type I membrane proteins that have been found to participate in vesicular protein transport in the secretory system (see Chapter 36). ERGIC-53 has been found in the ER-Golgi intermediate compartment (ERGIC) and its cytoplasmic carboxyl terminus contains the dilysine KKFF retention/retrieval motif. The dilysine part of this motif is recognized by the COPI coatomer complex; this binding enables the coated vesicles to be recycled from the ERGIC back to the ER. The diphenylalanine helps to direct the COPII-coated vesicles to ER export sites by binding to the COPII coatomer. The location of VIP36 is uncertain; overexpressed protein has been found in both the ER and ERGIC.

Both ERGIC-53 and VIP36 bind to oligomannose-type glycans and require Ca++ for their carbohydrate-binding activity. These two proteins were the first animal lectins that were found to share some sequence and structural homology with the legume seed lectins. Although the overall sequence identity of these proteins to the seed lectins is only about 19–24%, those amino acids important for metal and carbohydrate binding in the seed lectins are conserved, including the invariant aspartate, glycine, and asparagine. The invariant aspartate also participates in a cis-peptide bond with its preceding amino acid; this is similar to the case of the legume seed lectins discussed in the above section. The crystal structure of the CRD of ERGIC-53 has been determined and confirms the structural similarity of these lectins.

OTHER L-TYPE LECTINS

A variety of other proteins have been described that have carbohydrate-binding domains with tertiary structures similar to the lectin fold and may be considered as members of the L-type lectin family. Members of the galectin family of lectins fit into this category and are the subject of a separate chapter in this volume (see Chapter 33). Other carbohydrate-binding proteins that may fit into this category are briefly discussed below.

Pentraxins and Related Proteins

The pentraxins are a superfamily of plasma proteins that are involved in innate immunity in invertebrates and vertebrates. They contain L-type lectin folds and require Ca++ ions for ligand binding. Their name is based on the pentameric arrangement of their subunits. The short pentraxins, C-reactive protein (which binds phosphocholine) and the serum amyloid P component (which binds carbohydrate), are acute-phase proteins in humans and mice, respectively. This family also contains long pentraxins that have an unrelated long amino-terminal domain coupled to the pentraxin domain. PTX3 is one of these long pentraxins; in addition to its role in innate immunity, it may help in the assembly of a hyaluronan-rich extracellular matrix.

Laminin G domain-like (LG) modules are made of 180–200 amino acid residues and were first identified in proteins like laminins. These modules were predicted to have a fold similar to the folds of pentraxins. Some LG modules share binding properties for cellular receptors and carbohydrate ligands, indicating that the LG fold may have evolved from the L-type lectin fold for participation in related functions.

vp4 Sialic Acid–binding Domain

vp4 is a monomeric sialic acid–binding domain with an L-type lectin fold. This domain is required for infectivity of most animal rotaviruses. vp4 forms the viral spikes and mediates the recognition and attachment of the virus to the surface of animal cells. Unlike other L-type lectins, which bind to neutral glycans, vp4 binds to a charged glycan (sialic acid) through electrostatic interactions. Trypsin cleavage of vp4 gives rise to two fragments: vp8, which binds sialic acid, and vp5, which contains a hydrophobic region that permeabilizes the membrane for entry.

OTHER PROTEINS WITH JELLY-ROLL MOTIFS

Clostridium Neurotoxins, the Second-last Domain

Tetanus and botulism are caused by the toxic effect of neurotoxins produced by Clostridium tetanus and Clostridium botulinum, respectively. Botulinum neurotoxins (BoNTs) block the release of acetylcholine at the neuromuscular junction; in contrast, tetanus neurotoxin blocks the release of neurotransmitters such as glycine and γ-aminobutyric acid in the inhibitory interneurons of the spinal cord. Their entry into cells is mediated by binding to polysialogangliosides and other acidic lipids on the presynaptic membrane. These neurotoxins usually contain an activating domain (A) and a binding domain (B). The crystal structures of BoNT/A and BoNT/B reveal features of their receptor binding and activation. In the case of BoNT/B, the amino-terminal end of the binding domain consists of two seven-stranded antiparallel β-sheets that form a 14-stranded β-barrel in a jelly-roll motif. The structure of the binding domain is very similar to that of the C-fragment of tetanus toxin and the binding domain of BoNT/A. The study of binding of these neurotoxins to the cell-surface gangliosides is a necessary step toward designing effective inhibitors to combat their adverse effects.

Exotoxin A, Amino-terminal Domain

Pseudomonas aeruginosa exotoxin A is extremely toxic to eukaryotic cells because of its ability to inhibit protein synthesis. This effect is brought about by the ADP ribosylation of a specific posttranslationally modified histidine of eEF-2 (diphthamide). The crystal structure of the exotoxin reveals a tertiary fold, consisting of three distinct structural domains. These domains are individually responsible for amino-terminal receptor binding (domain I), transmembrane targeting (domain II), and ADP ribosyltransferase (domain III) activities. Domain Ia displays a complex 13-stranded jelly-roll structure.

Vibrio cholerae Sialidase, Amino-terminal and Insertion Domains

Bacterial sialidases have been implicated in the pathogenesis of a number of diseases. Vibrio cholerae neuraminidase aids in the pathogenesis of cholera by removing sialic acid from larger gangliosides to expose GM1, the receptor for cholera toxin. The crystal structure of the neuraminidase reveals the presence of a three-domain protein with a six-bladed β-propeller neuraminidase domain flanked by two lectin domains. One of these domains is at the amino terminus and the other is between the second and third blade of the propeller.

Leech Intramolecular trans-Sialidase, Amino-terminal Domain

The intracellular trans-sialidase from the leech Macrobdella decora catalyzes an intramolecular trans-sialosyl reaction; this is specific for the cleavage of the terminal Neu5Acα2–3Gal linkage in sialoglycoconjugates and releases 2,7-anhydro-Neu5Ac instead of Neu5Ac. This enzyme displays a multidomain architecture with a lectin-like domain II and an irregular β-stranded domain III, which is built around a canonical catalytic domain C. Domain II has a cluster of two histidine and two tyrosine residues on the curved surface of the same side as the active site crater. This domain II may be involved in carbohydrate recognition through sugar ring and aromatic side-chain interactions, as observed in many lectins.

FURTHER READING

  1. Goldstein IJ, Poretz RD. Isolation, physicochemical characterization, and carbohydrate-binding specificity of lectins. In: Liener IE, Sharon N, Goldstein IJ, editors. The lectins: Properties, functions, and applications in biology and medicine. Academic Press; Orlando, Florida: 1986. pp. 33–247.
  2. Fiedler K, Simons K. A putative novel class of animal lectins in the secretory pathway homologous to leguminous lectins. Cell. 1994;77:625–626. [PubMed: 8205612]
  3. Taylor G. Sialidases: Structures, biological significance and therapeutic potential. Curr Opin Struct Biol. 1996;6:830–837. [PubMed: 8994884]
  4. Sharma V, Surolia A. Analyses of carbohydrate recognition by legume lectins: Size of the combining site loops and their primary specificity. J Mol Biol. 1997;267:433–445. [PubMed: 9096236]
  5. Lis H, Sharon N. Lectins: Carbohydrate-specific proteins that mediate cellular recognition. Chem Rev. 1998;98:637–674. [PubMed: 11848911]
  6. Loris R, Bouckaert J, Hamelryck T, Wynn L. Legume lectin structure. Biochim. Biophys. Acta. 1998;1383:9–36. [PubMed: 9546043]
  7. Hamelryck TW, Loris R, Bouckaert J, Dao-Thi M.-H, Strecker G, Imberty A, Fernandez E, Wyns L, Etzler ME. Carbohydrate binding, quaternary structure and a novel hydrophobic binding site in two legume lectin oligomers from. Dolichos biflorus J Mol Biol. 1999;286:1161–1177.
  8. Vijayan M, Chandra N. Lectins. Curr Opin Struct Biol. 1999;9:707–714. [PubMed: 10607664]
  9. Rudenko G, Hohenester E, Muller YA. LG/LNS domains: Multiple functions—One business end. Trends Biochem Sci. 2001;26:363–368. [PubMed: 11406409]
  10. Schrag JD, Procopio DO, Cygler M, Thomas DY, Bergeron JJM. Lectin control of protein folding and sorting in the secretory pathway. Trends Biochem Sci. 2003;28:49–57. [PubMed: 12517452]
  11. Bottazzi B, Garlanda C, Salvatori G, Jeannin P, Manfredi A, Mantovani A. Pentraxins as a key component of innate immunity. Curr Opin Immunol. 2006;18:10–15. [PubMed: 16343883]
  12. Roopashree S, Singh SA, Gowda LR, Rao AA. Dual-function protein in plant defence: Seed lectin from Dolichos biflorus (horse gram) exhibits lipoxygenase activity. Biochem J. 2006;395:629–639. [PMC free article: PMC1462680] [PubMed: 16441240]
  13. Dam TK, Gerken TA, Cavada BS, Nascimento KS, Moura TR, Brewer CF. Binding studies of α-GalNAc-specific lectins to the α-GalNAc (Tn-antigen) form of porcine submaxillary mucin and its smaller fragments. J Biol Chem. 2007;282:28256–28263. [PubMed: 17652089]

Copyright © 2009, The Consortium of Glycobiology Editors, La Jolla, California.

Cover of Essentials of Glycobiology
Essentials of Glycobiology. 2nd edition.
Varki A, Cummings RD, Esko JD, et al., editors.
Cold Spring Harbor (NY): Cold Spring Harbor Laboratory Press; 2009.

Recent activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...