NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.

Lodish H, Berk A, Zipursky SL, et al. Molecular Cell Biology. 4th edition. New York: W. H. Freeman; 2000.

  • By agreement with the publisher, this book is accessible by the search feature, but cannot be browsed.
Cover of Molecular Cell Biology

Molecular Cell Biology. 4th edition.

Show details

Section 22.3Collagen: The Fibrous Proteins of the Matrix

Collagen is the major insoluble fibrous protein in the extracellular matrix and in connective tissue. In fact, it is the single most abundant protein in the animal kingdom. There are at least 16 types of collagen, but 80 – 90 percent of the collagen in the body consists of types I, II, and III (Table 22-3). These collagen molecules pack together to form long thin fibrils of similar structure (see Figure 5-20). Type IV, in contrast, forms a two-dimensional reticulum; several other types associate with fibril-type collagens, linking them to each other or to other matrix components. At one time it was thought that all collagens were secreted by fibroblasts in connective tissue, but we now know that numerous epithelial cells make certain types of collagens. The various collagens and the structures they form all serve the same purpose, to help tissues withstand stretching.

Table 22-3. Major Collagen Molecules.

Table 22-3

Major Collagen Molecules.

The Basic Structural Unit of Collagen Is a Triple Helix

Because its abundance in tendon-rich tissue such as rat tail makes the fibrous type I collagen easy to isolate, it was the first to be characterized. Its fundamental structural unit is a long (300-nm), thin (1.5-nm-diameter) protein that consists of three coiled subunits: two α1(I) chains and one α2(I).* Each chain contains precisely 1050 amino acids wound around one another in a characteristic right-handed triple helix (Figure 22-11a). All collagens were eventually shown to contain three-stranded helical segments of similar structure; the unique properties of each type of collagen are due mainly to segments that interrupt the triple helix and that fold into other kinds of three-dimensional structures.

Figure 22-11. The structure of collagen.

Figure 22-11

The structure of collagen. (a) The basic structural unit is a triple-stranded helical molecule. Each triple-stranded collagen molecule is 300 nm long. (b) In fibrous collagen, collagen molecules pack together side by side. Adjacent molecules are displaced (more...)

The triple-helical structure of collagen arises from an unusual abundance of three amino acids: glycine, proline, and hydroxyproline. These amino acids make up the characteristic repeating motif Gly-Pro-X, where X can be any amino acid. Each amino acid has a precise function. The side chain of glycine, an H atom, is the only one that can fit into the crowded center of a three-stranded helix. Hydrogen bonds linking the peptide bond NH of a glycine residue with a peptide carbonyl (C═O) group in an adjacent polypeptide help hold the three chains together. The fixed angle of the C – N peptidyl-proline or peptidyl-hydroxyproline bond enables each polypeptide chain to fold into a helix with a geometry such that three polypeptide chains can twist together to form a three-stranded helix. Interestingly, although the rigid peptidyl- proline linkages disrupt the packing of amino acids in an α helix, they stabilize the rigid three-stranded collagen helix.

Collagen Fibrils Form by Lateral Interactions of Triple Helices

Many three-stranded type I collagen molecules pack together side-by-side, forming fibrils with a diameter of 50 – 200 nm. In fibrils, adjacent collagen molecules are displaced from one another by 67 nm, about one-quarter of their length (Figure 22-11b). This staggered array produces a striated effect that can be seen in electron micrographs of stained collagen fibrils; the characteristic pattern of bands is repeated about every 67 nm (Figure 22-11c). The unique properties of the fibrous collagens — types I, II, III, and V — are due to the ability of the rodlike triple helices to form such side-by-side interactions.

Short segments at either end of the collagen chains are of particular importance in the formation of collagen fibrils (see Figure 22-11). These segments do not assume the triple-helical conformation and contain the unusual amino acid hydroxylysine (see Figure 3-16). Covalent aldol cross-links form between two lysine or hydroxylysine residues at the C-terminus of one collagen molecule with two similar residues at the N-terminus of an adjacent molecule (Figure 22-12). These cross-links stabilize the side-by-side packing of collagen molecules and generate a strong fibril.

Figure 22-12. The side-by-side interactions of collagen helices are stabilized by an aldol cross-link between two lysine (or hydroxylysine) side chains.

Figure 22-12

The side-by-side interactions of collagen helices are stabilized by an aldol cross-link between two lysine (or hydroxylysine) side chains. The extracellular enzyme lysyl oxidase catalyzes formation of the aldehyde groups.

Type I collagen fibrils have enormous tensile strength; that is, such collagen can be stretched without being broken. These fibrils, roughly 50 nm in diameter and several micrometers long, are packed side-by-side in parallel bundles, called collagen fibers, in tendons, where they connect muscles with bones and must withstand enormous forces (Figure 22-13). Gram for gram, type I collagen is stronger than steel.

Figure 22-13. Electron micrograph of the dense connective tissue of a chick tendon.

Figure 22-13

Electron micrograph of the dense connective tissue of a chick tendon. Most of the tissue is occupied by parallel type I collagen fibrils, about 50 nm in diameter, seen here in cross section. The cellular content of the tissue is very low. [From D. A. (more...)

Assembly of Collagen Fibers Begins in the ER and Is Completed outside the Cell

Collagen biosynthesis and assembly follows the normal pathway for a secreted protein (see Figure 17-13). The collagen chains are synthesized as longer precursors called procollagens; the growing peptide chains are co-translationally transported into the lumen of the rough endoplasmic reticulum (ER). In the ER, the procollagen chain undergoes a series of processing reactions (Figure 22-14). First, as with other secreted proteins, glycosylation of procollagen occurs in the rough ER and Golgi complex. Galactose and glucose residues are added to hydroxylysine residues, and long oligosaccharides are added to certain asparagine residues in the C-terminal propeptide, a segment at the C-terminus of a procollagen molecule that is absent from mature collagen. (The N-terminal end also has a propeptide.) In addition, specific proline and lysine residues in the middle of the chains are hydroxylated by membrane-bound hydroxylases. Lastly, intrachain disulfide bonds between the N- and C-terminal propeptide sequences align the three chains before the triple helix forms in the ER. The central portions of the chains zipper from C- to N-terminus to form the triple helix.

Figure 22-14. Major events in the biosynthesis of fibrous collagens.

Figure 22-14

Major events in the biosynthesis of fibrous collagens. Modifications of the procollagen polypeptide in the endoplasmic reticulum include hydroxylation, glycosylation, and disulfide-bond formation. Interchain disulfide bonds between the C-terminal propeptides (more...)

After processing and assembly of type I procollagen is completed, it is secreted into the extracellular space. During or following exocytosis, extracellular enzymes, the procollagen peptidases, remove the N-terminal and C-terminal propeptides. The resulting protein, often called tropocollagen (or simply collagen), consists almost entirely of a triple-stranded helix. Excision of both propeptides allows the collagen molecules to polymerize into normal fibrils in the extracellular space (see Figure 22-14). The potentially catastrophic assembly of fibrils within the cell does not occur both because the propeptides inhibit fibril formation and because lysyl oxidase, which catalyzes formation of reactive aldehydes, is an extracellular enzyme (see Figure 22-12). As noted above, these aldehydes spontaneously form specific covalent cross-links between two triple-helical molecules, which stabilizes the staggered array characteristic of collagen molecules and contributes to fibril strength.

Image med.jpgPost-translational modification of procollagen is crucial for the formation of mature collagen molecules and their assembly into fibrils. Defects in this process have serious consequences, as ancient mariners frequently experienced. For example, the activity of both prolyl hydroxylases requires an essential cofactor, ascorbic acid (vitamin C). In cells deprived of ascorbate, as in the disease scurvy, the procollagen chains are not hydroxylated sufficiently to form stable triple helices at normal body temperature (Figure 22-15), nor can they form normal fibrils. Consequently, nonhydroxylated procollagen chains are degraded within the cell. Without the structural support of collagen, blood vessels, tendons, and skin become fragile. A supply of fresh fruit provides sufficient vitamin C to process procollagen properly.

Figure 22-15. Denaturation of collagen containing a normal content of hydroxyproline and of abnormal collagen containing no hydroxyproline.

Figure 22-15

Denaturation of collagen containing a normal content of hydroxyproline and of abnormal collagen containing no hydroxyproline. Without hydrogen bonds between hydroxyproline residues, the collagen helix is unstable and loses most of its helical content (more...)

Mutations in Collagen Reveal Aspects of Its Structure and Biosynthesis

Image med.jpgType I collagen fibrils are used as the reinforcing rods in construction of bone. Certain mutations in the α1(I) or α2(I) genes lead to osteogenesis imperfecta, or brittle-bone disease. The most severe type is an autosomal dominant, lethal disease resulting in death in utero or shortly after birth. Milder forms generate a severe crippling disease. As might be expected, many cases of osteogenesis imperfecta are due to deletions of all or part of the very long α1(I) gene. However, a single amino acid change is sufficient to cause certain forms of this disease. As we have seen, a glycine must be at every third position for the collagen triple helix to form; mutations of glycine to almost any other amino acid are deleterious, producing poorly formed and unstable helices. Since the triple helix forms from the C- to the N-terminus, mutations of glycine near the C-terminus of the α1(I) chain are usually more deleterious than those near the N-terminus; the latter permit substantial regions of triple helix to form. Mutant unfolded collagen chains do not leave the rough ER of fibroblasts (the cells that make most of type I collagen), or they leave it slowly. As the ER becomes dilated and expanded, the secretion of other proteins (e.g., type III collagen) by these cells also is slowed down.

Because each type I collagen molecule contains two α1(I) and one α2(I) chains, mutations in the α2(I) chains are much less damaging. To understand this point, consider that in a heterozygote expressing one wild-type and one mutant α2(I) protein, 50 percent of the collagen molecules will have the abnormal α2(I) chain. In contrast, if the mutation is in the α1(I) chain, 75 percent of the collagen molecules will have one or two mutant α1(I) chains. In fact, even low expression of a mutant α1(I) gene can be deleterious, because the mutant chains can disrupt the function of wild-type α1(I) chains when combined with them. To study such mutations, experimenters constructed a mutant α1(I) collagen gene with a glycine-to-cysteine substitution near the C-terminus. This mutant gene was used to create lines of transgenic mice with otherwise normal collagen genes. High-level expression of the mutant transgene was lethal, and expression at a rate 10 percent that of the normal α1(I) genes caused severe growth abnormalities.

Collagens Form Diverse Structures

Collagens differ in their ability to form fibers and to organize the fibers into networks. For example, type II is the major collagen in cartilage. Its fibrils are smaller in diameter than type I and are oriented randomly in the viscous proteoglycan matrix. Such rigid macromolecules impart a strength and compressibility to the matrix and allow it to resist large deformations in shape. This property allows joints to absorb shocks.

Type II fibrils are cross-linked to proteoglycans in the matrix by type IX, a collagen of a different structure (Figure 22-16a). Type IX collagen consists of two long triple helices connected by a flexible kink. The globular N-terminal domain extends from the composite fibrils, as does a heparan sulfate molecule, a type of large, highly charged polysaccharide (discussed later) that is linked to the α2(IX) chain at the flexible kink. These protruding nonhelical domains are thought to anchor the fibril to proteoglycans and other components of the matrix. The interrupted triple-helical structure of type IX collagen prevents it from assembling into fibrils; instead, these three collagens associate with fibrils formed from other collagen types and thus are called fibril-associated collagens (see Table 22-3).

Figure 22-16. Interactions of fibrous and nonfibrous collagens.

Figure 22-16

Interactions of fibrous and nonfibrous collagens. (a) Association of types II and IX collagen in a cartilage matrix. Type II forms fibrils similar in structure to type I, with a similar 67-nm periodicity, though smaller in diameter. Type IX contains two long (more...)

Figure 22-24. Structures of various glycosaminoglycans, the polysaccharide components of proteoglycans.

Figure 22-24

Structures of various glycosaminoglycans, the polysaccharide components of proteoglycans. Each of the four classes of glycosaminoglycans is formed by polymerization of a specific disaccharide and subsequent modifications including addition of sulfate (more...)

Figure 22-24. Structures of various glycosaminoglycans, the polysaccharide components of proteoglycans.

Figure 22-24

Structures of various glycosaminoglycans, the polysaccharide components of proteoglycans. Each of the four classes of glycosaminoglycans is formed by polymerization of a specific disaccharide and subsequent modifications including addition of sulfate (more...)

In many connective tissues, type VI collagen is bound to the sides of type I fibrils and may bind them together to form thicker collagen fibers (Figure 22-16b). Type VI collagen is unusual in that the molecule consists of relatively short triple-helical regions about 60 nm long separated by globular domains about 40 nm long. Fibrils of pure type VI collagen thus give the impression of beads on a string.

In some places, several ECM components are organized into a basal lamina, a thin sheetlike structure. Type IV collagen forms the basic fibrous two-dimensional network of all basal laminae. Three type IV collagen chains form a 400-nm-long triple helix with large globular domains at the C-termini and smaller ones of unknown structure at the N-termini. The helical segment is unusual in that the Gly-X-Y sequences are interrupted about 24 times with segments that cannot form a triple helix; these nonhelical regions introduce flexibility into the molecule (Figure 22-17a). Lateral association of the N-terminal regions of four type IV molecules yields a characteristic tetrameric unit that can be observed in the electron microscope (Figure 22-17b). Triple-helical regions from several molecules then associate laterally, in a manner similar to fibril formation among fibrous collagens, to form branching strands of variable but thin diameters. These interactions, together with those between the C-terminal globular domains and the triple helices in adjacent type IV molecules, generate an irregular two-dimensional fibrous network (Figure 22-17b). We will discuss the other components of the basal lamina and the functions of this specialized matrix structure in the next section.

Figure 22-17. Structure and assembly of type IV collagen.

Figure 22-17

Structure and assembly of type IV collagen. (a) Schematic diagram of 400-nm-long triple-helical molecule of type IV collagen. This molecule has a noncollagen domain at the N-terminus and a large globular domain at the C-terminus; the triple helix is interrupted (more...)


  •  All 16 types of collagen contain a repeating Gly-Pro-X sequence and fold into a characteristic triple-helical structure.
  •  The various collagens are distinguished by the ability of their helical and nonhelical regions to associate into fibrils, to form sheets, or to cross-link different collagen types.
  •  Most collagen is fibrillar and composed of type I molecules. A two-dimensional network of type IV collagen is unique to the basal lamina.
  •  Fibrous type collagen molecules (e.g., types I, II, and III) assemble into fibrils that are stabilized by covalent aldol cross-links (see Figure 22-11).
  •  Procollagen chains are modified and assembled into a triple helix in the ER (see Figure 22-14). Helix formation is aided by disulfide bonds between N- and C-terminal propeptides, which align the polypeptide chains in register. Generally, the propeptides are removed after secretion, and then collagen fibrils form in the extracellular space.
  •  Fibrous collagen has specific structural requirements and is very susceptible to mutation, especially in glycine residues. Because mutant collagen chains can affect the function of wild-type ones, such mutations have a dominant phenotype.



In collagen nomenclature, the collagen type is in roman numerals and is enclosed in parentheses.

Image permission
Image ch3f16
Image ch17f13

By agreement with the publisher, this book is accessible by the search feature, but cannot be browsed.

Copyright © 2000, W. H. Freeman and Company.
Bookshelf ID: NBK21582


  • Cite this Page
  • Disable Glossary Links

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...