NCBI » Bookshelf » Molecular Biology of the Cell » Introduction to the Cell » Macromolecules: Structure, Shape, and Information
 
cell
Molecular Biology of the Cell
3rd
Bruce Alberts,1 Dennis Bray,2 Julian Lewis,3 Martin Raff,4 Keith Roberts,5 and James D Watson6
1University of California, San Fransisco, USA
2Department of Zoology, University of Cambridge, Cambridge, England
3Imperial Cancer Research Fund Developmental Biology Unit, University of Oxford, England
4MRC Laboratory for Molecular Cell Biology and Biology Department, University College London, England
5Department of Cell Biology, John Innes Institute, Norwich, England
6Cold Spring Harbor Laboratory, USA
Garland Publishing, Inc.0-8153-1619-41994
cell biologymolecular biology

 Chapter 3:  Macromolecules: Structure, Shape, and Information

A296

Introduction

In moving from the small molecules of the cell to the giant macromolecules, we encounter a transition of more than size alone. Even though proteins, nucleic acids, and polysaccharides are made from a limited repertoire of amino acids, nucleotides, and sugars, respectively, they can have unique and truly astounding properties that bear little resemblance to those of their simple chemical precursors. Biological macromolecules are composed of many thousands sometimes millions of atoms linked together in precisely defined spatial arrangements. Each of these macromolecules carries specific information. Incorporated in its structure is a series of biological messages that can be "read" in its interactions with other molecules, enabling it to perform a precise function.

In this chapter we examine the structures of macromolecules, emphasizing proteins and nucleic acids, and explain how they have adapted in the course of evolution to perform specific functions. We consider the principles by which these molecules catalyze chemical transformations, build complex multimolecular structures, generate movement, andmost fundamental of allstore and transmit hereditary information.

Molecular Recognition Processes 1

Introduction

An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is ch3f1.jpg.

Figure 3-1

.

   The size of protein molecules compared to some other cell components

The ribosome is an important macromolecular assembly composed of about 60 protein and RNA molecules.

Table 3-1

Approximate Chemical Compositions of a Typical Bacterium and a Typical Mammalian Cell
Percent of Total Cell Weight
ComponentE. coli BacteriumMammalian Cell
H2O7070
Inorganic ions (Na +, K +, Mg 2+, Ca 2+, Cl -, etc.)11
Miscellaneous small metabolites33
Proteins1518
RNA61.1
DNA10.25
Phospholipids23
Other lipids-2
Polysaccharides22
Total cell volume:2 x 10 -12 cm 34 x 10 -9 cm 3
Relative cell volume:12000

Proteins, polysaccharides, DNA, and RNA are macromolecules. Lipids are not generally classed as macromolecules even though they share some of their features; for example, most are synthesized as linear polymers of a smaller molecule (the acetyl group on acetyl CoA), and they self-assemble into larger structures (membranes). Note that water and protein comprise most of the mass of both mammalian and bacterial cells.

Macromolecules typically have molecular weights between about 10,000 and 1 million and are intermediate in size between the organic molecules of the cell discussed in Chapter 2 and the large macromolecular assemblies and organelles that will be discussed in subsequent chapters ( Figure 3-1). One small molecule, water, constitutes 70% of the total mass of a cell; nearly all of the remaining cell mass is due to macromolecules ( Table 3-1).

As described in Chapter 2, a macromolecule is assembled from low-molecular-weight subunits that are repeatedly added to one end to form a long, chainlike polymer. Usually only one family of subunits is used to construct each chain: amino acids are linked to other amino acids to form proteins, nucleotides are linked to other nucleotides to form nucleic acids, and sugars are linked to other sugars to form polysaccharides. Because the precise sequence of subunits is crucial to the function of a macromolecule, its biosynthesis requires mechanisms to ensure that the correct subunit goes into the polymer at each position in the chain.

The Specific Interactions of a Macromolecule Depend on Weak, Noncovalent Bonds 2

A macromolecular chain is held together by covalent bonds, which are strong enough to preserve the sequence of subunits for long periods of time. Although the sequence of subunits determines the information content of a macromolecule, utilizing that information depends largely on much weaker, noncovalent bonds. These weak bonds form between different parts of the same macromolecule and between different macromolecules. They therefore play a major part in determining both the three-dimensional structure of macromolecular chains and how these structures interact with one another.

The noncovalent bonds encountered in biological molecules are usually classified into three types: ionic bonds, hydrogen bonds, and van der Waals attractions. Another important weak force is created by the three-dimensional structure of water, which forces exposed hydrophobic groups together in order to minimize their disruptive effect on the hydrogen-bonded network of water molecules (see Panel 2-1, pp. 48-49). This expulsion from the aqueous solution generates what is sometimes thought of as a fourth kind of weak, noncovalent bond. These four types of weak attractive forces are the subject of Panel 3-1, pages 92-93.

Table 3-2

Covalent and Noncovalent Chemical Bonds
Strength (kcal/mole)*
Bond TypeLength (nm)In VacuumIn Water
Covalent0.159090
Ionic0.25803
Hydrogen0.3041
van der Waals attraction (per atom)0.350.10.1
*

The strength of a bond can be measured by the energy required to break it, here given in kilocalories per mole (kcal/mole). ( One kilocalorie is the quantity of energy needed to raise the temperature of 1000 g of water by 1°C. An alternative unit in wide use is the kilojoule, kJ, equal to 0.24 kcal.) Individual bonds vary a great deal in strength, depending on the atoms involved and their precise environment, so that the above values are only a rough guide. Note that the aqueous environment in a cell will greatly weaken both the ionic and the hydrogen bonds between nonwater molecules (Panel 3-1, pp. 92-93). The bond length is the center-to-center distance between the two interacting atoms; the length given here for a hydrogen bond is that between its two nonhydrogen atoms.

An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is ch3f2.jpg.

Figure 3-2

.

   Comparative energies of some important molecular events in cells

Note that energy is displayed on a logarithmic scale.

An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is ch3f3.jpg.

Figure 3-3

.

   Noncovalent bonds

How weak bonds mediate recognition between macromolecules.

In an aqueous environment each noncovalent bond is 30 to 300 times weaker than the typical covalent bonds that hold biological molecules together ( Table 3-2) and only slightly stronger than the average energy of thermal collisions at 37°C ( Figure 3-2). A single noncovalent bond - unlike a single covalent bond - is therefore too weak to withstand the thermal motions that tend to pull molecules apart. Large numbers of noncovalent bonds are needed to hold two molecular surfaces together, and these can form between two surfaces only when large numbers of atoms on the surfaces are precisely matched to each other ( Figure 3-3). The exacting requirements for matching account for the specificity of biological recognition, such as occurs between an enzyme and its substrate.

An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is ch3f4.jpg.

Figure 3-4

.

   Steric limitations on the bond angles in a polypeptide chain

(A) Each amino acid contributes three bonds (colored red) to its polypeptide chain. The peptide bond is planar ( gray shading) and does not permit rotation. By contrast, rotation can occur about the C α-C bond, whose angle of rotation is called psi (ψ), and about the N-C α bond, whose angle of rotation is called phi ([var phi]). The R group denotes an amino acid side chain. (B) The conformation of the main-chain atoms in a protein is determined by one pair of phi and psi angles for each amino acid; because of steric collisions within each amino acid, most pairs of phi and psi angles do not occur. In this so-called Ramachandran plot, each dot represents an observed pair of angles in a protein. (B, from J. Richardson, Adv. Prot. Chem. 34:174-175, 1981.)

As explained at the top of Panel 3-1, atoms behave almost as if they were hard spheres with a definite radius (their van der Waals radius). The requirement that no two atoms overlap limits the possible bond angles in a polypeptide chain ( Figure 3-4). These and other steric interactions severely constrain the number of three-dimensional arrangements of atoms (or conformations) that are possible. Nevertheless, a long flexible chain such as a protein can still fold in an enormous number of ways. Each conformation will have a different set of weak intrachain interactions, and it is the total strength of these interactions that determines which conformations will form.

Most proteins in a cell fold stably in only one way: during the course of evolution the sequence of amino acid subunits in each protein has been selected so that one conformation is able to form many more favorable intrachain interactions than any other.

A Helix Is a Common Structural Motif in Biological Structures Made from Repeated Subunits 3

An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is ch3f5.jpg.

Figure 3-5

.

   A helix will form when a series of subunits bind to each other in a regular way

In the foreground the interaction between two subunits is shown; behind it are the helices that result. These helices have two (A), three (B), and six (C and D) subunits per turn. At the top, the arrangement of subunits has been photographed from directly above the helix. Note that the helix in (D) has a wider path than that in (C).

An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is ch3f6.jpg.

Figure 3-6

.

   Comparison of a left-handed and a right-handed helix

As a reference, it is useful to remember that standard screws, which insert when turned clockwise, are right-handed. Note that a helix preserves the same handedness when it is turned upside down.

Biological structures are often formed by linking subunits that are very similar to each other - such as amino acids or nucleotides - into a long, repetitive chain. If all the subunits are identical, neighboring subunits in the chain will often fit together in only one way, adjusting their relative positions so as to minimize the free energy of the contact between them. In this case, each subunit will be positioned in exactly the same way in relation to its neighboring subunits, so that subunit 3 will fit onto subunit 2 in the same way that subunit 2 fits onto subunit 1, and so on. Because it is very rare for subunits to join up in a straight line, this arrangement will generally result in a helix - a regular structure that resembles a spiral staircase, as illustrated in Figure 3-5. Depending on the twist of the staircase, a helix is said to be either right-handed or left-handed ( Figure 3-6). Handedness is not affected by turning the helix upside down, but it is reversed if the helix is reflected in a mirror.

Helices occur commonly in biological structures, whether the subunits are small molecules that are covalently linked together (as in DNA) or large protein molecules that are linked by noncovalent forces (as in actin filaments). This is not surprising. A helix is an unexceptional structure, generated simply by placing many similar subunits next to each other, each in the same strictly repeated relationship to the one before.

Diffusion Is the First Step to Molecular Recognition 4

An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is ch3f7.jpg.

Figure 3-7

.

   A random walk

Molecules in solution move in a random fashion due to the continual buffeting they receive in collisions with other molecules. This movement allows small molecules to diffuse from one part of the cell to another in a surprisingly short time: such molecules will generally diffuse across a typical animal cell in less than a second.

Before two molecules can bind to each other, they must come into close contact. This is achieved by the thermal motions that cause molecules to wander, or diffuse, from their starting positions. As the molecules in a liquid rapidly collide and bounce off one another, an individual molecule moves first one way and then another, its path constituting a "random walk" ( Figure 3-7). The average distance that each type of molecule travels from its starting point is proportional to the square root of the time involved: that is, if it takes a particular molecule 1 second on average to go 1 µm, it will go 2 µm in 4 seconds, 10 µm in 100 seconds, and so on. Diffusion is therefore an efficient way for molecules to move limited distances but an inefficient way for molecules to move long distances.

An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is ch3f8.jpg.

Figure 3-8

.

   Macromolecules in the cell cytoplasm

The drawing is approximately to scale and emphasizes the crowding in the cytoplasm. Only the macromolecules are shown: RNAs are shown in blue, ribosomes in green, and proteins in red. Macromolecules diffuse relatively slowly in the cytoplasm because they interact with many other macromolecules; small molecules, by contrast, diffuse nearly as rapidly as they do in water. (Adapted from D.S. Goodsell, Trends in Biochem. Sci.16:203-206, 1991.)

Experiments performed by injecting fluorescent dyes and other labeled molecules into cells show that the diffusion of small molecules through the cytoplasm is nearly as rapid as it is in water. A molecule the size of ATP, for example, requires only about 0.2 second to diffuse an average distance of 10 µm - the diameter of a small animal cell. Large macromolecules, however, move much more slowly. Not only is their diffusion rate intrinsically slower, but their movement is retarded by frequent collisions with many other macromolecules that are held in place by molecular associations in the cytoplasm ( Figure 3-8).

Thermal Motions Bring Molecules Together and Then Pull Them Apart 4

Encounters between two macromolecules or between a macromolecule and a small molecule occur randomly through simple diffusion. An encounter may lead immediately to the formation of a complex between the two molecules, in which case the rate of complex formation is said to be diffusion-limited. Alternatively, the rate of complex formation may be slower, requiring some adjustment of the structure of one or both molecules before the interacting surfaces can fit together, so that most often the two colliding molecules will bounce off each other without sticking. In either case once the two interacting surfaces have come sufficiently close together, they will form multiple weak bonds with each other that persist until random thermal motions cause the molecules to dissociate again (see Figure 3-3).

In general, the stronger the binding of the molecules in the complex, the slower their rate of dissociation. At one extreme the total energy of the bonds formed is negligible compared with that of thermal motion, and the two molecules dissociate as rapidly as they came together. At the other extreme the total bond energy is so high that dissociation rarely occurs. Strong interactions occur in cells whenever a biological function requires that two macromolecules remain tightly associated for a long time - for example, when a gene regulatory protein binds to DNA to turn off a gene. Weaker interactions occur when the function demands a rapid change in the structure of a complex - for example, when two interacting proteins change partners during the movements of a protein machine.

The Equilibrium Constant Is a Measure of the Strength of an Interaction Between Two Molecules 5

An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is ch3f9.jpg.

Figure 3-9

.

   The principle of equilibrium

The equilibrium between molecules A and B and the complex AB is maintained by a balance between the two opposing reactions shown in (1) and (2). As shown in (3), the ratio of the rate constants for the association and the dissociation reactions is equal to the equilibrium constant (K) for the reaction. Molecules A and B must collide in order to react, and the rate in reaction (2) is therefore proportional to the product of their individual concentrations. As a result, the product [A] x [B] appears in the final expression for K, where [ ] indicates concentration.

As traditionally defined, the concentrations of products appear in the numerator and the concentrations of reactants appear in the denominator of the equation for an equilibrium constant. Thus the equilibrium constant in (3) is that for the association reaction A + B → AB. For simple binding interactions this constant is called the affinity constant or association constant (in units of liters per mole); the larger the value of the association constant ( K a), the stronger is the binding between A and B. The reciprocal of K ais the dissociation constant (in units of moles per liter); the smaller the value of the dissociation constant ( K d), the stronger is the binding between A and B.

The precise strength of the bonding between two molecules is a useful index of the specificity of their interaction. To illustrate how the binding strength is measured, let us consider a reaction in which molecule A binds to molecule B. The reaction will proceed until it reaches an equilibrium point, at which the rates of formation and dissociation are equal ( Figure 3-9). The concentrations of A, B, and the complex AB at this point can be used to determine an equilibrium constant ( K) for the reaction, as explained in Figure 3-9. This constant is sometimes termed the affinity constant and is commonly employed as a measure of the strength of binding between two molecules: the stronger the binding, the larger is the value of the affinity constant.

Table 3-3

The Relationship Between Free-Energy Differences and Equilibrium Constants
An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is ch3t3.jpg.
The equilibrium constant of a reaction in which two molecules bind to each other is related directly to the standard free-energy change for the binding (Δ G°) by the equation described in Table 3-3. The table also lists the Δ G° values corresponding to a range of K values. Affinity constants for simple binding interactions in biological systems often range between 10 3 and 10 12 liters/mole; this corresponds to binding energies in the range 4-17 kcal/mole, which could arise from 4 to 17 average hydrogen bonds.

Atoms and Molecules Move Very Rapidly 6

The chemical reactions in a cell occur at amazingly fast rates. A typical enzyme molecule, for example, will catalyze on the order of 1000 reactions per second, and rates of more than 10 6 reactions per second are achieved by some enzymes. Since each reaction requires a separate encounter between an enzyme and a substrate molecule, such rates are possible only because the molecules are moving so rapidly. Molecular motions can be classified broadly into three kinds: (1) the movement of a molecule from one place to another (translational motion), (2) the rapid back-and-forth movement of covalently linked atoms with respect to one another (vibrations), and (3) rotations. All of these motions are important in bringing the surfaces of interacting molecules together.

The rates of molecular motions can be measured by a variety of spectroscopic techniques. These indicate that a large globular protein is constantly tumbling, rotating about its axis about a million times per second. The rates of diffusional encounters due to translational movements are proportional to the concentration of the diffusing molecule. If ATP is present at its typical intracellular concentration of about 1 mM, for instance, each site on a protein molecule will be bombarded by about 10 6 random collisions with ATP molecules per second; for an ATP concentration tenfold lower, the number of collisions would drop to 10 5 per second and so on.

Once two molecules have collided and are in the correct relative orientation, a chemical reaction can occur between them extremely rapidly. When one appreciates how quickly molecules move and react, the observed rates of enzymatic catalysis do not seem so amazing.

Molecular Recognition Processes Can Never Be Perfect 7

All molecules possess energy - the kinetic energy of their translational movements, vibrations, and rotations and the potential energy stored in their electron distributions. Through molecular collisions this energy is randomly distributed to all of the atoms present, so that most atoms will have energy levels close to the average, with only a small proportion possessing very high energy. Although the favored conformations or states for a molecule will be those of lowest free energy (see p. 75), states of higher energy occur through unusually violent collisions. Given the temperature, it is possible to calculate the probability that an atom or a molecule will be in a particular energy state (see Table 3-3). The probability of a high-energy state becomes smaller relative to a low-energy state as the difference in free energy between the two increases. It reaches zero, however, only when this energy difference becomes infinite.

Because of the random factor in molecular interactions, minor "side reactions" are bound to occur occasionally. As a consequence, a cell continually makes errors. Even reactions that are very energetically unfavorable will take place occasionally. Two atoms joined to each other by a covalent bond, for example, will eventually be subjected to an especially energetic collision and fall apart. Similarly, the specificity of an enzyme for its substrate cannot be absolute because the recognition of one molecule as distinct from another can never be perfect. Mistakes could be avoided completely only if the cell could evolve mechanisms with infinite energy differences between alternatives. Since this is not possible, cells are forced to tolerate a certain level of failure and have instead evolved a variety of repair reactions to correct those errors that are the most damaging.

On the other hand, errors are essential to life as we know it. If it were not for occasional mistakes in the maintenance of DNA sequences, evolution could not occur.

Summary

The sequence of subunits in a macromolecule contains information that determines the three-dimensional contours of its surface. These contours in turn govern the recognition between one molecule and another, or between different parts of the same molecule, by means of weak, noncovalent bonds. The attractive forces are of four types: ionic bonds, van der Waals attractions, hydrogen bonds, and an interaction between nonpolar groups caused by their hydrophobic expulsion from water. Two molecules will recognize each other by a process in which they meet by random diffusion, stick together for a while, and then dissociate. The strength of this interaction is generally expressed in terms of an equilibrium constant. Since the only way to make recognition infallible is to make the energy of binding infinitely large, living cells constantly make errors; those that are intolerable are corrected by specific repair processes.

Nucleic Acids 8

Genes Are Made of DNA 9

It has been obvious for as long as humans have sown crops or raised animals that each seed or fertilized egg must contain a hidden plan, or design, for the development of the organism. In modern times the science of genetics grew up around the premise of invisible information-containing elements, called genes, that are distributed to each daughter cell when a cell divides. Therefore, before dividing, a cell has to make a copy of its genes in order to give a complete set to each daughter cell. The genes in the sperm and egg cells carry the hereditary information from one generation to the next.

The inheritance of biological characteristics must involve patterns of atoms that follow the laws of physics and chemistry: in other words, genes must be formed from molecules. At first the nature of these molecules was hard to imagine. What kind of molecule could be stored in a cell and direct the activities of a developing organism and also be capable of accurate and almost unlimited replication?

By the end of the nineteenth century biologists had recognized that the carriers of inherited information were the chromosomes that become visible in the nucleus as a cell begins to divide. But the evidence that the deoxyribonucleic acid (DNA) in these chromosomes is the substance of which genes are made came only much later, from studies on bacteria. In 1944 it was shown that adding purified DNA from one strain of bacteria to a second, slightly different bacterial strain conferred heritable properties characteristic of the first strain upon the second. Because it had been commonly believed that only proteins have enough conformational complexity to carry genetic information, this discovery came as a surprise, and it was not generally accepted until the early 1950s. Today the idea that DNA carries genetic information in its long chain of nucleotides is so fundamental to biological thought that it is sometimes difficult to realize the enormous intellectual gap that it filled.

DNA Molecules Consist of Two Long Chains Held Together by Complementary Base Pairs 10

The difficulty that geneticists had in accepting DNA as the substance of genes is understandable, considering the simplicity of its chemistry. A DNA chain is a long, unbranched polymer composed of only four types of subunits. These are the deoxyribonucleotides containing the bases adenine (A), cytosine (C), guanine (G), and thymine (T). The nucleotides are linked together by covalent phospho-diester bonds that join the 5' carbon of one deoxyribose group to the 3' carbon of the next (see Panel 2-6, pp. 58-59). The four kinds of bases are attached to this repetitive sugar-phosphate chain almost like four kinds of beads strung on a necklace.

How can a long chain of nucleotides encode the instructions for an organism or even a cell? And how can these messages be copied from one generation of cells to the next? The answers lie in the structure of the DNA molecule.

Early in the 1950s x-ray diffraction analyses of specimens of DNA pulled into fibers suggested that the DNA molecule is a helical polymer composed of two strands. The helical structure of DNA was not surprising since, as we have seen, a helix will often form if each of the neighboring subunits in a polymer is regularly oriented. But the finding that DNA is two-stranded was of crucial significance. It provided the clue that led, in 1953, to the construction of a model that fitted the observed x-ray diffraction pattern and thereby solved the puzzle of DNA structure and function.

An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is ch3f10.jpg.

Figure 3-10

.

   The DNA double helix

(A) A short section of the helix viewed from its side. Four complementary base pairs are shown. The bases are shown in green,while the deoxyribose sugars are blue.(B) The helix viewed from an end. Note that the two DNA strands run in opposite directions and that each base pair is held together by either two or three hydrogen bonds (see also Panel 3-2, pp. 100-101).

An essential feature of the model was that all of the bases of the DNA molecule are on the inside of the double helix, with the sugar phosphates on the outside. This demands that the bases on one strand be extremely close to those on the other, and the fit proposed required specific base-pairing between a large purine base (A or G, each of which has a double ring) on one chain and a smaller pyrimidine base (T or C, each of which has a single ring) on the other chain ( Figure 3-10).

Both evidence from earlier biochemical experiments and conclusions derived from model building suggested that complementary base pairs (also called Watson-Crick base pairs) form between A and T and between G and C. Biochemical analyses of DNA preparations from different species had shown that, although the nucleotide composition of DNA varies a great deal (for example, from 13% A residues to 36% A residues in the DNA of different types of bacteria), there is a general rule that quantitatively [G] = [C] and [A] = [T]. Model building revealed that the numbers of effective hydrogen bonds that could be formed between G and C or between A and T were greater than for any other combinations (see Panel 3-2, pp. 100-101). The double-helical model for DNA thus neatly explained the quantitative biochemistry.

The Structure of DNA Provides an Explanation for Heredity 11

A gene carries biological information in a form that must be precisely copied and transmitted from each cell to all of its progeny. The implications of the discovery of the DNA double helix were profound because the structure immediately suggested how information transfer could be accomplished. Since each strand contains a nucleotide sequence that is exactly complementary to the nucleotide sequence of its partner strand, both strands actually carry the same genetic information. If we designate the two strands A and A', strand A can serve as a mold or template for making a new strand A', while strand A' can serve in the same way to make a new strand A. Thus genetic information can be copied by a process in which strand A separates from strand A' and each separated strand then serves as a template for the production of a new complementary partner strand.

As a direct consequence of the base-pairing mechanism, it becomes evident that DNA carries information by means of the linear sequence of its nucleotides. Each nucleotide - A, C, T, or G - can be considered a letter in a four-letter alphabet that is used to write out biological messages in a linear "ticker-tape" form. Organisms differ because their respective DNA molecules carry different nucleotide sequences and therefore different biological messages.

An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is ch3f11.jpg.

Figure 3-11

.

   The DNA sequence of the human β-globin gene

The gene encodes one of the two subunits of the hemoglobin molecule, which carries oxygen in the blood. Only one of the two DNA strands is shown (the "coding strand"), since the other strand has a precisely complementary sequence. The sequence should be read from left to right in successive lines down the page, as if it were normal English text.

Since the number of possible sequences in a DNA chain n nucleotides long is 4 n, the biological variety that could in principle be generated using even a modest length of DNA is enormous. A typical animal cell contains a meter of DNA (3 x 10 9 nucleotides). Written in a linear αbet of four letters, an unusually small human gene would occupy a quarter of a page of text ( Figure 3-11), while the genetic information carried in a human cell would fill a book of more than 500,000 pages.

An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is ch3f12.jpg.

Figure 3-12

.

   DNA synthesis

The addition of a deoxyribonucleotide to the 3' end of a polynucleotide chain is the fundamental reaction by which DNA is synthesized. As shown, base-pairing between this incoming deoxyribonucleotide and an existing strand of DNA (the template strand) guides the formation of a new strand of DNA with a complementary nucleotide sequence.

Although the principle underlying gene replication is both elegant and simple, the actual machinery by which this copying is carried out in the cell is complicated and involves a complex of proteins that form a "replication machine." The fundamental reaction is that shown in Figure 3-12, in which the enzyme DNA polymerase catalyzes the addition of a deoxyribonucleotide to the 3' end of a DNA chain. Each nucleotide added to the chain is a deoxyribonucleoside triphosphate;the release of pyrophosphate from this activated nucleotide and its subsequent hydrolysis provide the energy for the DNA replication reaction and make it effectively irreversible.

An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is ch3f13.jpg.

Figure 3-13

.

   The semiconservative replication of DNA

In each round of replication each of the two strands of DNA is used as a template for the formation of a complementary DNA strand. The original strands therefore remain intact through many cell generations.

Replication of the DNA helix begins with the local separation of its two complementary DNA strands. Each strand then acts as a template for the formation of a new DNA molecule by the sequential addition of deoxyribonucleoside triphosphates. The nucleotide to be added at each step is selected by a process that requires it to form a complementary base pair with the next nucleotide in the parental template strand, thereby generating a new DNA strand that is complementary in sequence to the template strand (see Figure 3-12). Eventually, the genetic information is duplicated in its entirety, so that two complete DNA double helices are formed, each identical in nucleotide sequence to the parental DNA helix that served as the template. Since each daughter DNA molecule ends up with one of the original strands plus one newly synthesized strand, the mechanism of DNA replication is said to be semiconservative ( Figure 3-13).

Errors in DNA Replication Cause Mutations 12

One of the most impressive features of DNA replication is its accuracy. Several proofreading mechanisms are used to eliminate incorrectly positioned nucleotides; as a result, the sequence of nucleotides in a DNA molecule is copied with fewer than one mistake in 10 9 nucleotides added. Very rarely, however, the replication machinery skips or adds a few nucleotides, or puts a T where it should have put a C, or an A instead of a G. Any change of this kind in the DNA sequence constitutes a genetic mistake, called a mutation, which will be copied in all future cell generations since "wrong" DNA sequences are copied as faithfully as "correct" ones. The consequence of such an error can be great, for even a single nucleotide change can have important effects on the cell, depending on where the mutation has occurred.

Geneticists demonstrated conclusively in the early 1940s that genes specify the structure of individual proteins. Thus a mutation in a gene, caused by an alteration in its DNA sequence, may lead to the inactivation of a crucial protein and result in cell death, in which case the mutation will be lost. On the other hand, a mutation may be silent and not affect the function of the protein. Very rarely, a mutation will create a gene with an improved or novel useful function. In this case organisms carrying the mutation will have an advantage, and the mutated gene may eventually replace the original gene in the population through natural selection.

The Nucleotide Sequence of a Gene Determines the Amino Acid Sequence of a Protein 13

DNA is relatively inert chemically. The information it contains is expressed indirectly via other molecules: DNA directs the synthesis of specific RNA and protein molecules, which in turn determine the cell's chemical and physical properties.

An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is ch3f14.jpg.

Figure 3-14

.

   The amino acid sequence of bovine insulin

Insulin is a very small protein that consists of two polypeptide chains, one 21 and the other 30 amino acid residues long. Each chain has a unique, genetically determined sequence of amino acids. The one-letter symbols used to specify amino acids are those listed in Panel 2-5, pages 56-57; the SS bonds shown in red are disulfide bonds between cysteine residues. The protein is made initially as a single long polypeptide chain (encoded by a single gene) that is subsequently cleaved to give the two chains.

At about the time that biophysicists were analyzing the three-dimensional structure of DNA by x-ray diffraction, biochemists were intensively studying the chemical structure of proteins. It was already known that proteins are chains of amino acids joined together by sequential peptide linkages; but it was only in the early 1950s, when the small protein insulin was sequenced ( Figure 3-14), that it was discovered that each type of protein consists of a unique sequence of amino acids. Just as solving the structure of DNA was seminal in understanding the molecular basis of genetics and heredity, so sequencing insulin provided a key to understanding the structure and function of proteins. If insulin had a definite, genetically determined sequence, then presumably so did every other protein. It seemed reasonable to suppose, moreover, that the properties of a protein would depend on the precise order in which its constituent amino acids are arranged.

Both DNA and protein are composed of a linear sequence of subunits; eventually, the analysis of the proteins made by mutant genes demonstrated that the two sequences are co-linear - that is, the nucleotides in DNA are arranged in an order corresponding to the order of the amino acids in the protein they specify. It became evident that the DNA sequence contains a coded specification of the protein sequence. The central question in molecular biology then became how a cell translates a nucleotide sequence in DNA into an amino acid sequence in a protein.

Portions of DNA Sequence Are Copied into RNA Molecules That Guide Protein Synthesis 14

The synthesis of proteins involves copying specific regions of DNA (the genes) into polynucleotides of a chemically and functionally different type known as ribonucleic acid, or RNA. RNA, like DNA, is composed of a linear sequence of nucleotides, but it has two small chemical differences: (1) the sugar-phosphate backbone of RNA contains ribose instead of a deoxyribose sugar and (2) the base thymine (T) is replaced by uracil (U), a very closely related base that likewise pairs with A (see Panel 3-2, pp. 100-101).

RNA retains all of the information of the DNA sequence from which it was copied, as well as the base-pairing properties of DNA. Molecules of RNA are synthesized by a process known as DNA transcription, which is similar to DNA replication in that one of the two strands of DNA acts as a template on which the base-pairing abilities of incoming nucleotides are tested. When a good match is achieved with the DNA template, a ribonucleotide is incorporated as a covalently bonded unit. In this way the growing RNA chain is elongated one nucleotide at a time.

An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is ch3f15.jpg.

Figure 3-15

.

   The transfer of information from DNA to protein

The transfer proceeds by means of an RNA intermediate called messenger RNA (mRNA). In procaryotic cells the process is simpler than in eucaryotic cells. In eucaryotes the coding regions of the DNA (in the exons,shown in color) are separated by noncoding regions (the introns). As indicated, these introns must be removed by an enzymatically catalyzed RNA-splicing reaction to form the mRNA.

DNA transcription differs from DNA replication in a number of ways. The RNA product, for example, does not remain as a strand annealed to DNA. Just behind the region where the ribonucleotides are being added, the original DNA helix re-forms and releases the RNA chain. Thus RNA molecules are single-stranded. Moreover, RNA molecules are relatively short compared to DNA molecules since they are copied from a limited region of the DNA - enough to make one or a few proteins ( Figure 3-15). RNA transcripts that direct the synthesis of protein molecules are called messenger RNA (mRNA) molecules, while other RNA transcripts serve as transfer RNAs (tRNAs) or form the RNA components of ribosomes (rRNA) or smaller ribonucleoprotein particles.

The amount of RNA made from a particular region of DNA is controlled by gene regulatory proteins that bind to specific sites on DNA close to the coding sequences of a gene. In any cell at any given time, some genes are used to make RNA in very large quantities while other genes are not transcribed at all. For an active gene thousands of RNA transcripts can be made from the same DNA segment in each cell generation. Because each mRNA molecule can be translated into many thousands of copies of a polypeptide chain, the information contained in a small region of DNA can direct the synthesis of millions of copies of a specific protein. The protein fibroin, for example, is the major component of silk. In each silk gland cell a single fibroin gene makes 10 4 copies of mRNA, each of which directs the synthesis of 10 5 molecules of fibroin - producing a total of 10 9 molecules of fibroin in just 4 days.

Eucaryotic RNA Molecules Are Spliced to Remove Intron Sequences 15

In bacterial cells most proteins are encoded by a single uninterrupted stretch of DNA sequence that is copied without alteration to produce an mRNA molecule. In 1977 molecular biologists were astonished by the discovery that most eucaryotic genes have their coding sequences (called exons) interrupted by noncoding sequences (called introns). To produce a protein, the entire length of the gene, including both its introns and its exons, is first transcribed into a very large RNA molecule - the primary transcript. Before this RNA molecule leaves the nucleus, a complex of RNA-processing enzymes removes all of the intron sequences, thereby producing a much shorter RNA molecule. After this RNA-processing step, called RNA splicing, has been completed, the RNA molecule moves to the cytoplasm as an mRNA molecule that directs the synthesis of a particular protein (see Figure 3-15).

This seemingly wasteful mode of information transfer in eucaryotes is presumed to have evolved because it makes protein synthesis much more versatile. The primary RNA transcripts of some genes, for example, can be spliced in various ways to produce different mRNAs, depending on the cell type or stage of development. This allows different proteins to be produced from the same gene. Moreover, because the presence of numerous introns facilitates genetic recombination events between exons, this type of gene arrangement is likely to have been profoundly important in the early evolutionary history of genes, speeding up the process whereby organisms evolve new proteins from parts of preexisting ones instead of evolving totally new amino acid sequences.

Sequences of Nucleotides in mRNA Are "Read" in Sets of Three and Translated into Amino Acids 16

An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is ch3f16.jpg.

Figure 3-16

.

   The genetic code

Sets of three nucleotides ( codons) in an mRNA molecule are translated into amino acids in the course of protein synthesis according to the rules shown. The codons GUG and GAG, for example, are translated into valine and glutamic acid, respectively. Note that those codons with U or C as the second nucleotide tend to specify the more hydrophobic amino acids (compare with Panel 2-5, pp. 56-57).

The rules by which the nucleotide sequence of a gene is translated into the amino acid sequence of a protein, the so-called genetic code, were deciphered in the early 1960s. The sequence of nucleotides in the mRNA molecule that acts as an intermediate was found to be read in serial order in groups of three. Each triplet of nucleotides, called a codon, specifies one amino acid. Since RNA is a linear polymer of four different nucleotides, there are 4 3 = 64 possible codon triplets (remember that it is the sequence of nucleotides in the triplet that is important). However, only 20 different amino acids are commonly found in proteins, so that most amino acids are specified by several codons; that is, the genetic code is degenerate. The code (shown in Figure 3-16) has been highly conserved during evolution: with a few minor exceptions, it is the same in organisms as diverse as bacteria, plants, and humans.

An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is ch3f17.jpg.

Figure 3-17

.

   The three possible reading frames in protein synthesis

In the process of translating a nucleotide sequence ( blue) into an amino acid sequence ( green), the sequence of nucleotides in an mRNA molecule is read from the 5' to the 3' end in sequential sets of three nucleotides. In principle, therefore, the same RNA sequence can specify three completely different amino acid sequences, depending on the "reading frame."

In principle, each RNA sequence can be translated in any one of three different reading frames depending on where the decoding process begins ( Figure 3-17). In almost every case only one of these reading frames will produce a functional protein. Since there are no punctuation signals except at the beginning and end of the RNA message, the reading frame is set at the initiation of the translation process and is maintained thereafter.

tRNA Molecules Match Amino Acids to Groups of Nucleotides 17

The codons in an mRNA molecule do not directly recognize the amino acids they specify in the way that an enzyme recognizes a substrate. The translation of mRNA into protein depends on "adaptor" molecules that recognize both an amino acid and a group of three nucleotides. These adaptors consist of a set of small RNA molecules known as transfer RNAs (tRNAs), each about 80 nucleotides in length.

An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is ch3f18.jpg.

Figure 3-18

.

   Phenylalanine tRNA of yeast

(A) The molecule is drawn with a cloverleaf shape to show the complementary base-pairing ( short gray bars) that occurs in the helical regions of the molecule. (B) The actual shape of the molecule, based on x-ray diffraction analysis, is shown schematically. Complementary base pairs are indicated as long gray bars. In addition, the nucleotides involved in unusual base-pair interactions that hold different parts of the molecule together are colored red and are connected by a red line in both (A) and (B). The pairs are numbered in (B). (C) One of the unusual base-pair interactions. Here one base forms hydrogen-bond interactions with two others; several such "base triples" help fold up this tRNA molecule.

A tRNA molecule has a folded three-dimensional conformation that is held together in part by noncovalent base-pairing interactions like those that hold together the two strands of the DNA helix. In the single-stranded tRNA molecule, however, the complementary base pairs form between nucleotide residues in the same chain, which causes the tRNA molecule to fold up in a unique way that is important for its function as an adaptor. Four short segments of the molecule contain a double-helical structure, producing a molecule that looks like a "cloverleaf" in two dimensions. This cloverleaf is in turn further compacted into a highly folded, L-shaped conformation that is held together by more complex hydrogen-bonding interactions ( Figure 3-18). Two sets of unpaired nucleotide residues at either end of the "L" are especially important for the function of the tRNA molecule in protein synthesis: one forms the anticodon that base-pairs to a complementary triplet in an mRNA molecule (the codon), while the CCA sequence at the 3' end of the molecule is attached covalently to a specific amino acid (see Figure 3-18A).

The RNA Message Is Read from One End to the Other by a Ribosome 18

An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is ch3f19.jpg.

Figure 3-19

.

   Information flow in protein synthesis

(A) The nucleo-tides in an mRNA molecule are joined together to form a complementary copy of a segment of one strand of DNA. (B) They are then matched three at a time to complementary sets of three nucleotides in the anticodon regions of tRNA molecules. At the other end of each type of tRNA molecule, a specific amino acid is held in a high-energy linkage, and when matching occurs, this amino acid is added to the end of the growing polypeptide chain. Thus translation of the mRNA nucleotide sequence into an amino acid sequence depends on complementary base-pairing between codons in the mRNA and corresponding tRNA anticodons. The molecular basis of information transfer in translation is therefore very similar to that in DNA replication and transcription. Note that the mRNA is both synthesized and translated starting from its 5' end.

An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is ch3f20.jpg.

Figure 3-20

.

   Synthesis of a protein by ribosomes attached to an mRNA molecule

Ribosomes become attached to a start signal near the 5' end of the mRNA molecule and then move toward the 3' end, synthesizing protein as they go. A single mRNA will usually have a number of ribosomes traveling along it at the same time, each making a separate but identical polypeptide chain; the entire structure is known as a polyribosome.

The codon recognition process by which genetic information is transferred from mRNA via tRNA to protein depends on the same type of base-pair interactions that mediate the transfer of genetic information from DNA to DNA and from DNA to RNA ( Figure 3-19). But the mechanics of ordering the tRNA molecules on the mRNA are complicated and require a ribosome, a complex of more than 50 different proteins associated with several structural RNA molecules (rRNAs). Each ribosome is a large protein-synthesizing machine on which tRNA molecules position themselves so as to read the genetic message encoded in an mRNA molecule. The ribosome first finds a specific start site on the mRNA that sets the reading frame and determines the amino-terminal end of the protein. Then, as the ribosome moves along the mRNA molecule, it translates the nucleotide sequence into an amino acid sequence one codon at a time, using tRNA molecules to add amino acids to the growing end of the polypeptide chain ( Figure 3-20). When a ribosome reaches the end of the message, both it and the freshly made carboxyl end of the protein are released from the 3' end of the mRNA molecule into the cytoplasm.

Ribosomes operate with remarkable efficiency: in one second a single bacterial ribosome adds about 20 amino acids to a growing polypeptide chain. Ribosome structure and the mechanism of protein synthesis are discussed in Chapter 6.

Some RNA Molecules Function as Catalysts 19

An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is ch3f21.jpg.

Figure 3-21

.

   A self-splicing RNA molecule

The diagram shows the self-splicing reaction in which an intron sequence catalyzes its own excision from a Tetrahymena ribosomal RNA molecule. As shown, the reaction is initiated when a G nucleotide is added to the intron sequence, cleaving the RNA chain in the process; the newly created 3' end of the RNA chain then attacks the other side of the intron to complete the reaction.

An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is ch3f22.jpg.

Figure 3-22

.

   An enzymelike reaction catalyzed by the purified Tetrahymena intron sequence

In this reaction, which corresponds to the first step in Figure 3-21, both a specific substrate RNA molecule and a G nucleotide become tightly bound to the surface of the catalytic RNA molecule. The nucleotide is then covalently attached to the substrate RNA molecule, cleaving it at a specific site. The release of the resulting two RNA chains frees the intron sequence for further cycles of reaction.

RNA molecules have commonly been viewed as strings of nucleotides with a relatively uninteresting chemistry. In 1981 this view was shattered by the discovery of a catalytic RNA molecule with the type of sophisticated chemical reactivity that biochemists had previously associated only with proteins. The ribosomal RNA molecules of the ciliated protozoan Tetrahymena are initially synthesized as a large precursor from which one of the rRNAs is produced by an RNA-splicing reaction. The surprise came with the discovery that this splicing can occur in vitro in the absence of protein. It was subsequently shown that the intron sequence itself has an enzymelike catalytic activity that carries out the two-step reaction illustrated in Figure 3-21. The 400-nucleotide-long intron sequence was then synthesized in a test tube and shown to fold up to form a complex surface that can function like an enzyme in reactions with other RNA molecules. For example, it can bind two specific substrates tightly - a guanine nucleotide and an RNA chain - and catalyze their covalent attachment so as to sever the RNA chain at a specific site ( Figure 3-22).

In this model reaction, which mimics the first step in Figure 3-21, the same intron sequence acts repeatedly to cut many RNA chains. Although RNA splicing is most commonly achieved by means that are not autocatalytic (discussed in Chapter 8), self-splicing RNAs with intron sequences related to that in Tetrahymena have been discovered in other types of cells, including fungi and bacteria. This suggests that these RNA sequences may have arisen before the eucaryotic and procaryotic lineages diverged about 1.5 billion years ago.

An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is ch3f23.jpg.

Figure 3-23

.

   A peptidyl transferase reaction catalyzed by a deproteinized ribosomal RNA molecule

The puromycin molecule mimics a tRNA charged with the amino acid tyrosine, and it acts as a powerful inhibitor of protein synthesis in cells by adding to the growing end of a polypeptide chain on a ribosome. In this model reaction the growing polypeptide chain end is mimicked by a hexanucleotide ( red,representing a tRNA) that is covalently linked to N-formyl methionine (representing the polypeptide). A highly purified large rRNA molecule catalyzes the addition of the puromycin to the N-formyl methionine, forming a new peptide bond and releasing the hexanucleotide.

Several other families of catalytic RNAs have recently been discovered. Most tRNAs, for example, are initially synthesized as a larger precursor RNA, and an RNA molecule has been shown to play the major catalytic role in an RNA-protein complex that recognizes these precursors and cleaves them at specific sites. A catalytic RNA sequence also plays an important part in the life cycle of many plant viroids. Most remarkably, ribosomes are now suspected to function largely by RNA-based catalysis, with the ribosomal proteins playing a supporting role to the ribosomal RNAs (rRNAs), which make up more than half the mass of the ribosome. The large rRNA by itself, for example, has peptidyl transferase activity and will catalyze the formation of new peptide bonds ( Figure 3-23).

An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is ch3f24.jpg.

Figure 3-24

.

   A three-dimensional view of the catalytic core of the type of intron RNA sequence illustrated in Figures 3-21 and 3-22

(A) The folded molecule, with hydrogen-bond interactions shown in red. This molecule, which is about 240 nucleotides long, is shown immediately after the initial cut at the 5' side of the intron ( yellow). (B) Schematic of the molecule in (A) in its unfolded form. (Adapted from L. Jaeger, E. Westhof, and F. Michel, J. Mol. Biol.221:1153-1164, 1991.)

How is it possible for an RNA molecule to act like an enzyme? The example of tRNA indicates that RNA molecules can fold up in highly specific ways. A proposed three-dimensional structure for the core of the self-splicing Tetrahymena intron sequence is shown in Figure 3-24. Interactions between different parts of this RNA molecule (analogous to the unusual hydrogen bonds in tRNA molecules - see Figure 3-18) are responsible for folding it to create a complex three-dimensional surface with catalytic activity. An unusual juxtaposition of atoms presumably strains covalent bonds and thereby makes selected atoms in the folded RNA chain unusually reactive.

As explained in Chapter 1, the discovery of catalytic RNA molecules has profoundly changed our views of how the first living cells arose.

Summary

Genetic information is carried in the linear sequence of nucleotides in DNA. Each molecule of DNA is a double helix formed from two complementary strands of nucleotides held together by hydrogen bonds between G-C and A-T base pairs. Duplication of the genetic information occurs by the polymerization of a new complementary strand onto each of the old strands of the double helix during DNA replication.

The expression of the genetic information stored in DNA involves the translation of a linear sequence of nucleotides into a co-linear sequence of amino acids in proteins. A limited segment of DNA is first copied into a complementary strand of RNA. This primary RNA transcript is spliced to remove intron sequences, producing an mRNA molecule. Finally, the mRNA is translated into protein in a complex set of reactions that occur on a ribosome. The amino acids used for protein synthesis are first attached to a family of tRNA molecules, each of which recognizes, by complementary base-pairing interactions, particular sets of three nucleotides in the mRNA ( codons ). The sequence of nucleotides in the mRNA is then read from one end to the other in sets of three, according to a universal genetic code.

Other RNA molecules in cells function as enzymelike catalysts. These RNA molecules fold up to create a surface containing nucleotides that have become unusually reactive. One of these catalysts is the large rRNA of the ribosome, which catalyzes the formation of peptide bonds during protein synthesis.

Protein Structure 20

Introduction

To a large extent, cells are made of protein, which constitutes more than half of their dry weight (see Table 3-1). Proteins determine the shape and structure of the cell and also serve as the main instruments of molecular recognition and catalysis. Although DNA stores the information required to make a cell, it has little direct influence on cellular processes. The gene for hemoglobin, for example, cannot carry oxygen; that is a property of the protein specified by the gene.

DNA and RNA are chains of nucleotides that are chemically very similar to one another. In contrast, proteins are made from an assortment of 20 very different amino acids, each with a distinct chemical personality (see Panel 2-5, pp. 56-57). This variety allows for enormous versatility in the chemical properties of different proteins, and it presumably explains why evolution eventually selected proteins rather than RNA molecules to catalyze most cellular reactions.

The Shape of a Protein Molecule Is Determined by Its Amino Acid Sequence 21

Many of the bonds in a long polypeptide chain allow free rotation of the atoms they join, giving the protein backbone great flexibility. In principle, then, any protein molecule could adopt an almost unlimited number of shapes ( conformations). Most polypeptide chains, however, fold into only one particular conformation determined by their amino acid sequence. This is because the backbones and side chains of the amino acids associate with one another and with water to form various weak noncovalent bonds (see Panel 3-1, pp. 92-93). Provided that the appropriate side chains are present at crucial positions in the chain, large forces are developed that make one particular conformation especially stable.

Most proteins can fold spontaneously into their correct shape. By treatment with certain solvents, a protein can be unfolded, or denatured, to give a flexible polypeptide chain that has lost its native conformation. When the denaturing solvent is removed, the protein will usually refold spontaneously into its original conformation, indicating that all the information necessary to specify the shape of a protein is contained in the amino acid sequence itself.

An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is ch3f25.jpg.

Figure 3-25

.

   How a protein folds into a globular conformation

The polar amino acid side chains tend to gather on the outside of the protein, where they can interact with water. The nonpolar amino acid side chains are buried on the inside to form a hydrophobic core that is "hidden" from water.

An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is ch3f26.jpg.

Figure 3-26

.

   Hydrogen bonding

Some of the hydrogen bonds (shown in color) that can form between the amino acids in a protein. The peptide bonds are shaded in gray.

An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is ch3f27.jpg.

Figure 3-27

.

   Details of intra-molecular hydrogen bonds in a protein

In this region of the enzyme lysozyme, hydrogen bonds form between two side chains ( blue), between a side chain and an atom in a peptide bond ( yellow), or between atoms in two peptide bonds ( red). For reference, see Figure 3-26. (After C.K. Mathews and K.E. van Holde, Biochemistry. Redwood City, CA: Benjamin/Cummings, 1990.)

One of the most important factors governing the folding of a protein is the distribution of its polar and nonpolar side chains. The many hydrophobic side chains in a protein tend to be pushed together in the interior of the molecule, which enables them to avoid contact with the aqueous environment (just as oil droplets coalesce after being mechanically dispersed in water). By contrast, the polar side chains tend to arrange themselves near the outside of the protein molecule, where they can interact with water and with other polar molecules ( Figure 3-25). Since the peptide bonds are themselves polar, they tend to interact both with one another and with polar side chains to form hydrogen bonds ( Figure 3-26); nearly all polar residues buried within the protein are paired in this way ( Figure 3-27). Hydrogen bonds thus play a major part in holding together different regions of polypeptide chain in a folded protein molecule. They are also crucially important for many of the binding interactions that occur on protein surfaces.

An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is ch3f28.jpg.

Figure 3-28

.

   Disulfide-bond formation

The drawing illustrates the formation of a covalent disulfide bond between the side chains of neighboring cysteine residues in a protein.

Secreted or cell-surface proteins often form additional covalent intrachain bonds. Most notably, the formation of disulfide bonds (also called S-S bonds) between the two -SH groups of neighboring cysteine residues in a folded polypeptide chain ( Figure 3-28) often serves to stabilize the three-dimensional structure of extracellular proteins. These bonds are not required for the specific folding of proteins, since folding occurs normally in the presence of reducing agents that prevent S-S bond formation. In fact, S-S bonds are rarely, if ever, formed in protein molecules in the cytosol because the high cytosolic concentration of -SH reducing agents breaks such bonds.

The net result of all the individual amino acid interactions is that most protein molecules fold up spontaneously into precisely defined conformations. Those that are compact and globular have an inner core composed of clustered hydrophobic side chains - packed into a tight, nearly crystalline arrangement - while a very complex and irregular exterior surface is formed by the more polar side chains. The positioning and chemistry of the different atoms on this intricate surface make each protein unique and enable it to bind specifically to other macromolecular surfaces and to certain small molecules (discussed below). > From both a chemical and a structural standpoint, proteins are the most sophisticated molecules known.

Common Folding Patterns Recur in Different Protein Chains 22

Although all the information required for the folding of a protein chain is contained in its amino acid sequence, we have not yet learned how to "read" this information so as to predict the detailed three-dimensional structure of a protein whose sequence is known. Consequently, the folded conformation can be determined only by an elaborate x-ray diffraction analysis performed on crystals of the protein or, if the protein is very small, by nuclear magnetic resonance techniques (see Chapter 4). So far, more than 100 types of protein folds have been discovered by this technique. Each protein has a specific conformation so intricate and irregular that it would require a chapter to describe it in full three-dimensional detail.

When the three-dimensional structures of different protein molecules are compared, it becomes clear that, although the overall conformation of each protein is unique, several structural patterns recur repeatedly in parts of these macromolecules. Two patterns are particularly common because they result from regular hydrogen-bonding interactions between the peptide bonds themselves rather than between the side chains of particular amino acids. Both patterns were correctly predicted in 1951 from model-building studies based on the different x-ray diffraction patterns of silk and hair. The two regular patterns discovered are now known as the β sheet, which occurs in the protein fibroin, found in silk, and the α helix, which occurs in the protein α-keratin, found in skin and its appendages, such as hair, nails, and feathers.

An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is ch3f29.jpg.

Figure 3-29

.

   A β sheet is a common structure formed by parts of the polypeptide chain in globular proteins

At the top, a domain of 115 amino acids from an immunoglobulin molecule is shown; it consists of a sandwich-like structure of two β sheets, one of which is drawn in color. At the bottom, a perfect antiparallel β sheet is shown in detail, with the amino acid side chains denoted R. Note that every peptide bond is hydrogen-bonded to a neighboring peptide bond. The actual sheet structures in globular proteins are usually less regular than the β sheet shown here, and most sheets are slightly twisted (see Figure 3-31).

The core of most (but not all) globular proteins contains extensive regions of β sheet. In the example illustrated in Figure 3-29, which shows part of an antibody molecule, an antiparallel β sheet is formed when an extended polypeptide chain folds back and forth upon itself, with each section of the chain running in the direction opposite to that of its immediate neighbors. This gives a very rigid structure held together by hydrogen bonds that connect the peptide bonds in neighboring chains. The antiparallel β sheet and the closely related parallel β sheet (which is formed by regions of polypeptide chain that run in the same direction) frequently serve as the framework around which globular proteins are constructed.

An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is ch3f30.jpg.

Figure 3-30

.

   An α helix is another common structure formed by parts of the polypeptide chain in proteins

(A) The oxygen-carrying molecule myoglobin (153 amino acids long) is shown, with one region of α helix outlined in color. (B) A perfect α helix is shown in outline. (C) As in the β sheet, every peptide bond in an α helix is hydrogen-bonded to a neighboring peptide bond. Note that for clarity in (B) both the side chains [which protrude radially along the outside of the helix and are denoted by R in (C)] and the hydrogen atom are omitted on the α-carbon atom of each amino acid (see also Figure 3-31).

An α helix is generated when a single polypeptide chain turns regularly about itself to make a rigid cylinder in which each peptide bond is regularly hydrogen-bonded to other peptide bonds nearby in the chain. Many globular proteins contain short regions of such α helices ( Figure 3-30), and those portions of a transmembrane protein that cross the lipid bilayer are usually α helices because of the constraints imposed by the hydrophobic lipid environment (discussed in Chapter 10).

In aqueous environments an isolated α helix is usually not stable on its own. Two identical α helices that have a repeating arrangement of nonpolar side chains, however, will twist around each other gradually to form a particularly stable structure known as a coiled-coil (see p. 125). Long rodlike coiled-coils are found in many fibrous proteins, such as the intracellular α-keratin fibers that reinforce skin and its appendages.

An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is ch3f31.jpg.

Figure 3-31

.

   Space-filling models of an α helix and a β sheet with ( right) and without ( left) their amino acid side chains

(A) An α helix (part of the structure of myoglobin). (B) A region of β sheet (part of the structure of an immunoglobulin domain). In the photographs on the left, each side chain is represented by a single darkly shaded atom (the R groups in Figures 3-29 and 3-30), while the entire side chain is shown on the right. (Courtesy of Richard J. Feldmann.)

Space-filling representations of an α helix and a β sheet from actual proteins are shown with and without their side chains in Figure 3-31.

Proteins Are Amazingly Versatile Molecules 23

An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is ch3f32.jpg.

Figure 3-32

.

   Contrast between collagen and elastin

(A) Collagen is a triple helix formed by three extended protein chains that wrap around each other. Many rodlike collagen molecules are cross-linked together in the extracellular space to form inextensible collagen fibrils ( top) that have the tensile strength of steel. (B) Elastin polypeptide chains are cross-linked together to form elastic fibers. Each elastin molecule uncoils into a more extended conformation when the fiber is stretched. The striking contrast between the physical properties of elastin and collagen is due entirely to their very different amino acid sequences.

Because of the variety of their amino acid side chains, proteins are remarkably versatile with respect to the types of structures they can form. Contrast, for example, two abundant proteins secreted by cells in connective tissue - collagen and elastin - both present in the extracellular matrix. In collagen molecules three separate polypeptide chains, each rich in the amino acid proline and containing the amino acid glycine at every third residue, are wound around one another to generate a regular triple helix. These collagen molecules are packed together into fibrils in which adjacent molecules are tied together by covalent cross-links between neighboring lysine residues, giving the fibril enormous tensile strength ( Figure 3-32).

Elastin is at the opposite extreme. Its relatively loose and unstructured polypeptide chains are cross-linked covalently to generate a rubberlike elastic meshwork that enables tissues such as arteries and lungs to deform and stretch without damage. As illustrated in Figure 3-32, the elasticity is due to the ability of individual protein molecules to uncoil reversibly whenever a stretching force is applied.

An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is ch3f33.jpg.

Figure 3-33

.

   Some possible sizes and shapes of a protein molecule 300 amino acid residues long

The structure formed is determined by the amino acid sequence. (Adapted from D.E. Metzler, Biochemistry. New York: Academic Press, 1977.)

It is remarkable that the same basic chemical structure - a chain of amino acids - can form so many different structures: a rubberlike elastic meshwork (elastin), an inextensible cable with the tensile strength of steel (collagen), or any of the wide variety of catalytic surfaces on the globular proteins that function as enzymes. Figure 3-33 illustrates and compares the range of shapes that could, in theory, be adopted by a polypeptide chain 300 amino acids long. As we have already emphasized, the conformation actually adopted depends on the amino acid sequence.

Proteins Have Different Levels of Structural Organization 24

In describing the structure of a protein, it is helpful to distinguish various levels of organization. The amino acid sequence is called the primary structure of the protein. Regular hydrogen-bond interactions within contiguous stretches of polypeptide chain give rise to α helices and β sheets, which constitute the protein's secondary structure. Certain combinations of α helices and β sheets pack together to form compactly folded globular units, each of which is called a protein domain. Domains are usually constructed from a section of polypeptide chain that contains between 50 and 350 amino acids, and they seem to be the modular units from which proteins are constructed (see below). While small proteins may contain only a single domain, larger proteins contain a number of domains, which are often connected by relatively open lengths of polypeptide chain. Finally, individual polypeptides often serve as subunits for the formation of larger molecules, sometimes called protein assemblies or protein complexes, in which the subunits are bound to one another by a large number of weak, noncovalent interactions; in extracellular proteins these interactions are often stabilized by disulfide bonds.

An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is ch3f34.jpg.

Figure 3-34

.

   Basic pancreatic trypsin inhibitor (BPTI)

The three-dimensional conformation of this small protein is shown in five commonly used representations. (A) A stereo pair illustrating the positions of all nonhydrogen atoms. The main chain is shown with heavy lines and the side chains with thin lines. (B) Space-filling model showing the van der Waals radii of all atoms (see Panel 3-1). (C) Backbone wire model composed of lines that connect each α carbon along the polypeptide backbone. (D) "Ribbon model," which represents all regions of regular hydrogen-bonded interactions as either helices (a helices) or sets of arrows (b sheets) pointing toward the carboxyl-terminal end of the chain. (E) "Sausage model," which shows the course of the polypeptide chain but omits all detail. In the bottom three panels the hairpin β motif is colored green; this motif is also found in many other proteins (see text). Note that the core of all globular proteins is densely packed with atoms. Thus the impression of an open structure produced by models (C), (D), and (E) is misleading. (B and C, courtesy of Richard J. Feldmann; A and D, courtesy of Jane Richardson.)

The three-dimensional structure of a protein can be illustrated in various ways. Consider the unusually small protein basic pancreatic trypsin inhibitor (BPTI), which contains 58 amino acid residues folded into one domain. BPTI can be shown as a stereo pair displaying all of its nonhydrogen atoms ( Figure 3-34A) or as an accurate space-filling model, where most of the details are obscured ( Figure 3-34B). Alternatively, it can be shown more schematically, with all of the side chains and actual atoms omitted so that it is easier to follow the course of the main polypeptide chain ( Figures 3-34C, D, and E). An average-size protein contains about six times more amino acid residues than BPTI, and many proteins are more than 20 times its size. Schematic drawings are essential for displaying the structure of these larger proteins, and we use them throughout this text.

An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is ch3f35.jpg.

Figure 3-35

.

   Three levels of organization of a protein

The three-dimensional structure of a protein can be described in terms of different levels of folding, each of which is constructed from the preceding one in hierarchical fashion. These levels are illustrated here using the catabolite activator protein (CAP), a bacterial gene regulatory protein with two domains. When the large domain binds cyclic AMP, it causes a conformational change in the protein that enables the small domain to bind to a specific DNA sequence. The amino acid sequence is termed the primary structure and the first folding level the secondary structure. As indicated under the brackets at the bottom of this figure, the combination of the second and third folding levels shown here is commonly termed the tertiary structure, and the fourth level (the assembly of subunits) the quaternary structure of a protein. (Modified from a drawing by Jane Richardson.)

Figure 3-35 shows how the structure of a large protein can be resolved into several levels of organization, each level constructed from the one below it in a hierarchical fashion. These levels of increased organizational complexity may correspond to the steps by which a newly synthesized protein folds into its final native structure inside the cell.

Domains Are Formed from a Polypeptide Chain That Winds Back and Forth, Making Sharp Turns at the Protein Surface 24

A protein domain can be viewed as the basic structural unit of a protein structure. The core of each domain is largely composed of a set of interconnected β sheets or α helices or both. These regular secondary structures are favored because they permit an extensive hydrogen bonding between the backbone atoms, which is essential for stabilizing the interior of the domain, where water is not available to form hydrogen bonds with the polar carbonyl oxygen or amide hydrogen of the peptide bond.

An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is ch3f36.jpg.

Figure 3-36

.

   Example of a common protein motif

In the beta-alpha-beta motif two adjacent parallel strands that form a β sheet structure are connected by an α helix. Like the hairpin β motif highlighted in Figure 3-34, this motif is found in many different proteins.

Because there are only a limited number of ways of combining α helices and β sheets to make a globular structure, certain combinations of these elements, called motifs, occur repeatedly in the core of many unrelated proteins. One example is the hairpin beta motiffound in BPTI (colored green in Figure 3-34D), which consists of two antiparallel β strands joined by a sharp turn formed by a loop of polypeptide chain. Another example is the beta-alpha-beta motif,in which two adjacent parallel β strands are connected by a length of α helix ( Figure 3-36). Several other common motifs are discussed in Chapter 9, where we consider the various DNA-binding motifs found in several families of gene regulatory proteins.

An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is ch3f37.jpg.

Figure 3-37

.

   Ribbon models of the three-dimensional structure of several differently organized protein domains

(A) Cytochrome b 562, a single-domain protein composed almost entirely of α helices. (B) The NAD-binding domain of lactic dehydrogenase, composed of a mixture of α helices and β sheets. (C) The variable domain of an immunoglobin light chain, composed of a sandwich of two β sheets. In these examples the α helices are shown in green,while strands organized as β sheets are denoted by red arrows. Note that the polypeptide chain generally traverses back and forth across the entire domain, making sharp turns only at the protein surface. The protruding loop regions ( yellow) often form the binding sites for other molecules. (Drawings courtesy of Jane Richardson.)

Various combinations of motifs form the protein domain itself, in which the polypeptide chain tends to wind its way back and forth across the entire structure, either as a β sheet or an α helix, reversing direction suddenly by making a tight turn when it reaches the surface of the domain. As a result, a typical domain is a compact structure whose surface is covered by protruding loops of polypeptide chain ( Figure 3-37). The loop regions, which vary in length and have an irregular shape, often form the binding sites for other molecules. Because the loop regions are exposed to water, they are rich in hydrophilic amino acids, and on this basis their positions can frequently be predicted from a careful examination of the amino acid sequence of a protein.

Relatively Few of the Many Possible Polypeptide Chains Would Be Useful

Since each of the 20 amino acids is chemically distinct and each can, in principle, occur at any position in a protein chain, there are 20 x 20 x 20 x 20 = 160,000 different possible polypeptide chains 4 amino acids long, or 20 n different possible polypeptide chains n amino acids long. For a typical protein length of about 300 amino acids, more than 10 390 different proteins can be made.

We know, however, that only a very small fraction of these possible proteins would adopt a stable three-dimensional conformation. The vast majority would have many different conformations of roughly equal energy, each with different chemical properties. Proteins with such variable properties would not be useful and would therefore be eliminated by natural selection in the course of evolution. Present-day proteins have an amazingly sophisticated structure and chemistry because of their unique folding properties. Not only is the amino acid sequence such that a single conformation is extremely stable, but this conformation has the precise chemical properties that enable the protein to perform a specific catalytic or structural function in the cell. Proteins are so precisely built that the change of even a few atoms in one amino acid can sometimes disrupt the structure and cause a catastrophic change in function.

New Proteins Usually Evolve by Alterations of Old Ones 25

Cells have genetic mechanisms that allow genes to be duplicated, modified, and recombined in the course of evolution. Consequently, once a protein with useful surface properties has evolved, its basic structure can be incorporated in many other proteins. Proteins of different but related function in present-day organisms often have similar amino acid sequences. Such families of proteins are believed to have evolved from a single ancestral gene that duplicated in the course of evolution to give rise to other genes in which mutations gradually accumulated to produce related proteins with new functions.

An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is ch3f38.jpg.

Figure 3-38

.

   (A) Comparison of the amino acid sequences of two members of the serine protease family of enzymes

The carboxyl-terminal portions of the two proteins are shown (amino acids 149 to 245). Identical amino acids are connected by colored bars, and the serine residue in the active site at position 195 is highlighted. In the yellow boxed sections of the polypeptide chains, each amino acid occupies a closely equivalent position in the three-dimensional structures of the two enzymes (see Figure 3-39). (B) The standard one-letter and three-letter codes for amino acids. (Modified from J. Greer, Proc. Natl. Acad. Sci. USA 77:3393-3397, 1980.)

An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is ch3f39.jpg.

Figure 3-39

.

   Comparison of the conformations of the two serine proteases shown in Figure 3-38

Elastase is shown in (A) and chymotrypsin in (B). Although only those amino acid residues in the polypeptide chain shaded in green are the same in the two proteins, their conformations are very similar everywhere. The active site, which is circled in red, contains an activated serine residue (see Figure 3-57). Chymotrypsin contains more than two chain termini because it is formed by the proteolytic cleavage of chymotrypsinogen, an inactive precursor.

Consider the serine proteases, a family of protein-cleaving (proteolytic) enzymes that includes the digestive enzymes chymotrypsin, trypsin, and elastase and some of the proteases in the blood-clotting and complement enzymatic cascades. When two of these enzymes are compared, about 40% of the positions in their amino acid sequences are found to be occupied by the same amino acid ( Figure 3-38). The similarity of their three-dimensional conformations as determined by x-ray crystallography is even more striking: most of the detailed twists and turns in their polypeptide chains, which are several hundred amino acids long, are identical ( Figure 3-39).

An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is ch3f40.jpg.

Figure 3-40

.

   Comparison of DNA-binding homeodomains from two organisms separated by more than a billion years of evolution

(A) Schematic of structure. (B) Trace of the α-carbon positions. The three-dimensional structures shown were determined by x-ray crystallography for the yeast α2 protein ( green) and the Drosophila engrailed protein ( red). (C) Comparison of amino acid sequences for the region of the proteins shown in (A) and (B). Orange dotsdemark the position of a three amino acid insert in the α2 protein. (Adapted from C. Wolberger, et al., Cell67:517-528, 1991.)

The story that we have told for the serine proteases could be repeated for hundreds of other protein families. In many cases the amino acid sequences have diverged much further than for the serine proteases, so that one cannot be sure of a family relationship between two proteins without determining their three-dimensional structures. The yeast α2 protein and the Drosophilaengrailed protein, for example, are both gene regulatory proteins in the homeodomain family. Because they are identical in only 17 of their 60 amino acid residues, their relationship became certain only when their three-dimensional structures were compared ( Figure 3-40).

The various members of a large protein family will often have distinct functions. Some of the amino acid changes that make these proteins different were no doubt selected in the course of evolution because they resulted in changes in biological activity, giving the individual family members the different functional properties that they have today. Other amino acid changes are likely to be "neutral," having neither a beneficial nor a damaging effect on the basic structure and function of the protein. Since mutation is a random process, there must also have been many deleterious changes that altered the three-dimensional structure of these proteins sufficiently to inactivate them. Such inactive proteins would have been lost whenever the individual organisms making them were at enough of a disadvantage to be eliminated by natural selection. It is not surprising, then, that cells contain whole sets of structurally related polypeptide chains that have a common ancestry but different functions.

New Proteins Can Evolve by Recombining Preexisting Polypeptide Domains 26

Once a number of stable protein surfaces have been made in a cell, new surfaces with different binding properties can be generated by joining two or more proteins together by noncovalent interactions between them, producing a protein complex. This combining of proteins to make larger, functional protein assemblies is common. Many protein complexes have molecular weights of a million or more, even though an average polypeptide chain has a molecular weight of 40,000 (about 300 to 400 amino acids), and relatively few polypeptide chains are more than three times this size.

An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is ch3f41.jpg.

Figure 3-41

.

   The evolution of new ligand-binding sites

The general principle by which the juxtaposition of separate protein surfaces in the course of evolution has given rise to proteins that contain new binding sites for other molecules ( ligandssee p. 129). As indicated here, the ligand-binding sites often lie at the interface between two protein domains and are formed from loop regions on the protein surface (see also Figure 3-42).

An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is ch3f42.jpg.

Figure 3-42

.

   The structure of the glycolytic enzyme glyceraldehyde 3-phosphate dehydrogenase

The protein is composed of two domains, each shown in a different color, with regions of α helix represented by cylinders and regions of b sheet represented by arrows. The details of the reaction catalyzed by the enzyme are shown in Figure 2-22. Note that the three bound substrates lie at an interface between the two domains. (Courtesy of Alan J. Wonacott.)

An alternative way of making a new protein from existing chains is to join the corresponding DNA sequences to make a gene that encodes a single large polypeptide chain. Proteins in which different parts of the polypeptide chain fold independently into separate globular domains are believed to have evolved in this way, perhaps after existing for a prolonged period as a protein complex formed from separate polypeptides. Many proteins have such "multidomain" structures, and, as might be expected from the evolutionary considerations discussed above, the binding sites for substrate molecules frequently lie where the separate domains are juxtaposed ( Figure 3-41). Thus, for the multidomain protein whose three-dimensional structure is shown in Figure 3-42, a protein surface on one domain that binds NAD + was apparently combined with a surface on a second domain that binds a sugar, as part of the process of evolving an active site that uses the NAD + to catalyze sugar oxidation.

Another way of reutilizing an amino acid sequence is especially widespread among long fibrous proteins such as collagen (see Figure 3-32). In these cases a structure is formed from multiple internal repeats of an ancestral amino acid sequence. Putting together amino acid sequences by joining preexisting coding DNA sequences is clearly a much more efficient strategy for a cell than the alternative of deriving new protein sequences from scratch by random DNA mutation.

Structural Homologies Can Help Assign Functions to Newly Discovered Proteins 27

An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is ch3f43.jpg.

Figure 3-43

.

   Domain shuffling

An extensive shuffling of blocks of protein sequence (protein modules) has occurred during the evolution of proteins. Those portions of a protein denoted by the same shape and color are evolutionarily related but not identical. (A) The bacterial catabolite gene activator protein (CAP) contains one domain ( blue triangle) that binds a specific DNA sequence and a second domain ( red rectangle) that binds cyclic AMP (see Figure 3-35). The DNA-binding domain here is related to the DNA-binding domains of many other gene regulatory proteins, including the lac repressor and cro repressor proteins. In addition, two copies of the cyclic-AMP-binding domain are found in eucaryotic protein kinases regulated by the binding of cyclic nucleotides. (B) Serine proteases like chymotrypsin are formed from two domains ( brown). In some related proteases that are highly regulated and more specialized, the two protease domains are connected to one or more domains homologous to domains found in epidermal growth factor ( green hexagon), to a calcium-binding protein ( yellow triangle), or to a "kringle" domain ( blue square) that contains three internal disufide bridges.

The development of techniques for rapidly sequencing DNA molecules has made it possible to determine the amino acid sequences of many thousands of proteins from the nucleotide sequences of their genes. A rapidly enlarging protein data base is therefore available that biologists routinely scan by computer to search for possible sequence homologies between a newly sequenced protein and previously studied ones. Although sequences have so far been determined for only a few percent of the proteins in eucaryotic organisms, it is common to find that a newly sequenced protein is homologous to some other, known protein over part of its length, indicating that most proteins may have descended from relatively few ancestral types. As expected, the sequences of many large proteins often show signs of having evolved by the joining of preexisting domains in new combinations - a process called domain shuffling ( Figure 3-43).

These protein comparisons are important because related structures often imply related functions. Many years of experimentation can be saved by discovering an amino acid sequence homology with a protein of known function. Such sequence homologies, for example, first indicated that certain cell-cycle regulatory genes in yeast cells and certain genes that cause mammalian cells to become cancerous are protein kinases. In the same way many of the proteins that control pattern formation in the fruit fly Drosophila were recognized to be gene regulatory proteins, while another protein involved in pattern formation was identified as a serine protease.

The discovery of domain homologies can also be useful in another way. It is much more difficult to determine the three-dimensional structure of a protein than to determine its amino acid sequence. But the conformation of a newly sequenced protein domain can be guessed if it is homologous to a domain of a protein whose conformation has already been determined by x-ray diffraction analysis. By assuming that the twists and turns of the polypeptide chain will be conserved in the two proteins despite the presence of discrepancies in amino acid sequence, one can often sketch the structure of the new protein with reasonable accuracy (see Figure 3-40).

Many new protein sequences are being added to the data base each year, each one increasing the chance of finding useful homologies. Protein-sequence comparisons have therefore become a very important tool in cell biology.

Protein Subunits Can Assemble into Large Structures 28

The same principles that enable several protein domains to associate to form binding sites for small molecules operate to generate much larger structures in the cell. Supramolecular structures such as enzyme complexes, ribosomes, protein filaments, viruses, and membranes are not made as single, giant, covalently linked molecules; instead they are formed by the noncovalent assembly of many preformed molecules, which are called subunits of the final structure.

There are several advantages to the use of smaller subunits to build larger structures: (1) building a large structure from one or a few repeating smaller subunits reduces the amount of genetic information required; (2) both assembly and disassembly can be readily controlled, since the subunits associate through multiple bonds of relatively low energy; and (3) errors in the synthesis of the structure can be more easily avoided, since correction mechanisms can operate during the course of assembly to exclude malformed subunits.

A Single Type of Protein Subunit Can Interact with Itself to Form Geometrically Regular Assemblies 29

An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is ch3f44.jpg.

Figure 3-44

.

   The formation of a dimer from a single type of protein subunit

A protein with a binding site that recognizes itself will often form symmetrical dimers. These may then pair with other subunits to form tetramers and larger assemblies (not shown).

An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is ch3f45.jpg.

Figure 3-45

.

   Ribbon model of a dimer formed from two identical protein subunits (monomers)

The protein shown is the bacterial catabolite gene activator protein (CAP) illustrated previously in Figure 3-35. (Courtesy of Jane Richardson.)

If a protein has a binding site that is complementary to a region of its own surface, it will assemble spontaneously to form a larger structure. In the simplest case, a binding site recognizes itself and forms a symmetrical dimer. Many enzymes and other proteins form dimers of this kind, which frequently act as subunits in the formation of larger assemblies ( Figures 3-44 and 3-45).

An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is ch3f46.jpg.

Figure 3-46

.

   Rings or helices can form if a single type of protein subunit interacts with itself repeatedly

The formation of a helix was illustrated in Figure 3-5; a ring forms instead of a helix if the subunits run into one another, stopping further growth of the chain.

An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is ch3f47.jpg.

Figure 3-47

.

   An actin filament

There are about two globular protein subunits per turn in this important filament, which is discussed in detail in Chapter 16.

An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is ch3f49.jpg.

   Hexagonally packed globular protein subunits can form either a flat sheet or a tube

If the binding site of a protein is complementary to a region of its surface that does not include the binding site itself, a chain of subunits will be formed. For certain special orientations of the two binding sites, the chain will soon run into itself and terminate, forming a closed ring of subunits ( Figure 3-46). More commonly, an extended polymer of subunits will result, and provided that each subunit is bound to its neighbor in an identical way, the subunits in the polymer will be arranged in a helix that can be extended indefinitely (see Figure 3-5). An actin filament, for example, is a helical structure formed from a single globular protein subunit called actin;actin filaments are major components in the cytosol of most eucaryotic cells ( Figure 3-47). As we discuss below, globular proteins may also associate with like neighbors to form extended sheets or tubes (see Figure 3-49).

Coiled-Coil Proteins Help Build Many Elongated Structures in Cells 30

An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is ch3f48.jpg.

Figure 3-48

.

   The structure of a coiled-coil

In (A) a single α helix is shown, with successive amino acid side chains labeled in a sevenfold sequence "abcdefg" (from bottom to top). Amino acids "a" and "d" in such a sequence lie close together on the cylinder surface, forming a "stripe" (shaded in red) that winds slowly around the α helix. Proteins that form coiled-coils typically have hydrophobic amino acids at positions "a" and "d." Consequently, as shown in (B), the two α helices can wrap around each other with the hydrophobic side chains of one α helix interacting with the hydrophobic side chains of the other, while the more hydrophilic amino acid side chains are left exposed to the aqueous environment. (C) The atomic structure of a coiled-coil determined by x-ray crystallography. The red side chains are hydrophobic. (C, from T. Alber, Curr. Opin. Genet. Devel.2:205-210, 1992. © Current Science.)

Where mechanical strength is of major importance, supramolecular assemblies are usually made from fibrous rather than globular subunits. Such assemblies can be stabilized by extensive regions of protein-protein contact when the subunits are wound around one another as a multistranded helix. A particularly stable structural unit that is used repeatedly for this purpose is known as the coiled-coil. It forms by the pairing of two α-helical subunits that have a repeating arrangement of nonpolar side chains. The two α-helical subunits are usually identical and run in parallel (that is, in the same direction from amino to carboxyl terminal). They coil gradually around each other to produce a stiff filament with a diameter of about 2 nm ( Figure 3-48). Whereas short coiled-coils serve as dimerization domains in several families of gene regulatory proteins, more commonly a coiled-coil will extend for more than 100 nm and serve as a building block for a large fibrous structure, such as the thick filaments in a muscle cell.

Proteins Can Assemble into Sheets, Tubes, or Spheres 31

Some protein subunits assemble into flat sheets in which the subunits are arranged in hexagonal arrays. Specialized membrane proteins are sometimes arranged in this way in lipid bilayers. With a slight change in the geometry of the individual subunits, a hexagonal sheet can be converted into a tube ( Figure 3-49) or, with more changes, into a hollow sphere. Protein tubes and spheres that bind specific RNA and DNA molecules form the coats of viruses.

An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is ch3f50.jpg.

Figure 3-50

.

   The structure of a spherical virus

In many viruses, identical protein subunits pack together to create a spherical shell (a capsid) that encloses the viral genome, composed of either RNA or DNA (see Figure 6-72). For geometric reasons, no more than 60 identical subunits can pack together in a precisely symmetrical way. If slight irregularities are allowed, however, more subunits can be used to produce a larger capsid. The tomato bushy stunt virus (TBSV) shown here, for example, is a spherical virus about 33 nm in diameter that is formed from 180 identical copies of a 386 amino acid capsid protein plus an RNA genome of 4500 nucleotides. To form such a large capsid, the protein must be able to fit into three somewhat different environments, each of which is differently colored in the particle shown here. The postulated pathway of assembly is shown; the precise three-dimensional structure has been determined by x-ray diffraction. (Courtesy of Steve Harrison.)

The formation of closed structures, such as rings, tubes, or spheres, provides additional stability because it increases the number of bonds that can form between the protein subunits. Moreover, because such a structure is formed by mutually dependent, cooperative interactions between subunits, it can be driven to assemble or disassemble by a relatively small change that affects the subunits individually. These principles are dramatically illustrated in the protein capsid of many simple viruses, which takes the form of a hollow sphere. These coats are often made of hundreds of identical protein subunits that enclose and protect the viral nucleic acid ( Figure 3-50). The protein in such a capsid must have a particularly adaptable structure, since it must make several different kinds of contacts and also change its arrangement to let the nucleic acid out to initiate viral replication once the virus has entered a cell.

Many Structures in Cells Are Capable of Self-assembly 32

An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is ch3f51.jpg.

Figure 3-51

.

   The structure of tobacco mosaic virus (TMV)

(A) Electron micrograph of a tobacco mosaic virus (TMV), which consists of a single long RNA molecule enclosed in a cylindrical protein coat composed of a tight helical array of identical protein subunits. (B) A model showing part of the structure of TMV. A single-stranded RNA molecule of 6000 nucleotides is packaged in a helical coat constructed from 2130 copies of a coat protein 158 amino acids long. Fully infective virus particles can self-assemble in a test tube from purified RNA and protein molecules. (A, courtesy of Robley Williams; B, courtesy of Richard J. Feldmann.)

The information for forming many of the complex assemblies of macromolecules in cells must be contained in the subunits themselves, since under appropriate conditions the isolated subunits can spontaneously assemble in a test tube into the final structure. The first large macromolecular aggregate shown to be capable of self-assembly from its component parts was tobacco mosaic virus (TMV). This virus is a long rod in which a cylinder of protein is arranged around a helical RNA core ( Figure 3-51). If the dissociated RNA and protein subunits are mixed together in solution, they recombine to form fully active virus particles. The assembly process is unexpectedly complex and includes the formation of double rings of protein, which serve as intermediates that add to the growing virus coat.

Another complex macromolecular aggregate that can reassemble from its component parts is the bacterial ribosome. These ribosomes are composed of about 55 different protein molecules and 3 different rRNA molecules. If the individual components are incubated under appropriate conditions in a test tube, they spontaneously re-form the original structure. Most important, such reconstituted ribosomes are able to carry out protein synthesis. As might be expected, the reassembly of ribosomes follows a specific pathway: certain proteins first bind to the RNA, and this complex is then recognized by other proteins, and so on until the structure is complete.

An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is ch3f52.jpg.

Figure 3-52

.

   Three ways in which a large protein assembly can be made to a fixed length

(A) Coassembly along an elongated core protein or other macromolecule that acts as a measuring device. (B) Termination of assembly because of strain that accumulates in the polymeric structure as additional subunits are added, so that beyond a certain length the energy required to fit another subunit onto the chain becomes excessively large. (C) A vernier type of assembly, in which two sets of rodlike molecules differing in length form a staggered complex that grows until their ends exactly match.

An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is ch3f53.jpg.

Figure 3-53

.

   Electron micrograph of bacteriophage lambda

The tip of the virus tail attaches to a specific protein on the surface of a bacterial cell, following which the tightly packaged DNA in the head is injected through the tail into the cell. The tail has a precise length, which is determined by the mechanism shown in Figure 3-52A.

It is still not clear how some of the more elaborate self-assembly processes are regulated. Many structures in the cell, for example, appear to have a precisely defined length that is many times greater than that of their component macromolecules. How such length determination is achieved is in most cases a mystery. Three possible mechanisms are illustrated in Figure 3-52. In the simplest case a long core protein or other macromolecule provides a scaffold that determines the extent of the final assembly. This is the mechanism that determines the length of the TMV particle, where the RNA chain provides the core. Similarly, a core protein is thought to determine the length of the thin filaments in muscle, as well as the long tails of some bacterial viruses ( Figure 3-53).

Not All Biological Structures Form by Self-assembly 33

An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is ch3f54.jpg.

Figure 3-54

.

   The polypeptide hormone insulin cannot spontaneously re-form if its disulfide bonds are disrupted

It is synthesized as a larger protein ( proinsulin) that is cleaved by a proteolytic enzyme after the protein chain has folded into a specific shape. Excision of part of the proinsulin polypeptide chain causes an irretrievable loss of the information needed for the protein to fold spontaneously into its normal conformation.

Some cellular structures held together by noncovalent bonds are not capable of self-assembly. A mitochondrion, a cilium, or a myofibril, for example, cannot form spontaneously from a solution of their component macromolecules because part of the information for their assembly is provided by special enzymes and other cellular proteins that perform the function of jigs or templates but do not appear in the final assembled structure. Even small structures may lack some of the ingredients necessary for their own assembly. In the formation of some bacterial viruses, for example, the head structure, which is composed of a single protein subunit, is assembled on a temporary scaffold composed of a second protein. The second protein is absent from the final virus particle, and so the head structure cannot spontaneously reassemble once it is taken apart. Other examples are known in which proteolytic cleavage is an essential and irreversible step in the assembly process. This is the case for the coats of some bacterial viruses and even for some simple protein assemblies, including the structural protein collagen and the hormone insulin ( Figure 3-54). From these relatively simple examples, it seems very likely that the assembly of a structure as complex as a mitochondrion or a cilium will involve both temporal and spatial ordering imparted by other cellular components, as well as irreversible processing steps catalyzed by degradative enzymes.

Summary

The three-dimensional conformation of a protein molecule is determined by its amino acid sequence. The folded structure is stabilized by noncovalent interactions between different parts of the polypeptide chain. The amino acids with hydrophobic side chains tend to cluster in the interior of the molecule, and local hydrogen-bond interactions between neighboring peptide bonds give rise to α helices and β sheets. Globular regions known as domains are the modular units from which many proteins are constructed; small proteins typically contain only a single domain, while large proteins contain several domains linked together by short lengths of polypeptide chain. As proteins evolved, domains were modified and combined with other domains to construct new proteins.

Proteins are brought together into larger structures by the same noncovalent forces that determine protein folding. Proteins with binding sites for their own surface can assemble into dimers, closed rings, spherical shells, or helical polymers. Although mixtures of proteins and nucleic acids can assemble spontaneously into complex structures in the test tube, many assembly processes involve irreversible steps. Consequently, not all structures in the cell are capable of spontaneous reassembly after they are dissociated into their component parts.

Proteins as Catalysts 34

Introduction

The chemical properties of a protein molecule depend almost entirely on its exposed surface residues, which are able to form weak, noncovalent bonds with other molecules. When a protein molecule binds to another molecule, the second molecule is commonly referred to as a ligand. Because an effective interaction between a protein molecule and a ligand requires that many weak bonds be formed simultaneously between them, the only ligands that can bind tightly to a protein are those that fit precisely onto its surface.

An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is ch3f55.jpg.

Figure 3-55

.

   The ligand-binding site of the catabolite gene activator protein (CAP)

Hydrogen bonding between CAP and its ligand, cyclic AMP ( green), was determined by x-ray crystallographic analysis of the complex. As indicated, the two identical subunits of the dimer cooperate to form this binding site (see also Figure 3-45). (Courtesy of Tom Steitz.)

The region of a protein that associates with a ligand, known as its binding site, usually consists of a cavity formed by a specific arrangement of amino acids on the protein surface. These amino acids often belong to widely separated regions of the polypeptide chain ( Figure 3-55), and they represent only a minor fraction of the total amino acids present. The rest of the protein molecule is presumably necessary to maintain the polypeptide chain in the correct position and to provide additional binding sites for regulatory purposes; the interior of the protein is often important only insofar as it gives the surface of the molecule the appropriate shape and rigidity.

A Protein's Conformation Determines Its Chemistry 20

Neighboring surface residues on a protein often interact in a way that alters the chemical reactivity of selected amino acid side chains. These interactions are of several types.

An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is ch3f56.jpg.

Figure 3-56

.

   Competition for hydrogen bonding

The ability of water molecules to make favorable hydrogen bonds with groups on the protein surface greatly reduces the tendency of these groups to pair with each other.

First, neighboring parts of the polypeptide chain may interact in a way that restricts the access of water molecules to other parts of the protein surface. Because water molecules tend to form hydrogen bonds, they compete with ligands for selected side chains on the protein surface ( Figure 3-56). The tightness of hydrogen bonds (and ionic interactions) between proteins and their ligands is therefore greatly increased if water molecules are excluded. At first sight it is hard to imagine a mechanism that would exclude a molecule as small as water from a protein surface without affecting the access of the ligand itself. Because of their strong tendency for hydrogen bonding, however, water molecules exist in a large hydrogen-bonded network (see Panel 2-1, pp. 48-49), and it is often energetically unfavorable for individual molecules to break away from this network to reach into a crevice on the protein surface.

An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is ch3f57.jpg.

Figure 3-57

.

   An unusually reactive amino acid at the active site of an enzyme

The example shown is the "catalytic triad" found in chymotrypsin, elastase, and other serine proteases (see Figure 3-39). The aspartic acid side chain induces the histidine to remove the proton from serine 195; this activates the serine to form a covalent bond with the enzyme substrate, hydrolyzing a peptide bond as illustrated later in Figure 3-64.

Second, the clustering of neighboring polar amino acid side chains can alter their reactivity. If a number of negatively charged side chains are forced together against their mutual repulsion by the way the protein folds, for example, the affinity of the site for a positively charged ion is greatly increased. Selected amino acid side chains can also interact with one another through hydrogen bonds, which can activate normally unreactive side groups (such as the CH2OH on the serine shown in Figure 3-57) so that they are able to enter into reactions that make or break selected covalent bonds.

The surface of each protein molecule therefore has a unique chemical reactivity that depends not only on which amino acid side chains are exposed, but also on their exact orientation relative to one another. For this reason even two slightly different conformations of the same protein molecule may differ greatly in their chemistry.

An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is ch3f58.jpg.

Figure 3-58

.

   Coenzymes

Coenzymes, such as thiamine pyrophosphate (TPP), shown here in gray, are small molecules that bind to an enzyme's surface and enable it to catalyze specific reactions. The reactivity of TPP depends on its "acidic" carbon atom, which readily exchanges its hydrogen atom for a carbon atom of a substrate molecule. Other regions of the TPP molecule act as "handles" by which the enzyme holds the coenzyme in the correct position. Coenzymes presumably evolved first in an "RNA world," where they were bound to RNA molecules to help with catalysis (discussed in Chapter 1).

An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is ch3f59.jpg.

Figure 3-59

.

   Computer-generated space-filling models of two enzymes

In (A) cytochrome c is shown with its bound heme coenzyme. In (B) egg-white lysozyme is shown with a bound oligosaccharide substrate. In both cases the bound ligand is red. (Courtesy of Richard J. Feldmann.)

Where side-chain reactivities are insufficient, proteins often enlist the help of selected nonpolypeptide molecules that the proteins bind to their surface. These ligands serve as coenzymes in enzyme-catalyzed reactions, and they may be so tightly bound to the protein that they are effectively part of the protein itself. Examples are the iron-containing hemes in hemoglobin and cytochromes, thiamine pyrophosphate in enzymes involved in aldehyde-group transfers, and biotin in enzymes involved in carboxyl-group transfers. Most coenzymes are very complex organic molecules that have been selected for the unique chemical reactivity they acquire when bound to a protein surface. Besides its reactive center such a coenzyme has other residues designed to bind it to its host protein ( Figure 3-58). A space-filling model of an enzyme bound to a coenzyme is shown in Figure 3-59A.

Substrate Binding Is the First Step in Enzyme Catalysis 35

An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is ch3f60.jpg.

Figure 3-60

.

   Enzyme kinetics

The rate of an enzyme reaction ( V) increases as the substrate concentration increases until a maximum value ( Vmax) is reached. At this point all substrate-binding sites on the enzyme molecules are fully occupied, and the rate of reaction is limited by the rate of the catalytic process on the enzyme surface. For most enzymes the concentration of substrate at which the reaction rate is half-maximal ( KM) is a measure of how tightly the substrate is bound, with a large value of KM corresponding to weak binding.

One of the most important functions of proteins is to act as enzymes that catalyze specific chemical reactions. The ligand in this case is called a substrate molecule, and the binding of the substrate to the enzyme is an essential prelude to the chemical reaction (see Figure 3-59B). If we denote the enzyme by E, the substrate by S, and the product by P, the basic reaction path is E + S [right harpoon over left harpoon] ES [right harpoon over left harpoon] EP [right harpoon over left harpoon] E + P. From this simple outline of an enzyme-catalyzed reaction, we see that there is a limit to the amount of substrate that a single enzyme molecule can process in a given time. If the concentration of substrate is increased, the rate at which product is formed also increases, up to a maximum value ( Figure 3-60). At that point the enzyme molecule is saturated with substrate and the rate of reaction (denoted Vmax) depends only on how rapidly the substrate molecule can be processed. This rate divided by the enzyme concentration is called the turnover number. The turnover number is often about 1000 substrate molecules processed per second per enzyme molecule, but it can be much greater in extreme cases.

The other kinetic parameter frequently used to characterize an enzyme is its KM, which is the substrate concentration that allows the reaction to proceed at one-half its maximum rate (see Figure 3-60). A low KM value means that the enzyme reaches its maximum catalytic rate at a low concentration of substrate and generally indicates that the enzyme binds its substrate very tightly.

Enzymes Speed Reactions by Selectively Stabilizing Transition States 36

An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is ch3f61.jpg.

Figure 3-61

.

   Enzymes accelerate chemical reactions by decreasing the activation energy

Often both the uncatalyzed reaction (A) and the enzyme-catalyzed reaction (B) go through several transition states. It is the transition state with the highest energy (S T and ES T) that determines the activation energy and limits the rate of the reaction. (S = substrate; P = product of the reaction.)

Extremely high rates of chemical reaction are achieved by enzymes - far higher than for any synthetic catalysts. This efficiency is attributable to several factors. The enzyme serves, first, to increase the local concentration of substrate molecules at the catalytic site and to hold all of the appropriate atoms in the correct orientation for the reaction that is to follow. More important, however, some of the binding energy contributes directly to the catalysis. Substrate molecules pass through a series of intermediate forms of altered geometry and electron distribution before they form the ultimate products of the reaction, and the free energies of these intermediate forms - especially of those in the most unstable transition states - are the major determinants of the rate of reaction. Enzymes have a much greater affinity for these transition states of the substrate than they have for the stable forms. Because this binding interaction lowers the energies of crucial transition states, the enzyme greatly accelerates one particular reaction ( Figure 3-61).

An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is ch3f62.jpg.

Figure 3-62

.

   Catalytic antibodies

The stabilization of a transition state by an antibody creates an enzyme. (A) The reaction path for hydrolysis of an amide bond goes through a tetrahedral intermediate,which is the high-energy transition state for the reaction. (B) The molecule shown on the left was covalently linked to a protein and used as an antigen to generate an antibody that binds tightly to the region of the molecule shown in yellow. Because this antibody also bound tightly to the transition state in (A), it was found to function as an enzyme that efficiently catalyzed the hydrolysis of the amide bond in the molecule shown on the right.

A dramatic demonstration of how stabilizing a transition state can greatly increase reaction rates is provided by the intentional production of antibodies that act like enzymes. Consider, for example, the hydrolysis of an amide bond, which is similar to the peptide bond that joins adjacent amino acids in a protein. In an aqueous solution an amide bond hydrolyzes very slowly by the mechanism illustrated in Figure 3-62A. In the central intermediate, or transition state, the carbonyl carbon is bonded to four atoms that are arranged at the corners of a tetrahedron. By generating monoclonal antibodies that bind tightly to a stable analogue of this very unstable tetrahedral intermediate,as illustrated in Figure 3-62B, an antibody that functions like an enzyme can be obtained. This catalytic antibody binds to and stabilizes the tetrahedral intermediate and thereby increases the spontaneous rate of amide-bond hydrolysis more than 10,000-fold.

Enzymes Can Promote the Making and Breaking of Covalent Bonds Through Simultaneous Acid and Base Catalysis 37

An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is ch3f63.jpg.

Figure 3-63

.

   Acid catalysis and base catalysis

(A) The start of the uncatalyzed reaction shown in Figure 3-62A is diagrammed, with blueshading as a schematic indicator of electron distribution in the water and carbonyl bonds. (B) An acid likes to donate a proton (H +) to other atoms. By pairing with the carbonyl oxygen, an acid causes electrons to move away from the carbonyl carbon, making this atom much more attractive to the electronegative oxygen of an attacking water molecule. (C) A base likes to take up H +; by pairing with a hydrogen of the attacking water molecule, a base causes electrons to move toward the water oxygen, making it a better attacking group for the carbonyl carbon. (D) By having appropriately positioned atoms on its surface, an enzyme can carry out both acid catalysis and base catalysis at the same time.

Enzymes are better catalysts than catalytic antibodies. In addition to binding tightly to the transition state, the active site of an enzyme contains precisely positioned atoms that speed up the reaction by altering the distribution of electrons in those atoms involved in the making and breaking of covalent bonds. Peptide bonds, for example, can be hydrolyzed in the absence of an enzyme by exposing a polypeptide to either a strong acid or a strong base, as explained in Figure 3-63B and C. Enzymes are unique, however, in being able to use acid and base catalysis simultaneously, since the acidic and basic residues required are prevented from combining with each other (as they would do in solution) by being tied to the rigid framework of the protein itself ( Figure 3-63D).

The fit between an enzyme and its substrate needs to be precise. A small change introduced by genetic engineering in the active site of an enzyme can have a profound effect. Replacing a glutamic acid with an aspartic acid in one enzyme, for example, shifts the position of the catalytic carboxylate ion by only 1 Å (about the radius of a hydrogen atom), and yet this is enough to reduce the activity of the enzyme a thousandfold.

Enzymes Can Further Increase Reaction Rates by Forming Covalent Intermediates with Their Substrates 38

In addition to the above roles, many enzymes further speed the reaction they catalyze by interacting covalently with one of their substrates, thereby temporarily attaching the substrate to an amino acid or to a coenzyme molecule. Generally, one substrate enters the binding site, becomes covalently bound, and then reacts with a second molecule on the enzyme surface that breaks the covalent attachment just made. At the end of each reaction cycle, the free enzyme is regenerated.

An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is ch3f64.jpg.

Figure 3-64

.

   Some enzymes form covalent bonds with their substrates

In the example shown here, a carbonyl group in a polypeptide chain (shown in green) forms a covalent bond with a specially activated serine residue (see Figure 3-57) of a serine protease (shown in gray), which cleaves the polypeptide chain. When the unbound portion of the polypeptide chain has diffused away, a second step occurs in which a water molecule hydrolyzes the newly formed covalent bond, thereby releasing the portion of the polypeptide bound to the enzyme surface and freeing the serine for another cycle of reaction. Note that two unstable tetrahedral intermediates (shaded in yellow) serve as transition states in this reaction and both are stabilized by the enzyme.

Consider, for example, the mechanism of action of the serine proteases. The reaction they catalyze, the hydrolysis of a peptide bond, is greatly accelerated by the enzymes' affinity for the tetrahedral intermediate of the reaction. But a serine protease does more than a typical catalytic antibody: instead of waiting for an oxygen from a water molecule to attack the carbonyl carbon, it makes the reaction go much more quickly by first using a precisely positioned amino acid side chain for this purpose (the activated serine in Figure 3-57). This step breaks the peptide bond, but it leaves the enzyme covalently linked to the carboxyl group. Then, in a rapid second step, this covalent intermediate is destroyed by the enzyme-catalyzed addition of water, completing the reaction and regenerating the free enzyme ( Figure 3-64). Even though this two-step reaction is less direct than a one-step reaction (in which water is added to the peptide bond), it is faster because each step has a relatively low activation energy.

Enzymes Accelerate Chemical Reactions but Cannot Make Them Energetically More Favorable

No matter how sophisticated an enzyme is, it cannot make the chemical reaction it catalyzes either more or less energetically favorable. It cannot alter the free-energy difference between the initial substrates and the final products of the reaction. Like the simple binding interactions already discussed, any given chemical reaction has an equilibrium point, at which the backward and forward reaction fluxes are equal, so that no net change occurs (see Figure 3-9). If an enzyme speeds up the rate of the forward reaction, A + B → AB, by a factor of 10 8, it must speed up the rate of the backward reaction, AB → A + B, by a factor of 10 8 as well. The ratio of the forward to the backward rates of reaction depends only on the concentrations of A, B, and AB. The equilibrium point remains precisely the same whether or not the reaction is catalyzed by an enzyme.

Enzymes Determine Reaction Paths by Coupling Selected Reactions to ATP Hydrolysis 39

The living cell is a chemical system that is far from equilibrium. The product of each enzyme usually serves as a substrate for another enzyme in the metabolic pathway and is rapidly consumed. More important, by means of a reaction pathway that is determined by enzymes, many reactions are driven in one direction by being coupled to the energetically favorable hydrolysis of ATP to ADP and inorganic phosphate, as previously described in Chapter 2. To make this strategy effective, the ATP pool is itself maintained at a level far from its equilibrium point, with a high ratio of ATP to its hydrolysis products (discussed in Chapter 14). This ATP pool thereby serves as a "storage battery" that keeps energy and atoms continually passing through the cell directed along pathways determined by the enzymes present. For a living system, approaching chemical equilibrium means decay and death.

Multienzyme Complexes Help to Increase the Rate of Cell Metabolism 40

The efficiency of enzymes in accelerating chemical reactions is crucial to the maintenance of life. Cells, in effect, must race against the unavoidable processes of decay, which run downhill toward chemical equilibrium. If the rates of desirable reactions were not greater than the rates of competing side reactions, a cell would soon die. Some idea of the rate at which cellular metabolism proceeds can be obtained by measuring the rate of ATP utilization. A typical mammalian cell turns over (that is, completely degrades and replaces) its entire ATP pool once every 1 or 2 minutes. For each cell this turnover represents the utilization of roughly 10 7 molecules of ATP per second (or, for the human body, about a gram of ATP every minute).

The rates of cellular reactions are rapid because of the effectiveness of enzyme catalysis. Many important enzymes have become so efficient that there is no possibility of further useful improvement: the factor limiting the reaction rate is no longer the intrinsic speed of action of the enzyme, rather it is the frequency with which the enzyme collides with its substrate. Such a reaction is said to be diffusion-limited.

If a reaction is diffusion-limited, its rate will depend on the concentration of both the enzyme and its substrate. For a sequence of reactions to occur very rapidly, each metabolic intermediate and enzyme involved must therefore be present in high concentration. Given the enormous number of different reactions carried out by a cell, there are limits to the concentrations of substrates that can be achieved. In fact, most metabolites are present in micromolar (10 -6 M) concentrations, and most enzyme concentrations are much lower. How is it possible, therefore, to maintain very fast metabolic rates?

The answer lies in the spatial organization of cell components. Reaction rates can be increased without raising substrate concentrations by bringing the various enzymes involved in a reaction sequence together to form a large protein assembly known as a multienzyme complex. In this way the product of enzyme A is passed directly to enzyme B and so on to the final product, and diffusion rates need not be limiting even when the concentration of substrate in the cell as a whole is very low. Such enzyme complexes are very common (the structure of one, pyruvate dehydrogenase, was shown in Figure 2-41), and they are involved in nearly all aspects of metabolism, including the central genetic processes of DNA, RNA, and protein synthesis. In fact, it may be that few enzymes in eucaryotic cells diffuse freely in solution; instead, most may have evolved binding sites that concentrate them with other proteins of related function in particular regions of the cell, thereby increasing the rate and efficiency of the reactions that they catalyze.

An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is ch3f65.jpg.

Figure 3-65

.

   Compartmentalization

A large increase in the concentration of interacting molecules can be achieved by confining them to the same membrane-bounded compartment in a eucaryotic cell.

Cells have another way of increasing the rate of metabolic reactions. It depends on the extensive intracellular membrane systems of eucaryotic cells. These membranes can segregate certain substrates and the enzymes that act on them into the same membrane-bounded compartment, such as the endoplasmic reticulum or the cell nucleus. If, for example, the compartment occupies a total of 10% of the volume of the cell, the concentration of reactants in the compartment can be 10 times greater than in a similar cell with no compartmentalization ( Figure 3-65). Reactions that would otherwise be limited by the speed of diffusion can thereby be speeded up by the same factor.

Further details of protein structure and function will be presented in Chapter 5, where we discuss how cells construct tiny machines out of proteins.

Summary

The biological function of a protein depends on the detailed chemical properties of its surface. Binding sites for ligands are formed as surface cavities in which precisely positioned amino acid side chains are brought together by protein folding. In this way, normally unreactive amino acid side chains can be activated. Enzymes greatly speed up reaction rates by binding the high-energy transition states in a reaction especially tightly; they also carry out acid catalysis and base catalysis simultaneously. The rates of enzyme reactions are often so fast that they are limited only by diffusion; rates can be further increased if enzymes that act sequentially on a substrate are joined into a single multienzyme complex or if the enzymes and their substrates are confined to the same compartment of the cell.

References
General
Branden, C.; Tooze, J. Introduction to Protein Structure. New York: Garland, 1991.
Gesteland, R.F.; Atkins, J.F., eds. The RNA World. Cold Spring Harbor, NY: Cold Spring Harbor Laboratory Press, 1993.
Judson, H.F. The Eighth Day of Creation: Makers of the Revolution in Biology. New York: Simon & Schuster, 1979.
Lehninger, A.L.; Nelson, D.L.; Cox, M.M. Principles of Biochemistry, 2nd ed. New York: Worth, 1993.
Mathews, C.K.; van Holde, K.E. Biochemistry. Menlo Park, CA: Benjamin/Cummings, 1990.
Schulz, G.E.; Schirmer, R.H. Principles of Protein Structure. New York: Springer, 1979.
Stryer, L. Biochemistry, 3rd ed. New York: W.H. Freeman, 1988.
Voet, D.; Voet, J.G. Biochemistry. New York: Wiley, 1990.
Watson, J.D. Molecular Biology of the Gene, 3rd ed. Menlo Park, CA: Benjamin-Cummings, 1987.
Cited
1.
Cantor, C.R.; Schimmel, P.R. Biophysical Chemistry, Part I and Part III. New York: W.H. Freeman, 1980.
Eisenberg, D.; Crothers, D. Physical Chemistry with Applications to the Life Sciences. Menlo Park, CA: Benjamin-Cummings, 1979.
Pauling, L. The Nature of the Chemical Bond, 3rd ed. Ithaca, NY: Cornell University Press, 1960.
Whitesides, G.M.; Mathias, J.P.; Seto, C.T. Molecular self-assembly and nanochemistry: a chemical strategy for the synthesis of nanostructures. Science. 1991; 254: 13121319. [PubMed]
2.
Abeles, R.H.; Frey, P.A.; Jencks, W.P. Biochemistry. Boston: Jones and Bartlett, 1992.
Burley, S.K.; Petsko, G.A. Weakly polar interactions in proteins. Adv. Prot. Chem. 1988; 39: 125189.
Fersht, A.R. The hydrogen bond in molecular recognition. Trends Biochem. Sci. 1987; 12: 301304.
3.
Cohen, C.; Parry, D.A.D. α-helical coiled coilsa widespread motif in proteins. Trends Biochem. Sci. 1986; 11: 245248.
Dickerson, R.E. The DNA helix and how it is read. Sci. Am. 1983; 249(6): 94111.
4.
Berg, H.C. Random Walks in Biology. Princeton, NJ: Princeton University Press, 1983.
Einstein, A. Investigations on the Theory of Brownian Movement. New York: Dover, 1956.
Lavenda, B.H. Brownian motion. Sci. Am. 1985; 252(2): 7085.
5.
Lehninger, A.L. Bioenergetics: The Molecular Basis of Biological Energy Transformations, 2nd ed. Menlo Park, CA: Benjamin-Cummings, 1971.
6.
Karplus, M.; McCammon, J.A. The dynamics of proteins. Sci. Am. 1986; 254(4): 4251. [PubMed]
Karplus, M.; Petsko, G.A. Molecular dynamics simulations in biology. Nature. 1990; 347: 631639. [PubMed]
McCammon, J.A.; Harvey, S.C. Dynamics of Proteins and Nucleic Acids. Cambridge, UK: Cambridge University Press, 1987.
7.
Kirkwood, T.B.; Rosenberger, R.F.; Galas, D.J., eds. Accuracy in Molecular Processes: Its Control and Relevance to Living Systems. London: Chapman and Hall, 1986.
8.
Berg, P.; Singer, M. Dealing with Genes. The Language of Heredity. Mill Valley, CA: University Science Books, 1992.
Rosenfield, I.; Ziff, E.; Van Loon, B. DNA for Beginners. London: Writers and Readers Publishing Cooperative. New York: Distributed in the USA by Norton, 1983.
Saenger, W. Principles of Nucleic Acid Structure. Berlin: Springer, 1984.
9.
Moore, J. Heredity and Development, 2nd ed. New York: Oxford University Press, 1992.
Olby, R. The Path to the Double Helix. Seattle: University of Washington Press, 1974.
Stent, G.S.; Calendar, A.Z. Molecular Genetics: An Introductory Narrative, 2nd ed. San Francisco: W.H. Freeman, 1978.
10.
Watson, J.D.; Crick, F.H.C. Molecular structure of nucleic acids. A structure for deoxyribose nucleic acid. Nature. 1953; 171: 737738. [PubMed]
11.
Felsenfeld, G. DNA. Sci. Am. 1985; 253(4): 5866. [PubMed]
Meselson, M.; Stahl, F.W. The replication of DNA in. E. coli. Proc. Natl. Acad. Sci. USA. 1958; 44: 671682.
Watson, J.D.; Crick, F.H.C. Genetic implications of the structure of deoxyribonucleic acid. Nature. 1953; 171: 964967. [PubMed]
12.
Drake, J.W. Spontaneous mutation. Annu. Rev. Genet. 1991; 25: 125146. [PubMed]
Lindahl, T. Instability and decay of the primary structure of DNA. Nature. 1993; 362: 709715. [PubMed]
Wilson, A.C. Molecular basis of evolution. Sci. Am. 1985; 253(4): 164173. [PubMed]
13.
Sanger, F. Sequences, sequences, and sequences. Annu. Rev. Biochem. 1988; 57: 128. [PubMed]
Thompson, E.O.P. The insulin molecule. Sci. Am. 1955; 192(5): 3641.
Yanofsky, C. Gene structure and protein structure. Sci. Am. 1967; 216(5): 8094. [PubMed]
14.
Brenner, S.; Jacob, F.; Meselson, M. An unstable intermediate carrying information from genes to ribosomes for protein synthesis. Nature. 1961; 190: 576581.
Darnell, J.E. , Jr. RNA. Sci. Am. 1985; 253(4): 6878. [PubMed]
15.
Chambon, P. Split genes. Sci. Am. 1981; 244(5): 6071. [PubMed]
Steitz, J.A. Snurps. Sci. Am. 1988; 258(6): 5863. [PubMed]
Witkowski, J.A. The discovery of "split" genes: a scientific revolution. Trends Biochem. Sci. 1988; 13: 110113. [PubMed]
16.
Crick, F.H.C. The genetic code: III. Sci. Am. 1966; 215(4): 5562. [PubMed]
The Genetic Code. Cold Spring Harbor Symp. Quant. Biol. 1965; 31
17.
Rich, A.; Kim, S.H. The three-dimensional structure of transfer RNA. Sci. Am. 1978; 238(1): 5262. [PubMed]
18.
Lake, J.A. The ribosome. Sci. Am. 1981; 245(2): 8497. [PubMed]
Watson, J.D. Involvement of RNA in the synthesis of proteins. Science. 1963; 140: 1726. [PubMed]
Zamecnik, P. The machinery of protein synthesis. Trends Biochem. Sci. 1984; 9: 464466.
19.
Altman, S.; Baer, M.; Guerrier-Takada, C.; Viogue, A. Enzymatic cleavage of RNA by RNA. Trends Biochem. Sci. 1986; 11: 515518.
Cech, T. RNA as an enzyme. Sci. Am. 1986; 255(5): 6475. [PubMed]
Cech, T. Fishing for fresh catalysts. Nature. 1993; 365: 204205. [PubMed]
Michel, F.; Westhof, E. Modelling of the three-dimensional architecture of group I catalytic introns based on comparative sequence analysis. J. Mol. Biol. 1990; 216: 585610. [PubMed]
Noller, H.F. Ribosomal RNA and translation. Annu. Rev. Biochem. 1991; 60: 191227. [PubMed]
Noller, H.F.; Hoffarth, V.; Zimniak, L. Unusual resistance of peptidyl transferase to protein extraction procedures. Science. 1992; 256: 14161419. [PubMed]
20.
Branden, C.; Tooze, J. Introduction to Protein Structure. New York: Garland, 1991.
Creighton, T.E. Proteins: Structure and Molecular Properties, 2nd ed. New York: W.H. Freeman, 1993.
Dickerson, R.E.; Geis, I. The Structure and Action of Proteins. New York: Harper & Row, 1969.
Schulz, G.E.; Schirmer, R.H. Principles of Protein Structure. New York: Springer, 1990.
21.
Anfinsen, C.B. Principles that govern the folding of protein chains. Science. 1973; 181: 223230. [PubMed]
Baldwin, R.L. Seeding protein folding. Trends Biochem. Sci. 1986; 11: 69.
Creighton, T.E. Disulphide bonds and protein stability. Bioessays. 1988; 8: 5763. [PubMed]
Richards, F.M. The protein folding problem. Sci. Amer. 1991; 264(1): 5463.
Rupley, J.A.; Gratton, E.; Careri, G. Water and globular proteins. Trends Biochem. Sci. 1983; 8: 1822.
22.
Doolittle, R.F. Proteins. Sci. Am. 1985; 253(4): 8899. [PubMed]
Milner-White, E.J.; Poet, R. Loops, bulges, turns and hairpins in proteins. Trends Biochem Sci. 1987; 12: 189192.
Pauling, L.; Corey, R.B. Configurations of polypeptide chains with favored orientations around single bonds: two new pleated sheets. Proc. Natl. Acad. Sci. USA. 1951; 37: 729740. [PubMed]
Pauling, L.; Corey, R.B.; Branson, H.R. The structure of proteins: two hydrogen-bonded helical configurations of the polypeptide chain. Proc. Natl. Acad. Sci. USA. 1951; 27: 205211.
Richardson, J.S. The anatomy and taxonomy of protein structure. Adv. Protein Chem. 1981; 34: 167339. [PubMed]
23.
Scott, J.E. Molecules for strength and shape. Trends Biochem. Sci. 1987; 12: 318321.
24.
Branden, C.; Tooze, J. Introduction to Protein Structure. New York: Garland, 1991.
Hardie, D.G.; Coggins, J.R. Multidomain ProteinsStructure and Evolution. Amsterdam: Elsevier, 1986.
25.
Doolittle, R.F. The genealogy of some recently evolved vertebrate proteins. Trends Biochem. Sci. 1985; 10: 233237.
Neurath, H. Evolution of proteolytic enzymes. Science. 1984; 224: 350357. [PubMed]
26.
Blake, C. Exons and the evolution of proteins. Trends Biochem. Sci. 1983; 8: 1113.
Gilbert, W. Genes-in-pieces revisited. Science. 1985; 228: 823824. [PubMed]
McCarthy, A.D.; Hardie, D.G. Fatty acid synthase: an example of protein evolution by gene fusion. Trends Biochem. Sci. 1984; 9: 6063.
Rossmann, M.G.; Argos, P. Protein folding. Annu. Rev. Biochem. 1981; 50: 497532. [PubMed]
27.
Gehring, W.J. On the homeobox and its significance. Bioessays. 1986; 5: 34. [PubMed]
Hanks, S.K.; Quinn, A.M.; Hunter, T. The protein kinase family: conserved features and deduced phylogeny of the catalytic domains. Science. 1988; 241: 4252. [PubMed]
Shabb, J.B.; Corbin, J.D. Cyclic nucleotide-binding domains in proteins having diverse functions. J. Biol. Chem. 1992; 267: 57235726. [PubMed]
Thornton, J.M.; Gardner, S.P. Protein motifs and data-base searching. Trends Biochem. Sci. 1989; 14: 300304. [PubMed]
28.
Bajaj, M.; Blundell, T. Evolution and the tertiary structure of proteins. Annu. Rev. Biophys. Bioeng. 1984; 13: 453492. [PubMed]
Klug, A. From macromolecules to biological assemblies. Biosci. Rep. 1983; 3: 395430. [PubMed]
Metzler, D.E. Biochemistry. New York: Academic Press, 1977. (Chapter 4 describes how macromolecules pack together into large assemblies.).
29.
Caspar, D.L.D.; Klug, A. Physical principles in the construction of regular viruses. Cold Spring Harbor Symp. Quant. Biol. 1962; 27: 124. [PubMed]
Goodsell, D.S. Inside a living cell. Trends Biochem. Sci. 1991; 16: 203206. [PubMed]
30.
Cohen, C.; Parry, D.A.D. α-helical coiled coilsa widespread motif in proteins. Trends Biochem. Sci. 1986; 11: 245248.
31.
Harrison, S.C. Multiple modes of subunit association in the structures of simple spherical viruses. Trends Biochem. Sci. 1984; 9: 345351.
Harrison, S.C. Viruses. Curr. Opin. Struc. Biol. 1992; 2: 293299.
Hogle, J.M.; Chow, M.; Filman, D.J. The structure of polio virus. Sci. Am. 1987; 256(3): 4249. [PubMed]
Rossmann, M.G.; Johnson, J.E. Icosahedral RNA virus structure. Annu. Rev. Biochem. 1989; 58: 533573. [PubMed]
32.
Fraenkel-Conrat, H.; Williams, R.C. Reconstitution of active tobacco mosaic virus from its inactive protein and nucleic acid components. Proc. Natl. Acad. Sci. USA. 1955; 41: 690698. [PubMed]
Hendrix, R.W. Tail length determination in double-stranded DNA bacteriophages. Curr. Top. Microbiol. Immunol. 1988; 136: 2129. [PubMed]
Namba, K.; Caspar, D.L.D.; Stubbs, G.J. Computer graphics representation of levels of organization in tobacco mosaic virus structure. Science. 1985; 227: 773776. [PubMed]
Nomura, M. Assembly of bacterial ribosomes. Science. 1973; 179: 864873. [PubMed]
Trinick, J. Understanding the functions of titin and nebulin. FEBS Lett. 1992; 307: 4448. [PubMed]
33.
Mathews, C.K., Kutter, E.M.; Mosig, G.; Berget, P.B. Bacteriophage T4, Chaps. 1 and 4. Washington, DC: American Society of Microbiologists, 1983.
Steiner, D.F.; Kemmler, W.; Tager, H.S.; Peterson, J.D. Proteolytic processing in the biosynthesis of insulin and other proteins. Fed. Proc. 1974; 33: 21052115. [PubMed]
34.
Dressler, D.; Potter, H. Discovering Enzymes. New York: Scientific American Library, 1991.
Fersht, A. Enzyme Structure and Mechanism, 2nd ed. New York: W.H. Freeman, 1985.
35.
Fersht, A.R.; Leatherbarrow, R.J.; Wells, T.N.C. Binding energy and catalysis. Trends Biochem. Sci. 1986; 11: 321325.
Hansen, D.E.; Raines, R.T. Binding energy and enzymatic catalysis. J. Chem. Educ. 1990; 67: 483489.
36.
Lerner, R.A.; Benkovic, S.J.; Schultz, P.G. At the crossroads of chemisry and immunology: catalytic antibodies. Science. 1991; 252: 659667. [PubMed]
Lerner, R.A.; Tramontano, A. Catalytic antibodies. Sci. Am. 1988; 258(3): 5870. [PubMed]
Wolfenden, R. Analog approaches to the structure of the transition state in enzyme reactions. Accounts Chem. Res. 1972; 5: 1018.
37.
Knowles, J.R. Tinkering with enzymes: what are we learning? Science. 1987; 236: 12521258. [PubMed]
Knowles, J.R. Enzyme catalysis: not different, just better. Nature. 1991; 350: 121124. [PubMed]
38.
Stroud, R.M. A family of protein-cutting proteins. Sci. Am. 1974; 231(1): 7488. [PubMed]
39.
Wood, W.B.; Wilson, J.H.; Benbow, R.M.; Hood, L.E. Biochemistry, A Problems Approach, 2nd ed. Menlo Park, CA: Benjamin-Cummings, 1981. (Chapters 9 and 15 and associated problems.).
40.
Barnes, S.J.; Weitzman, P.D.J. Organization of citric acid cycle enzymes into a multienzyme cluster. FEBS Lett. 1986; 201: 267270. [PubMed]
Berg, O.G.; von Hippel, P.H. Diffusion-controlled macromolecular interactions. Annu. Rev. Biophys. Biophys. Biochem. 1985; 14: 131160.
Reed, L.J.; Cox, D.J. Multienzyme complexes. In The Enzymes, 3rd ed. (P.D. Boyer, ed.), Vol. 1, pp. 213-240. New York: Academic Press, 1970.
Help ǀ Contact Bookshelf