By agreement with the publisher, this book is accessible by the search feature, but cannot be browsed.

NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.

Griffiths AJF, Miller JH, Suzuki DT, et al. An Introduction to Genetic Analysis. 7th edition. New York: W. H. Freeman; 2000.

## An Introduction to Genetic Analysis. 7th edition.

Show detailsIn Chapter 5, we learned that the basic genetic
method of measuring map distance is based on recombinant frequency (RF). A genetic map unit (m.u.) was defined as a recombinant frequency of 1 percent. This is a useful fundamental unit
that has stood the test of time and is still used in genetics. However, the larger the
recombinant frequency, the less accurate it is as a measure of map distance. In fact, map units
calculated from larger recombinant frequencies are smaller than map units calculated from
smaller recombinant frequencies. We encountered this effect in examples in Chapter 5. Typically, when measuring recombination
between three linked loci, the sum of the two internal recombinant frequencies is greater than
the recombinant frequency between the outside loci. With the use of such data, what is the most
accurate estimate that can be made of map distance between the two outside loci
*A* and *C* in the following diagram?

The answer is that* x *+ *y* is the best estimate, and more
accurate than the smaller overall *A*–*C* value. This gives us the
following useful mapping principle:

### MESSAGE

The best estimates of map distance are obtained from the sum of the distances calculated for shorter subintervals.

However, what if we have no intervening marker loci available to measure recombination in
shorter intervals? Such a situation would be commonly encountered when beginning to map a new
experimental organism or in cases in which the genome is huge, as it is in human beings. For
example, in the preceding diagram, what if there were no known *B* locus? Would
we have to make do with the map distance value obtained directly from the
*A*–*C* recombinant frequency? Furthermore, what about the
shorter intervals themselves? If there were other loci between *A* and
*B* and between *B* and *C*, then we might obtain
even better estimates of the *A* to *C* distance. Luckily, there
is a way of taking any recombinant frequency and performing a calculation to make it a more
accurate measure of map distance, without studying shorter and shorter intervals.

Before we consider the calculation, let’s think about the reason why larger RF values are less accurate measures of map distance. We have already encountered the culprit: multiple crossovers. In Chapter 5, we learned that double crossovers often lead to a parental arrangement of alleles and therefore the resulting meiotic products are not counted when measuring recombinant frequency. The same is true for other types of multiple crossovers: triples, quadruples, and so forth. So it is easy to see that multiple crossovers automatically lead to an underestimate of map distance, and, because the multiples are expected to be relatively more common over longer regions, we can see why the problem is worse for larger recombinant frequencies.

How can we take these multiple crossovers into account when calculating the map distances? What we need is a mathematical function that accurately relates recombination to map distance. In other words, what we need is a mapping function.

### MESSAGE

A mapping function is a formula for using recombinant frequencies to calculate map distances corrected for multiple crossover products.

## Poisson distribution

To derive a mapping function, we need a mathematical tool widely used in genetic analysis
because it is useful in describing many different types of genetic processes. This mathematical
tool is the Poisson distribution. A distribution
is merely a description of the frequencies of the different types of classes that arise from
sampling. The Poisson distribution describes the frequency of classes containing 0, 1, 2, 3,
4, . . . , *i* items when the average number of items per sample is known. The
Poisson distribution is particularly useful when the average is small in relation to the total
number of items possible. For example, the possible number of tadpoles obtainable in a single
dip of a net in a pond is quite large, but most dips yield only one or two or none. The number
of dead birds on the side of a highway is potentially very large, but in a sample mile the
number is usually small. Such samplings are described well by the Poisson distribution.

Let’s consider a numerical example. Suppose that we randomly distribute 100 one-dollar bills to 100 students in a lecture room, perhaps by scattering them over the class from some point near the ceiling. The average (or mean) number of bills per student is 1.0, but common sense tells us that it is very unlikely that each of the 100 students will capture one bill. We would expect a few lucky students to grab three or four bills each and quite a few students to come up with two bills each. However, we would expect most students to get either one bill or none. The Poisson distribution provides a quantitative prediction of the results.

In this example, the item being considered is the capture of a bill by a student. We want to
divide the students into classes according to the number of bills each captures and then find
the frequency of each class. Let *m* represent the mean number of items (here,
*m* = 1.0 bill per student). Let *i* represent the number for a
particular class (say, *i* = 3 for those students who get three bills each). Let
*f*(*i*) represent the frequency of the *i*
class—that is, the proportion of the 100 students who each capture *i* bills.
The general expression for the Poisson distribution states that

where *e* is the base of natural logarithms (*e* is
approximately 2.7) and ! is the factorial symbol. As examples, 3! = 3 × 2 × 1 = 6 and
4! = 4 × 3 × 2 × 1 = 24. By definition, 0! = 1. When computing *f*(0), recall
that any number raised to the power of 0 is defined as 1. Table 6-1 gives values of *e*^{−}*
^{m}
* for

*m*values from 0.000 to 1.000. Values for

*m*greater than 1 can be obtained by calculation.

In our example, *m* = 1.0. Using Table
6-1, we compute the frequencies of the classes of students capturing 0, 1, 2, 3, and 4
bills as follows:

Figure 6-1 is a histogram of this distribution. We
predict that about 37 students will capture no bills, about 37 will capture one bill, about 18
will capture two bills, about 6 will capture three bills, and about 2 will capture four bills.
This accounts for all 100 students; in fact, you can verify that the Poisson distribution
yields *f*(5) = 0.003, which makes it likely that no student in this sample of
100 will capture five bills.

Similar distributions may be developed for other *m* values. Some are shown in
Figure 6-2 as curves instead of bar histograms.

## Derivation of a mapping function

The Poisson distribution can also describe the distribution of crossovers along a chromosome in meiosis. In any chromosomal region, the actual number of crossovers is probably small in relation to the total number of possible crossovers in that region. If crossovers are distributed randomly (that is, there is no interference), then, if we knew the mean number of crossovers in the region per meiosis, we could calculate the distribution of meioses with zero, one, two, three, four, and more multiple crossovers. This calculation is unnecessary in the present context because, as we shall see, the only class that is really crucial is the zero class. We want to correlate map distances with observable RF values. Meioses in which there are one, two, three, four, or any finite number of crossovers per meiosis all behave similarly in that they produce an RF of 50 percent among the products of those meioses, whereas the meioses with no crossovers produce an RF of 0 percent. To see how this can be so, consider a series of meioses in which nonsister chromatids do not cross over, cross over once, and cross over twice, as shown in Figure 6-3. We obtain recombinant products only from meioses with at least one crossover in the region, and always precisely half the products of such meioses are recombinant. We see then that the real determinant of the RF value is the size of the zero crossover class in relation to the rest.

As noted in Figure 6-3, we consider only crossovers between nonsister chromatids; sister-chromatid exchange is thought to be rare at meiosis. If it occurs, it can be shown to have no net effect in most meiotic analyses.

At last, we can derive the mapping function. Recombinants make up half the products of those meioses having at least one crossover in the region. The proportion of meioses with at least one crossover is 1 minus the fraction with zero crossovers. The zero-class frequency will be:

which equals

So the mapping function can be stated as

This formula relates recombinant frequency to *m*, the mean number of
crossovers. Because the whole concept of genetic mapping is based on the occurrence of
crossovers, as well as proportionality between crossover frequency and the physical size of a
chromosomal region, you can see that *m* is probably the most fundamental
variable in the whole process. In fact, *m* could be considered to be the
ultimate genetic mapping unit.

If we know an RF value, we can calculate *m* by solving the equation. After
obtaining many values of *m*, we can plot the function as a graph, as in Figure 6-4. Viewing the function plotted as a graph should
help us see how it works. First, notice that the function is linear for a certain range
corresponding to very small *m* values. (Remember that *m* is our
best measure of genetic distance.) Therefore, RF is a good measure of distance where the dashed
line coincides with the function in Figure 6-4. In this
region, the map unit defined as 1 percent RF has real meaning. Therefore, let’s use this region
of the curve to define corrected map units by considering some small values of
*m*:

We see that RF = m/2, and this relation defines the dashed line in Figure 6-4. It allows us to translate *m* values into
corrected map units. Expressing *m* as a percentage, we see that an
*m* of 100 percent (=1) is the equivalent of 50 corrected map units. Because an
*m* value of 1 is the equivalent of 50 corrected map units, we can express the
horizontal axis of Figure 6-4 in our new map units. Now
we can see from the graph that two loci separated by 150 corrected map units show an RF of only
50 percent. We can use the graph of the function to convert any RF into map distance simply by
drawing a horizontal line from the RF value to the curve and dropping a perpendicular to the
map unit axis—a process equivalent to using the equation RF = 1/2(1 − e^{−m}) to solve
for *m*.

Let’s consider a numerical example of the use of the mapping function. Suppose that we get an RF of 27.5 percent. How many corrected map units does this represent? From the function,

Therefore

From *e*^{−}*
^{m}
* tables (or by using a calculator), we find that

*m*= 0.8, which is the equivalent of 40 corrected map units. If we had been happy to accept 27.5 percent RF as representing 27.5 map units, we would have considerably underestimated the distance between the loci.

### MESSAGE

To estimate map distances most accurately, put RF values through the mapping function. Alternatively, add distances that are each short enough to be in the region where the mapping function is linear.

A corollary of the second statement of this message is that for organisms for which the
chromosomes are already well mapped, such as *Drosophila,* a geneticist seldom
needs to calculate from the map function to place newly discovered genes on the map. This is
because the map is already divided into small, marked regions by known loci. However, when the
process of mapping has just begun in a new orga-nism or when the available genetic markers are
sparsely distributed, the corrections provided by the function are needed.

Notice that no matter how far apart two loci are on a chromosome, we never observe an RF
value of greater than 50 percent. Consequently, an RF value of 50 percent would leave us in
doubt about whether two loci are linked or are on separate chromosomes. Stated another way, as
*m* gets larger, *e*^{−}*
^{m}
* gets smaller and RF approaches 1/2(1 − 0) =1/2 × 1 = 0.5, or 50 percent. This is an
important point: RF values of 100 percent are not observed, no matter how far apart the loci
are.

- Accurate calculation of large map distances - An Introduction to Genetic Analysi...Accurate calculation of large map distances - An Introduction to Genetic Analysis

Your browsing activity is empty.

Activity recording is turned off.

See more...