# Maximum likelihood estimation of a migration matrix and effective
population sizes in *n* subpopulations by using a
coalescent approach

^{*}To whom reprint requests should be addressed. E-mail: ude.notgnihsaw.sciteneg@ilreeb.

## Abstract

A maximum likelihood estimator based on the coalescent for
unequal migration rates and different subpopulation sizes is developed.
The method uses a Markov chain Monte Carlo approach to investigate
possible genealogies with branch lengths and with migration events.
Properties of the new method are shown by using simulated data from a
four-population *n*-island model and a source–sink
population model. Our estimation method as coded in migrate
is tested against genetree; both programs deliver a very
similar likelihood surface. The algorithm converges to the estimates
fairly quickly, even when the Markov chain is started from unfavorable
parameters. The method was used to estimate gene flow in the Nile
valley by using mtDNA data from three human populations.

**Keywords:**coalescence theory, population genetics

Estimation of migration rates
from genetic data has a long history. As soon as the first analyses of
population samples by using enzyme electrophoresis were available,
migration rates estimated from Wright's *F _{ST}* (1)
were used to infer patterns of gene flow. With the advent of other
types of genetic data, such as restriction fragment length polymorphism
data, DNA sequences, and microsatellite loci, migration rates have been
routinely estimated by using modified versions of

*F*(2–6). Translation of these

_{ST}*F*equivalents into migration rate estimates most often assumes that all subpopulations have the same size, or that there are infinitely many subpopulations, and that the migration rates are all symmetric. If the true migration pattern has been asymmetric, or the subpopulation sizes are unequal,

_{ST}*F*-based methods will deliver wrong estimates (7).

_{ST}Recently, it became possible to estimate migration rates and population sizes without the assumption that subpopulation sizes are all equal (8) or that the migration rates between the subpopulations are symmetric (9–12).

We describe here an extension of our two-population method (10)
that calculates maximum likelihood estimates of migration rates and
subpopulation sizes by using coalescence theory (13, 14). Our method
allows us to analyze more than two subpopulations, to specify arbitrary
migration scenarios, and to test a hierarchy of different migration
scenarios. Success of estimating migration pattern is assessed with
simulated sequence data in an *n*-island population model and
a source–sink population scenario. Convergence to the correct result
is investigated through simulation. The performance of our method,
implemented in the program migrate, is compared
with that of genetree (11). Finally, we analyze a
human mtDNA hypervariable region I data set from three populations in
the Nile valley, a total of 225 individuals.

## Materials and Methods

### Model.

We infer the population parameters by using a maximum likelihood approach based on coalescence theory (13, 14). This likelihood is the probability of the data given the parameters

where Prob(|) is the probability of a genealogy given the population parameters , such as population size (15), exponential population growth rate (16), migration rates (10), and recombination rate (17, 18).

Prob(|) is the likelihood of the data given
the genealogy; this quantity is widely used in phylogenetic inference
(19, 20). In this paper, we focus on estimation of migration rates and
population sizes while assuming a molecular clock at each locus. For a
system with *n* populations, we use the following set of
parameters:

where _{ji} is
*m _{ji}*/μ, where

*m*is the immigration rate from population

_{ji}*j*into

*i*, and

*μ*is the mutation rate per generation. For sequence data,

*μ*is the mutation rate per site and for allelic data, such as allozyme or microsatellite markers, and

*μ*is the mutation rate per locus. Θ

_{i}is 4

*N*μ, where

*N*is the effective population size of population

*i*in a Wright–Fisher population model. Sometimes we use

*γ*, which is Θ

_{ji}_{i}

_{ji}= 4

*N*

*m*.

_{ji}Kingman's coalescent can be extended to include migration (21).
Instead of just one type of event, the coalescence of lineages, we need
to record *n*^{2} different events:
coalescences in different subpopulations and migration events that
switch lineages from one population to another. This
migration–coalescence prior is a product over all time intervals
*T* on the genealogy (21). Going backwards in time and using a
time scale in which the units of time is the ratio of the generation
time and the expectation of the mutation rate, we have

The exponential term is the probability that in the
*j*th time interval with length *u _{j}*
neither a migration nor a coalescent event happens;

*u*is scaled by generations and mutation rate. The remaining term is the point probability density of the actual event that happens. The events in genealogy are either migrations from subpopulation

_{j}*w*to

_{j}*v*or coalescences in subpopulation

_{j}*v*. The indicator variable

_{j}*δ*is 1 when the event at the bottom of interval

_{j}*j*is a migration event and is 0 otherwise, and

*k*is the number of lineages in subpopulation

_{ji}*i*during time interval

*j*. No modification of the genealogy likelihood Prob(|) is necessary to accommodate migration events, as they occur independently of mutation. The modified coalescent probability (3) and the genealogy likelihood are rather easy to calculate.

Unfortunately, the sum of the probabilities of all possible genealogies
with different topologies and branch lengths cannot be calculated
because there are infinitely many of them and no analytical solution
for the integral over branch lengths or topologies is available. But it
can be approximated by using the Metropolis–Hastings Markov chain
Monte Carlo approach (22, 23). This approach (which we denote MH)
concentrates the sampling of genealogies in those regions of
the genealogy space that contribute most to the final likelihood. With
MH importance sampling, we compute instead of the likelihood
*L*() the ratio
*L*()/*L*(_{0}),
where _{0} are the parameters that were
used to sample the genealogies. This is

where *g* is the number of sampled genealogies. Our
derivation of formula **4** from formula **1** was
described earlier (10) and is essentially a standard MH scheme (24).
The denominator *L*(_{0}) is
unknown, so that we cannot estimate the absolute likelihood, but this
likelihood ratio is proportional to the absolute likelihood function.
The parameter estimates obtained by maximizing this approximate
likelihood ratio will approach the maximum likelihood estimates as the
number of sampled genealogies *g* becomes infinite (22, 23).
We find that very good estimates are achieved with a moderate number of
genealogies (10, 15–17).

Our MH approach needs a set of starting parameters
_{0} and an initial genealogy. These
starting parameters can often be found by using a simpler method, such
as methods based on *F _{ST}* (25). The first
genealogy is created by using a UPGMA (Unweighted Pair Group Method
using arithmetic averages) method (see ref. 19) to construct the
topology, and by using Sankoff's parsimony method (26) to reconstruct
the minimal number of migrations on that topology. The branch lengths
of this initial genealogy are drawn randomly from a coalescent density
for that topology given

_{0}. Our MH method (10) moves through genealogy space by making small rearrangements of branches of the current genealogy. A new genealogy is accepted with probability equal to the ratio of the likelihood of the old genealogy and the likelihood of the new [(for details, see ref. 10)

*r*= min(1, Prob(

*D*|

*G*

_{new})/Prob(

*D*|

*G*

_{old})].

With a random or otherwise inappropriate starting genealogy and an
inappropriate _{0}, the program can spend
much time in regions with highly improbable genealogies. To overcome
these starting conditions, we use an adaptive scheme that samples 10
short chains, in each of which several thousand genealogies are
sampled, followed by two or three long chains, each of which samples
many tens or hundreds of thousands of genealogies. After each chain, we
reestimate the parameters by maximizing the likelihood ratio
(4). These are taken as the _{0}
of the next chain. The last long chain is used for the final estimates
of ; the earlier chains are used only to obtain good starting
parameters.

One would like to know not only the maximum likelihood values of
the population parameters, but also confidence intervals for these
parameters. Approximate confidence intervals can be generated in a
maximum likelihood framework by using either the curvature of the
likelihood at its maximum or profile likelihoods (27). The latter are
more appropriate for our purposes, as curvature-based estimates can be
unreliable with many parameters unless there are many loci. For an
approximative confidence interval for a single parameter, we compare
twice the logarithm of the ratio of its profile likelihoods to the
quantiles of the *χ*^{2}-distribution
with one degree of freedom (27) for the desired level of confidence. If
a researcher needs multiparameter confidence intervals, she would need
to use a Bonferroni correction or a likelihood ratio test with the
correct number of degrees of freedom. The latter method is implemented
in migrate. This likelihood-ratio-based approach
may be inappropriate for data that in theory cannot be extended, such
as mtDNA, because the *χ*^{2}
approximation becomes exact only as we add a large number of loci.

## Results

### Simulation Study.

Data sets were created by using an approach first described by
Hudson (28). For some given set of true parameters
Θ_{i} and
4*N ^{(i)}m_{ji}*, a coalescent
genealogy is created. This genealogy is then used to evolve sites
according to the Kimura two-parameter substitution model (29), starting
at the root of that genealogy. The sites resulting at the tips are
taken as the data. We used 500 sites for each locus in all simulations,
except for the comparison of our own results with those from
genetree (11).

#### n*-island model.*

We simulated a 4-island model and were generating 100 10-locus data sets with 25 individuals sampled from each of 4 subpopulations (Fig. (Fig.11).

The values for the Θ_{i} were taken to be equal
and set to 0.01, a value that is moderately close to the estimate of
Θ = 0.039 from mtDNA control region domain I sequences from the
Nuu-Chah-Nulth people (15). The migration parameters 4*Nm*
were all set to 1.0. Data sets were analyzed twice, once under the
assumption that this is a symmetric *n*-island model with
two parameters Θ and 4*Nm*, which are the same in all
populations, and once under the assumption that we have
*n*^{2} parameters, *n* different
Θ_{i} and *n*(*n*
− 1) different 4*N _{i}m_{ji}*.

The averages for the two parameters of the *n*-island
model were = 0.00999 with an SE of 0.00007 and
= 0.96327 with an SE of 0.01351. Averages
of the limits of the one-parameter 95% profile confidence intervals
were (0.00888, 0.01089) and (0.86656, 1.12981), respectively. The
estimates for the full migration matrix model with all 16 parameters is
shown in Table Table1.1. The estimates for the
Θ_{i} are surprisingly precise given the
parameter-rich model, but most of the
4*N _{i}m_{ji}* are lower than the true value
parameter, although the averages of the individual 95% profile
confidence intervals include the true parameter values.

#### Source–sink model.

The full model with *n* population sizes and
*n*(*n* − 1) migration rates is able to detect
asymmetric gene flow and differences in population sizes. But for some
analyses, this freedom is undesirable because the researcher already
has some idea of the pattern of gene flow or the population sizes and
may want to fix population sizes, force migration rates to be
symmetric, use equal migration rates, or set some migration rates to
zero. This goal can be achieved in migrate by
using a migration connection matrix
(http://evolution.genetics.washington.edu/lamarc/migratedoc/migratedoc.html),
such as the one shown in Fig. Fig.22.

*A*) Arrows mark directions of migration, and disk sizes are proportional to population sizes. (

*B*) The corresponding population connection matrix used in migrate. The matrix contains specifications for Θ

**...**

One hundred simulated 10-locus data sets from the populations shown in Fig. Fig.22 were analyzed by using the full set of 16 parameters, and 50 data sets were analyzed by using only the 7 parameters implied by the migration-connection matrix.

Some of the simulations in Table Table2 do2 do not recover the true values of the migration parameters very well: all parameters with true parameter values of 0.0 are overestimated. Most disturbing are the values for migrations from population 2 to 1, from 3 to 2, and from 4 to 1. But this fact is not surprising, as all parameters are bounded by zero but have no upper limit. Our estimates must deliver a value greater than zero, so that the result must be an upwards bias. In addition, if we do not know the directionality of gene flow, finding the same haplotype in two or more populations will force the program to estimate at least a small migration rate in the wrong direction. Only a few mutations will arise in the small population and be visible in the sample, and only those unique mutations would contribute to inferring that that gene flow from the small population to the big population is very small. Thus, we would need many loci to establish this directional pattern.

If we know the migration model and need to estimate only 7 instead of
16 parameters, the maximum likelihood estimates are almost identical to
those for the parameter-rich model shown in Table Table2,2, but the profile
confidence intervals are slightly smaller: the coefficients of
variation (CV) of the parameters are about 25% smaller than with the
full model. For example, with 7 parameters, the CV for
*γ*_{12} is 0.299, whereas for the full
model, the CV for *γ*_{12} is 0.392.

### Convergence of Our Metropolis–Hastings Sampling Method.

Our Metropolis–Hastings Markov chain algorithm is irreducible, as it
can reach any possible genealogy from any other. However, it may take a
very long time until the proper regions of the parameter space are
found, because the algorithm is sensitive to the start parameters. This
problem is specific not to our algorithm but to any MH algorithms that
draw correlated samples. There is no simple criterion to judge when the
program has converged to the best possible answer. Several convergence
measures have been suggested, but there is no guarantee that
convergence occurs in a given run (30). We have used a simple graphical
method to explore convergence of the two-parameter model (Fig.
(Fig.3).3). A single-locus data set of DNA
sequences with 500 bp with 100 individuals sampled from a symmetric
model with 4 populations was generated with Θ = 0.01 and
= 100. We ran four cases each with starting parameters
(Θ_{0}, _{0}) equal to
(0.0001, 2), (0.001, 200), (0.1, 200), and (0.1, 2) (see Fig. Fig.3).3). Each
run had 10 short chains, each with a total of 20,000 genealogies, and 3
long chains each with a total of 110,000 genealogies. The first 10,000
genealogies in each chain were discarded. At the end of each chain, the
parameters were estimated and recorded, and the next chain was then
started with these new parameters. These four cases were then compared
with a very long run (five times longer) that started from estimates
Θ_{0} and _{0}, which
were based on *F _{ST}* estimates.

*A*) Θ

_{0}= 0.1 and

_{0}= 2, (

*B*) 0.001 and 2, (

*C*) 0.1 and 200, (

*D*) 0.001 and 200.

*E*shows run that was started with values from

**...**

The starting points _{0} chosen for
this convergence study are fairly far from the true parameter values
(see ref. 31), but the adaptive improvement of the
_{0} moves gradually toward parameters of
highest likelihood for this data set, as can be seen in Fig. Fig.3:3: in
*A*, *B*, *D*, and *F*, the
trajectories are moving toward values close to the true parameters,
namely toward the maximum likelihood estimates for this specific data
set, _{data}. In Fig. Fig.33*C*, the
trajectory first moves toward high values (256.4, outside of
the shown frame) while staying at low Θ (0.0066) and then returns
toward _{data}.

### Comparison with genetree.

genetree (11) can use only sequence data that evolve
according to an infinitely-many-sites model. To approximate this model,
we simulated data according to the Kimura two-parameter model, but when
more than one mutation occurred on the genealogy for a given site, we
split the site into two or more new sites, so that each of these new
sites would have mutated only once or not at all. We simulated sequence
data for 100 loci with 500 bp each that evolved according to this
infinite-sites model with 2 populations with 2 sampled individuals
each. The subpopulations had the same size (Θ_{i}
= 0.002) and had a symmetric migration model with rate
4*Nm* = 0.1. This data set was analyzed with
genetree after removal of the invariant sites and
also analyzed with migrate for the full set of
sites. We chose to evaluate a data set with few individuals and these
parameters so that we can compare the outcomes of both programs
independent of their ability to search the genealogy space. With
genetree, we sampled 1,000 genealogies per locus
and used the true parameters to run the sampling process. Our true
parameters defined the driving parameters for
genetree to be θ_{0}
= 4, and 4*Nm*_{0} = 0.2. The
θ_{0} and
4*Nm*_{0} used in
genetree are computed from the mutation rate per
locus, and *N* is the size of the whole population.
migrate was run at its default values, except for
the following settings: the starting _{0}
were set to the true values, and the lengths of sampling were set to 10
short chains each of 400 genealogies, of which the first 100 were
discarded and then every third genealogy used for the parameter
estimation, and then two long chains with 1,600 genealogies each, of
which the first 100 were discarded and then every third one was used
for parameter estimation. This was a total of 7,200 genealogies visited
per locus.

With a large number of loci, one expects that the results should converge to the values used to generate the data. With 100 loci and only 4 sampled individuals, there is much uncertainty about the parameters, but both programs include the true parameter values in their 50% confidence regions (Fig. (Fig.44).

**...**

migrate spent 45 min for the whole 100-locus data set on a computer with a 166-MHz Pentium processor, running linux. It was sampling ≈271 genealogies with 4 tips per second; genetree spent roughly 350 min on the same computer, and it evaluated about 5 genealogies per second. However, migrate's genealogies are autocorrelated, whereas genetree's are independently sampled.

### Example Data Set.

For a real-world example, we used aligned mtDNA control region
domain I sequences of populations from the Nile valley, described by
Krings *et al.* (32). The aligned sequences were taken from
the compilation of Handt *et al.* (33). We chose the
following three groups: Egypt (79 sequences), Nubia (69
sequences), and Sudan (79 sequences). These 225 sequences
certainly violate several of the assumptions
migrate is based on: the “populations” are
assemblages of local populations, some or all of the population sizes
were not constant, and migration rates between the populations were
most likely not constant over time.

In our analysis, we ignored the existence of unsampled populations but
took into account that mutation rates of the mtDNA control region
Domain I data are heterogenous among sites by using a Hidden Markov
Model with F84 mutation model with 4 rates (0.025, 0.239, 0.787, 2.354)
with equal probability (34). This is an approximation to a Gamma
distribution with *α* = 0.3 (see ref. 35). We used a
transition–transversion ratio of 15 and the empirical base frequencies
(A: 0.3302, C: 0.3313, G: 0.1161, T: 0.2225). We analyzed the data by
using starting Θ_{i} of 0.5 and
_{ji} of 5.0. Because of the size of the
problem, we used a Metropolis-coupled MH algorithm (36) with four
independent chains that accept at different rates. The chain that was
used for the estimates uses an unmodified acceptance ratio, whereas the
others accept more often. Switching between neighboring chains followed
the approach of Kuhner and Felsenstein (37).

The results of the analysis are shown in Table Table3.3. The gene flow among the populations seems to be moderate, except that there is considerable gene flow from Egypt into Nubia and from Sudan into Nubia.

## Discussion

The present method allows us to analyze a wide range of
different population models. It allows us to estimate as many as
*n* population sizes Θ_{i} and
*n*(*n* − 1) immigration rates
_{ij} or as few as two parameters, a Θ
that is equal for all subpopulations and an that is the same
between all pairs of subpopulation. With the migration connection
matrix, one can analyze arbitrary migration models where some migration
routes are not allowed. This versatility allows us to consider
biologically relevant migration scenarios and to put some of the
complication under the control of the user.

The MH technique for this method is identical to that described in ref.
10. Our method wanders through the sample space by proposing local
changes on a genealogy and rejecting or accepting such a changed
genealogy according to their likelihood. These changes in genealogy are
reversible: we showed (10) that this branch insertion and removal
process allows us to connect any two genealogies with a modest number
of rearrangements, and that genealogies are sampled in proportion to
Prob(|_{0})
Prob(|). This MH sampler will converge to the
correct answer when run for an infinitely long time, but of course our
hope is that convergence will be achieved much earlier. For simple
population scenarios, such as in our simulations of the
*n*-island model, we can get similar parameter estimates even
from bad starting parameters (Fig. (Fig.3).3). Our adaptive scheme using many
short chains and a few long chains helps move the sampler into regions
with genealogies that have high probabilities. As a result, estimates
are more accurate than if they came from a single long chain run at an
arbitrary _{0}. For practical purposes,
this self-targeting process for finding the appropriate distribution is
important, as it seems less desirable to rely on the user to find good
starting parameters or to ask the user to restart the estimation
process many times.

For large data sets, the researcher may need to run migrate several times with different chain lengths to see whether the length and the number of the chains are influencing the result. Alternative strategies are the use of Metropolis coupling (36) (e.g., our real-world example) or summarizing over different chains (15, 38) or even over different runs. These extensions will improve results but will increase the time for analysis considerably. In genetree (11), which is currently the only competing coalescent likelihood program, no such adaptive scheme is used, and the researcher needs to find appropriate starting parameters by doing the iterations by hand. Our and Bahlo and Griffiths' schemes will produce strange estimates when the starting parameters are far from the truth (Fig. (Fig.3;3; ref. 31). Bayesian approaches, although they vary the parameters of interest during the sampling process, will have similar problems if they are based on MH: with increasing numbers of parameters, the search space gets larger and much more sampling needs to be done to produce a proper posterior distribution. So far, there is no Bayesian method for analyzing migration models by using the coalescent, except for the two-population method developed by Nielsen and Slatkin (39).

The comparison with genetree shows that for the cases chosen, both methods deliver similarly shaped likelihood surfaces, as they should, because both are approximating the same likelihood criterion (17). For this data set, genetree has slightly wider confidence intervals in the Θ direction than migrate (Fig. (Fig.4).4). When using only a single population, both programs deliver almost identical likelihood curves and therefore confidence intervals (data not shown). The small differences might be caused by the different assumptions about the mutation model or by the different distributions from which the programs sample their genealogies.

Advantages of migrate over the current version of genetree are that the researcher can take into account different mutation models, such as the infinite allele model, a stepwise mutation model for microsatellite data, and sequence evolution models with rate heterogeneity among sites (34). All models can be combined with a model of rate heterogeneity among loci (10).

In the simulations tests of an *n*-island migration model, the
averages of the two-parameter model are rather close to the values used
to generate the data sets but are most often slightly smaller. This is
in stark contrast to theoretical results showing that expectations of
parameters over all simulations do not exist or at least are highly
biased upwards (41). There exists a very small but nonzero probability
that the data are compatible with a genealogy of infinite length. If
such a data set is encountered in a simulation study, the program will
return very large parameter estimates. The distributions of the 100
10-locus estimates from the simulation have heavy right tails (skewness
*S* for Θ is 0.063 and for 4*Nm* is 0.11). The
skewness is more pronounced if we look at the distribution of the 1,000
single locus estimates (*S*_{Θ} = 0.195,
*S*_{4Nm} = 0.896). In fact, there is an
upwards bias that is reflected in the skewness but not much in the
averages because in these 100 simulation runs, we have not yet
encountered a very high parameter value. On the other hand, there is a
“fatal attraction” to zero: once a parameter becomes very small,
it is unlikely that our adaptive MH procedure will succeed in reaching
higher parameter values, because the events that are under the control
of that parameter are not proposed and therefore subsequent parameter
estimates tend to stay small.

For some population structures, such as a hidden source–sink scenario with large gene flow from a large population to a small population, results are not very enlightening without additional information about the migration structure. It remains to be shown that other methods are superior for these kinds of data. We expect that when we do not know the migration structure, many loci and many individuals will be needed to detect the few new mutations in the small sink population. Only then would we be able to see whether any of these rare mutations migrate back into the source population.

The difficulty of retrieving the true parameters increases with the number of parameters. Estimates for two parameters show smaller profile confidence intervals than the estimates with 16 parameters, but if the true population structure is complicated, the two-parameter model also delivers much less information than a more parameter-rich model.

Krings *et al.* (32) infer gene flow in the Nile valley by
using analyses of molecular variance (AMOVA) (40). Their findings
coincide with ours, in that Nubia seems to have received a considerable
number of genes from Egypt and Sudan. The population sizes are most
likely inflated, because we did not take into account that the
individual populations are substructured, and because all populations
exchange migrants with their other neighbors, who contribute genetic
variation which we do not account for in our analysis.

The rather large range of possible values for the migration parameters in the example of possible migration directions in the Nile valley makes it evident that single locus data, even if it is highly variable, does not help much in clarifying current discussions in anthropology. Additional unlinked loci, each with its own coalescent history, can reduce this uncertainty greatly (10, 17).

## Conclusion

We have presented three lines of evidence that our method works even for rather complicated migration models: derivation of the MH sampling strategy (10, 15, 16), simulation to test convergence to true parameters, and simulations to assess biases and to see whether the method achieves results that are reproducible and can be found in a reasonable amount of time.

Maximum likelihood methods for the estimation of population parameters,
as implemented in migrate or genetree, will
make the current *F _{ST}*-based estimators obsolete,
because these do not take into account the genealogical relationship of
the sample and the possibility of asymmetry in gene flow. However, the
practical use of these new programs is limited to few subpopulations or
few parameters because of current lack of computation power.
Incorporation of the machinery of migrate into a program
that can handle additional forces such as recombination and
population growth is under way in our laboratory. The program
migrate is available at
http://evolution.genetics.washington.edu/lamarc.html.

## Acknowledgments

We thank M. K. Kuhner and J. Yamato for many discussions and comments on the manuscript, and R. C. Griffiths and J. Wakeley for helpful comments on the manuscript. This work was funded by National Science Foundation grant no. BIR 9527687 and National Institutes of Health grants nos. R01 GM 51929, R01 GM 01989, all to J.F.

## Abbreviation

- MH
- Metropolis–Hastings Markov chain Monte Carlo approach

## References

*Theor. Popul. Biol.*, in press.

**33,**163–185.

**National Academy of Sciences**

## Formats:

- Article |
- PubReader |
- ePub (beta) |
- PDF (135K) |
- Citation

- Effect of unsampled populations on the estimation of population sizes and migration rates between sampled populations.[Mol Ecol. 2004]
*Beerli P.**Mol Ecol. 2004 Apr; 13(4):827-36.* - Exact computation of coalescent likelihood for panmictic and subdivided populations under the infinite sites model.[IEEE/ACM Trans Comput Biol Bioinform. 2010]
*Wu Y.**IEEE/ACM Trans Comput Biol Bioinform. 2010 Oct-Dec; 7(4):611-8.* - Estimating effective population size and mutation rate from sequence data using Metropolis-Hastings sampling.[Genetics. 1995]
*Kuhner MK, Yamato J, Felsenstein J.**Genetics. 1995 Aug; 140(4):1421-30.* - Estimating genealogies from unlinked marker data: a Bayesian approach.[Theor Popul Biol. 2007]
*Gasbarra D, Pirinen M, Sillanpää MJ, Salmela E, Arjas E.**Theor Popul Biol. 2007 Nov; 72(3):305-22. Epub 2007 Jun 22.* - Metropolis sampling in pedigree analysis.[Stat Methods Med Res. 1993]
*Sobel E, Lange K.**Stat Methods Med Res. 1993; 2(3):263-82.*

- Escaping paradise: Larval export from Hawaii in an Indo-Pacific reef fish, the Yellow Tang (Zebrasoma flavescens)[Marine ecology progress series. 2011]
*Eble JA, Toonen RJ, Sorenson L, Basch LV, Papastamatiou YP, Bowen BW.**Marine ecology progress series. 2011; 428245-258* - Population Genetics of Seaside Sparrow (Ammodramus maritimus) Subspecies along the Gulf of Mexico[PLoS ONE. ]
*Woltmann S, Stouffer PC, Bergeon Burns CM, Woodrey MS, Cashner MF, Taylor SS.**PLoS ONE. 9(11)e112739* - Conservation genetics of a threatened butterfly: comparison of allozymes, RAPDs and microsatellites[BMC Genetics. ]
*Turlure C, Vandewoestijne S, Baguette M.**BMC Genetics. 15(1)114* - Genetic structure of fragmented southern populations of African Cape buffalo (Syncerus caffer caffer)[BMC Evolutionary Biology. ]
*Smitz N, Cornélis D, Chardonnet P, Caron A, de Garine-Wichatitsky M, Jori F, Mouton A, Latinne A, Pigneur LM, Melletti M, Kanapeckas KL, Marescaux J, Pereira CL, Michaux J.**BMC Evolutionary Biology. 14(1)203* - Prioritizing Tiger Conservation through Landscape Genetics and Habitat Linkages[PLoS ONE. ]
*Yumnam B, Jhala YV, Qureshi Q, Maldonado JE, Gopal R, Saini S, Srinivas Y, Fleischer RC.**PLoS ONE. 9(11)e111207*

- PubMedPubMedPubMed citations for these articles

- Maximum likelihood estimation of a migration matrix and effective
population siz...Maximum likelihood estimation of a migration matrix and effective population sizes in n subpopulations by using a coalescent approachProceedings of the National Academy of Sciences of the United States of America. Apr 10, 2001; 98(8)4563

Your browsing activity is empty.

Activity recording is turned off.

See more...