- We are sorry, but NCBI web applications do not support your browser and may not function properly. More information

###### From the Cover

# Protein sequence signatures support the African clade of mammals

^{*}Ole Madsen,

^{*}François Catzeflis,

^{†}Michael J. Stanhope,

^{‡}Wilfried W. de Jong,

^{*}

^{§}

^{¶}and Mark Pagel

^{¶}

^{‖}

^{*}Department of Biochemistry, University of Nijmegen, P.O. Box 9101, 6500 HB Nijmegen, The Netherlands;

^{†}Institut des Sciences de l'Évolution, Université Montpellier 2, 34095 Montpellier, France;

^{‡}Queen's University of Belfast, Biology and Biochemistry, Belfast BT9 7BL, United Kingdom;

^{§}Institute for Systematics and Population Biology, University of Amsterdam, 1090 GT Amsterdam, The Netherlands; and

^{‖}School of Animal and Microbial Sciences, University of Reading, Whiteknights, Reading RG6 6AJ, United Kingdom

^{¶}To whom reprint requests should be addressed. E-mail: ln.nuk.hcoib@gnojed.w or ku.ca.gnidaer@legap.m.

## Abstract

DNA sequence evidence supports a superordinal clade of mammals that
comprises elephants, sea cows, hyraxes, aardvarks, elephant shrews,
golden moles, and tenrecs, which all have their origins in Africa, and
therefore are dubbed Afrotheria. Morphologically, this appears an
unlikely assemblage, which challenges—by including golden moles and
tenrecs—the monophyly of the order Lipotyphla (Insectivora). We here
identify in three proteins unique combinations of apomorphous amino
acid replacements that support this clade. The statistical support for
such “sequence signatures” as unambiguous synapomorphic evidence
for the naturalness of the Afrotherian clade is reported. Using
likelihood, combinatorial, and Bayesian methods we show that the
posterior probability of the mammalian tree containing the Afrotherian
clade is effectively 1.0, based on conservative assumptions. Presenting
sequence data for another African insectivore, the otter shrew
*Micropotamogale lamottei*, we demonstrate that such
signatures are diagnostic for including newly investigated species in
the Afrotheria. Sequence signatures provide
“protein-morphological” synapomorphies that may aid in
visualizing monophyletic groupings.

Molecular sequence data are increasingly used in mammalian phylogeny and recently have led to a number of unorthodox proposals (1–3). These proposals range from the claim that the guinea pig is not a rodent (4) to making whales and hippos sister groups (5). One of the most remarkable propositions is that of an “African clade” in which species as diverse as elephant shrews (Macroscelidea), golden moles (Chrysochloridae), and tenrecs (Tenrecidae) are grouped with aardvarks (Tubulidentata) and paenungulates (elephants, sea cows, and hyraxes; refs. 6 and 7). All of the African clade species find their fossil roots in Africa, and most are still confined to this continent, hence the name Afrotheria (7). The sequence evidence for Afrotheria is unanimous and strong, deriving from various nuclear and mitochondrial genes (6–10). Morphologically, however, there is no evidence whatsoever for a natural grouping of these taxa (11–14), prompting us to subject the molecular evidence to further scrutiny.

If Afrotheria is a real clade, it might be possible to find specific combinations of amino acid replacements in the proteins that support them. These replacements would represent synapomorphous character states, as remnants of mutational events during the last common ancestry of a clade. Several authors have used the concept of such “sequence signatures” qualitatively in molecular phylogeny (e.g., refs. 15–19), but thorough statistical interpretations are lacking.

We here search for the presence of unique Afrotherian sequence signatures in nine protein data sets—eight nuclear and one mitochondrial—that include at least four Afrotherian orders. Putative Afrotherian signatures were traced in αA-crystallin (CRYAA), aquaporin-2 (AQP2), and interphotoreceptor retinol-binding protein (IRBP). To demonstrate the diagnostic value of the signatures we seek their presence in CRYAA and AQP2 of other potential members of the African clade, including the otter shrew—representing the only tenrecid subfamily living outside of Madagascar. To assess the significance of the candidate signatures, we use likelihood methods (20) to reconstruct their most probable ancestral states at the basal node of the Afrotherian clade. These calculations use a phylogeny reconstructed independently of the protein under investigation. We further use likelihood and combinatorial methods to estimate the probability of the signatures on three alternative morphology-based trees that are incompatible with an African clade. We then combine the evidence from CRYAA, AQP2, and IRBP by using Bayesian techniques to yield a posterior probability for the Afrotherian clade. Demonstrating the statistical improbability of such events in the course of biological evolution (21) may help to escape from the current stalemating in the molecules-versus-morphology debate on vertebrate phylogeny (3).

## Materials and Methods

### Searching for Afrotherian Signatures.

Databases were searched for sets of protein sequences that included
representatives of at least four Afrotherian orders, i.e., Proboscidea
(elephants), Sirenia (sea cows), Hyracoidea (hyraxes), Tubulidentata,
Macroscelidea, and Afrosoricida (golden moles and tenrecs; ref. 7).
This yielded data sets of CRYAA, AQP2, IRBP, von Willebrand factor,
α-2B adrenergic receptor, γ-fibrinogen, hemoglobin-α and -β,
and cytochrome *b*. The AQP2 data set was complemented with
newly determined sequences of pig, fin whale, and sperm whale (see
below). From these data sets, one or, if available, two representatives
of all included eutherian orders were taken. When more than two species
were available for an order, only the two most divergent sequences were
retained. This increases the homoplastic background, and thus the
significance of retrieved signatures. Retaining all sequences would
make the taxon representation unbalanced and hamper the signature
searches. The selected sequences were aligned, using
pileup, and manually edited. Where available, two
divergent Marsupialia were included as outgroups. For full species
names and accession numbers, and for protein alignments of CRYAA, AQP2,
and IRBP, see Table 4 and Figs. 3–5, which are published as
supplemental data on the PNAS web site, www.pnas.org.

Candidate sequence signatures were retrieved from the alignments by
using the spreadsheet signwin (available from the authors).
No phylogenetic information is included in this search;
signwin solely selects positions at which a designated
number of in-group species have the same putatively apomorphous
replacement, considering the out-group residue(s) as plesiomorphous
condition. The selection window is set to be appropriate for the number
of species for which the monophyly is investigated. Thus, when
searching for positions that might support the monophyly of the five
Afrotheria among the 26 selected eutherian CRYAA sequences (Figs.
(Figs.11*A* and 3), the window is set
at 5 ± 1. This allows for 20% back or otherwise superimposed
replacements within a five-species clade, and the same absolute number
of parallel replacements in the other 21 in-group sequences. Positions
at which 4–6 species share the same apomorphy are then candidates for
any Afrotherian sequence signature. Using a wider or narrower criterion
would change our candidate signatures, but as seen in
*Results* the candidate sites for a potential signature emerge
clearly from the data.

### Sequence Determination of CRYAA and AQP2.

CRYAA genomic sequences, coding for amino acid residues 64–94, were
determined for Indian elephant (*Elephas maximus*), dugong
(*Dugong dugon*), tail-less tenrec (*Tenrec
ecaudatus*), small Madagascar hedgehog (*Echinops
telfairi*), otter shrew (*Micropotamogale lamottei*), and
golden mole (*Amblysomus hottentotus*). Otter shrew DNA was
extracted from ethanol-preserved liver (voucher specimen IZEA-7083);
sources of other DNA were as before (6, 7). Amplification was performed
by using a forward primer hybridizing to exon 1 and a reverse primer
complementary to the 3′ end of exon 2 (22). AQP2 was sequenced (23) for
pig (*Sus scrofa*), sperm whale (*Physeter
macrocephalus*), manatee (*Trichechus manatus*), fin whale
(*Balaenoptera physalus*), tree hyrax (*Dendrohyrax
dorsalis*), tail-less tenrec, small Madagascar hedgehog, and otter
shrew.

### Phylogenetic Tree Reconstruction.

To study the evolution of the candidate Afrotherian signatures found in
CRYAA, AQP2, and IRBP (see *Results*), phylogenetic trees were
constructed from a 5,708-bp data set of concatenated α-2B adrenergic
receptor, von Willebrand factor, IRBP, and 12S rRNA-tRNA valine-16S
rRNA sequences (10), taking those entries that corresponded most
closely with the species in our CRYAA, AQP2, and IRBP data sets (see
Table 5, which is published as supplemental data on the PNAS web site).
In the case of the IRBP signature, phylogeny was constructed with
exclusion of the IRBP sequences. Topologies and branch lengths of the
obtained trees are thus independent of the protein sequences whose
signatures we investigate. It also avoids the problem that covarion
processes might influence our tree building (24).

We used a two-step procedure to derive the maximum likelihood (ML or
maximum average likelihood *sensu* Steel and Penny, ref. 25)
phylogeny from our sequence data. The size of our phylogenies precluded
an exhaustive search of all possible topologies to find the global ML
tree. We therefore first calculated the likelihood of the sequence data
on starter topologies obtained from a simple neighbor-joining (minimum
evolution) analysis. Likelihood calculations were done by using
paup-ml (26) assuming the HKY85 model of
evolution with gamma rate heterogeneity to allow for the possibility of
unequal rates of evolution across sites. We estimated the shape
parameter (α) of the gamma distribution, and the
transition/transversion ratio, from the data. This yielded a
candidate topology with branch lengths based on the ML distance
calculation. Subsequently, we searched for better topologies in the
region of the initial neighbor-joining topology by using the
tree-bisection-reconnection branch swapping algorithm in
paup-ml. We repeated this analysis procedure with
random sequence input orders and always found the same ML tree.
“Constrained trees” were constructed to conform with alternative
morphology-based hypotheses for eutherian relationships (see Fig.
Fig.2).2). For these trees we supplied the
topologies and reconstructed branch lengths by ML.

### Computation of Ancestral States.

We conducted two sorts of likelihood computation to investigate which trees best described the CRYAA, AQP2, and IRBP sequence evolution. In one we calculated the overall likelihood of observing the protein sequence signature separately for CRYAA, AQP2, and IRBP, using as our model the empirical JTT substitution rate matrix (27). A separate likelihood was calculated on the unconstrained ML tree and on the three morphology-constrained topologies. The second set of computations involved the likelihood of the most probable ancestral character states of the candidate sequence signatures. These calculations used the same model of evolution, and followed established procedures of which the details have been described (20, 28, 29). These procedures calculate the likelihood of observing the protein sequence data given a topology and a specified amino acid at some node. A likelihood is calculated for each possible amino acid, with the largest corresponding to the ML estimate. The ratio of the largest likelihood to the sum over all amino acids (the total likelihood), each weighted by the prior probabilities of occurrence, is a measure of the posterior probability of that amino acid at that node. As is customary in such analyses, we assume equal prior probabilities for each amino acid, although basing our calculations on priors equal to the proportion of a given amino acid in the sequence does not alter our conclusions. The product of the probabilities over the separate amino acids that comprise a signature measures the probability of the entire signature at that node. By comparing probabilities at a pair of ancestral and descendant nodes it can be inferred whether the signature arose in the branch leading to the descendant node.

## Results

### Candidate Sequence Signatures in CRYAA, AQP2, and IRBP.

In the alignment of 28 mammalian CRYAA sequences, six positions were
found to be relevant for distinguishing any possible five-species clade
(Fig. (Fig.11*A*). The only group of five species set apart
by a combination of two or more putative apomorphies, namely 70Q, 74L,
and 142C, is formed by elephant, manatee, hyrax, aardvark, and elephant
shrew. The combination QLC at positions 70, 74, and 142 thus is a
unique feature for the Afrotheria in this CRYAA data set. All
Afrotheria, apart from African elephant, share in addition the
apomorphy 72L. We therefore investigated the phylogenetic value of 70Q,
72L, 74L, 142C as a putative Afrotherian signature in CRYAA. Among 20
aligned AQP2 sequences, we traced four positions at which putative
apomorphies might be diagnostic for a six-species clade (Fig.
(Fig.11*B*). The combination 10S, 55T and 104L perfectly set apart
the Afrotheria, with exception of dugong, which only shares the 10S
apomorphy. The signature STL was studied as an Afrotherian marker in
AQP2. In the alignment of 28 IRBP sequences, 47 positions passed the
search for a five-species grouping (Fig. (Fig.11*C*). There are 17
putative apomorphies in support of at least four of the five
Afrotheria. The combination 18 M, 19S, 76S, 147G, 226I, 272D, and 328E
is even perfectly unique for all five Afrotheria. At the 10 other
positions the signature is affected by homoplasy, within the limits
allowed by our search procedure (see *Materials and Methods*).
The “degenerate” 17-residue signature is used in our further
analyses. No signatures were detected in the other six proteins.

### CRYAA and AQP2 Signatures in Other Afrotheria.

To perform meaningful likelihood calculations for the retrieved
signatures it was desirable to broaden the Afrotherian representation
by sequencing CRYAA and IRBP in golden mole and tenrec, and AQP2 in
tenrec. This would also be a test for the diagnostic value of these
signatures; if they are genuine synapomorphies for Afrotheria, one
expects to find them, completely or partially, in CRYAA, AQP2, and IRBP
from other members of this clade. We sequenced exon 2 of the CRYAA
gene, which encodes the signature positions 70, 72, and 74, in Indian
elephant, dugong, golden mole, and three Tenrecidae, including the
otter shrew. All new CRYAA sequences were found to code for 70Q, 72L,
and 74L, including that of Indian elephant, suggesting that 72V in
African elephant is a back mutation (Fig. (Fig.11*D*). For AQP2,
additional sequences were obtained for manatee, tree hyrax, and three
tenrecs, again including otter shrew. All of these species have the STL
signature, apart from manatee, which like dugong AQP2 misses 55T and
104L (Fig. (Fig.11*E*). Unfortunately, sequences for golden
mole and tenrec IRBP could not be obtained.

These new sequences illustrate that signatures, even in short proteins like CRYAA and AQP2, have the potential to identify newly investigated species as belonging to a specific clade. These data confirm that golden moles and tenrecs associate with Afrotheria, and indicate that the otter shrew joins this clade.

### Likelihoods of the CRYAA, AQP2 and IRBP Signatures.

To calculate the likelihood of the signatures in the Afrotherian
species we needed topologies representing alternative hypotheses about
their relationships. To construct these alternative topologies we used
a 5,708-bp concatenation of six genes (10) that is the only extensive
sequence data set available for most taxa that are relevant for our
calculations (indicated by asterisks in Fig. Fig.1).1). It allowed us to make
trees with topologies and branch lengths independent of the particular
signature under investigation. Fig. Fig.22*A* shows the
topology of the ML tree used for calculating the likelihoods of the
CRYAA signature. In this tree the African clade receives bootstrap
support of 100%. The principle morphologically favored alternatives
are to group Afrosoricida with hedgehog in a monophyletic Lipotyphla,
and elephant shrew with Glires (rabbits and rodents; refs. 11–14). The
trees in Fig. Fig.22 *B*–*D* are constrained to comply
with these morphology-based hypotheses. Similar sets of alternative
trees were constructed for the AQP2 and IRBP data sets (see Fig. 6,
which is published as supplemental data on the PNAS web site).

The log-likelihoods of the CRYAA, AQP2, and IRBP signatures were separately calculated on the corresponding ML and constrained trees (Table (Table1).1). The signatures fit in every case the unconstrained ML tree substantially better than any of the constrained trees, providing independent support in three proteins for the Afrotherian clade.

### Likelihoods of Ancestral State Reconstructions.

If the signatures in CRYAA, AQP2, and IRBP are synapomorphies of
Afrotheria they should have evolved in the branch leading to the basal
node of the Afrotherian clade. The estimated posterior probabilities of
observing the signature QLLC at nodes 1 and 2 in Fig.
Fig.22*A* are 3.0 × 10^{-9} and
0.984, respectively. For the AQP2 and IRBP ML trees the corresponding
probabilities are 2.0 × 10^{-6} and 0.987, and
7.8 × 10^{-32} and 0.391, respectively. The
sequence signatures of all three proteins thus have a high probability
of evolving in the branch leading to the basal node of the Afrotheria.
Probabilities this high for the CRYAA and AQP2 signatures imply that
each amino acid replacement in the signatures has a near 1.0
probability of having evolved in that branch. Even for the IRBP
signature, which requires 17 separate events in a specified branch, the
combined probability is 0.391. Removing just two of the more variable
sites (e.g., 59S and 326E, each of which has an approximately 0.65
probability of having evolved in the branch), the combined probability
rises to 0.94.

These results confirm that the absence of 72L in African elephant
CRYAA must be a loss of L at that site. Similarly, the absence of 55T
and 104L in dugong AQP2 is reconstructed as a loss in the branch
leading to that species. We also infer that elephant shrew IRBP has
lost 59S and 326E, and other instances of homoplasy arise (compare Fig.
Fig.11*C*). However, none constitutes an alternative to the
signatures we investigate here.

The morphology-constrained trees each require that the signatures
evolved more than once or have evolved and been lost again.
Reconstructions of ancestral states similar to those for the ML tree
support this interpretation, as shown for CRYAA by the probabilities at
the nodes numbered in Fig. Fig.22 *B*–*D*, and given in
the legends. Comparable values were found for the constrained AQP2 and
IRBP trees (Fig. 6). However, to reject the constrained trees solely on
the basis that they require more than one gain or loss of the
signatures requires a framework within which to consider the
probability of a signature event occurring more than once on a tree. If
this probability is high, then the alternative topologies are not ruled
out by our data.

### Phylogenetic Value of the Afrotherian Signatures.

Is it unlikely that the signatures we observed have evolved more than
once? To answer this question we developed a methodology that takes
account of all possible ways a signature could have arisen given the
number of elements (i.e., amino acid replacements in the signature) and
the length of the protein. This removes the possibility that we have
capitalized on chance. First, we calculate the probability of a given
class of signature events arising once. Let *r* be the number
of apomorphic elements in a signature. The class of *r*-events
(i.e., all of the possible signatures of size *r*) need not be
unlikely itself, but for the signature to be an unambiguous marker of a
clade the probability must be low that the same (identical) member of
the class arises twice.

Given *V* variable sites in a sequence, and a signature of
size *r* there are (_{r}^{V})
possible signatures of size *r.* Each signature has
probability
*p ^{r}q*

^{V−r}of occurring in any given branch, where

*p*is the probability of an amino acid replacement at a given site in a branch, and

*q =*1 −

*p*. We assume that

*p*is constant across sites. The product

gives the probability of an *r*-event. Summing this
product over *r,* allowing *r* to range from
*r* to *V,* gives the probability of a signature of
length *r* or greater. Call this probability
*p _{s}*, where

*s*denotes signature.

The probability that a signature of length *r* or greater will
arise at least once in a given tree is calculated as follows. Let there
be *N _{b}* branches in the tree. Then

gives the probability of observing a signature of length
*r* or greater in *b* branches of the tree. Summing
this product over *b* ranging from 1 to *N* gives the
probability of observing on the tree at least one signature of length
*r* or greater. Call this quantity
*p _{t,}* where the

*t*denotes the tree.

We estimated *p* for each protein, from the number of sites in
the sequence, the total number of changes reconstructed on the ML tree,
and the number of branches in the tree. We then applied this estimate
of *p* to all sites to calculate
*p _{s}* and

*p*(Table (Table2).2). The results show that none of our signature classes alone is improbable. Thus, given as many variable sites as we observe in each protein, signatures of the sort we have detected or longer, are expected somewhere on each tree.

_{t}For the identical signature to arise twice in a tree of
*N _{b}* branches, any of the

*r*-length events can happen first and anywhere on the tree, but the second

*r*-length event can only be one of the (

_{r}

^{V}) possible signatures of size

*r*. Each of these occurs in any give branch with probability

*(p*

^{r}q^{V−r}

*);*call this probability

*p*, where

_{b}*b*denotes branch. Then, the probability of the identical signature arising twice is given by the product of

*p*and all possible ways of the second signature arising in the

_{t}*N*−1 remaining branches. (In fact the number of branches in which the second signature can arise typically will be less than

_{b}*N*−1 because the first signature will usually be present in more than one branch of the tree, owing to identity by descent. This makes our calculations conservative.) This product is written as

_{b}
and the symbol *I* takes the value 1 to account for one
additional signature arising. Using the same logic, Eq. 3
can be used to calculate the probability of the same signature arising
three or four times by allowing *I* = 2 or
*I* = 3. Table Table22 reports the resulting probability for
two, three, and four identical *r*-events. In these
calculations we have replaced the *p _{b}* of
Eq. 3 with

*p*summed over all signatures of length

_{b}*r*or greater. The calculations reported in Table Table22 reveal that, although the class of

*r*-events (

*p*) is not improbable, the probability of the identical

_{t}*r*- or greater-length event occurring twice or more by chance is always small and often negligible.

### Combining Results from the Three Proteins.

How do these results alter our view about the likelihood of Afrotherian monophyly? Using Bayes' rule (30) we can combine the signature probabilities from Table Table22 to arrive at a posterior probability for the Afrotherian hypothesis. From Bayes' rule

where *P(Afrotheria)* is the posterior probability of the
Afrotheria signature, *w(Afrotheria)* is our prior belief in
the Afrotherian hypothesis, *P(signature/Afrotheria)* is the
probability of the Afrotherian signature given the unconstrained ML
tree, and *P(signatures)* is the combined probabilities of the
signatures summed over all four trees, weighted by their prior
probabilities. *P(signature/Afrotheria)* is obtained from
the *p _{t}* column in Table Table2,2, and

*P(signatures)*by combining the Afrotherian results with those from the appropriate column of Table Table2,2, corresponding to the number of times a signature has appeared in the three alternative trees.

Let our prior belief be skeptical to adopt a conservative view against
the Afrotherian hypothesis. Let *w(Afrotheria)* be 0.001. Let
our prior belief in the morphology-based hypotheses represented by the
other trees be higher, at 0.333 each (0.4995 for IRBP). These weights
then sum to 1.0 as they must. Table Table3
3
reports that for all three proteins the posterior belief in Afrotheria
is strong and substantially altered from the prior. Calculating the
combined posterior support of the three proteins for the Afrotherian
hypothesis yields *P* ~1.0, even when a prior weight of
only 0.0001 is used. Thus, the combined data effectively rule out
support for polyphyly of the Afrotherian species.

## Discussion

The sequence signatures that we identified in CRYAA, AQP2, and IRBP (Fig. (Fig.1),1), without resorting to prior phylogenetic analyses, provide independent evidence for the Afrotherian clade. The signatures are specific to Afrotheria, they arose with high probability at the basal node of the Afrotherian clade, and it is highly improbable that they would have arisen more than once as is required by the morphologically favored tree hypotheses. We demonstrated their predictive value by finding them in several species for which sequence data on the CRYAA and AQP2 proteins did not previously exist. Notably, the finding of the Afrotherian signatures in the otter shrew—for which no other sequence data have yet been published—supports the inclusion of this African insectivore in the Afrotheria.

Can the Afrotherian signatures be dismissed as homoplasy? The parallel appearance of signatures in a data set could be caused by the admixture of paralogous sequences, convergence, covarion processes, lineage sorting, or even to bias in base composition or differences in mutational mechanisms or repair systems (31). However, it seems highly implausible that such evolutionary mechanisms would cause similarly misleading signatures in three functionally independent proteins in precisely the same set of species.

At a methodological level our assumption that sites evolve independently may be questioned. An extensive literature deals with the correlated evolution of amino acid residues in a protein (e.g., refs. 32 and 33). Such mutual dependence makes it understandable that two or more replacements can originate or disappear in concert. To the extent that the amino acid replacements we have identified do change in a correlated manner, our calculations may underestimate the true probabilities of the signatures arising twice. Similarly, we have used a single estimate of the probability of a substitution to characterize every site and every branch. To the extent that the true probability varies our estimates may be affected. However, we reiterate that we have found similar highly improbable signature patterns in three independent proteins and always in the same set of species. Even using our simplifying assumptions, the results are congruent across trees and proteins. Further, our approach uses a statistical methodology that controls for the problem of capitalizing on chance that arises when searching for signatures of unknown length and composition.

The phylogenetic signal contained in sequence signatures, if present in a protein, contributes in any conventional phylogenetic analysis to the topology that is eventually reconstructed. What then is added by identifying and analyzing signatures on their own? It appears that the quantitative approach of analyzing ever longer sequences is not in all instances the panacea of molecular phylogeny, as in the case of deeper level analyses of mitochondrial protein sequences (e.g., refs. 34–38). If one accepts that synapomorphies are the cornerstones of phylogeny reconstruction, it is logical then to additionally search for mutational events that act as qualitative sequence characteristics for a specific clade. Such can be retropositions (39), specific insertions or deletions (e.g., refs. 10, 19, and 22), and the sequence signatures as discussed here. These molecular character-state data may allow a better discrimination between homoplasy and homology, a prerequisite for finding “true” trees (31). Where conventional analyses combine all of the site-by-site information into a single result, the signature approach highlights a concrete set of events whose most plausible evolutionary explanation can help to choose among competing phylogenetic hypotheses.

The “protein morphological” evidence provided by the signatures in CRYAA, AQP2, and IRBP may give an impetus to reevaluate the apparent absence of any morphological synapomorphies for the African clade against the backdrop of the various scenarios for lipotyphlan phylogeny (14, 40).

## Acknowledgments

We thank Collin van Asten for writing signwin, Marcel Sweers for technical help, and Peter Vogel for otter shrew tissue. This work was supported by grants from the Netherlands Foundation for Life Sciences (to W.W.d.J.), the European Commission (to W.W.d.J., F.C. and M.J.S.), and the Leverhulme Trust and the Natural Environment Research Council (to M.P.).

## Abbreviations

- CRYAA
- αA-crystallin
- AQP2
- aquaporin-2
- IRBP
- interphotoreceptor retinol-binding protein
- ML
- maximum likelihood

## Footnotes

This paper was submitted directly (Track II) to the PNAS office.

See commentary on page 1.

Data deposition: The sequences reported in this paper have been deposited in the GenBank database (accession nos. {"type":"entrez-nucleotide","attrs":{"text":"AJ251100","term_id":"11191757","term_text":"AJ251100"}}AJ251100–{"type":"entrez-nucleotide","attrs":{"text":"AJ251106","term_id":"11191765","term_text":"AJ251106"}}AJ251106, {"type":"entrez-nucleotide","attrs":{"text":"AJ277647","term_id":"10334488","term_text":"AJ277647"}}AJ277647, and {"type":"entrez-nucleotide","attrs":{"text":"AJ270463","term_id":"10803352","term_text":"AJ270463"}}AJ270463–{"type":"entrez-nucleotide","attrs":{"text":"AJ270468","term_id":"10803178","term_text":"AJ270468"}}AJ270468).

Article published online before print: *Proc. Natl. Acad. Sci. USA*,
10.1073/pnas.250216797.

Article and publication date are at www.pnas.org/cgi/doi/10.1073/pnas.250216797

## References

*Nature (London)*, in press.

**National Academy of Sciences**

## Formats:

- Article |
- PubReader |
- ePub (beta) |
- PDF (140K)

- Highly congruent molecular support for a diverse superordinal clade of endemic African mammals.[Mol Phylogenet Evol. 1998]
*Stanhope MJ, Madsen O, Waddell VG, Cleven GC, de Jong WW, Springer MS.**Mol Phylogenet Evol. 1998 Jun; 9(3):501-8.* - Molecular evolution of mammalian aquaporin-2: further evidence that elephant shrew and aardvark join the paenungulate clade.[Mol Biol Evol. 1997]
*Madsen O, Deen PM, Pesole G, Saccone C, de Jong WW.**Mol Biol Evol. 1997 Apr; 14(4):363-71.* - A retroposon analysis of Afrotherian phylogeny.[Mol Biol Evol. 2005]
*Nishihara H, Satta Y, Nikaido M, Thewissen JG, Stanhope MJ, Okada N.**Mol Biol Evol. 2005 Sep; 22(9):1823-33. Epub 2005 Jun 1.* - Placentation in species of phylogenetic importance: the Afrotheria.[Anim Reprod Sci. 2004]
*Carter AM, Enders AC, Künzle H, Oduor-Okelo D, Vogel P.**Anim Reprod Sci. 2004 Jul; 82-83:35-48.* - The chromosomes of Afrotheria and their bearing on mammalian genome evolution.[Cytogenet Genome Res. 2012]
*Svartman M, Stanyon R.**Cytogenet Genome Res. 2012; 137(2-4):144-53. Epub 2012 Aug 3.*

- Comparative Anatomy of the Bony Labyrinth (Inner Ear) of Placental Mammals[PLoS ONE. ]
*Ekdale EG.**PLoS ONE. 8(6)e66624* - Molecular phylogeny and divergence times of Malagasy tenrecs: Influence of data partitioning and taxon sampling on dating analyses[BMC Evolutionary Biology. ]
*Poux C, Madsen O, Glos J, de Jong WW, Vences M.**BMC Evolutionary Biology. 8102* - A new estimate of afrotherian phylogeny based on simultaneous analysis of genomic, morphological, and fossil evidence[BMC Evolutionary Biology. ]
*Seiffert ER.**BMC Evolutionary Biology. 7224* - Cross-species chromosome painting in the golden mole and elephant-shrew: support for the mammalian clades Afrotheria and Afroinsectiphillia but not Afroinsectivora.[Proceedings of the Royal Society B: Biologi...]
*Robinson TJ, Fu B, Ferguson-Smith MA, Yang F.**Proceedings of the Royal Society B: Biological Sciences. 2004 Jul 22; 271(1547)1477-1484* - Zebrin II compartmentation of the cerebellum in a basal insectivore, the Madagascan hedgehog tenrec Echinops telfairi[Journal of Anatomy. 2003]
*Sillitoe RV, Künzle H, Hawkes R.**Journal of Anatomy. 2003 Sep; 203(3)283-296*

- MedGenMedGenRelated information in MedGen
- NucleotideNucleotidePublished Nucleotide sequences
- ProteinProteinPublished protein sequences
- PubMedPubMedPubMed citations for these articles
- SubstanceSubstancePubChem Substance links
- TaxonomyTaxonomyRelated taxonomy entry
- Taxonomy TreeTaxonomy Tree

- From the Cover: Protein sequence signatures support the African clade
of mammal...From the Cover: Protein sequence signatures support the African clade of mammalsProceedings of the National Academy of Sciences of the United States of America. Jan 2, 2001; 98(1)188PMC

Your browsing activity is empty.

Activity recording is turned off.

See more...