Molecular Phylogenetics and Evolution Vol. 19, No. 1, April, pp. 57– 66, 2001 doi:10.1006/mpev.2001.0899, available online at http://www.idealibrary.com on
Nuclear and mtDNA Phylogenies of the Trimeresurus Complex: Implications for the Gene versus Species Tree Debate Nicholas Giannasi, Anita Malhotra, 1 and Roger S. Thorpe School of Biological Sciences, University College of North Wales, Bangor, Gwynedd, LL57 2UW, Wales, United Kingdom Received March 14, 2000; revised October 19, 2000
2000). Furthermore, a phylogeny produced from mtDNA represents a gene tree that may not be congruent with the species tree because of lineage sorting (Moore, 1995). Given that the major goal of phylogenetic studies is to infer the evolutionary history of species and populations, rather than genes, this poses a significant problem. The only apparent solution is to include additional, independent loci from the nuclear genome in phylogenetic studies (Wu, 1991). Recently, there have been several investigations of the utility of a range of intron sequences as a source of such loci. Exon-primed intron crossing conserved primers have been used in studies of actin in cetaceans (Palumbi and Baker, 1994) and fibrinogen (Prychitko and Moore, 1997) and aldolase, G3PD, ␣ enolase, and lamin in birds (Friesen et al., 1997). For venomous snakes, attempting to determine the species tree is of more than esoteric interest for a number of reasons. First, snakebite is very common in tropical developing countries and treatment with antivenom remains the only specific treatment for envenomation by snakes (Theakston, 1997). If venom composition differs between snake taxa, then a monovalent antivenom against the venom of one taxa is unlikely to be effective in neutralizing the venom of another taxa. Hence, knowing which species are present in which geographic locations enables a sound basis for production and clinical application of antivenoms. Second, venom variation is of fundamental importance for biomedical and toxinological research (Chippaux et al., 1991; Thorpe et al., 1997; Theakston, 1997). However, any work on venom based on an incorrect taxonomic framework may be rendered valueless if the wrong name is attached to the model or if comparisons are based on incorrect material (Warrell, 1986). This study focuses on the pit vipers of the Trimeresurus complex that are widely distributed across southern Asia and the Indo-Malayan archipelago. This complex represents a major evolutionary radiation and consists of over 40 species. Between three and five genera [Trimeresurus sensu stricto (s.s.), Tropidolaemus, Ovophis, Protobothrops, and Ermia] are generally recognized within this complex (McDiarmid et al., 1999; David and Ineich,
Phylogenies based on mitochondrial DNA (mtDNA) may represent gene trees that may not be congruent with the equivalent species tree. One solution to this problem is to include additional, independent loci from the nuclear genome. Sequence data from the seventh intron of the ␤-fibrinogen gene were generated for 25 specimens of vipers, including 8 nominal species of the Trimeresurus complex of Asian pit vipers. Phylogenetic trees were generated using maximum-parsimony and maximum-likelihood methods. The taxonomic level at which the intron provided significant phylogenetic information was examined and the trees were compared to those produced from previously obtained mtDNA cytochrome b sequences. A variety of different approaches (separate analyses, conditional data combination, and consensus) were used in an attempt to provide a sound organismal phylogeny based on both nuclear and mtDNA data sets. We discuss the implications for the gene tree–species tree debate and its particular relevance to medically important organisms. © 2001 Academic Press Key Words: gene tree; species tree; mitochondrial DNA; nuclear DNA; Trimeresurus; pitviper; phylogeny.
INTRODUCTION The past decade has seen a huge number of phylogenetic studies on a wide range of taxa that have provided valuable insight into many aspects of evolutionary biology. The vast majority of these studies have been based on mitochondrial DNA (mtDNA) because of the rapid rate of sequence evolution (Brown et al., 1979), the comparability of mtDNA genes across broad taxonomic boundaries (Brown, 1985), and the availability of universal primers (Kocher et al., 1989). However, due to the lack of recombination, the 37 different genes of the animal mitochondrial genome are inherited as a single unit and hence phylogenies derived from several mtDNA genes are not independent estimates of organismal phylogeny (Moore, 1995; Page, 1
To whom correspondence should be addressed. Fax: (01248) 371644. E-mail: [email protected]
1055-7903/01 $35.00 Copyright © 2001 by Academic Press All rights of reproduction in any form reserved.
GIANNASI, MALHOTRA, AND THORPE
TABLE 1 List of Species Included in This Study, Their Origins, GenBank Accession Numbers, and Catalogue Numbers (Author’s Personal Collection)
Genus Callesolasma Echis Ovophis Protobothrops Trimeresurus Trimeresurus Trimeresurus Trimeresurus Trimeresurus Trimeresurus Trimeresurus Trimeresurus Trimeresurus Trimeresurus Trimeresurus Trimeresurus Trimeresurus Trimeresurus Trimeresurus Trimeresurus Trimeresurus Trimeresurus Trimeresurus Trimeresurus Tropidolaemus a
rhodostoma ocellatus okinavensis mucrosquamatus albolabris albolabris albolabris albolabris albolabris albolabris albolabris albolabris cantori erythrurus gracilis gramineus malabaricus popeorum popeorum popeorum stejnegeri stejnegeri stejnegeri trigonocephalus wagleri
West Malaysia Garoua, Cameroon Ryu-kyu Islands Taiwan Nepal North Thailand Northeast Thailand South Thailand Hong Kong West Java (1) West Java (2) East Java Nicobar Islands Rangoon, Myanmar Taiwan South India South India North Thailand South Thailand West Malaysia Northeast Thailand Taiwan (1) Taiwan (2) Sri Lanka West Malaysia
CAL1 ECHO567 B1 TMUC1 TAN2 T?2 TAT4 TAT8 TAHK1 TAWJ1 TAWJ3 TAEJ1 TCANT TERY1 TGRAC TG1 TMAL2 TPN1 TPS1 TPM1 TSL18 TST4 TST60 TT1 TWAG1
GenBank Accession No. (␤-fibrinogen)
GenBank Accession No. (Cyt b) a
A54 WW567 B1 A211 A100 A226 A135 A134 A157 A125 A126 A115 A85 A209 A86 A219 A217 A204 A202 A196 A181 A160 A161 A58 A66
AF200622 AF200598 AF200620 AF200621 AF200614 AF200612 AF200618 AF200617 AF200615 AF200613 AF200616 AF200610 AF200606 AF200607 AF200619 AF200604 AF200602 AF200600 AF200601 AF200605 AF200609 AF200608 AF200611 AF200603 AF200599
AF171918 AF191579 AF171915 AF171897 AF171909 AF171910 AF171893 AF171894 AF171884 AF171886 AF171891 AF171887 AF171899 AF171900 AF171913 AF171905 AF171901 AF171902 AF171904 AF171888 AF171898 AF171896 AF171880 AF171890 AF171917
From Malhotra and Thorpe (2000).
1999). Certain species, such as Trimeresurus albolabris, are common and represent a significant problem to local people and an occupational hazard to agricultural workers. A recent phylogenetic study, based on cytochrome b mtDNA data, of 21 species in the complex (Malhotra and Thorpe, 2000) presented significant differences in species relationships that impinge on the previously accepted taxonomy. The main aims of this study were, first, to determine whether the previously identified seventh intron from the ␤-fibrinogen gene (Prychitko and Moore, 1997) was easily amplified across a range of medically important snake species for which mtDNA data exist; second, to determine at which taxonomic levels this intron provides significant phylogenetic information; and, third, to examine the phylogenetic resolution of different approaches (separate analyses, conditional data combination, consensus) when there is more than one data set. Finally, we hope to provide a sound organismal phylogeny for this medically and evolutionarily significant group of venomous snakes. MATERIALS AND METHODS Sampling Twenty-three samples representing eight nominal species and four genera (Tropidolaemus, Trimeresurus
s.s., Ovophis, and Protobothrops) of the Trimeresurus complex were used in this study (see Table 1 for details). In addition, one other Asian pit viper (Calloselasma rhodostoma) was included since previous studies suggest that the Trimeresurus complex may not be monophyletic (Malhotra and Thorpe, 2000). Finally, a true viper (Echis ocellatus) was used as the outgroup. Samples were in the form of tail-tip biopsies or liver tissue in 80% ethanol or 100 –200 l of blood taken from the caudal vein, placed in 1 ml of 5% EDTA, and stored in 2 ml SDS–Tris buffer (100 mM Tris, 3% SDS). DNA Preparations, PCR Amplifications, and Sequencing Whole genomic DNA was extracted using the protocol of Sambrook et al. (1989). The seventh intron from the ␤-fibrinogen gene was then amplified using the exon designed primers FIB-B17U and FIB-B17L (Prychitko and Moore, 1997). The cycling parameters used were as follows: denaturation at 94°C for 45 s, annealing at 55°C for 30 s, and extension at 72°C for 1 min 30 s. A negative (blank) control was always included to monitor for possible contamination. PCR products were purified using the Wizard PCR purification system following the manufacturer’s instructions. The
NUCLEAR AND mtDNA PHYLOGENIES OF THE Trimeresurus SPECIES COMPLEX
double-stranded purified PCR segment was then sequenced from both ends using BigDye Terminator cycle sequencing, and sequence was determined with an automated sequencer (Applied Biosystems 377) following the manufacturer’s protocols. Sequence Analysis The nuclear DNA (nDNA) sequences were aligned with ClustalX (Thompson et al., 1997), using default settings. Base composition was assessed using PAUP* Version 4.0b2a (Swofford, 1998). DnaSP Version 3 (Rozas and Rozas, 1999) was used to determine the number of monomorphic, polymorphic, and parsimony-informative sites and detect the presence of stop codons. The hypothesis that all mutations are selectively neutral (Kimura, 1983) was tested using Tajima’s test as implemented in DnaSP Version 3 (Rozas and Rozas, 1999) based on the total number of mutations and including sites with gaps. A likelihood-ratio test (Felsenstein, 1988) was used to determine whether nucleotide substitutions are clock-like over the evolution of all sequences. Nucleotide saturation was assessed at each codon position by plotting numbers of transitions and transversions against the Tamura–Nei distance (plots not shown). We used DAMBE Version 3.5.14 (Xia, 2000) to further investigate saturation, since simulation studies show that phylogenetic information is essentially lost when the observed saturation is equal to, or larger than, half of full substitution saturation (Xia, 2000). The alignment of the 25 sequences was tested for adequacy of phylogenetic signal using PAUP* Version 4.0b2a (Swofford, 1998) by plotting tree lengths of 100,000 random trees, with calculation of the g 1 statistic for the skewness of tree length distributions (Hillis and Huelsenbeck, 1992). The critical values of g 1 are obtained from the table published in Hillis and Huelsenbeck (1992), and a significant result indicates that the length of the actual tree is significantly shorter than expected from random data (i.e., without any phylogenetic structure). We also used g 1 in an iterative process to determine whether phylogenetic signal is evenly distributed throughout the branches of a tree, i.e., to assess the phylogenetic levels at which the data set has a significant signal-to-noise ratio. This was done by first defining a constraint tree in which only the well-supported tip clades found in the phylogenetic analysis were defined and recalculating the g 1 statistic. In subsequent tests, clades defined by successive levels of the tree were defined, and the point at which g 1 ceases to be significant indicates the limit of resolution of the data set (Hillis, 1991). All phylogenetic analyses were executed using PAUP* Version 4.0b2a (Swofford, 1998). Different reconstruction methods were used to derive phylogenies as this allows the consistency of phylogenetic estimation to be evaluated (Avise, 1994). Phylogenetic trees
were reconstructed using maximum-parsimony (MP) (Swofford and Olsen, 1990) and maximum-likelihood (ML) (Felsenstein, 1981) methods. In the MP analysis, the heuristic search algorithm was employed with 10 random additions of taxa and tree bisection–reconstruction (TBR) branch swapping. The random addition of sequences increases the effectiveness of heuristic searches as they decrease the chance that a search will find suboptimal trees in “tree islands” other than those containing the most-parsimonious trees (Maddison, 1991). All other settings were left at default values. Maximum-likelihood analyses were performed using heuristic searches with the HKY model (Hasegawa et al., 1985), with empirical base frequencies. The reliability of the trees produced by all phylogenetic reconstructions was tested by bootstrap analysis (Felsenstein, 1985) with 1000 replications. The Wilcoxon signed-ranks test (Templeton, 1983) was applied to compare the statistical significance of the best tree produced by each tree reconstruction method relative to one another. MtDNA cytochrome b sequence data (660 bp) were also available for the 25 specimens used in the intron analysis (a subset of the specimens used in Malhotra and Thorpe, 2000). This allowed an assessment of the utility of the seventh intron from the ␤-fibrinogen gene at the between-genera taxonomic level. MP and ML trees were constructed for the mtDNA data to allow such a comparison. Again, the Wilcoxon signed-ranks test (Templeton, 1983) was applied to compare trees. In an attempt to assess different methods of combining independent data sets to derive a sound organismal phylogeny, we reconstructed phylogenies from concatenated nDNA and mtDNA sequences in a total evidence approach (Kluge, 1989). We then assessed the congruence of the underlying data by a partition homogeneity test, based on comparison of separate and combined tree lengths, with data randomization (Farris et al., 1994). In this test, nDNA and mtDNA sequences from all 25 specimens were concatenated and designated as separate partitions, and congruence was assessed using 1000 heuristic search replicates. Subsequently, a variety of consensus approaches were explored [reviewed by Swofford (1991)]. The simplest of these is the strict consensus, which includes only the groups that appear on all of the competing trees (Sokal and Rohlf, 1981). Semistrict consensus trees (Bremer, 1990) are equivalent to strict consensus trees if all competing trees are fully dichomotomous. However, they allow groups that are never contradicted, but may not appear in all the trees (e.g., if they are less than fully resolved), to be retained in the consensus tree. Majority-rule consensus trees (Margush and McMorris, 1981) are the most familiar and are less strict in that they allow groups that occur in more than a given proportion of all trees (typically a 50% cut-off is used) to be included. The Adams consensus tree (Adams,
GIANNASI, MALHOTRA, AND THORPE
1972, 1986) is also less strict in that it includes any groups shared by the competing trees regardless of whether they constitute completely uncontradicted components (Mickevich and Platnick, 1989). It has been suggested to be useful when one or more taxa have very different positions on different trees, but when there is also a subset of sequences upon whose relationships the different trees agree upon. However, it may produce consensus trees that contain groups not in fact found in any of the competing trees, which complicates its interpretation. RESULTS
throughout the branches of the most parsimonious tree (P ⬍ 0.001 for all g 1 calculations). Phylogenetic Relationships from ␤-Fibrinogen A total of 2068 most parsimonious trees (length ⫽ 164) were produced with a consistency index (CI) of 0.896, a retention index (RI) of 0.915, and a homoplasy index (HI) of 0.104 (Fig. 1a). The ML analysis recovers an almost identical topology (Fig. 1b) with approximately the same degree of resolution. Indeed, using Templeton’s test no significant differences were found between the two topologies (z ⫽ ⫺ 1.41, P ⫽ 0.157). However, the ML analysis does not find Tropidolaemus wagleri and C. rhodostoma to be sister taxa.
Phylogenetic Relationships from Cytochrome b
We confirmed that we had amplified and sequenced the seventh intron from the ␤-fibrinogen gene by comparing the overlap of our putative exons with stretches of published chicken exon sequence (Weissbach et al., 1991) that flanked the 5⬘ and 3⬘ ends of the intron. We determined the beginning and end of the intron sequence by locating the basepairs that form the 5⬘ and 3⬘ consensus splice sites that are highly conserved across introns (Bretahnach et al., 1978). On average we found the seventh intron from the ␤-fibrinogen gene to be highly A-T rich (A ⫽ 29%, C ⫽ 21%, G ⫽ 16%, and T ⫽ 34%), which is in agreement with previous work on this intron by Prychitko and Moore (1997). The intron was found to be 927 bp in length. Over all sequences, there were 799 monomorphic sites, 108 polymorphic sites, and 47 parsimony-informative sites. Deletions were present in a few sequences but were short (maximum 3 bp in length) and were comparatively rare, with one present in Trimeresurus cantori and several different populations of Trimeresurus albolabris and two present in Ovophis okinavensis. Sequences have been deposited with GenBank (accession numbers are given in Table 1). Tajima’s test could not reject the hypothesis that all mutations are selectively neutral (D ⫽ ⫺ 1.62, not significant, 0.10 ⬎ P ⬎ 0.05). Furthermore, a likelihood-ratio test could not reject the action of a molecular clock, further indicating that the amplified intron was indeed conforming to neutral theory. The observed substitution saturation differed significantly (T ⫽ 37.23, df ⫽ 29.8817, P ⬍ 0.0001) from half of full substitution saturation, indicating that no significant saturation was present in the data set. A linear plot of transitions and transversions versus the Tamura–Nei distance (not shown) confirmed this. The presence of significant phylogenetic signal was indicated by the skewness parameter g 1 ⫽ ⫺ 1.23, corresponding to P ⬍ 0.001 for the number of characters and taxa involved (Hillis and Huelsenbeck, 1992). Furthermore, the iterative g 1 analysis demonstrated that phylogenetic signal in the data set is evenly distributed
A single most-parsimonious tree (length ⫽ 947) was produced with a CI of 0.466, a RI of 0.541, and a HI of 0.534 (Fig. 2a). The ML analysis (Fig. 2b) produced a topologically almost identical tree, which is not significantly different (z ⫽ ⫺ 2.00, P ⫽ 0.055) from the MP tree. However, both mtDNA trees show greater resolution than the intron trees and are significantly different from the intron trees (intron MP vs mtDNA MP, z ⫽ ⫺ 5.11, P ⬍ 0.001; intron MP vs mtDNA ML, z ⫽ ⫺ 9.19, P ⬍ 0.001; intron ML vs mtDNA MP, z ⫽ ⫺ 5.11, P ⬍ 0.001; intron MP vs mtDNA ML, z ⫽ ⫺ 5.10, P ⬍ 0.001). Phylogenetic Relationships from Total Evidence Using a partition homogeneity test, the concatenated data set was found to be significantly incongruent (P ⫽ 0.001). Nevertheless, a total evidence approach advocates combining data sets regardless (Kluge, 1989). Three equally parsimonious trees (length ⫽ 1145) were produced from concatenated sequences, with a CI of 0.514, a RI of 0.573, and a HI of 0.486. The MP and ML analyses resulted in almost identical topologies (Fig. 3). The degree of significant differences between all topologies is summarized in Table 2. Phylogenetic Relationships from Consensus Approaches Consensus trees resulting from different approaches are given in Fig. 4. The strict consensus tree retains virtually no phylogenetic information (Fig. 4a). The semistrict consensus approach (Fig. 4b) provides a greater degree of resolution, recognizing (among others) groups corresponding to Trimeresurus stejnegeri and Trimeresurus popeorum and suggesting that T. albolabris may contain more than one species. The majority-rule tree (Fig. 4c) provides a similar level of resolution as the semistrict tree, recognizing most of the same groups and also suggesting the paraphyly of T. albolabris. However, the sister relationship between Calloselasma and Tropidolameus is lost, as is the grouping of Protobothrops with the O. okinavensis/
NUCLEAR AND mtDNA PHYLOGENIES OF THE Trimeresurus SPECIES COMPLEX
DISCUSSION Using EPIC PCR amplification, the seventh intron from the ␤-fibrinogen gene appears to be relatively
FIG. 1. Phylogenetic relationships based on sequence data from the seventh intron of the ␤-fibrinogen gene. Bootstrap percentages (above 50%) are indicated below the nodes to which they refer, where appropriate (a) Maximum-parsimony tree. (b) Maximum-likelihood tree.
Trimeresurus gracilis cluster. Finally, the Adams consensus tree (Fig. 4d) also provides similar information concerning the taxonomy of Trimeresurus.
FIG. 2. Phylogenetic relationships based on a subset of cytochrome b mtDNA sequence data from Malhotra and Thorpe (2000). Bootstrap percentages (above 50%) are indicated where appropriate. (a) Maximum-parsimony tree. (b) Maximum-likelihood tree.
GIANNASI, MALHOTRA, AND THORPE
FIG. 3. Phylogenetic relationships based on a total evidence analysis of the concatenated intron and cytochrome b mtDNA sequence data. Bootstrap percentages (above 50%) are indicated below the nodes, where appropriate. (a) Maximum-parsimony tree. (b) Maximum-likelihood tree.
simple to amplify across a broad taxonomic range. Furthermore, the g 1 analysis demonstrates the presence of significant phylogenetic signal in the intron and the
iterative g 1 analysis indicates that this signal is evenly distributed throughout the tree. However, the distribution of this signal is rather surprising as the intron has a rather slow rate of sequence evolution (see also Prychitko and Moore, 1997). Consequently, one might predict that the intron would have significant phylogenetic signal at deep taxonomic levels, i.e., between genera, but would have less phylogenetic signal at the finer taxonomic levels, i.e., between closely related species or different populations of the same species. Phylogenetic analysis demonstrates that a well-supported, but not particularly well-resolved tree can be reconstructed from this data set. The availability of mtDNA sequence data (Malhotra and Thorpe, 2000) for the 25 specimens facilitates a direct comparison of the phylogenetic utility of the intron. The mtDNA sequences also contain significant phylogenetic signal across the mtDNA tree and both the MP and the ML analyses produce a more resolved tree than the intron data. Furthermore, the mtDNA and intron trees are significantly different. These topological differences raise the inevitable question of whether to combine or keep separate the two data sets. At one extreme the data sets could be kept separate and a consensus tree constructed from the intron and mtDNA trees. The strict consensus approach has the advantage of being very conservative, a particularly relevant point when attempting to determine the species tree for medically important species such as venomous snakes. However, it is apparent that this approach is so conservative that it is in fact very difficult to reach any phylogenetic conclusions. The semistrict consensus approach, majority-rule, and Adams consensus trees all provide a much greater degree of resolution and appear to provide some valuable information concerning the taxonomy of Trimeresurus. The only disagreement between them lies in the groups identified among specimens from the relatively closely related T. albolabris species group [as defined in Malhotra and Thorpe (2000)]. A drawback of using consensus approaches is that they do not use all the available data; rather, they use only data that are consistent between trees and nearly always result in a less than fully resolved tree. It has been suggested that although a lack of resolution exists, this is countered by the certainty that all the groups in the final consensus tree are supported by all the data. However, as Barrett et al. (1991) demonstrated, this is not necessarily true. An intermediate approach to the data combination question is the conditional data combination approach (Bull et al., 1993; de Queiroz et al., 1995) when data are combined unless there is evidence for significant conflict among them. This approach raises another problem of how to statistically test for the degree of heterogeneity between data partitions. A variety of statistical tests have been suggested as appropriate.
NUCLEAR AND mtDNA PHYLOGENIES OF THE Trimeresurus SPECIES COMPLEX
TABLE 2 Summary of Topological Differences between All Trees Assessed by the Wilcoxon Signed-Ranks Test (Templeton, 1983)
nML mtMP mtML Combined MP Combined ML
⫺1.41 NS ⫺5.11 NS ⫺9.19*** ⫺3.37*** ⫺3.37***
⫺5.11*** ⫺5.11*** ⫺3.14*** ⫺3.14***
⫺2.01 NS ⫺1.15 NS 0.00 NS
⫺2.18* ⫺0.89 NS
Note. n represents a nuclear DNA tree, mt represents a mtDNA tree, combined represents a combined nDNA/mtDNA tree, MP represents maximum-parsimony, and ML represents maximum-likelihood. ***P ⬍ 0.0001; **P ⬍ 0.001; *P ⬍ 0.05; NSnot significant.
These include the partition homogeneity test (Farris et al., 1995), the Wilcoxon signed-ranks test (Templeton, 1983), the Kishino–Hasegawa test (Kishino and Hasegawa, 1989), level of nodal support (Flynn and Nedbal, 1998), and the likelihood heterogeneity test (Huelsenbeck and Bull, 1996). In this study, we use the partition homogeneity test and find the concatenated intron and mtDNA data sets to be significantly incongruent (P ⫽ 0.001). Hence, following the CDC approach it is not appropriate to combine the two data sets and hence we are still no closer to resolving a species tree from our two gene trees. A third approach is to unconditionally combine all data sets in a “total evidence” analysis as advocated by Kluge (1989). The main attraction of combining all data into a single analysis is that it makes use of all the available data. Indeed, MP and ML analyses on the combined intron and mtDNA data result in almost identical topologies and exhibit the highest degree of resolution and support of all the trees considered (i.e., MP and ML intron, MP and ML mtDNA). This is despite the significance of the partition homogeneity test that advocates keeping the two data sets separate. This observation provides support for Sullivan’s (1996) suggestion that while tests of homogeneity provide evidence to keep data partitions separate in phylogenetic analyses, combining certain heterogeneous partitions can lead to a more robust phylogeny in some details. The resulting total evidence trees seem to provide “sensible” taxonomic information that corresponds to morphology (Malhotra and Thorpe, 2000). Among the species represented by multiple samples, T. stejnegeri and T. popeorum are recognized as monophyletic groups, but T. albolabris s.1. is seen to be paraphyletic. Hence, at first glance it appears that the total evidence approach may provide the best method of producing a species tree from more than one gene tree and therefore allow us to make taxonomic suggestions concerning these medically important snakes. However, it has been suggested that some data sets can be “positively misleading” (Bull et al., 1993) and incorporation into a total evidence analysis can pro-
duce incorrect estimates of phylogeny (e.g., Flynn and Nedbal, 1998). As a result we advocate caution and agree with Flynn and Nedbal (1998) that it would prove advantageous to have a more accurate measure of heterogeneity between data partitions that can differentiate between compromising the integrity of existing phylogenetic signal and sustaining and/or increasing phylogenetic signal. A further weakness of the total evidence approach is that it assumes that all the data reflect the same evolutionary history. This may not be the case since in many species (e.g., humpback whales) one may expect mtDNA and nuclear genes to have different, conflicting histories that may represent gender-biased migration (Palumbi and Baker, 1994). For example, if males move more than females and breed more successfully than females, then nuclear genes may be mixed more thoroughly than the maternally inherited mtDNA sequences. Moreover, different genes may have different histories for a variety of other reasons, such as gene duplication (resulting in paralogous genes), lineage sorting, and horizontal transfer (Page, 2000). The reconstruction of the history of a gene family is interesting in its own right and has benefited from the application of reconciled trees (Goodman et al., 1979; Page, 2000). However, the original problem of inferring species trees from one or more gene trees also would benefit substantially from the application of reconciled tree concepts. The optimal species tree is simply the tree in which all the gene trees can be embedded with the least cost. The program GeneTree (Page, 1998) is currently available to do such an analysis. Unfortunately, we were not able to implement this method, as it requires fully resolved gene trees. Furthermore, weakly supported and highly supported nodes are given equal weighting. These limitations are currently being investigated (Page, 2000) and may provide a novel and practical way to infer organismal phylogenies from one or more gene trees. This study suggests that these developments are likely to be very important in order to produce organismal phylogenies from multiple gene trees and thus facilitate the practical implementation
GIANNASI, MALHOTRA, AND THORPE
FIG. 4. Consensus trees of all six constructed trees (MP and ML trees from intron, mtDNA, and combined sequences). (a) Strict consensus (b) Semistrict consensus (c) 50% majority-rule consensus (d) Adams consensus.
of molecular phylogenetics in providing a stable taxonomy for medically significant organisms. ACKNOWLEDGMENTS We are grateful to the large numbers of people who have assisted us in the field or supplied us with tissue samples for analysis. These include Dr. Jennifer Daltry (Flora and Fauna International), Dr.
Wolfgang Wu¨ster and Nicholas Cockayne (University of Wales, Bangor), Professor David Warrell (University of Oxford), Romulus Whitaker, Dr. Indraneil Das, and Gerry Martin (Madras Crocodile Bank), Mr. S. S. Ramachandra Raja (Wildlife Association of Ramnad District, India), Dr. Michihisa Toriba (Japan Snake Institute), Anslem de Silva (Peridinaya University, Sri Lanka), Professor Sangkot Marzuki (Eijkmann Institute, Indonesia), Dr. Aucky Hinting, Rick Hodges, Vincen Khartono, Pak Harwono, and his staff (Surabaya Zoo), Dr. Wen-hao Chou (National Museum of Natural
NUCLEAR AND mtDNA PHYLOGENIES OF THE Trimeresurus SPECIES COMPLEX Science, Taiwan), Jean-Jay Mao (Taipei City Zoo, Taiwan), B. Bhetwal (Nepal Red Cross Society), Tanya Chan-ard and Jarujin Nabhitabhata (National Science Museum of Thailand), Dr. Lawan Chanhome (Queen Savoabha Memorial Institute, Thailand), Dr. Kumthorn Thirakhupt and Dr. Peter Paul van Dijk (Chulalongkorn University, Thailand), Dr. Cheelaprabha Rangsiyanon (Chiang Mai University, Thailand), Merel J. Cox, Jonathan Murray, Galen Valle, and Dr. James D. Lazell (the Conservation Agency, USA), Dr. Bob Murphy (Royal Ontario Museum, Canada), and Dr. Trinh Xuan Kiem (Cho-Ray Hospital, Vietnam). We also acknowledge the National Research Council of Thailand, the National Science Council of Taiwan, and LIPI (Indonesia) for permission to carry out fieldwork. This study was funded by grants to R.S.T. and A.M. from the Leverhulme Trust (F174/I), Darwin Initiative (162/6/65), NERC (GR9/ 2688), and the Wellcome Trust (057257/Z/99/Z). The Royal Society, the Percy Sladen Trust (A.M.), the Bonhote Trust, and the Carnegie Trust (R.S.T.) provided additional support for fieldwork.
REFERENCES Adams, E. N., III (1972). Consensus techniques and the comparison of taxonomic trees. Syst. Biol. 21: 390 –397. Adams, E. N., III (1986). N-trees as nestings: Complexity, similarity and consensus. J. Classif. 3: 299 –317. Avise, J. C. (1994). “Molecular Markers, Natural History and Evolution.” Chapman and Hall, New York. Barrett, M., Donoghue, M. J., and Sober, E. (1991). Against consensus. Syst. Zool. 40: 486 – 493. Bremer, K. (1990). Combinable component consensus. Cladistics 6: 369 –372. Bretahnach, R., Benoist, C., O’Hare, K., Gannon, F., and Chambon, P. (1978). Ovalbumin gene: Evidence for a leader sequence in mRNA and DNA sequences at the exon–intron boundaries. Proc. Natl. Acad. Sci. USA 75: 4853– 4857. Brown, W. M. (1985). The mitochondrial genome of animals. In “Evolutionary Genetics” (R. J. MacIntyre, Ed.), pp. 95–130. Plenum, New York. Brown, W. M., Jr., George, M., and Wilson, A. (1979). Rapid evolution of animal mitochondrial DNA. Proc. Natl. Acad. Sci. USA 76: 1967–1971. Bull, J. J., Huelsenbeck, J. P., Cunningham, C. W., Swofford, D. L., and Waddell, P. J. (1993). Partitioning and combining data in phylogenetic analysis. Syst. Biol. 42: 384 –397. Chippaux, J-P., Williams, V., and White, J. (1991). Snake venom variability: Methods of study, results and interpretation. Toxicon 29: 1279 –1303. David, P., and Ineich, I. (1999). Les serpents venimeux du monde: Systematique et repartition. Dumerilia 3: 3– 499. De Queiroz, A., Donaghue, M. J., and Kim, J. (1995). Separate versus combined analysis of phylogenetic evidence. Annu. Rev. Ecol. Syst. 26: 657– 681. Farris, J. S., Kallersjo, M., Kluge, A. G., and Bult, C. (1994). Testing significance of incongruence. Cladistics 10: 315–319. Felsenstein, J. (1981). Evolutionary trees from DNA sequences: A maximum likelihood approach. J. Mol. Evol. 17: 368 –376. Felsenstein, J. (1985). Confidence limits on phylogenies: An approach using the bootstrap. Evolution 39: 783–791. Felsenstein, J. (1988). Phylogenies from molecular sequences: Inference and reliability. Annu. Rev. Genet. 22: 521–565. Flynn, J. J., and Nedbal, M. A. (1998). Phylogeny of the Carnivora (Mammalia): Congruence vs incompatibility among multiple data sets. Mol. Phylogenet. Evol. 9: 414 – 426. Friesen, V. L., Congdon, B. C., Walsh, H. E., and Birt, T. P. (1997). Intron variation in marbled murrelets detected using analyses of
single-stranded conformational polymorphisms. Mol. Ecol. 6: 1047–1058. Goodman, M., Czelusniak, J., Moore, G. W., Romero-Herrera, A. E., and Matsuda, G. (1979). Fitting the gene lineage into its species lineage: A parsimony strategy illustrated by cladograms constructed from globin sequences. Syst. Zool. 28: 132–168. Hasegawa, M., Kishino, M., and Yano, T. (1985). Dating the human– ape split by a molecular clock of mitochondrial DNA. J. Mol. Evol. 22: 160 –174. Hillis, D. M. (1991). Discriminating between phylogenetic signal and random noise in DNA sequences. In “Phylogenetic Analysis of DNA Sequences” (M. M. Miyamoto and J. Cracraft, Eds.), pp. 278 –294. Oxford Univ. Press, New York. Hillis, D. M., and Huelsenbeck, J. P. (1992). Signal, noise, and reliability in molecular phylogenetic analyses. J. Hered. 83: 189 – 195. Huelsenbeck, J. P., and Bull, J. J. (1996). A likelihood ratio test to detect conflicting phylogenetic signal. Syst. Biol. 45: 92–98. Kimura, M. (1983). “The Neutral Theory of Molecular Evolution.” Cambridge Univ. Press, Cambridge. Kishino, H., and Hasegawa, M. (1989). Evaluation of the maximum likelihood estimate of the evolutionary tree topologies from DNA sequence data, and the branching order of the Hominoidea. J. Mol. Evol. 29: 170 –179. Kluge, A. G. (1989). A concern for evidence and a phylogenetic hypothesis of relationships among Epicrates (Boidae, Serpentes). Syst. Zool. 38: 7–25. Kocher, T. D., Thomas, W. K., Meyer, A., Edwards, S. V., Paabo, S., Villablanca, F. X., and Wilson, A. C. (1989). Dynamics of mitochondrial DNA evolution in animals: Amplification and sequencing with conserved primers. Proc. Natl. Acad. Sci. USA 86: 6196 – 6200. Maddison, D. R. (1991). The discovery and importance of multiple islands of most-parsimonious trees. Syst. Zool. 40: 315–328. Malhotra, A., and Thorpe, R. S. (2000). A phylogeny of the Trimeresurus group of pit vipers: New evidence from a mitochondrial gene tree. Mol. Phylogenet. Evol. 16: 199 –211. Margush, T., and McMorris, F. R. (1981). Consensus n-trees. Bull. Math. Biol. 43: 239 –244. McDiarmid, R. W., Campbell, J. A., and Toure, T. A. (1999). “Snake Species of the World. A Taxonomic and Geographic Reference,” Vol. 1. The Herpetologist’s League, Washington, DC. Mickevitch, M. F., and Platnick, N. I. (1989). On the information content of classifications. Cladistics 5: 33– 47. Moore, W. S. (1995). Inferring phylogenies from mtDNA variation: Mitochondrial gene trees versus nuclear gene trees. Evolution 49: 718 –726. Page, R. D. M. (1998). GeneTree: Comparing gene and species phylogenies using reconciled trees. Bioinformatics 14: 819 – 820. Page, R. D. M. (2000). Extracting species trees from complex gene trees: Reconciled trees and vertebrate phylogeny. Mol. Phylogenet. Evol. 14: 89 –106. Palumbi, S. R., and Baker, C. S. (1994). Contrasting population structure from nuclear intron sequences and mtDNA of humpback whales. Mol. Biol. Evol. 11: 426 – 435. Prychitko, T. M., and Moore, W. S. (1997). The utility of DNA sequences of an intron from the ␤-fibrinogen gene in phylogenetic analysis of woodpeckers (Aves: Picidae). Mol. Phylogenet. Evol. 8: 193–204. Rozas, J., and Rozas, R. (1999). DnaSP version 3: An integrated program for molecular population genetics and molecular evolution analysis. Bioinformatics 15: 174 –175. Sambrook, J., Frisch, E. F., and Maniatis, T. E. (1989). “Molecular Cloning: A Laboratory Manual,” 2nd ed. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY.
GIANNASI, MALHOTRA, AND THORPE
Sokal, R. R., and Rohlf, F. J. (1981). Taxonomic congruence in the Leptopodomorpha reexamined. Syst. Zool. 30: 309 –325. Sullivan, J. (1996). Combining data with different distributions of among-site variation. Syst. Biol. 45: 375–380. Swofford, D. L. (1991). When are phylogeny estimates from molecular and morphological data incongruent? In “Phylogenetic Analysis of DNA Sequences” (M. M. Miyamoto and J. Cracraft, Eds.), pp. 295–333. Oxford Univ. Press, New York. Swofford, D. L. (1998). “PAUP*. Phylogenetic Analysis Using Parsimony (* and other Methods),” Version 4. Sinauer, Sunderland, MA. Swofford, D. L., and Olsen, G. J. (1990). Phylogeny reconstruction. In “Molecular Systematics” (D. M. Hillis and C. Moritz, Eds.), pp. 411–501. Sinauer, Sunderland, MA. Templeton, A. R. (1983). Phylogenetic inference from restriction endonuclease cleavage site maps with particular reference to the evolution of humans and the apes. Evolution 37: 221–244.
Theakston, R. D. G. (1997). The kinetics of snakebite envenoming and therapy. In “Venomous Snakes: Ecology, Evolution and Snakebite” (R. S. Thorpe, W. Wuster, and A. Malhotra, Eds.), pp. 251–257. Oxford Univ. Press, Oxford. Thorpe, R. S., Wuster, W., and Malhotra, A. (Eds.). (1997). “Venomous Snakes: Ecology, Evolution and Snakebite.” Oxford Univ. Press, Oxford. Warrell, D. A. (1986). Tropical snake bite: Clinical studies in SouthEast Asia. In “Natural Toxins—Animal, Plant and Microbial” (J. B. Harris, Ed.), pp. 25– 45. Clarendon, Oxford. Weissbach, L., Oddoux, C., Procyk, R., and Grieninger, G. (1991). The beta chain of chicken fibrinogen contains an atypical thrombin cleavage site. Biochemistry 30: 3290 –3294. Wu, C-I. (1991). Inferences of species phylogeny in relation to segregation of ancient polymorphisms. Genetics 127: 429 – 435. Xia, X. (2000). “DAMBE: Data Analysis in Molecular Biology and Evolution.” Kluwer Academic, Boston.