Elisabeth Gaskells North South | The_GPoint | True Blood season 6 Completed

Evolutionary analysis of a large mtDNA translocation (numt) into the nuclear genome of the Panthera genus species

Evolutionary analysis of a large mtDNA translocation (numt) into the nuclear genome of the Panthera genus species

Gene 366 (2006) 292 – 302 www.elsevier.com/locate/gene Evolutionary analysis of a large mtDNA translocation (numt) into the nuclear genome of the Pan...

450KB Sizes 0 Downloads 12 Views

Recommend Documents

Insights into the evolutionary process of genome degradation
Studies of noncoding and pseudogene sequence diversity, particularly in Rickettsia, have begun to reveal the basic princ

Nuclear and mtDNA Phylogenies of the Trimeresurus Complex: Implications for the Gene versus Species Tree Debate
Phylogenies based on mitochondrial DNA (mtDNA) may represent gene trees that may not be congruent with the equivalent sp

Revisiting the age, evolutionary history and species level diversity of the genus Hydra (Cnidaria: Hydrozoa)
•Hydra’s species level diversity is much higher than suggested by earlier molecular studies.•The two main ‘workhorses’,

Dissecting the twin-arginine translocation pathway using genome-wide analysis
A recently discovered route for protein export, known as the twin-arginine translocation (Tat) pathway, has received muc

Multilocus phylogeny reconstruction: New insights into the evolutionary history of the genus Petunia
•The first complete species-level and multilocus phylogeny of Petunia presented.•Best-supported and reliable nodes detec

Analysis of the taxonomic subdivision within the genus Helleborus by nuclear DNA content and genome-wide DNA markers
Helleborus is a genus of herbaceous perennials belonging to the family Ranunculaceae. Within this genus six sections wit

Evolutionary dynamics of satellite DNA in species of the Genus Formica (Hymenoptera, Formicidae)
The satellite DNA has been characterized in eight species of the Formica genus. This satellite DNA is organized as tande

Estimation of nuclear genome size of the genus Mycetophylax Emery, 1913: Evidence of no whole-genome duplication in Neoattini
Genome size estimates and their evolution can be useful for studying the phylogenetic relationships and taxonomy of a pa

Inference and Evolutionary Analysis of Genome-Scale Regulatory Networks in Large Phylogenies
•Integrating phylogeny, motifs, and expression improves regulatory network inference•A phylogenetic framework allows gen

Cytokine-induced nuclear translocation of signaling proteins and their analysis using the inducible translocation trap system
Binding of cytokines to their specific receptors induces activation of signal transduction pathways, many of which invol

Gene 366 (2006) 292 – 302 www.elsevier.com/locate/gene

Evolutionary analysis of a large mtDNA translocation (numt) into the nuclear genome of the Panthera genus species Jae-Heup Kim a,1,2 , Agostinho Antunes a,b,2 , Shu-Jin Luo a , Joan Menninger c , William G. Nash d , Stephen J. O'Brien a , Warren E. Johnson a,⁎ a

b

Laboratory of Genomic Diversity, National Cancer Institute–Frederick Cancer Research and Development Center (NCI–FCRDC), Frederick, MD 21702-1201, USA REQUIMTE, Departamento de Química, Faculdade de Ciências, Universidade do Porto, Rua do Campo Alegre, 687, 4169-007 Porto, Portugal c Basic Research Program, SAIC-Frederick, Inc., NCI Frederick, Frederick, MD, USA d H&W Cytogenetic Services, Inc., Lovettsville, VA, USA Received 29 April 2005; received in revised form 3 August 2005; accepted 30 August 2005 Available online 27 December 2005 Received by C. Saccone

Abstract Translocation of cymtDNA into the nuclear genome, also referred to as numt, has been reported in many species, including several closely related to the domestic cat (Felis catus). We describe the recent transposition of 12,536 bp of the 17 kb mitochondrial genome into the nucleus of the common ancestor of the five Panthera genus species: tiger, P. tigris; snow leopard, P. uncia; jaguar, P. onca; leopard, P. pardus; and lion, P. leo. This nuclear integration, representing 74% of the mitochondrial genome, is one of the largest to be reported in eukaryotes. The Panthera genus numt differs from the numt previously described in the Felis genus in: (1) chromosomal location (F2—telomeric region vs. D2— centromeric region), (2) gene make up (from the ND5 to the ATP8 vs. from the CR to the COII), (3) size (12.5 vs. 7.9 kb), and (4) structure (single monomer vs. tandemly repeated in Felis). These distinctions indicate that the origin of this large numt fragment in the nuclear genome of the Panthera species is an independent insertion from that of the domestic cat lineage, which has been further supported by phylogenetic analyses. The tiger cymtDNA shared around 90% sequence identity with the homologous numt sequence, suggesting an origin for the Panthera numt at around 3.5 million years ago, prior to the radiation of the five extant Panthera species. Published by Elsevier B.V. Keywords: Big cats; Mitochondrial DNA; Nuclear insertion; Numt; Panthera genus; Pseudogene; Tiger

1. Introduction Abbreviations: ATP8, ATP synthase subunit 8; bp, base pairs; Cyt b, cytochrome b; COI, cytochrome c oxidase subunit I; COII, cytochrome c oxidase subunit II; cymtDNA, cytoplasmic mitochondrial DNA; CR, control region; kb, kilobase(s); FISH, fluorescence in situ hybridization; MYA, million years ago; mtDNA, mitochondrial DNA; ND1, NADH dehydrogenase subunit 1; ND2, NADH dehydrogenase subunit 2; ND5, NADH dehydrogenase subunit 5; ND6, NADH dehydrogenase subunit 6; PCR, polymerase chain reaction; RFLP, restriction fragment length polymorphism; 16S, 16S ribosomal RNA; 12S, 12S ribosomal RNA. ⁎ Corresponding author. Tel.: +1 301 846 7483; fax: +1 301 846 6327. E-mail address: [email protected] (W.E. Johnson). 1 Present address: Bio Lab, Samsung Advanced Institute of Technology, P.O. Box 111, Suwon, Korea 440-600. 2 These authors contributed equally to this work. 0378-1119/$ - see front matter. Published by Elsevier B.V. doi:10.1016/j.gene.2005.08.023

Nuclear DNA sequences that are homologous to the mitochondrial genome, often referred to as numts (pronounced “new-mights,” Lopez et al., 1994), have been reported in numerous organisms, including more than 60 animal and plant species (reviewed in Bensasson et al., 2001). Most of the described incidences of numt are of short fragments of less than 600 bp with varying degrees of similarity with cymtDNA (Zhang and Hewitt, 1996a; Herrnstadt et al., 1999) and the process of integration has been often associated with nonhomologous recombination (e.g., Roth et al., 1985; Henze and Martin, 2001). In humans, the genome sequence database has provided a broad view of the extent of mtDNA transfer, has

J.-H. Kim et al. / Gene 366 (2006) 292–302

facilitated the identification of transfer mechanisms, and has illuminated the evolutionary dynamics of numts (Mourier et al., 2001; Tourmen et al., 2002; Woischnik and Moraes, 2002; Hazkani-Covo et al., 2003; Mishmar et al., 2004; Ricchetti et al., 2004). The incorporation of mtDNA sequences into the human nuclear genome has probably been a continuous evolutionary process, with, by some estimates, at least 612 integrations (Woischnik and Moraes, 2002). However, the incidence of novel numt insertions may be lower, since mtDNAlike sequences may also result from duplication after insertion into the nucleus (Tourmen et al., 2002; Bensasson et al., 2003; Hazkani-Covo et al., 2003). Most human numt segments encompass less than 5% of the mtDNA, and in only three instances exceed 70% of mtDNA. Whole genome sequences of other mammals will continue to elucidate the evolutionary dynamics of numts outside of humans (Pereira and Baker, 2004; Richly and Leister, 2004). However, full genome drafts of other mammals will be limited primarily to model organisms of biomedical, taxonomic or phylogenetic interest (O'Brien et al., 2001). Therefore detailed characterizations of numts among closely related species will be necessary to provide additional insights into the characteristics of mitochondrial pseudogenes, including the study of their evolutionary histories and their distribution and abundance across species (Bensasson et al., 2001; Pons and Vogler, 2005). There have been two documented cases of numt that have been reported in the Felidae family. The first consisted of the translocation of 7.9 kb of the mitochondrial genome into the domestic cat (Felis catus) nuclear genome (Lopez et al., 1994). This large segment is tandemly repeated 38–76 times on cat chromosome D2. The second case of numt in the Felidae family was first described in Panthera genus species based on mtDNA RFLP data (Johnson et al., 1996) and later by sequence analysis (Cracraft et al., 1998). Here we characterize the structure and evolutionary history of the Panthera numt fragment by (i) determining its chromosomal location in all the Panthera genus species (tiger, P. tigris; snow leopard, P. uncia; jaguar, P. onca; leopard, P. pardus; and lion, P. leo), (ii) comparing large portions of the numt and cymt sequences in one Panthera species (the tiger), and (iii) employing phylogenetic and coalescence analyses to assess the evolutionary history of these numt and cymt segments in species of the genus Panthera. 2. Materials and methods 2.1. DNA isolation, amplification, cloning and sequencing To facilitate the characterization of the Panthera numt, three distinct DNA fractions [total (t), nuclear (n), and cytoplasmic mitochondrial (cymt)] were purified from 1.5 g of frozen liver from tiger (Pti065), snow leopard (Pun086), jaguar (Pon011), leopard (Ppa021), and lion (Ple181). The tDNA (mixture of nDNA and cymtDNA) was extracted from tissue according to standard procedures (Sambrook et al., 1989; Lopez et al., 1994). The nDNA fraction was purified using sucrose gradient DNA extraction methods (Bernatchez and Dodson, 1990) and the cymtDNA was purified using the Wizard Miniprep kit

293

(Promega, Beckman et al., 1993). Four regions of the mtDNA genome were amplified in each of the fractions: (i) a portion between the ND5 gene and the CR (primers ND5F-U/CRR-U), (ii) the CR segment (primers CRF-U/CRR-U), (iii) a portion from 16S to ND2 (primers 16SF-U /ND2R-U), and (iv) the segment from ND2 to ATP8 (primers ND2F-U /ATP8R-U) (Fig. 1; Table 1). RFLP analysis was performed on these segments using several restriction enzymes (BamHI, HindIII, EcoRI, XhoI, etc.) to test for differences in banding patterns between cymt and numt. The PCR products for the CR and 16S-ND2 segments, which exhibited different size lengths in the nDNA and cymtDNA fractions, were cloned and sequenced to unambiguously distinguish the cymt and numt products. PCR products were purified using Microcon PCR (Amicon). Cloning was carried out using Zero Blunt TOPO PCR Cloning kit (Invitrogen). The smallest CR PCR products were purified after agarose gel electrophoresis and subcloned using pCR-Blunt IITOPO cloning vector (Invitrogen). Positive clones of cymt and numt were confirmed by comparison with the RFLP patterns. The clones were sequenced using Bigdye Terminators Cycle Sequencing Kits (PE Applied Biosystems) and run on an ABI377 automated sequencer. Based on cymt/numt mismatches, a series of numt- and cymt-specific primers were designed for long-range PCR, allowing a more extensive sequencing and analysis of the tiger cymt and numt (Table 1). Numt and cymt strand-specific primers were designed in highly variable sections or for variable sites using the virtual PCR program Amplify-2.53 (Engels, 1997). Additionally, cymt and numt portions of the 16S, ND1 and ND2 genes were amplified, cloned and sequenced for all the five Panthera species. 2.2. Cytogenetic inference of the Panthera numt location: FISH mapping The location of numt in the nuclear genome of all the Panthera species was determined by FISH. A 2.6 kb mtDNA PCR probe (Fig. 1), generated from the purified cymtDNA fraction, was labeled with biotin-11 dUTP (Sigma) by nick translation (Brigati et al., 1983) in the five Panthera species, as well as the domestic cat. The final probe size was verified on a 1.2% gel with appropriate markers. Metaphase spreads were prepared by standard cytogenetic techniques (Modi et al., 1987). FISH was performed as described in (Lichter et al., 1990). Briefly the metaphase spreads were denatured in 70% formamide 2XSSC in an 80 °C oven for 90 s and dehydrated in cold ethanol series, 70–90–100%, for 3 to 5 min in each step. 400 ng of labeled probe and 10 ug of salmon sperm carrier DNA were resuspended in 50% formamide-10% dextran sulfate2XSSC and denatured for 10 min at 75 °C. The denatured probe cocktail was layered on the denatured metaphase chromosomes. Following 48 h of incubation at 37 °C, post-hybridization washes, and treatment with blocking solution, the hybridized biotin labeled probe was detected by fluorescein isothioscyinate (FITC) conjugated avidin DCS (5 mg/ml—Vector labs). Fluorescence signals were captured as gray scale images using a Zeiss Axioskop epi-fluorescence microscope equipped with a cooled CCD (charged coupled device) camera

294 J.-H. Kim et al. / Gene 366 (2006) 292–302 Fig. 1. Schematic diagram of the relative positions of Panthera numt. The scale bar (in kilobase) corresponds to the domestic cat (Fca—Felis catus) mtDNA complete sequence (Lopez et al., 1996) aligned with the Panthera numt described in this study. The Fca numt is represented for comparison. Protein-coding genes and rRNAs are indicated in boxes. Individual capital letters correspond to the 17 tRNAs. The arrows and numbers over the CR and 16S represent gaps between the cymt and numt sequences in the tiger. Fragments amplified from cymt or numt portions are represented by lines and arrow lines with primer names labeled at the 5′ and 3′ ends and primer sequences (Table 1). (A) A 2.6 kb mtDNA probe was generated by PCR and used for FISH mapping to locate the numt in the Panthera species. (B) Four segments were amplified using universal primers from three DNA fractions (tDNA, nDNA, and cymtDNA) of Panthera species and examined by RFLP (see Fig. 2). Two segments (CR, 16S-ND2) were cloned and sequenced subsequently to separate cymt and numt. (C) Cymt and numt tiger sequences were obtained separately using a combination of universal and strand-specific primers designed based upon cymt/numt gaps in the CR and 16S regions (Table 1).

J.-H. Kim et al. / Gene 366 (2006) 292–302

295

Table 1 Primers used to amplify the Panthera cymt and numt portions surveyed in this study Primer name

Sequence

Specificity

ND5F-U CytBR-U CRF-U CRR-U 16SF-U ND2R-U ND2F-U ATP8R-U CRF-N CRF-C CRR-N CRR-C 16SF-N 16SF-C 16SR-N 16SR-C

5′-GTGCAACTCCAAATAAAAG-3′ 5′-ATTAATAATTTTGATAAGGGGGTGCGAT-3′ 5′-TCAAAGCTTACACCAGTCTTGTAAACC-3′ 5′-TAACTGCAGAAGGCTAGGACCAAACCT-3′ 5′-ACGACGGCCAGTGTGCAAAGGTAGCATAATCA-3′ 5′-CAACCCGTTAACCTCGGGTACTCAGAAGT-3′ 5′-ACTTCTGAGTACCCGAGGTTAACGGGTTG-3′ 5′-GCTATGACCGGCGAATAGATTTTCGTTCA-3′ 5′-ACTCCCACAACACAGACGCACAGT-3′ 5′-CGTTAATACAGAACACACAACACG-3′ 5′-CATTGTGCGTTTGTGTTATGGG-3′ 5′-CGTGTTGTGTGTTCTGTAT-3′ 5′-CGTTTGTTCAACGACTACCGG-3′ 5′-CAAAGTCCTACGTGATCTG-3′ 5′-CGTGGACTACTCCGGTAATCG-3′ 5′-CAGAACTCAGATCACGTAG-3′

Panthera sp. Panthera sp. Universal Universal Panthera sp. Panthera sp. Panthera sp. Universal P. tigris N P. tigris C P. tigris N P. tigris C P. tigris N P. tigris C P. tigris N P. tigris C

*The meanings of the abbreviations are as follows; U—Panthera species specific or universal primer, N—numt specific, C—cymt specific, F—forward, R—reverse. The source of universal primers is Kocher et al. (1989), Johnson et al. (1998), or designed from this study, and N, C primers were designed for this study using clones from CR and 16S-ND2 gene regions.

(Photomentics CE 200 A) and the Oncor imaging system. Grayscale images were computer enhanced, pseudocolored, and merged using Oncor Image software. Images of reverse DAPI banded chromosomes were merged with the FITC detected signals allowing for direct visualization of localization, chromosome identification and cytogenetic loci assignment.

cymtDNA (Hasegawa et al., 1985; Lopez et al., 1997) and μ2 = 4.7 × 10− 9 substitutions/sites/year for nuclear pseudogene distance (Li et al., 1981; Lopez et al., 1997) and t is the time elapsed.

2.3. Sequence analyses

3.1. Recognition of the genes involved in the Panthera numt

Sequences were inspected using SEQUENCHER (Gene Codes Co.), aligned using Clustal-X (Thompson et al., 1997), and further checked by eye. Initial sequence comparisons and measures of variability were performed using MEGA (Kumar et al., 2001). Transition / transversion ratios (Ts / Tv) and the parameter of the gamma distribution of rate variation among sites method of Yang and Kumar (1996) were estimated using PAMP (included in the package PAML 2.0; Yang, 1997). tRNA structure was predicted using the mfold web server (Zuker, 2003). Phylogenetic analyses of the Panthera cymt and numt sequences were performed in PAUP* 4.0b2a (Swofford, 2001) using three approaches: (i) minimum evolution (ME) heuristic search, using a Kimura two-parameter model and the neighbor-joining tree-building algorithm (Saitou and Nei, 1987) followed by branch-swapping; (ii) maximum parsimony (MP), with an exhaustive search; and (iii) maximum likelihood (ML), incorporating a gamma-corrected HKY85 model with parameters estimated from the data set. Reliability of nodes defined by the phylogenetic trees was assessed using 100 bootstrap replications (Felsenstein, 1985; Hillis and Bull, 1993) in the ME and MP analyses, and with the quartet puzzling method in the ML analysis (PUZZLE 4.0; Strimmer and von Haeseler, 1996). The molecular dating for the Panthera numt origin was estimated from the overall genetic distance between tiger numt and cymt, applying the equation of Li et al. (1981) whereby the fraction of sequence divergence is: δ = (μ1 + μ2) t, where μ1 = 2.5 × 10− 8 substitutions/sites/year for

A detection strategy was devised to identify and isolate potential numt fragments based on differences in banding patterns from four distinct PCR products [(ND5-CR), (CR), (16S-ND2) and (ND2-ATP8); (Fig. 1)] and RFLP's banding patterns from three DNA fractions (tDNA, nDNA, and cymtDNA isolated from liver tissue; see Material and methods) (Fig. 2). The CR-PCR products from the tDNA fraction in all

3. Results

Fig. 2. Differences in the banding patterns from PCR products amplified from total (t), nuclear (n), and cytoplasmic mitochondrial (cymt) Panthera DNA fractions from two of the four segments surveyed that showed presence of numt copies. The two segments represented in this figure were chosen for depiction due to the clear distinction of numt sequences caused by the large deletions in CR and 16S. The banding patterns observed were similar in all Panthera species and thus only a single species profile is represented. (A) Control region fragment. (B) Region between 16S and ND2 genes followed by restriction enzyme digestion of BamHI. Lane L represents DNA size ladder 250 (BRL, i.e. the brightest band is 1.0 kb and each step represents 250 bp).

296

J.-H. Kim et al. / Gene 366 (2006) 292–302

the Panthera species showed two codominant bands of around 1.7 kb and 1.5 kb, compared with a single band from the purified cymtDNA and nDNA fraction (1.7 and 1.5 kb, respectively) (Fig. 2A). We determined by band pattern and sequence analysis that the 1.7 kb fragment was cymt and that the 1.5 kb fragment was the numt copy. Numt PCR products were identified also from the three other regions, (ND5-CR), (16S-ND2; Fig. 2B) and (ND2-ATP8) based on different RFLP patterns of HindIII and BamHI digestion among the three DNA fractions. These combined results suggested that the Panthera numt encompasses a region within the ND5 to the ATP8 gene, including eight protein coding genes, two rRNA genes, 17 tRNA genes, and the non-coding CR (Fig. 1). 3.2. Chromosomal location of the Panthera numt A 2.6 kb mtDNA probe including ND5, ND6, and CytB regions (Fig. 1) was hybridized on a metaphase spread of the five Panthera genus species and the domestic cat. Strong hybridization fluorescent signals were observed on chromosome F2 at q1.1 in all the Panthera species (Fig. 3A to E), but on chromosome D2 at the centromere of the domestic cat (Fig. 3F), as previously described by Lopez et al. (1994).

3.3. Comparative sequence analyses of tiger numt and cymt Using large deletions in CR (25 bp) and 16S (23 bp) of the Panthera numt, we designed strand-specific primers for numt and cymt for long-range PCR amplification and sequencing in tiger (Fig. 1; Table 1). Sequences from clones and PCR products were concatenated into a fragment of 12,898 bp for cymt and 12,536 bp for numt (GenBank accession numbers DQ151550 and DQ151551) (Fig. 1). The size difference between numt and cymt was caused mostly by the 340 bp gap in the RS3 region, a 23 bp gap in the HVS-1 region of CR, and a 25 bp gap of the 16S gene in numt (Fig. S1). The numt sequence started in the middle of the ND5 gene position (corresponding to position Fca 12,918 in the domestic cat; Lopez et al., 1996) and almost reached the end of ATP8 gene (position Fca 8840). This 12,536 bp (∼12.5 kb) of tiger numt included approximately 75% of the 17 kb mitochondrial genome, as described in the domestic cat (Lopez et al., 1996) The tiger numt contains a truncated ND5 gene (1533 bp), and complete ND6 (527 bp), Cyt b (1143 bp), 12S (960 bp), 16S (1545 bp), ND1 (958 bp), ND2 (1044 bp), COI (1550 bp), and COII genes (684 bp), a truncated ATP8 gene (183 bp), a CR sequence (1181 bp) with a large deletion (340 bp) removing most of the RS-3 with the d(CA)-rich 8-bp

Fig. 3. Image of fluorescent in situ hybridization (FISH) of the metaphase chromosomes for each of the five Panthera species and the domestic cat using the probe including the partial sequences from ND5 and Cytb regions (2.6 kb). (A) Tiger, P. tigris. (B) Lion, P. leo. (C) Jaguar, P. onca. (D) Leopard, P. pardus. (E) Snow leopard, P. uncia. (F) Domestic cat, F. catus. Signals revealed on the telomeric region of the chromosome F2 (F2q12) in all the Panthera species (A–E) and on the centromeric region of the chromosome D2 (D2p11) in the domestic cat (F).

Table 2 Characterization of the size, similarity, and nucleotide substitution patterns from pairwise comparison of tiger cymt (12.8 kb) and numt (12.5 kb) sequences. Stop codons within numt were determined after frame shift or indels Segments

Size (bp) cymt

numt

Changes between cymt and numt (bp) Substitutions

Pattern of substitutions Ts

Pattern of gaps in numt

Tv

Gaps

A-G

T-C

G-C

A-T

T-G

A-C

Insertions (bp)

Deletions (bp)

7(1bp × 3 + 2bp × 2) 2(1bp × 2) 3(1bp × 4) 2(1bp × 2) 9(1bp × 9) 12(1bp × 9 + 3bp × 1) 4(1bp × 2 + 2bp × 2bp × 1) 5(1bp × 3 + 2bp × 1) 44

4(1bp × 4) 3(1bp × 1 +2bp × 2) 2(1bp × 2) 2(1bp × 2) 6(1bp × 6) 7(1bp × 7) 4(1bp × 4) 4(1bp × 3 + 2bp × 1) 32

Number of stop codons within numt

Percent differences of nucleotides

1 – 3 – – 23 – 5 32

8.8 8.1 7.0 8.4 10.5 9.3 10.2 17.6 9.1

Protein coding genes ND5 1530 ND6 528 ND1 957 ND2 1044 CytB 1140 COI 1545 COII 684 ATP8 182 Total 7610

1533 527 958 1044 1143 1550 684 183 7622

124 38 71 84 94 124 62 23 620

11 5 5 4 15 19 8 9 76

35 14 15 26 29 48 27 10 204

76 19 52 46 59 65 33 9 359

4 – 1 4 1 1 – 1 12

3 1 1 4 2 2 – 2 15

1 3 1 – 1 6 1 1 14

5 1 1 4 2 2 1 1 17

8.5 6.6 16.8 6.0 14.7 10.3 30.0 3.8 9.7

rRNAs 12S 16S Total

957 1575 2532

956 1545 2501

19 42 61

9 44 53

7 13 20

9 22 31

– 1 1

– 4 4

– – –

3 1 4

5.3 5.8 5.7

4(1bp × 4) 7(1bp × 7) 11

5(1bp × 5) 37(6bp × 1 + 1bp × 6 + 25bp × 1) 42

– – –

2.9 5.5 4.5

71 70 66 71 68 75 69 74 69 69 69 73 66 68 70 69 69 1186 1539 12,867

70 70 66 75 68 75 69 74 69 69 69 73 66 66 70 70 69 1188 1181 12,492

– 4 – 6 – 4 2 – 1 3 3 3 2 4 3 4 4 43 79 803

1 – – 6 – – – – – 4 – – – 4 – 1 – 16 378 523

– 1 – 3 – – 2 – – – 1 1 1 – 1 2 1 13 25 262

– 2 – 1

– – – 1 – – – – – – – – – 1 – – – 2 2 17

– 1 – – – 1 – – – 1 – – 1 – – – – 4 6 29

– – – 1 – 1 – – – 1 – 1 – – – – – 4 2 20

– – – – – – – – 1 – – – – 1 – – – 1 5 27

– 3 – 2 – 1 2 – – 0.5 3 2 1 1 3 4 4 2.9 4.3 7.6

– – – 5(2bp × 1bp × 3) – – – – – 2(1bp × 2) – – – 1(1bp × 1) – 1(1bp × 1) – 9 10(1bp × 7 + 3bp × 1) 74

1(1bp × 1) – – 1(1bp × 1) – – – – – 2(1bp × 2) – – – 3(1bp × 3) – – – 7 368(1bp × 5 + 23bp × 1 + 340bp × 1) 449

– – – – – – – –

1.4 5.7 0 16.9 0 5.3 2.9 0 1.4 10.1 4.3 4.1 3.0 11.8 4.3 7.2 5.8 5.0 30 10.3

tRNAs tRNA-Glu tRNA-Thr tRNA-Pro tRNA-Phe tRNA-Val tRNA-Leu tRNA-Ile tRNA-Gln tRNA-Met tRNA-Trp tRNA-Ala tRNA-Asn tRNA-Cys tRNA-Tyr tRNA-Ser tRNA-Asp tRNA-Lys Total Control region Total

2 – – 1 1 2 1 – 2 2 2 3 19 39 448

– – – – – – – – – – 32

J.-H. Kim et al. / Gene 366 (2006) 292–302

Ts / Tv ratio

297

298

J.-H. Kim et al. / Gene 366 (2006) 292–302

[ACACACGT] motif, and full sequences for 17 interspersed tRNAs (Fig. S1). 3.4. Numt and cymt sequence characterization in tiger The nucleotide composition of tiger numt and cymt sequences were similar, 32.31% A, 26.04% C, 15.13% G, and 26.38% T in numt compared with 32.34% A, 26.19% C, 15.10% G, and 26.32% T in cymt. Numt and cymt shared three different types of genes (rRNA, tRNA, and protein coding) plus the CR (Fig. S1). Markedly different patterns of sequence variation were observed between different numt and cymt genes, with sequence similarities ranging from 82% in ATP8 to 100% in three tRNA (Table 2). Sequence variation between numt and cymt was due to both base-pair substitutions (n = 803) and indels (n = 523 bp). Most of the mutational changes between numt and cymt were transitions (710 / 803 = 88%) with the highest proportion of transitional changes occurring in the protein coding genes (5635 / 611 = 92%) and the lowest in RNAs (83 / 103 = 81%). Transitions from T to C were more common than from A to G. To infer whether these genes retained function, sequences from the protein coding genes of cymt and numt were translated into amino acid using the mitochondrial and universal genetic codes, respectively. All cymt protein coding gene sequences could be translated into amino acid sequences, but in the numt sequences 32 extra stop codons were

observed (Figs. S1 and S2). The variable sites between cymt and numt in protein-coding genes were not distributed evenly (Fig. S1A), suggesting that conserved segments may lie within the functional domains of the mtDNA proteins, which are more prone to evolutionary constraints. Likewise, in 12S there were 26 variable sites in the first half from positions 1 to 530 bp and no variable sites from positions 531 to 1027. In the 1575 bp fragment of 16S, 74 of 82 (90%) variable sites occurred in the first 520 bp (1–520 bp) and the third 500 bp (1040–1575 bp) compared with only 8 variable sites (less than 10%) in the middle, (from 521 to 1039 bp) (Fig. S1B). Seventeen tRNA genes were sequenced in both cymt and numt (Fig. S1C). Three tRNA genes (tRNA-Gln, -Pro, and -Val) had identical sequences in both cymt and numt. The number of variable sites in the other tRNA genes ranged from one in tRNA-Met to 12 in tRNA-Phe. Average percentage sequence similarity between cymt and numt in tRNA genes was 95% and in rRNA 95.5% (Table 2). Lower sequence similarity was observed for the protein coding genes (90.9%) and the CR (91%; excluding the 186 bp gap of RS3 region). 3.5. Phylogenetic relationships of the Panthera numts The phylogenetic relationships of the cymt and numt sequences in the five Panthera species was investigated using concatenated sequences (1206 bp) from three mitochondrial

Fig. 4. (A) Phylogenetic minimum evolution tree (Kimura two-parameter) of the five Panthera species cymts and numts (1206 bp concatenated sequences of the 16S, ND1, and ND2). The taxon abbreviation is as follows: Pti—tiger, Pun—snow leopard, Pon—jaguar, Ppa—leopard, Ple—lion, and Nne—clouded leopard (Neofelis nebulosa). The rooting of the tree was obtained with the slowest evolving mtDNA fragment (16S) to avoid long-branch attraction caused by the high rate of divergence of mtDNA. (B) Phylogenetic minimum evolution tree (Kimura two-parameter) illustrating the relationship between the domestic cat numt (Lopez et al., 1996) and the tiger numt (this study) (7683 bp alignment). The taxon abbreviation is as follows: Fca—domestic cat, Aju—cheetah (Acinonyx jubatus), Pti—tiger, and Cfa—dog (Canis familiaris). GenBank accession numbers are as follows: Fca-cymt (U20753); Fca-numt (U20754); Aju-cymt (NC_005212); and Cfa-cymt (NC_002008). Bootstrap values are placed at each branchpoint for the minimum evolution/maximum parsimony/maximum likelihood phylogenetic analyses, respectively (ME/MP/ ML).

J.-H. Kim et al. / Gene 366 (2006) 292–302

genes, 16S (403 bp), ND1 (502 bp), and ND2 (301 bp) (Fig. 4A). The cymt/numt specific-amplification of such genes was facilitated by the 23 bp deletion of the 16S Panthera numt. Two distinct monophyletic clusters, with very strong bootstrap support, defined cymt and numt sequences (results were identical considering ME, MP or ML analyses, or each of the single gene sequences analyses). Little internal structure among Panthera species was observed in either cymt and numt sequences. Cymt sequences showed a five fold faster rate of divergence (average pairwise distance = 0.066 ± 0.006) compared to numts (0.013 ± 0.002) (see also Fig. 4A), similar to the pattern observed in Felis numt (Lopez et al., 1994). Additionally, the phylogenetic relationships between the domestic cat numt (Lopez et al., 1997) and the tiger numt (this study) clearly suggest that the two classes of numts within Felidae are distinct synapomorphies (Fig. 4B). 4. Discussion 4.1. Origin of the Panthera numt An independent origin of the Panthera numt from that of the domestic cat (Lopez et al., 1994) is strongly supported by its distinct chromosomal location, size, contents, and structure. The numt location in all the Panthera species was mapped by FISH on chromosome F2 (Fig. 3A to E). However, the signal using the same probe on the domestic cat produced a signal on chromosome D2 (Fig. 3F), as previously described (Lopez et al., 1994). The tiger numt, is considerable larger than domestic cat's, with a single unit of 12.5 kb that includes genes from middle of ND5 to part of ATP8 subunit (Fig. 1). By contrast, the domestic cat numt has a unit of 7.9 kb (with genes from middle of CR to COII) that is tandemly repeated with 38 to 76 copies, having an overall integrated size of 300 to 600 kb (Lopez et al., 1994). To test for a tandem arrangement in tiger numt, we performed inverse PCR with several different primer sets. However, because we did not observe any PCR products, this suggests that the Panthera numt is not tandemly repeated and is most likely a single segment on the chromosome F2. The phylogenetic analysis performed on cymt and numt sequences from the five extant Panthera species strongly supports a single origin for all these numts along the branch leading to the most-recent common ancestor of the genus (Fig. 4A) and that the domestic cat numt and the tiger numt lineages are distinct synapomorphies within the Cat family (Fig. 4B). Using an overall genetic distance of 10.3% between tiger numt and cymt (Table 2), we estimate that numt and cymt began to diverge around 3.45 MYA, which would be consistent with the known evolutionary history of the Panthera lineage. Analyses of nuclear and mtDNA sequences across all felid species suggests that a common ancestor of the five species of roaring cats diverged from the clouded leopard 5.96 MYA and began to speciate into unique evolutionary lineages 3.47 MYA (O'Brien, 1996; Johnson and O'Brien, 1997; Johnson et al., in press). Overall, our results support the occurrence within the Felidae family of two independent translocations of cytoplasmic

299

mtDNA into the nuclear genome: one in the Panthera genus (around 3 MYA) and the other in the domestic cat lineage (around 1.8 MYA; Lopez et al., 1994). 4.2. Numt as a pseudogene: evolution and functional implications Once mtDNA fragments become incorporated into the nuclear genome, they immediately are exposed to different modes of evolution, which will influence the divergence patterns between the two sequences (Lopez et al., 1994, 1996, 1997). These include lower mutation rates due to nuclear DNA repair, a distinct genetic code, and the possibility of recombination. In addition, numts apparently evolve without the functional selective constraints as their mitochondrial counterparts (Gellissen et al., 1983; Perna and Kocher, 1996). The tiger cymt showed a high bias in transitions over transversions, a well-recognized characteristic of mtDNA (Brown et al., 1982) that was not observed for the numt sequence (nDNA). The phylogenetic analyses depict the more-rapid rate of cymt divergence among Panthera. This is caused by the higher mutation rate of mtDNA, particularly for protein-coding genes (Lopez et al., 1997). Genes within the tiger numt fragment have several characteristics that would preclude these sequences from producing functional gene products. First, in the protein coding genes of numt, there are often several termination codons or frame shift mutations in all possible open reading frames (Table 2; Fig. S1A), many of which were caused by differences in the genetic codes between the nucleus and mitochondria (Anderson et al., 1981; Brown, 1985). Second, the numt 16S has a large deletion (23 bp), which would appear to disrupt the normal secondary structure (Fig. S1B). Third, two regulatory elements (CSB 2 and 3) of the CR that are involved in transcriptional promotion catalyzed by mitochondrial RNA polymerase and trans-activating factors do not function in nuclear genes (Schinkel and Tabak, 1989). The numt CR also lacks most of the repetitive segment three (RS-3), which is involved in mtDNA replication and transcription (Fig. S1D). The importance of mtDNA CR in the nuclear genome is at least in part dependent on the presence of promoter regions and functional sequences, because as far as is known, the CR is only functional with promoter and several protein-binding sites (Chang and Clayton, 1985). Due to the large deletions of the hypervariable segment one (HVS-1) and RS-3, the numt CR sequence is presumably not functional. Fourth, all of the cymt tRNA sequences formed typical cloverleaf shapes of class 1 tRNAs (Lewin, 1994). However, some numt tRNAs, like for example, tRNA-Thr and tRNA-Tyr, formed imperfect shapes due to several unpaired free-bases that likely cause loss of function (Fig. 5). The differing degrees of similarity among tiger cymt and numt genes, specifically the highly conserved rRNAs or invariant tRNA genes contrasted with the more-divergent protein-coding genes and the CR (Table 2; Fig. S1C and D), highlight the differential rates of nucleotide substitution among mitochondrial genes relatively to its homologues numt molecular “fossils.” In the mammalian mitochondria, the

300

J.-H. Kim et al. / Gene 366 (2006) 292–302

average nucleotide divergence is much lower in rRNA genes relative to protein-coding genes or the CR (Lopez et al., 1997). The maintenance in the function of genes translocated from organelle to nucleus occurred numerous times in evolutionary history, contributing to the compact and economical mitochondrial genomes observe today (Perna and Kocher, 1996). The mammalian mitochondrial genome of 15,000–17,000 bp and 37 coding genes contrasts with the hundreds of nuclear genes that have function in the mitochondria, such as nuclear-encoded members of the citric acid cycle, cytochrome chain, and oxidative phosphorylation pathways. As with numt, these nuclear genes, following the Serial Endosymbiosis Theory (Margulis, 1970; Yang et al., 1985), are thought to have originated from the transfer of mtDNA genes to the nucleus, with subsequent duplication and divergence. A reduction in the accumulation of deleterious mutations is a prime benefit for cymt genes that are subsequently located in the nuclear genome (where DNA repair is more efficient). However, functional gene transfers have been documented almost exclusively in plants (e.g., Adams et al., 2002) and green algae (e.g., Perez-Martinez et al., 2000; Funes et al., 2002), suggesting that in animals, where the mitochondrial genetic code differs from the standard

code (Wolstenholme, 1992), most numts are non functional upon arrival. 4.3. The mtDNA as a reliable molecular marker The maternal inheritance, cellular abundance, and lack of recombination of the mtDNA have allowed biologists to phylogenetically study many metazoan animal. However, mitochondrial-like DNA sequences in the nuclear genome of many organisms, and their amplification or coamplification during PCR is a recognized complication (Perna and Kocher, 1996; Zhang and Hewitt, 1996a). Because nuclear insertions are paralogs of the authentic mitochondrial sequences, they will confound phylogenetic and population genetic analyses when inadvertently included, especially when using more slowly evolving segments (Arctander, 1995; Collura and Stewart, 1995; Vanderkuyl et al., 1995; Zhang and Hewitt, 1996b). Mitochondrial-like sequences in the nuclear genome can negate the advantages of mtDNA as a molecular marker in population studies. The occurrence of numt, as with sequence heteroplasmy, necessitates more-complicated data collection and analysis and in some species, like gorillas that have a large variety of numt

Fig. 5. Proposed secondary structure for tRNA-Thr and tRNA-Tyr based on DNA sequence data from the tiger cymt and numt. Dots represent Watson–Crick bonds. Red circles indicate that the nucleotide is variable between cymt and numt. Numbers represent the direction of the sequences from 5′ to 3′.

J.-H. Kim et al. / Gene 366 (2006) 292–302

sequences bearing high similarity to cymtDNA, can make analysis of mtDNA impractical (Thalmann et al., 2004). One implication is that explicit measures need to be taken to authenticate mtDNA sequences generated. Previously reported mtDNA tiger sequences have been incorrectly labeled (Fig. S3). In some cases, the reported gene sequences were mixed sequences of cymt and numt (Masuda et al., 1994; Ledje and Arnason, 1996). In another case, sequences were preferentially collected from nuclear copies (Johnson and O'Brien, 1997). The full sequence for both tiger cymt and numt is presented here, providing a valuable contribution for research in felids. Such data has greatly facilitated the validation of the matrilineal genealogy of current tiger subspecies (Luo et al., 2004) and certainly will be highly useful for research on the other closely related Panthera species. Refined accurate population genetic inferences will represent an effective contribution for the conservation and the management of these endangered cat species. The relative scarcity of numts described in Felidae species to date contrasts with the high frequency of numts observed in primates, particularly in humans, as revealed by the human genome database (e.g., Mourier et al., 2001; Tourmen et al., 2002; Woischnik and Moraes, 2002). The prevalence of reported numts varies widely among metazoans (reviewed in Bensasson et al., 2001), with human and plant genomes harboring the largest numt repertoires (Richly and Leister, 2004). The cat genome project, which was recently included in the Large-Scale Sequencing Research Network, will facilitate more detailed evaluation of the dynamics and extent of numt insertions in this Felidae species. Acknowledgment A. Antunes was supported in part by a Postdoctoral grant (SFRH/BPD/5700/2001) from the Portuguese Foundation for the Science and Technology (Fundação para a Ciência e a Tecnologia). This publication has been funded in whole or in part with Federal funds from the National Cancer Institute, National Institutes of Health, under Contract No. NO1-CO12400. The content of this publication does not necessarily reflect the views or policies of the Department of Health and Human Services, nor does mention of trade names, commercial products, or organizations imply endorsement by the U.S. Government. Comments made by two anonymous referees improved a previous version of this manuscript. Appendix A. Supplementary data Supplementary data associated with this article can be found, in the online version, at doi:10.1016/j.gene.2005.08.023. References Adams, K.L., Qiu, Y.L., Stoutemyer, M., Palmer, J.D., 2002. Punctuated evolution of mitochondrial gene content: high and variable rates of mitochondrial gene loss and transfer to the nucleus during angiosperm evolution. Proc. Natl. Acad. Sci. U. S. A. 99 (15), 9905–9912.

301

Anderson, S., et al., 1981. Sequence and organization of the human mitochondrial genome. Nature 290 (5806), 457–465. Arctander, P., 1995. Comparison of a mitochondrial gene and a corresponding nuclear pseudogene. Proc. R. Soc. Lond., B 262 (1363), 13–19. Beckman, K., Smith, M., Orrego, C., 1993. Purification of mitochondrial DNA with Wizard Minipreps DNA Purification System. Promega Note 43, 10–13. Bensasson, D., Zhang, D.X., Hartl, D.L., Hewitt, G.M., 2001. Mitochondrial pseudogenes: evolution's misplaced witnesses. Trends Ecol. Evol. 16 (6), 314–321. Bensasson, D., Feldman, M.W., Petrov, D.A., 2003. Rates of DNA duplication and mitochondrial DNA insertion in the human genome. J. Mol. Evol. 57 (3), 343–354. Bernatchez, L., Dodson, J.J., 1990. Allopatric origin of sympatric populations of lake whitefish (Coregonus-Clupeaformis) as revealed by mitochondrialDNA restriction analysis. Evolution 44 (5), 1263–1271. Brigati, D.J., et al., 1983. Detection of viral genomes in cultured-cells and paraffin-embedded tissue-sections using biotin-labeled hybridization probes. Virology 126 (1), 32–50. Brown, W., 1985. The Mitochondrial Genome of Animals. In: Macintyre, R.J. (Ed.), Molecular Evolutionary Genetics. Plenum Press, New York, pp. 95–130. Brown, W.M., Prager, E.M., Wang, A., Wilson, A.C., 1982. MitochondrialDNA sequences of primates—tempo and mode of evolution. J. Mol. Evol. 18 (4), 225–239. Chang, D.D., Clayton, D.A., 1985. Priming of human mitochondrial-DNA replication occurs at the light-strand promoter. Proc. Natl. Acad. Sci. U. S. A. 82 (2), 351–355. Collura, R.V., Stewart, C.B., 1995. Insertions and duplications of mtDNA in the nuclear genomes of Old-World monkeys and hominoids. Nature 378 (6556), 485–489. Cracraft, J., Felsenstein, J., Vaughn, J., Helm-Bychowski, K., 1998. Sorting out tigers (Panthera tigris): mitochondrial sequences, nuclear inserts, systematics, and conservation genetics. Anim. Conserv. 1, 139–150. Engels, B., 1997. Amplify: Software for PCR, version 2.53B. University of Wisconsin. Felsenstein, J., 1985. Confidence-limits on phylogenies—an approach using the bootstrap. Evolution 39 (4), 783–791. Funes, S., et al., 2002. The typically mitochondrial DNA-encoded ATP6 subunit of the F1F0-ATPase is encoded by a nuclear gene in Chlamydomonas reinhardtii. J. Biol. Chem. 277 (8), 6051–6058. Gellissen, G., Bradfield, J.Y., White, B.N., Wyatt, G.R., 1983. MitochondrialDNA sequences in the nuclear genome of a locust. Nature 301 (5901), 631–634. Hasegawa, M., Kishino, H., Yano, T., 1985. Dating of the human-ape splitting by a molecular clock of mitochondrial DNA. J. Mol. Evol. 22 (2), 160–174. Hazkani-Covo, E., Sorek, R., Graur, D., 2003. Evolutionary dynamics of large numts in the human genome: rarity of independent insertions and abundance of post-insertion duplications. J. Mol. Evol. 56 (2), 169–174. Henze, K., Martin, W., 2001. How do mitochondrial genes get into the nucleus? Trends Genet. 17 (7), 383–387. Herrnstadt, C., et al., 1999. A novel mitochondrial DNA-like sequence in the human nuclear genome. Genomics 60 (1), 67–77. Hillis, D.M., Bull, J.J., 1993. An empirical-test of bootstrapping as a method for assessing confidence in phylogenetic analysis. Syst. Biol. 42 (2), 182–192. Johnson, W.E., O'Brien, S.J., 1997. Phylogenetic reconstruction of the Felidae using 16S rRNA and NADH-5 mitochondrial genes. J. Mol. Evol. 44 (Suppl 1), S98–S116. Johnson, W.E., Dratch, P.A., Martenson, J.S., O'Brien, S.J., 1996. Resolution of recent radiations within three evolutionary lineages of Felidae using mitochondrial restriction fragment length polymorphism variation. J. Mamm. Evol. 3 (2), 97–120. Johnson, W.E., Culver, M., Iriarte, J.A., Eizirik, E., Seymour, K., O'Brien, S.J., 1998. Tracking the elusive Andean mountain cat (Oreailurus jacobita) from mitochondrial DNA. J. Heredity 89, 227–232. Johnson, W.E., Eizirik, E., Murphy, W.J., Pecon-Slaterry, J., Antunes, A., Teeling, E., O'Brien, S.J., in press. The Late Miocene radiation of modern Felidae: a genetic assessment. Science.

302

J.-H. Kim et al. / Gene 366 (2006) 292–302

Kocher, T.D., et al., 1989. Dynamics of mitochondrial-DNA evolution in animals—amplification and sequencing with conserved primers. Proc. Natl. Acad. Sci. U. S. A. 86 (16), 6196–6200. Kumar, S., Tamura, K., Jakobsen, I.B., Nei, M., 2001. MEGA2: molecular evolutionary genetics analysis software. Bioinformatics 17 (12), 1244–1245. Ledje, C., Arnason, U., 1996. Phylogenetic relationships within caniform carnivores based on analyses of the mitochondrial 12S rRNA gene. J. Mol. Evol. 43 (6), 641–649. Lewin, B., 1994. Genes V. Oxford University Press Inc., New York. Li, W.H., Gojobori, T., Nei, M., 1981. Pseudogenes as a paradigm of neutral evolution. Nature 292 (5820), 237–239. Lichter, P., et al., 1990. High-resolution mapping of human chromosome-11 by in situ hybridization with cosmid clones. Science 247 (4938), 64–69. Lopez, J.V., Cevario, S., O'Brien, S.J., 1996. Complete nucleotide sequences of the domestic cat (Felis catus) mitochondrial genome and a transposed mtDNA tandem repeat (Numt) in the nuclear genome. Genomics 33 (2), 229–246. Lopez, J.V., Culver, M., Stephens, J.C., Johnson, W.E., O'Brien, S.J., 1997. Rates of nuclear and cytoplasmic mitochondrial DNA sequence divergence in mammals. Mol. Biol. Evol. 14 (3), 277–286. Lopez, J.V., Yuhki, N., Masuda, R., Modi, W., O'Brien, S.J., 1994. Numt, a recent transfer and Tandem amplification of mitochondrial DNA to the nuclear genome of the domestic cat. J. Mol. Evol. 39 (2), 174–190. Luo, S.-J., et al., 2004. Phylogeography and genetic ancestry of tigers (Panthera tigris). PLoS Biol. 2 (12), e442. Margulis, L., 1970. Origin of Eukaryotic Cells. Yale University Press, New Haven, CT. Masuda, R., Yoshida, M.C., Shinyashiki, F., Bando, G., 1994. Molecular phylogenetic status of the Iriomote cat Felis iriomotensis, inferred from mitochondrial DNA sequence analysis. Zool. Sci. 11 (4), 597–604. Mishmar, D., Ruiz-pesini, E., Brandon, M., Wallace, D.C., 2004. Mitochondrial DNA-like sequences in the nucleus (NUMTs): insights into our African origins and the mechanism of foreign DNA integration. Human Mutat. 23 (2), 125–133. Modi, W.S., Nash, W.G., Ferrari, A.N., O'Brien, S.J., 1987. Cytogenetic methodologies for gene mapping and comparative analyses in mammalian cell culture systems. Gene Anal. Tech. 4 (4), 75–85. Mourier, T., Hansen, A.J., Willerslev, E., Arctander, P., 2001. The human genome project reveals a continuous, transfer of large mitochondrial fragments to the nucleus. Mol. Biol. Evol. 18 (9), 1833–1837. O'Brien, S.J., 1996. Molecular Genetics and Phylogenetics of the Felidae. In: Nowell, K., Jackson, P. (Eds.), Status Survey and Conservation Action Plan: Wild Cats. IUCN, p. XXIII–XXIV. O'Brien, S.J., Eizirik, E., Murphy, W.J., 2001. Genomics—on choosing mammalian genomes for sequencing. Science 292 (5525), 2264–2266. Pereira, S.L., Baker, A.J., 2004. Low number of mitochondrial pseudogenes in the chicken (Gallus gallus) nuclear genome: implications for molecular inference of population history and phylogenetics. BMC Evol. Biol. 4 (1), 17. Perez-Martinez, X., et al., 2000. Unusual location of a mitochondrial gene— subunit III of cytochrome c oxidase is encoded in the nucleus of chlamydomonad algae. J. Biol. Chem. 275 (39), 30144–30152. Perna, N.T., Kocher, T.D., 1996. Mitochondrial DNA—molecular fossils in the nucleus. Curr. Biol. 6 (2), 128–129.

Pons, J., Vogler, A.P., 2005. Complex pattern of coalescence and fast evolution of a mitochondrial rRNA pseudogene in a recent radiation of tiger beetles. Mol. Biol. Evol. 22 (4), 991–1000. Ricchetti, M., Tekaia, F., Dujon, B., 2004. Continued colonization of the human genome by mitochondrial DNA. PLoS Biol. 2 (9), 1313–1324. Richly, E., Leister, D., 2004. NUMTs in sequenced eukaryotic genomes. Mol. Biol. Evol. 21 (6), 1081–1084. Roth, D.B., Porter, T.N., Wilson, J.H., 1985. Mechanisms of nonhomologous recombination in mammalian-cells. Mol. Cell. Biol. 5 (10), 2599–2607. Saitou, N., Nei, M., 1987. The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol. Biol. Evol. 4, 406–425. Sambrook, J., Fritsch, E., Maniatis, T., 1989. Molecular Cloning: a Laboratory Manual. Cold Spring Harbor Laboratory Press, New York, NY. Schinkel, A.H., Tabak, H.F., 1989. Mitochondrial RNA-polymerase—dual role in transcription and replication. Trends Genet. 5 (5), 149–154. Strimmer, K., von Haeseler, A., 1996. Quartet puzzling: a quartet maximumlikelihood method for reconstructing tree topologies. Mol. Biol. Evol. 13 (7), 964–969. Swofford, D.L., 2001. 'PAUP* Phylogenetic Analysis Using Parsimony and Other Methods' Computer Program. Sinauer, Sunderland, MA. Thalmann, O., Hebler, J., Poinar, H.N., Paabo, S., Vigilant, L., 2004. Unreliable mtDNA data due to nuclear insertions: a cautionary tale from analysis of humans and other great apes. Mol. Ecol. 13 (2), 321–335. Thompson, J.D., Gibson, T.J., Plewniak, F., Jeanmougin, F., Higgins, D.G., 1997. The CLUSTAL-X Windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Res. 25, 4876–4882. Tourmen, Y., et al., 2002. Structure and chromosomal distribution of human mitochondrial pseudogenes. Genomics 80 (1), 71–77. Vanderkuyl, A.C., Kuiken, C.L., Dekker, J.T., Perizonius, W.R.K., Goudsmit, J., 1995. Nuclear counterparts of the cytoplasmic mitochondrial 12s ribosomalRNA gene—a problem of ancient DNA and molecular phylogenies. J. Mol. Evol. 40 (6), 652–657. Woischnik, M., Moraes, C.T., 2002. Pattern of organization of human mitochondrial pseudogenes in the nuclear genome. Genome Res. 12 (6), 885–893. Wolstenholme, D.R., 1992. Genetic novelties in mitochondrial genomes of multicellular animals. Curr. Opin. Genet. Dev. 2 (6), 918–925. Yang, Z., 1997. PAML: a program package for phylogenetic analysis by maximum likelihood. Comp. Appl. Biosci. 13, 555–556. Yang, Z.H., Kumar, S., 1996. Approximate methods for estimating the pattern of nucleotide substitution and the variation of substitution rates among sites. Mol. Biol. Evol. 13 (5), 650–659. Yang, D., Oyaizu, Y., Oyaizu, H., Olsen, G.J., Woese, C.R., 1985. Mitochondrial origins. Proc. Natl. Acad. Sci. U. S. A. 82 (13), 4443–4447. Zhang, D.-X., Hewitt, G.M., 1996a. Nuclear integrations: challenges for mitochondrial DNA markers. Trends Ecol. Evol. 11 (6), 247–251. Zhang, D.X., Hewitt, G.M., 1996b. Highly conserved nuclear copies of the mitochondrial control region in the desert locust Schistocerca gregaria: some implications for population studies. Mol. Ecol. 5 (2), 295–300. Zuker, M., 2003. Mfold web server for nucleic acid folding and hybridization prediction. Nucleic Acids Res. 31, 3406–3415.