Molecular Ecology (2007) 16, 2627– 2637
doi: 10.1111/j.1365-294X.2007.03303.x
Molecular evidence of host-associated genetic divergence in the holly leafminer Phytomyza glabricola (Diptera: Agromyzidae): apparent discordance among marker systems Blackwell Publishing Ltd
S O N J A J . S C H E F F E R * and D A V I D J . H A W T H O R N E † *Systematic Entomology Laboratory, Agricultural Research Service, US Department of Agriculture, Bldg. 005, Rm. 137, BARC-W, 10300 Baltimore Av., Beltsville MD, 20705, USA, †Department of Entomology, University of Maryland, 4112 Plant Sciences Building, College Park, MD 20742, USA
Abstract Host races play a central part in understanding the role of host plant mediated divergence and speciation of phytophagous insects. Of greatest interest are host-associated populations that have recently diverged; however, finding genetic evidence for very recent divergences is difficult because initially only a few loci are expected to evolve diagnostic differences. The holly leafminer Phytomyza glabricola feeds on two hollies, Ilex glabra and I. coriacea, that are broadly sympatric throughout most of their ranges. The leafminer is often present on both host plants and exhibits a dramatic life history difference on the two hosts, suggesting that host races may be present. We collected 1393 bp of mitochondrial cytochrome oxidase I (COI) sequence and amplified fragment length polymorphism (AFLP) data (45 polymorphic bands) from sympatric populations of flies reared from the two hosts. Phylogenetic and frequency analysis of mitochondrial COI sequence data uncovered considerable variation but no structuring by the host plant, and only limited differentiation among geographical locations. In contrast, analysis of AFLP frequency data found a significant effect with host plant, and a much smaller effect with geographical location. Likewise, neighbour-joining analysis of AFLP data resulted in clustering by host plant. The AFLP data indicate that P. glabricola is most likely comprised of two host races. Because there were no fixed differences in mitochondrial or AFLP data, this host-associated divergence is likely to have occurred very recently. P. glabricola therefore provides a new sympatric system for exploring the role of geography and ecological specialization in the speciation of phytophagous insects. Keywords: AFLP, host race formation, mitochondrial data, nuclear data, speciation, sympatry Received 29 September 2006; revision accepted 22 January 2007
Introduction Plant-feeding insects are routinely heralded as an extraordinarily diverse group whose diversity is associated with the phytophagous habit (Ehrlich & Raven 1964; Strong et al. 1984; Mitter et al. 1988; Farrell 1998). Many phytophagous insects are highly specialized in their feeding and mate in association with their host plant; these observations led to the suggestion that sympatric speciation mediated by changes in host use may contribute to the Correspondence: Sonja J. Scheffer, Fax: (301) 504-6482; E-mail:
[email protected] © 2007 The Authors Journal compilation © 2007 Blackwell Publishing Ltd
generation of diversity in some groups (Bush 1969, 1975; Berlocher & Feder 2002). The key stage under a sympatric speciation scenerio is the origin of host races while in sympatry. However, before the question of the origin of host races is addressed, the initial step in the study of host races is the documentation that what appears to be a single oligophagous species is actually composed of host plant specialized subpopulations or races (Dres & Mallet 2002). Once identified, analysis of recently diverged host races allows focused investigation of the behavioural, genetic, and ecological features that are associated with divergence and could potentially lead to speciation (e.g. Craig et al. 1993, 2001; Feder et al. 1994, 1997; Hawthorne & Via 2001).
2628 S . J . S C H E F F E R and D . J . H A W T H O R N E Host races reflect an intermediate level of divergence and reduction in gene flow, mediated through resource use. They occupy a position along a trajectory between a single oligophagous and panmictic population and reproductively isolated, host-specific species (Dres & Mallet 2002). As such, they provide an excellent opportunity to study the processes by which populations traverse that same spectrum during divergence and speciation. By focusing on the processes that occur during the process of speciation (which may or may not be ultimately attained), this approach provides an important complement to comparative analysis of sister taxa, which may sport differences accumulated both during and subsequent to the divergence. One challenge to the study of host races is documenting the existence of genetically diverged subpopulations within what otherwise appears to be a single species. During the initial stages of ecologically-based diversification, host races will differ primarily at nuclear loci that directly contribute to the divergence or that are genetically linked to those loci (Ting et al. 2000). Other loci, neutral with respect to host-use-related traits, will remain undifferentiated with respect to host race, with the exception of those loci linked to the selected loci (reviewed by Storz 2005). In fact, these neutral loci may exhibit ancestral polymorphisms that are shared by the host races, potentially complicating their detection. It is only after relatively long periods of time have passed that lineage sorting, genetic drift, and selection on additional traits will cause loci that are not associated with divergence to become fixed or diagnostic for each host race. Gene flow between the diverging populations further complicates the use of molecular markers to detect host races, by homogenizing frequencies of neutral alleles and contributing to the erosion of linkage disequilibrium between selected and linked neutral loci (Maynard Smith & Haigh 1974; Barton 2000; Miller & Hawthorne 2005). Mitochondrial genes are widely used to study population differentiation at and below the species level (Avise 2000). Mitochondrial genes have a smaller effective population size than do nuclear genes and, when comparing neutral loci, are expected to track differentiating populations more rapidly (Moore 1995; Palumbi et al. 2001). Most commonly, monophyly of genetically-based relationships among biologically or ecologically divergent populations is viewed as strong evidence for: (i) genetic population divergence; and (ii) reproductive isolation. Although some authors believe that mitochondrial monophyly is a necessary criterion for the genetic identification of biologically/taxonomically ‘significant units’ (e.g. Zink 2004), there is no reason to believe that monophyly of any particular marker will unmistakably mirror the development of biologically significant divergences, especially during the early stages of divergence. For example, a number of processes such as introgression, lineage sorting, and selective sweeps can cause variation in
mitochondrial data to inaccurately reflect general population differentiation (Avise 2000; Ballard et al. 2002; Funk & Omland 2003). Even populations experiencing complete reproductive isolation can exhibit periods of molecular polyphyly and paraphyly as they move towards reciprocal monophyly (Neigel & Avise 1986; Avise et al. 1990; Rosenberg 2003; Omland et al. 2006). In this study, we set out to explore patterns of molecular variation at mitochondrial and nuclear loci in a leafmining insect suspected of harbouring recently diverged host races (Scheffer 2002). Phytomyza is a large genus (> 500 species) composed mostly of monophagous leafminers (Spencer & Steyskal 1986; Spencer 1990). Phytomyza glabricola Kulp belongs to a radiation of 14 closely-related species, most of which feed on hollies in the genus Ilex (Aquifoliaceae) (Kulp 1968; Scheffer & Wiegmann 2000; S. J. Scheffer, unpublished). Most of these holly leafminer species feed on only a single species of Ilex, even though various combinations of up to six holly hosts (with their holly leafminers) regularly grow syntopically (S. J. Scheffer, personal observation). In contrast to the single host-use exhibited by most of the holly leafminers, P. glabricola feeds on two hollies, Ilex glabra (Linnaeus) Gray and I. coriacea (Pursh) Chapman, throughout the southeastern coastal plain of the United States (Scheffer 2002). These two native plant species are almost entirely sympatric (and often grow interspersed in the same fields or forests), with the exception that the range of I. glabra extends further south into southern Florida and further north along the northeastern coast to Canada (Fig. 1). Although P. glabricola and its two host plants, I. glabra and I. coriacea, are sympatric with several other holly (Ilex) species, P. glabricola has never been recorded for any additional plant species. Phytomyza glabricola adults associated with the two plant hosts do not appear to differ in either external or genitalic morphological characters (cursory inspection by Scheffer). In contrast, the larvae feeding on each of the two hosts exhibit dramatic differences in the duration of the larval developmental stage (Scheffer 2002). When feeding on I. glabra, P. glabricola has a larval development time of approximately 2–4 weeks and is multivoltine (Kulp 1968; Al-Siyabi & Shetlar 1998). When feeding on I. coriacea , P. glabricola is univoltine with a larval development time of approximately 9–10 months (Scheffer 2002; personal observation). Although it is possible that the nine-month difference in larval development time on the two plant species is due to host-associated phenotypic plasticity, this difference could also be indicative of the presence of sympatric host races (Scheffer 2002). In fact, none of the other 13 holly leafmining species exhibit both life history patterns (S. J. Scheffer, unpublished), nor has intraspecific variation of this sort been reported for other agromyzid species. The purpose of this study was to determine where P. glabricola from I. glabra and I. coriacea is positioned along © 2007 The Authors Journal compilation © 2007 Blackwell Publishing Ltd
H O S T - A S S O C I A T E D D I V E R G E N C E I N A L E A F M I N I N G F L Y 2629 eppendorf vials and stored in a moist chamber until the emergence of adults. Following eclosion from puparia, adult flies were placed in 95% ethanol and stored at −80°C. Sixteen adults from each host plant at each location were used in the study for a total of 64 flies from sympatric populations (32 from NC and 32 from SC). A total of 13 specimens were also sampled from natural allopatric populations of P. glabricola on I. glabra in New York (NY; Swan Pond, Long Island), Maryland (MD; near Annapolis) and central Florida (FL; Archbold Biological Station) (Fig. 1). The NY and MD samples were from extremely small populations and might represent siblings, while the FL samples (like the NC and SC samples) were from a large population and probably did not represent siblings. Total nucleic acids were extracted from individual fly specimens following the animal tissue protocol from the Qiagen DNeasy kit (Qiagen, Valencia, California).
Mitochondrial COI sequences Fig. 1 Geographical ranges of host plants I. glabra and I. coriacea and collection sites of P. glabricola leafminers.
the continuum between a single panmictic species that feeds on two hosts and genetically differentiated host races, or possibly even species. To accomplish this we sampled flies from sympatric populations of both hosts in each of two locations and looked for host-plant associated genetic structure. We used two independent molecular marker systems to assess patterns of variation: mitochondrial DNA (mtDNA) sequence data and amplified fragment length polymorphism (AFLP) data (Vos et al. 1995). In addition, to more fully explore the patterns of variation that we found, we collected mitochondrial sequence data from flies obtained from I. glabra at additional sites located outside of the range of I. coriacea.
Materials and methods Fly samples were obtained from Carolina Beach, North Carolina (NC), and from the Francis Marion National Forest in South Carolina (SC). These locations are approximately 300 miles apart (Fig. 1). P. glabricola is readily found at each location on both I. glabra and I. coriacea, which grow interspersed throughout the collecting areas. Several hundred leaves containing well-developed mines were removed from dozens of the two host plants growing interspersed within each of the ∼10 ha collecting sites. Within each collecting site, mined leaves were pooled across host plant individuals and were placed in plastic bags according to host plant species. Pupae were dissected from the leafmines and placed individually in 0.5 mL © 2007 The Authors Journal compilation © 2007 Blackwell Publishing Ltd
DNA sequence data from the mitochondrial cytochrome oxidase I region (COI) were obtained using the methods described by Scheffer & Wiegmann (2000). Briefly, a 1478 bp fragment of COI was obtained using the primers C1-J-1535 (ATTGGAACTTTATATTTTATATTTGG) (Scheffer & Wiegmann 2000) and TL2-N-3014 (TCCATTGCACTAATCTGCCATATTA) (Simon et al. 1994). The polymerase chain reaction (PCR) was carried out using a Mastercycler Gradient (Eppendorf Scientific, Westbury, New York) following a standard amplification program with an annealing temperature of 48°C. Sequencing reactions were carried out using Big Dye Terminator sequencing kits (Perkin-Elmer Applied Biosystems, Foster City, California) and electrophoresed on an ABI 377 automated DNA sequencer. All sequences were confirmed by the sequencing of complementary strands [see Scheffer & Wiegmann (2000) for the full list of primer sequences]. sequencher (Gene Codes Corp., Ann Arbor, Michigan) was used to assemble contigs and to align the final consensus sequences. Data analysis. The final dataset of aligned COI sequences was analysed in two ways. arlequin (Schneider et al. 2000) was used to conduct an analysis of molecular variance (amova), to test for significant population structure by host plant or by location. Parsimony analysis of this dataset was conducted using the heuristic search feature of paup* 4.0b10 (Swofford 2001) with 100 random addition replicates and the MaxTrees option set to 1000.
AFLPs AFLPs were constructed from fragments created by two six-cutters, Pst1 and EcoRI. AFLP restriction-ligation and
2630 S . J . S C H E F F E R and D . J . H A W T H O R N E amplification was done as described in Hawthorne (2001). Following electrophoresis on a vertical slab gel, the amplified fragments were silver stained, visualized on a light box, recorded as present or absent, and digitally archived by scanning the gel directly. Data analysis. To uncover large-scale patterns of host-related differentiation, the individual AFLP loci were analysed in aggregate in an amova framework and following construction of a phenogram. To uncover possible signatures of divergent selection, AFLP loci were next analysed individually to determine their contributions to the larger patterns. Host-related differentiation. amova was performed in arlequin (Schneider et al. 2000) to test for genetic structure due to host plant or collection site location. The data for each individual was analysed in the framework of a multilocus genotype composed of restriction site polymorphisms. Genetic variation was partitioned within and among host plants and also among collection sites. Significance of that variation was tested using empirically derived sampling distributions in arlequin. To observe patterns of host use in the context of genetic similarity, a distance tree was constructed for the leafminers using the neighbour-joining algorithm of paup*. Signature of divergent selection. The logistic procedure of sas (SAS 1999) was used to test for significant differences in frequencies of bands across hosts and across locations. This procedure adds effects to a logistic model in a stepwise fashion until no additional effects meet the α = 0.05 level for entry into the model. G-tests were then performed on individual AFLP loci to test the hypothesis that band frequencies did not differ among host plants or locations. Significance levels were adjusted to correct for multiple tests (Sokal & Rohlf 1981). In a second locus specific analysis, we compared the degree of genetic divergence at each locus among samples from different locations or host plants to identify highly differentiated outlier loci (Beaumont & Nichols 1996). The frequency of the band-absent allele was estimated from the bandpresence/absence matrix as q = (c/n)^0.5, where n = sample size (16 or 32 in this experiment) and c = the number of individuals with the band-absent genotype. This estimation assumes that the populations are in Hardy–Weinberg genotypic frequencies and that the present band is dominant to the absent band. FST was calculated from the AFLP data using the methods of Nei & Chesser (1983). Using the approach of Bowcock et al. (1991) and Beaumont & Nichols (1996) we simulated an expected distribution of genetic differentiation (measured by FST) across loci given the average divergence estimated with our data. We used the software fdist2 (Beaumont & Nichols 1996) to simulate the neutral divergence of
many loci among a series of 100 demes. fdist2 calculates θ, Weir & Cockerham’s (1984) estimator of diversity within a population, for many loci. Coalescent simulations are performed to generate data sets with a distribution of θ centred on the empirical estimates. The simulation generates a distribution of divergences for the n samples across many loci due to drift, migration and mutation. We then determine the quantiles of the simulated FST within which the observed FST’s fell. Because the simulated divergence is due to drift, migration and mutation alone, the 95th percentile of the distribution indicates a significantly greater degree of divergence, most simply explained by divergent selection. Beaumont & Nichols (1996) and Beaumont & Balding (2004) have shown that the simulated distributions of θ are highly robust to variation in parameters of the simulation model. We compared samples of insects from the two host plants at each of the two locations to determine if loci were divergent due to host-plant-mediated effects. Next, we considered genetic differentiation between samples from different sites but on a shared host, to determine if loci showed location associated differentiation.
Results Mitochondrial COI sequences The final aligned mitochondrial COI dataset consisted of 1393 bp collected from 64 individuals from NC/SC sympatric populations and 13 individuals from allopatric I. glabra populations. Of the data collected from NC/SC individuals, 69 sites were variable and 31 were parsimony informative. Third positions accounted for 93% of the variable sites and 97% of the parsimony informative sites. Uncorrected pairwise differences across the P. glabricola samples ranged from 0 to 1.29% (Table 1). There was considerable genetic diversity among the individuals sampled. Across the entire NC/SC dataset of 64 individuals, 43 different COI haplotypes were recovered; 35 of these were unique while the remaining 8 haplotypes were carried by
Table 1 The number of specimens, mtDNA haplotypes, private alleles and uncorrected pairwise distances within and among NC/SC populations of P. glabricola reared from I. glabra (Pgl) and I. coriacea (Pco) Pairwise distances N # hap #priv Pgl, NC Pgl, SC Pco, NC Pco, SC Pgl, NC Pgl, SC Pco, NC Pco, SC
16 16 16 16
10 12 12 15
7 9 10 14
0–0.93
0–1.01 0–0.93
0–1.08 0–1.15 0–1.29
0–1.22 0–1.29 0–1.29 0–1.15
© 2007 The Authors Journal compilation © 2007 Blackwell Publishing Ltd
H O S T - A S S O C I A T E D D I V E R G E N C E I N A L E A F M I N I N G F L Y 2631 Fig. 2 Representative phylogram from parsimony analysis of 1393 bp of mitochondrial COI. Size of circles is representative of number of individuals carrying a given haplotype. Branches connecting haplotypes are proportional to the number of nucleotide changes along each branch. Shared haplotypes from NC/SC collections are named H1 … H8, and the collection locations for these haplotypes are given in Table 2.
multiple individuals. Of these shared haplotypes, the two most common were carried by individuals from both host plants, while the remaining six shared haplotypes were host-restricted, carried by flies from only one or other of the host plants. In addition to these six host-restricted haplotypes, private haplotypes found within only a single population were present in flies from each of the populations. Of the flies sampled from three allopatric I. glabra populations, two related haplotypes were recovered from the flies sampled from NY (n = 2), two haplotypes were recovered from MD (n = 4), one being identical to shared haplotype H1 (Fig. 2), and six haplotypes were recovered from FL (n = 6), including one that was identical to shared haplotype H8 (Fig. 2). amova of the NC/SC dataset detected marginally significant structuring by geographical location (4.4% of variation), but not by host plant (Table 2). In contrast, using all of the data (the NC/SC dataset combined with the allopatric samples from I. glabra), a Mantel test for analysis of isolation by distance (IBD 1.53, Bohonak 2002) revealed © 2007 The Authors Journal compilation © 2007 Blackwell Publishing Ltd
no significant relationship between geographical and genetic distances (r2 < 0.01, P ≥ 0.32). Parsimony analysis of the entire dataset resulted in many equally parsimonious trees, one of which is presented Table 2 amova results for (A) mtDNA and (B) AFLP analysis of leafminers collected from two host plants (I. glabra and I. coreacea) in two locations (NC and SC) Source A. mtDNA Among Hosts Within Hosts Among Locations Within Locations B. AFLP Among Hosts Within Hosts Among Locations Within Locations
d.f.
SS
Var
% Var
P
1 62 1 62
7.547 259.219 10.203 256.562
0.10519 4.18095 0.18953 4.13810
2.45 97.55 4.38 95.62
> 0.095
1 62 1 62
71.016 322.531 10.641 382.906
2.057 5.202 0.140 6.176
28.33 71.67 2.21 97.21
< 0.0001
0.032
0.04
2632 S . J . S C H E F F E R and D . J . H A W T H O R N E Fig. 3 Unrooted neighbour-joining tree from distance analysis of 45 AFLP bands from NC/SC collections.
as an unrooted network with scaled branches (Fig. 2). There was no evidence of phylogenetic structuring by either host or location; both were well-interspersed throughout the trees. There was also no deep structure, indicative of distinct cryptic lineages uncovered with mitochondrial COI data; the phylogenetic structuring is very shallow with many recently derived ‘tip’ lineages (Fig. 2). Of the eight shared haplotypes, the two most common, which are shared across host plant species, both occupy internal node positions in the equally parsimonious trees. The remaining six host-restricted shared haplotypes occupy either tip (n = 4) or internal (n = 2) node positions (Fig. 2).
AFLPs AFLP procedures resulted in 49 polymorphic bands from the 64 NC/SC specimens. Four of these bands were present in only male specimens and were removed from subsequent analyses, leaving 45 bands for analysis. As observed for the mtDNA analysis, genetic diversity was abundant in the AFLP assay.
Unlike the mtDNA analysis, all analyses of the AFLP data found substantial structuring of the genotypes by host plant. Analysis of molecular variation using arlequin found much greater structuring by host (28.3% of genetic variation) than by location (2.2% of variation) (Table 2). Likewise, the unrooted neighbour-joining phylogram of AFLP presence/absence data indicates a high degree of clustering by host plant species (Fig. 3). Flies from the two host plants form two discrete clusters, with the sole exception of a single branch between the clusters uniting one fly from each host. There is no evidence of structuring by location; collection location is interspersed across the tree and within each host cluster. Bootstrap support is not shown, as only a few branches uniting terminal specimens exhibited high support levels. Logistic analysis of the AFLP dataset revealed a highly significant effect of host on band frequencies (χ2 = 118.29, d.f. = 44, P < 0.0001), but no significant effect of location (PROC LOGIT, sas). Because there was no significant effect of location, sample data from each host plant were pooled across locations for G-tests on individual band © 2007 The Authors Journal compilation © 2007 Blackwell Publishing Ltd
H O S T - A S S O C I A T E D D I V E R G E N C E I N A L E A F M I N I N G F L Y 2633 frequencies across hosts. To maintain an experiment-wise error rate of α = 0.05 for 45 tests across hosts, a test was determined to be significant only if P < 0.001. Using this criterion, 12 of the 45 bands were found to exhibit significant differences on the two host plants (Table 3). Although
Table 3 Results of G-tests of independence on individual AFLP bands. Samples from each host were pooled across the NC and SC locations. Significance column indicates which bands exhibit significance at an experiment-wise error rate of α = 0.05
Band
I. glabra (+/–)
I. coriacea (+/–)
χ2
P
Signif.
1-Pag/Ect 2-Pag/Ect 3-Pag/Ect 4-Pag/Ect 5-Pag/Ect 6-Pag/Ect 7-Pag/Ect 8-Pag/Ect 9-Pag/Ect 10-Pag/Ect 13-Pag/Ect 15-Pct/Ecg 16-Pct/Ecg 18-Pct/Ecg 19-Pct/Ecg 20-Pct/Ecg 21-Pct/Ecg 22-Pct/Ecg 23-Pct/Ecg 24-Pct/Ecg 25-Pct/Ecg 26-Pct/Ecg 27-Pct/Ecg 28-Pct/Ecg 29-Pct/Ecg 30-Pct/Ecg 31-Pat/Egt 32-Pat/Egt 33-Pat/Egt 34-Pat/Egt 35-Pat/Egt 36-Pat/Egt 37-Pat/Egt 38-Pat/Egt 39-Pat/Egt 40-Pat/Egt 41-Pat/Egt 42-Pat/Egt 43-Pat/Egt 44-Pat/Egt 45-Pat/Egt 46-Pat/Egt 47-Pat/Egt 48-Pat/Egt 50-Pat/Egt
17/15 28/4 1/31 19/13 2/31 13/19 4/28 21/11 16/16 28/4 1/31 17/15 0/32 17/15 3/29 6/26 21/11 1/31 2/30 13/19 1/31 3/29 2/30 2/30 16/16 0/32 8/24 9/23 0/32 0/32 28/4 3/29 3/29 16/16 8/24 24/8 4/28 14/18 26/6 3/29 1/31 30/2 12/20 3/29 16/16
0/32 21/11 1/31 19/13 18/14 13/19 7/25 0/32 32/0 3/29 1/31 22/10 5/29 1/31 0/32 0/32 7/25 0/32 2/30 26/6 7/25 0/32 0/32 7/25 21/11 1/31 1/31 32/0 1/31 21/11 25/7 5/27 5/27 5/27 1/31 32/0 1/31 7/25 31/1 3/29 7/25 28/4 0/32 0/32 0/32
23.15 4.27 0 0 18.62 0 0.99 31.26 21.33 39.10 0 1.64 5.42 19.79 3.14 6.62 12.44 1.02 0 11.09 5.14 3.15 2.06 3.23 1.60 1.02 6.34 35.90 0.99 31.26 0.99 0.57 0.57 8.58 6.33 9.14 1.95 3.47 4.01 0 5.14 0.74 14.77 3.15 21.33
< 0.001 < 0.05 ns ns < 0.001 ns ns < 0.001 < 0.001 < 0.001 ns ns < 0.05 < 0.001 ns < 0.025 < 0.001 ns ns < 0.001 < 0.025 ns ns ns ns ns < 0.025 < 0.001 ns < 0.001 ns ns ns < 0.01 < 0.025 < 0.01 ns < 0.05 < 0.05 ns < 0.025 ns < 0.001 ns < 0.001
* ns ns ns * ns ns * * * ns ns ns * ns ns * ns ns * ns ns ns ns ns ns ns * ns * ns ns ns ns ns ns ns ns ns ns ns ns * ns *
© 2007 The Authors Journal compilation © 2007 Blackwell Publishing Ltd
no reciprocally fixed differences across host plants were observed, there were 16 bands that were fixed present or absent on one host, but polymorphic on the other host. In seven of these cases the frequency differences were significant (Table 3). In the fdist2 analysis of genetic divergence, eight loci had FST higher than the 95th percentile of the simulation results in at least one comparison (Fig. 4). Comparisons of samples taken from different host plants but at the same location revealed seven (NC) and five (SC) loci with FST exceeding the 95% threshold simulated for neutral loci, indicating divergent selection acting on diversity of these loci. Four of these high-FST loci were shared by the NC and SC comparisons, and the relative magnitude of the Fst for the shared loci was the same in both locations. This strongly suggests that for these four loci at least, the pattern of extreme divergence is not a local artefact. Only one locus had a significantly high FST in a within-host plant comparison (Fig. 4). The mean FST of the among host plant comparisons fell from 0.11 to 0.048 following removal of the seven high FST loci in NC, and from 0.095 to 0.046 following removal of five high FST loci from the SC comparisons. The mean FST for the within-host plant, betweenlocation comparisons was smaller: 0.025 among samples from I. coriacea and 0.013 from I. glabra.
Discussion The leafminers studied here demonstrate host-plant-based genetic structure, strongly suggesting the presence of diverged populations such as host races or sibling species. Interestingly, our conclusions appear to depend critically on the marker system analysed. We found no evidence of host-associated mitochondrial variation in P. glabricola collected from its two hosts, I. glabra and I. coriacea. In the phylogenetic analysis, mitochondrial haplotypes did not cluster by either host plant or location. The most common mitochondrial haplotype (H1) was shared equally by flies from the two hosts (Fig. 2). amova found no evidence of frequency differences of mitochondrial haplotypes by host plant and only weak structuring by geographical location. As there were 43 distinct haplotypes, the lack of structure is not an artefact of insufficient genetic diversity. The maximum uncorrected pairwise divergence across the entire dataset was 1.29%, a value not much higher than intraspecific mitochondrial variation observed in other agromyzids (Scheffer & Wiegmann 2000; S. J. Scheffer, unpublished) and substantially less than the 2% cut-off proposed by COI barcoding proponents for delineating intraspecific from interspecific variation (Hebert et al. 2003; Barrett & Hebert 2005). In contrast to the mitochondrial data, AFLP frequencies exhibited significant differences across the two hosts, indicating that genotypes of pupal P. glabricola are not
2634 S . J . S C H E F F E R and D . J . H A W T H O R N E Fig. 4 fdist2 analysis results. FST vs. heterozygosity for each AFLP locus in comparisons of P. glabricola from different host plants at one location (A, B) or from the same host plant from different locations (C, D). Closed symbols indicate FST measured for each of 45 loci (some points are hidden). Solid line is the upper 95 percentile of simulated θ for neutral loci from samples of 32 individuals from each of 100 demes. See text for details.
distributed randomly with respect to host plant. Clustering of the AFLP data into nearly monophyletic host-associated clusters irrespective of location supports this finding. These AFLP results strongly suggest that P. glabricola feeding on I. glabra and I. coriacea represents two sympatric host races that have diverged so recently that reciprocally diagnostic differences have not yet evolved in either mitochondrial or sampled nuclear markers. Indeed, the fdist2 analysis indicates that directional selection has likely played an important role in the divergence among these host races. Our results were similar to those of Wilding et al. (2001), in which outlier loci were much more abundant in comparisons of nearby snails from different environments than from more distant snails from similar environments. We are aware of the limitations of our analysis: the number of individuals genotyped in each host or location was small (32) and the markers are dominant. However, despite those limitations, several loci repeatedly signal differential adaptation to the host plants I. glabra and I. coriacea. These loci would be good candidates for linkage analysis to identify markers closely linked to quantitative trait loci (QTL) causing host-plant adaptation in these flies. Removal of the seven highest FST loci did not eliminate the host-plant-based topology of the AFLP phenogram (not shown), suggesting that there are a number of additional loci with smaller but non-negligible FST contributing to that phylogenetic structure. Whether those additional loci are differentiated by genetic drift or selection remains unknown. The fdist2 analysis was relatively conservative, only five loci were detected as significant outliers in overall comparison of flies from the two host plants, whereas using the G-test 12 significantly differentiated loci were found (including the five also identified by fdist2).
Host races, particularly in sympatry, imply the presence of feeding and/or oviposition preferences. Although the role of host plant preference has not yet been fully explored in P. glabricola, several lines of evidence suggest female host preferences may exist. Field observations suggest that P. glabricola from I. coriacea prefer that host. During the mass emergence of leafminers from I. coriacea in February, adults can be readily observed on I. coriacea foliage but not on the interspersed I. glabra foliage (S. J. Scheffer, personal observation). Additionally, during laboratory no-choice tests, flies from I. glabra made significantly more oviposition punctures resulting in leafmines on I. glabra than they did on I. coriacea (χ2 = 8.43, d.f. = 1, P < 0.01; S. J. Scheffer, unpublished). An alternative explanation for the frequency differences observed in the AFLP data is that rather than being comprised of host races, P. glabricola is a single panmictic population that every generation experiences strong hostassociated divergent selection. Under this alternative hypothesis, the mortality levels that would be required to account for the observed differences in AFLP frequencies are very large and involve more AFLP loci than one might reasonably expect to find in a fairly limited survey of the genome. While a finding of strong single-generation, host-associated selection in sympatry would be an exciting contribution to the sympatric speciation debate; we suggest that it is more likely that P. glabricola on I. glabra and I. coriacea represents two host races. Accepting that host races are present within P. glabricola, how do we explain the apparently disparate findings of the mitochondrial and nuclear datasets? Under driftdominated scenarios, the smaller effective population size of mitochondrial genes will result in mitochondrial © 2007 The Authors Journal compilation © 2007 Blackwell Publishing Ltd
H O S T - A S S O C I A T E D D I V E R G E N C E I N A L E A F M I N I N G F L Y 2635 markers tracking recent population divergences on average more quickly than will typical nuclear loci (Moore 1995; Palumbi et al. 2001; Rosenberg 2003). However, the rate of fixation of individual neutral loci will vary stochastically. An approach such as AFLP, which extensively surveys variation throughout the nuclear genome, should sample primarily neutral gene regions exhibiting a range of rates of approach to fixation; such sampled gene regions may include some that happen to approach fixation faster than a mitochondrial locus, though most should not. Additionally, if given a sufficient density of markers, an approach such as AFLP may sample genomic regions linked to loci experiencing selection (e.g. for early host race formation and maintenance). These selected loci, and their genetically-linked neighbours, will change in frequency much more rapidly than will neutral loci that are primarily under the influence of genetic drift. Regions associated with selected loci will therefore show host-associated frequency differences followed by monophyly much sooner than other random regions of the genome (Wang et al. 1999; Ting et al. 2000). In this study, we found significant frequency differences in some AFLP bands, but no reciprocally fixed differences, an overall pattern that would be predicted to occur in the early stages of diversification. We have identified several AFLP bands that show significant frequency differences most simply explained by directional selection, and several other significantly differentiated loci with unknown relative contributions of drift and selection (via hitchhiking) to the observed patterns. In other studies, AFLP data has been shown to delineate species even in the face of incomplete lineage sorting of mtDNA or hybridization leading to introgression of mitochondrial markers (Parsons & Shaw 2001; Kai et al. 2002). Often, cases of hybridization are detected when individuals from one species are found to carry mitochondrial haplotypes indicative of a related species. This type of polyphyly is common (Funk & Omland 2003) and is most convincingly attributed to hybridization rather than lineage sorting when phylogeographical structure is present in the tree and the shared haplotypes occur primarily in regions of sympatry between the species (e.g. Shaw 1996, 2002; Masta et al. 2002). In P. glabricola, the mitochondrial data show no obvious evidence of structure either inside or outside the region of sympatry; the placements of haplotypes found in allopatric populations in NY, MD, and FL appear to be as unstructured as those of haplotypes from regions of sympatry in NC and SC (Fig. 2). Most strikingly, shared haplotype H8 was recovered from I. glabra flies in NC, SC, and FL. This lack of structure in the mitochondrial tree(s), the amount of total divergence observed (< 1.3%), and the lack of significant isolation by distance is most easily explained by postulating the presence of two host races (based on evidence from AFLP data) that have diverged so recently that lineage sorting of © 2007 The Authors Journal compilation © 2007 Blackwell Publishing Ltd
mitochondrial variation is only just beginning; diagnostic mitochondrial variation has not yet accumulated in hostassociated populations. Consistent with this hypothesis is the observation that the two haplotypes that are shared across hosts (H1 and H2) occupy internal (and likely older) rather than more recent tip locations in the phylogram (Fig. 2). Phytomyza glabricola belongs to a radiation of 14 primarily host-specific holly leafminers, most of which feed on plants in the genus Ilex. Of these species, P. glabricola is the only holly leafminer known to exhibit both a multivoltine (on I. glabra) and a univoltine (on I. coriacea) larval development pattern; all others exhibit one pattern or the other (eight multivoltine, three univoltine, two unknown; Kulp 1968; Scheffer & Wiegmann 2000). The AFLP data presented here provide the first clues that the difference in larval development time observed within P. glabricola may be due to genetic differences in the flies rather than to hostassociated phenotypic plasticity. The AFLP loci identified here will contribute towards characterizing the genetic architecture of the life history differences observed in these host races. That the voltinism difference is due to genetic differences in the flies is consistent with the observation from other holly leafminers that species-specific differences in larval development time are maintained even on the same host; the closely-related P. ilicicola Loew (univoltine) and P. opacae Kulp (multivoltine) exhibit the dramatic difference in larval development time even when cooccurring on their common host I. opaca (Kulp 1968; S. J. Scheffer, personal observation). Interestingly, the dramatic difference in voltinism exhibited by the host races within P. glabricola does not result in allochronic separation of adult flight periods. Leafminers from I. glabra and I. coriacea emerge synchronously in late winter with overlapping eclosion dates (S. J. Scheffer, unpublished). Because the hosts grow interspersed, there is ample opportunity for individuals from host-associated populations to encounter each other. Data from many other host race (or sibling species) systems suggest that host-mediated phenological differences play an important, and possibly critical, role in host-associated divergence and speciation (e.g. Wood 1980; Wood & Keese 1990; Wood et al. 1990; Craig et al. 1993; Feder et al. 1993; Feder & Filchak 1999; Groman & Pellmyr 2000; Thomas et al. 2003). Future studies on P. glabricola will investigate possible causes of host-associated divergence despite phenological opportunities for intermating of host races. Finally, the molecular evidence offered here for the presence of currently sympatric host races within P. glabricola does not firmly establish the geographical context of the origin of the host races. The lack of both mitochondrial differentiation and fixed differences in AFLP bands suggests a fairly recent origin of the host races, but the initial divergence could have taken place in either sympatry or
2636 S . J . S C H E F F E R and D . J . H A W T H O R N E allopatry. In fact, the two hosts of P. glabricola, I. glabra and I. coriacea, are believed to be closely related (Galle 1997; Manon, personal communication), raising the possibility that host divergence, presumably in allopatry, may have played a role in leafminer divergence. Additional work on the behaviour of these flies as well as the genetics of both the flies and their hosts will be necessary to determine to what extent the host races uncovered by this study have diverged, and the likely scenarios for such divergence. Currently, these host races are broadly sympatric and syntopic and provide a new system for exploring the role of geography and ecological specialization in speciation of phytophagous insects.
Acknowledgements We thank Leslie Iskenderian for assistance with field collections. Matt Lewis and Andrea Badgely collected the sequence data and AFLP data, respectively. Matt Kramer assisted with statistical analyses. Valuable comments on the manuscript were provided by Kevin Omland, Dan Funk, Chris Thompson, Douglas Miller, Matt Hare and several anonymous reviewers. For permission to collect leafminers, we gratefully acknowledge Carolina Beach State Park (Carolina Beach, North Carolina), Francis Marion National Forest (Columbia, South Carolina), and the Archbold Biological Station (Lake Placid, Florida). The Holly Society of America provided funds for collecting leafminers. DJH is supported by a grant from the USDA-NRI (2002-35302-12478).
References Al-Siyabi AAK, Shetlar DJ (1998) Inkberry leaf miner, Phytomyza glabricola Kulp (Diptera: Agromyzidae): Life cycle in Ohio. Ohio State Extension Research Special Circular, 165 –199. Avise JC (2000) Phylogeography, the History and Formation of Species. Harvard University Press, Cambridge, Massachusetts. Avise JC, Ankney CD, Nelson WS (1990) Mitochondrial gene trees and the evolutionary relationship of mallard and black ducks. Evolution, 44, 1109–1119. Ballard JWO, Chernoff B, James AC (2002) Divergence of mitochondrial DNA is not corroborated by nuclear DNA, morphology, or behavior in Drosophila simulans. Evolution, 56, 527–545. Barrett RDH, Hebert PDN (2005) Identifying spiders through DNA barcodes. Canadian Journal of Zoology, 83, 481– 491. Barton NH (2000) Genetic hitch-hiking. Philosophical Transactions of the Royal Society, 355, 1553 –1562. Beaumont MA, Balding DJ (2004) Identifying adaptive genetic divergence among populations from genome scans. Molecular Ecology, 13, 969–980. Beaumont MA, Nichols JM (1996) Evaluating loci for use in the genetic analysis of population structure. Proceedings of the Royal Society of London (B), 263, 1619–1626. Berlocher SH, Feder JL (2002) Sympatric speciation in phytophagous insects: moving beyond controversy? Annual Review of Entomology, 47, 773 – 815. Bohonak AJ (2002) Ibd (Isolation by Distance): a program for analysis of isolation by distance. Journal of Heredity, 93, 153– 154.
Bowcock AM, Kidd JR, Mountain JL et al. (1991) Drift, admixture, and selection in human evolution: a study with DNA polymorphisms. Proceedings of the National Academy of Sciences USA, 88, 839–843. Bush GL (1969) Sympatric host race formation and speciation in frugivorous flies of the genus Rhagoletis (Diptera: Tephritidae). Evolution, 23, 237–251. Bush GL (1975) Modes of animal speciation. Annual Review of Ecology and Systematics, 6, 339–364. Craig TP, Horner JD, Itami JK (2001) Genetics, experience, and host-plant preference in Eurosta solidaginis: implications for host shifts and speciation. Evolution, 55, 773–782. Craig TP, Itami JK, Horner JD, Abrahamson WG (1993) Behavioral evidence for host-race formation in Eurosta solidaginis. Evolution, 47, 1696–1710. Dres M, Mallet J (2002) Host races in plant-feeding insects and their importance in sympatric speciation. Philosophical Transactions of the Royal Society of London (B), 357, 471–492. Ehrlich P, Raven P (1964) Butterflies and plants: a study in coevolution. Evolution, 18, 586–608. Farrell BD (1998) ‘Inordinate fondness’ explained: why are there so many beetles? Science, 281, 555–559. Feder JL, Filchak KE (1999) It’s about time: the evidence for host-mediated selection in the apple maggot fly, Rhagoletis pomonella, and its implications for fitness trade-offs in phytophagous insects. Entomologia Experimentalis and Applicata, 91, 211–225. Feder JL, Hunt TA, Bush GL (1993) The effects of climate, host plant phenology and host fidelity on the genetics of apple and hawthorne infesting races of Rhagoletis pomonella. Entomologia Experimentalis and Applicata, 69, 117–135. Feder JL, Opp S, Wlazlo B, Reynolds K, Go W, Spisak S (1994) Host fidelity is an effective pre–mating barrier between sympatric races of the apple maggot fly. Proceedings of the National Academy of Sciences USA, 91, 7990–7994. Feder JL, Stolz U, Lewis KM, Perry W, Roethele JB, Rogers A (1997) The effects of winter length on the genetics of apple and hawthorn races of Rhagoletis pomonella (Diptera: Tephritidae). Evolution, 51, 1862–1876. Funk DJ, Omland KE (2003) Species-level paraphyly and polyphyly: frequency, causes, and consequences, with insights from animal mitochondrial DNA. Annual Review Ecology and Systematics, 34, 397–423. Galle FC (1997) Hollies: the Genus Ilex. Timber Press, Portland, Oregon. Groman JD, Pellmyr O (2000) Rapid evolution and specialization following host colonization in a yucca moth. Journal of Evolutionary Biology, 13, 223–236. Hawthorne DJ (2001) Amplified fragment length polymorphismbased genetic linkage map of the Colorado potato beetle Leptinotarsa decemlineata: sex chromosomes and a pyrethroidresistance candidate gene. Genetics, 158, 695–700. Hawthorne DJ, Via S (2001) Genetic linkage of ecological specialization and reproductive isolation in pea aphids. Nature, 412, 904–907. Hebert PDN, Ratnasingham S, deWaard JR (2003) Barcoding animal life: cytochrome c oxidase 1 divergences among closely related species. Proceedings of the Royal Society of London (B), 270, S596–S599. Kai Y, Nalayama K, Nakabo T (2002) Genetic differences among three colour morphotypes of the black rockfish, Sebastes inermis, inferred from mtDNA and AFLP analyses. Molecular Ecology, 11, 2591–2598. © 2007 The Authors Journal compilation © 2007 Blackwell Publishing Ltd
H O S T - A S S O C I A T E D D I V E R G E N C E I N A L E A F M I N I N G F L Y 2637 Kulp LA (1968) The taxonomic status of dipterous holly leaf miners (Diptera: Agromyzidae). University of Maryland Agriculture Experiment Station Bulletin, A-155, 1–42. Masta SE, Sullivan BK, Lamb T, Routman EJ (2002) Molecular systematics, hybridization, and phylogeography of the Bufo americanus complex in Eastern North America. Molecular Ecology, 24, 302–314. Maynard Smith J, Haigh J (1974) The hitch-hiking effect of a favorable gene. Genetical Research Cambridge, 23, 23 – 35. Miller JR, Hawthorne D (2005) Durability of marker-quantitative trait loci haplotypes in structured populations. Genetics, 171, 1353–1364. Mitter C, Farrell BD, Wiegmann B (1988) Phylogenetic study of adaptive zones: has phytophagy promoted insect diversification? American Naturalist, 132, 107–128. Moore WS (1995) Inferring phylogenies from mtDNA variation: mitochondrial-gene trees versus nuclear-gene trees. Evolution, 49, 718–726. Nei M, Chesser RK (1983) Estimation of fixation indices and gene diversities. Annals of Human Genetics, 47, 253 –259. Neigel JE, Avise JC (1986) Phylogenetic relationships of mitochondrial DNA under various demographic models of speciation. In: Evolutionary Processes and Theory (eds Nevo E, Karlin S), pp. 515–534. Academic Press, New York. Omland KE, Baker JM, Peters JL (2006) Genetic signatures of intermediate divergence: population history of Old and New World Holarctic ravens. Molecular Ecology, 15, 795 – 808. Palumbi SR, Cipriano F, Hare MP (2001) Predicting nuclear gene coalescence from mitochondrial data: the three-times rule. Evolution, 55, 859–868. Parsons YM, Shaw KL (2001) Species boundaries and genetic diversity among Hawaiian crickets of the genus Laupala identified using amplified fragment length polymorphism. Molecular Ecology, 10, 1765–1772. Rosenberg NA (2003) The shapes of neutral gene genealogies in two species: probabilities of monophyly, paraphyly, and polyphyly in a coalescent model. Evolution, 57, 1465 –1477. SAS Institute (1999) SAS/STAT Software User’s Guide. Release 8.0. SAS Institute, Inc., Cary, North Carolina. Scheffer SJ (2002) New host record, new range information, and a new pattern of voltinism: possible host races within the holly leafminer Phytomyza glabricola Kulp (Diptera: Agromyzidae). Proceedings of the Entomological Society of Washington, 104, 571–575. Scheffer SJ, Wiegmann BM (2000) Molecular phylogenetics of the holly leafminers (Diptera: Agromyzidae: Phytomyza): Species limits, speciation, and dietary specialization. Molecular Phylogenetics and Evolution, 17, 244 –255. Schneider S, Roessli D, Excoffier L (2000) ARLEQUIN ver. 2.000: a Software for Population Genetic Data Analysis. Genetics and Biometry Laboratory, University Geneva, Switzerland. Shaw KL (1996) Sequential radiations and patterns of speciation in the Hawaiian cricket genus Laupala inferred from DNA sequences. Evolution, 50, 237–255. Shaw KL (2002) Conflict between mitochondrial and nuclear DNA phylogenies of a recent species radiation: what mitochondrial reveals and conceals about modes of speciation in Hawaiian
© 2007 The Authors Journal compilation © 2007 Blackwell Publishing Ltd
crickets. Proceedings of the National Academy of Sciences USA, 99, 16122–16127. Simon C, Frati F, Beckenback A, Crespi B, Liu H, Flook P (1994) Evolution, weighting, and phylogenetic utility of mitochondrial gene sequences and a compilation of conserved PCR primers. Annals of the Entomological Society of America, 87, 651–701. Sokal RR, Rohlf FJ (1981) Biometry, 2nd edn. W.H. Freeman and Co. NY, New York. Spencer KA (1990) Host Specialization in the World Agromyzidae (Diptera). Kluwer Academic Publishers, Dordrecht, The Netherlands. Spencer KA, Steyskal GC (1986) Manual of the Agromyzidae (Diptera) of the United States. U.S. Department of Agriculture Handbook No. 638, Washington DC. Storz JF (2005) Using genome scans of DNA polymorphism to infer adaptive population divergence. Molecular Ecology, 14, 671– 688. Strong DR, Lawton JH, Southwood R (1984) Insects on Plants, Community Patterns and Mechanisms. Harvard University Press, Cambridge, Massachusetts. Swofford DL (2001) PAUP*. Phylogenetic Analysis Using Parsimony (*and Other Methods), Version 4. Sinauer Associates, Sunderland, Massachusetts. Thomas Y, Bethenod MT, Pelozuelo L, Frerot B, Bourguet D (2003) Genetic isolation between two sympatric host-plant races of the European corn borer, Ostrinia nubilalis Hubner. I. sex pheromone, moth emergence timing, and parasitism. Evolution, 57, 261–273. Ting C-T, Tsaur S-C, Wu C-I (2000) The phylogeny of closely related species as revealed by the genealogy of a speciation gene, Odysseus. Proceedings of the National Academy of Sciences USA, 97, 5313–5316. Vos P, Hogers R, Bleeker M et al. (1995) AFLP: a new technique for DNA fingerprinting. Nucleic Acids Research, 23, 4407–4414. Wang R-L, Stec A, Hey J, Lukens L, Doebley J (1999) The limits of selection during maize domestication. Nature, 398, 236–239. Weir BS, Cockerham CC (1984) Estimating F-statistics for the analysis of population structure. Evolution, 38, 1358–1370. Wilding CS, Butlin RK, Grahme J (2001) Differential gene exchange between parapatric morphs of Littorina saxatilis detected using AFLP markers. Journal of Evolutionary Biology, 14, 611–619. Wood TK (1980) Divergence in the Echenopa binotata Say complex (Homoptera: Membracidae) effected by host plant adaptation. Evolution, 34, 147–160. Wood TK, Keese MC (1990) Host-plant-induced assortative mating in Enchenopa treehoppers. Evolution, 44, 619–628. Wood TK, Olmstead KL, Guttman SI (1990) Insect phenology mediated by host-plant water relations. Evolution, 44, 629– 636. Zink RM (2004) The role of subspecies in obscuring biological diversity and misleading conservation policy. Proceedings of the Royal Society of London (B), 271, 561–564.
Sonja J. Scheffer is a molecular systematist interested in the evolution of host-use in plant-feeding insects. David J. Hawthorne studies the genetic basis of insect adaptation in natural and agricultural landscapes.