Genetic differentiation and reduced genetic diversity ...

9 downloads 0 Views 456KB Size Report
Nov 1, 2015 - provided by the iPlant Collaborative (Goff et al. 2011), and the UNEAK analyses were conducted by the IGD. Following the SNP discovery ...
Accepted Article

Received Date : 31-May-2015 Revised Date : 01-Nov-2015 Accepted Date : 16-Nov-2015 Article type

: Original Article

Title: Genetic differentiation and reduced genetic diversity at the northern range edge of two species with different dispersal modes

Author: Abigail E. Cahill a,b and Jeffrey S. Levinton a

Address: a Department of Ecology and Evolution, Stony Brook University, Stony Brook, NY 11794-5245 USA

Keywords: range limits, genotyping-by-sequencing, genetic variation, gastropods

Corresponding author: Abigail E. Cahill b

Institut Méditerranéen de Biodiversité et d’Ecologie marine et continentale (IMBE)

Aix Marseille Université, CNRS, IRD, Avignon Université Station Marine d’Endoume, Chemin de la Batterie des Lions, 13007 Marseille, France

Email : [email protected] Fax : (33) 4 91 04 16 35 This article has been accepted for publication and undergone full peer review but has not been through the copyediting, typesetting, pagination and proofreading process, which may lead to differences between this version and the Version of Record. Please cite this article as doi: 10.1111/mec.13497 This article is protected by copyright. All rights reserved.

Accepted Article

Running title Patterns of genetic diversity at range edges

Abstract Theory predicts that genetic variation should be reduced at range margins, but empirical support is equivocal. Here, we used genotyping-by-sequencing technology to investigate genetic variation in central and marginal populations of two species in the marine gastropod genus Crepidula. These two species have different development and dispersal types, and might therefore show different spatial patterns of genetic variation. Both allelic richness and the proportion of private alleles were highest in the most central populations of both species, and lower at the margin. The species with low dispersal, C. convexa, showed high degrees of structure throughout the range that conform to the pattern found in previous studies using other molecular markers. The northernmost populations of the high-dispersing species, C. fornicata, are distinct from more central populations, though this species has been previously observed to have little genetic structure over much of its range. Although genetic diversity was significantly lower at the range margin, the absolute reduction in diversity observed with these genome-wide markers was slight, and it is not yet known if there are functional consequences for these marginal populations. Introduction Understanding the causes of species’ geographic ranges is a subject of both theoretical and applied concern in evolutionary ecology (Bridle & Vines 2007; Kawecki 2008; Sexton et al. 2009). Identifying the mechanisms determining these range limits, as well as their underlying dynamics, is of current interest given that many species are experiencing range shifts in response

This article is protected by copyright. All rights reserved.

Accepted Article

to changing climate (e.g. Parmesan & Yohe 2003; Perry et al. 2005; Parmesan 2006; Thomas 2010; Angert et al. 2011; Cahill et al. 2014). The role of genetics in limiting species’ range shifts is a mix of too much versus too little

gene flow or genetic variation. Large amounts of gene flow from a central population to a marginal one may cause the migration of alleles that are adaptive within the range but maladaptive at the margins (reviewed in Lenormand 2002; Kawecki 2008). This swamping effect may prevent marginal populations from adapting to local conditions, which might slow range expansion (Kirkpatrick & Barton 1997). If adaptation to environments at the margin of a species’ distribution is possible, it may allow for adaptation to conditions beyond the margins (e.g. Bridle et al. 2014). However, very low amounts of gene flow to the margins could result in

small, isolated populations that are prone to genetic drift (Bridle & Vines 2007). The balance between gene flow and selection and the relative importance of these processes in determining species’ ranges has been modeled (e.g. Holt & Gomulkiewicz 1997; Kirkpatrick & Barton 1997; Case & Taper 2000; Price & Kirkpatrick 2009; Bridle et al. 2010; Bourne et al. 2014; Polechová & Barton 2015) and investigated empirically (e.g. Sexton et al. 2011). Marginal populations are expected to show reduced genetic diversity relative to central

populations due to fragmentation, small population size and a corresponding increase of genetic drift, and potentially strong adaptation to local conditions (Kawecki 2008). This prediction assumes that a species has an abundant-center distribution, with high population densities in the center of its range and low densities at the margins (Lira-Noriega & Manthey 2014). Genetic variation is indeed lower at range margins in many cases (Gaston 2003; Eckert et al. 2008; Kawecki 2008; Lira-Noriega & Manthey 2014; Micheletti & Storfer 2015), but the pattern is far from universal (e.g. Moeller et al. 2011; Halbritter et al. 2015).

This article is protected by copyright. All rights reserved.

Accepted Article

Coastal marine species frequently exist in a linear series of populations arrayed across large latitudinal ranges. This makes the identification of populations at margins relatively easy, and also means that environmental conditions are often quite variable across a range. For many marine invertebrates, adults are sessile, and dispersal among populations occurs mostly in the larval or juvenile stage (Cowen & Sponaugle 2009). This has led to the prediction that species with planktotrophic stages, which can live in the plankton for hours to years, will have lower levels of between-site genetic differentiation than those with direct-developing larvae that do not live in the plankton (reviewed in Shanks 2009). Species with planktotrophic larvae frequently have less genetic structure in space than closely-related direct developers (e.g. among congeneric gastropod species: Collin 2001; Lee & Boulding 2009; Cahill & Viard 2014). This general pattern indicates that larval type may be related to the degree of gene flow, including to marginal populations, which may affect both the amount of genetic diversity in marginal relative to core habitats and the degree of potential gene swamping. Crepidula fornicata and C. convexa (Gastropoda: Calyptraeidae) can be used to compare

diversity in populations at range margins based on dispersal ability. These two species are native to the east coast of North America (Collin 2001; Fig. 1). Crepidula fornicata is a planktotrophic developer, with a 2-4 week larval period (Collin 2003) that allows for high dispersal potential. Populations of C. fornicata display no genetic differentiation from New Brunswick (Canada) to the Atlantic coast of Florida (USA) when measured with allozymes (Hoagland 1985), the mitochondrial marker COI (Collin 2001), and microsatellites and AFLPs (Riquet et al. 2013),

although populations of C. fornicata in the Gulf of Mexico are slightly different from those in the Atlantic (Riquet et al. 2013). In contrast, C. convexa is a direct-developing snail with no planktonic larval stage. This lack of larval dispersal has led to relatively strong genetic

This article is protected by copyright. All rights reserved.

Accepted Article

differentiation among populations, as measured with COI (Collin 2001) and microsatellites (Cahill & Viard 2014). The native range of C. fornicata extends from the Yucatan Peninsula (Mexico) in the

south (Collin 2001) to Newfoundland (Canada) in the north (Rawlings et al. 2011). The range of

C. convexa is smaller, as predicted for a direct developer (Scheltema 1986; Johannesson 1988), stretching along the east coast of the United States (Collin 2001; Fig. 1). The northern range edge of C. fornicata may have expanded over the past decades, apparently tracking warming water temperature towards the north (Rawlings et al. 2011), and expansion is predicted to continue (Saupe et al. 2014). The different amounts of dispersal in these two species may lead to

different patterns of genetic diversity at the range margins in these species, but this is currently unknown. We used high-throughput DNA sequencing (genotyping-by-sequencing, or GBS, Elshire

et al. 2011) to investigate genetic diversity and structure in populations in the range center and at

the northern margin of two Crepidula species with differing dispersal ability. We particularly tested the hypothesis of reduced genetic diversity in marginal populations relative to central ones. Although southern range edges (low-latitude edge) are of interest due to potential climaterelated range contractions (Cahill et al. 2014), the current range expansions in C. fornicata mean that the northern edge is very relevant to range shifts. Additionally, C. convexa has a cryptic sister species with a parapatric range to the south (C. ustulatulina; Collin, 2001, 2002), making it difficult to identify the southern range edge in this species. We therefore chose to study the northern range edge of C. convexa and C. fornicata. We expected that neutral genetic diversity would be reduced at the margin in both species, but that C. convexa would show a greater reduction in diversity due to its lower dispersal capacity. We also expected that the higher

This article is protected by copyright. All rights reserved.

Accepted Article

dispersal capacity in C. fornicata would correspond to low genetic structure in this species, but that C. convexa would show high genetic structure, corresponding to work with other molecular

markers.

Materials and Methods Sample collection Samples of C. fornicata and C. convexa were collected from four populations each along the Atlantic coast of North America (Table 1, Fig. 1). Two central and two marginal populations were sampled for each species. Snails were collected from many different substrates at each site (rocks, bottles, clams, etc.), and multiple individuals were sometimes collected from the same substrate. Based on prior studies (Dupont et al. 2006; Le Cam et al. 2014), adults on a single substrate are not highly related, so collecting multiple individuals from a single substrate is not expected to affect results. Collections were made by a single person at each site. With the exception of the Newfoundland population of C. fornicata, animals were transported live to the laboratory in Stony Brook, New York, then removed from their substrates and placed directly in a freezer at -80°C. Samples from Newfoundland were removed from their substrates and placed

in 95% ethanol before transport to New York. DNA extraction and quality control DNA was extracted from 0.2 g of tissue using a DNeasy® Blood & Tissue Kit (Qiagen®) according to the manufacturer’s instructions. Cephalic tissue was used for extractions of C. fornicata and large C. convexa, and whole animals were used for small C. convexa. Following

This article is protected by copyright. All rights reserved.

Accepted Article

this extraction, contaminants were removed using a DNeasy® column and reagents. First, unpurified samples were added to lysis buffer that had been chilled and centrifuged, and the solution was placed on a DNeasy® column. Wash buffer was then added (Qiagen® buffer AW1), and the reaction was centrifuged for 6 min at 5,000 rpm. A second wash buffer (Qiagen® buffer AW2) was added, followed by 10 min of centrifugation at 5,000 rpm. Finally, elution buffer was added and centrifuged for 4 min at 5,000 rpm. The concentrations of the cleaned DNA samples were measured using photometric dye

(Quant-iTTM PicoGreen®, Life Technologies) and a Mini-Fluorometer (TBS-380, Turner

Biosystems) according to the manufacturers’ protocols. A subset of the samples were digested with the restriction enzyme Sau3AI (New England Biolabs®), then run on a 1% agarose gel (95V for 50 minutes) and visually checked to verify DNA quality. Samples were concentrated to ≥ 20 ng/µl for sequencing. Genotyping Samples were shipped to the Institute for Genomic Diversity (IGD) at Cornell University, where they were analyzed using GBS (Elshire et al. 2011). Libraries were prepared for 20-26

individuals per population. The IGD tested several common restriction enzymes on a subset of samples to see which produced a library with a large number of fragments that were small enough to sequence (< 500 bp). Following this enzyme optimization, DNA from each individual was digested separately using PstI, a restriction enzyme with a six-base recognition site. DNA fragments were then ligated to a barcoded adaptor (using a separate barcode for each individual) and a common adaptor. Following quality control, libraries from each species were run on a separate 96-well plate with 95 wells each containing DNA from a different individual and one

This article is protected by copyright. All rights reserved.

Accepted Article

well serving as a blank. The libraries were sequenced on an Illumina HiSeq 2500 with 96 samples sequenced per lane using single-end sequencing, and the reads generated were 100 bp long. The size of the genomes of these species is unknown, but that of the congener C. unguiformis has been estimated at approximately 6.2 Gb (Libertini et al. 2009). Bioinformatics and quality control The reads were quality checked using FastQC version 0.10.1 (Andrews 2010). Data were then analyzed using the non-reference pipeline Universal Network-Enabled Analysis Kit (UNEAK; http://www.maizegenetics.net/gbs-bioinformatics; Lu et al. 2013). The pipeline de-multiplexed

the data, trimmed the reads to 64 bp to remove the error-prone tail of the sequence, removed the barcoded adaptor, and then classified identical reads as tags. A network analysis was used to find tags that differed by a single base pair (i.e. candidate SNPs; Lu et al. 2013). The pipeline was run with the default error tolerance rate (0.03, designed to minimize the chance that real tags are discarded as sequencing errors) and the default minimum minor allele frequency (0.01). One of the processing steps in the UNEAK pipeline removed potential paralogs. The two species were analyzed in the same run of the pipeline. FastQC was conducted using the platform provided by the iPlant Collaborative (Goff et al. 2011), and the UNEAK analyses were

conducted by the IGD. Following the SNP discovery phase, the data were separated by species, and all further

analyses were conducted separately for each species. The dataset was filtered to the loci that were sampled at 75% or more of individuals (i.e. 72 individuals out of 95 total), as well as loci that were sampled at a mean coverage of ≥ 10X per individual (i.e. on average, each individual had 10 or more sequenced copies of the locus). We also removed all failed individuals, defined

This article is protected by copyright. All rights reserved.

Accepted Article

as those that were sequenced with less than 10% of the mean reads per sample for the particular species (eight individuals in C. fornicata and 10 individuals in C. convexa; see Table 2 for final sample sizes in each population). All subsequent analyses were done on this reduced dataset. Identifying FST outliers

To identify SNPs with extreme FST values (outliers) that may not follow expectations for neutral loci, we used the program Bayescan version 2.1 (Foll & Gaggiotti 2008), which uses a Bayesian framework to calculate the posterior odds that any given locus is under selection. We chose to use Bayescan due to its relatively low rate of false positives (Lotterhos & Whitlock 2014). We used BLAST (NCBI) to search for close matches (i.e. highly similar sequences) to tags containing outlier loci. For each outlier, we conducted multiple searches: the first against the nucleotide collection of NCBI, and then against transcriptomes of Crepidula fornicata (Henry et al. 2010; Romiguier et al. 2014) and Crepidula plana (Romiguier et al. 2014) from the Sequence Read Archive (NCBI). Outliers were removed from all subsequent analyses. Population genetic diversity analyses In order to investigate the change in genetic diversity across the species’ ranges, we calculated the expected heterozygosity (HE) and observed heterozygosity (HO) values, allelic richness as

calculated using rarefied allelic counts (Ar), and FIS for each locus and population using the Hierfstat package, version 0.04-10 (Goudet 2005) in R 3.0.1. We calculated the number of private alleles in each population using GenAlEx (Peakall & Smouse 2006, 2012). To assess differences among populations, we calculated 95% confidence intervals for each of these statistics. Two populations were different from each other at the α = 0.05 level if their confidence intervals did not overlap. We calculated the overall population structure (measured

This article is protected by copyright. All rights reserved.

Accepted Article

with FST) in the dataset using Hierfstat. Given the low number of SNPs recovered relative to the genome size, most loci should be independent, though we did not calculate linkage disequilibrium (D) between all pairs of loci. We used Adegenet, version 1.4-2 (Jombart et al. 2008) to calculate the pairwise FST values between all pairs of populations. Significance of

pairwise FST values was determined using permutation tests with the program Arlequin, version 3.5 (Excoffier & Lischer 2010). Population structure To assess the amount of population structure present in the two species, we used a discriminant analysis of principal components (DAPC; Jombart et al. 2008, 2010). This is a clustering

analysis that first performs a principal components analysis on the multilocus genotypes of the samples, then a discriminant analysis on the PC scores. The analysis finds the number of clusters that minimizes within-group variation and maximizes between-group variation (Jombart et al. 2010). We chose this program to analyze population structure because it does not assume that the population subgroups have any particular underlying structure (e.g. Hardy-Weinberg equilibrium) and therefore is more accurate than the program STRUCTURE (Pritchard et al. 2000) when the underlying structure departs from an island model (Jombart et al. 2010). Like STRUCTURE, DAPC uses k-means clustering and the Bayesian Information Criterion (BIC) to identify the optimal number of clusters in the data. For each species, two clustering analyses were run, one using the population of origin

(collecting site) as a prior in the model and one without this prior. The analysis using collecting site as a prior allowed us to visualize which geographic locations grouped together. The analysis run without collecting site as a prior relies on BIC to determine the best number of clusters in the

This article is protected by copyright. All rights reserved.

Accepted Article

data independent of geographic information: the investigator selects the number of clusters that generates the lowest BIC. DAPC was conducted with the Adegenet package, version 1.4-2 (Jombart et al. 2008) in R 3.0.1 (R Core Development Team 2013).

Results Quality control and SNP calls The FastQC results showed a mean per-base quality score (Phred score) in C. convexa of 35, and a mean per-sequence quality score of 37. In C. fornicata, the mean per-base quality score was 35.10, and the mean per-sequence quality score was 38. After filtering the data as described above, there were 1903 loci in the C. fornicata dataset and 309 loci in the C. convexa dataset. The difference in the number of SNPs is due to low cutting rates with the restriction enzyme in C. convexa, resulting in a lower number of tags. All subsequent analyses were conducted with

these reduced datasets. No orthologous loci were recovered between the two species. FST Outliers

In C. convexa, the Bayescan analysis identified four loci as outliers (Fig. 2A). Two of these loci

had posterior odds of 0.95-0.99 of being under selection, while two had posterior odds > 0.99 of being under selection. Three of the loci had no significant matches in any of the BLAST searches conducted. One locus, with an FST = 0.1956 (posterior odds = 0.9952), matched an unannotated sequence in the C. fornicata transcriptomes from Romiguier et al. (2014). These outliers were different between the two marginal and two central populations. Average pairwise

This article is protected by copyright. All rights reserved.

Accepted Article

FST between a northern and a southern population = 0.3334, while average pairwise FST within those regions = 0.0384. In C. fornicata, one locus was identified as an outlier (Fig. 2B), with posterior odds of

0.9589 of being under selection. There were no close matches for the sequence tag containing this SNP in any of the BLAST searches. Again, differentiation at this locus was much higher when comparing a marginal to a central population (average pairwise FST = 0.4302) than when

comparing within regions (average pairwise FST = 0.0084). Population genetic diversity analyses The expected heterozygosity (HE) in the overall C. convexa dataset was 0.184, with an

overall observed heterozygosity (HO) of 0.224. Each population also showed a higher observed heterozygosity than expected, though in many cases this difference was not significant. Mean HE values ranged from 0.176 – 0.205, and mean HO ranged from 0.187 – 0.271 (Table 2). This

overall excess in heterozygotes corresponded to an overall FIS value of -0.220, calculated across

all populations and loci. Within populations, the mean FIS ranged from -0.012 – -0.169 (Table 2). The proportion of polymorphic loci ranged from 68.8% (BA) to 86.7% (NJ). Mean allelic richness within populations ranged from 1.591 (BA) to 1.703 (NJ). Although marginal populations were not generally lower than central ones, the most central population, NJ, showed significantly higher allelic richness than all other populations in the study (Table 2, Fig. 3A). The same pattern was observed in the private alleles: the NJ population had a higher proportion of private alleles than the other three, which were not different from each other based on their 95% confidence intervals (Table 2, Fig. 3C).

This article is protected by copyright. All rights reserved.

Accepted Article

The overall FST among all populations of C. convexa was equal to 0.0308, significantly different than zero (p = 0.024). Pairwise FST values ranged from 0.198 (NA to BA) to 0.0388

(NA to NY) (Table 3). The average pairwise FST was 0.031. Only populations NA and BA were

not significantly different from each other based on FST values (Table 3). Comparisons between

the two central and two marginal populations (i.e. BA-NA and NY-NJ) yielded lower FST values

than comparisons of central to marginal populations (Table 3). In C. fornicata the overall HE was 0.121, and the overall HO was 0.130, calculated across

all populations and loci. Within-population values of HE were again slightly lower than HO in all populations but NA, with mean HE ranging from 0.114 – 0.128, and mean HO ranging from

0.115 – 0.152 (Table 2). The overall FIS in the dataset, calculated across all populations and loci, was -0.067, with mean FIS within populations ranging from -0.085 – 0.070 (Table 2). Mean

allelic richness within populations ranged from 1.452 (NS) to 1.575 (NY) (Table 2). Allelic richness generally increased moving from the margin to the center of the range, and NY (the most central population in this study) had a significantly higher allelic richness than the two marginal populations (Fig. 3B). The proportion of private alleles clearly declined from the center to the margins, with NY having the highest number of private alleles and NL the lowest (though NL and NS were not significantly different from each other based on 95% confidence intervals; Table 2, Fig. 3D). The overall FST among all populations of C. fornicata was 0.040, significantly different

from zero (p = 0.041), and the average pairwise FST was 0.040. Pairwise FST values ranged from

0.013 (NY to NA) to 0.061 (NA to NS). All populations were significantly distinct from each other based on the pairwise FST values (Table 3). As with C. convexa, comparisons between the

This article is protected by copyright. All rights reserved.

Accepted Article

two marginal populations (NS-NL) and two central populations (NY-NA) were lower than any comparisons between a central and a marginal population (Table 3). Population structure When sampling site was used as a prior in the DAPC clustering analysis of C. convexa, the analysis separated central and marginal populations along the first principal component axis (Fig. 4A). Populations NJ and NY were mostly overlapping, and NA and BA overlapped slightly. The second principal component axis separated NA from BA. The three principal eigenvectors of the discriminant analysis were of lengths 125.16, 34.17, and 18.83. When conducting the DAPC without using population of origin as a prior, the BIC with the lowest value corresponded to k = 4 genetic clusters. Each geographic population had a different proportion of individuals that belonged to each cluster (Fig. 5A), and in all cases the probability of membership to the assigned clusters was quite high (> 99% in 75 cases, >75% in 84 cases out of 85 total individuals). The DAPC analysis of C. fornicata when sampling site was used as a prior separated

central from marginal populations. The NA and NY populations overlapped slightly on the first two principal component axes, with the populations from NS and NL distinct both from each other and from the more southern populations (Fig. 4B). The three principal eigenvectors of the discriminant analysis were of lengths 216.91, 106.72, and 36.17. When using DAPC to find the best-fit number of genetic clusters in the data, the lowest BIC value again corresponded to k = 4 clusters. The population from NL was distinct from any other population, and there was a clear difference between central populations (NA and NY) and northern, marginal ones (NL and NS). In particular, there were genetic clusters that clearly corresponded to northern and southern

This article is protected by copyright. All rights reserved.

Accepted Article

Fairbairn DJ (1981) Biochemical genetic analysis of population differentiation in Greenland halibut (Reinhardtius hippoglossoides) from the northwest Atlantic, Gulf of St. Lawrence, and Bering Sea. Canadian Journal of Fisheries and Aquatic Sciences, 38, 669 – 677. Foll M, Gagiotti OE (2008). A genome scan method to identify selected loci appropriate for both dominant and codominant markers: a Bayesian perspective. Genetics, 180, 977-993. Gaston KJ (2003) The structure and dynamics of geographic ranges. Oxford University Press, Oxford. Goff SA, Vaughn M, McKay S, Lyons E, Stapleton AE, Gessler D, et al. (2011) The iPlant collaborative: cyberinfrastructure for plant biology. Frontiers in Plant Science, 2, 1- 16. Goudet J. (2005) Hierfstat, a package for R to compute and test hierarchical F-statistics. Molecular Ecology Notes, 5, 184-186. Halbritter AH, Billeter R, Edwards PJ, Alexander JM. (2015) Local adaptation at range edges: comparing elevation and latitudinal gradients. Journal of Evolutionary Biology, 28, 18491860. Hamlin JAP, Arnold ML (2014) Determining population structure and hybridization for two iris species. Ecology and Evolution, 4, 743-755. Henry JJ, Perry KJ, Fukui L, Alvi N (2010). Differential localization of mRNAs during early development in the mollusc, Crepidula fornicata. Integrative and Comparative Biology, 50, 720-733. Hoagland KE (1985) Genetic relationships between one British and several North American populations of Crepidula fornicata based on allozyme studies (Gastropoda: Calyptraeidae). Journal of Molluscan Studies, 51, 177-182. Holt RD, Gomulkiewicz R (1997) How does immigration influence local adaptation? A reexamination of a familiar paradigm. American Naturalist, 149, 563. Johannesson K (1988) The paradox of Rockall: why is a brooding gastropod (Littorina saxatilis) more widespread than one having a planktonic larval dispersal stage (L. littorea)? Marine Biology, 99, 507-513. Jombart T (2008) Adegenet: a R package for the multivariate analysis of genetic markers. Bioinformatics, 24, 1403–1405. Jombart T, Devillard S, Balloux F (2010) Discriminant analysis of principal components: a new method for the analysis of genetically structured populations. BMC Genetics, 11, 94. Kalinowski ST (2004) Counting alleles with rarefaction: private alleles and hierarchical sampling designs. Conservation Genetics, 5, 539-543. Kawecki TJ (2008) Adaptation to marginal habitats. Annual Review of Ecology Evolution Systematics, 39, 321-342. Kirkpatrick M, Barton NH (1997) Evolution of a species’ range. American Naturalist, 150, 1-23. Le Cam S, Riquet F, Pechenik JA, Viard F (2014) Paternity and gregariousness in the sexchanging sessile marine gastropod Crepidula convexa: comparison with other protandrous Crepidula species. Journal of Heredity, 105, 397-406. Lee HJ, Boulding EG (2009) Spatial and temporal population genetic structure of four northeastern Pacific littorinid gastropods: the effect of mode of larval development on variation at one mitochondrial and two nuclear DNA markers. Molecular Ecology, 18, 2165-2184. Lenormand T (2002) Gene flow and the limits to natural selection. Trends in Ecology and Evolution, 17, 183-189.

This article is protected by copyright. All rights reserved.

Accepted Article

Rawlings TA, Aker JM, Brunel P (2011) Clarifying the northern distributional limits of the slipper limpet Crepidula fornicata in the northwestern Atlantic. American Malacological Bulletin, 29, 105-119. Riquet F, Daguin-Thiébaut C, Ballenghien M, Bierne N, Viard F (2013) Contrasting patterns of genome-wide polymorphism in the native and invasive range of the marine mollusc Crepidula fornicata. Molecular Ecology, 22, 1003-1018. Romiguier J, Gayral P, Ballenghien M, Bernard A, Cahais V, Chenuil A, Chiari Y, Dernat R, Duret L, Faivre N, Loire E, Lourenco JM, Nabholz B, Roux C, Tsagkogeorga A, Weber AA-T, Weinert LA, Belkhir K, Bierne N, Glémin S, Galtier N (2014). Comparing population genomics in animals uncovers the determinants of genetic diversity. Nature, 515, 261-263. Saupe EE, Hendricks JR, Peterson AT, Lieberman BS (2014) Climate change and marine molluscs of the western North Atlantic: future prospects and perils. Journal of Biogeography, 41, 1352-1366. Scheltema RS (1986) On dispersal and planktonic larvae of benthic invertebrates: an eclectic overview and summary of problems. Bulletin of Marine Science, 39, 290-322. Sexton JP, McIntyre PJ, Angert AL, Rice KJ (2009) Evolution and ecology of species range limits. Annual Review of Ecology Evolution Systematics, 40, 415–436. Sexton JP, Strauss SY, Rice KJ (2011) Gene flow increases fitness at the warm edge of a species’ range. Proceedings of the National Academy of Sciences of the United States of America, 108, 11704-11709. Shanks AL (2009) Pelagic larval duration and dispersal distance revisited. Biological Bulletin, 216, 373-385. Thomas CD (2010) Climate, climate change, and range boundaries. Diversity and Distributions, 16, 488–495. Wares JP (2002) Community genetics in the northwest Atlantic intertidal. Molecular Ecology, 11, 1131-1144. Wares JP, Cunningham CW (2001) Phylogeography and historical ecology of the North Atlantic intertidal. Evolution, 55, 2455-2469. Wu YS, Tang CL (2011) Atlas of ocean currents in eastern Canadian waters. Canadian Technical Report of Hydrography and Ocean Sciences 271. Fisheries and Oceans Canada.

Data Accessibility Demultiplexed Illumina reads can be found in the Sequence Read Archive at NCBI (SRP study accession number SRP058970). SNP calls for the whole dataset, the filtered set of SNPs, input and output files for Bayescan, input files for DAPC, estimates of pairwise genetic and

geographic differentiation, and GPS coordinates for collection localities have been deposited in DataDryad (doi:10.5061/dryad.ct5q3).

This article is protected by copyright. All rights reserved.

Accepted Article

identified in C. convexa. This is consistent with the lack of reduction in diversity relative to other populations found in NA using microsatellite markers (Cahill & Viard 2014). However, both allelic richness and the proportion of private alleles were significantly higher in the NJ population than the other populations in the study (Fig. 3A,C). This population was also the most centrally located relative to the species’ geographic range (Fig. 1). No other markers have been used to assess the genetic variation in the NJ population, so we could not compare the high variation here to other data. Since NJ was the southernmost population in the current dataset, we also could not determine if high genetic variation is only found at this site, or if variation is as high or higher in other central or more southern populations. The levels of genetic variation in C. fornicata revealed a similar pattern. Although the

northernmost population (NL) showed high levels of genetic diversity when considering HE, HO,

or proportion of polymorphic loci, there was a significant reduction in allelic richness in marginal populations relative to NY (Fig. 3B). Allelic richness takes different sample sizes into account, and therefore is a more sensitive measure of changes in genetic diversity, especially when considering rare alleles (Kalinowski 2004). This difference was relatively small, but significant and detectable with the number of SNPs recovered in this study. The proportion of private alleles also decreased from the center to the margin of the range (Fig. 3D). The NY population is the most centrally located in this study relative to the species’ geographic range. Population structure in Crepidula species

The DAPC analysis identified four distinct genetic clusters in C. convexa, and separated the four collecting sites used in this study (Fig. 5A). This corresponds to previous analyses of genetic structure in C. convexa (Collin 2001; Cahill & Viard 2014), and is expected given the direct

This article is protected by copyright. All rights reserved.

Accepted Article

development of this species and its correspondingly low dispersal distance. However, the snails collected in NY and NJ had genotypes that largely overlap on the first two principal component axes (Fig. 4A). PC1 separates the populations across Cape Cod (Fig. 1). Cape Cod is known to be a biogeographic barrier for many marine invertebrates due to oceanic current structure, changes in temperature, and differences in available habitat (Wares 2002; Pappalardo et al. 2015), but a microsatellite analysis did not show that it is a dispersal barrier for C. convexa (Cahill & Viard 2014). Although PC1 in Fig. 4A appears to separate populations that are on either side of Cape Cod, this pattern is also consistent with populations that are closer together being more tightly clustered than those that are farther apart. Untangling how much of this difference is due to selection for environmental variation on either side of the Cape versus a pattern of isolation-by-distance or other factors will require further study. The pattern observed in C. fornicata was quite different. The DAPC analysis identified

four distinct genetic clusters within the data (Fig. 5B). The central populations (NY and NA) overlapped slightly (Fig. 4B) and shared two genetic clusters (Fig. 5B). This matched previous

studies of C. fornicata with other marker sets (COI, Collin 2001; microsatellites and AFLPs, Riquet et al. 2013), which showed low levels of differentiation between these two sites (e.g. FST = 0.013 with 17 microsatellites or 0.012 with 327 AFLPs, Riquet et al. 2013). However, the populations sampled near the range margin (NS and NL) were very different from the central ones (Fig. 4B, 5B). The NS population of C. fornicata, though it shared some similarities with those further

south, was also distinct from them (Fig. 5B). The differences may be largely explained by the geographic distance between populations. The NS collecting site is located approximately 1100 km from the next site south (NA). This distance is the same as that between NA and

This article is protected by copyright. All rights reserved.

Accepted Article

Chesapeake, Virginia (USA), populations which displayed an FST of 0.022 when analyzed with 17 microsatellite markers (Riquet et al. 2013). Additionally, the oceanography and water

movement between NA and NS, particularly circulation patterns in the Gulf of Maine (Fig. 1) and the Bay of Fundy (Miller et al. 1998), may enhance isolation, making the degree of isolation larger than just the effect of the absolute geographic distance between them. Further analyses and finer-scale sampling between NA and NS are required to assess the amount of genetic differentiation that can be explained by geographic distance versus other mechanisms such as selection. The NL population was distinct from all other populations of C. fornicata (Fig. 4B),

though it did share genetic clusters with the other marginal population, NS (Fig. 5B). Although NL and NS are only 300 km apart, a distance over which C. fornicata normally displays very little genetic differentiation (Collin 2001; Riquet et al. 2013), the water circulation patterns in this area likely enhance isolation. The NS population is found to the east and south of the Cabot Strait, on the Atlantic coast of Nova Scotia. The NL population, from southwestern Newfoundland, is located to the northwest of this strait, inside the Gulf of St. Lawrence (Fig. 1). The Cabot Strait is a deep channel (> 200 m) of fast-moving water at all depths. Water moves from west to east, out of the Gulf of St. Lawrence (Wu & Tang 2011). It is very likely that the planktonic larvae of C. fornicata are unable to move from east to west across this current, and that gene flow would therefore be nearly unidirectional. This would account for both the distinctness of the NL population and the shared genetic clusters with NS, as there is probably

some dispersal from NL to NS and points south.

This article is protected by copyright. All rights reserved.

Accepted Article

The circulation within the Gulf of St. Lawrence might allow for genetic exchange between the NL population defined in this study and other populations within the Gulf. Crepidula fornicata populations are known from several locations in the Gulf (Rawlings et al.

2011). Further sampling should include individuals from these areas to understand genetic exchange within the Gulf of St. Lawrence. Other marine species have shown similar patterns of genetic structure in the Gulf of St. Lawrence (i.e. striking differences between populations within the Gulf and on the Atlantic coast: halibut, Fairbairn 1981; hard clams, Dillon & Manzi 1992; calanoid copepods, Bucklin et al. 1996; silverside fish, Mach et al. 2011). Further work in these species should also take historical gene flow and various

demographic scenarios into account. Many marine invertebrates show a pattern of expansion from southern refugia following the Pleistocene glaciation (e.g. Wares & Cunningham 2001). A phylogenetic analysis of COI in C. fornicata did not show a pattern consistent with postPleistocene expansion (i.e. northern haplotypes were not nested within more southern haplotypes; Collin 2001). Both Crepidula species can exist subtidally, and therefore may show different historical patterns from species that only exist intertidally. However, evidence for current gene flow does not rule out past gene flow in structuring these populations, and should be investigated further.

Conclusions Genotyping-by-sequencing technology generated a large library of polymorphic SNPs that could be used to analyze genetic variation and structure across populations of C. convexa and C.

fornicata. Both species showed a decrease in genetic variation at the range margin relative to the

This article is protected by copyright. All rights reserved.

Accepted Article

most central populations, despite the different dispersal types of the species. Additionally, we did not find strong evidence for selection acting on these markers. However, we did find that in both species, marginal populations are differentiated from central ones. Although the causes of the northern range limits in these Crepidula species are undoubtedly complex, future work should investigate the scale over which genetic variation declines in these species, and if this decline has functional consequences.

Acknowledgements We thank D. Padilla, D. Futuyma, W. Eanes, and F. Viard for help during project development and writing, E. Rollinson for help with figures, and M.E. Lauterbur, M. Lim, J. Rollins, and three anonymous reviewers for helpful comments on previous versions of the manuscript. The Eanes and Collier labs, as well as O. Warsi, provided help in the lab. Thanks to the Institute for Genomic Diversity at Cornell University, especially S. Mitchell. We received collection help from P. Sargent and T. Rawlings. This project was funded by a National Science Foundation Doctoral Dissertation Improvement Grant (award number # 1209531). This is contribution # 1242 of the Department of Ecology and Evolution, Stony Brook University.

References

Andrews S (2010) FastQC: a quality control tool for high throughput sequence data. Available at http://www.bioinformatics.babraham.ac.uk/projects/fastqc/ Angert, AA, Crozier LG, Rissler LJ, Gilman SE, Tewksbury JJ, Chunco AJ (2011) Do species’ traits predict recent shifts at expanding range edges? Ecology Letters, 14, 677-689. Bierne N, Roze D, Welch JJ (2013). Pervasive selection or is it…? why are Fst outliers sometimes so frequent? Molecular Ecology, 22, 2061-2064.

This article is protected by copyright. All rights reserved.

Accepted Article

Bourne EC, Bodeci G, Travis JMJ, Pakeman RJ, Brooker RW, Schiffers K (2014) Between migration load and evolutionary rescue: dispersal, adaptation and the response of spatially structured populations to environmental change. Proceedings of the Royal Society B, 281, 20132795. Bridle JR, Vines TH (2007) Limits to evolution at range margins: when and why does adaptation fail? Trends in Ecology and Evolution, 22, 140-147. Bridle JR, Polechová J, Kawata M, Butlin RK (2010) Why is adaptation prevented at ecological margins? New insights from individual-based simulations. Ecology Letters, 13, 485-494. Bridle JR, Buckley J, Bodsworth EJ, Thomas CD (2014) Evolution on the move: specialization on widespread resources associated with rapid range expansion in response to climate change. Proceedings of the Royal Society B, 281, 20131800. Bucklin A, Sundt RC, Dahle G (1996) The population genetics of Calanus finmarchicus in the North Atlantic. Ophelia, 44, 29-45. Cahill AE, Aiello-Lammens ME, Fisher-Reid MC, Hua X, Karanewsky CJ, Ryu HY, Sbeglia GC, Spagnolo F, Waldron JB, Wiens JJ (2014) Causes of warm-edge range limits: systematic review, proximate factors, and implications for climate change. Journal of Biogeography, 41, 429-442. Cahill AE, Viard F (2014) Genetic structure in native and non-native populations of the directdeveloping gastropod Crepidula convexa. Marine Biology, 161, 2433-2443. Case RJ, Taper ML (2000) Interspecific competition, environmental gradients, gene flow, and the coevolution of species’ borders. American Naturalist, 155, 583-605. Collin R (2001) The effects of mode of development on phylogeography and population structure of North Atlantic Crepidula (Gastropoda: Calyptraeidae). Molecular Ecology, 10, 2249-2262. Collin R (2002) Another last word on Crepidula convexa with a description of C. ustulatulina n. sp. (Gastropoda: Calyptraeidae) from the Gulf of Mexico and southern Florida. Bulletin of Marine Science, 70, 177-184. Collin R (2003) Worldwide patterns in mode of development in calyptraeid gastropods. Marine Ecology Progress Series, 247, 103-122. Cowen RK, Sponaugle S (2009) Larval dispersal and marine population connectivity. Annual Review of Marine Science, 1, 443-466. Dillon RT, Manzi JJ (1992) Population genetics of the hard clam, Mercenaria mercenaria, at the northern limit of its range. Canadian Journal of Fisheries and Aquatic Sciences, 49, 2574-2578. Dupont L, Richard J, Paulet Y-M, Thouzeau G, Viard F (2006) Gregariousness and protandry promote reproductive insurance in the invasive gastropod Crepidula fornicata: evidence from assignments of larval paternity. Molecular Ecology, 15, 3009-3021. Eckert CG, Samis KE, Lougheed SC (2008) Genetic variation across species’ geographical ranges: the central-marginal hypothesis and beyond. Molecular Ecology, 17, 1170-1188. Elshire RJ, Glaubitz JC, Sun Q, Poland JA, Kawamoto K, Buckler ES, Mitchell SE (2011) A robust, simple genotyping-by-sequencing (GBS) approach for high diversity species. PLoS ONE, 6, e19379. Excoffier L, Lischer HE (2010) Arlequin suite ver 3.5: A new series of programs to perform population genetics analyses under Linux and Windows. Molecular Ecology Resources, 10, 564-567.

This article is protected by copyright. All rights reserved.

Accepted Article

Fairbairn DJ (1981) Biochemical genetic analysis of population differentiation in Greenland halibut (Reinhardtius hippoglossoides) from the northwest Atlantic, Gulf of St. Lawrence, and Bering Sea. Canadian Journal of Fisheries and Aquatic Sciences, 38, 669 – 677. Foll M, Gagiotti OE (2008). A genome scan method to identify selected loci appropriate for both dominant and codominant markers: a Bayesian perspective. Genetics, 180, 977-993. Gaston KJ (2003) The structure and dynamics of geographic ranges. Oxford University Press, Oxford. Goff SA, Vaughn M, McKay S, Lyons E, Stapleton AE, Gessler D, et al. (2011) The iPlant collaborative: cyberinfrastructure for plant biology. Frontiers in Plant Science, 2, 1- 16. Goudet J. (2005) Hierfstat, a package for R to compute and test hierarchical F-statistics. Molecular Ecology Notes, 5, 184-186. Halbritter AH, Billeter R, Edwards PJ, Alexander JM. (2015) Local adaptation at range edges: comparing elevation and latitudinal gradients. Journal of Evolutionary Biology, 28, 18491860. Hamlin JAP, Arnold ML (2014) Determining population structure and hybridization for two iris species. Ecology and Evolution, 4, 743-755. Henry JJ, Perry KJ, Fukui L, Alvi N (2010). Differential localization of mRNAs during early development in the mollusc, Crepidula fornicata. Integrative and Comparative Biology, 50, 720-733. Hoagland KE (1985) Genetic relationships between one British and several North American populations of Crepidula fornicata based on allozyme studies (Gastropoda: Calyptraeidae). Journal of Molluscan Studies, 51, 177-182. Holt RD, Gomulkiewicz R (1997) How does immigration influence local adaptation? A reexamination of a familiar paradigm. American Naturalist, 149, 563. Johannesson K (1988) The paradox of Rockall: why is a brooding gastropod (Littorina saxatilis) more widespread than one having a planktonic larval dispersal stage (L. littorea)? Marine Biology, 99, 507-513. Jombart T (2008) Adegenet: a R package for the multivariate analysis of genetic markers. Bioinformatics, 24, 1403–1405. Jombart T, Devillard S, Balloux F (2010) Discriminant analysis of principal components: a new method for the analysis of genetically structured populations. BMC Genetics, 11, 94. Kalinowski ST (2004) Counting alleles with rarefaction: private alleles and hierarchical sampling designs. Conservation Genetics, 5, 539-543. Kawecki TJ (2008) Adaptation to marginal habitats. Annual Review of Ecology Evolution Systematics, 39, 321-342. Kirkpatrick M, Barton NH (1997) Evolution of a species’ range. American Naturalist, 150, 1-23. Le Cam S, Riquet F, Pechenik JA, Viard F (2014) Paternity and gregariousness in the sexchanging sessile marine gastropod Crepidula convexa: comparison with other protandrous Crepidula species. Journal of Heredity, 105, 397-406. Lee HJ, Boulding EG (2009) Spatial and temporal population genetic structure of four northeastern Pacific littorinid gastropods: the effect of mode of larval development on variation at one mitochondrial and two nuclear DNA markers. Molecular Ecology, 18, 2165-2184. Lenormand T (2002) Gene flow and the limits to natural selection. Trends in Ecology and Evolution, 17, 183-189.

This article is protected by copyright. All rights reserved.

Accepted Article

Libertini A, Vitturi R, Gregorini A, Colomba M (2009) Karyotypes, banding patterns, and nuclear DNA content in Crepidula unguiformis Lamark, 1822, and Naticarius stercusmuscarum (Gmelin, 1791) (Mollusca, Caenogastropoda). Malacologia, 51, 111118. Lira-Noriega A, Manthey JD (2014) Relationship of genetic diversity and niche centrality: a survey and analysis. Evolution, 68, 1082-1093. Lotterhos KE, Whitlock MC (2014) Evaluation of demographic history and neutral parameterization on the performance of Fst outlier tests. Molecular Ecology, 23, 21782192. Lu F, Lipka AE, Glaubitz J, Elshire R, Cherney JH, Casler MD, Buckler ES, Costich DE (2013) Switchgrass genomic diversity, ploidy, and evolution: novel insights from a networkbased SNP discovery protocol. PLoS Genetics, 9, e1003215. Mach ME, Sbrocco EJ, Hice LA, Duffy TA, Conover DO, Barber PH (2011) Regional differentiation and post-glacial expansion of the Atlantic silverside, Menidia menidia, an annual fish with high dispersal potential. Marine Biology, 158, 515-530, Micheletti SJ, Storfer A (2015). A test of the central-marginal hypothesis using population genetics and ecological niche modelling in an endemic salamander (Ambystoma barbouri). Molecular Ecology, 24, 967-979. Miller CB, Lynch DR, Carlotti F, Gentleman W, Lewis CVW (1998). Coupling of an individualbased population dynamic model of Calanus finmarchicus to a circulation model for the Georges Bank region. Fisheries and Oceanography, 7, 219-234. Moeller DA, Geber MA, Tiffin P (2011) Population genetics and the evolution of range limits in an annual plant. American Naturalist, 178, S44-S61. Pappalardo P, Pringle JM, Wares JP, Byers JE (2015) The location, strength, and mechanisms behind marine biogeographic boundaries of the east coast of North America. Ecography, 38, 722-731. Parmesan C (2006) Ecological and evolutionary responses to recent climate change. Annual Review of Ecology Evolution Systematics, 37, 637–669. Parmesan C, Yohe G (2003) A globally coherent fingerprint of climate change impacts across natural systems. Nature, 421, 37–42. Peakall R, Smouse PE (2006) GENALEX 6: genetic analysis in Excel. Population genetic software for teaching and research. Molecular Ecology Notes, 6, 288-295. Peakall R, Smouse PE (2012) GenAlEx 6.5: genetic analysis in Excel. Population genetic software for teaching and research – an update. Bioinformatics, 28, 2537-2539. Perry AL, Low PJ, Ellis JR, Reynolds JD (2005) Climate change and distribution shifts in marine species. Science, 308, 1912–1915. Polechová J, Barton NH (2015) Limits to adaptation along environmental gradients. Proceedings of the National Academy of Sciences of the United States of America, 112, 6401–6406. Price TD, Kirkpatrick M (2009) Evolutionarily stable range limits set by interspecific competition. Proceedings of the Royal Society B, 276, 1429-1434. Pritchard JK, Stephens M, Donnelly P (2000) Inference of population structure using multilocus genotype data. Genetics, 155, 945-959. R Core Team (2013) R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0, URL http://www.Rproject.org/

This article is protected by copyright. All rights reserved.

Accepted Article

Rawlings TA, Aker JM, Brunel P (2011) Clarifying the northern distributional limits of the slipper limpet Crepidula fornicata in the northwestern Atlantic. American Malacological Bulletin, 29, 105-119. Riquet F, Daguin-Thiébaut C, Ballenghien M, Bierne N, Viard F (2013) Contrasting patterns of genome-wide polymorphism in the native and invasive range of the marine mollusc Crepidula fornicata. Molecular Ecology, 22, 1003-1018. Romiguier J, Gayral P, Ballenghien M, Bernard A, Cahais V, Chenuil A, Chiari Y, Dernat R, Duret L, Faivre N, Loire E, Lourenco JM, Nabholz B, Roux C, Tsagkogeorga A, Weber AA-T, Weinert LA, Belkhir K, Bierne N, Glémin S, Galtier N (2014). Comparing population genomics in animals uncovers the determinants of genetic diversity. Nature, 515, 261-263. Saupe EE, Hendricks JR, Peterson AT, Lieberman BS (2014) Climate change and marine molluscs of the western North Atlantic: future prospects and perils. Journal of Biogeography, 41, 1352-1366. Scheltema RS (1986) On dispersal and planktonic larvae of benthic invertebrates: an eclectic overview and summary of problems. Bulletin of Marine Science, 39, 290-322. Sexton JP, McIntyre PJ, Angert AL, Rice KJ (2009) Evolution and ecology of species range limits. Annual Review of Ecology Evolution Systematics, 40, 415–436. Sexton JP, Strauss SY, Rice KJ (2011) Gene flow increases fitness at the warm edge of a species’ range. Proceedings of the National Academy of Sciences of the United States of America, 108, 11704-11709. Shanks AL (2009) Pelagic larval duration and dispersal distance revisited. Biological Bulletin, 216, 373-385. Thomas CD (2010) Climate, climate change, and range boundaries. Diversity and Distributions, 16, 488–495. Wares JP (2002) Community genetics in the northwest Atlantic intertidal. Molecular Ecology, 11, 1131-1144. Wares JP, Cunningham CW (2001) Phylogeography and historical ecology of the North Atlantic intertidal. Evolution, 55, 2455-2469. Wu YS, Tang CL (2011) Atlas of ocean currents in eastern Canadian waters. Canadian Technical Report of Hydrography and Ocean Sciences 271. Fisheries and Oceans Canada.

Data Accessibility Demultiplexed Illumina reads can be found in the Sequence Read Archive at NCBI (SRP study accession number SRP058970). SNP calls for the whole dataset, the filtered set of SNPs, input and output files for Bayescan, input files for DAPC, estimates of pairwise genetic and

geographic differentiation, and GPS coordinates for collection localities have been deposited in DataDryad (doi:10.5061/dryad.ct5q3).

This article is protected by copyright. All rights reserved.

Accepted Article

Author Contributions AEC and JSL designed the research. AEC performed the research and analyzed the data. AEC and JSL wrote the paper.

Table 1. Collection information. Collection sites, dates, and sample sizes for four populations each of Crepidula convexa and C. fornicata, used for high-throughput sequencing. Species

Site

C. convexa

NA

C. fornicata

BA NY NJ NL NS NA NY

Location Nahant, Massachusetts Barnstable, Massachusetts Northport, New York Sandy Hook, New Jersey Port Saunders, Newfoundland Main-à-Dieu, Nova Scotia Nahant, Massachusetts Northport, New York

Center or Margin? M M C C M M C C

Coordinates 42°26'11"N 70°56'20"W 41°42'33"N 70°17'54"W 40°56'13"N 73°08'44"W 40°26'50"N 73°59'44"W 50°38'49.35"N 57°16'34.40"W 46°00'20"N 59°50'14"W 42°26'11"N 70°56'20"W 40°56'13"N 73°08'44"W

This article is protected by copyright. All rights reserved.

Collection date September 2012 July 2013 August 2013 October 2012 September 2013 July 2013 July 2013 August 2013

Accepted Article

Table 2. Population genetic statistics. Number of successfully sequenced individuals after data filtering (N), average expected heterozygosity (HE), observed heterozygosity (HO), allelic richness (Ar), FIS values, the proportion of polymorphic SNP loci, and the proportion of loci with private alleles for each population of Crepidula convexa and C. fornicata, presented with 95% confidence intervals. Analyses were conducted separately for each species using SNPs that were sampled at more than 75% of individuals (i.e. 72 or more) and with an average of 10x coverage per individual, and after outliers were removed. This equated to 305 SNPs in C. convexa and 1902 SNPs in C. fornicata. HE, HO, Ar, and FIS were averaged across all loci.

Species

Population

C. convexa

NA

C. fornicata

BA NY NJ NL NS NA NY

Center or Margin? M M C C M M C C

N 23 19 18 25 21 24 21 21

HE

HO

Ar

FIS

0.184 ± 0.020 0.178 ± 0.021 0.176 ± 0.020 0.205 ± 0.020 0.128 ± 0.007 0.114 ± 0.007 0.120 ± 0.006 0.123 ± 0.007

0.229± 0.030 0.212 ± 0.028 0.187 ± 0.025 0.271 ± 0.032 0.152 ± 0.010 0.120 ± 0.008 0.115 ± 0.008 0.131 ± 0.009

1.601 ± 0.020 1.591 ± 0.020 1.606 ± 0.020 1.703 ± 0.016 1.513 ± 0.020 1.452 ± 0.020 1.549 ± 0.019 1.575 ± 0.017

-0.151 ± 0.031 -0.110 ± 0.030 -0.012 ± 0.033 -0.169 ± 0.032 -0.085 ± 0.011 0.00 ± 0.012 0.070 ± 0.014 0.011 ± 0.013

Proportion polymorphic loci 0.746 ± 0.0485

Proportion of loci with private alleles 0.036 ± 0.021

0.688 ± 0.0517

0.026 ± 0.018

0.698 ± 0.0512

0.030 ± 0.019

0.867 ± 0.0379

0.118 ± 0.036

0.789 ± 0.0183

0.026 ± 0.007

0.613 ± 0.0219

0.030 ± 0.008

0.700 ± 0.0206

0.067 ± 0.011

0.825 ± 0.0171

0.131 ± 0.015

Table 3. Pairwise FST values. Parwise FST values for four populations of Crepidula convexa (top) and C. fornicata (bottom) calculated using a set of 305 SNPs (C. convexa) or 1902 SNPs (C. fornicata). Sites are arranged from north to south. Comparisons that are significant at p < 0.05 are in bold. NA BA NY NJ

NL NS NA NY

NA

BA

NY

0.0198 0.0388 0.0348

0.0318 0.0349

0.0240

NL

NS

NA

0.0171 0.0457 0.0477

0.0609 0.0539

0.0129

NJ

NY

This article is protected by copyright. All rights reserved.

Accepted Article

Fig. 1 Ranges of Crepidula spp. on the Atlantic coast of North America. Left: the latitudinal ranges of Crepidula convexa (gray line) and C. fornicata (black line). Range data from Collin (2001) and Rawlings et al. (2011). Right (inset): sampling sites and oceanographic features in this study. Black dots represent collecting locations of C. fornicata and white dots represent collecting locations of C. convexa. Site abbreviations correspond to those in Table 1. Letters indicate important oceanographic features: A = Gulf of St. Lawrence, B = Cabot Strait, C = Gulf of Maine, D = Cape Cod. Scale bar indicates 330 km.

Fig. 2 Outlier loci in Crepidula species. Results of Bayescan analysis to identify SNPs with high FST values in (A) C. convexa and (B) C. fornicata. The x-axis represents the log of the posterior odds (q value). The dashed line and the solid line in both plots represents the thresholds for posterior odds for α ≠ 0 of 0.95 and 0.99, respectively.

Fig. 3 Genetic variation in Crepidula populations. Panels A and B: allelic richness in central and marginal populations of Crepidula convexa (A) and C. fornicata (B). Bars indicate averages and error bars indicate 95% confidence intervals. Panels C and D: proportion of private alleles in central and marginal populations of C. convexa (C) and C. fornicata (D). Bars indicate proportion of private alleles and error bars indicate 95% confidence intervals. Letters indicate significant differences at the p < 0.05 level, as assessed using the 95% confidence intervals. Sites are arranged from north to south along the x-axis, and abbreviations correspond to those in Table 1. Fig. 4 Discriminant analysis of principal components. Scatter plot of a discriminant analysis of principal components for (A) Crepidula convexa and (B) C. fornicata when sampling sites were used as priors in the analysis. Each color represents a different genetic cluster, and site abbreviations that correspond to those in Table 1 are provided for each cluster. The 67% inertial ellipses around each cluster represent the variance of the two PCs depicted. The insets represent the relative magnitude of the eigenvalues of the first three principal components. Fig. 5 Clustering of individuals. Proportions of individuals from four populations of (A) Crepidula convexa and (B) C. fornicata assigned to different genetic groups (k = 4 in both panels) in a discriminant analysis of principal components when the analysis did not use sampling site as a prior. Each of the genetic clusters is indicated by a different shade of grey, and each species was analyzed separately (i.e. the black cluster in panel A is unrelated to the black cluster in panel B). Sites are arranged from north to south along the x-axis, and abbreviations correspond to those in Table 1.

This article is protected by copyright. All rights reserved.

Accepted Article This article is protected by copyright. All rights reserved.

Accepted Article This article is protected by copyright. All rights reserved.

Accepted Article This article is protected by copyright. All rights reserved.