Isolation and characterization of polymorphic ...

5 downloads 1381 Views 200KB Size Report
loci in Indian major carp, Catla catla using next-generation sequencing ..... Primer3 on the WWW for general users and for biologist programmers. Methods Mol.
Biochemical Systematics and Ecology 57 (2014) 357e362

Contents lists available at ScienceDirect

Biochemical Systematics and Ecology journal homepage: www.elsevier.com/locate/biochemsyseco

Isolation and characterization of polymorphic microsatellite loci in Indian major carp, Catla catla using next-generation sequencing platform B.P. Sahu a, L. Sahoo a, C.G. Joshi b, P. Mohanty a, J.K. Sundaray a, P. Jayasankar a, P. Das a, * a

Fish Genetics and Biotechnology Division, Central Institute of Freshwater Aquaculture, Kausalyaganga, Bhubaneswar 751002, Odisha, India Anand Agricultural University, Anand 388110, Gujarat, India

b

a r t i c l e i n f o

a b s t r a c t

Article history: Received 6 June 2014 Accepted 20 September 2014 Available online

Catla catla, the second most important Indian major carp, is gaining its popularity among Indian fish farmers due to its high growth rate and consumer preferences. Simple sequence repeats (SSRs) are rapidly evolving, versatile, co-dominant and highly informative molecular markers used in genetic research. However, the time and cost involved in developing such resources has limited their extensive use. Advent of massive parallel sequencing technology has considerably eased these limitations. In the present investigation, we used Ion Torrent sequencing platform to identify potentially amplifiable microsatellite loci for catla. A modest sequencing volume generated approximately 5.7 MB of sequence data. Out of 29,794 sequences generated, 21,477 contained simple sequence repeats. Only 81 sequences had enough flanking sequences for primer designing. Out of 81 loci, 51 were successfully PCR amplified in a panel of five unrelated individuals. Out of 15 loci randomly checked for polymorphism, 13 loci were polymorphic with allele number ranged from 3 to 6 and two loci were found to be monomorphic. The observed and expected heterozygosities ranged from 0.565 to 0.870 and 0.483e0.804, respectively. These markers will be useful for studying genetics of wild populations, breeding programs of C. catla and closely related species. © 2014 Published by Elsevier Ltd.

Keywords: Catla catla Microsatellite Next generation sequencing Ion Torrent

1. Introduction Freshwater aquaculture in India is principally based on Indian major carps (Labeo rohita, Catla catla and Cirrhinus mrigala) with an annual total production of 3.2 million tons, contributing to more than 90% of the total national aquaculture production (FAO, 2012). C. catla known as catla is widely distributed across India, Pakistan, Bangladesh, Nepal and Myanmar (Reddy, 2005). The natural population of catla is reducing day by day due to overfishing, pollution and habitat alteration (Reddy, 2005). To protect the genetic resources and carry out genetic improvement schemes of this species, stock characterization and delineation of population structure are necessary. However, very little genetic resources are currently available for this species to address the above problems.

* Corresponding author. Tel.: þ91 674 2465446; fax: þ91 674 2465407. E-mail address: [email protected] (P. Das). http://dx.doi.org/10.1016/j.bse.2014.09.010 0305-1978/© 2014 Published by Elsevier Ltd.

358

B.P. Sahu et al. / Biochemical Systematics and Ecology 57 (2014) 357e362

Microsatellites have emerged as one of the most popular genetic markers for a wide range of applications in population genetics, conservation biology and evolutionary biology. However, the major drawback of microsatellite markers in the past has been the high cost of developing species-specific markers (Castoe et al., 2009). Now, this has been alleviated in terms of cost and time with the advent of next-generation sequencing, which allows the detection and characterization of SSR loci easily achievable with simple bioinformatics approaches (Abdelkrim et al., 2009). The random sequencing-based approach to identify microsatellites was rapid, cost-effective and can identify a large set of useful microsatellite loci in non-model species (Sahu et al., 2012; Abdelkrim et al., 2009). In the present study, we used the high-throughput sequencing technology PGM™ to develop polymorphic microsatellite loci in catla. Additionally, the cross utility of these markers was tested in five related species. 2. Materials and methods 2.1. Sample collection and DNA extraction 0

00

0

A total of 32 specimens of catla were collected during 2010e12 from Mahanadi (Latitude 200 28 0 N; Longitude 850 42 0 E) riverine system. Fin clips were obtained from each individual fish, preserved in 95% ethanol and stored at 2000  C until DNA extraction. Total DNA was isolated from fin tissue by proteinase K digestion followed by standard phenol and chloroform extraction (Sambrook et al., 1989). For high throughput DNA sequencing, the total genomic DNA from a randomly selected C. catla individual was isolated using QIAGEN DNeasy Blood & Tissue Kit (cat #74192), following the manufacturers recommended procedure. The concentration and purity of isolated DNA was estimated at wave length 260/280 nm using a UV spectrophotometer. 00

2.2. DNA sequencing and microsatellite loci identification Ion Torrent adapter-ligated library was prepared following the manufacturer's Ion Fragment Library Kit (Life Technologies, Invitrogen Division, Darmstadt, Germany) protocol (Part #4467320 Rev. A). Briefly, 50 ng of genome DNA from one individual was end-repaired, and Ion Torrent adapters P1 and A were ligated using DNA ligase. Following AMPure bead (Beckman Coulter, Brea, CA, USA) purification, adapter-ligated products were nick-translated and PCR-amplified for a total of five cycles. The genomic DNA library was purified using AMPure beads (Beckman Coulter) and the quantification, concentration and size evaluated by the Agilent 2100 bioanalyzer (Agilent Technologies, Palo Alto, Calif.). Sample emulsion PCR, emulsion breaking, and enrichment were performed using the Ion Xpress Template Kit (Part #4467389 Rev. B), according to the manufacturer's instructions. Briefly, an input concentration of one DNA template copy and Ion Sphere Particles (ISPs) was added to the emulsion PCR master mix and the emulsion generated using an IKA DT-20 mixer (Life Technologies, Invitrogen division, Darmstadt, Germany). Next, ISPs were recovered and template-positive ISPs enriched for using Dynabeads MyOne Streptavidin C1 beads (Life Technologies, Invitrogen division, Darmstadt, Germany). ISP enrichment was confirmed using the Qubit 2.0 fluorometer (Life Technologies, Invitrogen division, Darmstadt, Germany), and the sample was prepared for sequencing using the Ion Sequencing Kit protocol (Part #4467391 Rev. B). The complete sample was loaded on an Ion 316 chip and sequenced on the PGM™ for 260 cycles. Bioinformatics pipeline QDD (Meglecz et al., 2010) was used to select microsatellite containing sequences from the genomic dataset. Adapters were removed from sequences. Sequences longer than 100 bp and containing perfect microsatellite motif of at least six repetitions for any microsatellite motif of 2e6 bp were selected for further analyses. Sequence similarities were computed along with the previous sequences present in the database through an ‘all against all’ BLAST (Altschul et al., 1997) in which microsatellite motifs were soft masked. QDD selected 81 sequence reads that were longer than 100 bp and contained enough flanking sequences for primer designing. The primers were designed for microsatellite loci using PRIMER3 v.0.4.0 (Rozen and Skaletsky, 2000), with the following criteria: (i) GC content 40e60%; (ii) product size 100e250 bp; (iii) primer length 18e25 bp; and (iv) melting temperature 55e60  C with a maximum 5  C difference between paired primers. 2.3. SSR marker validation All 81 pairs of primers were synthesized and amplified in 5 unrelated individuals of catla for primary screening. Further, a panel of thirty-two wild C. catla individuals was taken for polymorphism study. PCR was carried out in 25 ml reaction volume containing 25 ng of template DNA, 5 pmol of each primer (Table 1), 1.5 mM of MgCl2, 200 mM of each dNTPs, 1 Taq buffer and 0.25 units of Taq DNA polymerase (BANGALORE GENEI, India). PCR was performed in a GeneAmp 9700 thermocycler (APPLIED BIOSYSTEMS, USA) with the following cyclic condition: initial denaturation at 94  C for 5 min followed by 30 cycles at 94  C for 30 s, annealing at specific temp (Table 1) for 30 s, elongation at 72  C for 45 s and a final extension at 72  C for 5 min. The PCR products were dried in a vacuum concentrator (DNA plus, HETO HOLTEN, USA), re-dissolved in 10 ml formamide (AMERESCO, Ohio, USA, cat # 0606-100ML), heat denatured at 95  C for 5 min and separated on 6% denaturing polyacrylamide gel. Separation was performed in a vertical gel electrophoresis system (Sequi-Gen_ GT, Sequencing Cell, BIORAD, Denmark) at a constant power of 50 W and temperature of 55  C for 2 h using FX174/HindIII (BANGALORE GENEI, India) as size standard. Gels were visualized by silver staining and documented by counting alleles. Genetic analysis involving number of alleles, observed heterozygosity (HO) and expected heterozygosity (HE), the exact test for the HardyeWeinberg

Table 1 Characterization of thirteen polymorphic microsatellite loci for Catla catla. 0

0

GenBank accession no

5 ——————————3

Motif

No of alleles

Allele range (bp)

Annealing temp ( C)

HO

HE

p Value for HWE

Cc-1

KF913007

F: CCTGGAACACTTTTTCTCTTGATG R: CAAGATCACACACACCACACAC

(TG)9

5

128e148

60

0.800

0.771

0.177

Cc-9

KF913008

F: GGACCATAGGTTTGGGTTGATT R: TGACTCCAAATAGGACAAGTGG

(TTATT)6

4

105e120

62

0.718

0.718

0.009*

Cc-13

KF913009

F: TAGACACGGCATTAGAGACACC R: CTAGCTGCATATCACATTCTTCAC

(TG)12

4

136e160

59

0.709

0.721

0.383

Cc-15

KF913010

F: GGGTTGCTCTCTAAAACTCTGG R: CTCCTTCTGCTCTCTCTGCTCT

(CA)6

3

152e170

57

0.565

0.511

0.704

Cc-19

KF913011

F: CATGTGTATGCTTTGTACTGTGAG R: CAATTCACCACCGATTCTTTTG

(TG)6

4

140e170

54

0.838

0.723

0.210

Cc-24

KF913012

F: ATTAAGGAAGAAGGCTGGAAGG R: GTGCGAAGGAAGTGACAAGAGT

(AC)6

5

120e144

58

0.741

0.749

0.377

Cc-31

KF913013

F: TGTCTAGGTGTGTTTCTCTGTGG R: GAACATGAGCGGGAAAACTG

(GT)14

5

142e172

59

0.607

0.753

0.139

Cc-40

KF913014

F: TGCAATACGAAGAAGACAGTGG R: GCAAAAATACCATGCTCACAGA

(AG)6

6

154e168

60

0.750

0.804

0.157

Cc-42

KF913015

F: CTGGCCTGTATCTCGCTCTG R: TACACTTGACTAACCCGGACCT

(TG)13

5

130e156

61

0.843

0.727

0.026*

Cc-46

KF913016

F: CTCTCCCTCTACCAGGCATTTT R: GTCAGGTGTTGAAGCTCTTTCC

(TC)6

4

114e134

60

0.814

0.673

0.043*

Cc-57

KF913017

F: CCACTCTTTCTTTTACTCCCCATT R: TGTAACAGCTTGTCTGGTGATAG

(AG)13

5

130e160

59

0.870

0.719

0.037*

Cc-62

KF913018

F: TCCAACCATCCATATCAGCTAC R: TGACGACGCTATCTTCTCTCTTT

(GA)7

4

180e210

60

0.838

0.652

0.000*

Cc-70

KF913019

F: CGCTCAGGTTACCCAGCATT R: CACACACACACGCAACAGATAC

(TG)6

3

160e186

60

0.656

0.483

0.704

B.P. Sahu et al. / Biochemical Systematics and Ecology 57 (2014) 357e362

Locus name

HO ¼ Observed heterozygosity, HE ¼ Expected heterozygosity, * ¼ Significant deviation from HWE (p < 0.05).

359

360

B.P. Sahu et al. / Biochemical Systematics and Ecology 57 (2014) 357e362

equilibrium (HWE), and test for genotypic linkage disequilibrium were performed using GDA software (Lewis and Zaykin, 2001). Significance values were adjusted for multiple comparisons using Bonferroni corrections, wherever necessary (Rice, 1989). Finally, all loci were assessed using MICROCHECKER to check for null alleles and scoring errors (Van Oosterhout et al., 2004). 2.4. Cross species amplification To evaluate cross-species utilization, we tested all the polymorphic markers described here in five related species of Cyprinidae such as: L. rohita, Labeo fimbriatus, Labeo bata, C. mrigala and Cirrhinus reba. Two individuals from each species in the present study were considered for cross-species amplification. The PCR products were checked in 1.5% agarose gel for specific amplification. 3. Results 3.1. DNA sequencing and microsatellite loci discovery By PGM™ sequencing with a 316 chip, a total of 5.7 MB data and 29,794 quality reads were obtained in the present study. The length of the reads was quite concentrated in the range of 100e250 bp. By using QDD pipeline, reads were assembled into 21,477 microsatellite containing sequences. Most frequently encountered repeat motifs were di-nucleotides (50%) followed by tetra-nucleotides (33%), tri-nucleotide (14%), penta-nucleotide (2%) and hexa-nucleotide (1%). QDD selected 81 sequence reads that were longer than 100 bp and contained enough flanking sequences for primer designing. 3.2. SSR validation Primers for all 81 loci were synthesized and PCR amplified in a panel of five unrelated individuals. A total of 51 loci amplified and produced clear bands. To check the polymorphism, 15 microsatellite markers were chosen randomly and characterized in a panel of thirty-two unrelated wild C. catla individuals. Thirteen out of 15 randomly chosen loci were found to be polymorphic. The sequences of the 15 loci were deposited in GenBank (KF913007eKF913019) (Table 1). Number of alleles per locus, observed and expected heterozygosity, ranged from 3 to 6, 0.565e0.870 and 0.483e0.804, respectively. Eight loci were in agreement (p > 0.05) with HardyeWeinberg Equilibrium (HWE) and five loci deviated from it. No significant pair wise linkage disequilibrium was observed among the loci. 3.3. Cross species amplification Out of 13 loci, 12 loci in L. rohita, L. fimbriatus, L. bata, 11 loci in C. mrigala and 10 loci in C. reba were amplified successfully (Table 2). 4. Discussion Our results suggest that random genome sequencing approach using next generation sequencing technology is a rapid and cost-effective means to identify large number of polymorphic microsatellite loci in fish species. By taking advantages of Ion Torrent sequencing platform, 21,477 SSR containing sequences were generated within a very short span of time. A total of 81 microsatellites were identified in these contigs contained enough flanking sequences for primer designing. These represent Table 2 Cross-amplification of 13 microsatellite loci isolated from Catla catla across Labeo rohita, Labeo fimbriatus, Labeo bata, Cirrhinus mrigala and Cirrhinus reba. Locus name

Labeo rohita

Labeo fimbriatus

Labeo bata

Cirrhinus mrigala

Cirrhinus reba

Cc-1 Cc-9 Cc-13 Cc-15 Cc-19 Cc-24 Cc-31 Cc-40 Cc-42 Cc-46 Cc-57 Cc-62 Cc-70

 þ þ þ þ þ þ þ þ þ þ þ þ

 þ þ þ þ þ þ þ þ þ þ þ þ

 þ þ þ þ þ þ þ þ þ þ þ þ

 þ  þ þ þ þ þ þ þ þ þ þ

  þ þ þ þ þ _ þ þ þ þ þ

‘þ’ indicates successful amplification; ‘’ indicates unsuccessful amplification.

B.P. Sahu et al. / Biochemical Systematics and Ecology 57 (2014) 357e362

361

almost 3-fold of the total number of microsatellite loci developed for C. catla in the last few years. This partly explains why NGS technology has been so widely used for microsatellite loci isolation and characterization in many different taxa. In addition to marker development, the NGS method also provided important information about the genomic organization of repeat sequences in C. catla. It is not surprising that dinucleotide repeats were the most frequent motif type (50.00%), as is often observed in many other eukaryotes (Chistiakov et al., 2006; Zhu et al., 2012). Interestingly, tetranucleotides are the second most frequent motifs in C. catla, as opposed to what is observed for many plant and animal species, where trinucleotides are more frequent (Tuskan et al., 2004; Chistiakov et al., 2006). However, more tetranucleotides than trinucleotides have been observed in the genomes of the American cranberry Vaccinium macrocarpon (Zhu et al., 2012) and Japanese pufferfish, Fugu rubripes (Edwards et al., 1998). Therefore, the relative abundance of tetranucleotide versus trinucleotide motifs may vary according to species as observed earlier. Information about the genomic organization of repeat sequences can be useful for the selection of SSR loci and primer design. Thirteen out of 15 randomly chosen loci were found to be polymorphic among the 32 wild individuals from Mahanadi river. The percentage of polymorphic markers was 86.6% in this study, which was higher than those observed in Schizothorax biddulphi (56.7%) (Luo et al., 2012), and Galeorhinus galeus (40.6%) (Chabot and Nigenda, 2011) using next generation sequencing platform. The mean number of alleles per locus, HO and HE are 4.38, 0.753 and 0.692, respectively, demonstrating moderate genetic diversity within C. catla individuals, which is consistent with what has been reported from other studies (Naish and Skibinski, 1998; McConnell et al., 2001). No pairs of loci were found to be in linkage disequilibrium. Five microsatellite loci deviated from HardyeWeinberg equilibrium (p < 0.05) (Table 1). Possible explanations for the HardyeWeinberg disequilibrium for the SSR loci include null alleles, substructure of the specimens, extensive inbreeding or segregation distortion (Selkoe and Toonen, 2006). Cross-species amplification is an alternative strategy to extend the utilization of microsatellites. Microsatellite loci generally show considerable evolutionary conservation, suggesting that primers developed for any one species may often be useful across a wide range of taxa. The success rate of cross-species amplification was about 90% in the congeners studied in the present investigation. This is in accordance with our previous report in family Cyprinidae (Patel et al., 2010; Das et al., 2005) and sparidae (Liu et al., 2007). Therefore, these polymorphic markers would be useful for population genetic characterization, breeding program of C. catla and closely related species. Acknowledgments Financial help by the Indian Council of Agricultural Research, Govt. of India (E-96) is acknowledged. We are thankful to the Director, Central Institute of Freshwater Aquaculture (ICAR) for providing laboratory facility. The authors are also thankful to all the team members of outreach activity on fish genetic stocks for providing wild samples of catla. References Abdelkrim, J., Robertson, B.C., Stanton, J.A.L., Gemmell, N.J., 2009. Fast, cost-effective development of species-specific microsatellite markers by genomic sequencing. Biotechniques 46, 185e192. Altschul, S.F., Madden, T.L., Schaffer, A.A., Zhang, J., Zhang, Z., Miller, W., Lipman, D.J., 1997. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25, 3389e3402. Castoe, T.A., Poole, A.W., Gu, W., Koning, A.P.J.D., Daza, J.M., Smith, E.N., Pollock, D.D., 2009. Rapid identification of thousands of copperhead snake (Agkistrodon contortrix) microsatellite loci from modest amounts of 454 shotgun genome sequence. Mol. Ecol. Resour. 10, 341e347. Chabot, C.L., Nigenda, S., 2011. Characterization of 13 microsatellite loci for the tope shark, Galeorhinus galeus, discovered with next-generation sequencing and their utility for eastern Pacific smooth-hound sharks (Mustelus). Conserv. Genet. Resour. 3, 553e555. Chistiakov, D.A., Hellemans, B., Volckaert, F.A.M., 2006. Microsatellites and their genomic distribution, evolution, function and applications: a review with special reference to fish genetics. Aquaculture 255, 1e29. Das, P., Barat, A., Meher, P.K., Ray, P.P., Majumdar, D., 2005. Isolation and characterization of polymorphic microsatellites in Labeo rohita and their crossspecies amplification in related species. Mol. Ecol. Notes 5, 231e233. Edwards, Y.J.K., Elgar, G., Clark, M.S., Bishop, M.J., 1998. The identification and characterization of microsatellites in the compact genome of the Japanese pufferfish, Fugu rubripes: perspectives in functional and comparative genomic analyses. J. Mol. Biol. 278, 843e854. FAO, 2012. FAO; Fisheries and Aquaculture Department, Data and Statistical Unit. http://www.fao.org/fishery/en. Lewis, P.O., Zaykin, D., 2001. Genetic Data Analysis: Computer Program for Analysis of Allelic Data. Ver 1.0. Free program distributed by the authors over the internet available at: http://hydrodyction.eeb.uconn.edu/people/plewis/software.php. Liu, Y.G., Liu, L.X., Wu, Z.X., Lin, H., Li, B.F., Sun, X.Q., 2007. Isolation and characterization of polymorphic microsatellite loci in black sea bream (Acanthopagrus schlegeli) by cross-species amplification with six species of the Sparidae family. Aquat. Living Resour. 20, 257e262. Luo, W., Nie, Z.L., Zhan, F.B., Wei, J., Wang, W., Gao, Z., 2012. Rapid development of microsatellite markers for the endangered fish Schizothorax biddulphi (Günther) using next generation sequencing and cross-species amplification. Int. J. Mol. Sci. 13, 14946e14955. McConnell, S.K.J., Leamon, J., Skibinski, D.O.F., Mair, G.C., 2001. Microsatellite markers from the Indian major carp species, Catla catla. Mol. Ecol. Notes 1, 115e116. Meglecz, E., Costedoat, C., Dubut, V., Gilles, A., Malausa, T., Pech, N., Martin, J.F., 2010. QDD: a user-friendly program to select microsatellite markers and design primers from large sequencing projects. Bioinformatics 26, 403e404. Naish, K.A., Skibinski, D.O.F., 1998. Tetranucleotide microsatellite loci for Indian major carp. J. Fish Biol. 53, 886e889. Patel, A., Das, P., Barat, A., Meher, P.K., Pallipuram, J., 2010. Utility of cross-species amplification of 34 rohu microsatellite loci in Labeo bata, and their transferability in six other species of the cyprinidae family. Aquacult. Res. 41, 590e593. Reddy, P.V.G.K., 2005. Carp Genetic Resources of India, Carp Genetic Resources for Aquaculture in Asia. World Fish Center Publishing Inc., Malaysia, pp. 39e53. Rice, W.R., 1989. Analyzing tables of statistical tests. Evolution 4, 223e225. Rozen, S., Skaletsky, H., 2000. Primer3 on the WWW for general users and for biologist programmers. Methods Mol. Biol. 132, 365e386. Sahu, B.P., Patel, A., Sahoo, L., Das, P., Meher, P., Jayasankar, P., 2012. Rapid and cost effective development of SSR markers using next generation sequencing in Indian major carp, Labeo rohita (Hamilton, 1822). Indian J. Fish. 59, 21e24.

362

B.P. Sahu et al. / Biochemical Systematics and Ecology 57 (2014) 357e362

Sambrook, J., Fritsch, E.F., Maniatis, T., 1989. Molecular Cloning: a Laboratory Manual, second ed. Cold Spring Harbor Laboratory Press, New York, USA. Selkoe, K.A., Toonen, R.J., 2006. Microsatellites for ecologists: a practical guide to using and evaluating microsatellite markers. Ecol. Lett. 9, 615e629. Tuskan, G.A., Gunter, L.E., Yang, Z.K., Yin, T.M., Sewell, M.M., DiFazio, S.P., 2004. Characterization of microsatellites revealed by genomic sequencing of Populus trichocarpa. Can. J. For. Res. 34, 85e93. Van Oosterhout, C., Hutchinson, W.F., Wills, D.P.M., Shipley, P., 2004. MICRO-CHECKER: software for identifying and correcting genotyping errors in microsatellite data. Mol. Ecol. Notes 4, 535e538. Zhu, H., Senalik, D., McCown, B.H., Zeldin, E.L., Speers, J., Hyman, J., Bassil, N., Hummer, K., Simon, P.W., Zalapa, J.E., 2012. Mining and validation of pyrosequenced simple sequence repeats (SSRs) from American cranberry (Vaccinium macrocarpon Ait.). Theor. Appl. Genet. 124, 87e96.