Genetic diversity and population structure of common bean ...

2 downloads 38 Views 883KB Size Report
Dec 28, 2016 - one week old plants from five randomly selected plants per accession ... with Darwin V. 5.0 (http://darwin.cirad.fr/darwin). ...... 6-FAM (GTA)5. 5.
Vol. 15(52), pp. 2824-2847, 28 December 2016 DOI: 10.5897/AJB2016.15464 Article Number: 528FF4562176 ISSN 1684-5315 Copyright © 2016 Author(s) retain the copyright of this article http://www.academicjournals.org/AJB

African Journal of Biotechnology

Full Length Research Paper

Genetic diversity and population structure of common bean (Phaseolus vulgaris L) germplasm of Ethiopia as revealed by microsatellite markers Zelalem Fisseha1,2, Kassahun Tesfaye1, Kifle Dagne1, Matthew W. Blair3, Jagger Harvey4, Martina Kyallo4, Paul Gepts5* 1

Addis Ababa University, Department of Microbial, Cellular, and Molecular Biology, P.O.Box 1176, Addis Ababa, Ethiopia. 2 Department of Dryland Crops, College of Dryland Agriculutre, Jigjiga University, Crop Research Directorate, Somali Region Pastoral and Agro-pastoral Research Institute (SoRPARI), P.O.Box 398, 1020, Jijiga, Ethiopia. 3 Department of Agricultural and Environmental Sciences, Tennessee State University, Nashville, TN 37209, United States of America. 4 Biosciences Eastern and Central Africa (BecA), ILRI, Nairobi, Kenya. 5 Department of Plant Sciences / MS1, University of California, 1 Shields Avenue, Davis, CA 95616, USA. Accepted 13 May, 2016; Accepted 9 December, 2016

The Ethiopian genetic center is considered to be one of the secondary centers of diversity for the common bean. This study was conducted to characterize the distribution of genetic diversity between and within ecological/geographical regions of Ethiopia. A germplasm sample of 116 landrace accessions was developed, which represented different common bean production ecologies and seed types common in the country. This sample was then analyzed with 24 simple sequence repeat (SSR) markers to assess the genetic diversity within and between common bean landraces, classifying them based on SSR clustering, and determining relationships between genetic and agroecological diversity. Representatives of both Andean and Mesoamerican gene pools were identified by STRUCTURE software analysis, as well as a high proportion of hybrid accessions as evidenced by a STRUCTURE K = 2 preset. At the optimum K = 5 preset value, mixed membership of Andean and Mesoamerican genotypes in some of the clusters was also seen, which supported previous findings. Cluster analyses, principal coordinate analysis, and analysis of molecular variance all indicated clustering of accessions from different collection sites, accompanied by high gene flow levels, highlighting the significant exchange of planting materials among farmers in different growing regions in the country. Values of allelic diversity were comparable to those reported in previous similar studies, showcasing the high genetic diversity in the landrace germplasm studied. Moreover, the distribution of genetic diversity across various bean-growing population groups in contrasting geographical/ecological population groups suggests elevated but underutilized potential of Ethiopian germplasm in common bean breeding. In summary, this study demonstrated the geographical, as well as gene pool diversity in common bean germplasm of Ethiopia. This substantial diversity, in turn, should be utilized in future common bean breeding and conservation endeavors in the nation. Key words: Hybridity, simple sequence repeat, microsatellite, structure, seed exchange, gene flow.

Fisseha et al.

INTRODUCTION Common bean is the most widely consumed legume species of the genus Phaseolus (Freytag and Debouck, 2002). It is a pulse crop used since pre-Columbian times th in the Americas and, since the 16 century, in other regions of the world (Gepts et al., 2008). It is a true diploid (2n = 2x = 22) with a small genome (580 Mbp; Broughton et al., 2003). Originating in the Neotropics, common bean was domesticated in Mesoamerica and the Andes (Gepts and Bliss, 1986; Gepts, 1988). The crop has high diversity that is broadly classified into six or seven domesticated races distributed into two gene pools (Singh et al., 1991a, b, c; Blair et al., 2007, 2010b; Pallottini et al., 2004; Kwak and Gepts, 2009; Kwak et al., 2012). The crop is a major legume in Eastern and Southern Africa, occupying more than 4 million ha annually and providing food for ≥100 million people (Wortmann et al., 1998; Fisseha, 2015). Of the total production in sub-Saharan Africa of over 3.5 million MT, 62% is in Eastern and Central African countries (Wortmann et al., 1998; Fisseha, 2015). Common bean became established with the African-European trade, even before the widespread era of colonization (Allen and Edje, 1990; Asfaw et al., 2009). Historical accounts show th that common bean was introduced to Ethiopia in the 16 century by Portuguese traders and rapidly became an important component of the diet there (Assefa, 1985; Fisseha, 2015). Ever since the introduction of common bean into Ethiopia, farmers have developed farming practices adapted to local conditions by preservation and exploitation of useful alleles, which have resulted in a range of morphologically diverse landraces (Sperling, 2001). Moreover, recent efforts of the national beanbreeding program in Ethiopia have targeted improvement of on-farm common bean productivity and have benefited since the 1980‘s from continuous introduction of new germplasm from different parts of the world (Fisseha, 2015). Today, Ethiopia is among the major bean producers in Sub-Saharan Africa (Wortmann et al., 1998). However, the national bean yield still lags behind the global average (Fisseha, 2015). This can be attributed partially to the low yielding capacity of cultivars under use (Assefa, 1990; Fisseha, 2015). To this end, it is essential to tap the potential of landrace genetic resources in order to introgress novel genes of adaptation, resistance to diseases and pests, and tolerance to abiotic stresses. According to Hornakova et al. (2003), landraces grown by small farmers are rich sources of valuable genes. East Africa is a secondary center of diversity for common beans, due to the wide range of landraces there (Martin and Adams, 1987; Asfaw et al., 2009, Blair et al.,

2825

2010b). Understanding the patterns and levels of genetic diversity of bean landraces and cultivars can shed light on potential adaptation and direction and level of gene flow, and eventually help bean breeding and conservation in Ethiopia. Hence, this research project was undertaken with the principal goal of evaluating the genetic diversity within and between common bean landraces, to classify genotypes based on clustering and to understand the distribution of genetic diversity between and within ecological/geographical regions of Ethiopia. MATERIALS AND METHODS Plant materials A total of 116 landrace accessions collected from a range of common bean production agro-ecologies in Ethiopia, four Ethiopian cultivars, three Kenyan cultivars, and two other cultivars, used as control genotypes for the Andean and Mesoamerican gene pools, respectively, were grown in August, 2012, in a greenhouse in the Biosciences Eastern and Central Africa (BecA-ILRI) hub in Nairobi, Kenya, for DNA extraction and analysis. The Ethiopian accessions were sampled from potential bean growing areas in the country (Supplementary Table 1 and Figure 1). The seeds of the control and commercial cultivars were acquired from the Ethiopian National Bean Research Project, based at Melkassa Agricultural Research Center, Adama, Ethiopia. The landrace accessions were provided by the Gene Bank of the Ethiopian Biodiversity Institute (EIB). A total of ten plants per each accession were planted in a single row in the screen house of BecA-ILRI hub, Nairobi, Kenya in August, 2012 for DNA extraction. Genomic DNA extraction For the molecular diversity assessment, total genomic DNA for each accession was isolated from a bulked leaf tissue sample of one week old plants from five randomly selected plants per accession using cetyltrimethylammonium bromide (CTAB) method (Doyle and Doyle, 1987) with some minor modifications, as described in supplementary part 1. However, 47 accessions did not produce enough genomic DNA, probably due to poor leaf sample qualities, which, in turn, imposed the need to repeat DNA extraction from the same, using the Zymoplant seed DNA extraction kit (descriptions on the protocol are presented in Supplementary Part 2). Microsatellite amplification Twenty-four (24) microsatellite markers from all the 11 linkage groups were selected based on their Polymorphic Information Content (PIC) values and dispersed map locations (Yu et al., 2000; Pedrosa-Harand et al., 2008; Kwak and Gepts, 2009). Out of the 24 SSR markers, 15 were genomic, and the remaining nine were nongenomic (genic) markers (Supplementary Table 2). Markers were PCR amplified with 6-FAM, NED, PET or VIC 5‘-labeled forward primers and unlabeled reverse primers. The primers were run in multiplexes, based on their fluorescence dye and allele size using

*Corresponding author. E-mail: [email protected] . Tel: +1-530-752-7743. Author(s) agree that this article remains permanently open access under the terms of the Creative Commons Attribution License 4.0 International License

2826

Afr. J. Biotechnol.

2

3

4

1 5 12

6

13

11

7

8

10

9 Figure 1. Map showing the collection sites. Key: 1 = Assosa; 2 = Metekel; 3 = Gojam; 4 = North Shewa & South Wello;; 5 = Wellega Gojam; 6 = Jimma and Illubabor; 7=Bench Maji; 8 = North Omo;; 9 = South Omo; 10 = Sidama and Others around; 11 = Bale & Arsi; 12 = East Hararghe;13 = West Hararghe) . The size of the bubbles does not correspond to number of genotypes sampled in each location.

BIONEER ACCUPOWER® Multiplex PCR Premix Kits (Supplementary part 3). Out of the 24 SSR markers, seven were dropped after preliminary evaluation, because they either produced no amplification (BM172 and BMd1) or were monomorphic (BM188, BM183, BMd16, PV-AG001, and PV-AT001). PCR products were run on an ABI PRISM 3730xl fragment analyzer (Applied Biosystems, Foster City, CA, USA) at the BecA-ILRI hub (Sequencing, genotyping, and Oligo unit, SegoLip), and allele sizes were determined by comparing with Genescan LIZ500 size standard using GeneMapper v. 3.7.3.7 software. The observed allele sizes were then adjusted for the discrete allele size using the AlleloBin software (http://test1.icrisat.org/gtbt/download_allelobin.htm).

SSR genetic diversity analysis Genalex 6.5b3 (Peakall and Smouse, 2012; http://biology.anu.edu.au/GenAlEx/) was used to calculate genetic diversity parameters, such as genetic distance, number of alleles (Na); number of effective alleles (Ne); number of private alleles (Npa); observed heterozygosity (Ho); expected heterozygosity (He); Shannon‘s information index (I); analysis of molecular variance (AMoVA); and principal coordinate analysis (PCoA). Genetic associations were determined using the neighbor-joining coefficient

with Darwin V. 5.0 (http://darwin.cirad.fr/darwin). Genepop V.4 (Rousset, 2008) and Popgene32 (Yeh et al., 1999) programs were also used to determine genetic diversity, polymorphic loci, gene flow, levels of heterozygosity, fixation index, and F-values,. Finally, PowerMarker v. 3.25 (Liu and Muse, 2005) was used to estimate the number of alleles, polymorphic information content (PIC) values, genetic distance matrices, observed heterozygosity (H o); and expected heterozygosity (He) for each marker across all genotypes and then across genotypes within and between gene pools.

Analysis of population structure The software program STRUCTURE was run for K values ranging from 2 to 8. Each run was performed using the admixture model and 5,000 replicates for burn-in and 50,000 during the analysis (Pritchard et al., 2000). Evanno et al. (2005) test was performed after 10 simulations per K value. The repeated simulations were conducted for every subpopulation number from K = 2 to K=8 using 5,000 replicates for burn-in and 50,000 replicates according to previous suggestions (Rosenberg et al., 2002; Evanno et al., 2005; Ehrich, 2006). The Δ statistic showed that K = 5 was the optimal number of subpopulations in this analysis (Supplementary Figure 1). This ideal K value presented the highest peak for change in value from and to the previous and subsequent numbers of

Fisseha et al.

2827

Figure 2. Population structure for 120 common bean accessions from different growing regions of Ethiopia and 3 Kenyan cultivars compared to Andean and Mesoamerican control genotypes at K = 2 to K = 5. Predetermined group names indicated below figure are: Amhara = Genotypes from Amhara Regional State; andectrl = Andean control genotypes; Bgumuz = Genotypes from Benishangul Regional State; Debub = Genotypes from Southern Nations and Nationalities Regional State; Kenyan = Kenyan accessions; MACTRL = Mesoamerican control genotypes; Oromiya = Genotypes from Oromiya Regional State; and Std. Var. = Standard Varieties.

subpopulations, respectively. This showed a gain in precision from subdividing the genotypes into five subpopulations versus any lower or higher numbers of subpopulations. The K=2 analysis was done with a particular interest of distinguishing between Andean and Mesoamerican accessions (Koenig and Gepts, 1989; Kwak and Gepts, 2009). To this end, five independent runs were performed with the admixture model and 5,000 replicates for burnin and 50,000 replicates during analysis. The clustering in different runs was almost identical (similarity coefficient 0.9914). The run with the lowest likelihood value was selected among the five runs, and the accessions with more than 80% posterior assignment probability in the Mesoamerican cluster were assigned to the Mesoamerican gene pool (and vice versa for the Andean gene pool) (Supplementary Table 3). Lower posterior assignment probability values (that is, between 50 and 80%) may actually indicate hybrids or admixed accessions rather than ‗‗pure‘‘ accessions (Kwak and Gepts, 2009). Nonetheless, such accessions were included in the K=2 analysis, as they are important in future studies towards shedding light on the population structure of the common bean in Ethiopia, and as baseline information in breeding/improvement programs.

RESULTS Population structure into the Andean and Mesoamerican gene pools in the common bean germplasm The

population

subdivision

(as

determined

by

STRUCTURE) (Figure 2), the NJ tree (Figure 3), and the PCoA (Figure 4), showed significant Andean– Mesoamerican gene pool divergence as well as racial differentiation within gene pools. The accessions were assigned to the respective gene pools of origin, as per the methods explained in the ―Materials and Methods‖ for K=2. Consequently, 78 accessions out of the total 125 fell into the Mesoamerican group, whereas the remaining 47 were classified into the Andean group. This classification was based on posterior assignment probabilities p>0.5. This split was generally maintained from K=2 to 3, but broke down for K = 4 and 5 (Figure 2; Supplementary Table 3). The analysis for K = 2 populations showed individual genotypes distributed between the two gene pools, which was congruent with the neighbor-joining and PCoA analyses, which clearly separated the Mesoamerican and Andean gene pools. At K=3, looking jointly into the bar-graphs produced and membership coefficient values, the Mesoamerican gene pool genotypes further separated into two sub-groups but no meaningful interpretation of population structure could be made, while the Andean gene pool genotypes did not show any separation. At K=4, the Mesoamerican accessions further subdivided into two groups with a mild level of admixture but no meaningful interpretation of population structure could be made. At K = 5, the Andean accessions further subdivided into three groups with

2828

Afr. J. Biotechnol.

Figure 1. Neighbor-joining dendrogram depicting genetic relationship between common bean accessions from different bean growing populations in Ethiopia with respect to Andean and Mesoamerican control genotypes. Red: Andean Cluster1 (K4); Blue: Andean Cluster2 (K5); Yellow arrows: Andean Control (K1); Green: MA Cluster1 (K2); Purple arrows: MA Control (K3).

Coordinate 2

Principal Coordinates Analysis (PCoA)

Andean Cluster1 K4 ANDEAN

Andean Cluster2 K5 Andean Control K1 MA Cluster1 K2 Ma Control K3

MESOAMERICAN Coordinate 1 Figure 4. PCoA graph for the 53 accessions from different growing populations in Ethiopia.

Fisseha et al.

2829

Table 1. Fst values among five populations identified by STRUCTURE.

K 5

Andean Cluster 1 (K4) 0.239

Andean Cluster 2 (K5) 0.356

Andean Control (K1) 0.547

Mesoamerican Cluster 1 (K2) 0.135

Mesoamerican Control (K3) 0.264

Table 2. Proportion of non-hybrid accessions in K = 5 groups identified by STRUCTURE.

Groups Total Mesoamerican Mesoamerican Cluster1 (K2) Mesoamerican control (K3) Andean Andean Cluster1 (K4) Andean Cluster2 (K5) Andean Control (K1)

Total number of accessions 125 66 41 25 59 27 26 6

some admixture level, whereas the Mesoamerican accessions did not subdivide further. In the following section, we describe in further details the five groups of K = 5.

Genetic diversity among accessions and cluster groups in STRUCTURE preset K=5 For K=5, the groups were identified as Andean Cluster 1 (K4); Andean Cluster 2 (K5); Andean control (K1); Mesoamerican Cluster 1 (K2) and Mesoamerican control (K3). On average, Fst values for Andean populations (K1, K4, and K5) were lower (0.213) compared to those of Mesoamerican populations (K2, and K3) (0.451) (Table 1). We also quantified population admixture for each accession (Figure 2; Supplementary Table 3). The Andean gene pool had a higher proportion of non-hybrid accessions than the Mesoamerican gene pool (51and 35% at the 0.8 cutoff, respectively; Table 2). The proportion of non-hybrid accessions in each K group ranged from 28% (Mesoamerican Controls K3) to 54% (Andean Cluster 2 K5) at the 0.8 cutoff values (Table 2). The proportions of polymorphic loci were 100% in the Andean Cluster 1 (K4) genotypes; 94% in the Andean cluster 2 (K5), Andean control (K1), and the Mesoamerican cluster 1 (K2); 76% in the Mesoamerican control (K3) (Table 3). On average, the Andean groups had a higher number of alleles (Na), number of effective alleles (Ne); Shannon Index (I), observed heterozygosity, expected heterozygosity, fixation index, percent of polymorphic loci; genetic distance; and number of private alleles. On the other hand, the Mesoamerican groups had higher hybridity rates than the Andean groups. The

0.8 Cutoff Number of accessions % from total 53 42.4 23 34.9 16 39.0 7 28.0 30 50.9 11 40.7 14 53.9 5 83.3

highest number of alleles, genetic distance (GD), observed heterozygosity (Ho), hybridity rate (t), and percent of polymorphic loci was recorded for the Andean cluster 1 (K5). The Andean control cluster had the highest Shannon index (I), fixation index (F), number of private alleles (Npa); and number of effective alleles (Ne). Analysis of Molecular Variance (AMoVA) among accessions and cluster groups in STRUCTURE preset K=5 The AMOVA results showed that 50% of allelic diversity was attributed to individuals within gene pool (P