Genetic diversity and population structure of anciently ... - Springer Link

2 downloads 0 Views 2MB Size Report
Jul 5, 2014 - Abstract Cacao is widespread in eastern Cuba, but rarely grown in its central region. The origin, location and date of earliest introduction are ...
Genet Resour Crop Evol (2015) 62:67–84 DOI 10.1007/s10722-014-0136-z

RESEARCH ARTICLE

Genetic diversity and population structure of anciently introduced Cuban cacao Theobroma cacao plants Igor Bidot Martı´nez • Manuel Riera Nelson Marie-Christine Flamand • Pierre Bertin



Received: 18 November 2013 / Accepted: 26 May 2014 / Published online: 5 July 2014  Springer Science+Business Media Dordrecht 2014

Abstract Cacao is widespread in eastern Cuba, but rarely grown in its central region. The origin, location and date of earliest introduction are under debate, with cacao hypothesized to have arrived in central Cuba from Central America in 1540 or in the eastern part of the island from Haiti between 1781 and 1803. Controlled introductions have taken place during recent decades, but the genetic diversity of earlier introductions has never been investigated. A representative sample of 537 Cuban cacao plants of ancient origin and 107 plants representing 10 genetic reference groups and Trinitario genotypes were fingerprinted using 15 international standard microsatellite markers. Overall, 139 alleles—9.267 alleles per locus on I. Bidot Martı´nez  M. Riera Nelson Facultad Agroforestal de Montan˜a, Universidad de Guanta´namo, Carretera a El Salvador km 6 ‘, Guanta´namo, Cuba e-mail: [email protected] M. Riera Nelson e-mail: [email protected] M.-C. Flamand Institut des Sciences de la Vie, Universite´ catholique de Louvain, Croix du Sud 4-5, L7.07.14, 1348 Louvain-la-Neuve, Belgium e-mail: [email protected] P. Bertin (&) Earth and Life Institute – Agronomy (ELI-a), Universite´ catholique de Louvain, Croix du Sud 2, L7.05.11, 1348 Louvain-la-Neuve, Belgium e-mail: [email protected]

average—were amplified. Mean polymorphism information content, observed heterozygosity and expected heterozygosity were 0.379, 0.367 and 0.419, respectively. The Garza–Williamson index was 0.379, indicating a bottleneck in the history of Cuban cacao. Cuban plants exhibited low coefficients of membership in the 10 reference groups, with Amelonado (61.64 %) and Criollo (27.34 %) predominating, followed by Maran˜o´n (5.40 %), Iquitos (2.23 %), Contamana (1.49 %), Nanay (1.12 %) and Nacional (0.75 %). Additionally, 48.23 % of plants were an admixture of Amelonado and Criollo, i.e., of the Trinitario type. The Cuban plants were separated into two clusters, one comprising plants mainly from eastern Cuba from groups Criollo, Amelonado and Maran˜o´n, and the other mainly from central Cuba represented by plants from groups Criollo, Amelonado, Maran˜o´n, Iquitos, Contamana, Nanay and Nacional. These results should aid the design of rational conservation and utilization strategies for ancient Cuban cacao. Keywords Cuba  Genetic diversity  Population structure  SSR  Theobroma cacao

Introduction Cacao (Theobroma cacao L.) is a cultivated tree native to the humid tropics of central and northwestern regions of South America. The plant is the source of chocolate and other products, such as cacao butter,

123

68

cacao powder and cacao liquor (Bhattacharjee and Kumar 2007). During the pre-Columbian period, cacao populations were distributed among two geographical regions: circum-Caribbean and Amazonian. The Amazonian region, specifically the Upper Amazon region, is the center of origin and area of greatest genetic diversity of the species (Motamayor et al. 2008). In the circum-Caribbean region, the plant was domesticated and cultivated by the Mayans since at least 600 BC in Central America—mainly in Mexico where the Criollo type was found most frequently (Hurst et al. 2002; Motamayor et al. 2002). Secondary areas of cacao diversification are the result of the intentional transfer of propagated materials from their principal regions of origin (Bartley 2005). The first stage of expansion of cacao cultivation began a few years after the discovery of America. The foundation of this expansion was plant material derived from the Criollo group that was transported outside of the pre-Columbian distribution area to satisfy the European demand for chocolate. Another aspect of the development of modern cacao populations is the subsequent alteration of the original genetic base of cultivated cacao—fundamentally Criollo—through the introduction of germplasm from the Amazonian region beginning in the seventeenth century. This introduced germplasm was from different sources, increasing the genetic diversity of cacao in its production areas. Selection of more productive and resistant plants resulted in the progressive replacement of traditional populations by new varieties and hybrids (Bartley 2005; Bhattacharjee and Kumar 2007). Plant genetic resources of Cuba, including cacao, have been revised thoroughly (Esquivel and Hammer 1992). Cuba is a region of cacao cultivation, with a production of 1,510 t in 2011 (FAO 2013). A total of 6,800 ha are cultivated, of which 5,153 ha are in production, but with a yield of only 0.29 t ha-1. Cacao is mainly cultivated in the mountains of the eastern provinces of the country (Esquivel and Hammer 1992), particularly in the Baracoa region of Guanta´namo, the easternmost province of the country, where more than 70 % of the national cacao production takes place (Oficina Nacional de Estadı´sticas e Informacio´n 2012). This concentration of cacao production is due to the particular weather conditions of Baracoa, which receives more than 2,000 mm of precipitation per year

123

Genet Resour Crop Evol (2015) 62:67–84

owing to the combination of humid winds from the north and the Nipe-Sagua-Baracoa mountain range (Ma´rquez Rivero and Aguirre Go´mez 2008). In Baracoa, a strong tradition of cacao cultivation going back several decades, with familial transmission of cultivation techniques, grain production and processing facilities, is present (Ma´rquez Rivero and Aguirre Go´mez 2010). In Cuba, cacao is cultivated organically without chemical fertilizers or pesticides. Cacao plantations use a mixed crop system, with plants grown under various plant canopy shade conditions together with other tree crops (Nu´n˜ez Gonza´lez 2010; Ma´rquez Rivero and Aguirre Go´mez 2008). Cacao plants are not genetically uniform on most farms. They are instead composed of mixtures of different genotypes that include commercial varieties as well as plants of ancient and unknown origin. This genetic diversity combined with the presence of other plant species on plantations contributes to natural pest and disease control, thereby increasing farmers’ incomes and minimizing financial risk (Nu´n˜ez Gonza´lez 2010). Several types of cacao are recognized in Cuba on the basis of their origin and mode of reproduction. The first type comprises clonal varieties produced and grafted in research centers such as the Instituto de Investigaciones Agroforestales UCTB Baracoa (IIAB) and generally planted as a mixture of a few clones. These clones are hybrids introduced from Costa Rica in 1955 by the United Fruit Company (UF) and widely distributed from the late 1970s to the early 1990s. Representing 53 % of Cuban cacao, plants of this type are located in the major production regions of Baracoa, Imı´as and Maisı´ in the province of Guanta´namo. The second type of cacao consists of hybrids. Introduced into Cuba in 1991, plants of this type are now propagated by seeds produced by hand pollination of selected Trinitario and Forastero clones in research centers. Cultivated in most cacao-producing areas of eastern and central Cuba, the type represents 37 % of Cuban cacao. The third type corresponds to the progeny of Trinidad Selected Hybrid (TSH) cacao, which was introduced via seeds into Cuba in the 1970s. This type is grown in all of the eastern cacao-producing provinces of Cuba: Granma, Santiago de Cuba, Holguı´n and Guanta´namo. This type constitutes 4 % of the Cuban cacao.

Genet Resour Crop Evol (2015) 62:67–84

Finally, the fourth type represents traditional or ancient cacao of unknown origin, and is the main focus of the present study. Propagation and selection of this type has always been entirely managed by farmers. It is still present on plantations, and some isolated plants remain, including plants more than 50 years old and those propagated from their seeds. These plants possess some original and interesting features, such as adaptation to the local environment and, on some plants, seeds with white cotyledons. They represent 6 % of Cuban cacao (Ma´rquez Rivero and Aguirre Go´mez 2008, 2010). The origin of ancient Cuban cacao is under debate, with differing opinions expressed by Cuban experts regarding this topic. According to some authors, cacao was introduced into Cuba from Mexico in 1540 by the Spanish, with the first plants cultivated on the ‘‘Mi Cuba’’ farm in Cabaigua´n at the center of the island (Nu´n˜ez Gonza´lez 2010). According to other experts, cacao was introduced between 1791 and 1803 at Ti Arriba, Santiago de Cuba by French cacao growers who emigrated from Haiti during the Haitian Revolution (Nu´n˜ez Gonza´lez 2010; Herna´ndez Castillo 1978). A currently ongoing program is aiming to increase cacao cultivation using plants corresponding to the first and second types above: UF clones—essentially UF-650, UF-654, UF-667, UF-668, UF-676 and UF677—and hybrids with high productivity potential. Plant material used in this program consists of grafts and cuttings (Esquivel et al. 1992), principally from the IIAB (Nu´n˜ez Gonza´lez 2010). Also found in Cuba are very old cacao plants corresponding to the fourth type above, which are estimated by their owners to be 60 or even 80 years old. These old plants are likely the genetically closest individuals to the original plants introduced into Cuba. Genetic characterization of these ancient Cuban cacao plants has never been undertaken, but such an analysis is of great importance for their preservation and to justify their current and future use. This ancient genetic material is endangered because of the progressive replacement of old plantations by new, highly productive plants. Given this situation, the objectives of this study were to determine the genetic diversity and classification of ancient Cuban cacao based on genetic groups described by Motamayor et al. (2008). The results of genetic characterization of these ancient cacao plants should serve as a useful resource for future sustainable

69

utilization and genetic improvement of Cuban cacao. This characterization should also allow elucidation of the relationship of ancient Cuban cacao to other cacao populations around the world.

Materials and methods Plant material A systematic survey of cacao plantations and remnants with isolated plants was conducted throughout historical and current cacao-producing regions of Cuba. A total of 537 cacao plants were sampled from local farms between May 2009 and April 2012. Collections were made throughout the country wherever old plantations and isolated trees could be found, and included the most substantial cacao-growing regions and both putative regions of Cuban cacao introduction (Fig. 1). Only the oldest plants, presumably corresponding to individuals genetically closest to the originally introduced plants, were collected. We bypassed new plantations comprising clonal varieties, hybrid cacao and/or progeny of TSH, which are already maintained in gene banks and have been genetically described (Centre de Coope´ration Internationale en Recherche Agronomique pour le De´veloppement 2013; USDA ARS National Genetic Resources Program 2013; Turnbull and Hadley 2011). The leaves were collected, conserved on silica gel and finally dried in an oven at 60 C. The study included a reference group of 107 accessions from different germplasm collections: 77 plants from the 10 genetic groups of Motamayor et al. (2008) and 30 Trinitario genotypes (Table 1). This classification into 10 genetic groups more accurately reflects genetic diversity of the species than does traditional classification into Criollo, Forastero and Trinitario. The categorization of Motamayor et al. (2008) was based on genotyping of 1,241 accessions from a large geographical sampling in Central and South America using 106 microsatellite markers, with the number of genetic groups determined using STRUCTURE software. Our reference group of 107 accessions was obtained from the following sources: 42 from the International Cacao Germplasm Database (ICGD 2011), University of Reading, United Kingdom; 18 from the Centre de Coope´ration Internationale en

123

70

Genet Resour Crop Evol (2015) 62:67–84

Fig. 1 Sampling sites on cacao farms in central and eastern Cuba

Recherche Agronomique pour le De´veloppement (CIRAD), France; 18 from the Centro Agrono´mico Tropical de Investigacio´n y Ensen˜anza (CATIE), Costa Rica; and 29 from the IIAB, Cuba. A total of 644 plants—537 sampled plants and 107 reference accessions—were analyzed. DNA isolation, polymerase chain reaction (PCR) and SSR analysis Cacao genomic DNA was extracted from 100 mg of dry leaves of each plant using a FastDNA kit (QBIOgene, Carlsbad, CA, USA) and a FastPrep FP 120 cell disrupter (Savant Instrument, Holbrook, NY, USA) according to the recommended kit protocol for plant tissues. DNA quantification was performed using an ND-1000 spectrophotometer (NanoDrop Technologies, Wilmington, DE, USA).

123

DNA amplification was performed using 15 microsatellite markers (Lanaud et al. 1999; Saunders et al. 2004) recommended as international standards for cacao genetic characterization because of their high levels of polymorphism, reproducibility and distribution throughout the genome. These 15 microsatellites are considered sufficient to characterize the genetic diversity of cacao germplasm and have been widely used to genotype native populations (Sereno et al. 2006; Zhang et al. 2006), plantations (Boza et al. 2013; Lachenaud and Zhang 2008; Trognitz et al. 2011; Zhang et al. 2008, 2011, 2012) and diverse germplasm collections (Aikpokpodion et al. 2009; Efombagn et al. 2008; Susilo et al. 2011; Irish et al. 2010; Turnbull and Hadley 2011; Zhang et al. 2009a, b). A subset consisting of 11 of these 15 microsatellites has also been found to be sufficient for this purpose (Swanson et al. 2003). Forward primers were 50 -

Genet Resour Crop Evol (2015) 62:67–84

71

Table 1 Name, source and genetic group of the reference accessions Accession number

Source

Genetic group

Accession number

Source

Genetic group Maran˜o´n Maran˜o´n Maran˜o´n Maran˜o´n

SIAL 70

IIAB

Amelonado

PA 121

CIRAD

Amelonado 13(14)

CATIE

Amelonado

PA 13 [PER]

ICGD

Amelonado 9(11) BE 3

CATIE CIRAD

Amelonado Amelonado

PA 120 [PER] PA 4 [PER]

ICGD ICGD

SIAL 407

CIRAD

Amelonado

PA 121 [PER]

ICGD

LCTEEN 302

ICGD

Amelonado

PA 169 [PER]

ICGD

MA 12 [BRA]

ICGD

Amelonado

PA 136 [PER]

ICGD

Maran˜o´n Maran˜o´n Maran˜o´n

SPEC 41/6-18

ICGD

Amelonado

LCTEEN 46

CATIE

Nacional

FSC 13

ICGD

Amelonado

MO 20

ICGD

Nacional

SCA 6

IIAB

Contamana

U 26 [PER]

ICGD

Nacional

SCA 12

IIAB

Contamana

POUND 7

IIAB

Nanay

SCA 9

IIAB

Contamana

NA 127

CIRAD

Nanay

SCA 6

CIRAD

Contamana

NA 79

CIRAD

Nanay

SCA 9

ICGD

Contamana

NA 149

ICGD

Nanay

SCA 19

ICGD

Contamana

NA 232

ICGD

Nanay

SCA 11

ICGD

Contamana

NA 26

ICGD

Nanay

SCA 6

ICGD

Contamana

POUND 10/B [POU]

ICGD

Nanay

U 45 [PER]

ICGD

Contamana

POUND 16/B [POU]

ICGD

U 70 [PER] Criollo 216

ICGD CATIE

Contamana Criollo

LCTEEN 368 EBC 148

CATIE ICGD

Nanay Puru´s

Criollo 11

CATIE

Criollo

RB 46 [BRA]

ICGD

Puru´s Puru´s

Criollo 19

CATIE

Criollo

RB 47 [BRA]

ICGD

Criollo 215

CATIE

Criollo

LCTEEN 412

ICGD

Puru´s Puru´s

Criollo 22

CATIE

Criollo

RIM 2

IIAB

Trinitario

Criollo 23

CATIE

Criollo

UF 650

IIAB

Trinitario

Criollo 33

CATIE

Criollo

UF 613

IIAB

Trinitario

Criollo 34

CATIE

Criollo

UF 296

IIAB

Trinitario

Criollo 8

CATIE

Criollo

UF 29

IIAB

Trinitario

Criollo 27

CATIE

Criollo

UF 221

IIAB

Trinitario

CRIOLLO 22 [CRI]

ICGD

Criollo

UF 12

IIAB

Trinitario

LCTEEN 241

CIRAD

Curaray

UF 667

IIAB

Trinitario

LCTEEN 37/I

CIRAD

Curaray

UF 668

IIAB

Trinitario

LCTEEN 261/S-4

ICGD

Curaray

UF 676

IIAB

Trinitario

LCTEEN 163/A LCTEEN 401

ICGD ICGD

Curaray Curaray

SGU 54 UF 654

IIAB IIAB

Trinitario Trinitario

SILECIA 5 [ECU]

ICGD

Curaray

ICS 6

IIAB

Trinitario

LCTEEN 241

ICGD

Curaray

UF 677

IIAB

Trinitario

ELP 20 A

CATIE

Guiana

OC 61

IIAB

Trinitario

ELP 16 A

CATIE

Guiana

ICS 95

IIAB

Trinitario

B7 B3

CATIE

Guiana

ICS 8

IIAB

Trinitario

B7 A2

CATIE

Guiana

GS 29

IIAB

Trinitario

KER 3

CIRAD

Guiana

ICS 16

IIAB

Trinitario

GU 285A

CIRAD

Guiana

GS 67

IIAB

Trinitario

KER 3

ICGD

Guiana

GS 57

IIAB

Trinitario

123

72

Genet Resour Crop Evol (2015) 62:67–84

Table 1 continued Accession number

Source

Genetic group

Accession number

Source

Genetic group

KER 6

ICGD

Guiana

GS 46

IIAB

Trinitario

PINA 1

ICGD

Guiana

GS 36

IIAB

Trinitario

POUND 12 IMC 68

IIAB CIRAD

Iquitos Iquitos

ICS 39 ICS 95

CIRAD CIRAD

Trinitario Trinitario

IMC 103

ICGD

Iquitos

RIM 39

CIRAD

Trinitario

COCA 3370/5 [CHA]

ICGD

Iquitos

UF 667

CIRAD

Trinitario

SPEC 54/1

ICGD

Iquitos

CC 231

CIRAD

Trinitario

AMAZ 12 [CHA]

ICGD

Iquitos

ICS 6

CIRAD

Trinitario

POUND 12/A [POU]

ICGD

ICGD

Trinitario

CIRAD

Iquitos Maran˜o´n

ICS 1

PA 120







labeled with the fluorescent dyes 6-carboxyfluorescein (6-FAM), 4,7,20 ,40 ,50 ,70 -hexachloro-6-carboxyfluorescein (HEX) and 70 ,80 -benzo 50 -fluoro-20 ,4,7 trichloro3-carboxyflourescin (NED). PCRs were performed using GoTaq Flexi DNA polymerase (Promega) in a PTC-100TM thermal cycler (MJ Research, Watertown, MA, USA). PCR amplifications were carried out in 15-ll total volumes containing 0.2 ng ll-1 genomic DNA, 0.04 U ll-1 Taq polymerase, 1.50 mM MgCl2, 0.10 mM dNTPs, 59 PCR buffer and forward and reverse primer pairs of three different microsatellite markers of different size ranges and dye labels for each multiplex reaction. Thermal cycling conditions consisted of initial denaturation at 95 C for 10 min, followed by 38 cycles of denaturation at 95 C for 30 s, annealing at 54 C for 45 s, and extension at 72 C for 2 min, with a final extension at 72 C for 5 min. Amplification products were detected using an ABI PRISM 3100 genetic analyzer (Applied Biosystems, Carlsbad, CA, USA) and a DS-32 (dye set F) matrix standard kit. Product sizes were scaled using a GeneScan 500 Rox standard (Applied Biosystems). Amplified fragment sizes and intensities were visualized using the free software program Peak Scanner v1.0 (Applied Biosystems 2006). Analysis of genetic diversity Summarized statistical data collected for all plants included the number of alleles per locus in the entire collection of cacao plants of ancient origin, allele frequencies for each locus, observed (Ho) and expected (He) heterozygosities and polymorphism

123

information content (PIC; Botstein et al. 1980). To detect reductions in effective population size, the Garza-Williamson index (M-statistic; Garza and Williamson 2001) was calculated according to the formula M = k/r, where k is the total number of alleles and r is the overall range in allele size. A test for departure from Hardy–Weinberg equilibrium (Guo and Thompson 1992) was performed on each locus with 10,000 Monte Carlo permutations using the adegenet software package (Jombart 2008; Jombart and Ahmed 2011). Mislabeling detection To eliminate mislabeled or misclassified genotypes from further consideration, a preliminary analysis was performed on the 77 reference genotypes from the 10 cacao genetic groups described by Motamayor et al. (2008). Identifications were made according to the methodology of Motamayor et al. (2008) using a modelbased Bayesian clustering method as implemented in STRUCTURE v2.3.4 (Pritchard et al. 2000). Ten clusters (K = 10) were used to detect samples mislabeled with respect to the genotypes described by Motamayor et al. (2008). On the basis of allele size, individual samples were assigned to clusters using the admixture model with correlated allele frequencies. We performed 10 independent runs of 200,000 iterations with an initial burn-in of 100,000 iterations, and selected the run with the highest estimated Ln Pr(X/K). Reference plants that grouped according to their clone names into a cluster different from the one described by Motamayor et al. (2008) or that displayed a coefficient of membership (Q) below 0.9 were eliminated from

Genet Resour Crop Evol (2015) 62:67–84

subsequent analyses. By selecting plants with Q C 0.7, Motamayor et al. (2008) obtained strong differentiation (Fst = 0.46) between cacao genetic groups. In our case, a threshold value of 0.9 was required to obtain such a high Fst (0.466). The difference observed between these two studies may be due to the number of microsatellite markers used, i.e., 15 in the present study vs. 106 in the other (Lanaud et al. 1999; Saunders et al. 2004). This selection procedure allowed us to retain 39 reference plants. Genetic distance (1 - proportion of shared alleles) (Bowcock et al. 1994) was calculated among all 39 reference plants to obtain a distance matrix using the adegenet software package (Jombart 2008; Jombart and Ahmed 2011). To visualize genetic structure, a dendrogram was generated from this distance matrix using the unweighted pair-group method with arithmetic averages (UPGMA) clustering technique as implemented in the phangorn software package (Schliep 2011), with 10,000 bootstrap estimations.

73

UPGMA and principal coordinate analyses A UPGMA dendrogram including all 537 Cuban, 39 reference and 30 Trinitario plants was generated as previously described for mislabeling detection. The three most frequently represented reference groups in the set of Cuban plants were determined according to membership coefficients of the latter in the 10 reference groups. A principal coordinate analysis was performed on the 537 Cuban plants and reference plants in these three groups only (for the sake of clarity) using the ape package (Paradis et al. 2004). A two–dimensional plot was generated based on the first two components. Statistical analyses and software All packages used for genetic diversity analysis, distance calculation, UPGMA dendrogram construction and principal coordinate analysis were from the R statistical language v2.15.0 (R Core Team 2012).

Analysis of molecular variance An analysis of molecular variance was performed to assess the genetic structure of the 39 reference plants selected in the mislabeling detection step. Arlequin v3.5.1.3 (Excoffier et al. 2005) was used on the individual genotypes, with 1,000 bootstrap replicates. STRUCTURE analysis The population structure of the cacao plants was determined independently in two steps using STRUCTURE v2.3.4 (Pritchard et al. 2000). The same parameters were used as those employed for detection of mislabeled material. First, only the 537 Cuban collected plants were analyzed. K was varied from 1 to 12 to infer the best number of population clusters. The most probable K was then determined using the graphical method of Evanno et al. (2005) and Bayes’ Rule (Pritchard et al. 2010). Second, the collection was reanalyzed together with the 39 selected reference samples from the 10 previously described cacao genetic groups (Motamayor et al. 2008) using the prior population information option (Hubisz et al. 2009). To assign each of the studied genotypes to one of the 10 genetic groups, K was set to 10; each individual was then assigned to the cluster for which its coefficient of membership (Q value) was the highest.

Results Analysis of genetic diversity A total of 139 alleles from the 15 loci, corresponding to an average of 9.267 alleles per locus, were detected in the 537 Cuban cacao plants (Table 2). The number of alleles per locus varied from 6 for microsatellites mTcCIR1, mTcCIR7 and mTcCIR24 to 17 for the microsatellite mTcCIR33. Ho varied from 0.161 for mTcCIR24 to 0.660 for mTcCIR60, with a mean of 0.370; He varied from 0.182 for mTcCIR24 to 0.644 for mTcCIR40, with a mean of 0.419. The test for departure from Hardy–Weinberg equilibrium revealed significant deviations (p \ 0.05) from Hardy–Weinberg equilibrium in 8 of the 15 loci. All of these loci exhibited a deficit of heterozygotes (Table 2). PIC values ranged from 0.169 for mTcCIR24 to 0.578 for mTcCIR40, with a mean of 0.379. Two loci (mTcCIR 40 and mTcCIR 60) were highly polymorphic, with PIC values higher than 0.5. One locus (mTcCIR24) exhibited little polymorphism, with a PIC less than 0.25. The Garza-Williamson index was low, ranging from 0.133 for locus mTcCIR12 to 0.500 for loci mTcCIR1, mTcCIR7, mTcCIR8 and mTcCIR40, with a mean value of 0.384.

123

74

Genet Resour Crop Evol (2015) 62:67–84

Table 2 Summary statistics of 15 SSR loci genotyped in 537 ancient Cuban cacao plants collected in the field in all of the regions of cacao cultivation in Cuba Hobs

Hexp

No. alleles

Size range (bp)

PIC value

Garza–Williamson index (M)

Hardy–Weinberg equilibrium (p)

mTcCIR1

0.404

0.418

6

129–141

0.337

0.500

0.031

mTcCIR6

0.335

0.363

9

227–251

0.341

0.375

0.369

mTcCIR7

0.238

0.300

6

149–161

0.279

0.500

0.076

mTcCIR8

0.270

0.292

11

284–306

0.263

0.500

0.096

mTcCIR11

0.364

0.399

8

290–316

0.361

0.346

0.000

mTcCIR12

0.440

0.425

10

177–152

0.375

0.133

1.000

mTcCIR15

0.449

0.526

10

234–258

0.463

0.417

0.003

mTcCIR18

0.387

0.441

7

332–356

0.392

0.292

0.015

mTcCIR22

0.318

0.376

9

275–296

0.330

0.429

0.017

mTcCIR24

0.181

0.182

6

186–202

0.169

0.375

1.000

mTcCIR26 mTcCIR33

0.248 0.380

0.346 0.486

7 17

285–305 273–345

0.319 0.467

0.400 0.250

0.003 0.001

mTcCIR37

0.394

0.495

12

135–175

0.470

0.325

0.045

mTcCIR40

0.481

0.644

11

262–286

0.578

0.500

0.283

mTcCIR60

0.660

0.597

10

188–214

0.535

0.423

0.194

Mean

0.370

0.419

9.267



0.379

0.384



Fig. 2 Population structure inferred by Bayesian clustering in STRUCTURE v2.3.4 to detect labeling errors for a 77 reference plants representing the 10 cacao genetic groups of Motamayor et al. (2008); b 51 reference plants selected after elimination of those placed into different clusters of Motamayor et al. (2008) than suggested by their clone names; and c 39 reference plants selected after elimination of those with a genetic group membership coefficient (Q) less than 0.9

123

Genet Resour Crop Evol (2015) 62:67–84

75

Fig. 3 Dendrogram based on unweighted pair-group method with arithmetic averages (UPGMA) clustering of the 39 Bayesianselected reference accessions

Mislabeling detection The first selection from the initial 77 reference plants was based on their correspondence to the genetic groups of Motamayor et al. (2008). Among these plants, 26 (33.77 %) were classified into a different genetic group than the ones proposed by Motamayor et al. (2008) and were therefore discarded. A second selection was then made from the 51 remaining plants. Thirty-nine plants with Q [ 0.9 were retained as the plants most representative of the 10 cacao genetic groups (Fig. 2). In the UPGMA dendrogram of the 39 selected reference plants (Fig. 3), all plants in a given genetic group of Motamayor et al. (2008), with the exception

of Nacional, were clustered together. The 39 selected samples were thus characterized by a strong population structure corresponding to the 10 clearly differentiated genetic groups of Motamayor et al. (2008). These plants were consequently well suited for use as reference plants for DNA fingerprinting of the ancient Cuban cacao plants. Analysis of molecular variance Analysis of molecular variance of the 39 selected reference plants revealed that 47.46 % of the total observed molecular variance was found between cacao genetic groups, 9.00 % between individuals within genetic groups and 43.54 % within individuals.

123

76

Genet Resour Crop Evol (2015) 62:67–84

Fig. 4 Selection of the most probable number of clusters for Bayesian clustering (STRUCTURE) of 537 ancient Cuban cacao plants using the graphical method of Evanno et al. (2005). The most probable number of clusters was determined to be K = 2

All of the components of variance were significant at p \ 0.005. STRUCTURE analysis Evaluation of STRUCTURE analysis results using the graphical method (Evanno et al. 2005) and Bayes’ rule (Pritchard et al. 2000) indicated that the 537 analyzed Cuban plants of ancient origin were best grouped into two clusters (K = 2), with 99.81 % of plants always placed into the same cluster across the 10 performed runs (Fig. 4). Clusters 1 and 2 contained 419 (78.03 %) and 118 (21.97 %) plants, respectively (Fig. 5). The mean Q-value for each individual in a cluster was 0.952, with a clear differentiation between the two clusters. A total of 518 (96.46 %) and 463 (86.22 %) plants had Q values [0.7 and [0.9, respectively. A total of 125 alleles were classified into cluster 2, and 87 into cluster 1. A similar result was obtained for the private alleles, with 51 in cluster 2 and 14 in cluster 1. Observed and expected heterozygosities were also higher in cluster 2 (Ho = 0.599; He = 0.625) than in

123

cluster 1 (Ho = 0.306; He = 0.339). Although smaller, cluster 2 thus exhibited higher genetic diversity than cluster 1 (Table 3). A relationship was found between clusters and plant geographical locations. In particular, 81.27 % of the plants in cluster 1 were collected from eastern Cuba, while 68.57 % of the plants in cluster 2 were obtained from central Cuba. K was then set to 10 and membership in the 10 cacao genetic groups defined by Motamayor et al. (2008) was assessed for each ancient Cuban cacao plant. Among the Cuban cacao plants, 61.64 % (331 plants) corresponded predominantly to Amelonado and 27.37 % (147 plants) to Criollo. The 59 remaining plants were predominantly associated with one of the following groups: Maran˜o´n (5.40 %), Iquitos (2.23 %), Contamana (1.49 %), Nanay (1.12 %) and Nacional (0.75 %) (Fig. 6). However, membership coefficients of Cuban ancient cacao plants in the 10 cacao genetic groups were low: the mean value was 0.381 with a maximum of 0.621, indicating an admixture of different genetic groups. In 259 plants (48.23 %), in fact, more than

Genet Resour Crop Evol (2015) 62:67–84

77

Fig. 5 Population structure of 537 ancient Cuban cacao plants inferred using Bayesian clustering in STRUCTURE with K = 2

Table 3 Allelic richness, private alleles and observed and expected heterocigosity in each of the two clusters of ancient Cuban cacao plants obtained using Bayesian clustering (structure version 2.3.4) SSR

He

Allelic richness

Private alleles

Ho

Cluster 1

Cluster 2

Cluster 1

Cluster 2

Cluster 1

Cluster 2

Cluster 1

Cluster 2

mTcCIR1

4

5

1

2

0.358

0.568

0.395

0.482

mTcCIR6

6

7

2

3

0.243

0.661

0.269

0.614

mTcCIR7

2

6

0

4

0.162

0.509

0.196

0.561

mTcCIR8

4

11

0

7

0.203

0.509

0.225

0.491

mTcCIR11

6

8

0

2

0.313

0.547

0.306

0.660

mTcCIR12

5

9

1

5

0.348

0.763

0.347

0.644

mTcCIR15

6

10

0

4

0.363

0.754

0.417

0.753

mTcCIR18

4

7

0

3

0.315

0.644

0.341

0.696

mTcCIR22

7

7

2

2

0.282

0.449

0.308

0.572

mTcCIR24

3

6

0

3

0.129

0.364

0.129

0.344

mTcCIR26

6

6

1

1

0.206

0.431

0.286

0.554

mTcCIR33

9

15

2

7

0.276

0.746

0.351

0.795

mTcCIR37 mTcCIR40

9 9

10 9

2 2

3 2

0.327 0.462

0.616 0.552

0.383 0.605

0.753 0.713

mTcCIR60

7

9

1

3

0.600

0.873

0.533

0.735

Total

87

125

14

51

Mean

5.800

8.333

0.933

3.400

0.306

0.599

0.339

0.625

Fig. 6 Population structure of ancient Cuban cacao plants inferred using STRUCTURE based on the 10 cacao genetic groups described by Motamayor et al. (2008). Each vertical line represents a genotype. Admixed individuals are denoted by

multiple colors. Membership of the 537 ancient Cuban plants in clusters 1 or 2 was determined solely on the basis of the 537 plants. (Color figure online)

123

78

Genet Resour Crop Evol (2015) 62:67–84 b Fig. 7 UPGMA dendrogram of 537 ancient Cuban cacao

plants, 39 reference plants from 10 genetic groups described by Motamayor et al. (2008) and 30 Trinitario reference plants. Ancient Cuban and Trinitario plants were similarly distributed between Criollo and Amelonado reference plants. Ancient Cuban cacao plants were also clustered with Maran˜o´n reference plants

40 % of the 15 markers were an admixture of Amelonado and Criollo, the two cacao groups from which the Trinitario group originated (Motamayor et al. 2003). In addition, 389 plants (72.44 %) were detected in which more than 55 % of the 15 markers consisted of an admixture of the three cacao genetic groups Amelonado, Criollo and Maran˜on (Fig. 6). The genetic composition of clusters 1 and 2 was then analyzed on the basis of their correspondence to the 10 genetic groups of Motamayor et al. (2008). The ancient Cuban cacao plants from cluster 1 were assigned to only three genetic groups: Amelonado (71.84 %), Criollo (25.06 %) or Maran˜o´n (3.10 %). Cluster 2 was more diverse, with plants assigned to seven different genetic groups. Among them, the most predominant groups were the same as in cluster 1: Criollo (35.59 %), Amelonado (25.42 %) and Maran˜o´n (13.56 %). The other genetic groups detected in lesser proportions were Iquitos (10.17 %), Contamana (6.78 %), Nanay (5.09 %) and Nacional (3.39 %) (Fig. 6). The more diverse origin of cluster 2 is consistent with the higher number of detected alleles and private alleles in this cluster compared with cluster 1. UPGMA and principal coordinate analyses According to the results of the STRUCTURE analysis, the principal genetic groups of ancient Cuban cacao plants were Amelonado and Criollo. This result suggested a possible Trinitario origin. For this reason, a complementary analysis using UPGMA clustering was performed to study genetic relationships among the ancient Cuban plants, the 30 Trinitario reference plants and the 10 genetic groups. The UPGMA dendrogram largely agreed with the results of the STRUCTURE analysis (Fig. 7). Cuban plants of ancient origin constituted a large cluster comprising three smaller clusters corresponding to Amelonado, Criollo and Maran˜o´n; these three clusters were separated from one another, with the Maran˜on cluster being the most distant. This result was consistent with that of the cluster analysis, in which

123

Genet Resour Crop Evol (2015) 62:67–84

the ancient Cuban plants were found to consist of an admixture of Criollo, Amelonado and Maran˜o´n. The Trinitario reference plants were distributed among the Cuban plants of ancient origin and the clusters corresponding to Amelonado and Criollo. Principal coordinate analysis revealed that the ancient Cuban plants were scattered among reference plants corresponding to Amelonado, Criollo and Maran˜o´n genetic groups, and were distinct from the seven other reference groups. Although only a small percentage of the total variance was captured in the principal coordinate analysis, this result agreed with the conclusion inferred from STRUCTURE and UPGMA cluster analyses: ancient Cuban cacao plants are fundamentally a mixture of Criollo, Amelonado and Maran˜o´n genetic groups (Fig. 8).

Discussion Analysis of genetic diversity As noted by Saunders et al. (2004), loci with the highest (mTcCIR37 and mTcCIR33) and lowest (mTcCIR1, mTcCIR7 and mTcCIR24) numbers of

Fig. 8 Plot of the first two main axes from principal coordinate analysis of ancient Cuban cacao plants and reference plants from Amelonado, Criollo and Maran˜o´n genetic groups described by Motamayor et al. (2008). Colored ellipses indicate 90 % confidence levels for the 6 Criollo (red), 3 Amelonado (blue) and 7 Maran˜o´n (orange) reference plants and 537 ancient Cuban cacao plants (black). Axes indicate the percentage of total variance captured by the first two principal coordinates. (Color figure online)

79

alleles were among the most and least polymorphic, respectively. Large variation was observed in microsatellite allele frequency, with the most frequent allele at each locus accounting for more than 60 % except for mTcCIR40 and mTcCIR60. In addition, the summed frequency of the two most common alleles was greater than 70 % at each locus. These results are similar to those reported by other authors for Trinitario plants, in which only two predominant alleles were found for each locus, with the other alleles exhibiting much lower frequencies (N’Goran et al. 2000; Clement et al. 2003). The presence of two predominant alleles per locus accounted for the low PIC values calculated for the anciently introduced Cuban plants and may be related to the Trinitario derivation of these plants. Approximately 90 % of modern Criollo and Trinitario varieties are apparently products of hybridization and subsequent introgression between two genetically uniform types from the island of Trinidad: (1) homozygous ancient Criollo, a surviving remnant of the ‘‘blast’’ that almost destroyed cacao plantations in 1727 and (2) a reduced number of homozygous Lower Amazon Forastero genotypes introduced from South America. This parental background is responsible, in most case, for the presence of the same two alleles at each locus in modern Criollo/Trinitario hybrid groups. Genotypes belonging to these groups would therefore represent different levels of admixture of Criollo and Lower Amazon Forastero parental genomes (Motamayor et al. 2003; Lanaud et al. 2001; Bartley 2005). The Trinitario hybrids were more productive and disease resistant than the Criollo and Amelonado groups, the cacao types planted during the period in which Trinitario originated. For this reason, Trinitario became the predominant type, and began to replace Criollo in Central and South America in 1825 and Nacional in Ecuador in 1890. It was also introduced into Sri Lanka in 1834 and 1880, and later into Singapore, Fiji, Samoa, Tanzania and Madagascar. In Brazil and Africa, Trinitario was also crossed with Amelonado. Trinitario is currently associated with a high proportion of the world’s cacao production (Motamayor et al. 2003; Reis Monteiro et al. 2009). The allelic diversity found in our sampling of Cuban cacao of ancient origin (139 total alleles with a mean of 9.27 alleles per locus) was higher than that of most reported cacao studies. For example, the total number and mean number of alleles per locus reported

123

80

in natural populations from Brazil, Peru and Bolivia ranged from 49 to 110 and 3.7 to 7.3, respectively (Zhang et al. 2006, 2012); these values were also lower in plantations from Ecuador, Nicaragua and the Dominican Republic: 63–116 and 4.2–8.1, respectively (Boza et al. 2013; Trognitz et al. 2011; Zhang et al. 2008). An exception to this trend was uncovered on plantations in Peru, where values in the same range as in Cuba (150 total alleles with a mean of 10 alleles per locus) have been reported (Zhang et al. 2011). Finally, most values in germplasm collections were also lower than in Cuba, being in the range of 114–132 total alleles and 7.5–8.8 alleles per locus in germplasm from Ghana, Java and the Dominican Republic and in the USDA-ARS TARS collection (Boza et al. 2013; Irish et al. 2010; Opoku et al. 2007; Susilo et al. 2011). In contrast, total alleles and alleles per locus from Cameroon were similar to those in Cuba (125 and 9.4, respectively; Efombagn et al. 2008), and several germplasm collections from broad geographic areas such as the Upper Amazon, West Africa and the CATIE collection exhibited higher values than in Cuba, i.e., 31–180 and 12–14.2, respectively (Aikpokpodion et al. 2009; Efombagn et al. 2008; Zhang et al. 2009b). Nevertheless, the relatively high number of alleles uncovered in our study may be related to our larger sample size. Ho and He are independent of sample size, with He also independent of the mode of reproduction. These parameters are thus suitable for comparing genetic diversity among samples of different sizes. Both He and Ho were generally found to be lower in the ancient Cuban cacao plants than in other samples detailed above, Ecuadorian ‘‘Refractarios’’ plantations, and germplasm collections in Cameroon and West Africa lacking Ho and He information (Aikpokpodion et al. 2009; Efombagn et al. 2008; Zhang et al. 2008). For the ancient Cuban plants, mean He across all loci was 0.419; this value was lower than He values of the samples detailed above, which ranged from 0.476 to 0.765. The low He of ancient Cuban cacao may be due to the presence of two highly frequent alleles at each locus, a situation arising from a bottleneck event that occurred during or after the introduction of cacao onto the island. Mean Ho across all loci was 0.37 in the ancient Cuban plants, which was lower than the Ho values of most samples detailed above (0.4–0.717) with the exception of natural Brazilian and Bolivian

123

Genet Resour Crop Evol (2015) 62:67–84

populations (0.334–0.37) (Sereno et al. 2006; Zhang et al. 2012), This low Ho value may be a consequence of the evolutionary trajectory of the original cacao populations, in which genetic drift in the Trinitario genetic group led to fixation of alleles (Motamayor et al. 2003). Eight of the 15 loci were not in Hardy–Weinberg equilibrium, as evidenced by lower Ho values compared with He. This deviation from Hardy–Weinberg equilibrium may reflect crossing between genetically close individuals—a regularly occurring process in small populations—or even a degree of autogamy. These 15 microsatellite markers have been mapped in several cacao populations, with no apparent null alleles reported (Brown et al. 2008). Our results are similar to those observed in ‘‘Refractario’’ cacao (Zhang et al. 2008), populations from 19 Amazon river basins in Brazil (Sereno et al. 2006), Amazon Peruvian valleys (Zhang et al. 2006) and Ecuador (Zhang et al. 2008) and a germplasm collection in Ghana (Opoku et al. 2007). On the other hand, populations with most loci in Hardy–Weinberg equilibrium have been reported from a Peruvian valley forest (Kuhn et al. 2008) and a Hawaiian cacao plantation (Schnell et al. 2005). Calculated Garza-Williamson index values were consistent with the known history of cacao cultivation in Cuba. The mean value among all loci was 0.384, which is far less than the 0.68 threshold indicating a reduction in population size (Garza and Williamson 2001). According to Cuban cacao experts, cacao cultivation expanded after each of its two hypothesized introductions into Cuba and then diminished, with a subsequent reduction in population size. This diminution was more pronounced in the central region of the country (Nu´n˜ez Gonza´lez 2010). Few farmers in central Cuba currently dedicate themselves to cacao cultivation, and most plants are isolated, very old and essentially used for home consumption. Mislabeling detection The percentage of plants misclassified according to the genetic groups proposed by Motamayor et al. (2008) (33.77 %) was consistent with other studies of germplasm collections, which detected labeling error rates of approximately 40 % (Boza et al. 2013; Saunders et al. 2001; Sounigo et al. 2001; Motilal and Butler 2003; Turnbull et al. 2004). The percentage of

Genet Resour Crop Evol (2015) 62:67–84

variation among the ten groups within the 39 selected cacao plants (47.46 %) was higher than the 38.1 % obtained by Motamayor et al. (2008), most likely a consequence of selecting only plants with a membership coefficient (Q) higher than 0.9. Use of this high Q value criterion revealed a strong population structure, with the 10 reference cacao genetic groups clearly separated; the 39 plants were thus highly suitable as reference genotypes to analyze ancient Cuban cacao diversity. Population structure Two clusters corresponding to the two regions of Cuban cacao cultivation were revealed by Bayesian methods using STRUCTURE software. The presence of these clusters is consistent with the occurrence of both hypothesized ancient introductions into Cuba. These introductions into the two regions would have taken place independently, being separated in time by more than two and a half centuries. Sharing among cultivators of fruits from plants with interesting agronomical characteristics—a traditional practice in Cuba—may have produced a mixture of these two clusters that weakened the genetic differentiation between them and, hence, between the geographic regions. The central region of Cuba—associated with cluster 2—has a greater number of alleles, perhaps indicating higher genetic diversity of cacao plants introduced into this region. According to Cuban cacao experts, this region was the first area of cacao cultivation in the country. After the second introduction into Santiago de Cuba, however, cacao cultivation declined in the central region and nearly disappeared (Nu´n˜ez Gonza´lez 2010). Ancient central Cuban cacao plants may actually be a mixture of progenies from the first introduction and plants brought from the eastern region, the new center of cultivation. The high predominance of Amelonado and Criollo groups (89.01 %) and the mixed nature of Cuban germplasm revealed by STRUCTURE and UPGMA analyses indicate that ancient Cuban cacao originated as a hybrid of Amelonado and Criollo. Thus, ancient Cuban cacao may be a Trinitario hybrid (Motamayor et al. 2003). Cuban cacao genetic groups were the same as those of selected farm samples from the nearby Dominican Republic (Boza et al. 2013). The Dominican Republic

81

and Haiti constitute the island of Hispaniola, with the two countries being united until 1697 under Spanish domination. In both Cuba and the Dominican Republic, the Amelonado group was predominant, with more than 60 % of all plants corresponding to this group. Many of these plants were Trinitario hybrids. The second most important genetic group in both countries was Criollo, although more highly represented in Cuba than in the Dominican Republic. Furthermore, the other represented genetic groups—consisting less than 20 % of plants—were the same in both countries: Maran˜o´n, Iquitos, Contamana, Nanay and Nacional. This similarity in genetic groups of cacao constituting the germplasm of the two nearby countries supports the second proposed introduction from Haiti into Cuba around 1800. Cacao was most likely introduced to Hispaniola by the Spanish soon after its discovery; this origin would apply to cacao populations on the entire island comprising the present-day Dominican Republic and Haiti (Bartley 2005).

Conclusions As revealed by our study, the allelic composition of anciently derived Cuban cacao populations is consistent with the introduction of one or several populations originating from the genetic groups described by Motamayor et al. (2008). According to our results, the original Cuban cacao population was an admixture of Brazilian Amelonado, Central American Criollo and Peruvian Maran˜o´n. Ancient Cuban cacao plants are likely Trinitario hybrids of Amelonado and Criollo genetic groups. Other genetic groups—Contamana, Nanay, Nacional and Iquitos—are represented in lesser proportions, but are only found in cluster 2 associated with central Cuba. The presence of these groups is compatible with the hypothesis of two independent introductions into Cuba. The similarity of cacao germplasm between Cuba and the Dominican Republic is also compatible with historical data. At least one introduction has been recorded from Haiti, the western portion of Hispaniola that along with the Dominican Republic constitutes that island. Our genetic characterization of ancient Cuban cacao germplasm has provided information regarding its genetic diversity. In Cuba, as in several other cacaoproducing countries, traditional cacao varieties are

123

82

endangered; this situation is due to their replacement by modern, more productive commercial varieties that are usually genetically more uniform and sometimes of inferior quality. The information uncovered by our study can serve as a basis for conservation of ancient Cuban cacao and help reduce the loss of its genetic diversity. Ancient Cuban cacao plants, which are genetically highly diverse with a background from seven genetic groups, also show interesting characteristics that may justify their use in breeding and selection programs. These plants are cultivated without any crop protection and are often better adapted to Cuban ecological conditions, being resistant to the most common diseases present in the country. Furthermore, their Trinitario origin, along with Criollo characteristics such as white cotyledons found in seeds of some plants, are economically very attractive. The development of varieties based on local genetic resources could guarantee their long-term conservation and utilization. Acknowledgments We thank the Coope´ration Universitaire au De´veloppement (CUD) of Belgium for financial support and the International Cacao Germplasm Database (ICGD), Centre de Coope´ration Internationale en Recherche Agronomique pour le De´veloppement (CIRAD), Centro Agrono´mico Tropical de Investigacio´n y Ensen˜anza (CATIE) and Instituto de Investigaciones Agroforestales UCTB Baracoa (IIAB) gene banks for providing leaf samples. We are also grateful to farmers, researchers and technicians from the IIAB and personnel from Cuban provincial agricultural delegations for their cooperation, especially during cacao plantation field sampling.

References Aikpokpodion PO, Motamayor JC, Adetimirin VO, AduAmpomah Y, Ingelbrecht I, Eskes AB, Schnell RJ, Kolesnikova-Allen M (2009) Genetic diversity assessment of sub-samples of cacao, Theobroma cacao L. collections in West Africa using simple sequence repeats marker. Tree Genet Genomes 5:699–711 Applied Biosystems (2006) Peak ScannerTM Software v1.0. Reference guide, p 68 Bartley BGD (2005) The genetic diversity of cacao and its utilization. CABI Publishing, Wallingford Bhattacharjee R, Kumar PL (2007) Cacao. In: Kole C (ed) Genome mapping and molecular breeding in plants, vol 6. Technical Crops. Springer, Berlin, pp 127–142 Botstein D, White RL, Skolnick M, Davis RW (1980) Construction of a genetic linkage map in man using restriction fragment length polymorphisms. Am J Hum Genet 32(3):314–331

123

Genet Resour Crop Evol (2015) 62:67–84 Bowcock AM, Ruiz-Linares A, Tomfohrde J, Minch E, Kidd JR, Cavalli-Sforza LL (1994) High resolution of human evolutionary trees with polymorphic microsatellites. Nature 368:455–457. doi:10.1038/368455a0 Boza EJ, Irish BM, Meerow AW, Tondo CL, Rodrıı´guez OA, Ventura-Lo´pez M, Go´mez JA, Moore JM, Zhang D, Motamayor JC, Schnell RJ (2013) Genetic diversity, conservation, and utilization of Theobroma cacao L.: genetic resources in the Dominican Republic. Genet Resour Crop Evol 60:605–619 Brown JS, Sautter RT, Olano CT, Borrone JW, Kuhn DN, Motamayor JC, Schnell RJ (2008) A composite linkage map from three crosses between commercial clones of cacao, Theobroma cacao L. Trop Plant Biol 1:120–130 Centre de Coope´ration Internationale en Recherche Agronomique pour le De´veloppement (2013) CocoaGenDB. http://cocoagendb.cirad.fr/index.html. Accessed 18 April 2013 Clement D, Risterucci AM, Motamayor JC, N’Goran J, Lanaud C (2003) Mapping quantitative trait loci for bean traits and ovule number in Theobroma cacao L. Genome 46:103–111. doi:10.1139/G02-118 Efombagn IBM, Motamayor JC, Sounigo O, Eskes AB, Nyasse´ S, Cilas C, Schnell R, Manzanares-Dauleux MJ, Kolesnikova-Allen M (2008) Genetic diversity and structure of farm and GenBank accessions of cacao (Theobroma cacao L.) in Cameroon revealed by microsatellite markers. Tree Genet Genomes 4:821–831 Esquivel M, Hammer K (1992) Native food plants and the American influence in Cuban Agriculture. In: Hammer K, Esquivel M, Knu¨pffer H (eds) ‘‘… y tienen faxones y fabas muy diversos de los nuestros…’’ Origin, Evolution and Diversity of Cuban Plant Genetic Resources. IPGRI, Gatersleben, pp 46–74 Esquivel M, Knu¨pffer H, Hammer K (1992) Inventory of cultivated plants. In: Hammer K, Esquivel M, Knu¨pffer H (eds) ‘‘… y tienen faxones y fabas muy diversos de los nuestros…’’ Origin, Evolution and Diversity of Cuban Plant Genetic Resources. IPGRI, Gatersleben, pp 213–406 Evanno G, Regnaut S, Goudet J (2005) Detecting the number of clusters of individuals using the software STRUCTURE: a simulation study. Mol Ecol 14:2611–2620 Excoffier L, Laval G, Schneider S (2005) Arlequin (version 3.0): an integrated software package for population genetics data analysis. Evol Bioinform Online 1:47–50 FAO (2013) FAOSTAT. FAO. http://faostat.fao.org/. Accessed 29 May 2013 Garza JC, Williamson EG (2001) Detection of reduction in population size using data from microsatellite loci. Mol Ecol 10:305–318 Guo SW, Thompson EA (1992) Performing the exact test of Hardy–Weinberg proportions for multiple alleles. Biometrics 48:361–372 Herna´ndez Castillo J (1978) Fitotecnia del cacao. Editorial Pueblo y Educacio´n, La Habana, Cuba Hubisz MJ, Falush D, Stephens M, Pritchard JK (2009) Inferring weak population structure with the assistance of sample group information. Mol Ecol Resour 9:1322–1332 Hurst WJ, Tarka SMJ, Powis TG, Valdez F Jr, Hester TR (2002) Cacao usage by the earliest Maya civilization. Nature 418:289–290

Genet Resour Crop Evol (2015) 62:67–84 International Cocoa Germplasm Database (ICGD) (2011) NYSE Liffe/CRA Ltd./University of Reading, UK. http:// www.icgd.reading.ac.uk. Accessed 12 July 2013 Irish BM, Goenaga R, Zhang D, Schnell R, Brown JS, Motamayor JC (2010) Microsatellite fingerprinting of the USDA-ARS tropical agriculture research station cacao (Theobroma cacao L.) germplasm collection. Crop Sci 50:656–667 Jombart T (2008) adegenet: a R package for the multivariate analysis of genetic markers. Bioinformatics 24:1403–1405. doi:10.1093/bioinformatics/btn129 Jombart T, Ahmed I (2011) adegenet 1.3-1: new tools for the analysis of genome-wide SNP data. Bioinformatics. doi:10.1093/bioinformatics/btr521 Kuhn DN, Motamayor JC, Meerow AW, Borrone JW, Schnell RJ (2008) SSCP markers provide a useful alternative to microsatellites in genotyping and estimating genetic diversity in populations and germplasm collections of plant specialty crops. Electrophoresis 29:4096–4108 Lachenaud P, Zhang D (2008) Genetic diversity and population structure in wild stands of cacao trees (Theobroma cacao L.) in French Guiana. Ann For Sci 65:310. doi:10.1051/ forest:2008011 Lanaud C, Risterucci AM, Pieretti I, Falque M, Bouet A (1999) Isolation and characterization of microsatellites in Theobroma cacao L. Mol Ecol 8:2141–2145 Lanaud C, Motamayor J-C, Risterucci A-M (2001) Implications of new insight into the genetic structure of Theobroma cacao L. for breeding strategies. In: Bekele F (ed) Proceedings of the international workshop on new technologies and cocoa breeding, Kota Kinabalu, Sabah, Malaysia, 16th–17th October 2000, INGENIC Ma´rquez Rivero JJ, Aguirre Go´mez MB (2008) Manual te´cnico de manejo agrote´cnico de las plantaciones de cacao. Ciudad de La Habana Ma´rquez Rivero JJ, Aguirre Go´mez MB (2010) Cacao con denominacio´n de origen. Metodologı´a para su obtencio´n en el Consejo Popular de Sabanilla del municipio Baracoa. Editora Agroecolo´gica, La Habana Motamayor JC, Risterucci AM, Lopez PA, Ortiz CF, Moreno A, Lanaud C (2002) Cacao domestication I: the origin of the cacao cultivated by the Mayas. Heredity 89:380–386 Motamayor JC, Risterucci AM, Heath M, Lanaud C (2003) Cacao domestication II: progenitor germplasm of the Trinitario cacao cultivar. Heredity 91:322–330 Motamayor JC, Lachenaud P, da Silva e Mota JW, Loor R, Kuhn DN, Brown JS, Schnell RJ (2008) Geographic and genetic population differentiation of the amazonian chocolate tree (Theobroma cacao L). PLoS ONE 3(10):e3311 Motilal L, Butler D (2003) Verification of identities in global cacao germplasm collections. Genet Resour Crop Evol 50:799–807 N’Goran JAK, Laurent V, Risterucci AM, Lanaud C (2000) The genetic structure of cocoa populations (Theobroma cacao L.) revealed by RFLP analysis. Euphytica 115:83–90 Nu´n˜ez Gonza´lez N (2010) El cacao y el chocolate en Cuba, 2nd edn. Fundacio´n Fernando Ortı´z, La Habana Oficina Nacional de Estadı´sticas e Informacio´n (2012) Agricultura, ganaderı´a, silvicultura y pesca. In: Anuario Estadı´stico de Cuba 2011. La Habana, pp 217–244

83 Opoku SY, Bhattacharjee R, Kolesnikova-Allen M, Motamayor JC, Schnell R, Ingelbrecht I, Enu-Kwesi L, Adu-Ampomah Y (2007) Genetic diversity in cocoa (Theobroma cacao L.) germplasm collection from Ghana. J Crop Improv 20:73–87 Paradis E, Claude J, Strimmer K (2004) APE: analyses of phylogenetics and evolution in R language. Bioinformatics 20(2):289–290. doi:10.1093/bioinformatics/btg412 Pritchard JK, Stephens M, Donnelly P (2000) Inference of population structure using multilocus genotype data. Genetics 155:945–959 Pritchard JK, Wen X, Falush D (2010) Documentation for structure software: Version 2.3. University of Chicago. University of Oxford, Oxford Reis Monteiro W, Vanderlei Lopes U, Clement D (2009) Genetic improvement in cacao. In: Jain SM, Priyadarshan PM (eds) Breeding plantation tree crops: tropical species. Springer Science ? Business Media, pp 589–626 Saunders JA, Hemeida AA, Mischke S (2001) USDA DNA fingerprinting programme for identification of Theobroma cacao accessions. In: Bekele F (ed) Proceedings of the international workshop on new technologies and cocoa breeding, Kota Kinabalu, Sabah, Malaysia, 16–17 October 2000. International Network for the Improvement of Cacao, pp 112–118 Saunders JA, Mischke S, Leamy EA, Hemeida AA (2004) Selection of international molecular standards for DNA fingerprinting of Theobroma cacao. Theor Appl Genet 110:41–47 Schliep KP (2011) phangorn: phylogenetic analysis in R. Bioinformatics 27(4):592–593 Schnell R, Olano CT, Brown JS, Meerow AW, CervantesMartinez C, Nagai C, Motamayor JC (2005) Retrospective determination of the parental population of superior cacao (Theobroma cacao L.) seedlings and association of microsatellite alleles with productivity. J Am Soc Hortic Sci 130(2):181–190 Sereno ML, Albuquerque PSB, Vencovsky R, Figueira A (2006) Genetic diversity and natural population structure of cacao (Theobroma cacao L.) from the Brazilian Amazon evaluated by microsatellite markers. Conserv Genet 7:13–24 Sounigo O, Christopher Y, Bekele F, Mooleedhar V, Hosein F (2001) The detection of mislabelled trees in the International Cocoa Genebank, Trinidad (ICG,T) and options for a global strategy for identification of accessions. In: Bekele F (ed) Proceedings of the international workshop on new technologies and Cocoa breeding, Kota Kinabalu, Sabah, Malaysia, 2001. International Network for the Improvement of Cacao, pp 34–39 Susilo AW, Zhang D, Motilal LA, Mischke S, Meinhardt LW (2011) Assesing genetic diversity in Java-Flavor cocoa (Theobroma cacao L.) germplasm by using simple sequence repeat (SSR) markers. Trop Agric Dev 55(2):84–92 Swanson J-D, Lee AC, Guiltinan MJ (2003) USDA cacao DNA fingerprinting ring test: results from Penn State University. Ingenic Newsl 8:22–24 R Core Team (2012) R: a language and environment for statistical computing. 2.15.0 edn. R Foundation for Statistical Computing, Vienna, Austria

123

84 Trognitz B, Scheldeman X, Hansel-Hohl K, Kuant A, Grebe H, Hermann M (2011) Genetic population structure of cacao plantings within a young production area in Nicaragua. PLoS ONE 6(1):e16056. doi:10.1371/journal.pone. 0016056 Turnbull CJ, Hadley P (2011) International Cocoa Germplasm Database (ICGD). NYSE Liffe/CRA Ltd./University of Reading, UK. http://www.icgd.reading.ac.uk. Accessed 12 July 2013 Turnbull CJ, Butler DR, Cryer NC, Zhang D, Lanaud C, Daymond AJ, Ford CS, Wilkinson MJ, Hadley P (2004) Tackling mislabelling in cocoa germplasm collections. Ingenic Newsl 9:8–11 USDA ARS National Genetic Resources Program (2013) Germplasm Resources Information Network (GRIN). National Germplasm Resources Laboratory. http://www. ars-grin.gov/. Accessed 18 April 2013 Zhang D, Arevalo-Gardini E, Mischke S, Zu´n˜iga-Cernades L, ´ guila J (2006) Genetic Barreto-Chavez A, Adriozola Del A diversity and structure of managed and semi-natural populations of cocoa (Theobroma cacao) in the Huallaga and Ucayali Valleys of Peru. Ann Bot 98:647–655 Zhang D, Boccara M, Motilal L, Butler DR, Umaharan P, Mischke S, Meinhardt L (2008) Microsatellite variation

123

Genet Resour Crop Evol (2015) 62:67–84 and population structure in the ‘‘Refractario’’ cacao of Ecuador. Conserv Genet 9:327–337 Zhang D, Boccara M, Motilal L, Mischke S, Johnson ES, Butler DR, Bailey B, Meinhardt L (2009a) Molecular characterization of an earliest cacao (Theobroma cacao L.) collection from Upper Amazon using microsatellite DNA markers. Tree Genet Genomes 5:595–607 Zhang D, Mischke S, Johnson ES, Phillips-Mora W, Meinhardt L (2009b) Molecular characterization of an international cacao collection using microsatellite markers. Tree Genet Genomes 5:1–10 Zhang D, Arevalo Gardini E, Motilal LA, Baligar V, Bailey B, Zun˜iga-Cernades L, Arevalo-Arevalo CE, Meinhardt L (2011) Dissecting genetic structure in farmer selections of Theobroma cacao in the Peruvian Amazon: implications for on farm conservation and rehabilitation. Trop Plant Biol 4:106–116 Zhang D, Martı´nez WJ, Johnson ES, Somarriba E, PhillipsMora W, Astorga C, Mischke S, Meinhardt LW (2012) Genetic diversity and spatial structure in a new distinct Theobroma cacao L. population in Bolivia. Genet Resour Crop Evol 59:239–252. doi:10.1007/s10722-011-9680-y