Computational clustering integration of metabolomics ...

2 downloads 0 Views 5MB Size Report
variability constrains associated with breeding programs, the phenotypic and genetic diversity of heirloom cultivars (landraces) emerges as a landmark to rescue ...
Computational clustering integration of metabolomics, transcriptomics and agronomical data for germplasm selection in a highly diverse tomato landrace collection Cernadas RA1, Conte M1, Pividori M2, Insani M1, López MG1, Asís R3, D´Angelo M4, Zanor M4, Boggio S4, Valle E4, Asprelli P5, Peralta I6, Milone D2, Stegmayer G2, Carrari F1 ABSTRACT Tomato (S. lycopersicum) is a major vegetable crop consumed worldwide that provides a valuable source of vitamins and antioxidants for the human diet. Because of the variability constrains associated with breeding programs, the phenotypic and genetic diversity of heirloom cultivars (landraces) emerges as a landmark to rescue desired agronomic traits for crop improvement. Here, we surveyed a germplasm collection of 68 tomato Andean landraces maintained and cultivated by family farmers. Distinct sets of these accessions were cultivated in the Cuyo region (Mendoza) during several seasons (i.e. 2005-06, 2006-07, 2008-09, 2009-10, 2010-11 and 2011-12) and characterized by morpho-agronomic traits as well as by biochemical characters of the mature fruits. Our analyses undertook a combined approach using, i) GC-MS, NMR and HPLC to identify fruit soluble and volatile metabolites, ii) transcriptomics, and iii) computational biology to integrate the whole dataset. Preliminary results allow to define genotypic clusters according to agronomical traits, including metabolite profiles, antioxidant properties and vitamins accumulation. We also explored organoleptic properties of the different accessions to establish inter-cluster correlations between volatile content and fruit taste. Finally, a multi focus clustering analysis based on accessions diversity and environmental variation along the experimental seasons provides a method to infer the most probable traits to be stable inherited. METHODS Sensory analyses of mature fruits were conducted as is described previously with at least 10 experienced panelists (Baldwin et al., 1998). Antioxidant metabolites were measured by HPLC–DAD–MS/MS in mature fruits and in vitro antioxidant capacities determined by TEAC and FRAP methods as described by Di Paola Naranjo et al., (2016). Volatile organic compounds (VOCs) of mature fruits from Andean tomato landraces were performed by gas chromatography-mass spectrometry (GC-MS) (Cortina et al., 2016) and metabolite profiles of soluble compounds obtained by GC-TOF-MS as described before (Lisec et al., 2006). A combinatorial of year/location data are presented from field trial experiments performed under field conditions in the Cuyo region (Mendoza, Argentina, 33° 50´S, 68°52´W, 900 masl). Morpho-agronomic traits were recorded from the same experiments. RESULTS Figure 1. Geo-references of tomato heirloom varieties used in this study A total of one hundred tomato accessions, including heirloom varieties collected in the Argentinean Andean valleys (Figure 1) and commercial varieties of distinct origins were phenotyped in detail. Particularly, heirloom varieties have been subjected to minimal selection process during breeding performed by local farmer. Therefore, this germplam constitute a very valuable genetic resource.

565

564

563

562 566

569 571

570

553

568

550 560

559

557

All plants were cultivated under open field conditions of the Cuyo region in Mendoza, Argentina during the spring-summer season (October-March).

575 554 572

571

552

551

574

3827

558 555

573

548

561

556

Sensory panels for fresh fruits of 16, 18 and 22 accessions harvested at the end of the 2010, 2011 and 2012 summers, respectively, were assayed for their organoleptic properties based on trait descriptors by partially trained personnel.

549

3833 3842

3819

3832

3815 3831

3812 3811

3825

3808

3836

3829 3840

3816

3824

3837 3837

3834 3820

3822

Figure 1. Geo-references and phenotype of some tomato varieties used in this study.

HPLC and GCMS data

Morpho-agronomic data

Sensory panels data

Accession/year 2006 2007 2010 2012

Accession/year 2010 2011 2012

Accession/year 2009 2010 2012 Data

80 95 140

Accession/year 2009 2010 2011 15 36 28

Vola%les An%oxidants Vitamins Aminoacids other

64 68 39 19

Data

Gene Expression data Accession/year 2010 Data

100 tomato accessions analyzed

Clustermatch (BioDataFusion, Version 1.0)

Heirloom varie,es

Clustering variables (≠data sources/harvest seasons)

552, 553, 558, 559, 561, 564, 567, 568, 569, 715 551, 554, 555, 556, 557, 560, 563, 565, 3809, 3810, 3811 550, 570, 571, 572, 573, 574, 3812, 3815, 3823

Genotype ensemble (k=10)

575, 3814, 3816, 3820, 3832, 3840, 3841, 3842, 3843, 3844 548, 549, 566, 3806, 3808, 3813, 4735, 4736, 4739, 4171 562, 3805, 4618, 4740, 4741, 4742, 4743, 4748, 4749, 4751

10

Computa%onal biology

Genotype selec%on

Commercial varie,es Franco, 4623, CheAmRed, 4750, Elpida, Franco*, Biguá, CZB, CheRedRoj

2523, 2535, 2637, 2677, 2724, 2767, 2777, 2790, 4745 3817, 3819, 3821, 3822, 3827, 3828, 3829, 3835, 3838, 3839 3818, 3824, 3825, 3826, 3830, 3831, 3833, 3834, 3836, 3837

Figure 2. Flow chart of data collection, integration and analysis. 1-Instituto de Biotecnología – INTA, Castelar. Buenos Aires, Argentina 2- Laboratorio de Inv. en Señales e Inteligencia Computacional. Facultad de Ingeniería y Ciencias Hídricas. Universidad Nacional del Litoral. Santa Fe, Argentina. 3- CIBICI-CONICET. Universidad de Córdoba. Córdoba, Argentina 4- Instituto de Biología Molecular y Celular de Rosario. Rosario, Argentina 5- Estación Experimental Agropecuaria La Consulta – INTA. San Carlos, Mendoza 6- Facultad de Ciencias Agrárias Universidad Nacional de Cuyo. Mendoza, Argentina

Gene expression analyses showed that nearly 2,600 probesets (FDR

Suggest Documents