Expression Profiling of Cassava Storage Roots Reveals an Active

0 downloads 0 Views 1MB Size Report
Dec 1, 2010 - cassava root development using high-throughput expression profiling technologies .... Correlation and coefficient (Supporting Figure S2) as well as hierarchical ... 27 217 arabidopsis proteins, 20 450 sequences had hits with ...... JE, Ringwald M, Rubin GM, Sherlock G (2000) Gene ontology: tool for the ...
Journal of Integrative Plant Biology 2011, 53 (3): 193–211

Research Article

Expression Profiling of Cassava Storage Roots Reveals an Active Process of Glycolysis/Gluconeogenesis F



Jun Yang1 , Dong An2 and Peng Zhang1,2,3 1 Shanghai

Center for Cassava Biotechnology, National Laboratory of Plant Molecular Genetics, Institute of Plant Physiology & Ecology, Shanghai Institutes for Biological Sciences, the Chinese Academy of Sciences, Shanghai 200032, China 2 Key Laboratory of Synthetic Biology, Institute of Plant Physiology & Ecology, Shanghai Institutes for Biological Sciences, the Chinese Academy of Sciences, Shanghai 200032, China 3 Shanghai Chenshan Plant Science Research Center, the Chinese Academy of Sciences, Chenshan Botanical Garden, Shanghai, 201602, China ∗ Corresponding author Tel: +86 21 5492 4096; Fax: +86 21 5492 4318; E-mail: [email protected] F Articles can be viewed online without a subscription. Available online on 7 December 2010 at www.jipb.net and www.onlinelibrary.wiley.com doi: 10.1111/j.1744-7909.2010.01018.x

Abstract Mechanisms related to the development of cassava storage roots and starch accumulation remain largely unknown. To evaluate genome-wide expression patterns during tuberization, a 60 mer oligonucleotide microarray representing 20 840 cassava genes was designed to identify differentially expressed transcripts in fibrous roots, developing storage roots and mature storage roots. Using a random variance model and the traditional twofold change method for statistical analysis, 912 and 3 386 upregulated and downregulated genes related to the three developmental phases were identified. Among 25 significantly changed pathways identified, glycolysis/gluconeogenesis was the most evident one. Rate-limiting enzymes were identified from each individual pathway, for example, enolase, L-lactate dehydrogenase and aldehyde dehydrogenase for glycolysis/gluconeogenesis, and ADP-glucose pyrophosphorylase, starch branching enzyme and glucan phosphorylase for sucrose and starch metabolism. This study revealed that dynamic changes in at least 16% of the total transcripts, including transcription factors, oxidoreductases/transferases/hydrolases, hormone-related genes, and effectors of homeostasis. The reliability of these differentially expressed genes was verified by quantitative real-time reverse transcription-polymerase chain reaction. These studies should facilitate our understanding of the storage root formation and cassava improvement. Yang J, An D, Zhang P (2011) Expression profiling of cassava storage roots reveals an active process of glycolysis/gluconeogenesis. J. Integr. Plant Biol. 53(3), 193–211.

Introduction Cassava (Manihot esculenta Crantz) provides the major source of dietary carbohydrates for almost 750 million people throughout the tropics (Cock 1982; Nassar and Ortiz 2010). The majority of the cassava that is produced is used for human foods, livestock feeds and starches in small-scale industries (Lopez et al. 2005). The storage root is a key organ for the

direct production of cassava. In many circumstances, its yield reflects the productivity of the entire plant (Alves and Cameira 2002). The physiological significance of the storage roots is belied by their relative structural simplicity compared with other plant organs: roots largely lack some of the major metabolic pathways, such as photosynthesis, and they have a stereotypical morphology that is conserved throughout the stages of cassava root development and throughout the life cycle of

 C

2011 Institute of Botany, Chinese Academy of Sciences

194

Journal of Integrative Plant Biology

Vol. 53

individual plants. This combination of physiological relevance and structural simplicity has made storage roots excellent targets for functional genomic analyses (Jiang and Deyholos 2006). It is possible to reveal the dynamic mechanisms of cassava root development using high-throughput expression profiling technologies, such as microarray and large-scale sequencing. Despite its economic importance, especially in developing countries, this orphan crop has received little attention in the scientific community compared to other major crops, such as rice or maize (Aerni and Bernauer 2006). Numerous tools are currently available for performing functional genomic analyses on cassava (genetic map, bacterial artificial chromosome (BAC) library, expressed sequence tags (EST) library). A transformation system has also been developed (Taylor et al. 2004), and a draft sequence of the cassava genome was recently released (http://www.phytozome.net/cassava). Global gene expression profiling provides another useful tool to improve our ability to study biological processes in cassava. In plants, cDNA microarrays have been used to study responses to various stresses (Thimm et al. 2001; Seki et al. 2002; Oono et al. 2003; Rabbani et al. 2003), identify genes related to metabolic pathways (Guterman et al. 2002) and analyze gene expression during adventitious root development (Brinker et al. 2004). Oligonucleotide microarrays have been used to investigate lateral root development after nitrate stimulation (Liu et al. 2008). In cassava, differentially expressed genes involved in the incompatible interaction between cassava and Xanthomonas axonopodis pv. manihotis (Lopez et al. 2005), as well as the process of post-harvest physiological deterioration, have been studied using cDNA microarrays (Reilly et al. 2007). Although both cDNA and oligonucleotide microarrays can be used to analyze gene expression patterns, fundamental differences exist between these methods; for example, the oligonucleotide microarrays apparently have high precision, whereas the cDNA arrays show poor concordance (Woo et al. 2004). Short and long oligonucleotide arrays have several advantages over cDNA arrays in terms of specificity, sensitivity and reproducibility. Long oligonucleotides can provide increased signal intensity compared to short ones (Relogio et al. 2002; Shippy et al. 2004). To increase our understanding of storage root development, a transcriptome analysis of gene expression during the process is needed. Recently, Sojikul et al. (2010) reported the characterization of differentially expressed genes in fibrous and storage roots using cDNA-amplified fragment length polymorphism (AFLP). In this study, we investigated gene expression changes in cassava root at different developmental stages, e.g., fibrous roots, developing storage root and mature storage root, using a cassava 60 mer oligonucleotide microarray. Our study provides new insights into the molecular nature of cassava tuberous root development.

No. 3

2011

Results Development of a custom cassava microarray For the RIKEN cDNA library (Sakurai et al. 2007), 35 400 sequences were assembled into 13 063 unique consensus sequences (unigenes) that contain 6 998 tentative contigs from 29 138 sequences and 6 065 singletons. Only 197 sequences have been screened out in the sequence analysis (phrap with minmatch 17 and minscore 40). Sequence comparison with assemblies and singletons from The Institute for Genomic Research (TIGR) indicated that 7 629 sequences had high similarity with a subset of 13 063 unigenes (E value < 1e-20), and 7 774 TIGR sequences showed no significant similarity with the unigenes. A total of 20 837 sequences composed of the 13 063 unigenes and the 7 774 TIGR sequences were generated for further analyses. The largest EST dataset, 71 520 ESTs from National Centre for Biotechnology Information (NCBI) including both TIGR and RIKEN results, was reduced to 11 536 contigs and 8 222 singletons with the same program and parameters mentioned above (556 sequences have been screened out). Only 37 sequences showing low similarity (E value≥1e-20) were found after sequence comparison with the 20 837 sequences that were generated by the previous step. This finding indicates that the combination of the TIGR and RIKEN libraries provides nearly equivalent information to the NCBI public database. Therefore, a dataset of 20 874 unigenes was used to design the oligonucleotide probes for the microarray using Agilent (Palo Alto, CA, USA) technology. Finally, a 4×44K custom oligonucleotide array was designed for the present study containing in situ-synthesized 60 mer oligonucleotides representing 20 840 unique cassava genes; there was one replicate for each probe located at another position on the slide (Agilent Technologies). A total of 20 840×2 in situ synthesized 60 mer oligonucleotide probes plus 1 264 positive control features and 153 negative control features were designed on each microarray. Four housekeeping genes (Table 1) were used as internal control genes.

Table 1. Internal control genes used on cassava microarray Sequence Probe name

name∗

Description

CUST_18705

Contig350

Manihot esculenta actin

CUST_7767

Contig3493

Manihot esculenta cytochrome P450 protein CYP71E (c15) mRNA,complete cds

CUST_10481

Contig6723

AT5G12250.1 TUB6

CUST_9480

Contig6971

AT1G07930.1 EF-1-alpha

(BETA-6 TUBULIN) ∗ Contig

represents for riken_Mes_flcDNA.fa.Contig.

Gene Expression Profiling of Developing Cassava Storage Roots

The signal-to-noise ratio (SNR) for each spot was calculated as described by Leiske et al. (2006). The average percentages of acceptable spots (SNR > 2.6) and high-quality spots (SNR > 10) were 94.16 ± 1.23% and 73.6 ± 2.51% in all 11 arrays. To evaluate the microarray quality, principal component analysis (PCA) of all 11 cassava arrays was conducted using the GeneSpring GX Software (Agilent Technologies), as shown in Supporting Figure S1. Technical replicates (TR-1 and TR-2) were assigned together and indicated that the entire experiment from RNA extraction to data extraction was reliable and reproducible. Obvious separation was detected among the three different types of materials analyzed, fibrous root (FR), mature storage root (MR) and technical repeat (TR). FR and MR showed a distinct boundary; while developing storage root (DR) was expected as an intermediate phase between FR and MR (Figure 1) from a developmental point of view. Gene expression profiling may be more informative than morphological observation for the identification of different root development stages. Correlation and coefficient (Supporting Figure S2) as well as hierarchical cluster (Supporting Figure S3) analyses also revealed that among the three biological replicates of developing storage root, DR-1 was similar to the FR group, while DR-2 and DR-3 were related to the MR group. Expression patterns of four house-keeping genes further confirmed the reliable quality of the microarray (Supporting Figure S4). When the total 20 840 sequences were compared with 27 217 arabidopsis proteins, 20 450 sequences had hits with 11 995 arabidopsis proteins, and 14 813 sequences had hits with 9 158 arabidopsis proteins at E value ≤ 1e-5. After searching the descriptions of the 912 differentially expressed genes in the NCBI non-redundant protein database, the descriptions of 847 hits were identical to the The Arabidopsis Information Resource (TAIR) result (836 similar descriptions) from arabidopsis (Supporting File S3).

Gene expression profiling of developing cassava storage roots Before normalization, the signal intensities of each feature were filtered against negative controls on the array. It was found that 72.13 ± 2.63%, 72.14 ± 2.48%, 68.91 ± 2.24% and 71.61 ± 2.10% of the genes on the chip were expressed in FR, DR, MR and TR, respectively. A comparative study was conducted by comparing gene expression profiles between each of the three selected stages (FR, DR and MR) using a twofold change cutoff (FCC). There were 777, 1 410 and 48 significantly upregulated genes (Figure 2A); 623, 1 349 and 409 significantly downregulated genes (Figure 2B) in DR/FR, MR/FR and MR/DR, respectively, at the cut-off SNR > 2.6 and P-value < 0.05. In total, 3386 differentially expressed genes were defined (Figure 2C, Supporting File S1). When using the random variance model (RVM) method, 912 differentially

195

Figure 1. Different stages of tuberous root development. DR, developing storage root in cross-section; FR, fibrous roots; MR, mature storage root in cross-section.

expressed genes could be identified at a cut-off P-value < 0.05 and false discovery rate (FDR) < 0.1 (Supporting File S2). In total, 742 common genes were detected by the FCC and RVM methods (Figure 2D).

Clustering of and pathways represented by differentially expressed genes Among the 912 differentially expressed genes associated with Gene IDs in the Kyoto Encyclopedia of Genes and Genomes (KEGG), Biocarta and Reactome, 54 related pathway maps were selected for further analysis. Among these pathways, 25 pathways were considered significant at a cut-off Pvalue < 0.05 and FDR < 0.05 (Supporting File S4). The enrichment of each pathway was calculated according to the given equation and shown in Figure 3. The FDR of most pathways was not higher than 0.01 with only one exception (the FDR of fatty acid metabolism pathway was 0.0125). These pathways are suggested to be important during the initiation and formation of the storage root in cassava; and some of them belong to glucide metabolism, which includes N-glycan degradation, the pentose phosphate pathway, fructose and mannose metabolism, glycan structure degradation, glycolysis/gluconeogenesis, and starch and sucrose metabolism. A global pathway net was constructed to illustrate the key pathways in the process of root development (Supporting File S5). Twenty out of 25 significant pathways were included in the pathway net (Figure 4), and the five pathways that were omitted are arachidonic acid metabolism, glycan structure degradation, limonene and pinene degradation, Nglycan degradation, and the phosphatidylinositol signaling system. Glycolysis/gluconeogenesis was considered to be the most important node in the net because the component exchanges with other pathways were strongly dependent on its existence. The top hitting locus identifiers of the 11 995 non-redundant arabidopsis genes, together with the corresponded MR/FR ratio (Supporting File S6), were mapped to KEGG arabidopsis pathways using the KegArray tool (Version 1.2.1) for a

196

Journal of Integrative Plant Biology

Vol. 53

No. 3

2011

Figure 2. The number of differentially expressed genes during cassava storage root development. Upregulated (A) and downregulated (B) genes in paired comparison. DR/FR, developing storage roots/fibrous roots; MR/FR, mature storage roots/fibrous roots; MR/DR, mature storage roots/developing storage roots. (C) Non-redundant up- and downregulated genes. (D) Comparison of two methods. FCC, two-fold change cutoff; RVM, Random variance model; Red and green numbers indicate the up- and downregulated genes, respectively, validated by real time reverse transcription-polymerase chain reaction (RT-PCR) in each subset.

detailed view of the pathways of interest (Figure 5). A subset of 2 086 identifiers corresponding to the differentially expressed cassava genes (SNR > 2.6 and P-value < 0.05) was also mapped. There were 4 158 and 691 hits, respectively, corresponding to all identified genes and differentially expressed genes. Changes in enrichment were calculated according to the equation: Re = (n f /n)/(N f /N), where N f and N were set as 691 and 4158, respectively (Supporting File S7). Among the 25 significant pathways mentioned above, 17 were confirmed by the 2 086 identifiers and 14 pathways exceeded the average enrichment, which is consistent with the results of previous pathway analysis. Short time-series clustering revealed that six significant expression patterns were involved in root development (Figure 6). There were 144, 195, 92, 38, 39 and 36 genes (Supporting File S8) clustered in profiles 0 (0, –2, –3), 4 (0, –1, –1), 11 (0, 1, 1), 3 (0, –1, –2), 15 (0, 2, 3) and 2 (0, –1, –3), respectively. Two pairs of opposite profiles were identified: profiles 0 and 15, and profiles 4 and 11. The latter pair of profiles (Figure 6B) contained more genes and was selected for further gene ontology analysis. Functional category enrichment evaluation based on the gene ontology (GO) was performed on the upregulated or downregulated genes assigned to profile 11 and 4, respectively. Significant GO terms were found at a cut-off P-value < 0.05,

and the enrichment of each GO term was calculated (see additional file 9). Significantly enriched GO terms in profile 11 and profile 4 are illustrated in Figure 7. Some reliable GO terms are presented for profile 11 (amylopectin biosynthetic process, basipetal auxin transport, response to wounding, response to cytokinin stimulus and the starch metabolic process) and profile 4 (carbohydrate biosynthetic process, response to auxin stimulus, carbohydrate transport, carbohydrate metabolic process and plant-type cell wall loosening). There were three common GO terms considered as basic functions in profiles 11 and 4 (oligopeptide transport, regulation of transcription and its subset, DNA-dependent regulation of transcription). Some interesting relationships were found between profile 11 and 4. For example, basipetal auxin transport was present in the former, and response to auxin stimulus appeared in the latter. Among the three components described by GO annotation (cellular component, molecular function and biological process), the biological processes might be the more relevant aspect of GO with respect to root development. Therefore, only functional clusters belonging to this component are presented in the selected profiles. To verify the results of the GO analysis, 11 995 and 2 086 non-redundant arabidopsis gene locus identifiers were annotated with GO terms in TAIR (http://www.arabidopsis.

Gene Expression Profiling of Developing Cassava Storage Roots

0

Enrichment of significant pathway 5 10 15 20 25 30

197

35

N-Glycan degradation Taurine and hypotaurine metabolism Pentose phosphate pathway Cyanoamino acid metabolism Fructose and mannose metabolism Glycan structures - degradation Carbon fixation Gamma-Hexachlorocyclo hexane degradation Riboflavin metabolism Glycolysis / Gluconeogenesis Limonene and pinene degradation Naphthalene and anthracene degradation Arachidonic acid metabolism Selenoamino acid metabolism Phenylpropanoid biosynthesis Starch and sucrose metabolism Tryptophan metabolism Zeatin biosynthesis Benzoate degradation via CoA ligation Biosynthesis of steroids Sulfur metabolism Terpenoid biosynthesis Inositol phosphate metabolism Phosphatidylinositol signaling system Fatty acid metabolism Figure 3. Significantly enriched pathways of the differentially expressed genes.

org/tools/bulk/go/index.jsp), in which 3 448 and 1 507 non-redundant GO IDs were hit, respectively. Changes in enrichment were calculated according to the equation: Re = (n f /n)/(N f /N), where N f and N were set as 2 086 and 11 995, respectively (Supporting File S10). Among the 64 significant GO terms in profiles 11 and 4 mentioned above, 62 were found in 3 448 GO IDs and 56 were found in 1 057 GO IDs. Among the 56 GO terms, 42 exceeded the average enrichment. When arabidopsis gene locus identifiers were mapped with KEGG, a BRITE (Biomolecular Relations in Information Transmission and Expression) (functional hierarchies and ontologies) view of all hits was constructed. Similarly, a functional categorization was presented after annotation in TAIR (Supporting File S11). The expression of many transcription factors was significantly changed during storage

root development. A total of 138 genes (182 distinct gene models) were considered to be transcription factors when the 2 086 identifiers were annotated in TAIR. The number of upregulated transcription factors was nearly equal to the number of downregulated ones. The arabidopsis genome has at least 1 922 predicted transcription factors in the Database of arabidopsis Transcription Factors (DATF, http://datf.cbi.pku.edu.cn/index.php) and more than 2 192 in the Plant Transcription Factor Database (PlnTFDB, http://plntfdb.bio.uni-potsdam.de/v3.0/index.php?sp_id=ATH). These transcription factors have been classified into 65 or 83 families in DATF and PlnTFDB, respectively. The enrichment of each family was calculated by classification of non-redundant identifiers according to the two databases (Supporting File S12). Several enriched families were identified, such as ARF, C2C2-Dof, CCAAT, CPP, E2F-DP, G2-like and GRF.

198

Journal of Integrative Plant Biology

Vol. 53

No. 3

2011

Figure 4. Significant and interacting pathways during storage root development in cassava. The node size was correlated with the number of interacting pathways (Supporting File S5); Red nodes represent 20 significant pathways and blue nodes represent non-significant ones; five isolated significant pathways were absent.

Verification of gene expression patterns by quantitative real-time PCR To validate the microarray data and evaluate the methods of selected differentially expressed genes, real time RT-PCR was performed using the RNA extracted from the three sample

replicates at different developmental stages that were used in the microarray experiment. A total of 55 genes were selected for verification, including 42 differentially expressed genes identified by FCC and RVM methods, as indicated in Figure 2D, and 13 genes involved in starch and sucrose metabolism.

Gene Expression Profiling of Developing Cassava Storage Roots

199

Figure 5. Regulatory changes in the pathway of glycolysis/gluconeogenesis (A) and starch and sucrose metabolism (B). Colored rectangles correspond with the cassava genes detected on microarray. Green, downregulated; yellow, nonregulated; red, upregulated.

Journal of Integrative Plant Biology

Vol. 53

No. 3

2011

Figure 5. Continued

200

Gene Expression Profiling of Developing Cassava Storage Roots

201

Figure 6. Cluster analyses of differentially expressed genes. Profiles in color indicate significant ones (A). Green, upregulated; red, downregulated; profile number (up left), trend (line) and P-value (bottom left) in each profile were indicated. The profiles of 92 upregulated (profile 11) and 195 downregulated (profile 4) genes are illustrated (B).

Cassava actin was used as the normalization standard. The fold changes (log2 ratio) in DR and MR compared with FR are presented in Table 2. In total, 94.55% of the tested genes were consistent with the microarray analysis.

Discussion Development of cassava storage root has drawn increasing attention from the cassava research community due to the use

of cassava as a staple food crop in the tropics and also as feedstock for bio-ethanol production in many countries (Nguyen et al. 2007a, 2007b; Jansson et al. 2009). Recently, a study related to storage root formation was conducted using AFLPbased transcript profiling (Sojikul et al. 2010). In our study, the custom-designed 4×44K long oligonucleotide (60 mer) microarray was developed and used to investigate the genomewide gene expression profile related to storage root development in cassava cultivar TMS60444. Differentially expressed

202

Journal of Integrative Plant Biology

A

Vol. 53

No. 3

2011

Enrichment of significant GO 0 10 20

30

40

Potassium ion import Blue light signaling pathway Aging Amylopectin biosynthetic process Basipetal auxin transport Dipeptide transport Response to abiotic stimulus Meristem maintenance Cotyledon vascular tissue pattern formation Inflorescence development Phyllome development Stomatal movement Copper ion transport Starch metabolic process Response to light stimulus Leaf vascular tissue pattern formation Metal ion transport Oligopeptide transport Abscisic acid mediated signaling pathway Response to cytokinin stimulus Flower development Response to heat Response to wounding Response to abscisic acid stimulus Response to cadmium ion Regulation of transcription, DNA-dependent Regulation of transcription

B

Enrichment of significant GO 0 5

Figure 7. Significantly enriched gene ontology (GO) terms in profile 11 and profile 4.

(B) Enrichment of 40 significant GO terms in profile 4.

60

Profile 11

Tripeptide transport DNA catabolic process Alkaloid biosynthetic process Very-long-chain fatty acid metabolic process Histidine catabolic process Carbohydrate biosynthetic process Response to other organism Brassinosteroid metabolic process Seed coat development Regulation of hydrogen peroxide metabolic process RNA modification Oligopeptide transport Flavonol biosynthetic process L-serine metabolic process Phenylpropanoid metabolic process Response to fungus Cuticle development Glycine metabolic process Response to biotic stimulus Response to salicylic acid stimulus Plant-ty pe cell wall loosening Carbohydrate transport Lipid transport Fatty acid biosynthetic process Response to auxin stimulus Amino acid transport Response to jasmonic acid stimulus Response to nematode Response to ethylene stimulus Defense response Regulation of transcription Regulation of transcription, DNA-dependent Carbohydrate metabolic process Response to water deprivation Photosynthesis Response to oxidative stress Transport Response to chitin Transmembrane receptor protein tyrosine kinase Protein folding

(A) Enrichment of 27 significant GO terms in profile 11.

50

10

15

Profile 4

20

Gene Expression Profiling of Developing Cassava Storage Roots

203

Table 2. Validation of microarray-based gene expression by real time reverse transcription-polymerase chain reaction (RT-PCR) (55 selected genes in developing storage root (DR) and mature storage root (MR) in comparison with fibrous root (FR)) DR-FR Sequence name∗

Tair locus

Expect

Array

MR-FR qRT-PCR

Array

qRT-PCR

TA5570_3983

AT2G36390

0

3.21

3.09

3.44

2.86

Contig6973

AT2G36530

0

−0.45

0.10

−0.53

−0.63

TA9083_3983

AT3G29320

5.00E-142

4.11

3.65

4.72

2.77

Contig3548

AT3G25230

4.00E-116

3.61

2.77

5.63

−0.43

Contig6596

AT3G48000

2.00E-107

−0.02

−0.62

−0.42

−1.15

Contig6514

AT2G38380

2.60E-95

−4.42

−5.50

−8.33

−6.90

DV457304

AT5G22810

1.60E-89

−8.03

−5.39

−8.10

−8.83

Contig6989

AT4G38970

3.30E-83

3.42

3.59

7.16

3.22

CAS01_007_P13.f

AT4G17260

1.00E-79

−1.00

−0.72

−1.86

−1.24 −1.85

DV449591

AT2G26560

1.50E-78

−4.95

−1.86

−4.48

CK647376

AT1G61800

1.40E-76

2.05

1.05

4.24

0.41

Contig5864

AT2G47180

5.20E-76

2.07

4.57

3.41

3.84

Contig5970

AT5G08290

1.40E-73

4.05

3.17

5.46

3.83

DV444052

AT5G39150

4.10E-73

−7.65

−10.21

−9.32

−13.71

Contig6535

AT1G23800

3.40E-71

−0.82

−0.60

−1.80

−1.83

Contig1114

AT1G78380

1.90E-70

−7.36

−7.59

−8.56

−8.72

Contig5561

AT5G48300

5.40E-68

2.65

2.59

3.28

1.94

TA7864_3983

AT1G17500

1.90E-64

−7.41

−7.23

−7.50

−9.04

CK643761

AT1G11545

3.20E-64

1.86

3.55

3.91

3.72

Contig4784

AT1G54100

2.10E-63

−0.22

−0.44

−0.39

−1.70

CK647353

AT3G29320

5.70E-63

4.55

4.58

5.60

2.85

DV453892

AT1G74030

8.10E-63

−3.10

−1.90

−2.64

−2.54

TA10158_3983

AT3G26330

8.90E-59

2.58

1.68

3.49

2.01

DV441664

AT3G05950

4.70E-58

−7.96

−14.21

−9.10

−14.04

DV452155

AT5G37820

1.40E-57

−6.86

−7.79

−8.34

−8.93

DV447432

AT5G23960

7.30E-57

−6.03

−3.90

−8.37

−8.25

TA5610_3983

AT4G10250

8.40E-55

3.44

4.78

4.40

4.19

TA9482_3983

AT5G01300

1.10E-51

4.63

4.04

5.40

5.04

TA8522_3983

AT5G03650

1.40E-49

2.90

2.27

2.97

2.06

DV457213

AT5G24910

4.60E-47

−7.44

−6.39

−8.06

−8.26

Contig6028

AT1G56300

1.50E-38

1.06

2.78

3.59

−1.10

DV442707

AT4G25410

2.40E-37

−7.31

−9.08

−7.71

−12.22

Contig4441

AT2G45220

3.60E-36

−0.09

1.32

0.17

1.30

Contig2017

AT4G39210

3.20E-32

4.49

3.20

5.01

1.91

DV457474

AT4G38620

1.90E-31

−3.89

−5.41

−4.17

−7.87

CAS01_031_N01.f

AT5G48810

8.30E-29

−3.36

−3.18

−4.09

−4.98

DV441796

AT1G13710

9.20E-28

−2.62

−4.30

−3.42

−6.27

TA9112_3983

AT4G25040

2.70E-25

−7.93

−6.45

−8.68

−10.18

TA6163_3983

AT4G23180

1.40E-24

6.36

7.90

9.90

9.48

Contig5004

AT1G17870

8.70E-24

1.58

1.19

3.98

−0.13

CK643753

AT2G16980

4.10E-22

−2.43

−2.75

−3.57

−5.36

TA8676_3983

AT2G40200

5.80E-22

2.22

3.91

4.31

4.16 −11.87

DV457216

AT4G13420

1.90E-18

−8.39

−13.89

−9.79

CK645724

AT5G24580

6.70E-16

1.18

2.81

4.32

3.52

BM259732

AT1G53830

3.40E-15

−1.72

−2.40

−2.39

−5.77 Continued.

204

Journal of Integrative Plant Biology

Vol. 53

No. 3

2011

Table 2. Continued DR-FR Sequence name∗

Tair locus

Expect

Array

MR-FR qRT-PCR

Array

qRT-PCR −8.26

DV446429

AT1G49320

4.90E-14

−5.48

−6.07

−9.47

CAS01_015_G22.f

AT4G27900

1.80E-11

2.77

2.89

3.89

2.34

TA8688_3983

AT2G18680

3.00E-08

−5.49

−4.59

−7.94

−7.66

TA9886_3983

AT4G24000

7.40E-07

−5.48

−5.83

−7.74

−10.63

DV454822

AT1G66410

4.50E-06

−4.80

−2.91

−8.90

−4.21

DV458137

AT1G24560

3.10E-05

−3.55

−2.26

−3.85

−2.55

Contig6893

AT2G01021

0.113

6.12

1.10

5.95

−0.18

DV446763

AT3G16800

0.361

−3.55

−14.45

−2.79

−14.82

DR086621

AT4G13070

0.365

−4.96

−7.77

−5.77

−9.25

DV449336

AT5G40780

0.808

3.43

−0.59

2.80

−2.20

Note: All data are shown as log2 ratio, positive and negative value means up- and downregulated in DR or MR compared with FR, respectively. ∗ Contig represent riken_Mes_flcDNA.fa.Contig; quantitative reverse transcription-polymerase chain reaction (qRT-PCR) results shown in bold font (15 values) are the results of those that were dramatically higher or lower (log2 ratio difference > 4) compared with array results and bold italic font indicates inconsistent results (six values) between microarray and qRT-PCR.

genes at different developmental stages were identified, and their potentially relevant functions were studied using pathway and GO analyses, in which 25 important pathways were identified as significant ones to be regulated. Important genes related to each individual pathway were identified. The results of real time RT-PCR also confirmed the validity of the microarray, as well as the storage root specific expression patterns. Therefore, our study sheds new light on storage root development in cassava by using the long oligonucleotide (60 -mer) cassava microarray. High-quality microarrays are a prerequisite for reliable analysis of different biological processes. Different types of microarray, including cDNA (long strands of amplified cDNA sequences), short oligonucleotide (25–30 nt), and long oligonucleotide (50–80 nt), used in transcriptome study may have different feature performances (Petersen et al. 2005; de Reynies et al. 2006; Tsai et al. 2006; Fan 2009; McHale et al. 2009). The custom long oligonucleotide array described here was generated by the Agilent SurePrint ink-jet technology, which provides a flexible platform for revising and updating oligonucleotide probes in the array without additional cost (Hughes et al. 2001; Wolber et al. 2006; Li et al. 2008). Most probes used on the current array are represented in the most recent draft of the cassava genome (http://www.phytozome.net/cassava). In addition, the 4×44K platform used for the array design contains four independent arrays in one slide; this arrangement is costeffective and can reduce the variation among the arrays within a slide because high background levels in an array might obscure the signal from low-expressed genes and impede accurate quantification. The average SNR of the current microarray was 602.30, which was much higher than that of most cDNA array

platforms (35.1 to 38.3). High SNR will promote sufficient signal generation for the detection of even low copy genes. Two statistical criteria have been applied in the current analysis. Several thousand differentially expressed genes were identified in a pair-comparison at P < 0.05. Because more than 20 000 genes were analyzed in this microarray experiment, it is important to control the proportion of false positives (Tsai et al. 2003). The FDR based on P-values is the expected proportion of true null hypotheses that will be rejected in relation to the total number of null hypotheses that will be rejected (Benjamini and Hochberg 1995). The FDR is a more convenient and natural scale than the P-value scale, and it can provide the probability that a gene value is a false positive (Pawitan et al. 2005). In this study, the false discovery rates of differentially expressed genes selected by the RVM method were controlled at less than 10%, which guaranteed the reliable results of the current microarray experiment. The FDR was also calculated in the pathway analyses, and it was used in the GO analyses for correcting P-values (Dupuy et al. 2007). Quantitative real time PCR has become the gold standard for measuring gene expression, and it is generally used to validate microarray results (Dallas et al. 2005). With a criterion of P < 0.05 in the microarray analysis, the false positives could be effectively controlled (94.55% of qRT-PCR results were consistent with microarray data). The results indicate that microarray analyses in the present study are statistically reliable and accurate. Gene expression profiling could provide valuable information related to the biological process during starchy storage root development, and these processes are expected to be conserved in storage root-bearing species e.g., sweet potato and yams. Starch accumulation is obviously the main theme in the

Gene Expression Profiling of Developing Cassava Storage Roots

process, but how this is achieved is still unclear. In the present study, there were 25 significant enriched pathways, and many of them are related to glucide metabolism, which is strongly correlated to starch accumulation and storage root bulking. In parallel, zeatin biosynthesis was highlighted because it is not only related to the promotion of root cell propagation, elongation and enlargement, resulting in sufficient room for starch granule accumulation and tuberization (Melis and Vanstaden 1985; Gibson 2004), but it is also needed for amyloplast formation and starch accumulation (Miyazawa et al. 2002; Bishopp et al. 2009). Lipid biosynthesis is required to support cell propagation and jasmonic acid biosynthesis. The regulation of tuberization in potato and yam by jasmonates has been reported (Koda and Kikuta 1991; Koda et al. 1991; Vandenberg and Ewing 1991; Ovono et al. 2010). For signal transduction, the inositol phosphate-calcium signaling system may play a comprehensive role in the tuberization process (Cenzano et al. 2008). Furthermore, several secondary metabolic processes, such as flavonol biosynthesis, were also found to be important for plant growth and development (Taylor et al. 2004; Besseau et al. 2007). Based on our study, a molecular mode of storage root development was constructed (Figure 8). In the two opposite development-associated expression patterns (profile 11 and profile 4), three common GO terms were identified (oligopeptide transport, regulation of transcription and DNA-dependent regulation of transcription). The array data resulted in fewer differentially expressed genes (Figure 2A, B) between DR and MR, which suggests that profile 11 and profile 4 are root development-associated. The results of the GO analysis also support this hypothesis. For example, basipetal

205

auxin transport appeared in profile 11 (upregulated pattern), whereas response to auxin stimulus was found in profile 4 (downregulated pattern). Interestingly, metal ion transport and copper ion transport were also highlighted in profile 11, which is highly consistent with the physiological requirement of cassava for copper (Chew et al. 1978). Furthermore, defense response, response to abiotic stimulus, and response to wounding were also noticeable, possibly due to stress responses that occurred during sample collection. When pathways of interest (glycolysis/gluconeogenesis, starch and sucrose metabolism) were examined (Figure 5), several genes that were either upregulated or downregulated based on the array were subsequently validated by qRT-PCR (Table 2). These genes included six genes involved in glycolysis/gluconeogenesis (Contig4784, AtALDH7B4; Contig6596, AtALDH2B4; Contig6973, AtLOS2; Contig6535, AtALDH2B7; CAS01_007_P13.f, Arabidopsis L-lactate dehydrogenase; DV453892, Arabidopsis Enolase) and seven genes involved in starch and sucrose metabolism (Contig2017, AtAPL3; TA9083_3983, Arabidopsis Glucan phosphorylase; TA5570_3983, AtSBE2.1; Contig5561, AtADG1; TA8522_3983, AtSBE2.2; Contig4441, Arabidopsis Pectinesterase; BM259732, AtATPME2). When we mapped the qRT-PCR ratios (DR/FR and MR/FR) of these 13 genes to KEGG pathways, upregulations of ADP-glucose pyrophosphorylase, glucan phosphorylase and starch branching enzyme were observed, indicating that these are the key enzymes required for starch accumulation in the storage root. ADPglucose pyrophosphorylase (AGPase), which catalyzes the rate-limiting step in starch biosynthesis in plants, is strongly

Figure 8. Schematic illustration representing microarray analysis of starchy storage root development related to major biological events in cassava. Downregulated reactions are marked by dashed arrows; upregulated ones are indicated by bold arrows. Patterns of differentially expressed genes (green for downregulated and red for upregulated) in carbon flux were shown in boxes. The regulation and developmental processes of tuberization are indicated under the structural scenario. AGPase, ADP-glucose pyrophosphorylase; GP, glucan phosphorylase; SBE, starch branching enzyme.

206

Journal of Integrative Plant Biology

Vol. 53

associated with the yield production both in grains and root crops (Reviewed by Smith 2008 and Zeeman et al. 2010). The downregulation of pectinesterase was also considered to be important because it blocks the carbon flowing to pectate. In glycolysis/gluconeogenesis, downregulation of enolase, Llactate dehydrogenase and aldehyde dehydrogenase (NAD+) could slow down the entry of carbon into the citrate cycle, pyruvate metabolism and propanoate metabolism, leading to less α-D-glucose-6P to be converted to glycerate-3P and α-Dglucose-1P. This would result in most of the α-D-glucose-6P being transported into the amyloplast for starch and sucrose metabolism. This result may also suggest that these enzymes are rate-limiting in the two important pathways in starchy root formation (Figure 8). These findings may facilitate the engineering of cassava for enhanced starch accumulation and deeper understanding of the interested biological process, the starchy root formation (Figure 8). By comparison with the report by Sojikul et al. (2010), in which only 157 transcript-derived fragments were indentified between leaves and roots using cDNA-AFLP, the present study was able to distinguish a large amount of differentially expressed genes among the three types of roots, giving more interesting findings related to the storage root tuberization. Several enriched transcription factor families, such as ARF, C2C2-Dof, CCAAT, CPP, E2F-DP, G2-like and GRF, were identified in the present study. C2C2-Dof (Yanagisawa 2002, 2004), G2-like (Bravo-Garcia et al. 2009) and GRF (Kim et al. 2003) are of particular interest. Although their roles related to starchy storage root development have not been determined, it is possible to narrow down the gene candidates to study the comprehensive biological process by developing a hypothetical model. Recently, a transcriptional factor called RSR1 belonging to the AP2/EREBP family of transcription factors was found to regulate starch biosynthesis in rice endosperm (Fu and Xue 2010) and another transcription factor MADS1 was involved in tuberous root initiation in sweet potato (Ku et al. 2008). In conclusion, gene expression during the process of cassava starchy root formation was characterized using the newly developed cassava microarray, and putative rate-limiting enzymes in key pathways have been highlighted, which provides potential targets for cassava genetic engineering. The platform will provide a valuable resource for the scientific community to study the developmental biology, stress response, virus resistance, genetics and genomics in cassava.

Materials and Methods Plant materials The cultivar TMS60444, which is frequently used for genetic engineering, was used in the present study. The in vitro plants were transferred to pots for one month of growth in a

No. 3

2011

greenhouse and then planted in a field. Fibrous roots (FR), developing storage roots (DR) and mature storage roots (MR, Supplemental Figure S1) were collected from three independent healthy 4-month-old cassava plants in the field and submerged into liquid nitrogen immediately. All samples were maintained in liquid nitrogen during transportation, and they were stored in an ultra-freezer (−80 ◦ C) until RNA extraction.

Generation of a cassava oligonucleotide microarray The microarray design was based on the sequence information from a large collection of cassava ESTs from NCBI (71,520 ESTs of cassava, released 28 March 2008) and TIGR (Manihot_esculenta_release_5, released 1 June 2007; 5 189 assemblies, 10 214 singletons) as well as a 35 400 full-length cDNA RIKEN library (Sakurai et al. 2007). Phrap (http://www.phrap.org/phredphrapconsed.html) and BLAT, the BLAST-like alignment tool (Kent 2002), were employed in sequence analyses. The design of the 60 mer oligonucleotide probes and the microarray were performed using Agilent technology.

Hybridization and data extraction Total RNA was extracted from root samples, including FR, DR and MR, using the RNeasy Mini Kit (Qiagen, Valencia, CA, USA). Two RNA samples extracted from stored storage root slices were used as technical repeats (TR) for quality control. RNA quality was checked on a 1% agarose gel using an RNase-free electrophoresis system. RNA labeling and hybridization were conducted by the Shanghai Biochip Corporation (Shanghai, China) following the manufacturer’s protocols. Arrays were incubated at 65 ◦ C for 17 h in Agilent hybridization chambers (G2545A) and then washed according to the protocol at room temperature. Hybridized microarray slides were scanned at 5 µm resolution with an Agilent Technologies Scanner (G2505B), and images were saved in JPG format. Both 10% and 100% photomultiplier tube (PMT) settings were selected, and combined images were exported. The signal intensities of all spots on each image were quantified by Agilent Feature Extraction software, and data were saved as .txt files for further analysis.

Normalization and differential gene definition Two types of statistical analyses were used. First, the signal intensity of each gene was globally normalized with the GeneSpring GX Software (Agilent Technologies) following the work flow guide (Bolstad et al. 2003). The signal-to-noise ratio (SNR) was calculated using the difference of the median signal minus the background median signal, divided by the background standard deviation (Leiske et al. 2006). Pair comparison was

Gene Expression Profiling of Developing Cassava Storage Roots

used to analyze the normalized and averaged data from the three types of samples (FR, DR and MR). P-values from T tests and fold changes between each comparison for each gene were calculated. Genes induced or suppressed greater than a twofold ratio (twofold change cutoff, FCC) were taken as differentially expressed when SNR > 2.6 and P-value < 0.05. Second, raw data were normalized using LOWESS within the R statistics package. The RVM (random variance model) corrective ANOVA was used to analyze the normalized data from the three different samples (FR, DR and MR). The Pvalues and FDR were calculated by the R program according to Benjamini and Hochberg’s method (Benjamini and Hochberg 1995). Genes were taken as differentially expressed when both P-value < 0.05 and FDR < 0.1 (Wright and Simon 2003).

207

that had a higher probability than expected were identified using Fisher’s exact test and multiple comparisons (Miller et al. 2002; Ramoni et al. 2002). GO-analysis was applied to the genes belonging to specific profiles (Ashburner et al. 2000; Harris et al. 2006). Generally, Fisher’s exact test and the χ2 test were used to classify the GO category, and the FDR (Dupuy et al. 2007) was calculated to correct the P-value. The FDR was defined as FDR = 1 − N k /T, where N k refers to the number of Fisher’s tests with P-values less than those calculated using the χ2 test. Within the significant category, the enrichment Re was given by: Re = (n f /n)/(N f /N), where n f is the number of differential genes within the particular category, n is the total number of genes within the same category, N f is the number of differential genes on the entire microarray, and N is the total number of genes on the microarray (Schlitt et al. 2003).

Bioinformatics All non-redundant sequences that were considered to be unique cassava genes were locally blasted in the TAIR protein database (27 217 arabidopsis protein sequences), which was downloaded from ftp://ftp.arabidopsis.org/home/TAIR, using the blastx program in the blastall package (version 2.2.9). The top hits were used for gene annotation, and the corresponding arabidopsis gene locus identifiers were mapped to the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways using the KegArray tool (Version 1.2.1). Differential genes identified using the random variance model (RVM) method were locally blasted against the NCBI non-redundant protein database using the blastx program in the blastall package. All the hits were recorded to confirm the gene annotations from the arabidopsis protein database, and accession numbers were assigned and used to search Entrez IDs using gene2accession in NCBI. Entrez IDs were recorded and used to search Gene IDs in the Gene Ontology database (http://www.geneontology.org/) for functional categorization of differentially expressed genes. In addition, the Entrez IDs were converted to Gene IDs in the KEGG database (http://www.genome.jp/kegg/), Biocarta (http://www.biocarta.com/genes/index.asp) and Reactome (http://www.reactome.org/). The significant pathways were identified based on KEGG, Biocarta and Reactome. Fisher’s exact test (Yi 2006) was used to select the significant pathways, and the threshold of significance was defined by Pvalue < 0.05 and FDR < 0.05. The enrichment Re was given by the following: Re = (n f /n)/(N f /N), where n f is the number of differential genes within a particular pathway, n is the total number of genes within the same pathway, N f is the number of differential genes on the entire microarray, and N is the total number of genes on the microarray. Pathway-net was built according to the direct or systemic interactions between pathways in the KEGG database. Using a strategy for clustering short time-series (STC) gene expression data (Ernst et al. 2005), some unique profiles were defined. Significant profiles

Quantitative real-time PCR To validate the array data, the expression of 55 genes of interest were confirmed using quantitative real-time PCR (qRT-PCR) with the cassava RNA samples extracted using Plant RNA Reagent (Invitrogen, Cat. No. 12322-012). DNA was removed from the samples using DNase I (TaKaRa, Cat. No. D2215) treatment according to the manufacturer’s protocol. The RNA quantity and purity were determined using a NanoDrop ND-1000 spectrophotometer (Nano Drop Technologies, Wilmington, DE, USA). A 2 µg aliquot of total RNA was used to synthesize first-strand cDNA using the ReverTra Ace (TOYOBO, Code: TRT101) in a 20 µL reaction volume. The qRT-PCR primers were designed with Primer3Plus (http://www.bioinformatics. nl/cgi-bin/primer3plus/primer3plus.cgi). The PCR reactions were performed in a 20 µL volume containing a 2×SYBR Green Master Mix (TOYOBO, Code: QPK-201), 50 ng cDNA, 400 nM of forward primer, and 400 nM of reverse primer in a Bio-Rad CFX96 thermocycler. The amplification conditions were 95 ◦ C for 1 min, followed by 40–50 cycles of 95 ◦ C for 15 s and 61 ◦ C for 30 s. Beta-actin was used as the internal control. All of the samples were measured in triplicate. The comparative Ct method was used to calculate the relative gene expression levels across the samples. The relative expression level of each gene in one sample (Ct) was calculated as follows: Ct target gene – Ct beta-actin. The relative expression of each gene in two different samples (Ct) was calculated as follows: Ct (sample 1) – Ct (sample 2).

Acknowledgements This work was supported by grants from the National Basic Research Program (2010CB126605), the National High Technology Research and Development Program of China

208

Journal of Integrative Plant Biology

Vol. 53

No. 3

2011

(2009AA10Z102), the Earmarked Fund for Modern Agroindustry Technology Research System (nycytx-17), the Chinese Academy of Sciences (KSCX2-EW-J-12) and Shanghai Municipal Afforestation & City Appearance and Environmental Sanitation Administration (G102410). Wenzhi Zhou is acknowledged for downloading and checking the sequences of the latest cassava genome draft.

Brinker M, van Zyl L, Liu W, Craig D, Sederoff RR, Clapham DH, von

Utilization of the microarray

Chew WY, Joseph KT, Ramli K (1978) Influence of soil-applied

Arnold S (2004) Microarray analyses of gene expression during adventitious root development in Pinus contorta. Plant Physiol. 135, 1526–1539. Cenzano A, Cantoro R, Racagni G, De Los Santos-Briones C, Hernandez-Sotomayor T, Abdala G (2008) Phospholipid and phospholipase changes by jasmonic acid during stolon to tuber transition of potato. Plant Growth Regul. 56, 307–316.

MIAME information about the cassava transcriptome microarray used here has been deposited in the Gene Expression Omnibus (GEO) of NCBI (Edgar and Barrett 2006; Barrett et al. 2007). The accession numbers are: Platform, GPL11271; Series, GSE25813; Samples, GSM634255-GSM634265. Received 25 Nov. 2010

Accepted 1 Dec. 2010

micronutrients on cassava (Manihot esculenta) in Malaysian tropical oligotrophic peat. Exp. Agr. 14, 105. Cock JH (1982) Cassava – A basic energy-source in the tropics. Science 218, 755–762. Dallas PB, Gottardo NG, Firth MJ, Beesley AH, Hoffmann K, Terry PA, Freitas JR, Boag JM, Cummings AJ, Kees UR (2005) Gene expression levels assessed by oligonucleotide microarray analysis and quantitative real-time RT-PCR – how well do they correlate? BMC Genomics 6, 59. de Reynies A, Geromin D, Cayuela JM, Petel F, Dessen P, Sigaux

References

F, Rickman DS (2006) Comparison of the latest commercial short and long oligonucleotide microarray technologies. BMC Genomics

Aerni P, Bernauer T (2006) Stakeholder attitudes toward GMOs in the Philippines, Mexico, and South Africa: The issue of public trust. World Dev. 34, 557–575.

7, 51. Dupuy D, Bertin N, Hidalgo CA, Venkatesan K, Tu D, Lee D, Rosenberg J, Svrzikapa N, Blanc A, Carnec A, Carvunis A,

Alves I, Cameira MD (2002) Evapotranspiration estimation perfor-

Pulak R, Shingles J, Reece-Hoyes J, Hunt-Newbury R, Viveiros

mance of root zone water quality model: evaluation and improve-

R, Mohler WA, Tasan M, Roth FP, Le Peuch C, Hope IA,

ment. Agr. Water Manage. 57, 61–73.

Johnsen R, Moerman DG, Barabasi A, Baillie D, Vidal M (2007)

Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT, Harris MA, Hill DP, Issel-Tarver L, Kasarskis A, Lewis S, Matese JC, Richardson JE, Ringwald M, Rubin GM, Sherlock G (2000) Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat. Genet. 25, 25–29.

Genome-scale analysis of in vivo spatiotemporal promoter activity in Caenorhabditis elegans. Nat. Biotechnol. 25, 663–668. Edgar R, Barrett T (2006) NCBI GEO standards and services for microarray data. Nat. Biotechnol. 24, 1471–1472. Ernst J, Nau GJ, Bar-Joseph Z (2005) Clustering short time series gene expression data. Bioinformatics 21(Suppl. 1), i159–i168.

Barrett T, Troup DB, Wilhite SE, Ledoux P, Rudnev D, Evangelista

Fan X (2009) Consistency of predictive signature genes and classifiers

C, Kim IF, Soboleva A, Tomashevsky M, Edgar R (2007) NCBI

generated using different microarray platforms. Mol. Cell. Toxicol.

GEO: Mining tens of millions of expression profiles – database and Tools update. Nucleic Acids Res. D760–D765.

5, 42. Fu FF, Xue HW (2010) Co-expression analysis identifies Rice Starch

Benjamini Y, Hochberg Y (1995) Controlling the false discovery rate

Regulator1 (RSR1), a rice AP2/EREBP family transcription factor,

– a practical and powerful approach to multiple testing. J. R. Stat.

as a novel rice starch biosynthesis regulator. Plant Physiol. DOI:

Soc. Ser. B Stat. Methodol. 57, 289–300. Besseau S, Hoffmann L, Geoffroy P, Lapierre C, Pollet B, Legrand M (2007) Flavonoid accumulation in Arabidopsis repressed in lignin synthesis affects auxin transport and plant growth. Plant Cell 19, 148–162.

10.1104/pp.110.159517. Gibson SI (2004) Sugar and phytohormone response pathways: navigating a signalling network. J. Exp. Bot. 55, 253–264. Guterman I, Shalit M, Menda N, Piestun D, Dafny-Yelin M, Shalev G, Bar E, Davydov O, Ovadis M, Emanuel M, Wang J, Adam

Bishopp A, Help H, Helariutta Y (2009) Cytokinin Signaling during

Z, Pichersky E, Lewinsohn E, Zamir D, Vainstein A, Weiss D

Root Development. Elsevier Academic Press Inc, San Diego. pp. 1.

(2002) Rose scent: Genomics approach to discovering novel floral

Bolstad BM, Irizarry RA, Astrand M, Speed TP (2003) A comparison of normalization methods for high density oligonucleotide array data based on variance and bias. Bioinformatics 19, 185–193.

fragrance-related genes. Plant Cell 14, 2325–2338. Harris MA, Clark JI, Ireland A, Lomax J, Ashburner M, Collins R, Eilbeck K, Lewis S, Mungall C, Richter J, Rubin GM, Shu S,

Bravo-Garcia A, Yasumura Y, Langdale JA (2009) Specialization of

Blake JA, Bult CJ, Diehl AD, Dolan ME, Drabkin HJ, Eppig JT,

the Golden2-like regulatory pathway during land plant evolution.

Hill DP, Ni L, Ringwald M, Balakrishnan R, Binkley G, Cherry

New Phytol. 183, 133–141.

JM, Christie KR, Costanzo MC, Dong Q, Engel SR, Fisk DG,

Gene Expression Profiling of Developing Cassava Storage Roots

209

Hirschman JE, Hitz BC, Hong EL, Lane C, Miyasato S, Nash R,

McHale CM, Zhang LP, Lan Q, Li GL, Hubbard AE, Forrest MS,

Sethuraman A, Skrzypek M, Theesfeld CL, Weng S, Botstein

Vermeulen R, Chen J, Shen M, Rappaport SM, Yin SN, Smith MT,

D, Dolinski K, Oughtred R, Berardini T, Mundodi S, Rhee SY,

Rothman N (2009) Changes in the peripheral blood transcriptome

Apweiler R, Barrell D, Camon E, Dimmer E, Mulder N, Chisholm

associated with occupational benzene exposure identified by cross-

R, Fey P, Gaudet P, Kibbe W, Pilcher K, Bastiani CA, Kishore R, Schwarz EM, Sternberg P, Van Auken K, Gwinn M, Hannick L,

comparison on two microarray platforms. Genomics 93, 343–349. Melis R, Vanstaden J (1985) Tuberization in cassava (Manihot es-

Wortman J, Aslett M, Berriman M, Wood V, Bromberg S, Foote

culenta) – cytokinin and abscisic-acid activity in tuberous roots. J.

C, Jacob H, Pasko D, Petri V, Reilly D, Seiler K, Shimoyama M,

Plant Physiol. 118, 357–366.

Smith J, Twigger S, Jaiswal P, Seigfried T, Collmer C, Howe

Miller LD, Long PM, Wong L, Mukherjee S, McShane LM, Liu ET

D, Westerfield M (2006) The Gene Ontology (GO) project in 2006.

(2002) Optimal gene expression analysis by microarrays. Cancer

Nucleic Acids Res. D322–D326.

Cell 2, 353–361.

Hughes TR, Mao M, Jones AR, Burchard J, Marton MJ, Shannon

Miyazawa Y, Kato H, Muranaka T, Yoshida S (2002) Amyloplast

KW, Lefkowitz SM, Ziman M, Schelter JM, Meyer MR, Kobayashi

formation in cultured tobacco BY-2 cells requires a high cytokinin

S, Davis C, Dai H, He YD, Stephaniants SB, Cavet G, Walker WL, West A, Coffey E, Shoemaker DD, Stoughton R, Blanchard AP, Friend SH, Linsley PS (2001) Expression profiling using microarrays fabricated by an ink-jet oligonucleotide synthesizer. Nat. Biotechnol. 19, 342–347. Jansson C, Westerbergh A, Zhang JM, Hu XW, Sun CX (2009) Cassava, a potential biofuel crop in the People’s Republic of China. Appl. Energ. 86(Suppl. 1), S95–S99. Jiang Y, Deyholos MK (2006) Comprehensive transcriptional profiling of NaCl-stressed Arabidopsis roots reveals novel classes of responsive genes. BMC Plant Biol. 6, 25. Kent WJ (2002) BLAT: the BLAST-like alignment tool. Genome Res. 12, 656–664. Kim JH, Choi D, Kende H (2003) The AtGRF family of putative transcription factors is involved in leaf and cotyledon growth in Arabidopsis. Plant J. 36, 94–104. Koda Y, Kikuta Y (1991) Possible involvement of jasmonic acid in tuberization of yam plants. Plant Cell Physiol. 32, 629–633. Koda Y, Kikuta Y, Tazaki H, Tsujino Y, Sakamura S, Yoshihara T (1991) Potato tuber-inducing activities of jasmonic acid and relatedcompounds. Phytochemistry 30, 1435–1438. Ku AT, Huang YS, Wang YS, Ma DF, Yeh KW (2008) IbMADS1

content. Plant Cell Physiol. 43, 1534–1541. Nassar N, Ortiz R (2010) Breeding cassava to feed the poor. Sci. Am. 302, 78–84. Nguyen T, Gheewala SH, Garivait S (2007a) Energy balance and GHG-abatement cost of cassava utilization for fuel ethanol in Thailand. Energ. Policy 35, 4585–4596. Nguyen T, Gheewala SH, Garivait S (2007b) Full chain energy analysis of fuel ethanol from cassava in Thailand. Environ. Sci. Technol. 41, 4135–4142. Oono Y, Seki M, Nanjo T, Narusaka M, Fujita M, Satoh R, Satou M, Sakurai T, Ishida J, Akiyama K, Lida K, Maruyama K, Satoh S, Yamaguchi-Shinozaki K, Shinozaki K (2003) Monitoring expression profiles of Arabidopsis gene expression during rehydration process after dehydration using ca. 7000 full-length cDNA microarray. Plant J. 34, 868–887. Ovono PO, Kevers C, Dommes J (2010) Tuber formation and growth of Dioscorea cayenensis-D. rotundata complex: interactions between exogenous and endogenous jasmonic acid and polyamines. Plant Growth Regul. 60, 247–253. Pawitan Y, Michiels S, Koscielny S, Gusnanto A, Ploner A (2005) False discovery rate, sensitivity and sample size for microarray studies. Bioinformatics 21, 3017–3024.

(Ipomoea batatas MADS-box 1 gene) is involved in tuberous root

Petersen D, Chandramouli GV, Geoghegan J, Hilburn J, Paarlberg

initiation in sweet potato (Ipomoea batatas). Ann. Bot. 102, 57–67.

J, Kim CH, Munroe D, Gangi L, Han J, Puri R, Staudt L,

Leiske DL, Karimpour-Fard A, Hume PS, Fairbanks BD, Gill RT

Weinstein J, Barrett JC, Green J, Kawasaki ES (2005) Three

(2006) A comparison of alternative 60-mer probe designs in an

microarray platforms: an analysis of their concordance in profiling

in situ synthesized oligonucleotide microarray. BMC Genomics 7, 72.

gene expression. BMC Genomics 6, 63. Rabbani MA, Maruyama K, Abe H, Khan MA, Katsura K, Ito Y,

Li X, Chiang HI, Zhu J, Dowd SE, Zhou H (2008) Characterization of

Yoshiwara K, Seki M, Shinozaki K, Yamaguchi-Shinozaki K

a newly developed chicken 44K Agilent microarray. BMC Genomics

(2003) Monitoring expression profiles of rice genes under cold,

9, 60.

drought, and high-salinity stresses and abscisic acid application

Liu J, Han L, Chen F, Bao J, Zhang F, Mi G (2008) Microarray analysis reveals early responsive genes possibly involved in localized nitrate stimulation of lateral root development in maize (Zea mays L.). Plant Sci. 175, 272–282.

using cDNA microarray and RNA gel-blot analyses. Plant Physiol. 133, 1755–1767. Ramoni MF, Sebastiani P, Kohane IS (2002) Cluster analysis of gene expression dynamics. Proc. Natl. Acad. Sci. USA 99, 9121–9126.

´ Lopez C, Soto M, Restrepo S, Piegu B, Cooke R, Delseny M,

Reilly K, Bernal D, Cortes DF, Gomez-Vasquez R, Tohme J, Beech-

Tohme J, Verdier V (2005) Gene expression profile in response to

ing JR (2007) Towards identifying the full set of genes expressed

Xanthomonas axonopodis pv. manihotis infection in cassava using

during cassava post-harvest physiological deterioration. Plant Mol.

a cDNA microarray. Plant Mol. Biol. 57, 393–410.

Biol. 64, 187–203.

210

Journal of Integrative Plant Biology

Vol. 53

No. 3

2011

Relogio A, Schwager C, Richter A, Ansorge W, Valcrcel J (2002) Op-

Wright GW, Simon RM (2003) A random variance model for detection

timization of oligonucleotide-based DNA microarrays. Nucl. Acids

of differential gene expression in small microarray experiments.

Res. 30, e51. Sakurai T, Plata G, Rodriguez-Zapata F, Seki M, Salcedo A, Toyoda A, Ishiwata A, Tohme J, Sakaki Y, Shinozaki K, Ishitani M (2007) Sequencing analysis of 20,000 full ength cDNA clones from cassava

Bioinformatics 19, 2448–2455. Yanagisawa S (2002) The Dof family of plant transcription factors. Trends Plant Sci. 7, 555–560. Yanagisawa S (2004) Dof domain proteins: plant-specific transcription

reveals lineage specific expansions in gene families related to stress

factors associated with diverse phenomena unique to plants. Plant

response. BMC Plant Biol. 7, 66.

Cell Physiol. 45, 386–391.

Schlitt T, Palin K, Rung J, Dietmann S, Lappe M, Ukkonen E, Brazma A (2003) From gene networks to gene function. Genome Res. 13, 2568–2576. Seki M, Narusaka M, Ishida J, Nanjo T, Fujita M, Oono Y, Kamiya A, Nakajima M, Enju A, Sakurai T, Satou M, Akiyama K, Taji

Yi M (2006) Whole pathway scope a comprehensive pathway based analysis. BMC Bioinformatics 7, 30. Zeeman SC, Kossmann J, Smith AM (2010) Starch: its metabolism, evolution, and biotechnological modification in plants. Annu. Rev. Plant Biol. 61, 209–234.

T, Yamaguchi-Shinozaki K, Carninci P, Kawai J, Hayashizaki

(Co-Editor: Hai-Chun Jing)

Y, Shinozaki K (2002) Monitoring the expression profiles of 7000 Arabidopsis genes under drought, cold and high-salinity stresses using a full-length cDNA microarray. Plant J. 31, 279– 292. Shippy R, Sendera T, Lockner R, Palaniappan C, Kaysser-Kranich T, Watts G, Alsobrook J (2004) Performance evaluation of commercial short-oligonucleotide microarrays and the impact of noise in making cross-platform correlations. BMC Genomics, 5, 61. Smith AM (2008) Prospects for increasing starch and sucrose yields for bioethanol production. Plant J. 54, 546–558. Sojikul P, Kongsawadworakul P, Viboonjun U, Thaiprasit J, Intawong B, Narangajavana J, Svasti MR (2010) AFLP-based transcript profiling for cassava genome-wide expression analysis in the onset of storage root formation. Physiol. Plant. 140, 189–198. Taylor N, Chavarriaga P, Raemakers K, Siritunga D, Zhang P (2004) Development and application of transgenic technologies in cassava. Plant Mol. Biol. 56, 671–688. Thimm O, Essigmann B, Kloska S, Altmann T, Buckhout TJ (2001) Response of Arabidopsis to iron deficiency stress as revealed by microarray analysis. Plant Physiol. 127, 1030–1043. Tsai CA, Hsueh HM, Chen JJ (2003) Estimation of false discovery rates in multiple testing: application to gene microarray data. Biometrics 59, 1071–1081. Tsai S, Mir B, Martin AC, Estrada JL, Bischoff SR, Hsieh W, Cassady JP, Freking BA, Nonneman DJ, Rohrer GA, Piedrahita JA (2006) Detection of transcriptional difference of porcine imprinted genes using different microarray platforms. BMC Genomics 7, 328. Vandenberg JH, Ewing EE (1991) Jasmonates and their role in plantgrowth and development, with special reference to the control of potato tuberization – A review. Amer. Potato J. 68, 781–794. Wolber PK, Collins PJ, Lucas AB, De Witte A, Shannon KW (2006) The agilent in situ-synthesized microarray platform. Methods Enzymol. 410, 28–57. Woo Y, Affourtit J, Daigle S, Viale A, Johnson K, Naggert J, Churchill G (2004) A comparison of cDNA, oligonucleotide, and Affymetrix GeneChip gene expression microarray platforms. J. Biomol. Tech. 15, 276–284.

Supporting Information Additional Supporting Information may be found in the online version of this article: Figure S1 Principal component analysis of all 11 cassava arrays. DR, developing storage roots in orange square; FR, fibrous roots in red triangle; MR, mature storage roots in blue diamond; TR, technical repeat in green circular. Figure S2 High correlation and co-efficiency of samples from fibrous roots (FR), developing storage roots (DR) and mature storage roots (MR). Figure S3 Hierarchical clustering of samples from fibrous roots (FR), developing storage roots (DR) and mature storage roots (MR). DR-1 was clustered into the FR group; DR-2 and DR-3 were clustered into the MR group. Figure S4 Stable expression of control genes in different samples. MeCYP71E, Manihot esculenta cytochrome P450 protein CYP71E (c15); MeBeta-6 Tublin, Manihot esculenta TUB6 (Beta-6 Tublin), putative; MeEF-1-alpha, Manihot esculenta elongation factor 1-alpha; MeACT, Manihot esculenta actin. Supporting File S1 Upregulated and downregulated genes in pair comparison from different samples. Supporting File S2 Differential expressed genes selected by random variance model (RVM) method. Supporting File S3 Gene annotations of differentially expressed genes base on National Centre for Biotechnology Information (NCBI) and TAIR protein database. Supporting File S4 Significant pathways and contributing genes. Supporting File S5 Interactions between significant pathways and related ones according to Kyoto Encyclopedia of Genes and Genomes (KEGG) database. Supporting File S6 Non-redundant Arabidopsis gene lists represented total or differential cassava genes on microarray.

Gene Expression Profiling of Developing Cassava Storage Roots

Supporting File S7 Appeared and absent significant pathways according to KegArray results. Supporting File S8 Differentially expressed genes assigned in each significant expression profile. Supporting File S9 Significant gene ontology (GO) terms and contributing genes in profile 11 and profile 4. Supporting File S10 Enrichment of each significant gene ontology (GO) term calculated base on The Arabidopsis Information Resource (TAIR) GO annotation results. Supporting File S11 Predicted 138 differentially expressed transcription factors identified by Kyoto Encyclopedia of Genes and Genomes (KEGG) Biomolecular Rela-

211

tions in Information Transmission and Expression (BRITE) and The Arabidopsis Information Resource (TAIR) Functional Categorization. Supporting File S12 Transcription factor family classifications and enrichments based on the Database of Arabidopsis Transcription Factors (DATF) and Plant Transcription Factor Database (PlnTFDB). Please note: Wiley-Blackwell are not responsible for the content or functionality of any supporting materials supplied by the authors. Any queries (other than missing material) should be directed to the corresponding author for the article.