Published online January 15, 2018
RESEARCH
Selection of Expression Reference Genes with Demonstrated Stability in Barley among a Diverse Set of Tissues and Cultivars Michael Gines, Thomas Baldwin, Abdur Rashid, Phil Bregitzer, Peter J. Maughan, Eric N. Jellen, and Kathy Esvelt Klos*
ABSTRACT Reference genes are selected based on the assumption of temporal and spatial expression stability and on their widespread use in model species. They are often used without validation in the studied species. For barley (Hordeum vulgare L.), previous reference gene validation studies have been limited by the number of tissues, cultivars, and/or reference gene candidates assayed. Publically available bioinformatic resources are now available that enable efficient assessment of much larger numbers of candidate genes than is possible with empirical approaches. Expression sequence tag profile viewer data from the UniGene library were used to predict the expression stability of 655 barley genes. The expression profiles of 20 gene candidates predicted to be stable across tissue types were evaluated in eight tissues of the cultivar ‘Conlon’ by real-time quantitative polymerase chain reaction. The five most stable genes were then tested in the barley cultivars ‘Golden Promise’ and ‘Harrington’ and (to test potential applicability to other Poaceae species) in the oat (Avena sativa L.) cultivar ‘Hifi’. The traditional reference gene actin and the candidate genes elongation factor EF-2 and malate dehydrogenase demonstrated the most stable expression. The predictive capacity of bioinformatics to identify suitable reference genes was demonstrated. Four genes with better stability than all except one traditionally used reference gene were discovered.
M. Gines, P.J. Maughan, and E.N. Jellen, Dep. of Plant and Wildlife Sciences, Brigham Young Univ., Provo, UT; T. Baldwin, P. Bregitzer, and K. Esvelt Klos, National Small Grains Germplasm Research Facility, USDA-ARS, Aberdeen, ID; A. Rashid, Dep. of Entomology Plant Pathology and Weed Science, New Mexico State Univ., Las Cruces, NM. Received 24 July 2017. Accepted 24 Oct. 2017. *Corresponding author (
[email protected]). Assigned to Associate Editor Candice Hirsch. Abbreviations: cDNA, complementary DNA; Cq, quantification cycle; Ct, threshold cycle; DPA, days postanthesis; EST, expression sequence tag; GAPDH, glyceraldehyde 3-phosphate dehydrogenase; PCR, polymerase chain reaction; RT-qPCR, real-time quantitative polymerase chain reaction; TPM, transcripts per million.
T
he advent of largescale gene expression analysis by microarrays and, increasingly, by RNA sequencing is changing the landscape of molecular biology research. These methods produce large datasets that are informative and reliable, but time consuming to analyze and costly to produce. For these reasons, classic techniques for gene expression analysis like northern blots and realtime quantitative polymerase chain reaction (RT-qPCR) are still effective and widely used for targeted small-scale investigations. These methods can provide insights into gene regulation, protein interactions, novel gene discovery, and biosynthesis steps (Oliver, 2000; Lee et al., 2004; Wolfe et al., 2005). However, they require normalization of target-gene expression based on the constitutive expression of reference genes in the tissues being investigated. Reference genes used for expression studies in barley (Hordeum vulgare L.) and other cereals have generally been chosen based on their use in studies of other species (Ferdous et al., 2015; Hua et al., 2015), often without further validation. This is a problem for two reasons. First, such reference genes may be differentially expressed among species and in tissues within a species, Published in Crop Sci. 58:332–341 (2018). doi: 10.2135/cropsci2017.07.0443 © Crop Science Society of America | 5585 Guilford Rd., Madison, WI 53711 USA This is an open access article distributed under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
332
www.crops.org
crop science, vol. 58, january– february 2018
and second, restriction of reference genes to a pool of previously used genes may overlook superior candidates. Although previous studies designed to identify or validate appropriate reference genes for barley have been conducted, each was limited in the numbers of genes, tissues, and/or cultivars. Using microarray-based expression data, Janská et al. (2013) surveyed 13 genes in only two tissues (seedling leaf and crown) subjected to abiotic stress, whereas Ovesna et al. (2012) evaluated 10 genes at five developmental stages of developing barley caryopses. Zmienko et al. (2015) surveyed a large set of 181 genes in senescing leaf tissue only. In addition, many have relied on traditional candidates. For example, Hua et al. (2015) and Ferdous et al. (2015) identified the most stable reference genes from among pools of traditionally used genes including glyceraldehyde 3-phosphate dehydrogenase (GAPDH) and actin. None of these studies examined gene expression in germinating grain, which is the most critical aspect of development to the malting and brewing industry. The current study provides a broader analysis of traditional and nontraditional reference gene candidates across spatially and temporally diverse tissues and diverse cultivars. The objective of this study was to identify a set of highly reliable and stably expressed reference genes across genetically diverse barley cultivars, and across diverse plant tissues. We used publically available bioinformatic resources for data mining to predict the stability of hundreds of genes across multiple tissues, time points, and conditions. Comparison across these publically available datasets was used to narrow the selection of adequate reference genes and was then followed by experimental validation of the top candidates. Three barley cultivars were selected based on their importance in genetic transformation studies, commercial utility, and genetic diversity. For additional comparison and to test the potential for extending results to other members of the Poaceae, top reference gene candidates were also evaluated in an oat (Avena sativa L.) cultivar.
MATERIALS AND METHODS Plant Materials Plant tissues were obtained from ‘Conlon’, ‘Golden Promise’, and ‘Harrington’ barley cultivars and from the oat cultivar ‘HiFi’. The barley cultivars were chosen based on their importance for malting (Harrington and Golden Promise) and their use in transformation studies (Conlon and Golden Promise). Based on pedigrees, they are relatively genetically diverse. Harrington (Harvey and Rossnagel 1984) and Conlon (PI 597789, PVP 9700243; USDA, 1997) both have a common parent (Klages), but whereas Klages represented 50% of the initial genetic variation in Harrington (Klages/3/Gazelle/Betzes// Centennial), the pedigree of Conlon is complex and the contribution of Klages is considerably smaller. Furthermore, the areas of adaptation for Conlon and Harrington differ, and their morphology and patterns of growth and development are distinctive crop science, vol. 58, january– february 2018
(Bregitzer, personal observation). Golden Promise (PI 343079) is a dwarf mutant induced by g radiation of ‘Maythorpe’ from the United Kingdom and is not closely related to either Conlon or Harrington. None of the cultivars used in empirical analyses were represented in the UniGene expressed sequence tag (EST) libraries used for data mining. For this study, lines derived from single-plant selections of each cultivar were made to reduce or eliminate potential within-cultivar heterogeneity. Eight tissues were chosen from each cultivar to represent important barley developmental stages: 12-d-old seedling roots and shoots; stems and tiller leaves at the six-tiller stage; caryopses at 0 to 1, 5 to 7, and 12 to 15 d post anthesis (DPA); and seed tissue 4 d postgermination. The latter tissue is unique to this study and highly relevant to studies of gene expression during malting. For each cultivar and tissue type, three biological replicates (separate plants) were assayed. All plants were grown in a greenhouse. The initial experiments were done with Conlon plants, grown at a different time than the Harrington and Goldon Promise plants used for subsequent experiments. Samples were harvested and immediately used for extraction or frozen in liquid N and stored at −80°C.
Bioinformatic Analysis Candidate reference genes were discovered using estimation of expression stability across publically available expression data. A UniGene (UniGene, 2014) search was performed for barley EST clusters. These data came from studies of a number of cultivars, with ‘Barke’, ‘Hiruna Nijo’, ‘Morex’, and ‘Optic’ represented by more than two of the 74 EST libraries used. Hiruna Nijo, a Japanese cultivar, and Morex, a six-rowed cultivar adapted to the Midwestern United States, represent germplasm distinct from the two-rowed North American or European germplasm (Munoz-Amatrain et al., 2014). Barke and Optic are two-rowed European malting cultivars, like Golden Promise, but were both released after Golden Promise, and neither has Golden Promise in their pedigree (Russell et al., 2000; Gottwald et al., 2009). Robust, well-represented candidate reference genes were selected by using the search term “Hordeum vulgare 100:20000[ESTC]” to exclude clusters with less than 100 ESTs, resulting in a pool of 689 candidate reference genes. Of these, 665 had EST profiles for a broad range of tissues relevant to this study. These EST profiles were used for estimating gene expression stability. For each cluster’s EST profile, the CV was derived from the transcripts per million (TPM) values across leaf, pericarp, pistil, root, seed, spike, and stem tissues. The TPM CV as a measurement of stability of the EST profile could be skewed by outliers or underrepresented data. No outliers were found outside of the calculated threshold of 0 to 140,164 TPM. Underrepresented tissue datasets were excluded below the first quartile of 16,105 TPM and reduced candidate genes from 665 to 430 profiles that contained TPM values for all seven tissues (Supplemental Table S1). The clusters were ranked by CV, and the 20 clusters with the lowest CV were selected as candidates for the empirical expression study.
Primer Design and Validation Barley sequences were acquired from NCBI (NCBI, 2014) accessions linked to UniGene clusters. Oat sequences were predicted
www.crops.org 333
by aligning CORE (Oliver et al., 2013) EST libraries from 22 oat lines to the barley NCBI accessions using Sequencher version 5.2.4 (Gene Codes Corporation, 2014) with 85% minimum match, minimum 20 bp overlap, and under the “dirty data” setting. Where possible, primers were designed using sequences homologous between barley and oat. Primer sets for each candidate were first validated using barley complementary DNA (cDNA), and successful assays were evaluated for their performance on oat cDNA. All primer pairs were validated through end-point PCR and gel electrophoresis for specific amplification of single products. Satisfactory primer pairs were then validated for real-time PCR applications by performing a standard curve analysis. Primer pairs were accepted if they produced products of 100 to 400 bp, amplification efficiencies between 90 and 110%, r 2 above 0.98, and single melt peaks (Appendices 1 and 2 in the supplemental material). Primer sets that did not perform optimally using oat cDNA were redesigned specifically for oat sequences. Some primer sets were observed to amplify genomic DNA. To ensure no genomic carryover from RNA extraction interfered in real-time quantification, all RNA samples were run parallel with their cDNA counterparts using primer set 13 (elongation factor 1-a).
RNA Extraction RNA extraction was performed using the GeneJET Plant Purification Minikit (Thermo Fisher Scientific). Samples were then treated with DNase I and DNase 5´ buffer (New England Biolabs). The RNA was then isolated through LiCl precipitation. Samples with high starch levels (all seeds 4 d postgermination, and oat leaf and shoot tissues) clogged the GeneJet spin columns and were therefore isolated through a Trizol (Thermo Fisher Scientific) extraction method (Rashid et al., 2016) with further purification through LiCl precipitation. RNA samples were observed for quality using gel electrophoresis and optical density readings. Samples showing defined 28S and 18S bands and with values for optical density 260/280 and 260/230 >1.7 were selected for use. The RNA samples were checked for genomic DNA contamination by running them in parallel with cDNA through real-time PCR. Sample cDNA was generated using the iScript kit (Bio-Rad).
Quantitative Reverse Transcriptase PCR Quantification of transcript levels for each sample replicate (biological replicate) was performed in triplicate using RTqPCR. All reference candidate gene expression measurements were performed using the SsoFast EvaGreen Supermix (BioRad) with the Sybrgreen fluorophore. The standard protocol recommended by the manufacturer was followed using 0.5 mL of each primer pair for each primer set, and 2.5 ng of template cDNA for each tissue. The thermal cycler program was 95°C activation for 30 s, 40 cycles of a melting step at 95°C for 5 s and an annealing and extension step of 60°C, followed by a melt curve step from 65 to 95°C in increments of 0.5°C every 5 s. A CFX thermal cycler and CFX Manager version 3.21 software (Bio-Rad, 2014) were used to generate and analyze the data. Transcript abundance was measured from the cycle in which amplification of the template primer product exhibited fluorescence above the threshold detection level for each target, manifesting as a quantitative cycle (Cq) value. 334
Statistical Comparison of Reference Genes The RT-qPCR Cq values for candidate genes were evaluated for equality of variances using the Levene’s test and for equality of means using the Welch’s variance-weighted ANOVA in the SAS Enterprise 7.1 software (SAS Institute, 2014). To empirically assess the expression stability across tissues of each candidate gene, the data retrieved from the real time experiments were uploaded to an online data analysis tool, RefFinder (Xie et al, 2012; Baohong Zhang, personal communication, 2016). RefFinder returned the results from several statistical models used to calculate gene expression stability. Stability rank was assigned to candidate genes using each model independently, and then the ranks were used to derive an overall measure of reference gene stability, the geomean value, for each candidate gene. Low geomean values indicate high stability. The statistical models used were BestKeeper, comparative D threshold cycle (Ct), geNorm, and Normfinder (Vandesompele et al., 2002; Andersen et al., 2004; Pfaffl et al., 2004; Silver et al., 2006). Candidate gene expression stability ranking, estimated using CV of the UniGene TPM data, was compared with the empirical rankings, based on geomeans, using a Pearson’s correlation test. According to the results of the empirical analysis of all 20 candidates in Conlon, the top five novel candidates and two traditional reference genes (actin and GAPDH) were evaluated in Golden Promise, Harrington, and HiFi tissues. The Cq values for these seven genes in eight tissues were used to calculate geomeans for each cultivar individually, for all three barley cultivars averaged over all tissues and for each tissue independently, and for all three barley cultivars plus oat.
RESULTS Expression Stability Based on UniGene EST Profile Unigene EST profiles were used to identify 430 clusters defining reference candidate genes with publically available expression data across seven barley tissues. The CV values from TPM reported for barley leaf, pericarp, pistil, root, seed, spike, and stem tissues were normally distributed, ranging from 0.15 to 2.03 (not shown). The CV of the 20 most stable candidate genes are shown in Table 1.
Real-Time Quantitative PCR in Conlon Expression of these top 20 candidate genes was measured in Conlon using RT-qPCR (Supplemental Table S2). Within Conlon, candidate genes differed for both the variance (p < 0.005) and mean value (p < 0.0001) of Cq, but Cq of tissues relative to one another was fairly consistent with Cq values for 4 d postgermination comprising at least two of the three lowest values (Fig. 1). Based on the geomeans, the five most stably expressed candidates in Conlon were (in descending order of stability): Hv. 3271 (superoxide dismutase), Hv. 22901 (malate dehydrogenase), Hv. 9509 (elongation factor EF-2), Hv. 2973 (3-ketoacyl-CoA thiolase-like), and Hv. 20658 (S-adenosylmethionine synthetase 4, Fig. 2, Appendix 3 in the supplemental material).
www.crops.org
crop science, vol. 58, january– february 2018
bp 112 135 145 221 218 218 228 175 135 111 239 178 207 342 359 371 205 380 146 146 CCAGTCTGCCCCTTTAATCA CCTTGGGCTCTGCTCTTAG CCAAGGTCATGTCAGTGCTG AGATCAACAACTGACACATCCAC GGCAGGCATCAACTTCCTTC CTTCTCAGGGATAGATGGAGC AGCTGCTGAACTCTGGGAAT GCACCAGATTGCCCAAAGACA CAACAATGTCAACCACCTTCAC GATCCACTCCAACAGCAACG GTTAGCAACACCATCCGCTC CACCAGCGAACACAGAGAAA CAAACCCACGCTTGAGATCC GCAGCAATCTCATCGTTGGT CACAAGGTAGGCGGAGTAGA GTAGTCGAACTGGCAGTCATG TCCATGTCATCCCAGTTGCT CCTACAGTGCCTTGAGGATCTC CCATGCTTCCCAGTCTTGGA CTCAACAGCCAGGACAACAC AK362890 NADH dehydrogenase AGATCGCATTTTCACGAACC AK251074 Phosphoglycerate mutase GCAGCAGTGTTGGTTTCGTA AK251152 Proteasome subunit beta type-7-B AGTTGCTTGTGAAAGGTGCC X60343 GAPDH GAGGGTCTGATGACCACTGT AK250157 Elongation factor EF-2 AACTGGCATGAAGGTCCGTA AK358926 Malate dehydrogenase GCACTGGTGTGAATGTTGC AK248694 Heat shock protein 70 CCTTTCTTCCACTGCCCAAA Y09741 Beta-tubulin 1 CAGTTGGAGCGTGTCAATGT AF230786 Translationally-controlled tumor protein homolog (HTP) GTGCTCTGGGAAGTCGATG AK251228 3-ketoacyl-CoA thiolase-like protein AAGTGAGTGATGGTGCTGGA AK252295 Superoxide dismutase [Cu-Zn] GCACCATCTTCTTCACCCAG AK359072 Mitochondrial ATP synthase beta subunit GCCACAGAGCAGCAAATTCT Z50789 TTGGTGGCATTGGAACTGTG Elongation factor 1-a AM039896 S-adenosylmethionine synthetase 4 CAACATTGAGCAGCAGTCCC AY325266 Cytosolic heat shock protein 90 AGATCAACCAGCTCCTCTCG AK250311 Zinc finger A20 & AN1 domain-containing stress-associated TGAACATGTGCTCCAAGTGC AY145451 Actin GAATGGTCAAGGCTGGTTTCG AK371415 Peroxiredoxin-5, mitochondrial GGTCATCCTCTTCGGCGTC AK252614 Translation initiation factor eIF-5A CACCACTTCGAGTCCAAGGC AK367709 Proteasome subunit alpha type-5-A TCACGAGGACTGAGTACGAC Hv. 3082 Hv. 674 Hv. 2613 Hv. 22848 Hv. 9509 Hv. 22901 Hv. 19033 Hv. 735 Hv. 22789 Hv. 2973 Hv. 3271 Hv. 22823 Hv. 22783 Hv. 20658 Hv. 22798 Hv. 23243 Hv. 23088 Hv. 3750 Hv. 3381 Hv. 12544 0.154 0.205 0.205 0.221 0.231 0.240 0.242 0.244 0.254 0.255 0.259 0.261 0.264 0.276 0.287 0.289 0.292 0.296 0.299 0.304 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Size Reverse primer Forward primer Gene NCBI accession UniGene cluster CV Stability rank
Table 1. Stability ranking based on CV of transcripts per million in the UniGene database for 20 candidate reference genes and their validated primer sequences. Genes that were also included in cited studies are underlined.
crop science, vol. 58, january– february 2018
Quantitative Reverse Transcriptase PCR in Golden Promise, Harrington, and HiFi The five candidate genes with the most stable expression in Conlon as determined by RT-qPCR, as well as the commonly used reference genes GAPDH and actin, were evaluated in Golden Promise and Harrington barley and HiFi oat using RT-qPCR (Supplemental Table S3). The means and ranges for Cq values are shown in Fig. 3. The geomean rankings of the top five candidate genes plus actin (Hv. 23088) and GAPDH (Hv. 22848) within cultivar (across all tissues) are shown in Table 2. These data showed that actin was ranked sixth, third, third, and first for Conlon, Golden Promise, Harrington, and HiFi, respectively.
Expression Stability Comparisons Expression stability within Conlon, as expressed by geomean ranking based on Cq values from three biological replicates of seven tissues, was compared with predicted stability from the data mining, as expressed by ranking the TPM CVs. Within this subset of candidate reference genes (which had been selected based on low TPM CV), there was no correlation between the TPM CV rank and the geomean rank based on expression analysis of these top 20 candidates in Conlon (r 2 = −0.0511, p = 0.8306). For example, the candidate predicted to be the most stably expressed based on UniGene EST profile, NADH dehydrogenase (Hv. 3082), was ninth in overall geomean ranking from RefFinder (and was ranked 11th by the DCt method, 17th by the BestKeeper, 12th by the Normfinder, and first by the geNorm method). The candidate gene predicted to be least stable based on UniGene EST profile, proteasome subunit a type5-A (Hv. 12544), was sixth in the overall geomean ranking (and eighth by the DCt method, 18th by the BestKeeper, eighth by the Normfinder, and first by the geNorm method; Appendix 3 in the supplemental material). However, none of the candidate reference genes displayed high levels of variability relative to GAPDH and actin (Fig. 2). Expression stability across barley cultivars was evaluated by calculating geomeans using the Cq data from all barley cultivars. Among those geomeans, actin was ranked first. The other traditionally used reference gene, GAPDH, was less stable and was ranked seventh for Conlon, Golden Promise, and for all barley cultivars. Superoxide dismutase [Cu-Zn] (Hv. 3271), which was the most stably expressed reference candidate gene based on the data mining and in the empirical RT-qPCR analyses of Conlon, was ranked fifth in stability when all barley cultivars were combined (Table 2).
www.crops.org 335
Fig. 1. The quantification cycle (Cq) values from real-time quantitative polymerase chain reaction of Conlon for 20 candidate genes. Three biological replicates of eight tissues are indicated by symbols. The x-axis is labeled by UniGene cluster annotations in geomean rank order from most stable on the left to least stable on the right. DPA, days postanthesis; DPG, days postgermination.
Fig. 2. A bar chart of the calculated stability values of 20 candidate genes across eight Conlon tissues. The y-axis measures their geomean stability value calculated through RefFinder, and the x-axis shows the UniGene cluster annotations in geomean rank order from most stable on the left to least stable on the right.
336
www.crops.org
crop science, vol. 58, january– february 2018
To test the potential for extending results to other members of the Poaceae, the Cq data from RT-qPCR experiments in HiFi oat were included with the barley Cq data to generate geomeans. Using those geomeans, actin was ranked first and GAPDH sixth (Table 2). Variability in the relative stabilities of reference genes was observed when all available Cq data for each tissue type was used to estimate geomeans (Fig. 4). For instance, the stability of all seven genes was similar for 5-DPA caryopses, but for other tissues there were differences among the genes.
DISCUSSION
Fig. 3. Box-and-whisker plots of the quantification cycle (Cq) data for candidates GAPDH (Hv. 22848), elongation factor EF-2 (Hv. 9509), malate dehydrogenase (Hv. 22901), 3-ketoacyl-CoA thiolase-like protein (Hv 2973), superoxide dismutase (Hv. 3271), S-adenosylmethionine synthetase 4 (Hv. 20658), and actin (Hv. 23088), from real-time quantitative polymerase chain reaction of three biological replicates of seven tissues each of Conlon, Golden Promise, Harrington, and HiFi. crop science, vol. 58, january– february 2018
This study used publically available gene expression data to justify the selection of reference genes expected to be expressed at relatively uniform levels throughout the barley life cycle and across many plant tissues, regardless of genotype. The inclusion in this study of multiple tissues sampled at different growth stages, and multiple cultivars that represent significant diversity within elite malting barley germplasm, was intended as a means to identify and eliminate as reference genes those that may have tissue- or cultivar-specific expression patterns. Similarly, the pool of reference gene candidates tested was based on publically available expression data derived from multiple tissues and cultivars. The results, therefore, may be generally applicable to studies of other barley cultivars and tissues, but validation of the utility of these candidate reference genes for cultivars, tissue, or growth conditions not evaluated in this study may be necessary. Empirical evaluation of the 20 reference candidate genes expected to be most stably expressed on the basis of UniGene EST profile identified some that were more uniformly expressed in Conlon than others. The relative stability ranking among the five most stable genes from Conlon within other cultivars was not uniform. This suggests that all were quite stable, and that the level of variation in TPM reported for experiments uploaded to the UniGene library is a good predictor of variation in expression estimates obtained using RT-qPCR analysis of experiments that compare barley cultivars grown under uniform conditions. The analysis of 655 genes based on expression data in the UniGene database was
www.crops.org 337
Table 2. The RefFinder geomean stability ranking (from left to right) of candidates GAPDH (Hv. 22848), elongation factor EF-2 (Hv. 9509), malate dehydrogenase (Hv. 22901), 3-ketoacyl-CoA thiolase-like protein (Hv. 2973), superoxide dismutase (Hv. 3271), S-adenosylmethionine synthetase 4 (Hv. 20658), and actin (Hv. 23088). Geomeans were calculated using three biological replicates of seven tissues of each cultivar or combination of cultivars. Cultivar Conlon Golden Promise Harrington Hifi Combined Barley Combined Poaceae
Ranking ——————————————————————————————— More stable to less stable ——————————————————————————————— Hv. 3271 Hv. 9509 Hv. 20658 Hv. 2973 Hv. 22901 Hv. 23088 Hv. 22848 Hv. 22901 Hv. 9509 Hv. 23088 Hv. 20658 Hv. 2973 Hv. 3271 Hv. 22848 Hv. 22848 Hv. 2973 Hv. 23088 Hv. 9509 Hv. 20658 Hv. 22901 Hv. 3271 Hv. 23088 Hv. 9509 Hv. 22901 Hv. 3271 Hv. 2973 Hv. 22848 Hv. 20658 Hv. 23088 Hv. 9509 Hv. 22901 Hv. 20658 Hv. 3271 Hv. 2973 Hv. 22848 Hv. 23088 Hv. 22901 Hv. 9509 Hv. 20658 Hv. 3271 Hv. 22848 Hv. 2973
effective in identifying genes with stable expression relative to the commonly used actin and GAPDH genes. However, within the top 20 candidates, there was little correlation between predicted stability ranking and ranking based on the empirical analyses. This may indicate the limitations of predictions made from publically available data. It is possible that data mining within RNA sequence datasets would produce more precise estimates of relative expression stability among stable genes that would be more accurately reflected in empirical measures. Alternately, the low correlation between rankings may be due to fairly stable expression of all genes in the set, which would suggest that in relative terms, all of these genes may be suitable as reference genes, performing equal to or better than actin or GAPDH in some cultivars or tissues. In Conlon, actin (Hv. 23088) and the top novel candidate genes had similarly low geomean values, reflecting the relatively low levels of Cq variation observed among tissues. Indeed, these genes also had very low coefficients of variation in the UniGene data (CV < 0.3). Data mining of the UniGene library identified both traditional reference genes and novel candidate reference genes among the 20 genes predicted to be most stably expressed in diverse barley tissues. This coincides with the expectation that, although traditional reference genes have been validated as stable in other studies, there may be untested genes that would better serve as expression references. In addition, the empirically tested genes performed similarly in HiFi oat compared with the barley cultivars. For crops with little published data on gene expression variation, selection of reference genes through data mining using model species data may prove as effective as selection on the basis of traditional usage. It must be noted that this study looked at reference gene stability under normal plant growth conditions only. Many gene expression experiments are aimed at identifying changes that occur with different environmental exposures. Some reference genes may have expression stabilities that are relative to the treatments applied. For example, Ferdous et al. (2015) found actin to be more stable than GAPDH through five stress treatments, but less stable under nitrate and salt treatments. Janská et al. 338
(2013) recommended actin as a reference gene for contrasting gene expression under variable temperatures, and GAPDH under variable water treatments. The novel reference genes suggested by these experiments may prove to be more stable under some environmental treatments than others. Thus, it may be advisable to conduct empirical assessments to identify the most stable reference genes for a particular set of experimental conditions. Similarly, the observed variability in the stability of reference genes in different tissues suggests that the choice of reference genes may vary depending on the tissues in which assays are conducted. In particular, the 0- and 5-DPA tissues exhibited both generally higher geomeans and greater Cq variation among samples. This implies a need for candidate gene selection targeted towards caryopses. Within the other tissues, the observed geomeans covered the range and Cq variation was less uniform. Discounting the 0- and 5-DPA tissues, EF-2 and actin had least variation in Cq values among samples within tissue. Using the combined barley expression data and extending this to HiFi oat, three candidate reference genes emerged as the most uniformly stable. Actin (Hv. 23088) was the most stable candidate, using both types of combined data. Actin has been used as an expression reference gene in numerous plant and animal species, originally because of its relatively high level of expression in all cells and, perhaps, because of historical use as a reference in nonquantitative assays such as northern blots (reviewed in Huggett et al., 2005). Recent evaluations of reference genes have questioned the selection of traditional housekeeping genes, such as actin, and examined them for evidence of fluctuations in expression. For example, a detailed evaluation of relative expression during the Arabidopsis Heynh. lifecycle suggested that actin (and GAPDH) expression may be lower than the norm in pollen and seed samples (Czechowski et al., 2005). Actin expression in barley and oat was stable across the tissues used in this study, but reports of experimental conditions under which actin has proven unsuitable ( Janská et al., 2013; Ferdous et al., 2015) argue the need for additional reference genes to include as controls.
www.crops.org
crop science, vol. 58, january– february 2018
Fig. 4. Box-and-whisker plots (top) of the quantification cycle (Cq) data from real-time quantitative polymerase chain reaction and calculated geomeans (bottom bars) for seven tissues (averaged over three biological replicates each of Conlon, Golden Promise, and Harrington) for candidates GAPDH (Hv. 22848), elongation factor EF-2 (Hv. 9509), malate dehydrogenase (Hv. 22901), 3-ketoacyl-CoA thiolase-like protein (Hv. 2973), superoxide dismutase (Hv. 3271), S-adenosylmethionine synthetase 4 (Hv. 20658), and actin (Hv. 23088). DPA, days postanthesis; DPG, days postgermination.
Elongation factor EF-2 (Hv. 9509) ranked second in stability when all barley data were combined and fell to third when oat data were added, but expression of this gene was among the top four most stable within all cultivars. Both elongation factors EF-1a and EF-2 act during translation elongation on the ribosome, but unlike EF-1a, EF-2 has seldom been used as a reference gene in plant crop science, vol. 58, january– february 2018
studies. EF-2 was validated as a reference gene for RTqPCR studies in Lycium barbarum L., where it was among the top five most stable (out of eight genes) based on analysis using the NormFinder algorithm (Wang et al., 2013). It was also among the 127 gene sequences evaluated as expression controls in tomato (Solanum lycopersicum L.) by Coker and Davies (2003) but was found to have a
www.crops.org 339
greater than threefold range of expression in the available EST libraries. Differential expression of EF-2 between drought-susceptible and drought-tolerant rapeseed (Brassica napus L.) lines (Mohammadi et al., 2012) and after cotton (Gossypium hirsutum L.) infection with Fusarium oxysporum Schlecht. emend. Snyder & Hansen (Dowd et al., 2004) suggest that this gene may not be suitable as a reference under some treatments. Finally, malate dehydrogenase (Hv. 22901) ranked the third most stably expressed when barley data were combined and second when oat data were added, largely because of less optimal performance in Harrington. This is not the first appearance of malate dehydrogenase in the plant gene expression literature. Malate dehydrogenase was been validated as a reference gene in peanut (Arachis hypogaea L.), where it was ranked third most stable out of eight genes by the geNorm algorithm ( Jiang et al., 2011); in grapevine (Vitis vinifera L.), where it was ranked fourth to fifth using the same algorithm (Reid et al. (2006); and in cacao (Theobroma cacao L.), where it ranked first out of 10 genes in stability across various tissues based on Normfinder (Pinheiro et al., 2011). A malate dehydrogenase homologue was nominated by Faccioli et al. (2007) as a reference candidate gene for barley using a data-mining process aimed at identifying constitutively expressed genes based on their representation in >32 of 135 cDNA libraries in the TIGR database, but it was not empirically tested for expression stability in that study. Based on TPM CVs, cytosolic malate dehydrogenase was ranked as the 16th most stably expressed wheat (Triticum aestivum L.) UniGene cluster by Paolacci et al. (2009) but was not ranked within the top 10 most stable genes in any empirical study. In conclusion, these analyses support the use of the traditional reference gene actin (Hv. 23088) as a barley reference gene in many applications. They also identify two additional reference genes, elongation factor EF-2 (Hv. 9509) and malate dehydrogenase (Hv. 22901), that are likely to perform well as expression reference genes in future studies for most tissues.
Conflict of Interest The authors declare that there is no conflict of interest.
Supplemental Material Available Supplemental material for this article is available online. Acknowledgments Funding for this research was provided by the USDA-ARS, Project 5366-21000-031-00D, and by General Mills. The USDA is an equal opportunity employer.
340
References Andersen, C.L., J.L. Jensen, and T.F. Ørntoft. 2004. Normalization of real-time quantitative reverse transcription-PCR data: A model-based variance estimation approach to identify genes suited for normalization, applied to bladder and colon cancer data sets. Cancer Res. 64:5245–5250. doi:10.1158/0008-5472. CAN-04-0496 Bio-Rad. 2014. CFX Manager software version 3.21. Bio-Rad, Hercules, CA. Coker, J.S., and E. Davies. 2003. Selection of candidate housekeeping controls in tomato plants using EST data. Biotechniques 35:740–748. Czechowski, T., M. Stitt, T. Altmann, M.K. Udvardi, and W.R. Scheible. 2005. Genome-wide identification and testing of superior reference genes for transcript normalization in Arabidopsis. Plant Physiol. 139:5–17. doi:10.1104/pp.105.063743 Dowd, C., I.W. Wilson, and H. McFadden. 2004. Gene expression profile changes in cotton root and hypocotyl tissues in response to infection with Fusarium oxysporum f. sp. vasinfectum. Mol. Plant Microbe Interact. 17:654–667. doi:10.1094/MPMI.2004.17.6.654 Faccioli, P., G.P. Ciceri, P. Provero, A.M. Stanca, C. Morcia, and V. Terzi. 2007. A combined strategy of “in silico” transcriptome analysis and web search engine optimization allows an agile identification of reference genes suitable for normalization in gene expression studies. Plant Mol. Biol. 63:679–688. doi:10.1007/s11103-006-9116-9 Ferdous, J., Y. Li, N. Reid, P. Langridge, B.J. Shi, and P.J. Tricker. 2015. Identification of reference genes for quantitative expression analysis of microRNAs and mRNAs in barley under various stress conditions. PLoS One 10:e0118503. Gene Codes Corporation. 2014. Sequencher version 5.2.4. Gene Codes Corp., Ann Arbor, MI. Gottwald, S., P. Bauer, T. Komatsuda, U. Lundqvist, and N. Stein. 2009. TILLING in the two-rowed cultivar ‘Barke’ reveals preferred sites of functional diversity in the gene HvHox1. BMC Res. Notes 2:258. doi:10.1186/1756-0500-2-258 Harvey, B.L., and B.G. Rossnagel. 1984. Harrington barley. Can. J. Plant Sci. 64:193–194. doi:10.4141/cjps84-024 Hua, W., J. Zhu, Y. Shang, J. Wang, Q. Jia, and J. Yang. 2015. Identification of suitable reference genes for barley gene expression under abiotic stresses and hormonal treatments. Plant Mol. Biol. Report. 33:1002–1012. doi:10.1007/s11105-014-0807-0 Huggett, J., K. Dheda, S. Bustin, and A. Zumla. 2005. Real-time RT-PCR normalisation; strategies and considerations. Genes Immun. 6:279–284. doi:10.1038/sj.gene.6364190 Janská, A., J. Hodek, P. Svoboda, J. Zámečník, L.T. Prášil, E. Vlasáková, et al. 2013. The choice of reference gene set for assessing gene expression in barley (Hordeum vulgare L.) under low temperature and drought stress. Mol. Genet. Genomics 288:639–649. doi:10.1007/s00438-013-0774-4 Jiang, S., Y. Sun, and S. Wang. 2011. Selection of reference genes in peanut seed by real-time quantitative polymerase chain reaction. Int. J. Food Sci. Technol. 46:2191–2196. doi:10.111/ j.1365-2621.2011.02735.x Lee, H.K., A.K. Hsu, J. Sajdak, J. Qin, and P. Pavlidis. 2004. Coexpression analysis of human genes across many microarray data sets. Genome Res. 14:1085–1094. doi:10.1101/gr.1910904 Mohammadi, P.P., A. Moieni, and S. Komatsu. 2012. Comparative proteome analysis of drought-sensitive and drought-tolerant rapeseed roots and their hybrid F1 line under drought stress. Amino Acids 43:2137–2152. doi:10.1007/s00726-012-1299-6
www.crops.org
crop science, vol. 58, january– february 2018
Munoz-Amatrain, M., A. Cuesto-Marcos, J.B. Endelman, J. Comadran, J.M. Bonman, H.E. Bockelman, et al. 2014. The USDA Barley CORE Collection: Genetic diversity, population structure, and potential for genome-wide association studies. PLoS One 9:e94688. NCBI. 2014. National Center for Biotechnology Information. U. S. Natl. Library Med. http://www.ncbi.nlm.nih.gov (accessed 8 Oct. 2014). Oliver, S. 2000. Proteomics: Guilt-by-association goes global. Nature 403:601–603. doi:10.1038/35001165 Oliver, R.E., N.A. Tinker, G.R. Lazo, S. Chao, E.N. Jellen, M.L. Carson, et al. 2013. SNP discovery and chromosome anchoring provide the first physically-anchored hexaploid oat map and reveal synteny with model species. PLoS One 8:e58068. Ovesna, J., L. Kučera, K. Vaculová, K. Štrymplová, I. Svobodova, and L. Milella. 2012. Validation of the -amy1 transcription profiling assay and selection of reference genes suited for a RT-qPCR assay in developing barley caryopsis. PloS One 7:e41886. doi:10.1371/journal.pone.0041886 Paolacci, A.R., O.A. Tanzarella, E. Porceddu, and M. Ciaffi. 2009. Identification and validation of reference genes for quantitative RT-PCR normalization in wheat. BMC Mol. Biol. 10:11. doi:10.1186/1471-2199-10-11 Pfaffl, M.W., A. Tichopad, C. Prgomet, and T.P. Neuvians. 2004. Determination of stable housekeeping genes, differentially regulated target genes and sample integrity: BestKeeper–Excel-based tool using pair-wise correlations. Biotechnol. Lett. 26:509–515. doi:10.1023/ B:BILE.0000019559.84305.47 Pinheiro, T.T., C.G. Litholdo, Jr., M.L. Sereno, G.A Leal, Jr., P.S.B. Albuquerque, and A. Figueira. 2011. Establishing references for gene expression analyses by RT-qPCR in Theobroma cacao tissues. Genet. Mol. Res. 10:3291–305. doi:10.4238/2011.November.17.4 Rashid, A., T. Baldwin, M. Gines, P. Bregitzer, and K. Esvelt Klos. 2016. A high-throughput RNA extraction for sprouted single-seed barley (Hordeum vulgare L.) rich in polysaccharides. Plants 6:1. doi:10.3390/plants6010001 Reid, K.E., N. Olsson, J. Schlosser, F. Peng, and S.T. Lund. 2006. An optimized grapevine RNA isolation procedure and statistical determination of reference genes for real-time
crop science, vol. 58, january– february 2018
RT-PCR during berry development. BMC Plant Biol. 6:27. doi:10.1186/1471-2229-6-27 Russell, J.R., R.P. Ellis, W.T.B. Thomas, R. Waugh, J. Provan, A. Booth, et al. 2000. A retrospective analysis of spring barley germplasm development from ‘foundation genotypes’ to currently successful cultivars. Mol. Breed. 6:553–568. doi:10.1023/A:1011372312962 SAS Institute. 2014. The SAS Enterprise guide. Version 7.1. SAS Inst., Cary, NC. Silver, N., S. Best, J. Jiang, and S.L. Thein. 2006. Selection of housekeeping genes for gene expression studies in human reticulocytes using real-time PCR. BMC Mol. Biol. 7:33. doi:10.1186/1471-2199-7-33 UniGene. 2014. National Center for Biotechnology UniGene resource. U. S. Natl. Library Med. http://www.ncbi.nlm.nih. gov/unigene (accessed 25 June 2014). USDA. 1997. Plant variety protection application for barley ‘Conlon’. USDA. https://apps.ams.usda.gov/CMS/AdobeImages/009700243.pdf (accessed 25 Oct. 2017). Vandesompele, J., K. DePreter, F. Pattyn, B. Poppe, N. Van Roy, A. De Paepe, and F. Speleman. 2002. Accurate normalization of real-time quantitative RT-PCR data by geometric averaging of multiple internal control genes. Genome Biol. 3:RESEARCH0034. Wang, L., Y. Wang, and P. Zhou. 2013. Validation of reference genes for quantitative real-time PCR during Chinese wolfberry fruit development. Plant Physiol. Biochem. 70:304–310. doi:10.1016/j.plaphy.2013.05.038 Wolfe, C.J., I.S. Kohane, and A.J. Butte. 2005. Systematic survey reveals general applicability of “guilt-by-association” within gene coexpression networks. BMC Bioinformatics 6:227. doi:10.1186/1471-2105-6-227 Xie, F., P. Xiao, D. Chen, L. Xu, and B. Zhang. 2012. miRDeepFinder: A miRNA analysis tool for deep sequencing of plant small RNAs. Plant Mol. Biol. 80:75–84. doi:10.1007/ s11103-012-9885-2 Zmienko, A., A. Samelak-Czajka, M. Goralski, E. SobieszczukNowicka, P. Kozlowski, and M. Figlerowicz. 2015. Selection of reference genes for qPCR-and ddPCR-based analyses of gene expression in senescing barley leaves. PLoS One 10:e0118226. doi:10.1371/journal.pone.0118226
www.crops.org 341