Plant Molecular Biology (2005) 59:533–551 DOI 10.1007/s11103-005-2498-2
Springer 2005
Structural, functional, and phylogenetic characterization of a large CBF gene family in barley Jeffrey S. Skinner1,2, Jarislav von Zitzewitz2, Pe´ter Sz} ucs2,3, Luis Marquez-Cedillo2, 2 4,6 Tanya Filichkin , Keenan Amundsen , Eric J. Stockinger5, Michael F. Thomashow4, Tony H.H. Chen1 and Patrick M. Hayes2,* 1
Department of Horticulture, College of Agriculture, Oregon State University, Corvallis, OR 97331, USA; Department of Crop and Soil Science, College of Agriculture, Oregon State University, Corvallis, OR 97331, USA (*author for correspondence; e-mail
[email protected]); 3Agricultural Research Institute of the Hungarian Academy of Sciences, H-2462, Martonva´sa´r, Hungary; 4Department of Crop and Soil Sciences, Michigan State University, East Lansing, MI 48824, USA; 5Department of Horticulture and Crop Science, The Ohio State University/OARDC, Wooster, OH 44691, USA; 6United States National Arboretum, USDA-ARS, Washington, DC 20002, USA 2
Received 1 July 2005; accepted in revised form 27 August 2005
Key words: barley, CBF, cereal, HvCBF, low temperature tolerance, Triticeae
Abstract CBFs are key regulators in the Arabidopsis cold signaling pathway. We used Hordeum vulgare (barley), an important crop and a diploid Triticeae model, to characterize the CBF family from a low temperature tolerant cereal. We report that barley contains a large CBF family consisting of at least 20 genes (HvCBFs) comprising three multigene phylogenetic groupings designated the HvCBF1-, HvCBF3-, and HvCBF4subgroups. For the HvCBF1- and HvCBF3-subgroups, there are comparable levels of phylogenetic diversity among rice, a cold-sensitive cereal, and the cold-hardy Triticeae. For the HvCBF4-subgroup, while similar diversity levels are observed in the Triticeae, only a single ancestral rice member was identified. The barley CBFs share many functional characteristics with dicot CBFs, including a general primary domain structure, transcript accumulation in response to cold, specific binding to the CRT motif, and the capacity to induce cor gene expression when ectopically expressed in Arabidopsis. Individual HvCBF genes differed in response to abiotic stress types and in the response time frame, suggesting different sets of HvCBF genes are employed relative to particular stresses. HvCBFs specifically bound monocot and dicot cor gene CRT elements in vitro under both warm and cold conditions; however, binding of HvCBF4subgroup members was cold dependent. The temperature-independent HvCBFs activated cor gene expression at warm temperatures in transgenic Arabidopsis, while the cold-dependent HvCBF4-subgroup members of three Triticeae species did not. These results suggest that in the Triticeae – as in Arabidopsis – members of the CBF gene family function as fundamental components of the winter hardiness regulon. Abbreviations: CRT, C-repeat; DRE, dehydration response element; EST, expressed sequence tag; gDNA, genomic DNA; LT, low temperature; PCR, polymerase chain reaction; QTL, quantitative trait locus; UTR, untranslated region
534 Introduction Plants show a broad range of ability to acclimate to and survive exposure to low temperature (LT) and freezing conditions, traits that are of great ecological and economic importance. As a consequence, the physiology and genetics of LT tolerance are areas of intense study (Thomashow, 1999; Cattivelli et al., 2002). Recently, key genes involved in the regulation of LT tolerance have been isolated and characterized in Arabidopsis thaliana (Stockinger et al., 1997; Gilmour et al., 1998; Liu et al., 1998; Haake et al., 2002; Chinnusamy et al., 2003). This model plant system has been used to establish that the cold acclimation process induces expression of a suite of LTresponsive cor (cold-regulated) genes, whose products are thought to collectively impart the necessary physiological and biochemical alterations that increase a plant’s LT tolerance capacity (Thomashow, 1999). Many of these cor genes contain one or more CRT (C-repeat) elements in their promoter, which have the core motif CCGAC, and are responsible for their LT-responsiveness; CRT and DRE (dehydration response element) elements (Liu et al., 1998) have the same CCGAC core motif and are considered analogous. The CBF (C-repeat Binding Factor) family of genes was first characterized in Arabidopsis (Stockinger et al., 1997; Gilmour et al., 1998; Liu et al., 1998) and encode LT-induced transcription factors that bind the CRT element and induce cor expression (Stockinger et al., 1997; Gilmour et al., 1998; Jaglo et al., 2001). Other abiotic stresses – e.g., drought and salt – can also induce expression of some CBF genes, and therefore subsequent cor gene expression (Haake et al., 2002; Dubouzet et al., 2003). CBFs are members of the AP2/ EREBP superfamily and are distinguished from other members by the presence of CBF signature sequence motifs (PKK/RPAGRxKFxETRHP and DSAWR) directly flanking the AP2 domain (Jaglo et al., 2001). The Arabidopsis genome encodes six CBF (AtCBFs) genes, and ectopic transgenic expression of Arabidopsis AtCBF1-AtCBF4 both improves plant LT tolerance and activates components of the CBF regulon under non-acclimating conditions, including cor genes that harbor CRT elements within their promoters (Stockinger et al., 1997; Gilmour et al., 1998; Jaglo-Ottosen et al., 1998; Liu et al., 1998; Gilmour et al., 2000;
Jaglo et al., 2001; Haake et al., 2002; Sakuma et al., 2002). Extension of results from model plant systems to economically important crops is an important component to identifying genes conferring key agronomic traits. The grasses, or Poaceae, contain the economically most important crop plant families, and include the cereal crops rice (Oryza sativa), corn (Zea mays), and members of the Triticeae such as wheat (Triticum aestivum), barley (Hordeum vulgare), and rye (Secale cereale). Within the cereals, a broad LT tolerance range, from completely sensitive to extremely hardy, is observed. Cereals of subtropical origin, like rice and maize, are sensitive to cold and do not survive freezing temperatures, whereas a wide range of phenotypic variation for LT tolerance occurs among temperate cereals. The Triticeae form a homogeneous genetic system with a high degree of synteny and comparative genetic studies confirm the genetic determinants of winter hardiness traits are conserved between members that possess the trait, making results of one species frequently applicable to other members of the cereal tribe (Dubcovsky et al., 1998; Mahfoozi et al., 2000). In cereals, LT tolerance shows complex inheritance and has been analyzed using quantitative trait locus (QTL) analysis tools (Hayes et al., 1997; Va´gu´jfalvi et al., 2003; Francia et al., 2004). Determination of the genes controlling LT QTLs is a subject of intensive research, and one class of candidates is the CBF genes (Va´gu´jfalvi et al., 2003; Francia et al., 2004). CBF-like genes are represented in most dicot and monocot EST collections and have been cloned from a variety of plant backgrounds, including monocots (Jaglo et al., 2001; Choi et al., 2002; Xue, 2002, 2003). Many cold-responsive monocot cDNAs and genes have been cloned (summarized in Cattivelli et al., 2002) and the promoter region of the barley cor genes HVA1 (Straub et al., 1994), cor14b (Dal Bosco et al., 2003), and Dhn5 and Dhn8 (Choi et al., 1999), as well as wcs120 (the wheat Dhn5 ortholog) (Vazquez-Tello et al., 1998) contain one or more CRT motifs each, suggesting CBF factors may regulate their LT-responsiveness. Together, these results imply that CBF response pathways are likely present in most higher plants and could be a component of cereal LT tolerance. It is of interest, therefore, to determine if CBF genes are also involved in the LT tolerance regulon of
535 important crop plants such as the cereals, and whether they are a source of genetic variation for differences in LT capacity. Hordeum vulgare subsp. vulgare (barley) is both an economically important crop and a model system for the study and dissection of molecular, genetic, and physiological components of LT tolerance in cereals, and the Triticeae in particular. Barley is a self-pollinated diploid, abundant genetic variation for LT tolerance occurs in the primary gene pool, and an everexpanding set of tools exists for genetic and molecular analysis (reviewed in Hayes et al., 2003), including multiple mapping populations, arrayed BAC clones, a large EST database, and a microarray chip. The H. vulgare cv. Dicktoo genotype is a well-characterized ‘facultative winter’ barley variety with a high degree of LT tolerance relative to the cultivated barley germplasm range (Kolar et al., 1991, Hayes et al., 1997; von Zitzewitz et al., 2005). Dicktoo has been utilized as a parent for generation of three mapping populations to analyze the winter hardiness traits of LT tolerance, vernalization requirement, and photoperiod sensitivity: DicktooMorex (Hayes et al., 1997), DicktooKompoti korai (Karsai et al., 2005), and DicktooPlaisant (Karsai et al., 1997). The DicktooMorex population in particular is a barley research community standard for mapping traits associated with the winter growth habit (Pan et al., 1994; Hayes et al., 1997; Mahfoozi et al., 2000; Fowler et al., 2001; Karsai et al., 2001) as well as candidate genes for stress-related traits (e.g., Dehydrins) (van Zee et al., 1995; Choi et al., 2000). Accordingly, we have focused on the Dicktoo genotype in the current study to isolate and characterize the barley CBF gene family and have determined (i) the number and complexity of CBF genes in barley, (ii) their phylogenetic context relative to other LT-tolerant and LT-sensitive cereals, (iii) the expression characteristics of these genes in response to abiotic stress, and (iv) whether key functional characteristics of Arabidopsis CBFs extend to the barley homologs. Our findings expand the current understanding of the CBF cold response pathway to LT-tolerant monocots and demonstrate the importance of extending results from model non-crop systems into the economically important crops.
Materials and methods Plant material Hordeum vulgare subsp. vulgare genotypes Dicktoo, Morex, and 88Ab536 were used. Plants for genomic DNA (gDNA) extraction and expression studies were grown under greenhouse conditions and supplemented with artificial light (16 h photoperiod). For expression studies, plants were grown to the visible emergence of the third leaf, than transferred to treatment conditions. For cold treatment, plants were transferred to a cold room (2 C; 8/16 h light/dark) and bulk aerial tissue collected after 1, 2, 4, 8, 24, and 96 h. For NaCl treatment, pots were transferred to a growth room (20 C), saturated with 400 mM NaCl, and bulk aerial tissue collected after 6 h (plants were visibly wilted following treatment). For desiccation treatment, intact aerial tissue was harvested and left at room temperature on a bench top for 1 and 6 h prior to collection; Dicktoo and Morex tissue had respective final relative water contents of 63% and 73% of starting values after 6 h desiccation. For ABA treatment, plants were sprayed with 100 lM ABA solution and collected after 3 h; control plants were sprayed with water. Gene cloning and nomenclature Sequence data from this article has been deposited with the EMBL/GenBank data libraries and respective accession numbers are listed in Tables 1 and S1. Primer sequences used for gene cloning are listed in Table S3. Wheat and sorghum CBF sequences were determined via direct sequencing of EST clones (Table 1). Rice CBF (designated OsDREB1s after Dubouzet et al., 2003; see below) sequences determined via the reported rice genome for OsDREB1AOsDREB1F were verified by amplifying, cloning, and sequencing each gene from O. sativa (cv. Nipponbare) genomic DNA (gDNA); OsDREB1B.1 was also determined via sequencing of three independent ESTs due to discrepancies with a previously reported OsDREB1B version (Tables 1 and S2; see supplemental data, Section S2). Rice CBF sequences for OsDREB1G-OsDREB1J were derived directly from the respective reported BAC or scaffold clones sequences listed for each gene in Table S2. All OsDREB1 gene sequences utilized for comparisons are from the O. sativa L. ssp. japonica
536 Table 1. Cloned monocot CBF genes, alleles, and polypeptide characteristics. CBF Gene Barley HvCBF1-Dt HvCBF2A-Dt HvCBF2B-Dt HvCBF3-Dt HvCBF4A-Dt HvCBF4B-Dt HvCBF4D-Abc HvCBF5-Dt HvCBF6-Dt HvCBF7-Dt HvCBF8A-Dt HvCBF8B-Dt HvCBF8C-Dt HvCBF9-Dt HvCBF10A-Dt HvCBF10B-Dt HvCBF11-Dt HvCBF12-Dt HvCBF13-Dt HvCBF14-Dt Ricee OsDREB1A OsDREB1B.1 OsDREB1C OsDREB1D OsDREB1E OsDREB1F OsDREB1G OsDREB1H OsDREB1I OsDREB1J Sorghum SbCBF5 SbCBF6 Rye ScCBF22 ScCBF24 ScCBF31 Wheat TaCBF1 TaCBF2 TaCBF5 TaCBF6 TmCBF7 TaCBF9 TaCBF11 TaCBF14 Maize ZmCBF2 a
Accession number
HvCBF-subgroup
Isolation methoda
Proteinb aa, kD, pI, AD pI
AY785837 AY785841 DQ097684 AY785845 AY785849 AY785850 AY785852 AY785855 AY785860 AY785864 AY785868 AY785871 AY785875 AY785878 AY785882 AY785885 AY785890 DQ095157 DQ095158 DQ095159
1 4 4 3 4 4 4 1 3 1 3 3 3 4 3 3 1 3 3 4
2 2 4 2 3 3 2 2 2 2 2 2 2 2,3d 2 2 2 4 4 2
217, 221, 221, 249, 225, 225, 225, 214, 244, 219,
AF300970 AY785894 AP001168 AY785895 AY785896 AY785897 AP005775 AAAA01001957 AP004632 AP004632
3 4 1 3 1 1 1 3 3 3
1,2 1,2 1,2 1,2 1,2 1,2 1 1 1 1
238, 25.4, 5.05, 3.71 218, 23.2, 5.22, 3.51 214, 23.1, 5.01, 3.48 253, 27.7, 10.18, 9.08 219, 23.9, 5.44, 3.85 219, 23.8, 5.81, 4.08 224, 23.9, 5.14, 3.64 246, 25.5, 5.29, 3.66 251, 26.8, 4.64, 3.31 236, 24.5, 5.54, 3.86
AY785898 AY785899
1 3
1 1
249, 26.2, 4.75, 3.64 235, 25.1, 4.96, 3.50
AF370730 AF370729 AF370728
4 4 4
5 5 5
270, 29.2, 9.00, 4.92 268, 28.9, 9.47, 4.90 212, 23.3, 8.51, 4.29
AF376136 AY785900 AY785902 AY785903 AY785904 AY785905 AY785906 AY785901
4 4 1 3 1 4 1 4
5 1 1 1 1 1 1 1
212, 225, 219, 242, 275, 269, 218, 214,
AF450481
3
5
267, 27.7, 5.97, 3.84
291, 241, 227, 218, 244, 252, 214,
23.1, 5.27, 24.6, 4.97, 24.4, 5.12, 26.3, 5.21, 24.5, 8.40, 24.5, 8.40, 24.4, 6.96, 22.8, 6.51, 26.0, 5.07, 23.0, 5.74, – – – 31.2, 8.76, 25.6, 4.89, 24.4, 5.25, 23.7, 5.64, 25.9, 5.97, 27.4, 6.53, 23.4, 6.97,
23.3, 25.1, 23.1, 25.9, 28.9, 28.7, 23.7, 23.6,
7.78, 5.11, 5.06, 4.95, 5.27, 8.73, 6.18, 6.53,
3.42 3.83 3.85 3.96 4.07 4.07 3.97 3.93 3.82 3.73
4.51 3.79 3.97 3.98 4.24 4.36 4.34
4.18 3.91 3.66 3.75 3.60 4.61 4.05 4.33
Isolation/Identification method codes: 1:EST, BAC, and/or PAC clone, 2: PCR amplified from gDNA, 3: cDNA library screen, 4: gDNA library screen, 5: Gene reported in Genbank. b Predicted protein features: Length in amino acids (aa), molecular mass (kD), total protein isoelectric point (pI), acidic C-terminal domain isoelectric point (AD pI); no values (–) given for HvCBF8 pseudogenes. c HvCBF4D-Ab protein characteristics based on inclusion of primer-based terminal residues. d The 5¢ truncated HvCBF9-Dt cDNA insert is reported under accession AY785908. e For rice genes, a single representative clone is given; a detailed clone list is available in Table S2.
537 cv. Nipponbare genotype, except for OsDREB1H, which is from the O. sativa L. ssp. indica cv. 93–11 genotype. Barley CBFs were cloned via a combination of EST analyses, cDNA and gDNA library screens, and PCR-based amplification. Total RNA for cDNA library construction was purified as in Chang et al. (1993) from Dicktoo 4 and 24 h coldtreated tissue. Poly(A+) RNA was purified using a Poly(A)Purist Kit (Ambion, Austin, TX) and used for cDNA library construction with a Lambda ZAP-cDNA Synthesis kit according to the manufacturer’s instructions (Stratagene, San Diego, CA). A Dicktoo gDNA library was constructed using the Lambda FIX II kit (Stratagene). The isolation method (EST, PCR, etc.) of each barley HvCBF gene and allele are indicated in Tables 1 and S1. All barley HvCBF genes isolated via PCR were amplified from gDNA and clones of two independent PCR reactions were sequenced and verified to contain identical sequences; only the non-primer portion of each sequenced amplicon was reported to GenBank. HvCBF gene alleles are designated by a dash following the HvCBF# designation and a two letter abbreviation for the respective barley genotype allele (Ab: cv. 88Ab536, Bk: cv. Barke, Dt: cv. Dicktoo, Hn: cv. Halcyon, Mx: cv. Morex, Op: cv. Optic). As an example, HvCBF9-Dt is the barley HvCBF9 gene allele from the Dicktoo genotype. Calculations of protein molecular mass and pI values were done using the program Compute pI/ Mw (http://ca.expasy.org/tools/pi_tool.html). For acidic C-terminal domain pI calculations, the region from the first acidic amino acid occurring after the second CBF signature motif through to the last amino acid was utilized. Monocot CBF alignments and phylogenetic analysis Monocot CBFs and accession numbers used for alignments are in Table 1. Arabidopsis AtCBF1 (U77378) and AtCBF4 (NM_124578) were used as representative dicot CBFs and OsDREB2A (AF300971), a closely related monocot AP2 domain-containing protein lacking the flanking CBF signature sequences, was utilized as a CBF-related outlier (see Dubouzet et al., 2003). Protein sequences were aligned using ClustalW, and refined by hand using GeneDoc Version 2.6 software (http://www.psc.edu/biomed/genedoc). Phylogenetic analyses on refined alignments were conducted using MEGA Version 2.1 software (http://
www.megasoftware.net/). Phylogenetic trees were generated with MEGA Version 2.1 Neighbor Joining and Minimum Evolution default methodologies on 1000 bootstrap replications. DNA and RNA isolation and gel blot analysis Barley and Oryza sativa L. ssp. japonica cv. Nipponbare gDNA was isolated using Qiagen DNeasy kits (Qiagen, Valencia, CA). RNA isolation, gel electrophoresis, and blotting to Nytran nylon membranes were performed as previously described (Skinner and Timko, 1998). RNA blots were probed using Ultrahyb (Ambion) and washed following the manufacturer’s guidelines. HvCBF probes excluded the AP2 domain and consisted of only the C-terminal domain and 3¢ UTR to minimize HvCBF cross hybridization. Barley Dhn5 (Choi et al., 1999), cor14b (Dal Bosco et al., 2003), and 18S rRNA (EST BF620814) probes were also utilized for northern analysis. Labeled probes were generated as directed using a High Prime Labeling Kit (Roche Biochemicals, Indianapolis, IN). Gel shift assays The coding region of Arabidopsis AtCBF1 and Dicktoo HvCBF2A, HvCBF3, HvCBF4A, and HvCBF7 alleles were amplified using Pfx polymerase (Invitrogen, Carlsbad, CA) with the following respective primer sets: AtCBF1.003/AtCBF1.005, HvCBF2.006/HvCBF2.008, HvCBF3.008/ HvCBF3.007, HvCBF4.009/HvCBF4.011, and HvCBF7.007/HvCBF7.008 (see supplemental data, Section S4, for primer sequences); HvCBF4A and HvCBF4B encode identical polypeptides, therefore HvCBF4A results are representative of both gene products. Products were directionally cloned into pET101/D-TOPO (Invitrogen) in frame with the 6His tag and sequence verified. Purification of IPTG-induced E. coli whole cell extracts was done using a modified protocol based on Foster et al. (1992); see supplemental data (Section S4) for protocol details. Complementary oligos encoding wildtype or mutant versions of a CRT present as a direct repeat (supplemental data, Section S4) from three cor gene promoters (COR15a, Dhn5, and cor14b) were designed to form XbaI and BamHI compatible overhangs when annealed and cloned
538 into XbaI/BamHI-digested pBluescript-KS. Labeled probes and unlabeled (cold) competitor DNAs were prepared from these constructs via a PCR-based protocol (Schowalter and Sommer, 1989) and gel purified using a QIAEX II Gel Extraction Kit (Qiagen). DNA binding assays contained total E. coli protein extract harboring one of the above CBF proteins or LacZ, 20 mM HEPES (pH 7.5), 40 mM KCl, 1 mM EDTA, 0.5 mM DTT, 1 lg/ll sonicated Salmon Sperm DNA, 10% glycerol, and 25 fmol labeled probe with or without 50 (1250 fmol) cold competitor DNA. 0.1 (AtCBF1), 0.5 (HvCBF3) or 3.0 lg (HvCBF2, HvCBF4, HvCBF7 and LacZ) of total protein were utilized for binding assays. Reactions were assembled minus labeled probe, incubated five minutes, labeled probe added, and reactions incubated 30 min before complex fractionation on a 5% acyrlamide/0.5 TBE PAGE gel at 10 V/cm. Gels were dried to ZetaBind membrane (BioRad, Hercules, CA) and exposed and scanned using an MD-SI PhosphorImager system (Amersham Biosciences, Piscataway, NJ). All steps (reaction assembly through fractionation) were conducted at room temperature (20 C) for warm assays and at 2 C for cold assays. Transgenic Arabidopsis studies The coding regions of HvCBF3-Dt, HvCBF4-Dt, HvCBF6-Dt, ScCBF22, ScCBF24, ScCBF31, and TaCBF1 were individually subcloned into the plant binary transformation vector pGA643 under control of the constitutive CaMV 35S promoter and used to transform A. thaliana ecotype Wassilewskija (WS-2) via the floral dip method of Clough and Bent (1998). Homozygous seeds from second-generation transgenic lines were utilized for studies. Tissue was collected from two nonacclimated transgenic Arabidopsis lines per construct grown in a controlled growth chamber at 20 C for 15 days; a vector-only control line subjected to 24 h of cold (4 C) following day 15 was also collected. RNA was extracted using an RNA Isolation Kit (Qiagen), blotted, and probed as described above, except full-length gene probes to each transgene were utilized. The ScCBF31 probe was used to evaluate both the ScCBF31 and TaCBF1 lines as the closely related genes readily cross-hybridize. Probes to COR6.6, COR15a, COR47, COR78, and AtCBF1 were prepared as
previously described (Jaglo-Ottosen et al., 1998). Whole plant freeze tests (Section S5) were performed as in Vlachonasios et al. (2003). Results Identification and isolation of barley CBF genes BLAST searches were conducted against the barley EST database collection (www.ncbi.nlm. nih.gov) using the reported CBF sequences of Arabidopsis, wheat, and rye (Stockinger et al, 1997; Gilmour et al., 1998; Jaglo et al., 2001). Analysis of all barley ESTs displaying strong similarity to the queried genes and/or containing sequences similar to the CBF signature motifs revealed nine distinct barley CBF genes were present. A representative cDNA clone corresponding to each of the nine genes was sequenced. These EST cDNAs originated from different barley varieties and correspond to HvCBF1-Mx, HvCBF4B-Mx, HvCBF5Op, HvCBF6-Mx, HvCBF7-Mx, HvCBF8B-Bk, HvCBF9-Mx, HvCBF10B-Op, and HvCBF11-Op (Table S1). To isolate Dicktoo alleles, we utilized a combination of cDNA and genomic DNA library screens and reduced-stringency PCR. During the course of this work, three barley CBFs designated HvCBF1, HvCBF2, and HvCBF3 were independently reported on (Xue, 2002, 2003; Choi et al., 2002), two of which – HvCBF2 and HvCBF3 – were not present in the EST set and were also targeted for Dicktoo allele cloning (Table S1). During the Dicktoo allele isolations, eight additional CBF genes (HvCBF2A, HvCBF4A, HvCBF8A, HvCBF8C, HvCBF10A, HvCBF12, HvCBF13, and HvCBF14) were cloned and the Dicktoo genotype therefore harbors at least 19 CBF genes (Table 1). We isolated a 20th barley gene, HvCBF4D, from the facultative winter genotype 88Ab536, which is absent from Dicktoo based on current data (supplemental data, Section S1). Thus, a total of 20 distinct CBF genes have been isolated from barley to date (Table 1). The 20 barley CBF genes fall into 14 distinct HvCBF gene families (Figure 1; supplemental data, Section S1). To be consistent with the established HvCBF1 to HvCBF3 nomenclature, we assigned consecutive numbers to each novel barley CBF form in order of isolation, starting with HvCBF4, thus designating the families HvCBF1 through HvCBF14. Similar but distinct
539 genes that display high similarity at the nucleotide level along their entire length were assigned as members of a gene subfamily and given the same number and sequential letters to reflect this close relationship. These consist of the HvCBF2 (2A, 2B), HvCBF4 (4A, 4B, 4D), HvCBF8 (8A, 8B, 8C), and HvCBF10 (10A, 10B) families. Pseudogenes also make up a segment of the barley CBF family size as the three HvCBF8 family members encode frame-shifted pseudogenes (supplemental data, Section S1, Figure S4); additional pseudogenes consisting of partial HvCBF gene fragments are present in barley (Stockinger et al., 2005). The monocot CBF genes, method of isolation, and corresponding Genbank accession numbers are summarized in Table 1; 19 CBF genes from monocots other than barley were also identified during the course of this work (below).
The increased CBF gene family size in barley is characteristic of other cereals The Arabidopsis and poplar genomes encode six CBF genes each (Haake et al., 2002; Benedict et al., 2005). To determine whether the larger barley CBF family size was representative of cereals, we analyzed the two reported rice genome draft sequences (Goff et al., 2002; Yu et al., 2002) and the extensive wheat EST collection (www.ncbi.nlm.nih.gov) for CBF genes, using the same criteria as the barley EST database searches. Ten full-length rice CBFs were identified through analysis of the genomic and EST sequence data sets (Table 1); three pseudogenes and a partial novel CBF gene terminating at the end of a scaffold clone were also present in rice (supplemental data, Section S2), indicating the total rice CBF family size is at least 14 members. Four of the rice CBFs were identical to those reported by Dubouzet et al. (2003), who used the alternate CBF designations OsDREB1A through OsDREB1D; we utilize the OsDREB1B sequence we report (AY785894) here, termed OsDREB1B.1, due to sequence discrepancies with the original OsDREB1B report (supplemental data, Section S2). For consistency, the six additional full-length rice CBFs were designated OsDREB1E through OsDREB1J (Table 1). A summary of the rice CBFs and their occurrence in the indica and japonica sequence collections is provided in Table S2.
Analysis of the extensive wheat EST collection present in Genbank demonstrated that wheat homologs to the majority of the barley CBF families are present (data not shown). We obtained and sequenced EST clones harboring full-length T. aestivum or T. monococcum cDNA homologs to about half the barley CBF gene families; these are designated TaCBF2, TaCBF5, TaCBF6, TmCBF7, TaCBF9, TaCBF11, and TaCBF14 (Table 1) to reflect their relationship to their closest barley homolog. An additional wheat CBF (TaCBF1), reported by Jaglo et al. (2001), is closely related to the TaCBF14 and HvCBF14 genes. Based on the EST survey, wheat contains a similar number of non-homoeologous CBFs relative to barley, and is predicted to encode additional members not represented in the EST collection, as is the case for barley. Two Sorghum bicolor CBF full-length ESTs were also identified and sequenced; these genes are designated SbCBF5 and SbCBF6, reflecting their relationships to HvCBF5 and HvCBF6, respectively (Figure 1). In sum, these data indicate that the increased size of the CBF family is a characteristic of cereals, irrespective of LT tolerance potentials. Barley CBF phylogentic complexity is representative of monocots Phylogenetic analysis of the barley CBFs revealed three major phylogenetic subgroups are present and all the non-barley monocot CBFs fell within one of the three subgroups (Figure 1). We designated the three subgroups the HvCBF1-, HvCBF3-, and HvCBF4-subgroups, each after a representative barley member. The 20 barley CBFs are relatively well distributed between the three groups and the barley CBF family complexity is representative of cereals in general. The HvCBF3- and HvCBF4-subgroups are clearly defined phylogenetically with a single defined root branch each. The HvCBF1-subgroup is ancestral to the HvCBF3- and HvCBF4-subgroups, and while collectively assigned as a single subgroup for convenience, is composed of three distinct gene family clusters – the HvCBF5, the HvCBF7, and the HvCBF1/HvCBF11 families (Figure 1). The three HvCBF8 subfamily members encode frame-shifted pseudogenes and the polypeptides used for the phylogenetic tree (Figure 1) and the AP2 domain alignment (Figure 2C) are based on hypothetical reading
540 c Figure 2. Major monocot CBF features. (A) Basic monocot CBF domain structure. The variable leader, AP2, and acidic Cterminal domains are noted. The AP2 domain-flanking CBF signature motif positions (Sig) are indicated as black blocks. Four additional regions of note are: (1) a six amino acid block conserved between the HvCBF4- and a subset of the HvCBF1subgroup, (2) a nine amino acid block shared between dicot CBFs and all monocot CBFs except those of the HvCBF5 and HvCBF7 clades, (3) an acidic cluster, and (4) the LWS(Y) motif typically found near the CBF C-term. (B) General consensus sequence of a conserved C-terminal domain feature (#2 of panel 2A) for each indicated group; the maximum conserved block length is shown per group. (C) Alignment of monocot CBF AP2 domain and flanking signature motifs. All currently reported monocot CBFs (Table 1), grouped by HvCBF-subgroup membership, were utilized. The 45 amino acid insertion event of TmCBF7 disrupting the signature motif is denoted as ‘‘>insert