POINT-OF-VIEW RNA Biology 9:2, 143–147; February 2012;
G
2012 Landes Bioscience
Transcription beyond borders has downstream consequences Cagla Sonmez and Caroline Dean* Department of Cell and Developmental Biology; John Innes Centre; Norwich, UK
T
he realization that non-coding RNAs and antisense transcription are pervasive in many genomes has emphasized our relatively poor understanding of what limits transcription and how initiation and termination are linked to processing and turnover of the RNA. In genomes where the density of genes is high it is clearly important to efficiently terminate transcription to prevent readthrough into adjacent genes. In a recent paper published in PNAS, we showed that two RNA binding proteins in Arabidopsis thaliana, FCA and FPA, play important roles in limiting intergenic transcription in the A. thaliana genome. Their absence leads to transcriptional read-through over many kilobases (kb), which influences expression, and in some cases chromatin modifications, of associated genes.
the different stages of the transcription cycle, initiation, elongation and termination, through phosphorylation of its carboxy terminal domain (CTD) (reviewed in refs. 17 and 18). These different phases are in turn functionally coupled to the RNA processing steps, 5' capping, splicing and cleavage/polyadenylation, which occur co-transcriptionally while the RNA is being transcribed (reviewed in ref. 19). Chromatin state appears to be central to this functional coupling providing feedback between transcription and processing steps. Changes in co-transcriptional interactions are thus likely to be at the heart of the complexity of the non-coding RNA in different organisms. In our work studying regulators of developmental timing in Arabidopsis thaliana we have elucidated co-transcriptional mechanisms linking chromatin regulation and RNA processing. Our recent work has shown that perturbation of these mechanisms through loss of two RNA binding proteins leads to extensive intergenic transcription and increased non-coding RNA in the Arabidopsis genome.20 Previous analysis had implicated FCA in the regulation of alternative polyadenylation.21,22 FCA auto-regulates its own expression through enhancing proximal polyadenylation in an early intron preventing the formation of a full-length transcript (Fig. 1A). This limits accumulation of the functional full-length transcript so that the FCA protein is expressed mainly in meristematic regions only after a certain stage of development.21 FCA autoregulation thus has important functional consequences in plant development by preventing premature flowering.23 FCA is also involved in alternative polyadenylation of the major floral regulator gene
© 2012 Landes Bioscience. Do not distribute.
Keywords: Arabidopsis FCA/FPA, read-through transcription, termination, RNA processing, chromatin modifications Submitted: 09/26/11 Revised: 11/03/11 Accepted: 11/05/11 http://dx.doi.org/10.4161/rna.18668 *Correspondence to: Caroline Dean; Email:
[email protected]
www.landesbioscience.com
The low number of predicted proteincoding genes1-3 and the complexity of the non-coding transcriptome were major surprises from the human genome project. Many intergenic regions have been found to be transcribed4-7 and novel functions for non-coding RNAs have been elucidated.8-11 Whether the majority of non-coding RNA reflects “pervasive transcription that underpins human complexity” or technical artifact and noise has been recently debated in a series of papers by Van Bakel et al. and John Mattick’s group.12-14 Whatever the function of most of the so-called “dark matter” of the transcriptome it will be important to identify its origin.15 RNA Polymerase II (PolII), together with RNA PolIV and V in plants,16 is the major polymerase involved in the generation of non-coding RNA. PolII activity is tightly regulated at
RNA Biology
143
FLC.22 Together with another RNA binding protein FPA, FCA downregulates FLC through alternative 3' processing of the FLC antisense RNA.24 The antisense RNA originates immediately downstream of the polyA site of the FLC sense transcript and spans the entire FLC gene locus (Fig. 1B). FCA and FPA promote proximal polyadenylation of this transcript to generate a short FLC antisense RNA. Loss of the RNA-binding proteins results in read-through transcription of the antisense strand and polyadenylation near the FLC sense promoter. Thorough genetic studies demonstrated that FCA functions through 3' end processing factors and chromatin regulators to modulate FLC expression.22,25-27 Use of the proximal polyadenylation site promotes activity of a histone demethylase (specific for H3K4me1/2) and transcriptional downregulation. Thus, FCA activity provides
a co-transcriptional link between 3' end processing and chromatin regulation in gene silencing. FCA and FPA were originally identified through their effect on flowering time,28 so for more than a decade, they were solely studied for their regulation of FLC. A genetic screen aimed at elucidating systemic RNA silencing in plants then revealed that FCA and FPA are also involved in transgene-induced chromatin silencing of an endogenous gene and transcriptional repression of low-copy transposable elements and repetitive sequences.29 This surprising finding stimulated the use of Arabidopsis whole genome tiling arrays to identify the range of sequences regulated by FCA and FPA.20 At least 2% of the ORFs in the Arabidopsis genome were mis-expressed in fcafpa with a striking bias toward misregulation of 3' ends of genes compared with 5' ends. Analysis of a smaller subset of
upregulated genomic segments, which had previously been classified as unannotated (UA) (www.arabidopsis.org), demonstrated extensive read-through transcription from upstream genes. Fifteen of the UA readthrough transcripts were characterized in detail and shown to extend into intergenic regions, through adjacent genes transcribed in the same direction, into transposon-rich regions or into convergently transcribed genes potentially causing double-stranded RNA formation. The read-through transcripts were also detected in wild type cells using quantitative RT-PCR (qRT-PCR), although at very low levels. Analysis of single fca and fpa mutants revealed complex differential effects of FCA and FPA at the different loci, but chromatin immunoprecipitation (ChIP) experiments indicated that these were direct effects. Canonical 3' end processing factors had previously been implicated in gene silencing pathways by
© 2012 Landes Bioscience. Do not distribute.
Figure 1. FCA-dependent alternative polyadenylation of (A) FCA transcript, (B) FLC antisense transcripts. Open boxes represent exons, straight lines represent introns. (A) FCA promotes proximal polyadenylation of its own transcript in intron 3 thus providing a negative feedback regulation. Distal polyadenylation is favored in meristem cells at a certain stage of development leading to formation of the functional full length transcript.21 (B) FCA and FPA promote proximal polyadenylation of the FLC antisense transcript, which leads to transcriptional downregulation of the locus in a process requiring FLD, a histone demethylase activity.22 In their absence there is antisense transcriptional read-through, distal polyadenylation and no transcriptional downregulation.
144
RNA Biology
Volume 9 Issue 2
Herr et al., their mutation led to enhanced silencing in Arabidopsis due to increased read-through transcription of a transgene.30 We considered what could be causing the downregulation of the associated gene. Changed polyA usage and increased length of 3' UTRs could lead to inclusion of micro RNA (miRNA) sites and thus regulation of the transcript by an alternative pathway. However, comparison of the UA targets with genomic segments mis-expressed in a serrate mutant, deficient in miRNA regulation, showed no matches providing no supporting evidence for this. Alternatively, long 3' regions can cause transcripts to become more sensitive to nonsense mediated decay (NMD) path ways.31-33 Our preliminary analysis did not reveal a higher accumulation of the fcafpa upregulated genomic segments in an Arabidopsis NMD mutant upf1. In many of the examples we studied the majority of the polyadenylation and 3' processing still occurred at the wild-type site, the fcafpa mutants just resulted in leakier termination rather than alternative polyA site use. Thus, reduction of the expression of the gene may be a consequence of secondary effects. The chromatin surrounding the 15 UA segments is associated with various histone modifications, DNA methylation patterns or small RNAs in wild-type plants (neomorph. salk.edu/epigenome/epigenome.html). We wondered, therefore, whether a close link between RNA processing, transcription rate and chromatin regulation may be the cause of the expression differences. Penheiter et al. and Sheldon et al. have shown that loss of transcription elongation complex Paf1C components leads to defective 3' end formation of mRNAs and small nucleolar RNAs (snoRNAs), respectively.34,35 Yeast Paf1C integrates many aspects of transcription through interactions with transcriptional activators, elongation, 3' processing factors and histone modifications (reviewed in ref. 36). Similarly, loss of the Arabidopsis histone deacetylase, HDAC6 leads to accumulation of longer rRNA (rRNA) transcript variants normally repressed in wild-type cells, triggering overproduction of small interfering RNAs (siRNAs) and changed DNA methylation and histone modifications.37 We therefore explored whether there were any connections of the UA segments
with chromatin regulators. The fcafpa upregulated antisense transcript of a Helitron DNA was also found to be upregulated in dcl1, mutant for one of the Arabidopsis dicer-like homologs, nrpd1a, defective in a plant specific RNA polymerase (PolIV) and ddm1, defective in a chromatin remodelling factor (decrease in DNA methylation). The helitron DNA is heavily methylated in wild type plants, and the DNA methylation is reduced almost by half in fcafpa in all three sequence contexts (i.e., CG, CNG and CHH). This UA segment is also upregulated in seedlings defective in the histone K4 dimethylation demethylase, FLD. Another example of a gene with an altered DNA methylation
pattern in the read-through transcript is illustrated in Figure 2A. These examples support a link between changed 3' processing and chromatin regulation as a major factor in affecting gene expression but more extensive analysis of all 15 genomic segments will be required to establish the generality of this link. Some of the UA transcripts are readthrough products of genes mapping several kb upstream and they contain some of the largest introns in the Arabidopsis genome (Fig. 2B). The 5' splice donor sites of these introns falls close to—sometimes immediately upstream of—the stop codon of the associated gene; therefore, there is both a 3' UTR sequence change and a
© 2012 Landes Bioscience. Do not distribute.
www.landesbioscience.com
Figure 2. Representative examples of two genes (A) At1g55805 and (B) At1g28140, which are alternatively polyadenylated in proximal and distal sites in an FCA/FPA-dependent fashion. The open boxes are exons, thick gray lines indicate introns, thin black line is intergenic region, and gray boxes represent 5’ and 3’ UTRs. The horizontal arrows indicate direction of transcription. The tilted arrows indicate polyadenylation choices with/without FCA/FPA. The thickness of the arrows represents relative efficiency of polyadenylation in wild type vs. fcafpa mutants. The dotted lines represent (A) the region with changed DNA methylation pattern in fcafpa vs. wild type and (B) the large intron splicing event.
RNA Biology
145
frameshift in the protein sequence. This link between alternative splicing and 3' processing is similar to the previously reported example of alternative polyadenylation in the mammalian IgH gene.38 There a splice site choice influences polyadenylation site usage, which in turn determines which form of IgH is made, secreted or membrane-bound. Suppression of a 5' splice site and thus use of the proximal weak polyA site is promoted by the transcription elongation factor ELL2, through binding of the polyadenylation factor CstF64 to PolII and potentially also changes in polymerase elongation rate. In the absence of ELL2, the intron is spliced out removing the weak proximal polyA site, and a terminal exon is included to form the membrane-bound form. We view
this dynamic interplay between 3' end processing, splicing and transcription as very similar to what happens in FCAFPA targets. Formation of the spliced UA transcripts through loss of FCA and FPA appears to also involve differential choice of splice sites and polyA sites. There is much to be still understood in the area of functional coupling between splicing and transcription in a chromatin context.39 Our study reveals the extent of the genomic targets of the RNA binding proteins FCA and FPA in the A. thaliana genome. It sheds light on how defects in 3' processing lead to different consequences and reveals an interplay between chromatin modifications, 3' end processing, splicing and transcription. The effect of aberrant 3' end formation and
read-through transcription influence the different chromosomal regions beyond the associated gene and so would contribute to the catalog of intergenic, non-coding RNAs. Continued analysis of changes in co-transcriptional regulation through genetic or environmental variation in activities such as Arabidopsis FCA and FPA should shed light on consequences of intergenic transcription.
References
10. Khalil AM, Guttman M, Huarte M, Garber M, Raj A, Rivea Morales D, et al. Many human large intergenic noncoding RNAs associate with chromatin-modifying complexes and affect gene expression. Proc Natl Acad Sci USA 2009; 106:11667-72; PMID:19571010; http://dx.doi.org/10.1073/pnas.0904715106 11. Gupta RA, Shah N, Wang KC, Kim J, Horlings HM, Wong DJ, et al. Long non-coding RNA HOTAIR reprograms chromatin state to promote cancer metastasis. Nature 2010; 464:1071-6; PMID:20393566; http://dx.doi.org/10.1038/nature08975 12. van Bakel H, Nislow C, Blencowe BJ, Hughes TR. Most “dark matter” transcripts are associated with known genes. PLoS Biol 2010; 8:e1000371; PMID:20502517; http://dx.doi.org/10.1371/journal.pbio.1000371 13. Clark MB, Amaral PP, Schlesinger FJ, Dinger ME, Taft RJ, Rinn JL, et al. The reality of pervasive transcription. PLoS Biol 2011; 9:e1000625; PMID:21765801; http:// dx.doi.org/10.1371/journal.pbio.1000625 14. van Bakel H, Nislow C, Blencowe BJ, Hughes TR. Response to “the reality of pervasive transcription”. PLoS Biol 2011; 9:e1001102; http://dx.doi.org/10. 1371/journal.pbio.1001102 15. Jarvis K, Robertson M. The noncoding universe. BMC Biol 2011; 9:52; PMID:21798102; http://dx.doi.org/ 10.1186/1741-7007-9-52 16. Zheng B, Wang Z, Li S, Yu B, Liu J-Y, Chen X. Intergenic transcription by RNA Polymerase II coordinates Pol IV and Pol V in siRNA-directed transcriptional gene silencing in Arabidopsis. Genes Dev 2009; 23:2850-60; PMID:19948763; http://dx.doi.org/10. 1101/gad.1868009 17. Hahn S. Structure and mechanism of the RNA polymerase II transcription machinery. Nat Struct Mol Biol 2004; 11:394-403; PMID:15114340; http://dx.doi.org/10.1038/nsmb763 18. Kuehner JN, Pearson EL, Moore C. Unravelling the means to an end: RNA polymerase II transcription termination. Nat Rev Mol Cell Biol 2011; 12:283-94; PMID:21487437; http://dx.doi.org/10.1038/nrm3098 19. Perales R, Bentley D. “Cotranscriptionality”: the transcription elongation complex as a nexus for nuclear transactions. Mol Cell 2009; 36:178-91; PMID: 19854129; http://dx.doi.org/10.1016/j.molcel.2009.09. 018
20. Sonmez C, Baurle I, Magusin A, Dreos R, Laubinger S, Weigel D, et al. RNA 3' processing functions of Arabidopsis FCA and FPA limit intergenic transcription. Proc Natl Acad Sci USA 2011; 108:8508-13; PMID: 21536901; http://dx.doi.org/10.1073/pnas.1105334108 21. Macknight R, Duroux M, Laurie R, Dijkwel P, Simpson G, Dean C. Functional Significance of the Alternative Transcript Processing of the Arabidopsis Floral Promoter FCA. Plant Cell 2002; 14:877-88; PMID:11971142; http://dx.doi.org/10.1105/tpc.010456 22. Liu F, Marquardt S, Lister C, Swiezewski S, Dean C. Targeted 3’ Processing of Antisense Transcripts Triggers Arabidopsis FLC Chromatin Silencing. Science 2010; 327:94-7; PMID:19965720; http://dx.doi.org/10.1126/ science.1180278 23. Quesada V, Macknight R, Dean C, Simpson GG. Autoregulation of FCA pre-mRNA processing controls Arabidopsis flowering time. EMBO J 2003; 22:3142-52; PMID:12805228; http://dx.doi.org/10.1093/emboj/ cdg305 24. Hornyik C, Terzi L, Simpson G. The Spen family protein FPA controls alternative cleavage and polyadenylation of RNA. Dev Cell 2010; 18:172-4; PMID: 20159589; http://dx.doi.org/10.1016/j.devcel.2009. 12.009 25. Manzano D, Marquardt S, Jones AME, Bäurle I, Liu F, Dean C. Altered interactions within FY/AtCPSF complexes required for Arabidopsis FCA-mediated chromatin silencing. Proc Natl Acad Sci USA 2009; 106:8772-7; PMID:19439664; http://dx.doi.org/10. 1073/pnas.0903444106 26. Liu F, Quesada V, Crevillen P, Bäurle I, Swiezewski S, Dean C. The Arabidopsis RNA-binding protein FCA requires a lysine-specific demethylase 1 homolog to downregulate FLC. Mol Cell 2007; 28:398-407; PMID:17996704; http://dx.doi.org/10.1016/j.molcel. 2007.10.018 27. Sarnowski TJ, Swiezewski S, Pawlikowska K, Kaczanowski S, Jerzmanowski A. AtSWI3B, an Arabidopsis homolog of SWI3, a core subunit of yeast Swi/Snf chromatin remodeling complex, interacts with FCA, a regulator of flowering time. Nucleic Acids Res 2002; 30:3412-21; PMID:12140326; http://dx.doi. org/10.1093/nar/gkf458
Acknowledgments
We thank Dean Lab members for critical reading of the manuscript. This work was supported by European Community FP7 project AENEAS: Acquired environmental epigenetic advances: from Arabidopsis to maize (AENEAS) Contract SCP/226477 to C.D.
© 2012 Landes Bioscience. 1.
2.
3.
4.
5.
6.
7.
8.
9.
146
Lander ES, Linton LM, Birren B, Nusbaum C, Zody MC, Baldwin J, et al. Initial sequencing and analysis of the human genome. Nature 2001; 409:860-921; PMID: 11237011; http://dx.doi.org/10.1038/35057062 Venter JC, Adams MD, Myers EW, Li PW, Mural RJ, Sutton GG, et al. The sequence of the human genome. Science 2001; 291:1304-51; PMID:11181995; http:// dx.doi.org/10.1126/science.1058040 Finishing the euchromatic sequence of the human genome. Nature 2004; 431:931-45; PMID:15496913; http://dx.doi.org/10.1038/nature03001 Bertone P, Stolc V, Royce TE, Rozowsky JS, Urban AE, Zhu X, et al. Global identification of human transcribed sequences with genome tiling arrays. Science 2004; 306:2242-6; PMID:15539566; http:// dx.doi.org/10.1126/science.1103388 Katayama S, Tomaru Y, Kasukawa T, Waki K, Nakanishi M, Nakamura M, et al. Antisense transcription in the mammalian transcriptome. Science 2005; 309:1564-6; PMID:16141073; http://dx.doi.org/10. 1126/science.1112009 Birney E, Stamatoyannopoulos JA, Dutta A, Guigo R, Gingeras TR, Margulies EH, et al. Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature 2007; 447:799-816; PMID:17571346; http://dx.doi.org/10. 1038/nature05874 Chekanova JA, Gregory BD, Reverdatto SV, Chen H, Kumar R, Hooker T, et al. Genome-wide highresolution mapping of exosome substrates reveals hidden features in the Arabidopsis transcriptome. Cell 2007; 131:1340-53; PMID:18160042; http://dx.doi. org/10.1016/j.cell.2007.10.056 Brown CJ, Hendrich BD, Rupert JL, Lafreniere RG, Xing Y, Lawrence J, et al. The human XIST gene: analysis of a 17 kb inactive X-specific RNA that contains conserved repeats and is highly localized within the nucleus. Cell 1992; 71:527-42; PMID:1423611; http:// dx.doi.org/10.1016/0092-8674(92)90520-M Rinn JL, Kertesz M, Wang JK, Squazzo SL, Xu X, Brugmann SA, et al. Functional Demarcation of Active and Silent Chromatin Domains in Human HOX Loci by Noncoding RNAs. Cell 2007; 129:1311-23; PMID: 17604720; http://dx.doi.org/10.1016/j.cell.2007.05.022
Do not distribute.
RNA Biology
Volume 9 Issue 2
28. Koornneef M, Hanhart CJ, Van der Veen JH. A genetic and physiological, analysis of late flowering mutants in Arabidopsis thaliana. Mol Gen Genet 1991; 229:57-66; PMID:1896021; http://dx.doi.org/10.1007/ BF00264213 29. Bäurle I, Dean C. The timing of developmental transitions in plants. Cell 2006; 125:655-64; PMID: 16713560; http://dx.doi.org/10.1016/j.cell.2006.05.005 30. Herr AJ, Molnar A, Jones A, Baulcombe DC. Inaugural Article: Defective RNA processing enhances RNA silencing and influences flowering of Arabidopsis. Proc Natl Acad Sci USA 2006; 103:14994-5001; PMID: 17008405; http://dx.doi.org/10.1073/pnas.0606536103 31. Hogg JR, Goff SP. Upf1 senses 3'UTR length to potentiate mRNA decay. Cell 2010; 143:379-89; PMID: 21029861; http://dx.doi.org/10.1016/j.cell.2010.10.005 32. Sandberg R, Neilson JR, Sarma A, Sharp PA, Burge CB. Proliferating cells express mRNAs with shortened 3' untranslated regions and fewer microRNA target sites. Science 2008; 320:1643-7; PMID:18566288; http://dx.doi.org/10.1126/science.1155390
33. Mayr C, Bartel DP. Widespread shortening of 3'UTRs by alternative cleavage and polyadenylation activates oncogenes in cancer cells. Cell 2009; 138:673-84; PMID:19703394; http://dx.doi.org/10.1016/j.cell. 2009.06.016 34. Penheiter KL, Washburn TM, Porter S, Hoffmann MG, Jaehning JA. A posttranscriptional role for the yeast Paf1-RNA polymeraseII complex is revealed by identification of primary targets. Mol Cell 2005; 20:213-23; PMID:16246724; http://dx.doi.org/10. 1016/j.molcel.2005.08.023 35. Sheldon KE, Mauger DM, Arndt KM. A Requirement for the Saccharomyces cerevisiae Paf1 Complex in snoRNA 3' End Formation. Mol Cell 2005; 20: 225-36; PMID:16246725; http://dx.doi.org/10.1016/ j.molcel.2005.08.026 36. Jaehning JA. The Paf1 complex: platform or player in RNA polymerase II transcription? Biochim Biophys Acta 2010; 1799:379-88; PMID:20060942
37. Earley KW, Pontvianne F, Wierzbicki AT, Blevins T, Tucker S, Costa-Nunes P, et al. Mechanisms of HDA6-mediated rRNA gene silencing: suppression of intergenic Pol II transcription and differential effects on maintenance versus siRNA-directed cytosine methylation. Genes Dev 2010; 24:1119-32; PMID:20516197; http://dx.doi.org/10.1101/gad.1914110 38. Martincic K, Alkan SA, Cheatle A, Borghesi L, Milcarek C. Transcription elongation factor ELL2 directs immunoglobulin secretion in plasma cells by stimulating altered RNA processing. Nat Immunol 2009; 10:1102-9; PMID:19749764; http://dx.doi.org/ 10.1038/ni.1786 39. Oesterreich FC, Bieberstein N, Neugebauer KM. Pause locally, splice globally. Trends Cell Biol 2011; 21:32835; PMID:21530266; http://dx.doi.org/10.1016/j.tcb. 2011.03.002
© 2012 Landes Bioscience. Do not distribute.
www.landesbioscience.com
RNA Biology
147