1654
Biochemical Society Transactions (2013) Volume 41, part 6
Understanding non-coding DNA regions in yeast Margarita Schlackow* and Monika Gullerova†1 *Mathematical Institute, University of Oxford, Oxford OX1 3LB, U.K., and †Sir William Dunn School of Pathology, University of Oxford, Oxford OX1 3RE, U.K.
Abstract Non-coding transcripts play an important role in gene expression regulation in all species, including budding and fission yeast. Such regulatory transcripts include intergenic ncRNA (non-coding RNA), 5 and 3 UTRs, introns and antisense transcripts. In the present review, we discuss advantages and limitations of recently developed sequencing techniques, such as ESTs, DNA microarrays, RNA-Seq (RNA sequencing), DRS (direct RNA sequencing) and TIF-Seq (transcript isoform sequencing). We provide an overview of methods applied in yeast and how each of them has contributed to our knowledge of gene expression regulation and transcription.
Introduction Three decades ago, studies of gene expression regulation were based mostly on biological and biochemical analyses of a specific gene. Technological advances of experimental strategies enabled a rapid development of the genome-wide approaches involving quantitative transcriptome sequencing and subsequent bioinformatic and statistical analyses. These offer a much deeper and general understanding of transcriptional regulation. Regulatory elements relate to transcribed, but not translated, nucleic acid regions such as intergenic ncRNA (non-coding RNA), 5 and 3 UTRs of a gene, introns and antisense transcripts. In the present review, we focus on RNA polymerase IItranscribed regulatory elements in the fission yeast Schizosaccharomyces pombe and the budding yeast Saccharomyces cerevisiae. These two commonly used model organisms have contributed substantially to the comprehension of eukaryotic genome function. Investigative techniques of the yeast genome have undergone a dramatic evolution. Starting from the analysis of individual genes [1], followed by DNA microarrays [2], ESTs [3], RNA-Seq (RNA sequencing) [4,5], DRS (direct RNA sequencing) [6] and TIF-Seq (transcript isoform sequencing) [7], they are providing more and more advances in the technical aspect of experiments and consequently result in larger and more detailed datasets.
Experimental approaches employed to determine transcribed genomic regions Expressed sequence tags ESTs are partial reads of cDNA sequences. They provide the possibility of evaluating gene expression dependent on Key words: budding yeast, fission yeast, polyadenylation, sequencing, untranslated region (UTR). Abbreviations used: CUT, cryptic unstable transcript; DRS, direct RNA sequencing; ncRNA, non-coding RNA; NUE, near-upstream element; PAS, polyadenylation signal; priRNA, primary small RNA; RNA-Seq, RNA sequencing; SAGE, serial analysis of gene expression; snoRNA, small nucleolar RNA; SUT, stable unannotated transcript; TIF-Seq, transcript isoform sequencing; TSS, transcription start site; uORF, upstream ORF; VT, virtual terminator. 1 To whom correspondence should be addressed (email
[email protected]).
C The
C 2013 Biochemical Society Authors Journal compilation
the cell cycle stage and cell type [8]. Isolated eukaryotic mRNA is reverse-transcribed into cDNA, which is inserted into a suitable vector (Figure 1A). Inserted fragments are sequenced by priming their ends in randomly selected clones. Bioinformatic analysis removes low sequence reads and contaminating vector sequence. Individual EST reads can vary greatly in length, up to 800 nt [8]. Although ESTs have proved to be useful in studying gene expression, several disadvantages have pushed them to the sideline of currently used techniques. In particular, low numbers of full-length ESTs, chimaeric sequences due to cDNA template switching [9], internal cDNA priming events and low-quality sequences at the EST ends [10] (reviewed in [11]) led to the development of further approaches to study gene expression.
DNA microarrays DNA microarrays became one of the most popular tools to measure transcriptional activity and genotype in multiple genomic regions. Fluorescently labelled cDNA or cRNA (complementary RNA) can hybridize to microscopic spots on a surface, each containing one specific fluorescent DNA probe (Figure 1B). A laser-scanning microscope detects the fluorescent signal corresponding to sample–probe hybridization [12,13]. Microarrays enable parallelization of data acquisition, allowing gene comparison. A major challenge of these microarrays is the elimination of crosshybridization. Additionally, microarrays depend on specific probes to allow the detection of particular gene expression. ESTs, in this case, are helpful for probe design. The more recent tiling arrays are independent of gene annotation and can probe for any genomic sequence. This allows the representation of non-repetitive DNA at various sequence resolutions, thus enabling the discovery of novel transcripts and regulatory elements [14]. High-density oligonucleotide tiling arrays allowed the re-annotation of gene boundaries and the estimation of levels of coding and non-coding transcripts in S. cerevisiae [15] and S. pombe [16]. Biochem. Soc. Trans. (2013) 41, 1654–1659; doi:10.1042/BST20130144
The 7th International Fission Yeast Meeting: Pombe2013
Figure 1 Schematic illustration of experimental techniques RNA is extracted from a population of cells. (A) ESTs. The extracted RNA is converted into cDNA and inserted into a plasmid. Random clones are sequenced by 5 and 3 end priming. (B) DNA microarray. The RNA from a population of interest and a control population is extracted and reverse-transcribed into cDNA. Each population is differently fluorescently labelled. The samples are spotted on to a microarray slide, whose spots contain different probes to which the samples can hybridize by complementarity. Different levels and colours of fluorescence imply where and how strong the probed region is expressed. (C) RNA-Seq. After reverse transcription and shearing of the cDNA, the fragments undergo adapter ligation. Size-selected fragments are added to a flow cell, which is covered with oligonucleotides binding to the adapters. Bound sequences are locally amplified and sequenced via addition of reversibly chemically blocked nucleotides. Strand-specific and non-strand-specific RNA-Seq protocols are employable. (D) DRS. Poly(dT) oligonucleotides cover a surface which binds the polyadenylated RNA. A ‘fill and lock’ step fills in the overhanging poly(A) tail and locks the first non-A nucleotide with a complementary fluorescently labelled VT–nucleotide. The dye–nucleotide is then chemically cleaved and the next VT–nucleotide can be incorporated and imaged. This step is then repeated. (E) TIF-Seq. The 5 end is capped. Full-length cDNA is generated and the 5 end is biotin-tagged. The molecules are then circularized and sheared. The sequence is obtained by paired end sequencing.
C The
C 2013 Biochemical Society Authors Journal compilation
1655
1656
Biochemical Society Transactions (2013) Volume 41, part 6
RNA-Seq DNA tiling arrays could fail to identify short exons and precise UTR boundaries. They may give high falsepositive rates of transcribed regions and cannot detect posttranscriptionally modified sequences [17]. RNA-Seq is a more recent quantitative method allowing estimation of gene expression levels, where cDNA reads are mapped back to genomic regions [18]. Unique DNA adapters are attached to both ends of sheared DNA fragments (60% of budding yeast genes [20]. Additionally, co-ordination between sense and antisense polyadenylated transcript levels plays a role in gene expression regulation. Genes that are expressed at low levels in budding yeast show a positive correlation between sense and antisense transcripts, whereas highly expressed genes demonstrate a negative correlation [20]. In contrast, in fission yeast, tiling array analysis suggests that most antisense transcripts are not polyadenylated [16]. This method also detected a likely propensity of highly expressed genes and histone-depleted regions to show elevated antisense transcription [16]. The antisense transcripts generally occur more frequently in the 3 UTR than in the 5 UTRs, which implies that they may also arise from overlapping 3 UTRs, as shown by RNA-Seq [17]. However, the identification of antisense transcription using any method that involves reverse transcription of RNA to cDNA requires caution, as this can cause secondary mispriming. Generally, it seems that antisense transcription is lower than sense transcription [15,19]. Also, some antisense transcripts are part of lncRNAs (long ncRNAs). These long antisense transcripts, as well as some long intergenic transcripts are present at less than one copy number per cell, which could reflect their tight repression at transcriptional, post-transcriptional or chromatin levels [5]. C The
C 2013 Biochemical Society Authors Journal compilation
1657
1658
Biochemical Society Transactions (2013) Volume 41, part 6
An important difference between S. cerevisiae and S. pombe is the lack of an RNAi pathway in budding yeast. The RNAi pathway in fission yeast and in higher eukaryotes is induced by dsRNA formation. dsRNA is processed into short RNA molecules (siRNA), which targets nascent mRNA via complementarity or induce heterochromatin formation and consequent gene silencing at the transcriptional level. Multiple protein complexes, including Dicer [35], are involved in the RNAi pathway. Small RNA libraries are prepared by size-selecting molecules, followed by RNA-Seq, which determines sequence content [36]. It is apparent that RNAi in S. pombe acts jointly with the exosome to repress developmentally regulated genes and retrotransposons. Furthermore, similar analysis in Dicer-knockout cells [37] revealed Dicer-independent priRNAs (primary small RNAs). It appears that priRNAs originate from degradation of abundant transcripts, and they target antisense transcripts arising from bidirectional transcription of DNA repeats.
Conclusion Over the years, increasingly sophisticated experimental and analytical tools have given scientists the opportunity to gain a deeper insight into gene regulation elements. These may range from the production of regulatory ncRNA, which cause a gradient of genetic repression and even gene silencing, up to coding transcript variation. Transcript diversity and functionality may depend on the protein sequence [17,19], translational activity [32] or deletion/retention of binding sites for RNA-binding proteins [7]. Sequencing resources have not yet reached their limits. Their depth and precision grow with newly developing techniques. In the present review, we have summarized a few significant experimental techniques and regulatory factors in several regions on the genome. Moreover, we have provided a simple overview of RNA polymerase II-transcribed regulatory elements. Many other regulatory elements such as rRNA, tRNA, spliceosomal RNA, snoRNA, telomerase RNA, signal recognition particle RNA, RNA components of RNase P and RNase MRP and mitochondrial RNA [38] can be also detected by the techniques described. To date, many effects in gene expression are not fully classified, and possibly only vaguely predicted. We anticipate that the rapid development and reduction in the costs of high-throughput sequencing, which dramatically increases the available genomic and transcriptome sequence resources in many organisms, will lead to the discovery of more regulatory RNAs. These resources, together with the powerful tools of comparative genomics have potential to bring new insights into the functional dynamics of gene expression regulation.
Acknowledgement We thank Eleanor White for valuable comments on our paper.
C The
C 2013 Biochemical Society Authors Journal compilation
Funding This work was supported by the Engineering and Physical Sciences Research Council (to M.S.), a L’Oreal/UNESCO Woman in Science UK and Ireland award (to M.G.) and a Medical Research Council Career Development Award (to M.G.).
References 1 Humphrey, T., Sadhale, P., Platt, T. and Proudfoot, N.J. (1991) Homologous mRNA 3 end formation in fission and budding yeast. EMBO J. 10, 3505–3511 2 David, L., Clauder-Munster, ¨ S. and Steinmetz, L. (2011) Genome-wide transcriptome analysis in yeast using high-density tiling arrays. In Yeast Systems Biology (Castrillo, J.I. and Oliver, S.G., eds), pp. 107–123, Humana Press, Totowa 3 Graber, J.H., Cantor, C.R., Mohr, S.C. and Smith, T.F. (1999) Genomic detection of new yeast pre-mRNA 3 -end-processing signals. Nucleic Acids Res. 27, 888–894 4 Wilhelm, B.T., Marguerat, S., Watt, S., Schubert, F., Wood, V., Goodhead, I., Penkett, C.J., Rogers, J. and Bahler, ¨ J. (2008) Dynamic repertoire of a eukaryotic transcriptome surveyed at single-nucleotide resolution. Nature 453, 1239–1243 5 Marguerat, S., Schmidt, A., Codlin, S., Chen, W., Aebersold, R. and Bahler, ¨ J. (2012) Quantitative analysis of fission yeast transcriptomes and proteomes in proliferating and quiescent cells. Cell 151, 671–683 6 Ozsolak, F., Platt, A.R., Jones, D.R., Reifenberger, J.G., Sass, L.E., McInerney, P., Thompson, J.F., Bowers, J., Jarosz, M. and Milos, P.M. (2009) Direct RNA sequencing. Nature 461, 814–818 7 Pelechano, V., Wei, W. and Steinmetz, L.M. (2013) Extensive transcriptional heterogeneity revealed by isoform profiling. Nature 497, 127–131 8 Parkinson, J. and Blaxter, M. (2009) Expressed sequence tags: an overview. In Expressed Sequence Tags (ESTs) (Parkinson, J., ed.), pp. 1–12, Humana Press, Totowa 9 Cocquet, J., Chong, A., Zhang, G. and Veitia, R.A. (2006) Reverse transcriptase template switching and false alternative transcripts. Genomics 88, 127–131 10 Aaronson, J.S., Eckman, B., Blevins, R.A., Borkowski, J.A., Myerson, J., Imran, S. and Elliston, K.O. (1996) Toward the development of a gene index to the human genome: an assessment of the nature of high-throughput EST sequence data. Genome Res. 6, 829–845 11 Nagaraj, S.H., Gasser, R.B. and Ranganathan, S. (2007) A hitchhiker’s guide to expressed sequence tag (EST) analysis. Briefings Bioinf. 8, 6–21 12 Shalon, D., Smith, S.J. and Brown, P.O. (1996) A DNA microarray system for analyzing complex DNA samples using two-color fluorescent probe hybridization. Genome Res. 6, 639–645 13 Schena, M., Heller, R.A., Theriault, T.P., Konrad, K., Lachenmeier, E. and Davis, R.W. (1998) Microarrays: biotechnology’s discovery platform for functional genomics. Trends Biotechnol. 16, 301–306 14 Bertone, P. and Snyder, M. (2005) Advances in functional protein microarray technology. FEBS J. 272, 5400–5411 15 David, L., Huber, W., Granovskaia, M., Toedling, J., Palm, C.J., Bofkin, L., Jones, T., Davis, R.W. and Steinmetz, L.M. (2006) A high-resolution map of transcription in the yeast genome. Proc. Natl. Acad. Sci. U.S.A. 103, 5320–5325 16 Dutrow, N., Nix, D.A., Holt, D., Milash, B., Dalley, B., Westbroek, E., Parnell, T.J. and Cairns, B.R. (2008) Dynamic transcriptome of Schizosaccharomyces pombe shown by RNA–DNA hybrid mapping. Nat. Genet. 40, 977–986 17 Nagalakshmi, U., Wang, Z., Waern, K., Shou, C., Raha, D., Gerstein, M. and Snyder, M. (2008) The transcriptional landscape of the yeast genome defined by RNA sequencing. Science 320, 1344–1349 18 Marguerat, S. and Bahler, ¨ J. (2010) RNA-seq: from technology to biology. Cell. Mol. Life Sci. 67, 569–579 19 Rhind, N., Chen, Z., Yassour, M., Thompson, D.A., Haas, B.J., Habib, N., Wapinski, I., Roy, S., Lin, M.F., Heiman, D.I. et al. (2011) Comparative functional genomics of the fission yeasts. Science 332, 930–936 20 Ozsolak, F., Kapranov, P., Foissac, S., Kim, S.W., Fishilevich, E., Monaghan, A.P., John, B. and Milos, P.M. (2010) Comprehensive polyadenylation site maps in yeast and human reveal pervasive alternative polyadenylation. Cell 143, 1018–1029
The 7th International Fission Yeast Meeting: Pombe2013
21 Samanta, M.P., Tongprasit, W., Sethi, H., Chin, C.-S. and Stolc, V. (2006) Global identification of noncoding RNAs in Saccharomyces cerevisiae by modulating an essential RNA processing pathway. Proc. Natl. Acad. Sci. U.S.A. 103, 4192–4197 22 Neil, H., Malabat, C., d’Aubenton-Carafa, Y., Xu, Z., Steinmetz, L.M. and Jacquier, A. (2009) Widespread bidirectional promoters are the major source of cryptic transcripts in yeast. Nature 457, 1038–1042 23 Xu, Z., Wei, W., Gagneur, J., Perocchi, F., Clauder-Munster, S., Camblong, J., Guffanti, E., Stutz, F., Huber, W. and Steinmetz, L.M. (2009) Bidirectional promoters generate pervasive transcription in yeast. Nature 457, 1033–1037 24 Thompson, D.M. and Parker, R. (2007) Cytoplasmic decay of intergenic transcripts in Saccharomyces cerevisiae. Mol. Cell. Biol. 27, 92–101 25 Zhao, J., Hyman, L. and Moore, C. (1999) Formation of mRNA 3 ends in eukaryotes: mechanism, regulation, and interrelationships with other steps in mRNA synthesis. Microbiol. Mol. Biol. Rev. 63, 405–445 26 Graber, J.H., McAllister, G.D. and Smith, T.F. (2002) Probabilistic prediction of Saccharomyces cerevisiae mRNA 3 -processing sites. Nucleic Acids Res. 30, 1851–1858 27 Yamanishi, M., Ito, Y., Kintaka, R., Imamura, C., Katahira, S., Ikeuchi, A., Moriya, H. and Matsuyama, T. (2013) A genome-wide activity assessment of terminator regions in Saccharomyces cerevisiae provides a “terminatome” toolbox. ACS Synth. Biol. 2, 337–347 28 Shiraki, T., Kondo, S., Katayama, S., Waki, K., Kasukawa, T., Kawaji, H., Kodzius, R., Watahiki, A., Nakamura, M., Arakawa, T. et al. (2003) Cap analysis gene expression for high-throughput analysis of transcriptional starting point and identification of promoter usage. Proc. Natl. Acad. Sci. U.S.A. 100, 15776–15781 29 Zhang, Z. and Dietrich, F.S. (2005) Mapping of transcription start sites in Saccharomyces cerevisiae using 5 SAGE. Nucleic Acids Res. 33, 2838–2851
30 Choi, W.S., Yan, M., Nusinow, D. and Gralla, J.D. (2002) In vitro transcription and start site selection in Schizosaccharomyces pombe. J. Mol. Biol. 319, 1005–1013 31 Miura, F., Kawaguchi, N., Sese, J., Toyoda, A., Hattori, M., Morishita, S. and Ito, T. (2006) A large-scale full-length cDNA analysis to explore the budding yeast transcriptome. Proc. Natl. Acad. Sci. U.S.A. 103, 17846–17851 32 Rojas-Duran, M.F. and Gilbert, W.V. (2012) Alternative transcription start site selection leads to large differences in translation activity in yeast. RNA 18, 2299–2305 33 Sehgal, A., Hughes, B.T. and Espenshade, P.J. (2008) Oxygen-dependent, alternative promoter controls translation of tco1 + in fission yeast. Nucleic Acids Res. 36, 2024–2031 34 Yassour, M., Kaplan, T., Fraser, H.B., Levin, J.Z., Pfiffner, J., Adiconis, X., Schroth, G., Luo, S., Khrebtukova, I., Gnirke, A. et al. (2009) Ab initio construction of a eukaryotic transcriptome by massively parallel mRNA sequencing. Proc. Natl. Acad. Sci. U.S.A. 106, 3264–3269 35 Hannon, G.J. (2002) RNA interference. Nature 418, 244–251 36 Yamanaka, S., Mehta, S., Reyes-Turcu, F.E., Zhuang, F., Fuchs, R.T., Rong, Y., Robb, G.B. and Grewal, S.I.S. (2013) RNAi triggered by specialized machinery silences developmental genes and retrotransposons. Nature 493, 557–560 37 Halic, M. and Moazed, D. (2010) Dicer-independent primal RNAs trigger RNAi and heterochromatin formation. Cell 140, 504–516 38 Peng, W.-T., Robinson, M.D., Mnaimneh, S., Krogan, N.J., Cagney, G., Morris, Q., Davierwala, A.P., Grigull, J., Yang, X., Zhang, W. et al. (2003) A panoramic view of yeast noncoding RNA processing. Cell 113, 919–933
Received 8 July 2013 doi:10.1042/BST20130144
C The
C 2013 Biochemical Society Authors Journal compilation
1659