Mechanisms and functions of RNA-guided RNA

0 downloads 0 Views 537KB Size Report
Aug 5, 2004 - RNAs (Lowe and Eddy 1999) was re-trained to recognize the more compact ar- chaeal RNAs and used to search archaeal genomes for ...
Mechanisms and functions of RNA-guided RNA modification Yi-Tao Yu, Rebecca M. Terns, and Michael P. Terns

Abstract RNA-guided 2'-O-methylations and pseudouridylations occur in several different types of RNAs and in a wide range of organisms. Hundreds of the RNAs that guide these modifications have been identified, leading to breakthroughs in our understanding of the mechanisms of RNA-guided RNA modifications and, to some extent, the functions of 2'-O-methylated residues and pseudouridines. There are two classes of guide RNAs, namely box C/D and box H/ACA RNAs, which direct 2'-O-methylations and pseudouridylations, respectively. The guide RNAs function primarily by binding to complementary regions in the target RNAs. Cellular guide RNAs exist in RNA-protein complexes comprised of one guide RNA and a set of proteins that includes the modifying enzyme (2'-O-methylase or pseudouridylase). We are beginning to understand the basis for the importance of the RNA-guided modifications, which are well conserved and clustered in functionally important regions of RNAs. Recent reports indicate that modified nucleotides in rRNAs and spliceosomal snRNAs contribute to protein synthesis and premRNA splicing, respectively.

1 Introduction Post-transcriptional modifications occur in a large number of cellular RNAs and are an important component of RNA maturation. Modifications can occur within the base, sugar ring (ribose), or both, and thereby increase the diversity and functional potential of RNAs. In fact, a large collection of naturally occurring modified nucleotides has been identified (Motorin and Grosjean 1998). Importantly, modified nucleotides are, in most cases, conserved from species to species and are often clustered in regions of functional importance within RNAs (Massenet et al. 1998; Ofengand and Fournier 1998; Decatur and Fournier 2002). The fact that modified RNA nucleotides are widespread, conserved and located in strategic locations within RNAs leaves little doubt about their functional relevance. Yet despite intense work over many years, the question of what roles the modified nucleotides play in cellular processes remains largely unanswered. Pseudouridylation and 2'-O-methylation are the most abundant internal modifications found in stable RNAs, namely tRNAs (Bjork 1995; Grosjean et al. 1995;

2 Yi-Tao Yu, Rebecca M. Terns, and Michael P. Terns

Auffinger and Westhof 1998; Sprinzl et al. 1998; Hopper and Phizicky 2003), rRNAs (Maden 1990; Bachellerie and Cavaille 1998; Ofengand and Fournier 1998) and spliceosomal snRNAs (some snoRNAs as well) (Reddy and Busch 1988; Massenet et al. 1998). In fact, these are the predominant modifications found in rRNAs and spliceosomal snRNAs. The mammalian rRNAs contain ~100 pseudouridines (Ψ) and ~100 2'-O-methylated residues (Maden 1990; Bachellerie and Cavaille 1998; Ofengand and Fournier 1998), and a total of 30 2'-Omethylated residues and 24 pseudouridines have been reported in the major vertebrate spliceosomal snRNAs (including U1, U2, U4, U5, and U6 snRNAs) (Reddy and Busch 1988; Massenet et al. 1998). 2'-O-methylation and pseudouridylation are also the predominant modifications in U3, a well-characterized snoRNA (Reddy and Busch 1988). While pseudouridylations and 2'-O-methylations may not be the most prevalent modifications in tRNA, they are present in all tRNAs (Bjork 1995; Grosjean et al. 1995; Auffinger and Westhof 1998). RNA modifications can be categorized as either RNA-dependent or RNAindependent, based on the mechanism by which they are generated. RNAdependent modifications are introduced by RNA-protein complexes (for example small nucleolar ribonucleoprotein complexes or snoRNPs), in which the RNA component serves as a guide that base-pairs with the target RNA to direct modification at a specific site(s) (Kiss 2001; Bachellerie et al. 2002; Filipowicz and Pogacic 2002; Kiss 2002; Terns and Terns 2002; Decatur and Fournier 2003). RNAindependent modifications are catalyzed by a single protein or a protein complex that recognizes and binds to a specific RNA sequence or structure (Bjork 1995; Alexandrov et al. 2002; Ofengand 2002; Ferre-D'Amare 2003; Ma et al. 2003). While most RNA base modifications are catalyzed by the RNA-independent (protein only) mechanism, 2'-O-methylations and pseudouridylations are introduced by both RNA-independent and RNA-dependent mechanisms depending on the RNA type and organism. Computational and experimental evidence indicates that 2'-O-methylation and pseudouridylation of eukaryotic and archaeal rRNAs and higher eukaryotic spliceosomal snRNAs are almost exclusively catalyzed by the RNA-dependent mechanism (Dennis et al. 2001; Kiss 2001, 2002; Bachellerie et al. 2002; Terns and Terns 2002; Decatur and Fournier 2003; Omer et al. 2003). 2’O-methylation of archaeal tRNA is also catalyzed by RNA-dependent mechanism (Omer et al. 2000; Clouet d'Orval et al. 2001; Dennis et al. 2001). Recent reports suggest that RNA-guided modifications may occur in certain mRNAs as well (Cavaille et al. 2000; Liang et al. 2002). In this review, we focus on RNAdependent RNA modifications, including RNA-guided pseudouridylation and 2'O-methylation in various organisms.

2 Discovery of eukaryotic snoRNAs that guide rRNA modifications The nucleolus of eukaryotic cells harbors, in addition to precursor and mature rRNAs and ribosomal proteins, a huge number of small RNAs (termed small nu-

Mechanisms and functions of RNA-guided RNA modification 3

cleolar or snoRNAs) ranging from ~60 to ~300 nucleotides in length in metazoans and from ~60 to ~600 nucleotides in unicellular organisms. In recent years, scores of snoRNAs have been identified and characterized, allowing the elucidation of common features and mechanisms of action. We now understand that most of the identified snoRNAs function as guides that direct site-specific 2'-O-methylations and pseudouridylations in rRNA (and perhaps other RNA substrates that pass transiently through the nucleolus). The first snoRNA was discovered nearly four decades ago. We have come a long way in our appreciation of the snoRNAs and our understanding of the mechanisms underlying rRNA modifications. 2.1 Early studies of snoRNAs Research on snoRNAs started in the late 1960s and early 1970s. The first snoRNAs discovered and characterized were a subset that are involved in pre-rRNA processing rather than modification. Characterization of these first snoRNAs provided a foundation for understanding the closely related modification guide RNAs. In 1968, U3 became the first snoRNA to be identified in mammalian cells (Hodnett and Busch 1968). Because of its unusually high abundance (~2 × 105 copies per cell), U3 is readily detectable by denaturing gel electrophoresis of total nuclear RNA. U3 is a relatively large snoRNA (~200 nt in length) that possesses a 5' trimethylguanosine (TMG) cap structure (like the nucleoplasmic spliceosomal snRNAs), associates with the abundant nucleolar protein fibrillarin (Nop1p), a target of autoantibodies, and is essential in yeast (Reddy and Busch 1988; Tollervey et al. 1991). Careful inspection of U3 sequences from various species revealed short conserved sequence elements, two of which [termed boxes C’ (UGAUGA/U) and D (CUGA)] are critical for U3 RNA processing, transport, protein association and function (Terns and Terns 2002). UV cross-linking analysis indicates that U3 binds to the 5' ETS (external transcribed spacer) region of pre-rRNA and is involved in ETS primary cleavage, an early step during prerRNA processing (Maser and Calvet 1989; Stroke and Weiner 1989; Kass et al. 1990; Maxwell and Fournier 1995). U3 also contributes to pre-rRNA processing at the ITS1 (the first internal transcribed spacer)-5.8S boundary (Gerbi et al. 1990; Maxwell and Fournier 1995). It was suspected that there might be additional snoRNAs in the nucleolus that participated in pre-rRNA processing but were not initially detected due to lower abundance. Several techniques were developed to isolate or enrich nucleolar RNAs. For instance, small nucleolar RNAs were separated from other cellular RNAs by nucleolar fractionation, sucrose gradient fractionation, cross-linking or hybridization to rRNAs, and immunoprecipitation using antibodies (or autoantibodies) against the TMG cap structure or fibrillarin (Maxwell and Martin 1986; Trinh-Rohlik and Maxwell 1988; Tyc and Steitz 1989; Ruff et al. 1993). Various approaches, employed independently or in combination and coupled with denaturing gel electrophoresis and sequencing, yielded fruitful results. By 1995, many new snoRNAs were identified in mammals, including U8, U13, U14–U24, MRP7.2 RNA, E2 and E3 (Maxwell and Fournier 1995). Indeed, these snoRNAs are

4 Yi-Tao Yu, Rebecca M. Terns, and Michael P. Terns

relatively low in abundance, with U8 and U13 (both 5'-TMG capped) being present at ~4 × 104 copies per cell and the others at roughly 103–104 copies per cell (Maxwell and Fournier 1995; Yu et al. 1999). Biochemical approaches were also used to identify small nucleolar RNAs in yeast (Wise et al. 1983; Zagorski et al. 1988; Li et al. 1990; Balakin et al. 1993). For instance, Wise et al. (1983) used semi-denaturing two-dimensional gel electrophoresis followed by 5' cap labeling (decapping followed by recapping with [α-32P]GTP) to identify 5' TMG-capped snoRNAs in yeast. It quickly became clear that the small RNAs found in nucleoli were a disparate group. Only a few of the newly identified snoRNAs (U8, U14, U22, snR30, and 7.2/MRP) were shown to be involved in pre-rRNA processing; most of the RNAs were not essential and their functions were unknown (Maxwell and Fournier 1995). Unlike U3, most mammalian snoRNAs, including U14–U24, MRP-7.2 RNA, E1, E2, and E3, do not possess a 5' TMG cap. In addition, while some of the new RNAs shared conserved sequence elements box C and box D with U3, others (mammalian TMG-minus snoRNAs U17/E1, E2, E3, U19 and U23, and a large number of yeast snoRNAs, both TMG-capped and TMG-minus) do not contain a box C/D motif and do not associate with fibrillarin (or Nop1p in yeast) (Maxwell and Fournier 1995). 2.2 Two classes of snoRNAs—box C/D and box H/ACA snRNAs In an effort to classify the snoRNAs, the Fournier group conducted a comparative analysis that identified conserved sequences and structural elements common to snoRNAs lacking boxes C and D (Balakin et al. 1996). They showed that, except for MRP-7.2 RNA, all human and yeast non-C/D snoRNAs possess a common predicted secondary structure of two or more stem-loops separated and flanked by single-stranded regions (Balakin et al. 1996; Ni et al. 1997). The structure of this class of RNAs was later refined to the current “hairpin-hinge-hairpin-tail” model (Ganot et al. 1997a) (Fig. 1). Another defining feature, the trinucleotide sequence ACA, was found in the 3' single-stranded region, three nucleotides away from the 3' terminus (Balakin et al. 1996; Ganot et al. 1997a; Ni et al. 1997). In addition, there was a variant ACA sequence (ANANNA), termed the H box, located in the hinge region between the two hairpins. The H and ACA boxes were found to be essential for RNA stability and for binding of the nucleolar protein Gar1p (Balakin et al. 1996; Ganot et al. 1997a, 1997b). Thus, with the exception of the MRP-7.2 RNA, all known snoRNAs were classified into two major families: box C/D snoRNAs that associate with fibrillarin and box H/ACA snoRNAs that bind Gar1 (Balakin et al. 1996) (Fig. 1). 2.3 Discovery that Box C/D snoRNAs guide rRNA 2'-O-methylation A great deal of effort was devoted to determining the function of the rapidly increasing number of newly identified snoRNAs. Sequence inspection revealed significant complementarity between box C/D snoRNAs and rRNAs, suggesting that

Mechanisms and functions of RNA-guided RNA modification 5

Fig. 1. RNA-guided RNA 2'-O-methylation and pseudouridylation. Box C/D and box H/ACA snoRNAs guide 2'-O-methylation and pseudouridylation, respectively, by binding to complementary regions in target RNAs. Boxes C, D, C', D', H, and ACA are shown. 2'OMe represents the target 2'-O-methylation site that is always the fifth nucleotide from box D or D', and Ψ is the target pseudouridylation site that is always left unpaired in the pseudouridylation pocket. N, any nucleotide.

snoRNAs might function as chaperones for ribosome biogenesis (Bachellerie et al. 1995; Steitz and Tycowski 1995). A correlation between the locations of 2'-Omethylated residues in rRNAs and regions of snoRNA-rRNA complementarity was also noted (Bachellerie et al. 1995). Moreover, it was found that mutations in Nop1p, the yeast fibrillarin homologue, greatly reduced 2'-O-methylation of rRNA (Tollervey et al. 1993). These observations led to the hypothesis that box C/D snoRNAs function as guides that direct 2'-O-methylation of rRNA (Bachellerie et al. 1995). The experimental evidence arrived soon after the hypothesis was proposed. In 1996, it was discovered that rRNA 2'-O-methylation always occurs in the residue basepaired to the nucleotide in snoRNA precisely 5 nucleotides upstream from box D (or D') (Fig. 1) (Cavaille et al. 1996; Kiss-Laszlo et al. 1996) . In addition, it was shown that deletion of a particular box C/D snoRNA resulted in loss of 2'-O-methylation at the target site in rRNA in yeast. Importantly, sitespecific methylation could be restored upon reintroduction of the box C/D snoRNA into the deletion strain, and modification of novel target sites could be directed by introduction of snoRNAs with appropriate guide sequences (Cavaille et al. 1996). Thus, the “Box D +5 rule” for prediction of the site of 2’-O-methylation guided by snoRNAs was established in 1996, and has since been confirmed in various organisms including Xenopus and human, suggesting that RNA-guided 2'O-methylation of rRNA is universal among eukaryotes (Peculis 1997; Smith and Steitz 1997; Kiss 2001, 2002).

6 Yi-Tao Yu, Rebecca M. Terns, and Michael P. Terns

2.4 Discovery that Box H/ACA snoRNAs guide rRNA pseudouridylation The determination of the function of the box H/ACA snoRNAs was more challenging (due to the lack of extensive contiguous stretches of complementarity to rRNAs). However, psoralen cross-linking (which favors the detection of basepairing interactions) had generated cross-links between box H/ACA snoRNAs and rRNAs (Rimoldi et al. 1993), suggesting a basepairing interaction between rRNA and box H/ACA snoRNAs. Inspired by the work linking box C/D snoRNAs to rRNA 2'-O-methylation, the Fournier group and the Kiss group soon demonstrated that box H/ACA snoRNAs function as guides that direct rRNA pseudouridylation, the other major type of rRNA modification for which no mechanism had been ascribed (Ganot et al. 1997a; Ni et al. 1997). The guide sequences in box H/ACA RNAs are found in two segments in the linear RNA sequence that are brought together in internal loops within the hairpins (Fig. 1). Base-pairing between the bipartite guide sequence and the rRNA positions the target uridine at the base of the upper stem of the hairpin, leaving it unpaired within the so-called "pseudouridylation pocket" and located about 14–16 nucleotides upstream of box H or box ACA (Fig. 1). The snoRNA-guided pseudouridylation mechanism has been tested and verified in various systems (Ganot et al. 1997a; Ni et al. 1997; Jady and Kiss 2001; Zhao et al. 2002). 2.5 Toward identification of complete sets of rRNA modification guide snoRNAs The recognition that snoRNAs serve as guides for 2'-O-methylation and pseudouridylation constituted a major step in understanding the rRNA modification puzzle. Still, the number of snoRNAs identified at that time could not account for the number of modified nucleotides known to exist in rRNAs. This fact has prompted large-scale searches for box C/D and box H/ACA snoRNAs, and development of powerful new approaches for identifying small non-coding RNAs. Bioinformatic approaches have been particularly productive in the identification of new box C/D snoRNAs, and more recently also box H/ACA RNAs. The finding that box C/D snoRNAs guide rRNA 2'-O-methylation allowed the development of computer algorithms that were designed to search genome sequences for patterns of conserved sequence elements (e.g. box C and box D) and complementarities to rRNA (Lowe and Eddy 1999; Samarsky and Fournier 1999; Brown et al. 2003b). Computer-assisted searches have identified a large number of putative 2'-O-methylation guides in various eukaryotic organisms (Lowe and Eddy 1999; Qu et al. 1999; Samarsky and Fournier 1999; Barneche et al. 2001; Brown et al. 2001, 2003b; Accardo et al. 2004). In yeast in particular, almost all the box C/D snoRNAs necessary to account for the known rRNA 2'-O-methylations (51/55) have been identified and confirmed experimentally (Lowe and Eddy 1999; Samarsky and Fournier 1999; Brown et al. 2003b). The computational identification of box H/ACA snoRNAs is more difficult (due to the bipartite nature of the

Mechanisms and functions of RNA-guided RNA modification 7

guide sequence and shorter conserved sequence elements), but was very recently accomplished (Schattner et al. 2004; Huang et al. 2004) and brought the number of yeast rRNA pseudouridines with corresponding H/ACA guide RNAs up to 41 (of 44 known) (Schattner et al. 2004). Powerful experimental approaches have also been applied to snoRNA identification, including yeast genome analysis followed by Northern blotting (Olivas et al. 1997) and construction and analysis of cDNA libraries of selected cellular RNAs, an approach used in identification of some of the original snoRNAs and now referred to as RNomics (Bachellerie and Cavaille 1997; Huttenhofer et al. 2001, 2004; Kiss and Jady 2004). The “RNomics” approach has produced huge numbers of new box C/D and box H/ACA snoRNAs in various organisms (Dunbar et al. 2000; Gaspin et al. 2000; Omer et al. 2000; Huttenhofer et al. 2001; Qu et al. 2001; Kiss 2002; Marker et al. 2002; Tang et al. 2002; Vitali et al. 2003; Yuan et al. 2003; Kiss et al. 2004). Most of the identified snoRNAs are predicted to guide rRNA modification, though the experimental approach (more so than the computational approach, which has largely selected for complementarity to rRNA) also produces snoRNAs that are thought to target other cellular RNAs (see below). The combined approaches appear to have yielded nearly complete sets of rRNA modification guide snoRNAs not only in yeast but also in human where at least 100 of ~105–107 known 2'-O-methylation sites and 79 of ~97 pseudouridylation sites in rRNA are accounted for with snoRNAs (Bachellerie et al. 2002; Huttenhofer et al. 2002; Vitali et al. 2003; Kiss et al. 2004). While it is possible that some of the remaining modifications might be catalyzed by RNA-independent mechanisms, the fact that all but one (see Chapter 9 in this volume) of the eukaryotic rRNA modifications characterized, thus far, are guided by snoRNAs argues that there are a few additional snoRNAs to be identified.

3 RNAs also guide the pseudouridylation and 2'-Omethylation of snRNAs It has long been known that the spliceosomal snRNAs contain a large number of modified nucleotides (Fig. 2), yet research on the modification mechanisms did not begin until shortly after the discovery of snoRNA-guided rRNA modifications. Given that both snRNAs and rRNAs are extensively modified by 2'-O-methylation and pseudouridylation, it was reasonable to suspect that, like rRNA modification, spliceosomal snRNA modifications are catalyzed by a snoRNA-guided mechanism as well. In 1998, U6 snRNA became the first spliceosomal snRNA for which a snoRNA-guided modification mechanism was reported (Tycowski et al. 1998). Taking advantage of conserved elements identified in RNAs that guide rRNA 2'O-methylation, (i.e., the C/D boxes and the guide sequence complementary to the target RNA across the modified residue), Tycowski et al. (1998) searched available databases and identified two possible box C/D snoRNA guides (mgU6-47 and mgU6-77) in various organisms that might be responsible for U6 snRNA 2'O-methylation. Using the Xenopus oocyte reconstitution system, they showed that

8 Yi-Tao Yu, Rebecca M. Terns, and Michael P. Terns

Fig. 2. Pseudouridines and 2'-O-methylated residues in human spliceosomal snRNAs. Primary and secondary structures of human spliceosomal snRNAs are shown. 2'-O-methylated nucleotides are indicated by circles, and pseudouridines (Ψ) are indicated by rectangles. The thick lines denote the nucleotides involved in RNA-RNA interactions or implicated in catalysis during pre-mRNA splicing. The gray boxes indicate the Sm-binding sites. The 5' caps (2, 2, 7 trimethylated guanosine cap for U1, U2, U4, and U5, and γ-methylated guanosine cap for U6) are also shown.

Mechanisms and functions of RNA-guided RNA modification 9

depletion of the putative guide RNAs completely abolished U6 2'-O-methylation at the predicted sites. Moreover, site-specific 2'-O-methyaltion was rescued upon injection of the in vitro transcribed guide RNA into depleted oocytes. Interestingly, one of the box C/D snoRNAs (mgU6-77) exhibited dual substrate specificity, guiding 2'-O-methylation at position 2970 in 28S rRNA as well as position 77 in U6 (Tycowski et al. 1998). Later, another U6 2'-O-methylation guide (mgU653) was identified in human, and snoRNA-guided modification was further demonstrated for mammalian U6 (Ganot et al. 1999). It was unclear whether the guide mechanism for U6 2'-O-methylation applied to the other spliceosomal snRNA modifications as well, since U6 differs from all the other major spliceosomal snRNAs (U1, U2, U4, and U5) in many ways (Yu et al. 1999). For instance, U6 is transcribed by RNA polymerase III (Pol III), whereas the other spliceosomal snRNAs are RNA polymerase II (Pol II) transcripts. U6 contains a γ-methyl cap and does not bind to Sm core proteins, whereas the other snRNAs possess a TMG cap and bind tightly to Sm proteins. Finally, whereas U6 probably never leaves the nucleus, the other snRNAs travel through the cytoplasm during their biogenesis. Indications that modification of the Pol II–transcribed spliceosomal snRNAs might also be catalyzed by an RNA-guided mechanism came from the identification of a novel RNA (U85) in human and fruit fly that contains both box C/D and box H/ACA motifs (Jady and Kiss 2001). Careful inspection of the RNA revealed sequences complementary to U5 snRNA, which could position C45 and U46 in U5 for 2'-O-methylation and pseudouridylation, respectively. Indeed, this prediction was confirmed by modification experiments both in vivo and in vitro (Jady and Kiss 2001). Soon after this report, three more such "hybrid" guide RNAs (U87, U88, and U89) were identified and predicted to guide 2'-O-methylation of U4 and U5 as well as pseudouridylation of U5 (Darzacq et al. 2002). Interestingly, some of these RNAs appeared to have overlapping guide functions that could target the same nucleotide for modification, suggesting redundant modes of modification at certain sites. However, as more spliceosomal snRNA–specific box C/D and box H/ACA guide RNAs were discovered, not all exhibited the hybrid composition of U85, U87, U88, and U89. Instead, most of them fell into either the box C/D or box H/ACA category (Kiss et al. 2004). One exception was U93, which appeared to combine two box H/ACA domains, resulting in four hairpins instead of two (Kiss et al. 2002). One of the hairpins was predicted to guide the formation of Ψ54 in U2. While one H/ACA guide RNA that directs U2 pseudouridylation at two different sites in the branch site recognition region appears to reside within the nucleoplasm of Xenopus oocytes (Zhao et al. 2002), most if not all the other guide RNAs directing spliceosomal snRNA modifications are localized to Cajal bodies (Darzacq et al. 2002). These guide RNAs are therefore designated scaRNAs, for small Cajal body-specific RNAs (Darzacq et al. 2002). Data accumulated, thus far, from several labs (Tycowski et al. 1998; Ganot et al. 1999; Huttenhofer et al. 2001; Jady and Kiss 2001; Darzacq et al. 2002; Zhao et al. 2002; Vitali et al. 2003; Kiss et al. 2002, 2004) suggest that at least 28 spliceosomal snRNA–specific guide RNAs have been identified in various organisms (e.g. human, mouse, Xenopus, fruit fly), although some of these guide RNAs con-

10 Yi-Tao Yu, Rebecca M. Terns, and Michael P. Terns

stitute homologs among the different organisms. These RNAs have been proven (Tycowski et al. 1998; Jady and Kiss 2001; Zhao et al. 2002) or predicted (Ganot et al. 1999; Huttenhofer et al. 2001; Darzacq et al. 2002; Vitali et al. 2003; Kiss et al. 2002, 2004) to guide either 2'-O-methylation at 13 of 30 known sites or pseudouridylation at 12 of 24 known sites in the 5 spliceosomal snRNAs (U1, U2, U4, U5, and U6). In contrast to the data accumulated for rRNA modifications, only a small fraction of spliceosomal snRNA modification sites have been accounted for. Thus, the challenge remains to identify the guide RNAs for the majority of these sites. However, it is possible that some of the modifications may be catalyzed by an RNA-independent mechanism. In this regard, at least two (Ψ35 and Ψ44) of the three pseudouridylation sites in yeast U2 are generated by single-polypeptide enzymes, Pus7p and Pus1p, respectively (Massenet et al. 1999; Ma et al. 2003). The continued search for small guide RNAs will undoubtedly clarify this issue and further our understanding of the general mechanisms underlying spliceosomal snRNA modifications.

4 sno/scaRNAs may also guide mRNA modifications Experimental RNomics of mouse brain cells identified four brain-specific snoRNAs, including three box C/D and one box H/ACA snoRNAs (Cavaille et al. 2000; Filipowicz 2000). Interestingly, the three box C/D snoRNA genes, two of which are tandemly repeated multiple times, are located within a chromosomal region implicated in Prader-Willi syndrome (PWS), a neurogenetic disease caused by deficient paternal gene expression. Because these snoRNAs have not been detected in PWS patients or in a PWS mouse model, the expression of the brainspecific snoRNAs are believed to be paternally imprinted (Cavaille et al. 2000). Although sequence inspection revealed no target sites in rRNAs or in the spliceosomal snRNAs, an 18-nucleotide region in one of the box C/D snoRNAs is complementary to the mRNA for the brain-specific serotonin 2C receptor, suggesting a role for this snoRNA in mRNA 2'-O-methylation and processing/function (Cavaille et al. 2000). However, before the connection can be established, it is important to determine whether the predicted site in the mRNA is indeed 2'-O-methylated and where within the cell the box C/D snoRNA is localized. Given that the other three brain-specific snoRNAs exhibit no complementarity to rRNAs or spliceosomal snRNAs, they may also target mRNAs that are yet to be determined. In this regard, many snoRNAs identified by experimental RNomics lack target sites in rRNAs and snRNAs as well, and thus, it is of great interest to determine whether they have target sites in mRNAs (Bachellerie et al. 2002). SLA-1 was originally identified as the spliced leader (SL)-associated RNA in trypanosomes (Watkins et al. 1994). However, it has been reported that SLA-1 in fact shares characteristics of eukaryotic box H/ACA snoRNAs (Liang et al. 2002). The presumed bipartite guide sequences in the pseudouridylation pocket exhibit complementarity to the SL RNA, perfectly positioning a conserved SL uridine in

Mechanisms and functions of RNA-guided RNA modification 11

the pseudouridylation pocket for pseudouridylation. Pseudouridylation mapping using CMC modification followed by primer-extension confirmed that the predicted target residue is indeed a pseudouridine. Mutagenesis analysis further demonstrated that the formation of pseudouridine at this position in SL RNA is dependent on the intact guide sequences in the box H/ACA-like RNA (Liang et al. 2002). Because the SL sequence is donated to the 3' exon during trans-splicing of pre-mRNA, the newly generated mature mRNAs in trypanosomes inherit a pseudouridine. In this sense, the box H/ACA RNA indirectly targets mRNA for pseudouridylation. Modified nucleotide(s) in mRNAs could play an important role in mRNA processing, transport, stability, or protein translation.

5 Small RNA–guided RNA modification of rRNA and tRNA in archaea The presence of homologs of the proteins that function with box C/D and box H/ACA RNAs in archaea predicted the presence of homologous RNAs and RNAguided rRNA modification in this domain of life (Lafontaine and Tollervey 1998; Watanabe and Gray 2000). In 2000, Omer et al. co-precipitated a number of box C/D RNAs using antibodies against the Sulfolobus solfataricus homologs of fibrillarin and Nop56/58 (another box C/D snoRNA–specific protein) (Omer et al. 2000). The computational algorithm developed for identification of yeast box C/D RNAs (Lowe and Eddy 1999) was re-trained to recognize the more compact archaeal RNAs and used to search archaeal genomes for candidate box C/D RNAs (Omer et al. 2000). In the end, over 200 potential RNAs were identified in 7 archaeal genomes (Omer et al. 2000). At the same time the Bachellerie laboratory also identified and characterized predicted box C/D RNAs in the three Pyrococcus genomes (Gaspin et al. 2000). The existence of the computationally predicted RNAs and of modifications at predicted target sites have been confirmed in many cases (Gaspin et al. 2000; Omer et al. 2000). These results suggest that RNAguided rRNA 2'-O-methylation is an ancient mechanism that functions not only in eukaryotes but also in archaea, which lack a nucleus. When archaeal box C/D RNAs are injected into a eukaryotic nucleus (Xenopus oocyte) the RNAs localize to the eukaryotic nucleolus, interact with the eukaryotic box C/D RNP proteins and guide rRNA 2'-O-methylation in a site-specific manner, indicating the conservation of the essential features of this class of RNAs over 2000 million years of divergent evolution (Speckmann et al. 2002). Interestingly, the number of 2’-Omethylations and sRNAs identified in archaeal species increases with increasing optimum growth temperature (Noon et al. 1998; Dennis et al. 2001; Omer et al. 2003). This correlation may reflect the function of rRNA 2'-O-methylation (see section 9). With the identification of the box C/D RNAs in archaea, another target for RNA-guided RNA modification was uncovered. Some of the newly identified archaeal box C/D RNAs were found to contain complementarity to tRNA (Omer et al. 2000; Clouet d'Orval et al. 2001; Dennis et al. 2001). Currently, as many as 21

12 Yi-Tao Yu, Rebecca M. Terns, and Michael P. Terns

predicted box C/D RNAs are thought to target a total of 23 different sites in tRNAs (or pre-tRNAs) (Omer et al. 2000; Clouet d'Orval et al. 2001; Dennis et al. 2001). Some of the RNAs have the potential to modify a site that is common to multiple (up to 19) tRNAs (http://rna.wustl.edu/snoRNAdb/). The most frequently targeted site is the wobble position within the anticodon loop of tRNAs (position 34) (http://rna.wustl.edu/snoRNAdb/). Box H/ACA RNAs that guide pseudouridylation of rRNA were identified in archaea by RNomics as well as computational approaches (Klein et al. 2002; Tang et al. 2002; Rozhdestvensky et al. 2003). Unlike most eukaryotic box H/ACA snoRNAs that have two hairpin structures, archaeal box H/ACA RNAs can have one, two, or three hairpin structures, each containing a pseudouridylation pocket (Tang et al. 2002; Rozhdestvensky et al. 2003). Five of six predicted rRNA target sites in Archaeoglobus fulgidus have been experimentally confirmed (Tang et al. 2002; Rozhdestvensky et al. 2003). Thus, it appears that rRNA pseudouridylation is also carried out by an RNA-guided modification system in this domain of life.

6 Gene organization and biosynthesis of snoRNAs As more vertebrate snoRNAs have been identified, it is clear that only a small fraction is transcribed from independent promoters (and contain a 5' TMG cap). Most of the vertebrate snoRNAs contain a 5' monophosphate group (Maxwell and Fournier 1995; Yu et al. 1999) and are encoded within introns of protein coding genes (Maxwell and Fournier 1995; Tycowski and Steitz 2001). Mouse U14 was the first such intron-encoded snoRNA to be identified (Liu and Maxwell 1990). U14 is positioned within an intron of the mouse hsc70 heat shock gene, and its production is coupled with hsc70 pre-mRNA processing (Leverette et al. 1992). Soon after this finding, many snoRNAs were identified within the introns of various host genes, thus accounting for a large number of the cap-minus snoRNAs (Tycowski and Steitz 2001; Filipowicz and Pogacic 2002). In general, in vertebrates a single snoRNA gene is found within an intron, however, in other organisms, including Drosophila and rice, clusters of snoRNA genes have been reported within introns (Chen et al. 2003; Huang et al. 2004). Two pathways for processing intron-encoded snoRNAs have been reported in vertebrates (Tycowski and Steitz 2001; Filipowicz and Pogacic 2002). In the major pathway, the intron is first spliced out of the host gene pre-mRNA in the form of a lariat. Following debranching of the lariat intron, exonucleolytic trimming of the linear RNA produces a snoRNA with mature 5' and 3' termini. In the minor pathway (e. g. for U16 and U18 in Xenopus and also for U18 in yeast), prototypical splicing does not occur. Instead, endonucleolytic activities initiate cleavages at sites upstream and downstream of snoRNA regions, generating linear snoRNAcontaining fragments that are then trimmed from both ends to generate mature snoRNAs (Caffarelli et al. 1994, 1996; Villa et al. 1998). In yeast and Arabidopsis, while a few snoRNAs are intron-encoded, most snoRNAs are independently transcribed as mono-, di-, or polycistronic precursors

Mechanisms and functions of RNA-guided RNA modification 13

that then mature through a series of endo- and exonucleolytic cleavages (Tycowski and Steitz 2001; Filipowicz and Pogacic 2002; Brown et al. 2003a). In Arabidopsis, most snoRNA genes, including both box C/D and box H/ACA snoRNA genes, are organized into gene clusters scattered over the chromosomes (Brown et al. 2003a). Interestingly, most snoRNA host genes encode proteins involved in nucleolar function, ribosome biogenesis or structure, or protein translation (Tycowski and Steitz 2001; Filipowicz and Pogacic 2002). This gene organization may imply that a coordinated regulatory mechanism is imposed on the synthesis of rRNA, ribosomal proteins and proteins of the translational apparatus. Remarkably, in vertebrates and fruit flies, some snoRNA host genes appear to serve purely as a source of snoRNAs (Tycowski and Steitz 2001; Terns and Terns 2002). In these cases, the ligated exons do not appear to have protein coding potential (Tycowski et al. 1996; Bortolin and Kiss 1998; Pelczar and Filipowicz 1998; Smith and Steitz 1998). Another interesting observation is that most if not all vertebrate and fly snoRNA host genes, including protein-coding and non-protein-coding genes, belong to the TOP (terminal oligopyrimidine) family, which represents a group of housekeeping genes that are constitutively transcribed (Pelczar and Filipowicz 1998; Smith and Steitz 1998). The fact that all snoRNAs reside in introns of TOP genes suggests that snoRNAs are produced in a coordinated manner, perhaps at the level of RNA transcription. Much less is known about the biogenesis of box C/D and box H/ACA RNAs in archaea, though it does appear likely that the biogenic pathways will be novel in these organisms. In archaea, sRNA genes are generally not clustered and are typically positioned in the regions between protein coding genes, sometimes overlapping the 5’ or 3’ end of a predicted ORF (Gaspin et al. 2000; Omer et al. 2000). One box C/D RNA is unique in its location within the intron of the tRNATrp gene in several organisms, and is likely processed out via the bulge-helix-bulge processing pathway that removes tRNA introns in archaea (Nieuwlandt et al. 1993). Interestingly, this box C/D RNA directs 2'-O-methylation at positions 34 and 39 of its host tRNA either via a cis (Clouet d'Orval et al. 2001) or trans mechanism (Singh et al. 2004). In P. furiosus, it appears that most (or all) box C/D RNAs exist in circular as well as linear forms (Starostina et al. 2004). Since none of the eukaryotic snoRNAs are known to exist as circular RNAs at any point in biogenesis (and indeed very few single-stranded circular RNAs are known to exist in biological systems), these findings imply the existence of a novel pathway for the biogenesis of box C/D RNAs in archaea.

7 Modification guide RNAs function as RNA-protein complexes The box C/D and box H/ACA guide RNAs establish sites for RNA modification by base-pairing with target RNAs as described above. However, the modifications are executed by protein enzymes that are part of a core set of proteins specifically

14 Yi-Tao Yu, Rebecca M. Terns, and Michael P. Terns

associated with each family of guide RNAs. The box C/D and box H/ACA RNPs are comprised of three or four core proteins and a guide RNA. A focus of current research is to understand the roles of the proteins within the RNPs. The structure and function of the RNPs is being investigated in archaea as well as eukaryotes, where the RNPs are fundamentally similar and the distinctions that exist broaden our understanding of the modification guide RNP system. Differences in the protein composition of the RNPs found in archaea and eukaryotes seem to reflect gene duplications that allowed an increase in the complexity and specialization of the RNPs in eukaryotes. 7.1 Protein components of methylation guide RNPs Individual box C/D RNAs direct a set of common proteins to sites of modification. In eukaryotes, the set is comprised of four proteins: fibrillarin, 15.5 kDa, Nop56, and Nop58 (Fig. 3). In archaea, there is a homologous set of three core proteins: fibrillarin, L7Ae and Nop56/58 (Fig. 3). Fibrillarin (Nop1p in yeast) is the methyltransferase, found in both eukaryotic and archaeal box C/D RNPs (Ochs et al. 1985; Galardi et al. 2002; Omer et al. 2002). The second core component of the box C/D RNP is an RNA binding protein: 15.5 kDa in higher eukaryotes, snu13p in yeast, or L7Ae in archaea (Watkins et al. 2000; Kuhn et al. 2002). This protein interacts directly with the kink-turn (K-turn), an RNA motif formed by the signature sequences of the box C/D RNAs, box C and box D (Watkins et al. 2000; Klein et al. 2001; Kuhn et al. 2002). The core box C/D RNP is completed by Nop56/58 in archaea, or two paralogues, Nop56 and Nop58 (also called Nop5p in yeast), in eukaryotes (Gautier et al. 1997; Lafontaine and Tollervey 1999, 2000). 7.2 Protein components of pseudouridylation guide RNPs The core box H/ACA RNP is comprised of a set of four proteins: Cbf5, Gar1, Nhp2 (L7Ae in archaea), and Nop10 (Henras et al. 1998; Watkins et al. 1998; Dragon et al. 2000; Pogacic et al. 2000; Watanabe and Gray 2000; Rozhdestvensky et al. 2003; Wang and Meier 2004) (Fig. 3). The conversion of uridine to pseudouridine by box H/ACA RNPs is very likely catalyzed by Cbf5, a core component that strongly resembles other pseudouridine synthases (Koonin 1996; Zebarjadian et al. 1999). In humans, this protein is called dyskerin, and mutation of the dyskerin gene results in X-linked dyskeratosis congenita (DC) (Heiss et al. 1998). DC patients typically have abnormal skin pigmentation and nail dystrophy, and often develop life-threatening bone marrow failure and epithelial cancers (Marrone and Mason 2003; Meier 2003). In vertebrates, the RNA component of telomerase, the enzyme involved in telomere length maintenance, also contains an H/ACA motif (Mitchell et al. 1999a; Chen et al. 2000; Lukowiak et al. 2001; Jady et al. 2004; Zhu et al. 2004) and the four H/ACA RNP proteins are also core components of telomerase (Mitchell et al. 1999b; Dragon et al. 2000; Pogacic et al. 2000; Wang and Meier 2004).

Mechanisms and functions of RNA-guided RNA modification 15

Fig. 3. Protein components of RNA-guided RNA modification RNPs in Eukarya and Archaea. Homologous protein components of box C/D and box H/ACA RNPs in eukaryotes and archaea are shown. In eukaryotes, box C/D RNAs are associated with a set of four common proteins: fibrillarin (or Nop1p), 15.5 kD (or Snu13p), Nop56 and Nop58 (or Nop5p). The eukaryotic box H/ACA RNP also contains four common proteins: Cbf5 (or dyskerin or Nap57) Nhp2, Nop10, and Gar1. Archaeal homologs of the same name exist for fibrillarin, Cbf5, Nop10, and Gar1. Nop56 and Nop58 are related eukaryotic proteins with a single homolog in archaea that is equally similar to the two eukaryotic proteins, and thus, called Nop56/58. L7Ae is homologous to three related eukaryotic proteins, 15.5 kD and Nhp2 as well as L7a, and is thought to be a component of box C/D and box H/ACA RNPs as well as the ribosome in archaea. In eubacteria, individual (or small numbers of) nucleotide modifications are catalyzed by dedicated protein enzymes.

7.3 Evolutionary relationships between archaeal and eukaryotic modification guide RNPs The discovery that box C/D and box H/ACA RNPs exist in archaea as well as eukaryotes indicates the ancient evolutionary origin of these RNA-guided RNA modification systems (Terns and Terns 2002; Omer et al. 2003; Tran et al. 2004). Examination of the protein components of the box C/D and box H/ACA RNPs in archaea and eukaryotes suggests that more complex and specialized RNPs have arisen in eukaryotes following gene duplications that occurred after the divergence

16 Yi-Tao Yu, Rebecca M. Terns, and Michael P. Terns

of archaea and eukaryotes approximately 2000 million years ago. For example, the Nop56 and Nop58 proteins that are each essential components of box C/D RNPs in eukaryotes appear to be paralogues derived from a single gene found in contemporary archaea (and presumably in the last common ancestor of archaea and eukaryotes). Perhaps the most interesting instance of divergence and specialization, however, involves a set of proteins that bind directly to box C/D and box H/ACA RNAs. In archaea, the L7Ae protein is a core component of box C/D RNPs and box H/ACA RNPs, and also of the ribosome (Ban et al. 2000; Kuhn et al. 2002; Rozhdestvensky et al. 2003). L7Ae binds K-turns present in the RNA components of each of these RNPs in archaea (Ban et al. 2000; Klein et al. 2001; Kuhn et al. 2002; Rozhdestvensky et al. 2003). Duplications of an ancestral L7Ae gene apparently allowed the evolution of three related but distinct RNA binding proteins in eukaryotes – 15.5 kDa (component of box C/D RNP and U4/U6 snRNP), Nhp2 (component of box H/ACA RNP), and L7a (component of ribosome). In eukaryotes, these proteins are not redundant or interchangeable. Each of the three proteins is essential for viability and is specifically associated with particular families of RNAs. Two of the eukaryotic proteins have acquired additional domains. Both Nhp2 and L7a have N terminal extensions not found in the archaeal protein L7Ae (or in 15.5 kDa). Eukaryotic L7a contains an additional C terminal extension. The L7Ae gene duplications may also have allowed greater divergence of the RNA families in eukaryotes than is observed in archaea. In particular, Nhp2 and the eukaryotic H/ACA RNAs appear to have co-evolved significantly from the ancestral L7Ae / K-turn RNA pair. Eukaryotic box H/ACA RNAs do not contain recognizable K-turns and the Nhp2 protein appears to have reduced RNA binding specificity (Henras et al. 2001; Wang and Meier 2004). The divergence of the Nhp2 / box H/ACA pair in eukaryotes presumably would have required covariation of the multiple individual RNAs with the protein – a considerable evolutionary constraint. This suggests that the divergence may have occurred at an early point in the evolution of the box H/ACA RNAs, perhaps before significant expansion in the number of individual H/ACA RNAs. Many fewer box H/ACA RNAs are known to exist in archaea, consistent with the possibility that a common ancestor had few box H/ACA RNAs. On the other hand, the box C/D RNAs, which are more numerous in archaea, appear to have diverged less from the L7Ae / K-turn model in eukaryotes. Eukaryotic box C/D RNAs retain the K-turn. In addition, we have found that the eukaryotic 15.5 kDa protein (and all other eukaryotic box C/D RNP proteins essential for function) recognize archaeal box C/D RNAs (Speckmann et al. 2002). Archaeal box C/D RNAs associate with eukaryotic proteins, localize to the nucleolus and guide rRNA modification when injected into the nucleus of a eukyarotic cell (Speckmann et al. 2002). The gene duplications may also have allowed the evolution of a key spliceosomal RNP in eukaryotes, the U4/U6 snRNP. snRNP-mediated splicing is not known to exist in archaea. However, at least two protein components of the U4/U6 snRNP appear to be directly related to archaeal box C/D RNP proteins. The eukaryotic 15.5 kDa protein (related to archaeal L7Ae) is a component of the U4/U6 snRNP as well as box C/D RNPs in eukaryotes (Watkins et al. 2000). 15.5 kDa

Mechanisms and functions of RNA-guided RNA modification 17

recognizes very similar K-turn motifs in U4 snRNA and box C/D RNAs. In addition, like the L7Ae gene, the Nop56/58 gene appears to have undergone two gene duplications giving rise to three related genes in eukaryotes: Nop56, Nop58, and Prp31 (Gautier et al. 1997; Watkins et al. 2000; Terns and Terns 2002). The Prp31 protein is an essential component of the U4/U6 snRNP and required for mRNA splicing in eukaryotes (Weidenhammer et al. 1997; Makarova et al. 2002). Thus duplications in box C/D RNP protein genes appear to have contributed to the development of snRNP-mediated mRNA processing in eukaryotes.

8 Assembly and structural organization of modification guide RNPs Very recently, significant advances have been made toward detailing the global architecture of box C/D and box H/ACA modification guide RNPs and dissecting the molecular interactions that underlie their mechanisms of action. Progress has been accelerated by both development of cell-free modification guide systems and determination of the structures of components of the RNPs . Studies with cell-free 2’-O-methylation (Galardi et al. 2002; Omer et al. 2002; Bortolin et al. 2003; Rashid et al. 2003; Tran et al. 2003) and pseudouridylation (Wang et al. 2002; Wang and Meier 2004) systems have confirmed that the known core proteins are sufficient to support RNA-guided, site-specific modification. Furthermore, the cell-free systems have enabled detailed analysis of the roles of RNA/protein and protein/protein interactions in the assembly and function of the modification guide RNPs. In addition, guide RNP research has recently reached the “atomic age” primarily thanks to the availability of crystallization-friendly proteins from archaea. Great insight has been gained by glimpsing the details revealed in recent Xray structures of individual guide RNP proteins and co-crystallized RNA/protein and protein/protein complexes. 8.1 Methylation guide RNP structure The methylation guide RNPs in archaea and eukaryotes are fundamentally similar in structure and function, and a common general model can be visualized (Fig. 4). As described above, the methylation guide RNAs contain one or two functional box C/D units. The homologous proteins L7Ae (in archaea) and 15.5 kDa (in eukaryotes) interact directly with K-turns formed in the RNAs by boxes C and D, and initiate complex formation. Two molecules of Nop56/58 (in archaea) or one each of Nop56 and Nop58 (in eukaryotes) are associated with each RNP and likely bridge between the two box C/D units. Structural and mutational analysis indicates that fibrillarin catalyzes the 2’-O-methylation of the target RNA, and a fibrillarin molecule presumably resides in proximity to each of the two potential guide sequences. The target RNA is positioned for 2’-O-methylation by basepairing with the guide sequence.

18 Yi-Tao Yu, Rebecca M. Terns, and Michael P. Terns

Fig. 4. Assembly of modification guide RNP proteins on box C/D and box H/ACA RNAs. A) Assembly of eukaryotic (top half) and archaeal (bottom half) box C/D proteins on a box C/D RNA. The homologs 15.5 kD (in eukaryotes) and L7Ae (in archaea) interact with box C/D RNAs in the absence of the other proteins, which is thought to nucleate assembly of the RNPs. Archaeal Nop56/58 can then interact with the L7Ae/RNA complex and fibrillarin can then join the complex. The eukaryotic assembly is shown to be similar in this model. B) Assembly of eukaryotic box H/ACA proteins on a box H/ACA RNA. In this case, the proteins pre-assemble into a protein complex that interacts specifically with box H/ACA RNAs. The stoichiometry and organization of the proteins within the complex is not known.

One interest that has arisen with regard to box C/D RNP organization is the degree of symmetry between the box C/D and box C’/D’ units in an individual RNP. As is detailed below, there is necessarily greater asymmetry in the eukaryotic box C/D RNPs, where the box C’/D’ unit is often degenerate and non-functional (KissLászló et al. 1998), and where distinct Nop56 and Nop58 proteins are found at the box C’/D’ and box C/D units respectively (Cahill et al. 2002). Even in a model

Mechanisms and functions of RNA-guided RNA modification 19

eukaryotic double guide RNA, 15.5 kDa was only detected at the box C/D unit, not at the box C’/D’ unit (Szewczak et al. 2002). The archaeal RNPs are more symmetric, with two functional box C/D units found in nearly all of the RNAs, and homodimers of the Nop56/58 protein in place of Nop56 and Nop58. However, while it is clear that L7Ae (the 15.5 kDa homologue) can bind to either a box C/D or box C’/D’ unit, binding of L7Ae at only one unit is sufficient for activity at both units (Rashid et al. 2003; Tran et al. 2003). Thus, it is possible that box C/D RNPs may be asymmetric with regard to L7Ae (i.e. that L7Ae may be present at only one C/D unit) in both archaea and eukaryotes. The recent studies that have illuminated the organization of the box C/D RNPs in eukaryotes and archaea are described herein. 8.1.1 Structure of eukaryotic methylation guide RNPs The organization of the eukaryotic methylation guide RNP has been studied in vivo and in vitro using a model eukaryotic double guide RNA (Cahill et al. 2002; Szewczak et al. 2002). The contacts of individual RNP proteins with the conserved box C, C’, D, and D’ elements in vivo were mapped by site-specific UV crosslinking experiments performed following injection of U25 box C/D RNA into Xenopus oocytes (Cahill et al. 2002). These studies clearly demonstrated interaction of the Nop58 protein with the box C element (of the terminal C/D motif) and of the Nop56 protein with the box C’ element (of the internal C’/D’ motif). The terminal box C/D motif is known to be particularly important in protecting box C/D RNAs against degradation, and consistent with the mapping results, Nop58 (but not Nop56) is required for the stability of all box C/D snoRNAs in yeast (Lafontaine and Tollervey 1999, 2000). Interaction of the Xenopus 15.5 kDa protein with U25 was not detected by UV crosslinking, but recombinant 15.5 kDa protein assembled into active RNP complexes following injection into Xenopus oocytes (Cahill et al. 2002). In a separate study, interference mapping both in vivo and in vitro showed that the 15.5 kDa protein interacts with the C/D but not C’/D’ motif (Szewczak et al. 2002). Interactions of fibrillarin with both the terminal C/D and internal C’/D’ motifs (predominantly at box D and box D’) were observed by crosslinking (Cahill et al. 2002). Taken together, these studies indicate that the box C/D and box C’/D’ units do not act as simple structural mirror images of one another and do not each bind all of the four core protein components. Instead, the core protein components are asymmetrically organized with respect to these two functional motifs. 8.1.2 Structure of archaeal methylation guide RNPs The recent availability of in vitro systems capable of reconstituting enzymatically active, archaeal methylation guide RNPs are greatly facilitating investigation of box C/D RNP structure/function relationships (Omer et al. 2002; Bortolin et al. 2003; Rashid et al. 2003; Tran et al. 2003). Site-specific 2’-O-methylation can be reconstituted by incubating a box C/D guide RNA with a target rRNA sequence (both synthesized in vitro), three recombinant box C/D RNP proteins (i.e. L7Ae,

20 Yi-Tao Yu, Rebecca M. Terns, and Michael P. Terns

Nop56/58, and fibrillarin) and the methyl donor AdoMet. Order of addition experiments have been used to assess the requirements for assembly, and the data indicate that L7Ae can bind the RNA in the absence of other components. Nop56/58 can interact with the RNA only in the presence of L7Ae, and association of fibrillarin (the methyltransferase) with the RNA requires both L7Ae and Nop56/58. These results suggest an ordered assembly of proteins on the RNA to form a functional RNP complex: L7Ae, Nop56/58, fibrillarin. It should be noted, however, that while the box C/D RNP proteins are capable of interaction with a box C/D RNA in a specific sequential order, it is not known whether some or all of the protein components (e.g. Nop56/58 and fibrillarin) pre-assemble prior to interaction with the RNA in vivo. High resolution X-ray structures of fibrillarin proteins from Methanococcus jannashii (Wang et al. 2000), Archaeoglobus fulgidus (Aittaleb et al. 2003), and Pyrococcus furiosus (Deng et al. 2004) provide compelling evidence that fibrillarin is the RNP component responsible for catalyzing site-specific, RNA-guided 2’-O-methylation. The archaeal fibrillarins exhibit the hallmark topology (a seven stranded β−sheet flanked by three α-helices on each side), conserved motifs, Adomet binding site, and invariant amino acid residues present in the catalytic domains of all known AdoMet-dependent methyltransferases (Martin and McMillan 2002; Schubert et al. 2003). The recent functional studies with reconstituted box C/D RNPs have also provided direct evidence that fibrillarin is the methyltransferase (Omer et al. 2002; Bortolin et al. 2003; Rashid et al. 2003; Tran et al. 2003). Solution of the structure of co-crystals of Archaeoglobus fulgidus fibrillarin and Nop56/58 revealed the existence of a tetramer consisting of two molecules of each protein in the configuration fibrillarin-Nop56/58-Nop56/58-fibrillarin (Aittaleb et al. 2003). The interaction of Nop56/58 with fibrillarin is primarily brought about by extensive surface complementarity between the concave N-terminal domain of the Nop56/58 protein and a convex central domain of fibrillarin. Interestingly, Nop56/58 interaction with fibrillarin appears to be a requirement for stable binding of the methyl donor (Adomet) to fibrillarin, indicating the functional importance of the interaction. Extensive coiled-coil interactions between two Nop56/58 proteins result in the formation of a four helix bundle that mediates the dimerization of the fibrillarin-Nop56/58 heterodimers. These findings suggest a model in which the tetramer bridges the two box C/D units, positioning a fibrillarinNop56/58 heterodimer at each unit (see Fig. 4). The overall features of this model likely hold true for eukaryotic box C/D RNP organization as well, though in this case a heterodimer of Nop56 and Nop58 (rather than a Nop56/58 homodimer) would bridge the units. Finally, the atomic details of the interaction of L7Ae with short, model box C/D motif RNAs have been described (Hamma and Ferre-D'Amare 2004; Moore et al. 2004) and have illuminated the initial step of box C/D RNP assembly. Like 15.5 kDa in eukaryotes, L7Ae is an RNA binding protein that interacts specifically with K-turns (Kuhn et al. 2002; Rozhdestvensky et al. 2003). The K-turn is an RNA motif originally identified in the U4 spliceosomal RNA (Vidovic et al. 2000) and later recognized to occur in other RNAs including ribosomal RNAs, box C/D

Mechanisms and functions of RNA-guided RNA modification 21

RNAs, and archaeal (but apparently not eukaryotic) box H/ACA RNAs (Ban et al. 2000; Kuhn et al. 2002; Rozhdestvensky et al. 2003). A canonical K-turn consists of a short, asymmetric internal loop in which the phosphodiester backbone undergoes a major bend (or “kink”) of ~1200, and which is flanked by a canonical (Watson-Crick base-paired) stem structure on one side and a non-canonical stem structure that contains tandem sheared G-A base-pairs typically followed by a U-U or G-U base-pair on the other side (Klein et al. 2001). A K-turn is thought to form at the Box C/D motif and be recognized by L7Ae (in archaea) and 15.5 kDa (in eukaryotes). The ability of L7Ae to recognize relaxed (non-canonical) K-turn structures likely accounts for its binding to box C’/D’ motifs and also H/ACA RNAs in archaea, which lack the canonical Watson-Crick base-paired stem (Hamma and Ferre-D'Amare 2004). The eukaryotic 15.5 kDa protein demonstrates stricter binding requirements (Kuhn et al. 2002; Hamma and Ferre-D'Amare 2004). With regard to the question of symmetry between the two box C/D units of the archaeal box C/D RNP, in vitro studies performed with recombinant proteins and mutated or truncated RNAs indicate that all three proteins can interact with either the box C/D or box C’/D’ unit (Rashid et al. 2003; Tran et al. 2003). However, there is evidence that binding of L7Ae at both units is not necessary for assembly of the other proteins (Nop56/58 and fibrillarin) at both units, or for function (as is the case for 15.5 kDa in eukaryotes). Mutations that disrupt L7Ae interaction at either the box C/D or box C’/D’ unit do not prevent binding of L7Ae at the other unit, or recruitment of the fibrillarin-Nop56/58-Nop56/58-fibrillarin tetramer to the complex (Rashid et al. 2003). In addition, RNPs formed with reduced concentrations of L7Ae protein (where it is likely that there is only one molecule of L7Ae per RNP) are functional in methylation assays (Rashid et al. 2003). Thus, it is possible that the archaeal RNPs, like the eukaryotic RNPs, are asymmetric with regard to L7Ae protein distribution between the two units. Examination of whether box C/D RNPs possess one or two molecules of L7Ae in vivo is required to resolve this issue. In addition, it was recently recognized that box C/D RNAs exist as circular as well as linear RNAs, at least in the hyperthermophilic archaeon P. furiosus (Starostina et al. 2004). The circular box C/D RNAs are found in complexes with box C/D RNP proteins in extracts from P. furiosus. It remains to be determined whether and how this fundamental difference in RNA structure will effect the organization of box C/D RNPs. We look forward to a more complete understanding of the molecular interactions underlying methylation guide RNP function that we expect to come from high resolution structures of the entire guide RNP complex including the guide RNA, target RNA and all three protein components. 8.2 Pseudouridylation guide RNP structure Relative to the box C/D RNPs, less is known about how box H/ACA RNPs are organized and function. While there is solid evidence that Cbf5 is the enzyme that catalyzes uridine isomerization (Koonin 1996; Zebarjadian et al. 1999), the roles

22 Yi-Tao Yu, Rebecca M. Terns, and Michael P. Terns

of Gar1, Nop10, and Nhp2 (or L7Ae in archaea) in box H/ACA RNP function are less clear. The organization of the proteins and RNAs within the RNP is also not known. Interestingly, recent evidence indicates that specific interaction with box H/ACA RNAs requires pre-assembly of a complex of most or all of the four core proteins in eukaryotes (Wang and Meier 2004), suggesting that assembly of the proteins onto the guide RNA may occur in a single step. At the same time, recently published studies and unpublished work in our laboratory indicate that two of the box H/ACA RNP proteins (L7Ae and Cbf5) can interact specifically and independently with box H/ACA RNAs in archaea (Rozhdestvensky et al. 2003; Baker, Youssef, Terns and Terns, unpublished data). The availability of H/ACA pseudouridylation guide RNPs from archaea, plus the development of an in vitro system to study site-specific pseudouridylation (Wang et al. 2002; Wang and Meier 2004) portend rapid progress in understanding this class of modification guide RNPs in the near future. 8.2.1 Structure of eukaryotic pseudouridylation guide RNPs A fundamental question that remains to be answered with regard to eukaryotic box H/ACA RNPs is the mechanism of specific recognition of the box H/ACA RNAs by the core proteins. UV crosslinking studies of in vitro reconstituted mammalian H/ACA RNPs indicate that all four core proteins contact the RNA in assembled complexes (Dragon et al. 2000), but it does not appear that any single protein interacts directly with the RNA in a sequence-specific fashion in eukaryotes. Nhp2 might be expected to interact specifically with box H/ACA RNAs given the demonstrated roles of the related proteins p15.5 kDa and L7Ae in binding specific motifs in box C/D and other RNAs in eukaryotes and archaea. However, while recombinant Nhp2 does exhibit general RNA binding capacity in vitro, there is no indication that the protein specifically recognizes H/ACA RNAs (Henras et al. 2001; Wang and Meier 2004). Gar1 has also been reported to bind RNA in vitro (Bagni and Lapeyre 1998), but because the studies used proteins expressed in translation lysates that contain additional proteins and RNAs, it is not clear that Gar1 interacts directly with box H/ACA RNAs. Moreover, depletion of Gar1 in vivo does not affect H/ACA snoRNA stability (Bousquet-Antonelli et al. 1997), and immunodepletion of Gar1 from extracts does not prevent assembly of complexes in vitro (Dragon et al. 2000). These observations suggest that Gar1 does not play a role as a primary H/ACA RNA binding protein. Recent work indicates that formation of a complex of the core proteins precedes RNA binding in the assembly of box H/ACA RNPs in eukaryotes (Wang and Meier 2004). This mode of assembly contrasts the proposed stepwise box C/D RNP assembly pathway, which is thought to be initiated by binding of 15.5 kDa (or L7Ae in archaea) to the box C/D RNA (see above). Data with mammalian components suggest that Cbf5, Nhp2, and Nop10 (and perhaps also Gar1) form a stable, multi-protein complex in the absence of the H/ACA RNA, and that this protein complex (rather than any individual protein) recognizes H/ACA RNAs in a sequence-specific manner. In yeast, there is evidence for a Cbf5p/Gar1p/Nop10p complex that forms in vivo independent of an association with either Nhp2p or

Mechanisms and functions of RNA-guided RNA modification 23

H/ACA snoRNAs (Henras et al. 2004). In the assembled RNP, both Cbf5 (the pseudouridine synthetase) and GAR1 contact the target uridine, suggesting that both of these proteins play important functional roles in target RNA modification (Wang and Meier 2004). It should be noted that the H/ACA RNPs assembled from proteins produced by in vitro transcription and translation were not active in pseudouridylation of target RNAs. However, cell-free, specific pseudouridylation of target RNA was obtained with H/ACA complexes reconstituted from mammalian cytosolic extracts and immunopurified complexes that appear to contain just the four core proteins (Wang et al. 2002; Wang and Meier 2004). Another interesting issue is the overall architecture of the eukaryotic box H/ACA RNP. On one hand, there are indications that each H/ACA RNA guide unit (i.e. hairpin) may bind one complete set of the four core proteins (and thus that the typical double guide RNP would contain two sets of the four proteins). The bipartite nature of the RNAs in most eukaryotes (two hairpins each containing a guide sequence) suggests two independent, symmetrical protein binding domains. Consistent with this view, electron micrographs of purified yeast H/ACA RNPs exhibit a V-like structure that has been interpreted to reflect two sets of the four core proteins, each interacting with one hairpin (Watkins et al. 1998). The estimated molecular weight of the complexes is also in general agreement with this possibility (Lubben et al. 1995; Watkins et al. 1998). In addition, the existence of functional single hairpin H/ACA RNAs in trypanosomes (early diverging unicellular eukaryotes) indicates that a single hairpin is sufficient for the binding of the four core proteins (Uliel et al. 2004). On the other hand, emerging evidence indicates that, like box C/D RNPs, the core H/ACA proteins may be asymmetrically arranged in the RNPs despite the symmetric structure of the double hairpin RNAs. First, mutation of either box H or box ACA prevents function at both pseudouridylation pockets of an H/ACA RNA (rather than just at the adjacent hairpin) (Bortolin et al. 1999). Second, the pre-assembled box H/ACA protein complex appears to contain sub-stoichiometric amounts (~1/2 the level) of Nhp2 protein relative to Cbf5 and Nop10 (Wang and Meier 2004). Third, in some cases, in vitro reconstitution of H/ACA RNPs in mammalian extract systems has been found to require two hairpins (e.g. U19 and U64). In the case of two RNAs that contain H/ACA motifs but are not known to function as pseudouridylation guides (U17 and telomerase RNA) a single hairpin appears to be sufficient for RNP formation (Pogacic et al. 2000) indicating that the structural organization of the four core proteins may differ in box H/ACA RNPs that are not involved in RNA modification. 8.2.2 Structure of archaeal pseudouridylation guide RNPs Evidence for the existence of box H/ACA RNPs in archaea has only recently emerged. Archaeal genomes contain putative homologs of the four core eukaryotic box H/ACA proteins (Watanabe and Gray 2000) and box H/ACA RNAs (Tang et al. 2002) (Fig. 3 and 4). The presence of modifications at predicted rRNA target sites provides support for the function of putative box H/ACA RNAs as pseudouridylation guide RNAs (Tang et al. 2002). However, the roles of the protein

24 Yi-Tao Yu, Rebecca M. Terns, and Michael P. Terns

homologs in RNA-guided pseudouridylation, and even the fundamental presence of the protein homologs in H/ACA RNP complexes in archaea remain to be established. Studies to investigate the structural organization and mechanism of action of the archaeal H/ACA RNPs are already providing key information as well as surprises. In contrast to eukaryotic H/ACA RNAs, which generally have two hairpins, predicted archaeal H/ACA RNAs contain one, two, or three hairpins (and corresponding box H or box ACA elements) (Tang et al. 2002). In addition, the archaeal H/ACA RNAs contain non-canonical K-turns located near the terminal loops of the hairpins that have been shown to serve as specific L7Ae binding sites (Rozhdestvensky et al. 2003; Hamma and Ferre-D'Amare 2004). (As described above, the archaeal homolog of Nhp2 is L7Ae, which is also a component of box C/D RNPs and ribosomes in archaea.) Unpublished studies from our laboratory indicate that Cbf5 (the likely pseudouridine synthase) also interacts directly and specifically with H/ACA RNAs in the absence of other proteins (Baker, Youssef, Terns and Terns, unpublished data). The independent and specific interaction of two of the core proteins, Cbf5 and L7Ae, with box H/ACA RNAs in archaea (Rozhdestvensky et al. 2003) contrasts the binding of a pre-assembled protein complex proposed for eukaryotic H/ACA RNPs. The archaeal and eukaryotic box H/ACA RNPs appear to have diverged to a greater extent than the box C/D RNPs (see above), suggesting that more enlightening differences between RNA-guided RNA modification systems in these two domains of life await discovery.

9 Function of pseudouridylation and 2'-O-methylation The positions of the extensive modifications found in both rRNAs and spliceosomal snRNAs are similar in various organisms. For instance, the 2'-Omethylated residues and pseudouridines found in the five spliceosomal snRNAs from various species are virtually all concentrated in the 5' half of each RNA molecule, and are all clustered in regions known to be important for pre-mRNA splicing (Fig. 2) (Yu et al. 1999). Although not located in identical positions, the modified nucleotides of rRNAs from different organisms are virtually all distributed in conserved regions known to be functionally important for protein synthesis (Decatur and Fournier 2002; Omer et al. 2003). The conservation in the location of these modified nucleotides in critical regions within each RNA strongly suggest that both 2'-O-methylated residues and pseudouridines are functionally relevant. In fact, globally blocking yeast rRNA 2'-O-methylation by deleting nop1 (Tollervey et al. 1991) or pseudouridylation by mutating cbf5 (Zebarjadian et al. 1999) causes a severe growth defect phenotype. Likewise, mutation of the human homologue of Cbf5, dyskerin, results in loss of rRNA pseudouridylation, and in dyskeratosis congenita, a disease characterized by bone marrow failure (Meier 2003; Ruggero et al. 2003). Recent work is beginning to shed light on the functional roles of these modifications in both rRNA and spliceosomal snRNA, as described in the following sections.

Mechanisms and functions of RNA-guided RNA modification 25

9.1 rRNA modifications occur primarily in functionally important regions of the ribosome Taking advantage of the recently acquired high-resolution crystal structures of ribosomes, the Fournier group modeled the positions of the known 2'-O-methylated residues and pseudouridines in the context of the ribosome structure and deduced three-dimensional modification maps for E. coli and yeast cytoplasmic ribosomes (Decatur and Fournier 2002). They found that modified nucleotides, which are not distributed randomly in rRNA at the secondary structural level, remain highly concentrated in important sites at the three-dimensional level, including the peptidyl transferase center and the sites where ribosomal subunits interact. Interestingly, modified nucleotides are concentrated in areas free of ribosomal proteins, suggesting that rRNA modifications might not be directly involved in the binding of proteins to rRNA. Likewise, Omer and colleagues mapped predicted 2'-Omethylations on the archaeal rRNA crystal structures and deduced the threedimensional distribution of the 2'-O-methylated nucleotides in the archaeal ribosome structure (Omer et al. 2003). Their results indicated that 2’-O-methylated nucleotides in archaeal rRNA are likewise located in regions known or expected to be important for ribosome function. It is noteworthy that, as discussed earlier, the number of box C/D small RNA guides (and perhaps the number of modified nucleotides in rRNAs) is higher in archaeal organisms that grow at high temperatures compared with those that grow at low temperatures (Noon et al. 1998; Dennis et al. 2001). This correlation suggests the possibility that the 2'-O-methylated residues in rRNAs contribute directly to the thermostability of the ribosome. In this regard, it has been demonstrated that 2'-O-methylated RNA-RNA structures are more stable than those involving RNARNA interactions alone (Davis 1998). However, it is also possible that higher temperatures require more sRNAs to act as chaperones to direct rRNA folding processes. 9.2 rRNA modifications in the peptidyl transferase center contribute to ribosome function and cell growth The fact that most 2'-O-methylated residues and pseudouridines correlate with functional sites in the ribosome led to the plausible hypothesis that rRNA modifications might contribute directly to ribosome function and protein synthesis, although it is also possible that they affect the biogenesis of rRNA and ribosomes (Decatur and Fournier 2002). To experimentally test the functional relevance of the pseudouridines, the Fournier group (King et al. 2003) mutated/deleted five yeast box H/ACA snoRNAs predicted to pseudouridylate the large subunit (LSU) rRNAs at positions 2822, 2861, 2876, 2919, 2940, and 2971, all of which are located at the peptidyl transferase center of the ribosome. They subsequently assessed the effects of the mutations/deletions on pseudouridine synthesis, rRNA processing, ribosome function and cell growth. They found that a point mutation in the guide region in snR10 and the deletion of the other four snRNAs essentially

26 Yi-Tao Yu, Rebecca M. Terns, and Michael P. Terns

had no effect on rRNA processing, but specifically abolished pseudouridylation at expected site(s). Importantly, a blockade of pseudouridylation at a single site by mutating/deleting a single box H/ACA snoRNA caused a slight defect in polypeptide synthesis and cell growth, suggesting that an individual pseudouridine in the peptidyl transferase center in rRNA contributes only modestly to healthy growth. Strikingly, however, the simultaneous mutation/deletion of all five snoRNAs eliminated pseudouridylation at all six sites and consequently reduced the rate of protein synthesis, thereby causing a more severe growth defect phenotype. These results suggest that these pseudouridines may contribute to ribosome function in a synergistic manner. Consistent with these results, the Jacquier group reported that the removal of a single box H/ACA snoRNA responsible for the formation of the two most highly conserved pseudouridines in the yeast LSU rRNA (Ψ2258 and Ψ2260) also caused a slight but consistent cell growth defect phenotype, suggesting a functional role of the two pseudouridines in translation (Badis et al. 2003). However, it should be noted that it is still possible that the snoRNAs have additional roles in ribosome function aside from directing pseudouridylation. In this regard, a previous report by the Ofengand group showed that the deletion of RluD, a pseudouridylase responsible for the formation of Ψ1911, Ψ1915 and Ψ1917 in E. coli LSU rRNA, caused a severe growth defect (Raychaudhuri et al. 1998). However, when a point mutation was introduced into the catalytic center of the enzyme, pseudouridylation was completely abolished and yet the mutant cells grew as well as wild type E. coli, suggesting that the enzyme has a second function (unrelated to pseudouridylase activity) that is crucial for cell growth (Gutgsell et al. 2001). 9.3 Spliceosomal snRNA modifications are required for pre-mRNA splicing Clues that spliceosomal snRNA modifications might be important for pre-mRNA splicing come from observations in several reconstitution systems in which U2 snRNA function can be assessed. While cellularly derived U2 (fully modified) is fully competent for splicing (Yu et al. 1998), in vitro transcribed U2, which contains no modifications, does not reconstitute splicing in U2-depleted Xenopus oocytes (Pan et al. 1989; Yu et al. 1998) or in U2-depleted HeLa nuclear extract (Segault et al. 1995). Upon prolonged reconstitution, in vitro transcribed U2 is modified at the expected positions and the splicing activity in Xenopus oocytes is regenerated (Yu et al. 1998). Moreover, in vitro transcribed yeast U2 is modified in yeast splicing extracts (Ma and Yu, unpublished data) and reconstitutes premRNA splicing in yeast cell extracts depleted of endogenous U2 snRNA (McPheeters et al. 1989; McPheeters and Abelson 1992). By creating chimeric U2 snRNA molecules in which some of the sequences are from cellularly derived U2 whereas others are from in vitro transcribed U2, Yu et al. (1998) further dissected the modification of U2 and demonstrated that the functionally important modified nucleotides reside within the 5'-most 27 nucleotides, including three pseudouridines and six 2'-O-methylated residues (see Fig. 2). Na-

Mechanisms and functions of RNA-guided RNA modification 27

tive gel analysis indicated that the U2 snRNA (containing no modifications within the 5'-most 27 nucleotides) did not participate in spliceosome assembly, suggesting that the effect of these modified nucleotides on pre-mRNA splicing may be at an earlier stage. Subsequent analyses using anti-Sm immunoprecipitation, oligonucleotide affinity chromatography, and glycerol gradient centrifugation argued that U2 snRNA modification may directly contribute to the full assembly of the functional U2 snRNP that is essential for spliceosome assembly and splicing. Very recently, the Luhrmann group performed similar experiments in HeLa nuclear extracts (Donmez et al. 2004). Consistent with the results above, they found that most modified nucleotides (both 2'-O-methylated residue and pseudouridines) within the 5' end region of U2 are necessary for splicing. Furthermore, their data suggest that the effect of these modified nucleotides on splicing occurs at an early stage of pre-mRNA splicing, namely during complex E assembly (Donmez et al. 2004). Further dissection of U2 modifications indicates that the pseudouridines in the branch site recognition region of U2 are also required for pre-mRNA splicing in Xenopus oocytes (Zhao and Yu 2004b). Using the Xenopus microinjection system, Zhao and Yu (2004) observed that pseudouridylation occurs so fast in the U2 branch site recognition region that it is already complete before the splicing assay is performed. This rapid modification precludes the possibility of analyzing the modified nucleotides in this region using the conventional Xenopus oocyte reconstitution system described above (Yu et al. 1998). In order to analyze these pseudouridines, Zhao and Yu (2004) took advantage of the fact that injection of oocytes with synthetic U2 snRNA containing 5-fluorouridines only in the branch site recognition region specifically inhibits pseudouridylation in the same region of in vitro transcribed U2 snRNA injected at a later time. The reconstitution results indicate that prior injection of 5-fluorouridine-containing U2 into U2-depleted oocytes almost completely abrogates the ability of in vitro transcribed U2 to rescue splicing whereas full rescue is achieved with either cellular U2 or U2 containing pseudouridines in the branch site recognition region. Further analyses using glycerol-gradient and native gel electrophoresis indicate that U2 RNAs lacking pseudouridines in the branch site recognition region do not participate in the assembly of the fully functional U2 snRNP and the spliceosome. However, because pseudouridylation at all six pseudouridine positions (Fig. 2) is inhibited, it remains to be determined whether the pseudouridines act synergistically or if individual pseudouridines in this region are critical for function. In this regard, the change of a single uridine in the branch site recognition region (U34) to pseudouridine (Ψ34) greatly enhances the production of X-RNA, a product generated by a splicing-related branching reaction in a cell- and protein-free system (Valadkhan and Manley 2003), suggesting that at least one pseudouridine in the branch site recognition region plays a critical role in splicing. This notion is supported by published NMR structural data that indicate that Ψ34 is important both for stabilizing the RNA-RNA duplex between the branch site recognition region in U2 and the branch site sequence in pre-mRNA and for maintaining the bulge of the branch point nucleotide (adenosine) for nucleophilic attack during splicing (Newby and Greenbaum 2001; Newby and Greenbaum 2002; see chapter 8, this volume).

28 Yi-Tao Yu, Rebecca M. Terns, and Michael P. Terns

9.4 How do modified nucleotides contribute to RNA function? Although some important modified nucleotides have been identified, the question of how these modifications contribute to function remains unclear. One possibility is that the modifications alter RNA structure either locally or globally, thereby, altering function. Accordingly, it has been reported that pseudouridines are important for RNA folding, perhaps acting by stabilizing local base stacking (Davis 1995, 1998). The crystal structure of tRNAGln also revealed the specific binding of a pseudouridine to a water molecule, forming a local structure that may be critical for stabilizing the tRNA (Arnez and Steitz 1994). 2'-O-methylated residues can also contribute to the stabilization of RNA secondary structure (Davis 1998). Another possibility of how RNA modifications contribute to function is that the modified nucleotides are directly recognized by another RNA(s) or protein(s) and that the resulting complex is essential for function. It is clear that both 2'-Omethylation and pseudouridylation can change the chemical properties of nucleotide residues. Specifically, the conversion of uridine into pseudouridine creates an extra hydrogen donor, whereas 2'-O-methylation certainly makes a nucleotide residue more hydrophobic (Davis 1998). These property changes may allow the modified nucleotides to interact differentially with other proteins or RNAs. Alternatively, changes in nucleotide properties may directly contribute to catalysis. In this regard, it is especially relevant that many modified nucleotides in 28S rRNA are located in the peptidyl transferase center (Decatur and Fournier 2002; Omer et al. 2003) and that the U2 branch site recognition region, which contains a number of pseudouridines (Fig. 2), is believed to contribute to the catalytic center of the spliceosome that mediates the splicing reaction (Yu et al. 1999). 9.5 Are RNA modifications reversible? Another important question regarding RNA modifications is whether they are reversible. Molecular modifications are reversible in many instances, and this is especially well-appreciated for post-translational modifications of proteins (phosphorylation, methylation, acetylation, etc.). Cells utilize the reversible protein modification strategy to regulate gene expression in response to changes in the intra- or extracellular environment. However, to date there is no evidence for the reversibility of RNA modifications. On the other hand, this lack of data does not disprove the possible existence of such a process, and therefore further scrutiny is necessary to clarify this issue. As a first step toward solving this problem, it is important to quantitate RNA modification at naturally occurring sites. Although it is widely assumed that RNA modification at a given site are fully (100%) achieved in the cell, this assumption clearly needs further experimental examination. With recently developed methods for detecting and quantitating RNA modifications (Bakin and Ofengand 1993; Yu et al. 1997; Grosjean et al. 2004; Zhao and Yu 2004a), it is anticipated that this issue will soon be addressed.

Mechanisms and functions of RNA-guided RNA modification 29

10 Concluding remarks RNA modification provides for an increase in the diversity and complexity of the RNA products encoded by a genome. RNA-guided RNA modification is a particularly flexible system, allowing modification of multiple distinct sites by a single enzyme with modular RNA guides. Expansion in the extent of modification and evolution of new modification sites is possible via changes to short linear guide sequences (as opposed to evolution of novel RNA recognition domains in dedicated modification proteins). In addition, the extent and possible reversibility of modifications provide additional potential for regulation and fine-tuning of the function of substrate RNAs. Significant progress has been made in recent years in understanding the mechanisms of RNA-guided RNA modification in eukaryotes and archaea. While many important questions remain about the mechanisms of modification, the more fundamental questions now seem to center on the roles of the modifications in the target RNAs. Why are so many nucleotides within rRNAs and spliceosomal snRNAs 2’-O-methylated and pseudouridylated? What are the functional consequences of the modifications in protein translation and pre-mRNA splicing? Are the modified nucleotides present in all spliceosomal snRNAs (U1, U4, U5, and U6 as well as U2) and in snoRNAs (U3) functionally important? How do the modified nucleotides contribute to function? To what extent are other classes of RNAs, including mRNAs and tRNAs, modified by the RNA-guided modification system? Is every target site fully modified? Is modification reversible? Does it function as a means to regulate the function of the target RNAs? With regard to the functions of the modifications introduced by the RNA-guided system, there are clearly more questions than answers. The answers will require a combination of experimental approaches such as functional reconstitution of spliceosomal snRNPs in Xenopus oocytes (Yu et al. 1998; Zhao and Yu 2004b) or in HeLa nuclear extracts (Donmez et al. 2004), yeast genetics targeting rRNA modifications (Badis et al. 2003; King et al. 2003), and high resolution NMR (Newby and Greenbaum 2001; Newby and Greenbaum 2002) and X-ray crystallography that will provide insight into the detailed macromolecular changes induced by discrete post-transcriptional nucleotide modifications.

Acknowledgments We thank Henri Grosjean for extremely helpful discussions and valuable comments on the manuscript. We also thank our colleagues in the Yu lab and in the Terns lab for discussions and inspiration. Our work was supported by grant GM62937 (to Y.-T. Yu) and grant GM54682 (to M.P. and R.M. Terns) from the National Institutes of Health.

30 Yi-Tao Yu, Rebecca M. Terns, and Michael P. Terns

References A ccardo MC, Giordano E, Riccardo S, Digilio FA, Iazzetti G, Calogero RA, Furia M (2004) A computational search for box C/D snoRNA genes in the D. melanogaster genome. Bioinformatics, Advance Access published online on August 5, 2004 Aittaleb M, Rashid R, Chen Q, Palmer JR, Daniels CJ, Li H (2003) Structure and function of archaeal box C/D sRNP core proteins. Nat Struct Biol 10:256-263 Alexandrov A, Martzen MR, Phizicky EM (2002) Two proteins that form a complex are required for 7-methylguanosine modification of yeast tRNA. RNA 8:1253-1266 Arnez JG, Steitz TA (1994) Crystal structure of unmodified tRNA(Gln) complexed with glutaminyl-tRNA synthetase and ATP suggests a possible role for pseudo-uridines in stabilization of RNA structure. Biochemistry 33:7560-7567 Auffinger P, Westhof E (1998) Location and distribution of modified nucleotides in tRNA. In: Grosjean H, Benne R (eds) Modification and Editing of RNA. ASM Press, Washington, DC, pp 569-576 Bachellerie J-P, Cavaille J (1998) Small nucleolar RNAs guide the ribose methylations of eukaryotic rRNAs. In: Grosjean H, Benne R (eds) Modification and Editing of RNA. ASM Press, Washington, DC, pp 255-272 Bachellerie JP, Cavaille J (1997) Guiding ribose methylation of rRNA. Trends Biochem Sci 22:257-261 Bachellerie JP, Cavaille J, Huttenhofer A (2002) The expanding snoRNA world. Biochimie 84:775-790 Bachellerie JP, Michot B, Nicoloso M, Balakin A, Ni J, Fournier MJ (1995) Antisense snoRNAs: a family of nucleolar RNAs with long complementarities to rRNA. Trends Biochem Sci 20:261-264 Badis G, Fromont-Racine M, Jacquier A (2003) A snoRNA that guides the two most conserved pseudouridine modifications within rRNA confers a growth advantage in yeast. RNA 9:771-779 Bagni C, Lapeyre B (1998) Gar1p binds to the small nucleolar RNAs snR10 and snR30 in vitro through a nontypical RNA binding element. J Biol Chem 273:10868-10873 Bakin A, Ofengand J (1993) Four newly located pseudouridylate residues in Escherichia coli 23S ribosomal RNA are all at the peptidyltransferase center: analysis by the application of a new sequencing technique. Biochemistry 32:9754-9762 Balakin AG, Schneider GS, Corbett MS, Ni J, Fournier MJ (1993) SnR31, snR32, and snR33: three novel, non-essential snRNAs from Saccharomyces cerevisiae. Nucleic Acids Res 21:5391-5397 Balakin AG, Smith L, Fournier MJ (1996) The RNA world of the nucleolus: two major families of small RNAs defined by different box elements with related functions. Cell 86:823-834 Ban N, Nissen P, Hansen J, Moore PB, Steitz TA (2000) The complete atomic structure of the large ribosomal subunit at 2.4 A resolution. Science 289:905-920 Barneche F, Gaspin C, Guyot R, Echeverria M (2001) Identification of 66 box C/D snoRNAs in Arabidopsis thaliana: extensive gene duplications generated multiple isoforms predicting new ribosomal RNA 2'-O-methylation sites. J Mol Biol 311:57-73 Bjork GR (1995) Biosynthesis and function of modified nucleotides. In: Soll D, RajBhandary U (eds) tRNA: Structure, biosynthesis, and function. ASM Press, Washington, DC, pp 165-205

Mechanisms and functions of RNA-guided RNA modification 31 Bortolin ML, Bachellerie JP, Clouet-d'Orval B (2003) In vitro RNP assembly and methylation guide activity of an unusual box C/D RNA, cis-acting archaeal pre-tRNA(Trp). Nucleic Acids Res 31:6524-6535 Bortolin ML, Ganot P, Kiss T (1999) Elements essential for accumulation and function of small nucleolar RNAs directing site-specific pseudouridylation of ribosomal RNAs. EMBO J 18:457-469 Bortolin ML, Kiss T (1998) Human U19 intron-encoded snoRNA is processed from a long primary transcript that possesses little potential for protein coding. RNA 4:445-454 Bousquet-Antonelli C, Henry Y, G'elugne JP, Caizergues-Ferrer M, Kiss T (1997) A small nucleolar RNP protein is required for pseudouridylation of eukaryotic ribosomal RNAs. EMBO J 16:4770-4776 Brown JW, Clark GP, Leader DJ, Simpson CG, Lowe T (2001) Multiple snoRNA gene clusters from Arabidopsis. RNA 7:1817-1832 Brown JW, Echeverria M, Qu LH (2003a) Plant snoRNAs: functional evolution and new modes of gene expression. Trends Plant Sci 8:42-49 Brown JW, Echeverria M, Qu LH, Lowe TM, Bachellerie JP, Huttenhofer A, Kastenmayer JP, Green PJ, Shaw P, Marshall DF (2003b) Plant snoRNA database. Nucleic Acids Res 31:432-435 Caffarelli E, Arese M, Santoro B, Fragapane P, Bozzoni I (1994) In vitro study of processing of the intron-encoded U16 small nucleolar RNA in Xenopus laevis. Mol Cell Biol 14:2966-2974 Caffarelli E, Fatica A, Prislei S, De Gregorio E, Fragapane P, Bozzoni I (1996) Processing of the intron-encoded U16 and U18 snoRNAs: the conserved C and D boxes control both the processing reaction and the stability of the mature snoRNA. EMBO J 15:1121-1131 Cahill NM, Friend K, Speckmann W, Li ZH, Terns RM, Terns MP, Steitz JA (2002) Sitespecific cross-linking analyses reveal an asymmetric protein distribution for a box C/D snoRNP. EMBO J 21:3816-3828 Cavaille J, Buiting K, Kiefmann M, Lalande M, Brannan CI, Horsthemke B, Bachellerie JP, Brosius J, Huttenhofer A (2000) Identification of brain-specific and imprinted small nucleolar RNA genes exhibiting an unusual genomic organization. Proc Natl Acad Sci USA 97:14311-14316 Cavaille J, Nicoloso M, Bachellerie JP (1996) Targeted ribose methylation of RNA in vivo directed by tailored antisense RNA guides. Nature 383:732-735 Chen CL, Liang D, Zhou H, Zhuo M, Chen YQ, Qu LH (2003) The high diversity of snoRNAs in plants: identification and comparative study of 120 snoRNA genes from Oryza sativa. Nucleic Acids Res 31:2601-2613 Chen JL, Blasco MA, Greider CW (2000) Secondary structure of vertebrate telomerase RNA. Cell 100:503-514 Clouet d'Orval B, Bortolin ML, Gaspin C, Bachellerie JP (2001) Box C/D RNA guides for the ribose methylation of archaeal tRNAs. The tRNATrp intron guides the formation of two ribose-methylated nucleosides in the mature tRNATrp. Nucleic Acids Res 29:4518-4529 Darzacq X, Jady BE, Verheggen C, Kiss AM, Bertrand E, Kiss T (2002) Cajal bodyspecific small nuclear RNAs: a novel class of 2'-O-methylation and pseudouridylation guide RNAs. EMBO J 21:2746-2756 Davis DR (1995) Stabilization of RNA stacking by pseudouridine. Nucleic Acids Res 23:5020-5026

32 Yi-Tao Yu, Rebecca M. Terns, and Michael P. Terns Davis DR (1998) Biophysical and conformational properties of modified nucleotides in RNA. In: Grosjean H, Benne R (eds) Modification and editing of RNA. ASM Press, Washington, DC, pp 85-102 Decatur WA, Fournier MJ (2002) rRNA modifications and ribosome function. Trends Biochem Sci 27:344-351 Decatur WA, Fournier MJ (2003) RNA-guided nucleotide modification of ribosomal and other RNAs. J Biol Chem 278:695-698 Deng L, Starostina NG, Liu ZJ, Rose JP, Terns RM, Terns MP, Wang BC (2004) Structure determination of fibrillarin from the hyperthermophilic archaeon Pyrococcus furiosus. Biochem Biophys Res Commun 315:726-732 Dennis PP, Omer A, Lowe T (2001) A guided tour: small RNA function in Archaea. Mol Microbiol 40:509-519 Donmez G, Hartmuth K, Luhrmann R (2004) Modified nucleotides in the 5' end of the human U2 snRNA are required for early spliceosome (E complex) formation in vitro. The 2004 RNA meeting abstract:92 Dragon F, Pogacic V, Filipowicz W (2000) In vitro assembly of human H/ACA small nucleolar RNPs reveals unique features of U17 and telomerase RNAs. Mol Cell Biol 20:3037-3048 Dunbar DA, Wormsley S, Lowe TM, Baserga SJ (2000) Fibrillarin-associated box C/D small nucleolar RNAs in Trypanosoma brucei. Sequence conservation and implications for 2'-O-ribose methylation of rRNA. J Biol Chem 275:14767-14776 Ferre-D'Amare AR (2003) RNA-modifying enzymes. Curr Opin Struct Biol 13:49-55 Filipowicz W (2000) Imprinted expression of small nucleolar RNAs in brain: time for RNomics. Proc Natl Acad Sci USA 97:14035-14037 Filipowicz W, Pogacic V (2002) Biogenesis of small nucleolar ribonucleoproteins. Curr Opin Cell Biol 14:319-327 Galardi S, Fatica A, Bachi A, Scaloni A, Presutti C, Bozzoni I (2002) Purified box C/D snoRNPs are able to reproduce site-specific 2'-O-methylation of target RNA in vitro. Mol Cell Biol 22:6663-6668 Ganot P, Bortolin ML, Kiss T (1997a) Site-specific pseudouridine formation in preribosomal RNA is guided by small nucleolar RNAs. Cell 89:799-809 Ganot P, Caizergues-Ferrer M, Kiss T (1997b) The family of box ACA small nucleolar RNAs is defined by an evolutionarily conserved secondary structure and ubiquitous sequence elements essential for RNA accumulation. Genes Dev 11:941-956 Ganot P, Jady BE, Bortolin ML, Darzacq X, Kiss T (1999) Nucleolar factors direct the 2'O-ribose methylation and pseudouridylation of U6 spliceosomal RNA. Mol Cell Biol 19:6906-6917 Gaspin C, Cavaille J, Erauso G, Bachellerie JP (2000) Archaeal homologs of eukaryotic methylation guide small nucleolar RNAs: lessons from the Pyrococcus genomes. J Mol Biol 297:895-906 Gautier T, Berges T, Tollervey D, Hurt E (1997) Nucleolar KKE/D repeat proteins Nop56p and Nop58p interact with Nop1p and are required for ribosome biogenesis. Mol Cell Biol 17:7088-7098 Gerbi SA, Savino R, Stebbins-Boaz B, Jeppesen C, Rivera-Leon R (1990) A role for U3 small nuclear ribonucleoprotein in the nucleolus? In: Dahlberg A, Garrett RA, Moore PB, Schlessinger D, Warner JR (eds) The ribosome–structure, function and evolution. ASM Press, Washington, DC, pp 452-469

Mechanisms and functions of RNA-guided RNA modification 33 Grosjean H, Keith G, Droogmans L (2004) Detection and quantification of modified nucleotides in RNA using thin-layer chromatography. Methods Mol Biol 265:357-391 Grosjean H, Sprinzl M, Steinberg S (1995) Posttranscriptionally modified nucleosides in transfer RNA: their locations and frequencies. Biochimie 77:139-141 Gutgsell NS, Del Campo MD, Raychaudhuri S, Ofengand J (2001) A second function for pseudouridine synthases: A point mutant of RluD unable to form pseudouridines 1911, 1915, and 1917 in Escherichia coli 23S ribosomal RNA restores normal growth to an RluD-minus strain. RNA 7:990-998 Hamma T, Ferre-D'Amare AR (2004) Structure of protein L7Ae bound to a K-turn derived from an archaeal box H/ACA sRNA at 1.8 A resolution. Structure (Camb) 12:893-903 Heiss NS, Knight SW, Vulliamy TJ, Klauck SM, Wiemann S, Mason PJ, Poustka A, Dokal I (1998) X-linked dyskeratosis congenita is caused by mutations in a highly conserved gene with putative nucleolar functions. Nat Genet 19:32-38 Henras A, Dez C, Noaillac-Depeyre J, Henry Y, Caizergues-Ferrer M (2001) Accumulation of H/ACA snoRNPs depends on the integrity of the conserved central domain of the RNA-binding protein Nhp2p. Nucleic Acids Res 29:2733-2746 Henras A, Henry Y, Bousquet-Antonelli C, Noaillac-Depeyre J, Gélugne JP, CaizerguesFerrer M (1998) Nhp2p and Nop10p are essential for the function of H/ACA snoRNPs. EMBO J 17:7078-7090 Henras AK, Capeyrou R, Henry Y, Caizergues-Ferrer M (2004) Cbf5p, the putative pseudouridine synthase of H/ACA-type snoRNPs, can form a complex with Gar1p and Nop10p in absence of Nhp2p and box H/ACA snoRNAs. RNA 10:1704-1712 Hodnett JL, Busch H (1968) Isolation and characterization of uridylic acid-rich 7 S ribonucleic acid of rat liver nuclei. J Biol Chem 243:6334-6342 Hopper AK, Phizicky EM (2003) tRNA transfers to the limelight. Genes Dev 17:162-180 Huang ZP, Zhou H, Liang D, Qu LH (2004) Different expression strategy: multiple intronic gene clusters of box H/ACA snoRNA in Drosophila melanogaster. J Mol Biol 341:669-683 Huttenhofer A, Brosius J, Bachellerie JP (2002) RNomics: identification and function of small, non-messenger RNAs. Curr Opin Chem Biol 6:835-843 Huttenhofer A, Cavaille J, Bachellerie JP (2004) Experimental RNomics: a global approach to identifying small nuclear RNAs and their targets in different model organisms. Methods Mol Biol 265:409-428 Huttenhofer A, Kiefmann M, Meier-Ewert S, O'Brien J, Lehrach H, Bachellerie JP, Brosius J (2001) RNomics: an experimental approach that identifies 201 candidates for novel, small, non-messenger RNAs in mouse. EMBO J 20:2943-2953 Jady BE, Bertrand E, Kiss T (2004) Human telomerase RNA and box H/ACA scaRNAs share a common Cajal body-specific localization signal. J Cell Biol 164:647-652 Jady BE, Kiss T (2001) A small nucleolar guide RNA functions both in 2'-O-ribose methylation and pseudouridylation of the U5 spliceosomal RNA. EMBO J 20:541-551 Kass S, Tyc K, Steitz JA, Sollner-Webb B (1990) The U3 small nucleolar ribonucleoprotein functions in the first step of preribosomal RNA processing. Cell 60:897-908 King TH, Liu B, McCully RR, Fournier MJ (2003) Ribosome structure and activity are altered in cells lacking snoRNPs that form pseudouridines in the peptidyl transferase center. Mol Cell 11:425-435 Kiss AM, Jady BE, Bertrand E, Kiss T (2004) Human box H/ACA pseudouridylation guide RNA machinery. Mol Cell Biol 24:5797-5807

34 Yi-Tao Yu, Rebecca M. Terns, and Michael P. Terns Kiss AM, Jady BE, Darzacq X, Verheggen C, Bertrand E, Kiss T (2002) A Cajal bodyspecific pseudouridylation guide RNA is composed of two box H/ACA snoRNA-like domains. Nucleic Acids Res 30:4643-4649 Kiss T (2001) Small nucleolar RNA-guided post-transcriptional modification of cellular RNAs. EMBO J 20:3617-3622 Kiss T (2002) Small nucleolar RNAs: an abundant group of noncoding RNAs with diverse cellular functions. Cell 109:145-148 Kiss T, Jady BE (2004) Functional characterization of 2'-O-methylation and pseudouridylation guide RNAs. Methods Mol Biol 265:393-408 Kiss-Laszlo Z, Henry Y, Bachellerie JP, Caizergues-Ferrer M, Kiss T (1996) Site-specific ribose methylation of preribosomal RNA: a novel function for small nucleolar RNAs. Cell 85:1077-1088 Kiss-László Z, Henry Y, Kiss T (1998) Sequence and structural elements of methylation guide snoRNAs essential for site-specific ribose methylation of pre-rRNA. EMBO J 17:797-807 Klein DJ, Schmeing TM, Moore PB, Steitz TA (2001) The kink-turn: a new RNA secondary structure motif. EMBO J 20:4214-4221 Klein RJ, Misulovin Z, Eddy SR (2002) Noncoding RNA genes identified in AT-rich hyperthermophiles. Proc Natl Acad Sci USA 99:7542-7547 Koonin EV (1996) Pseudouridine synthases: four families of enzymes containing a putative uridine-binding motif also conserved in dUTPases and dCTP deaminases. Nucleic Acids Res 24:2411-2415 Kuhn JF, Tran EJ, Maxwell ES (2002) Archaeal ribosomal protein L7 is a functional homolog of the eukaryotic 15.5kD/Snu13p snoRNP core protein. Nucleic Acids Res 30:931-941 Lafontaine DL, Tollervey D (1998) Birth of the snoRNPs: the evolution of the modification-guide snoRNAs. Trends Biochem Sci 23:383-388 Lafontaine DL, Tollervey D (2000) Synthesis and assembly of the box C+D small nucleolar RNPs. Mol Cell Biol 20:2650-2659 Lafontaine DLJ, Tollervey D (1999) Nop58p is a common component of the box C+D snoRNPs that is required for snoRNA stability. RNA 5:455-467 Leverette RD, Andrews MT, Maxwell ES (1992) Mouse U14 snRNA is a processed intron of the cognate hsc70 heat shock pre-messenger RNA. Cell 71:1215-1221 Li HD, Zagorski J, Fournier MJ (1990) Depletion of U14 small nuclear RNA (snR128) disrupts production of 18S rRNA in Saccharomyces cerevisiae. Mol Cell Biol 10:11451152 Liang XH, Xu YX, Michaeli S (2002) The spliced leader-associated RNA is a trypanosome-specific sn(o) RNA that has the potential to guide pseudouridine formation on the SL RNA. RNA 8:237-246 Liu J, Maxwell ES (1990) Mouse U14 snRNA is encoded in an intron of the mouse cognate hsc70 heat shock gene. Nucleic Acids Res 18:6565-6571 Lowe TM, Eddy SR (1999) A computational screen for methylation guide snoRNAs in yeast. Science 283:1168-1171 Lubben B, Fabrizio P, Kastner B, Luhrmann R (1995) Isolation and characterization of the small nucleolar ribonucleoprotein particle snR30 from Saccharomyces cerevisiae. J Biol Chem 270:11549-11554

Mechanisms and functions of RNA-guided RNA modification 35 Lukowiak AA, Narayanan A, Li ZH, Terns RM, Terns MP (2001) The snoRNA domain of vertebrate telomerase RNA functions to localize the RNA within the nucleus. RNA 7:1833-1844 Ma X, Zhao X, Yu YT (2003) Pseudouridylation (Psi) of U2 snRNA in S. cerevisiae is catalyzed by an RNA-independent mechanism. EMBO J 22:1889-1897 Maden BE (1990) The numerous modified nucleotides in eukaryotic ribosomal RNA. Prog Nucleic Acid Res Mol Biol 39:241-303 Makarova OV, Makarov EM, Liu S, Vornlocher HP, Luhrmann R (2002) Protein 61K, encoded by a gene (PRPF31) linked to autosomal dominant retinitis pigmentosa, is required for U4/U6*U5 tri-snRNP formation and pre-mRNA splicing. EMBO J 21:11481157 Marker C, Zemann A, Terhorst T, Kiefmann M, Kastenmayer JP, Green P, Bachellerie JP, Brosius J, Huttenhofer A (2002) Experimental RNomics: identification of 140 candidates for small non-messenger RNAs in the plant Arabidopsis thaliana. Curr Biol 12:2002-2013 Marrone A, Mason PJ (2003) Dyskeratosis congenita. Cell Mol Life Sci 60:507-517 Martin JL, McMillan FM (2002) SAM (dependent) I AM: the S-adenosylmethioninedependent methyltransferase fold. Curr Opin Struct Biol 12:783-793 Maser RL, Calvet JP (1989) U3 small nuclear RNA can be psoralen-cross-linked in vivo to the 5' external transcribed spacer of pre-ribosomal-RNA. Proc Natl Acad Sci USA 86:6523-6527 Massenet S, Motorin Y, Lafontaine DL, Hurt EC, Grosjean H, Branlant C (1999) Pseudouridine mapping in the Saccharomyces cerevisiae spliceosomal U small nuclear RNAs (snRNAs) reveals that pseudouridine synthase pus1p exhibits a dual substrate specificity for U2 snRNA and tRNA. Mol Cell Biol 19:2142-2154 Massenet S, Mougin A, C. B (1998) Posttranscriptional modifications in the U small nuclear RNAs. In: Grosjean H (ed) Modification and Editing of RNA. ASM Press, Washington, DC, pp 201-228 Maxwell ES, Fournier MJ (1995) The small nucleolar RNAs. Annu Rev Biochem 64:897934 Maxwell ES, Martin TE (1986) A low-molecular-weight RNA from mouse ascites cells that hybridizes to both 18S rRNA and mRNA sequences. Proc Natl Acad Sci USA 83:7261-7265 McPheeters DS, Abelson J (1992) Mutational analysis of the yeast U2 snRNA suggests a structural similarity to the catalytic core of group I introns. Cell 71:819-831 McPheeters DS, Fabrizio P, Abelson J (1989) In vitro reconstitution of functional yeast U2 snRNPs. Genes Dev 3:2124-2136 Meier UT (2003) Dissecting dyskeratosis. Nat Genet 33:116-117 Mitchell JR, Cheng J, Collins K (1999a) A box H/ACA small nucleolar RNA-like domain at the human telomerase RNA 3' end. Mol Cell Biol 19:567-576 Mitchell JR, Wood E, Collins K (1999b) A telomerase component is defective in the human disease dyskeratosis congenita. Nature 402:551-555 Moore T, Zhang Y, Fenley MO, Li H (2004) Molecular basis of box C/D RNA-protein interactions; cocrystal structure of archaeal L7Ae and a box C/D RNA. Structure (Camb) 12:807-818 Motorin Y, Grosjean H (1998) Chemical structures and classification of posttranscriptionally modified nucleotides in RNA. In: Grosjean H, Benne R (eds) Modification and Editing of RNA. ASM Press, Washington, DC, pp 543-549

36 Yi-Tao Yu, Rebecca M. Terns, and Michael P. Terns Newby MI, Greenbaum NL (2001) A conserved pseudouridine modification in eukaryotic U2 snRNA induces a change in branch-site architecture. RNA 7:833-845 Newby MI, Greenbaum NL (2002) Sculpting of the spliceosomal branch site recognition motif by a conserved pseudouridine. Nat Struct Biol 9:958-965 Ni J, Tien AL, Fournier MJ (1997) Small nucleolar RNAs direct site-specific synthesis of pseudouridine in ribosomal RNA. Cell 89:565-573 Nieuwlandt DT, Carr MB, Daniels CJ (1993) In vivo processing of an intron-containing archael tRNA. Mol Microbiol 8:93-99 Noon KR, Bruenger E, McCloskey JA (1998) Posttranscriptional modifications in 16S and 23S rRNAs of the archaeal hyperthermophile Sulfolobus solfataricus. J Bacteriol 180:2883-2888 Ochs RL, Lischwe MA, Spohn WH, Busch H (1985) Fibrillarin: a new protein of the nucleolus identified by autoimmune sera. Biol Cell 54:123-133 Ofengand J (2002) Ribosomal RNA pseudouridines and pseudouridine synthases. FEBS Lett 514:17-25 Ofengand J, Fournier M (1998) The pseudouridine residues of rRNA: number, location, biosynthesis, and function. In: Grosjean H, Benne R (eds) Modification and Editing of RNA. ASM Press, Washington, DC, pp 229-253 Olivas WM, Muhlrad D, Parker R (1997) Analysis of the yeast genome: identification of new non-coding and small ORF-containing RNAs. Nucleic Acids Res 25:4619-4625 Omer AD, Lowe TM, Russell AG, Ebhardt H, Eddy SR, Dennis PP (2000) Homologs of small nucleolar RNAs in Archaea. Science 288:517-522 Omer AD, Ziesche S, Decatur WA, Fournier MJ, Dennis PP (2003) RNA-modifying machines in archaea. Mol Microbiol 48:617-629 Omer AD, Ziesche S, Ebhardt H, Dennis PP (2002) In vitro reconstitution and activity of a C/D box methylation guide ribonucleoprotein complex. Proc Natl Acad Sci USA 99:5289-5294 Pan ZQ, Ge H, Fu XY, Manley JL, Prives C (1989) Oligonucleotide-targeted degradation of U1 and U2 snRNAs reveals differential interactions of simian virus 40 pre-mRNAs with snRNPs. Nucleic Acids Res 17:6553-6568 Peculis B (1997) RNA processing: pocket guides to ribosomal RNA. Curr Biol 7:R480-482 Pelczar P, Filipowicz W (1998) The host gene for intronic U17 small nucleolar RNAs in mammals has no protein-coding potential and is a member of the 5'-terminal oligopyrimidine gene family. Mol Cell Biol 18:4509-4518 Pogacic V, Dragon F, Filipowicz W (2000) Human H/ACA small nucleolar RNPs and telomerase share evolutionarily conserved proteins NHP2 and NOP10. Mol Cell Biol 20:9028-9040 Qu LH, Henras A, Lu YJ, Zhou H, Zhou WX, Zhu YQ, Zhao J, Henry Y, CaizerguesFerrer M, Bachellerie JP (1999) Seven novel methylation guide small nucleolar RNAs are processed from a common polycistronic transcript by Rat1p and RNase III in yeast. Mol Cell Biol 19:1144-1158 Qu LH, Meng Q, Zhou H, Chen YQ, Liang-Hu Q, Qing M, Hui Z, Yue-Qin C (2001) Identification of 10 novel snoRNA gene clusters from Arabidopsis thaliana. Nucleic Acids Res 29:1623-1630 Rashid R, Aittaleb M, Chen Q, Spiegel K, Demeler B, Li H (2003) Functional requirement for symmetric assembly of archaeal box C/D small ribonucleoprotein particles. J Mol Biol 333:295-306

Mechanisms and functions of RNA-guided RNA modification 37 Raychaudhuri S, Conrad J, Hall BG, Ofengand J (1998) A pseudouridine synthase required for the formation of two universally conserved pseudouridines in ribosomal RNA is essential for normal growth of Escherichia coli. RNA 4:1407-1417 Reddy R, Busch H (1988) Small nuclear RNAs: RNA sequences, structure, and modifications. In: Birnsteil ML (ed) Structure and function of major and minor small nuclear ribonucleoprotein particles. Sringer-Verlag Press, Heidelberg, pp 1-37 Rimoldi OJ, Raghu B, Nag MK, Eliceiri GL (1993) Three new small nucleolar RNAs that are psoralen cross-linked in vivo to unique regions of pre-rRNA. Mol Cell Biol 13:4382-4390 Rozhdestvensky TS, Tang TH, Tchirkova IV, Brosius J, Bachellerie JP, Huttenhofer A (2003) Binding of L7Ae protein to the K-turn of archaeal snoRNAs: a shared RNA binding motif for C/D and H/ACA box snoRNAs in Archaea. Nucleic Acids Res 31:869-877 Ruff EA, Rimoldi OJ, Raghu B, Eliceiri GL (1993) Three small nucleolar RNAs of unique nucleotide sequences. Proc Natl Acad Sci USA 90:635-638 Ruggero D, Grisendi S, Piazza F, Rego E, Mari F, Rao PH, Cordon-Cardo C, Pandolfi PP (2003) Dyskeratosis congenita and cancer in mice deficient in ribosomal RNA modification. Science 299:259-262 Samarsky DA, Fournier MJ (1999) A comprehensive database for the small nucleolar RNAs from Saccharomyces cerevisiae. Nucleic Acids Res 27:161-164 Schattner P, Decatur WA, Davis CA, Ares M Jr, Fournier MJ, Lowe TM (2004) Genomewide searching for pseudouridylation guide snoRNAs: analysis of the Saccharomyces cerevisiae genome. Nucleic Acids Res 32:4281-4296 Schubert HL, Blumenthal RM, Cheng X (2003) Many paths to methyltransfer: a chronicle of convergence. Trends Biochem Sci 28:329-335 Segault V, Will CL, Sproat BS, Luhrmann R (1995) In vitro reconstitution of mammalian U2 and U5 snRNPs active in splicing: Sm proteins are functionally interchangeable and are essential for the formation of functional U2 and U5 snRNPs. EMBO J 14:4010-4021 Singh SK, Gurha P, Tran EJ, Maxwell ES, Gupta R (2004) A trans mechanism for archaeal tRNAtrp nucleotide 2'-O-methylation guided by the pre-tRNATrp intron-encoded box C/D RNPs. The 2004 RNA meeting abstract:744 Smith CM, Steitz JA (1997) Sno storm in the nucleolus: new roles for myriad small RNPs. Cell 89:669-672 Smith CM, Steitz JA (1998) Classification of gas5 as a multi-small-nucleolar-RNA (snoRNA) host gene and a member of the 5'-terminal oligopyrimidine gene family reveals common features of snoRNA host genes. Mol Cell Biol 18:6897-6909 Speckmann WA, Li ZH, Lowe TM, Eddy SR, Terns RM, Terns MP (2002) Archaeal guide RNAs function in rRNA modification in the eukaryotic nucleus. Curr Biol 12:199-203 Sprinzl M, Horn C, Brown M, Ioudovitch A, Steinberg S (1998) Compilation of tRNA sequences and sequences of tRNA genes. Nucleic Acids Res 26:148-153 Starostina NG, Marshburn S, Johnson LS, Eddy SR, Terns RM, Terns MP (2004) Circular box C/D RNAs in Pyrococcus furiosus. Proc Natl Acad Sci USA 101:14097-14101 Steitz JA, Tycowski KT (1995) Small RNA chaperones for ribosome biogenesis. Science 270:1626-1627 Stroke IL, Weiner AM (1989) The 5' end of U3 snRNA can be crosslinked in vivo to the external transcribed spacer of rat ribosomal RNA precursors. J Mol Biol 210:497-512

38 Yi-Tao Yu, Rebecca M. Terns, and Michael P. Terns Szewczak LB, DeGregorio SJ, Strobel SA, Steitz JA (2002) Exclusive interaction of the 15.5 kD protein with the terminal box C/D motif of a methylation guide snoRNP. Chem Biol 9:1095-1107 Tang TH, Bachellerie JP, Rozhdestvensky T, Bortolin ML, Huber H, Drungowski M, Elge T, Brosius J, Huttenhofer A (2002) Identification of 86 candidates for small nonmessenger RNAs from the archaeon Archaeoglobus fulgidus. Proc Natl Acad Sci USA 99:7536-7541 Terns MP, Terns RM (2002) Small nucleolar RNAs: versatile trans-acting molecules of ancient evolutionary origin. Gene Expr 10:17-39 Tollervey D, Lehtonen H, Carmo-Fonseca M, Hurt EC (1991) The small nucleolar RNP protein NOP1 (fibrillarin) is required for pre-rRNA processing in yeast. EMBO J 10:573-583 Tollervey D, Lehtonen H, Jansen R, Kern H, Hurt EC (1993) Temperature-sensitive mutations demonstrate roles for yeast fibrillarin in pre-rRNA processing, pre-rRNA methylation, and ribosome assembly. Cell 72:443-457 Tran E, Brown J, Maxwell ES (2004) Evolutionary origins of the RNA-guided nucleotidemodification complexes: from the primitive translation apparatus? Trends Biochem Sci 29:343-350 Tran EJ, Zhang X, Maxwell ES (2003) Efficient RNA 2'-O-methylation requires juxtaposed and symmetrically assembled archaeal box C/D and C'/D' RNPs. EMBO J. 22:39303940 Trinh-Rohlik Q, Maxwell ES (1988) Homologous genes for mouse 4.5S hybRNA are found in all eukaryotes and their low molecular weight RNA transcripts intermolecularly hybridize with eukaryotic 18S ribosomal RNAs. Nucleic Acids Res 16:6041-6056 Tyc K, Steitz JA (1989) U3, U8 and U13 comprise a new class of mammalian snRNPs localized in the cell nucleolus. EMBO J 8:3113-3119 Tycowski KT, Shu MD, Steitz JA (1996) A mammalian gene with introns instead of exons generating stable RNA products. Nature 379:464-466 Tycowski KT, Steitz JA (2001) Non-coding snoRNA host genes in Drosophila: expression strategies for modification guide snoRNAs. Eur J Cell Biol 80:119-125 Tycowski KT, You ZH, Graham PJ, Steitz JA (1998) Modification of U6 spliceosomal RNA is guided by other small RNAs. Mol Cell 2:629-638 Uliel S, Liang XH, Unger R, Michaeli S (2004) Small nucleolar RNAs that guide modification in trypanosomatids: repertoire, targets, genome organisation, and unique functions. Int J Parasitol 34:445-454 Valadkhan S, Manley JL (2003) Characterization of the catalytic activity of U2 and U6 snRNAs. RNA 9:892-904 Vidovic I, Nottrott S, Hartmuth K, Luhrmann R, Ficner R (2000) Crystal structure of the spliceosomal 15.5kD protein bound to a U4 snRNA fragment. Mol Cell 6:1331-1342 Villa T, Ceradini F, Presutti C, Bozzoni I (1998) Processing of the intron-encoded U18 small nucleolar RNA in the yeast Saccharomyces cerevisiae relies on both exo- and endonucleolytic activities. Mol Cell Biol 18:3376-3383 Vitali P, Royo H, Seitz H, Bachellerie JP, Huttenhofer A, Cavaille J (2003) Identification of 13 novel human modification guide RNAs. Nucleic Acids Res 31:6543-6551 Wang C, Meier UT (2004) Architecture and assembly of mammalian H/ACA small nucleolar and telomerase ribonucleoproteins. EMBO J 23:1857-1867

Mechanisms and functions of RNA-guided RNA modification 39 Wang C, Query CC, Meier UT (2002) Immunopurified small nucleolar ribonucleoprotein particles pseudouridylate rRNA independently of their association with phosphorylated Nopp140. Mol Cell Biol 22:8457-8466 Wang H, Boisvert D, Kim KK, Kim R, Kim SH (2000) Crystal structure of a fibrillarin homologue from Methanococcus jannaschii, a hyperthermophile, at 1.6 A resolution. EMBO J 19:317-323 Watanabe Y, Gray MW (2000) Evolutionary appearance of genes encoding proteins associated with box H/ACA snoRNAs: cbf5p in Euglena gracilis, an early diverging eukaryote, and candidate Gar1p and Nop10p homologs in archaebacteria. Nucleic Acids Res 28:2342-2352 Watkins KP, Dungan JM, Agabian N (1994) Identification of a small RNA that interacts with the 5' splice site of the Trypanosoma brucei spliced leader RNA in vivo. Cell T6:171-182 Watkins NJ, Gottschalk A, Neubauer G, Kastner B, Fabrizio P, Mann M, Lührmann R (1998) Cbf5p, a potential pseudouridine synthase, and Nhp2p, a putative RNA-binding protein, are present together with Gar1p in all H BOX/ACA-motif snoRNPs and constitute a common bipartite structure. RNA 4:1549-1568 Watkins NJ, Segault V, Charpentier B, Nottrott S, Fabrizio P, Bachi A, Wilm M, Rosbash M, Branlant C, Luhrmann R (2000) A common core RNP structure shared between the small nucleolar box C/D RNPs and the spliceosomal U4 snRNP. Cell 103:457-466 Weidenhammer EM, Ruiz-Noriega M, Woolford JL Jr (1997) Prp31p promotes the association of the U4/U6 x U5 tri-snRNP with prespliceosomes to form spliceosomes in Saccharomyces cerevisiae. Mol Cell Biol 17:3580-3588 Wise JA, Tollervey D, Maloney D, Swerdlow H, Dunn EJ, Guthrie C (1983) Yeast contains small nuclear RNAs encoded by single copy genes. Cell 35:743-751 Yu YT, Scharl EC, Smith CM, Steitz JA (1999) The growing world of small nuclear ribonucleoproteins. In: Gesteland RF, Cech TR, Atkins JF (eds) The RNA world, 2nd edn. Cold Spring Harbor laboratory Press, Cold Spring Harbor, New York, pp 487-524 Yu YT, Shu MD, Steitz JA (1997) A new method for detecting sites of 2'-O-methylation in RNA molecules. RNA 3:324-331 Yu YT, Shu MD, Steitz JA (1998) Modifications of U2 snRNA are required for snRNP assembly and pre-mRNA splicing. EMBO J 17:5783-5795 Yuan G, Klambt C, Bachellerie JP, Brosius J, Huttenhofer A (2003) RNomics in Drosophila melanogaster: identification of 66 candidates for novel non-messenger RNAs. Nucleic Acids Res 31:2495-2507 Zagorski J, Tollervey D, Fournier MJ (1988) Characterization of an SNR gene locus in Saccharomyces cerevisiae that specifies both dispensable and essential small nuclear RNAs. Mol Cell Biol 8:3282-3290 Zebarjadian Y, King T, Fournier MJ, Clarke L, Carbon J (1999) Point mutations in yeast CBF5 can abolish in vivo pseudouridylation of rRNA. Mol Cell Biol 19:7461-7472 Zhao X, Li ZH, Terns RM, Terns MP, Yu YT (2002) An H/ACA guide RNA directs U2 pseudouridylation at two different sites in the branchpoint recognition region in Xenopus oocytes. RNA 8:1515-1525 Zhao X, Yu YT (2004a) Detection and quantitation of RNA base modifications. RNA 10:996-1002 Zhao X, Yu YT (2004b) Pseudouridines in and near the branch site recognition region of U2 snRNA are required for snRNP biogenesis and pre-mRNA splicing in Xenopus oocytes. RNA 10:681-690

40 Yi-Tao Yu, Rebecca M. Terns, and Michael P. Terns Zhu Y, Tomlinson RL, Lukowiak AA, Terns RM, Terns MP (2004) Telomerase RNA accumulates in Cajal bodies in human cancer cells. Mol Biol Cell 15:81-90

Terns, Michael P. Department of Biochemistry and Molecular Biology, University of Georgia, Life Sciences Building, Athens, Georgia 30602, USA Terns, Rebecca M. Department of Biochemistry and Molecular Biology, University of Georgia, Life Sciences Building, Athens, Georgia 30602, USA Yu , Yi-Tao Department of Biochemistry and Biophysics, University of Rochester Medical Center, 601 Elmwood Avenue, Rochester, NY 14642, USA [email protected]