RNA Tertiary Structure - IBMC

3 downloads 1201 Views 185KB Size Report
secondary structure of an RNA molecule is experimentally accessible and its .... differences observed between DNA and RNA molecules. In contrast to DNA but ...
RNA Tertiary Structure Eric Westhof and Pascal Auffinger in Encyclopedia of Analytical Chemistry R.A. Meyers (Ed.) pp. 5222–5232  John Wiley & Sons Ltd, Chichester, 2000

1

RNA TERTIARY STRUCTURE

RNA Tertiary Structure

1 INTRODUCTION

Eric Westhof and Pascal Auffinger Institute of Molecular and Cellular Biology, Strasbourg, France

1 Introduction 2 Chemical Structure of RNA

1 1

3 Definitions of Secondary and Tertiary Structures 4 RNA Tertiary Motifs

5 5

5 Instrumental Methods 5.1 Physical Methods 5.2 Phylogenetic Approach 5.3 Computer Modeling 6 Conclusions Abbreviations and Acronyms Related Articles References

5 6 7 8 8 9 9 9

Ribonucleic acids are negatively charged polymers assembled from four different types of monomers. Each monomer is made of an invariant phosphorylated sugar to which is attached one of the four standard nucleic acid bases; the pyrimidines uracil and cytosine, and the purines guanine and adenine. The first level of organization is thus the sequence of bases attached to the sugar – phosphate backbone. In salty water, the RNA molecules fold back on themselves via Watson – Crick base pairing between the bases (A with U, G with C or U) leading to double-stranded helices interrupted by single-stranded regions in internal loops or hairpin loops. The enumeration of the base-paired regions or helices constitutes a description of the second level of organization, the secondary structure. The methods available to deduce the secondary structure of an RNA molecule are mainly of three types: the phylogenetic approach, the theoretical prediction, and chemical/enzymatic methods. The secondary structure of an RNA molecule is experimentally accessible and its content measurable. Under appropriate conditions, structured RNA molecules undergo a transition to a three-dimensional (3D) fold in which the helices and the unpaired regions are precisely organized in space. This folding process usually depends on the presence of divalent ions, such as magnesium ions, and on the temperature. The tertiary structure is the level of organization relevant for biological function of structured RNA molecules. Encyclopedia of Analytical Chemistry R.A. Meyers (Ed.) Copyright  John Wiley & Sons Ltd

Ribonucleic acids are negatively charged polymers assembled from four different types of monomers. Each monomer is made of an invariant phosphorylated sugar to which is attached one of the four standard nucleic acid bases; the pyrimidines uracil and cytosine, and the purines guanine and adenine. The first level of organization is thus the sequence of bases attached to the sugar – phosphate backbone. In salty water, the RNA molecules fold back on themselves via Watson – Crick base pairing between the bases (A with U, G with C or U) leading to doublestranded helices interrupted by single-stranded regions in internal loops or hairpin loops. The enumeration of the base-paired regions or helices constitutes a description of the second level of organization, the secondary structure. The methods available to deduce the secondary structure of an RNA molecule are mainly of three types: the phylogenetic approach, the theoretical prediction, and chemical/enzymatic methods. The secondary structure of an RNA molecule is experimentally accessible and its content measurable. Under appropriate conditions, structured RNA molecules undergo a transition to a 3D fold in which the helices and the unpaired regions are precisely organized in space. This folding process usually depends on the presence of divalent ions like magnesium ions and on the temperature. The tertiary structure is the level of organization relevant for biological function of structured RNA molecules. Sections 2 – 4 discuss these structural properties of RNA. The instrumental methods leading to the elucidation of the tertiary structure are then described. The first method relies on X-ray crystallography of single crystals of purified RNA molecules. The second most important structural method is nuclear magnetic resonance (NMR). Finally, the phylogenetic method coupled to computer modeling and experimental approaches is described. An overview of the RNA motifs underlying the assembly of RNA molecules in complex tertiary folds is given. These include loop – loop interactions, some of which constitute pseudoknots, tetraloops and their receptors, as well as structured internal loops such as the 5S loop E motif. The roles of non-Watson – Crick base pairing for tertiary structure is emphasized throughout.

2 CHEMICAL STRUCTURE OF RNA Nucleic acid biopolymers comprise the DNA and RNA molecules. These two types of molecules possess very different functional roles. In brief, DNA molecules (deoxyribonucleic acids) contain the genetic code, whereas the more versatile RNA molecules (ribonucleic acids) are

2

NUCLEIC ACIDS STRUCTURE AND MAPPING H

H

H1

N6 N7 A N3

H8

O H O P O − O

N9

H O H

H

O

H OH

H

H6

H6

H2 H

O P O5′ 5′ O4′ − O 4′ H H 3′ Nucleotide unit O3′ 5′

(1)

N1

O O4

O P O − O

G N3

O

H OH

H

N1

N2 H

H H

H

O

H OH

Ch

H5

n

ai c re

di

H6

tio

O P O - O

N4 H

n

C N3 N1 O2

O H

H

O

H OH

H 3′

H

O6

O H H

H H

U N3 H N1 O2 H 1′ N7 2′ H8 OH N9

O2 N1 Ψ N3 H C5 O4

O P O − O

Figure 1 Chemical structure of a Ð Ð ÐAUGCÐ Ð Ð RNA sequence drawn from the 50 to the 30 end. The RNA bases are frequently modified, especially in transfer RNAs (tRNAs), where thymine (5-methyl U or T) or pseudouridine (, in which there is a C C glycosyl bond between the base C5 and the sugar C10 atoms) are often found. Deep groove side

H8

O6

N7 N9

H H N4

G N1 H

H H5

N3 C N1 O2

N3 N2 H

H8 N9

H6

H

N6

N7

A N1

O4 H

N3 H2

H5

N3 U N1 O2

H6

H Shallow groove side

O4 H8

O6

N7 N9

G N1 H

H

H5

N3 U N1 O2

O H6

N3

O P − O

O O H

H

O

H OH

H

N2 H H Donor site Acceptor site Weak donor site

O P O − O

Figure 2 Possible interaction sites around the standard Watson – Crick base pairs G C and A U as well as the wobble G°U pair

and the sugar – phosphate backbone atoms. The edges facing the deep and shallow grooves of RNA helices are indicated. Hydrogen bond acceptor sites are marked by ‘inward-pointing bold’ arrows, hydrogen bond donor sites are marked by ‘outward-pointing bold’ arrows, weak hydrogen bond donor sites (C H groups) are marked by thin arrows. The RNA 20 -hydroxyl group (see inset) displays simultaneously an hydrogen bond donor and acceptor potential.

3

RNA TERTIARY STRUCTURE

involved in almost all crucial life processes and especially in the translation of the genetic code into proteins. The four ribonucleosides that incorporate the purine bases, adenine (A) and guanine (G), and the pyrimidine bases, ucacil (U) and cytosine (C) constitute the basic building blocks of a RNA polymeric chain (Figure 1). These nucleosides comprise a ribose sugar ring and a purine (A, G) or a pyrimidine (C, U) base. They are connected together by a phosphodiester linkage. The nucleoside and its phosphodiester unit is called a nucleotide. The bases of an RNA polymeric chain associate with the complementary bases of the same chain or of another RNA chain by forming purine – pyrimidine A U or G C Watson– Crick base pairs (see Figure 2). The association of two self-complementary strands results in the formation of a right-handed double helical structure (Figure 3). Although DNA and RNA molecules possess very distinct biological functions, chemically RNA differs from DNA only in two aspects: (a) the absence of a methyl group at position 5 of the uridine (U), and (b) the presence of a 20 -hydroxyl group of the RNA ribose sugar (Figure 1). These two small chemical modifications account for the profound functional and structural differences observed between DNA and RNA molecules. In contrast to DNA but similar to proteins, singlestranded polynucleotide RNA chains can fold in a variety of complex 3D structures..1/ This ability to form complex folds is exemplified by tRNA molecules which are constituted by a single chain of about 70 nucleotides (Figure 2)..2/ The analogy with proteins includes also the fact that some RNA molecules, called ribozymes, are able to perform biologically crucial catalytic reactions. In nucleic acids, there are six torsion angles in the sugar – phosphate backbone with an additional one between the base and the sugar (see Figure 5 in Berman’s article X-ray Structures of Nucleic Acids). Sundaralingam reduced severely the variations in two torsion angles and constrained the other five to preferred domains of variations..3/ Thus, the torsion angles about the C O bonds remain at about 180° , whereas the sugar rings adopt either the C30 -endo or the C20 -endo puckers and the base is either in an anti or a syn orientation with respect to the sugar. In proteins, the torsion angles on either side of the peptide bond constitute the main flexible links, whereas in nucleic acids the phosphodiester bonds themselves direct chain re-orientations. The RNA molecules, in helical form (Figure 3), are not much affected by changes in their environment and adopt essentially the A-form in the helical parts of their structures under nearly all conditions..4/ However, many RNA molecules need specific divalent cations in order to fold properly into biologically active conformations..5,6/

A-form RNA helix

tRNA

(a) Thymine hairpin 60

58

AU U

A 56 C

70

75

C C C C G U C G C G G A G C C A 3′ OH G G G G C A G U G C C U 5′ P

ψ T54 55

19 G G18 20 D D16 20a C A15 21 A A 14 ψ 22 13 Dyhidrouridine G 23 A U 12 hairpin 24 A U 11 25 U G 10 26 G A 44

(b)

Acceptor stem 65

50 m5 7

5

1

U8 A9 A 48 Variable A 46 loop G45

G C C C C G 30G U 40 Anticodon hairpin C G ψ C 1 C m G G C 34 U 36 35

Figure 3 (a) Side and top views of the A-form adopted by RNA double helical structures. Notice the deep groove (corresponding to the major groove of B-DNA) and the shallow groove (corresponding to the minor groove of B-DNA). At the right, the 3D structure of a tRNA molecule resulting from the complex folding of a single polymer chain comprising 75 nucleotides is represented. (b) Schematic representation of the secondary and tertiary structures of yeast tRNA for aspartic acid. The dotted lines represent the tertiary pairs, which are except for one (G19 C56) of non-Watson – Crick nature. Notice the co-axial stacking between the acceptor and thymine helical stems as well as between the dihydrouridine (an uridine residue saturated at the C5DC6 bond) and the anticodon helical stems. At the interface between the dihydrouridine and anticodon stems there is usually a non-Watson – Crick pair. The thymine and dihydrouridine loops are connected via two important base pairs, a Watson – Crick G19 C56 and a bifurcated G18ž55 base pairs. This spatial relationship between the secondary elements is further stabilized by triples between the single-stranded junctions (residues 8 – 9 and variable junction) and the deep groove of the dihydrouridine helix.

4

NUCLEIC ACIDS STRUCTURE AND MAPPING

molecules and can be inserted without great distortions into regular Watson – Crick helices..9/ The wobble base pairs display a characteristic shift of their Watson – Crick interaction sites. Two of these sites point, respectively, toward the shallow and deep grooves and are, thus, completely accessible to the solvent. Wobble base pairs comprise essentially the G°U base pairs. For these base pairs, the guanine amino group protrudes into the shallow groove and the O4 atom of the uracil

The absence of such ions may induce profound structural changes, such as loss of 3D interactions..7/ Non-Watson – Crick base associations.8/ are often found in RNA structures where they serve as specific recognition elements for proteins, nucleic acids, and ligands, or as ion-binding sites. Such noncanonical base pairs are linked by at least one interbase hydrogen bond and occasionally involve water-mediated base – base interactions. Wobble base pairs are typical of RNA

AC

AA

C UG AU G

G AC U G G 5′

3′

III

A A

U C C U A I A C C A C G C A

V

AA A C GU U

A G G G A A A G G G C G U

U G U

CA U

II

G U U C G A A A C G G U U A C G C G U U A A A G U

C A A C G A G AA IV

I

U G

ACG CA

C

C

UA C

GU

U AA U A G A

Figure 4 Representation of the secondary and tertiary structure of an artificial group I intron based on that of Tetrahymena thermophyla. Some tertiary motifs are highlighted. Motifs I and II are loop – loop interactions leading to a helical structure maintained via standard Watson – Crick pairings. Motif III is constituted of a GUGA tetraloop interacting with the shallow groove of two base pairs embedded within a helix. Motif IV is made of a GAAA tetraloop interacting with its receptor, the 11nt-motif, a complex structure maintained by an AA-platform (two consecutive As in the same plane.26/ ) and an AžU Hoogsteen base pair. In motif V, part of an A-rich bulge interacts with C G pairs in the shallow groove of a helix.

5

RNA TERTIARY STRUCTURE

base point towards the deep groove. Furthermore, G°U base pairs display a deep groove side that is lined with three hydrophilic acceptor atoms (N7, O6, O4) and no hydrophilic donor atoms, whereas the shallow groove side presents a cavity wide enough for trapping a water molecule. Both grooves present, therefore, very unique recognition patterns..10/

5′ 5′

Loop 3 Stem 1

Stem 1 Loop 2

Loop 2

Loop 1

Stem 2

Loop 1

3 DEFINITIONS OF SECONDARY AND TERTIARY STRUCTURES Probably the most straightforward definition of secondary structure would include any nucleotide such that both itself and one at least of its immediate neighbors in the 50 or 30 direction are involved in classical (Watson – Crick and G°U) base pairing with a stretch of nucleotides in antiparallel orientation (Figures 3 and 4). Ideally, one would like an RNA secondary structure to be a planar graph which can be represented as a tree, i.e. the lines connecting the paired bases do not intersect..11/ The secondary structure will define various elements. A hairpin loop is formed when an RNA strand folds back on itself. In an internal loop, at least one base is unpaired on each strand of the loop separating two paired regions. A mismatch is a special type of internal loop for which only one nucleotide is unpaired on each strand. A bulge has unpaired nucleotides on only one strand. The other strand has uninterrupted base pairing. A multibranched loop occurs when double-stranded regions separated by any number of unpaired nucleotides, come together. In the next level of organization, the tertiary structure, the secondary structure elements are associated through numerous van der Waals contacts, specific hydrogen bonds via the formation of a small number of additional Watson-Crick pairs and/or unusual pairs involving hairpin loops or internal bulges. The parsing of energy levels between secondary and tertiary structures is reasonable in large RNAs, considering the relative energies and the clear identification of the secondary structure elements. In some cases, it is even possible to cut RNA molecules into modular domains which can re-associate only through tertiary contacts.

Loop 3

3′ Stem 2

3′

Figure 5 Two representations of a pseudoknot with a co-axial stacking of the two generated helices. Some consecutive bases (usually 6 – 8) of a hairpin loop are complementary, in the Watson – Crick sense, to bases either in the 50 - or 30 -strand leading to the hairpin. A second helix can, thus, be formed, once one hairpin is folded. In the classical type of pseudoknot represented (with segment L2 absent or with only one or two residues), the two helices can stack upon each other, leading to a stabilization of the tertiary motif. The single-stranded segments L1 and L3, which link stems 1 and 2, cross, respectively, the deep and shallow grooves of the contiguous helical stack to which they can make further base – base or base-sugar – phosphate backbone interactions. on each other, or two distant helices position themselves so that their shallow grooves fit. An unpaired region belongs to either a single-stranded stretch (forming an internal loop or a bulge) or a hairpin loop closing a helix. Interactions between two unpaired regions lead to pseudoknots if a single loop is involved and to loop– loop motifs otherwise (Figure 5). Interactions between an unpaired region and a double-stranded helix can lead to various types of motifs. Pairing of a single-stranded stretch, either in the deep or the narrow groove of a double helix, yields a triple helix. One motif is known where the unpaired region constitutes a terminal loop, in which -GNRA- tetraloops bind the shallow groove of the helix (see Figure 6). Those motifs involving singlestranded stretches are especially rich in potential to form tertiary structure because they can lead to co-axiality between helices. Depending on their sequence, internal loops or bulges could constitute 3D motifs, but such motifs have not yet been characterized. The co-axial stacking of helices, together with specific helix – helix contacts or helix – loop interactions, lead to compact RNA assemblies, generally in the presence of divalent ions or polyamines (Figure 7).

4 RNA TERTIARY MOTIFS RNA tertiary structure comprises those interactions involving (a) two helices, (b) two unpaired regions, or (c) one unpaired region and a double-stranded helix. The interactions between two helices are basically of two types: either two helices with a contiguous strand stack

5 INSTRUMENTAL METHODS The experimental observations used for deriving a 3D structure, and thus the tertiary motifs, can be of quite different nature depending on the techniques employed.

6

NUCLEIC ACIDS STRUCTURE AND MAPPING

3′ 3′

5′

3′

G Y

G

G

A

U

A

C

G

Y

A

G

A

C C

G G

A G

A A

(a)

3′

5′ (b)

GYGA

3′ GYAA

5′

U

A A U A U C G C

5′

5′

5′

11nt-Receptor motif

G

3′

(c)

Figure 6 The three main tertiary motifs mediated by the tetraloops of the GNRA family. In the first two (a) and (b), the third and fourth residues of the GYRA loop forms triples in the shallow grooves of helices so that G binds a U A pair and A a C G pair. In the third type of contact (c), a GAAA tetraloop binds a complex 11nt-motif with a central UžA in the Hoogsteen configuration. Each A of the loop binds principally a different base pair in the receptor.

I

I

IV

IV II

II

Figure 7 Stereoview of the modeled architecture of the intron of Tetrahymena thermophyla..27/ Some of the tertiary motifs, shown in black, are numbered as in Figure 4.

These range from biophysical methods (X-ray diffraction data, NMR couplings or nuclear Overhauser effects, and other spectroscopic methods such as ultraviolet, Raman, or circular dichroism), to biochemical approaches (chemical probing or enzymatic attack), and biological data (sequences, phylogenies, and in vitro selection). Highresolution X-ray crystallographic analysis (diffraction ˚ resolution) yields a wealth of unequalled data at 1.5 – 0.9 A 3D information. Nucleic acids are difficult to crystallize because they are highly charged macromolecules which, in case of RNA molecules, can undergo spontaneous cleavages. In addition, large, nucleic acids and especially RNAs, often exchange between various base pairings and foldings. Recently, several new RNA crystal structures have appeared, extending our structural knowledge enormously since the days of the tRNAs..1,12/ Those developments were possible first following the introduction of methods for preparing RNA molecules on a large scale with either chemical synthesis or using the T7 DNA-dependent RNA polymerase and, second, following progress in crystallographic techniques especially

cryocrystallography and the availability of synchroton sources (see the article X-ray Structures of Nucleic Acids). NMR has proved useful in this area (reviewed in the chapter by Eriksson Nuclear Magnetic Resonance and Nucleic Acid Structures). Chemical and enzymatic probing of nucleic acids in solution yields important information on the stability of the structures and on those bases protected from chemical or enzymatic attack. However, such experimental approaches do not reveal the nature of the interacting partners. Cross-linking experiments have the potential to give that information, but the cross-linking reactions take place in an assembly of molecules generally not all in the same state, and it is difficult to prove that the reactions occurred solely on functional molecules. Sequence data are extremely rich in potential 3D information, as they result from adaptative evolution over millions of years. Thus, if the function is identical and the sequences are sufficiently diverse, the noise level (or covariations resulting from contingencies) will be decreased by sequence comparisons. 5.1 Physical Methods The physical methods used for determining RNA tertiary structure are described in other articles. The most accurate method relies on X-ray diffraction of single crystals (Figure 8), as reviewed in the article X-ray Structures of Nucleic Acids. The development of NMR spectroscopy has progressed and allowed the determination of several solution structures of free RNAs as well as of RNAs complexes with peptides, proteins or antibiotics. This approach is reviewed in the chapter by Eriksson Nuclear Magnetic Resonance and Nucleic Acid Structures. Most published structures are accessible and can be retrieved from the nucleic acid database (http://ndbserver.rutgers.edu/). The RNA tertiary structure is subtended by the secondary structure and this

7

RNA TERTIARY STRUCTURE

End September 1999

Number of structures in the nucleic acid database

100 Ribosomal 30S fragment (5.5 Å) Complete group I intron (5 Å) HDV ribozyme Crystal design

80

P4-P6 domain of group I intron

Hammerhead ribozyme

60

In vitro transcription

Sparse-matrix

First oligomer Chemical synthesis Cryocrystallography Synchotron radiation

40

First tRNA/synthetase complex 20 First tRNA structures

0 1975

1980

1985

1990

1995

2000

Figure 8 The cumulative number of RNA crystal structures alone (light shading) and in complexes (in dark) accessible in the nucleic acid database (http://ndbserver.rutgers.edu/).

needs to be established before large-scale synthesis is envisaged for crystallogenesis or NMR experiments. The best way to derive secondary structures is by using sequence comparisons. 5.2 Phylogenetic Approach When a set of homologous sequences (homologous sequences have common ancestry and function) is available, one can search, with the help of a rough alignment, for compensatory base changes that maintain base-paired helices and, therefore, derive the secondary structure. When only one sequence is available or when RNAs are not conserved among a sufficiently diverse set of organisms, theoretical models of predictions, have to be associated with experiments. The related knowledge is based on a set of constraints, the thermodynamic model, and the available experimental data on the molecule. Comparative analysis of nucleic acid sequences has been widely used for the detection and evaluation of similarities and evolutionary relationships. With RNA molecules, sequence alignments and RNA 2D prediction are intimately related. Comparative analysis is based on the biological paradigm that macromolecules are the product of their historical evolution and that functionally homologous sequences will adopt similar structures.

The sequences are first aligned and then searched for compensatory base pair changes. If, during evolution, a base has been modified in a strand of a potential helix (mutation), then this modification must have been compensated on the complementary strand in order to maintain the structure. The presence of several compensatory changes (two or more) in a potential helix allows one to assert the existence of the helix in the structure. Several secondary structure models have been generated by using comparative analysis: tRNA,.13/ 16S rRNA,.14/ or group I and group II self-splicing introns..15/ The method requires that the molecules compared are sufficiently different to provide numerous instances of sequence variations with which to test pairing possibilities but that the molecules do not differ so much that homologous residues cannot be aligned with confidence..16/ In an alignement, the objective is to juxtapose related sequences so that homologous residues in each sequence occupy the same column in the alignment. Standard alignment programs dedicated to molecular biology can be used for that purpose. More recent programs, dedicated to the alignment of RNA sequences, allow the user to manipulate interactively the proposed alignment (such as DCSE,.17/ ALIGNOS.18/ ). They offer functions

8

NUCLEIC ACIDS STRUCTURE AND MAPPING

dedicated to secondary structures as well as an interactive environment for manipulating the alignment. Other recent and interesting programs automatically reconsider the alignment by taking into account new sequences and the pre-existing knowledge of the secondary structure..19/ Indeed, with the growing number of sequences, specific RNA databases are created and new sequences have to be added quickly in structured databases of homologous RNA molecules. In such databases, it is very desirable that sequences are aligned in accordance with the preservation of secondary structural features. Because, in an alignment, optimal structural elements can be misaligned, the program RNAlign.19/ makes it possible to align a group of aligned sequences with a new sequence, using positions of high sequence conservation as a guide and a common secondary structure of the group as a guide for determining the secondary structure of the new sequence. Thus, RNAlign does not suppose that the related sequences are correctly aligned but instead reconsiders the alignment.

relative arrangements of helices in group I introns were correctly assembled in a 3D model.21/ long before X-ray crystallography led to supportive evidence..22/ Although such models lack correctness in atomic detail, they allow the deduction of new spatial relationships. Those, together with sequence analysis, permit the localizing of recurrent motifs serving as RNA – RNA anchors responsible for recognition and stabilization of folding in structured RNAs. In this way, the recognition motif between the GNRA tetraloops and the shallow grooves of helices was correctly identified.21/ before being observed in intermolecular crystal packing contacts..23/ Similarly, the recognition motif between the GAAA tetraloop and its 11nt-receptor was identified by in vitro experiments using selection methods.24/ before being observed by crystallography..25,26/

5.3 Computer Modeling

RNA molecules are often restricted to their roles as intermediate between the genomic DNA and the active proteins. This view has changed following the discovery of catalytic RNA by S. Altman and T. Cech:.27,28/ RNA is nowadays the only molecule with the two properties of being a depository of genomic information with catalytic potential. Chemical catalysis requires a precise positioning of atoms in space and, therefore, RNA must achieve complex tertiary folds in order to reach transition states. Because RNA molecules carry one negative charge per residue, the compact assembly of large RNAs presents a formidable physical problem. The architecture of large structured RNAs is dominated by helix formation and co-axial stacking of helical stems with loop – loop interactions and GNRA/motifs prevalent as domain anchors..29/ Although the recurrent tertiary motifs can be identified by sequence analysis coupled with chemical probing and 3D modeling, the precise and subtle atomic details require X-ray crystallography or NMR techniques in order to be unveiled. The developments in chemical and molecular biology techniques have allowed the circumvention of some of the difficulties in the crystallization of RNA molecules and one can hope to see a continuous progression in the number of crystal structures of RNA molecules. Similarly, NMR spectroscopy has become a decisive tool for the unravelling of RNA motifs and complexes between RNA fragments and other ligands, such as antibiotics. Not every single RNA molecule is as highly structured as a self-splicing group I intron. However, it is common to find structured regions in functionally important domains of eukaryotic messenger RNAs, such as the 50 - or 30 -untranslated regions, in eubacterial promoters or

Molecular modeling attempts to assemble the 3D structure of a macromolecule on the basis of a mixture of theoretical and experimental data. Hence, prediction methods range from mathematically oriented approaches, relying solely on computer algorithms, to pragmatic and operational approaches in which insight comes from both theory and experiment. Modeling can be viewed as a heuristic tool which should help in the rationalization of experimental observations but also should suggest new relationships between the various components of the modeled molecule. Modeling relies on an existing body of knowledge and, therefore, necessitates an integration of that accumulated knowledge in a form which can be exploited. Hence, the importance of organized and annotated data banks. The power of visualizing spatial relations is such that models need not be always detailed. In the end, the validity and the accuracy of the model obtained will depend on the nature of the experimental observations collected. However, a mathematical proof guaranteeing the correctness of the derived model is only possible with crystallographic methods (the Fourier theorem). Otherwise, the best that can be achieved is a network of evidence converging on the spatial contacts and relations embodied by a model. Because of the strong base-pairing constraints and the Watson – Crick rules of complementarity, modeling has had some success in the field of nucleic acids, starting from the Watson– Crick double helix. In the RNA field, modeling first proposed the structure of a GNRA tetraloop closed by a sheared GžA pair with stacking of the remaining purine bases..20/ Also, the

6 CONCLUSIONS

9

RNA TERTIARY STRUCTURE

Shine – Dalgarno regions, and naturally in control regions of viral genomes. In addition, new structured RNAs are still being discovered, like the tmRNAs or the RNA part of the ribonucleic particle telomerase. Structured RNAs are often part of complexes with proteins, sometimes a single one as in RNaseP, or with 21 different proteins such as 16S rRNA in the 30S eubacterial ribosomal particle.

3.

4. 5.

ABBREVIATIONS AND ACRONYMS NMR tRNA 3D

Nuclear Magnetic Resonance Transfer RNA Three-dimensional

6. 7.

8.

RELATED ARTICLES 9.

Biomolecules Analysis (Volume 1) Biomolecules Analysis: Introduction ž Circular Dichroism in Analysis of Biomolecules ž Infrared Spectroscopy of Biological Applications ž Mass Spectrometry in Structural Biology ž Nuclear Magnetic Resonance of Biomolecules ž Raman Spectroscopy in Analysis of Biomolecules ž Single Biomolecule Detection and Characterization Nucleic Acids Structure and Mapping (Volume 6) Nucleic Acids Structure and Mapping: Introduction ž Aptamers ž Mass Spectrometry of Nucleic Acids ž Nuclear Magnetic Resonance and Nucleic Acid Structures ž Nucleic Acid Structural Energetics ž Sequencing Strategies and Tactics in DNA and RNA Analysis ž Structural Analysis of Ribozymes ž X-ray Structures of Nucleic Acids Nuclear Magnetic Resonance and Electron Spin Resonance Spectroscopy (Volume 14) Two-, Three- and Four-dimensional Nuclear Magnetic Resonance of Biomolecules Radiochemical Methods (Volume 14) g-Spectrometry, High-resolution, for Radionuclide Determination

10. 11.

12.

13.

14.

15.

16.

17.

REFERENCES 18. 1.

2.

R.T. Batey, R.P. Rambo, J.A. Doudna, ‘Tertiary Motifs in RNA Structure and Folding’, Angew. Chem., Int. Ed. Engl., 38, 2326 – 2343 (1999). T. Hermann, D. Patel, ‘Stitching Together RNA Tertiary Architecture’, J. Mol. Biol., 294, 829 – 849 (1999).

19.

M. Sundaralingam, ‘Stereochemistry of Nucleic Acids and their Constituents. IV. Allowed and Preferred Conformations of Nucleosides, Nucleoside, Mono-, Di-, Tri-, Tetraphosphates, Nucleic Acids and Polynucleotides’, Biopolymers, 7, 821 – 860 (1969). W. Saenger, Principles of Nucleic Acid Structure, Springer-Verlag, New York, 1984. J.H. Cate, R.L. Hanna, J.A. Doudna, ‘A Magnesium Ion Core at the Heart of a Ribozyme Domain’, Nat. Struct. Biol., 4, 553 – 558 (1997). A.M. Pyle, ‘Ribozymes: A Distinct Class of Metalloenzymes’, Science, 261, 709 – 714 (1993). P. Brion, E. Westhof, ‘Hierarchy and Dynamics of RNA Folding’, Annu. Rev. Biophys. Biomol. Struct., 26, 113 – 137 (1997). N.B. Leontis, E. Westhof, ‘Conserved Geometrical Base Pairing Patterns in RNA’, Quart. Rev. Biophysics, 31, 399 – 455 (1998). H.F.C. Crick, ‘Codon-anticodon Pairing: The Wobble Hypothesis’, J. Mol. Biol., 19, 548 – 555 (1966). P. Auffinger, E. Westhof, ‘RNA Base Pair Hydration’, J. Biomol. Struct. Dyn., 16, 693 – 707 (1998). E. Westhof, F. Michel, ‘Prediction and Experimental Investigation of RNA Secondary and Tertiary Foldings’, in RNA – Protein Interactions K. Nagai, I.W. Mattaj, IRL Press, Oxford, 25 – 51, 1994. B. Masquida, E. Westhof, ‘Crystallographic Structures of RNA Oligoribonucleotides and Ribozymes’, in Oxford Handbook of Nucleic Acid Structure, ed. S. Neidle, London, 533 – 565, 1998. R.W. Holley, J. Apgar, G.A. Everett, J.T. Madison, M. Marquisee, S.H. Merrill, S.H. Penswick, J.R. Zamir, ‘Structure of a Ribonucleic Acid’, Science, 147, 1462 – 1465 (1965). C.R. Woese, L.J. Magrum, R. Gupta, R.B. Siegel, D.A. Stahl, ‘Secondary Structure Model for Bacterial 16S Ribosomal RNA: Phylogenetic, Enzymatic and Chemical Evidence’, Nucleic Acids Res., 8, 2275 – 2293 (1980). F. Michel, A. Jacquier, B. Dujon, ‘Comparison of Fungal Mitochondrial Introns Reveals Extensive Homologies in RNA Secondary Structures’, Biochimie, 64, 867 – 881 (1982). F. Michel, M. Costa, ‘Inferring RNA Structure by Phylogenetic and Genetic Analysis’, in RNA Structure and Function, eds. R.W. Simons, M. Grunberg-Manago, Cold Spring Harbor Laboratory Press, 175 – 202, 1996. P. DeRijk, R. DeWachter, ‘DCSE, an Interactive Tool for Sequence Alignment and Secondary Structure Research’, CABIOS, 9, 735 – 740 (1993). H. Neurath, J. Wolters, ‘The Alignment Editor ALIGNOS – Integrating Tool for a Cooperative Databanks for Structural RNAs’, Bioinformatics, 1, 22 – 25 (1992). F. Corpet, B. Michot, ‘RNAlign Program: Alignment of RNA Sequences Using both Primary and Secondary Structures’, Comput. Appl. Biosci., 10, 389 – 395 (1994).

10 20.

21.

22.

23.

24.

NUCLEIC ACIDS STRUCTURE AND MAPPING

E. Westhof, P. Romby, P. Romaniuk, J.-P. Ebel, C. Ehresmann, B. Ehresmann, ‘Computer Modeling from Solution Data of Spinach Chloroplast and of Xenopus laevis Somatic and Oocyte 5S rRNAs’, J. Mol. Biol., 207, 417 – 431 (1989). F. Michel, E. Westhof, ‘Modelling of the Threedimensional Architecture of Group I Catalytic Introns based on Comparative Sequence Analysis’, J. Mol. Biol., 216, 585 – 610 (1990). B.L. Golden, A.R. Gooding, E.R. Podell, T.R. Cech, ‘A Preorganized Active Site in the Crytsal Structure of the Tetrahymena Ribozyme’, Science, 282, 259 – 264 (1998). H.W. Pley, K.M. Flaherty, D.B. McKay, ‘Model for a Tertiary Interaction from the Structure of an Intermolecular Complex Between a GAAA Tetraloop and an RNA Helix’, Nature, 372, 111 – 113 (1994). M. Costa, F. Michel, ‘Frequent Use of the Same Tertiary Motif by Self-folding RNAs’, EMBO J., 14, 1276 – 1285 (1995).

25.

26.

27.

28.

29.

J.H. Cate, A.R. Gooding, E. Podell, K.H. Zhou, B.L. Golden, C.E. Kundrot, T.R. Cech, J.A. Doudna, ‘Crystal Structure of a Group I Ribozyme Domain – Principles of RNA Packing’, Science, 273, 1678 – 1685 (1996). J.H. Cate, A.R. Gooding, E. Podell, K. Zhou, B.L. Golden, A.A. Szewczak, C.E. Kundrot, T.R. Cech, J.A. Doudna, ‘RNA Tertiary Structure Mediation by Adenosine Platforms’, Science, 273, 1696 – 1699 (1996). K. Kruger, P.J. Grabowski, A.J. Zaug, J. Sands, D.E. Gotschling, T.R. Cech, ‘Self-splicing RNA: Autoexcision and Autocyclization of the Ribosomal RNA Intervening Sequence of Tetrahymena’, Cell, 31, 147 – 157 (1982). C. Guerrier-Takada, K. Gardiner, T. Marsh, N. Pace, S. Altman, ‘The RNA Moiety of Ribonuclease P is the Catalytic Subunit of the Enzyme’, Cell, 35, 849 – 857 (1983). V. Lehnert, L. Jaeger, F. Michel, E. Westhof, ‘New Loop – Loop Interactions in Self-splicing Introns of Subgroup IC and ID: A Complete 3D Model of the Tetrahymena thermophila Ribozyme’, Chem. Biol., 3, 993 – 1009 (1996).