Nucleotide sequence and secondary structure of the ... - BioMedSearch

0 downloads 0 Views 158KB Size Report
I NTRON -2 o. 200nt. B. ccUGc AAGuAAcUUooo AGUAUAUAAACAUCGAUuU .6c I,, . ,,, . -I. -,-6 ..III. III. 11. -. c. cucOLA.JAUw Uuu. cUUAUCI UJUUAGUU u.
Nucleic Acids Research, 1993, Vol. 21, No. 7 1667

Nucleotide sequence and secondary structure of the chloroplast group I intron Cr.psbA-2: novel features of this self-splicing ribozyme Yijia Bao and David L.Herrin* Botany Department, University of Texas at Austin, Austin, TX 78713, USA Received February 22, 1993; Accepted March 5, 1993 The chloroplast psbA gene in Chlamydomonas reinhardtii contains self-splicing introns (1). Intron-2 was particularly efficient at selfsplicing (1). Thus it was of interest to determine its sequence. Cr.psbA-2 shows two unique features: (1) two free-standing open reading frames (ORFs), and (2) a stem-loop (and additional sequences) between helices P8 and P7. The sequence of Cr.psbA-2 was determined from two independent clones, pGEMR14.2 (1), and P-66 (Chlamydomonas Genetics Center, Duke University). The intron is 1410 nt, and is A/T rich (-67%). The ORFs are located at residues 206 to 706, and 949 to 1098, respectively (Figure IA). ORF-1 potentially encodes a protein of 167 amino acids (18.6 kDa). The P1I/P2 peptides, which occur in some group I intron ORFs (2) are not present in ORF-1. However, there is a perfect Shine-Dalgarno sequence 7 nt 5' to the start codon (see Figure iB). A search of databases revealed no sequence similarity with other group I ORFS, nor any other proteins. ORF-2 is 50 amino acids. However, chloroplast genes can be very small ( - 4 kDa) and some lack a Shine-Dalgarno sequence. There is little similarity of ORF-I with other known proteins. Figure lB shows a proposed secondary structure. Cr.psbA-2 contains P1 -P0 helices similar to other group I introns, and both ORFs are in loop 6, out of the core structure. Although Cr.psbA-2 contains large peripheral structures (e.g. PSa, b, c), the essential helices P3, P4, P5, P6, P7, P8, and P9 are small and give a compact structure to the ribozyme core. Figure lB (inset) shows how the 5'- and 3'-splice sites can be aligned by the internal guide sequence (IGS) (2). We have attempted to classify Cr.psbA-2 (3). The presence of the P5 extension (PSa, P5b and P5c) is characteristic of group IC introns. However, Cr.psbA-2 also contains P7.1 (Figure iB) which is typical among subgroup IA introns. Thus, Cr.psbA-2 appears to be an intermediate between the IC and IA subgroups. Alternatively, because of the additional sequences and stem-loop (P8. 1) between P8 and P7, which have not been observed previously, Cr.psbA-2 may represent the first case of a new subclass of group I introns.

ACKNOWLEDGEMENTS This research was supported by grants from the NSF (DMB89-05303), USDA (92-37301-7682) and the Welch Foundation (F-1 164) to D.L.H. *

To whom correspondence should be addressed

EMBL accession no. Z19597

REFERENCES 1. Herrin,D.L., Bao,Y., Thompson,A.J. and Chen,Y.-F. (1991) Plant Cell 3, 1095-1107. 2. Davies,R.W., Waring,R.B., Ray,J.A., Brown,T.A. and Scazzocchio,C. (1982) Nature 300, 719-724. 3. Michel,F. and Westhof,E. (1990) J. Mol. Bio. 216, 585-610.

A.

EXON-2

ORF- 1 ORF-2 EXON-3 ..I.. NTR.. t. .0...

_

. ....... .... .

.................:

I NTRON -2

cc UGc

B.

o

200nt

AAGuAA cUUooo AGUAUAUAAACAUCGAUuU

. ,,, . III III 11 -. .6c I,, -I ,-6 .. cucOLA.JAUw Uuu cUUAUCI UJUUAGUU u UUC G.U A 0 P5b P5c

c

0 0 A -U

U U

GUU

-UU C u A-U A-U

'A CA

A-U

P5 A-U

P1O U

U.G

C

U

Ui0 GA

u

C-G

U

A-U

p,

,2gX- -A-Up2 A-'J >C-u pi A_ A-U A- C-GAUUAAAAAAACU 5

02pA-AU A-Ecl--AACCCOU-AA^A

UUGA^ORF5G4nt U A- ORF 1504n t 239n t

0

c A-u r A AAAUAU

PC

.~~~AAPO0Au0

C-7-l

44nt &~-l 4n t -U-A-26n

'

~

-UAAAC-G

U-A

A_U CG C U

COGAA

CGAGAGA-U 0 -C

0-C AG AA

P7t1

0-site

Ps.0

P~i

Pi

AUCCACUGUUJJgugaqugq-

U-At

3'

A-U P9 UA-U

GA AA

Figure 1. A. Map of the group I intron Cr.psbA-2. Exons, filled boxes; ORFS, open boxes; intron, solid lines. B. Proposed secondary structure of the Cr.psbA-2 ribozyme. Intron sequences are in upper case letters and exon sequences in lower case letters; the 5' and 3' splice-sites are indicated by arrows. P1 to PlO refer to helices numbered by convention. The base-pairs in boxes are proposed to have at least one base participating in a tertiary interaction (3). Circled residues are 100% (or nearly) conserved among group I introns. The ORFs are located in loop 6 and the guanosine-binding site (G-site) in P7 is indicated. S-D, Shine-Dalgarno sequence. Inset: Arrows indicate 5' and 3' splice-sites; the IGS is boxed.