Selenocysteine's mechanism of incorporation and evolution revealed in cDNAs of three glutathione peroxidases. Guy T.Mullenbach1, Azita Tabrizi, Bruce ...
Protein Engineering vol.2 no.3 pp.239-246, 1988
Selenocysteine's mechanism of incorporation and evolution revealed in cDNAs of three glutathione peroxidases
Guy T.Mullenbach1, Azita Tabrizi, Bruce D.Irvine, Graeme I.Bell2, John A.Tainer1 and Robert A.Hallewell Chiron Research Laboratories, Chiron Corporation, 4560 Horton Street, Emeryville, CA 94608 and 'Department of Molecular Biology, Research Institute of Scripps Clinic, La Jolla, CA 92037, USA 2 Present address: Howard Hughes Medical Institute, University of Chicago, Chicago, IL 60637, USA 'To whom correspondence should be addressed
The nonsense codon, UGA, has for the first time recently been shown to encode selenocysteine in two proteins, mouse glutathione peroxidase (GSH-Px) (EC 1.11.1.9) and bacterial formate dehydrogenase. A co-translational rather than posttranslational selenium-incorporation mechanism has been implicated. Furthermore, high expression levels of GSH-Px have suggested that suppression of termination is efficient and specific. We have isolated and characterized pituitary, kidney and placenta cDNAs for bovine, human and mouse GSH-Px respectively. It is demonstrated that this novel suppression event occurs in diverse tissues, in at least three mammalian species and at the translational step. Surprisingly, GSH-Px is shown to be extramitochondrially encoded, indicating a cytosolic suppression event rather than one utilizing the mitochondria's well-documented extended codon-reading ability. Sequence analysis reveals that a simple proximal contextual pattern responsible for readthrough does not exist. Analysis of predicted secondary strucutres of mRNAs, however, has revealed a conformation which may be unique to selenocysteine proteins and may prove useful as a tool for artificial incorporation of selenocysteines. A human nitron for GSH-Px from an unsplked mRNA has been isolated whose position indicates an ancient, divergent evolutionary relationship with thk>redoxin-S2, rather than an independent convergent one. Key words: codon usage/glutathione peroxidase/isomorphous replacement/ selenocysteine/suppression Introduction Because the selenocysteine residue (-CH 2 SeH) possesses a redox potential and various other chemical properties which differ from its analogs, i.e. cysteine (-CH 2 SH) and serine (-CH 2 OH), its artificial conservative placement in selected enzymes and redox proteins should provide a technique for elucidating molecular mechanisms and possibly for extending or improving a protein's function. A selenium atom, so introduced, might also provide a highly conserved, isomorphic reference atom for X-ray crystallographic analysis. A better understanding of this residue's mechanism of incorporation into glutathione peroxidase should offer an essential first step in achieving these replacements. Glutathione peroxidase (GSH-Px) (EC 1.11.1.9) is the most extensively characterized and thoroughly reviewed of the selenoproteins (Chiu etal., 1982; Flohe, 1982; Mannervik, 1985). A partial amino acid sequence determination of rat GSHPx revealed an active site selenocysteine residue within the © IRL Press Limited, Oxford, England
polypeptide chain (Zakowski et al., 1978) which was later also demonstrated in the bovine enzyme by X-ray diffraction analysis (Ladenstein et al., 1979; Epp et al., 1983) and by complete sequencing of the 198 amino acid protein (Gunzler et al., 1984). Interestingly, it has recently been revealed from the genomic sequence of mouse GSH-Px (Chambers et al., 1986) that the active site selenocysteine residue is encoded by the opal nonsense codon, UGA, which usually signals chain termination. Similarly, UGA has recently been shown to encode this residue in formate dehydrogenase from Escherichia coli (Zinoni etal., 1986). Surprisingly, evidence is accumulating which indicates that the selenium atom is incorporated into the protein in a co-translarional process rather than in a post-translational modification step (Hawke and Tappel, 1983), thus suggesting a 21st amino acid. These are the only selenium-containing proteins whose genes have been so characterized. GSH-Px is an enzyme of mammals and birds which protects against the damaging effects of various endogenously formed hydroperoxides and hydrogen peroxide as follows: + 2 GSH - 2 H2O + GSSG and RGOH + 2 GSH - GSSG + ROH + H2O where ROOH represents lipid hydroperoxides, membraneassociated phospholipid hydroperoxides (Ursini et al., 1985), peroxidized DNA and various alkyl and aryl hydroperoxides. GSH-Px is found at high levels in the liver and at moderate levels in heart, lung and brain (see review, Halliwell and Gutteridge. 1985). At the intracellular level GSH-Px is detected in both cytosol and mitochondria (Chiu et al., 1976; Timcenko-Youssef et al., 1985), the resident ratio apparendy varying with cellular function. In this communication we have confirmed die existence of this unique coding phenomenon and have analyzed the role played by bases flanking and more distant from the suppressed UGA. In order to determine those invariant flanking regions exhibited among different species, we have isolated and characterized cDNA sequences from human, bovine and mouse. In this work cDNAs have been isolated from several tissues and species to investigate the universality of this coding phenomenon. In an effort to understand better the evolution of selenocysteine's occurrence and function, we have also considered features of the predicted protein sequences. Materials and methods Strains Plasmids were propagated in E.coli strain HB101. Bacteriophage XgtlO was propagated in E.coli strain C600hfl. DNA manipulations Standard DNA manipulations were carried out as described elsewhere (Maniatis et al., 1983). Both strands of all cDNAs were sequenced by the dideoxynucleotide chain termination mediod in bacteriophage M13 (Sanger etal., 1977) and 7-deazoguanosine was utilized when required to remove band 239
G.T.Mulienbach et al.
compressions sometimes observed in GC rich regions (Barr et al., 1986). DNA synthesis Automated oligomer synthesis on a glass support was performed on a Gene-O-Matic DNA synthesizer (Warner etal., 1984) wherein nucleoside A^-diisopropyl phosphoramidites were utilized. cDNA isolations Double-stranded cDNA was prepared essentially as described (Gubler and Hoffman, 1983) using mouse (BALB/c) placenta, bovine pituitary and adult human kidney poly (A) + RNAs. After methylation of internal EcoRl sites and the addition of EcoRl linkers, the cDNA was ligated into the EcoRl site of XgtlO (Huynh et al., 1985). The phage were packed and recombinants selected by plating on E.coli strain BNN102. Bovine-derived viral DNA lifted onto duplicate nitrocellulose filters (0.45 fim pore size) was probed (Maniatis et al., 1983) with the following 32 P-terminally-labeled synthetic oligomers: 5'-GCCTTCTCCCCATTCACCTCACACTTCTCAAACAGCATAAAGTTAGGCTCAAA-3' (53mer) and 5'-ACGTTGGTCAAACCGGTGGTCCITTTGCGGTTTTTGCTTCTTTA-3' (44-mer). These were designed from the bovine amino acid sequence so as to anneal to two regions of the coding strand which possess low degeneracy in their code, namely regions coding amino acids 108-125 and 81-95 respectively as numbered (Figure 3). Where degeneracies were encountered, the following prioritized principles were exercised in order to select a nucleotide: (i) consistency with the codon usage bias observed in bovine DNA was maintained, (ii) in cases of potential mismatches which might arise, GT pairs rather than other mismatches were favored and (iii) the sequence 5'-CG-3', seldom observed in the coding strand of eukaryotes, was also presumed here to occur infrequently, and was thus avoided. Of ~5300 plaques screened, 10 of - 2 4 which gave strong signals with both probes after washing (4 x SSC, 0.1% SDS, 50°C, 60 min), were replated for plaque purification and reprobed under these conditions. EcoRl treatment of DNA prepared from putative clones yielded an 830-bp insert which was sequenced and also cloned into the EcoRl site of pBR322 as an additional source of cDNA. A radiolabeled probe prepared from this fragment by nick translation (Maniatis et al., 1983) was utilized as above to probe a mouse placenta library and a human kidney library prepared in XgtlO. After washing the filters (0.1 x SSC, 0.1 % SDS, 25°C, 60 min), 10 putative clones from each species were plaque purified. EcoRl treatment of viral DNA preparations were then subcloned into pBR322 as above, re-excised with EcoRl and sequenced on both strands. Northern blot hybridizations RNA blot analysis was carried out as described (Thomas, 1983). Ten micrograms of poly(A)+ RNAs were denatured with glyoxal and, after electrophoresis through a 1.0% agarose gel, transferred to a nitrocellulose filter. They were then probed with their respective nick-translated GSH-Px cDNAs: 830 bp (bovine, full length) and - 5 5 0 bp (mouse, 3' fragment). 32P-Labeled and glyoxal-denatured fragments of HindUl digest of X DNA were included as size standards. Results A cDNA library in XgtlO carrying EcoRl inserts derived from bovine pituitary was screened with two synthetic DNA probes (see Materials and methods). Approximately 24 of 5300 plaques hybridized with both probes. Of these, the clone which possessed 240
the largest insert was sequenced as shown in Figure 1. Human kidney and mouse placenta GSH-Px cDNA clones were similarly isolated from XgtlO cDNA libraries using the full-length bovine cDNA as a probe. DNA prepared from these plaques, when treated with EcoRl, yielded fragments of - 550 bp and - 280 bp for human and — 550 bp and - 300 bp for mouse. Also, from a third human XgtlO clone was isolated a 560-bp fragment which was found to contain a 280-bp intron: 5 '-GTGCGCCGGGCGGAGCGGGGCGGGGCGGGGGCGG ACGTGCAGTAGTGGCTGGGGGCGCCGGCGGTGTGCTG GTGGGTGCCGTCGGCTCCATGCGCGGAGAGTCTGGCT ACTCTCTCGTTTCCTTTCTGTTGCTCGTAGCTGCTGAA ATTCCTCTCCGCCCTTGGGATTGCGCATGGAGGGCAA AATCCCGGTGACTCATAGAAAATCTCCCTTGTTTGTG GTTAGAACGTTTCTCTCCTCCTCTTGACCCCGGGTTC TAGCTGCCCTTCTCTCCTGT AG-3' Nucleotide sequences derived from both strands of the cDNAs were obtained, although we have not sequenced across the internal EcoRl site residing in human and mouse cDNAs. It may be noted, however, that homology with the bovine sequence is high and contiguous across these restriction sites and theirflankingregions, both in nucleotide and amino acid sequence (Figures 1 and 3), suggesting continuity through the restriction site. Our mouse cDNA sequence is in agreement with recent genomic work (Chambers etal., 1986) . The GC content of these cDNAs is high, i.e. bovine, mouse and human cDNAs possess 59, 62 and 62% GC pairs respectively. In those codons where G or C can wobble witfi an A or T in the third position, G or C is favored in that position 82, 73 and 81% respectively. Within the coding region - 8 3 % nucleic acid homology is observed between all three species although bovine carries a 15-bp insert (Figure 1). The 3'-untranslated region reflects more variation: 71% (bovine/human), 69% (mouse/human), 67% (bovine/mouse). Homology in the 5'-untranslated region of bovine and mouse is — 35%. The mouse and human cDNAs possess a canonical polyadenylation signal (AATAAA) at positions 814 and 799 respectively. Our bovine cDNA clone, however, is presumably somewhat too short to reveal this feature. In order to deduce the cellular location(s) of the GSH-Px gene, a search of the complete bovine and human mitochondrial genome libraries was performed for sequences homologous to the bovine and human GSH-Px cDNAs. No appreciable homology was revealed, however, indicating that this mammalian protein is encoded only within the nucleus. Surprisingly then the translation process is a cytosolic one. Northern blot analysis of bovine and mouse GSH-Px indicates full-length mRNAs of - 1060 bases and - 1020 bases respectively (data not shown). Our bovine and mouse cDNA sequences, which reveal just a portion of a poly(A) tail, thus correspond to - 80% of these full-length messages. No multiplicity of bands in these Northern blots was apparent. Should related transcripts be present, they must be of low abundance or of similar length. Interestingly we find a single opal stop codon (UGA) in frame within the coding region of the bovine and human cDNAs and confirm such a codon in mouse, as highlighted in Figure 1. In bovine GSH-Px, whose amino acid sequence has been established by chemical methods, this UGA codes the active site selenocysteine residue. In an effort to understand the role codon context plays in the anomalous reading of this UGA, those proximal flanking bases conserved in each species have been compared with genes wherein UGA is utilized as a 'true' termination signal
Encoding of setenocysteine residues
Met 1 BOVTNE 1'
.CCGCTCAGCGCTCGGCGGCCGCCCTGGCGGaXCAGCrcCGOX^tf^TC^ ****
* *
* * * *****
**
MOUSE
1"
GTTTGAGTCCCAACATCTCCAG1 VTG rGTGCTGCTCGGCTCTCCGCGGCGGCA
HUMAN
1*
GCGCC^TqTGTGCTGCTCGGCTAGCGGCGGCGGCG
*
*
• * * *****
* * * * * * * * * * * * * * * * *
* * * *
*****
CAGTCCACCGTGTATGCCTTCTCCGCGCGCCCGCTGACGGGCGG GCCCAGTCGGTGTATGCCTTCTCGGCGCGCCCGCTGGCCGGCGG SeCys
121'
GGAGCCCTTCAACCTGTCCTCO
98"
ITGTGAGCCTGGGCTCCCTGCGGGGCAAGG'KXTGCTCATTGAGAATCTCGCGTCTCTC PGA XX^«X»CGATCCGGGACTACACCGAGATGAACGATCTGCAGAAGCGTCTGGG
80«
IVS 241'
ACCCCGGGGCCTGGTCGTGCTCGGCTTCCCCTGCAACCAGTTTGGGCATCA