Jun 7, 1993 - swimming larva (1). Nucleic acid sequence information could provide greater insight into the relationship of tunicates to the vertebrate lineage.
Nucleic Acids Research, 1993, Vol. 21, No. 15 3587-3588
Nucleotide sequence of cytochrome oxidase (subunit III) from the mitochondrion of the tunicate Pyura stolonifera: evidence that AGR encodes glycine G.A.Durrheim, V.A.Corfield, E.H.Harley1 and M.H.Ricketts* US/MRC Centre for Molecular and Cellular Biology, PO Box 19063, Tygerberg 7505 and 'Department of Chemical Pathology, University of Cape Town Medical School, Observatory 7925, South Africa GenBank accession
Received April 30, 1993; Revised and Accepted June 7, 1993 Tunicates are filter-feeding marine organisms which are classed as protochordates because of chordate characteristics in the freeswimming larva (1). Nucleic acid sequence information could provide greater insight into the relationship of tunicates to the vertebrate lineage. The mitochondrial gene encoding cytochrome oxidase subunit II (CO IH) of the tunicate Pyura stolonifera was cloned on 2 overlapping fragments (a 3 kb BamHI fragment and a 2.3 kb EcoRIIXhoI fragment) of mitochondrial DNA and identified by homology to previously reported CO HI sequences of other species. The sequence is 52% and 51% identical to human CO III at the nucleotide and amino-acid levels, respectively. Codon usage in mitochondria deviates from the 'universal' genetic code of nuclear genes. In particular, AGA and AGG (AGR), which code for arginine in nuclear genes, are known to code for serine in non-vertebrate metazoans (204), and, when present, are termination codons in vertebrates (5°7). The deduced amino acid sequence of P.stolonifera CO III was aligned with that of 13 other species, representing a wide range of taxa (Figure 1). From this alignment, a high degree of conservation of glycine at positions corresponding to AGR codons of P.stolonifera is apparant. At sites corresponding to the 12 AGR codons in the 13 other species shown, 61.5% are glycine, 14.7% are alanine and only 2.6% are serine. Furthermore, the sites with complete conservation of glycine tend to be within areas of conserved residues in the alignment, indicating domains with an important functional role in the CO III proteins. This is particularly apparant at positions 82, 141, 202 and 234, while, in contrast, positions 154 and 159, where glycine is poorly conserved, fall within an area of very low sequence conservation. This presents strong evidence that AGR codons in tunicate mitochondria specify glycine in the encoded protein. A similar observation, based on a partial gene sequence of cytochrome oxidase (subunit 1) from the mitochondrion of the tunicate
*To whom correspondence should be addressed at: NJ 08854, USA
no.
Z17208
Halocynthia roretzi, was recently reported (8). We have investigated the possibility that the conservation of glycines at sites corresponding to AGR codons of tunicates is due to RNA editing by looking at the occurrence of adenines in P. stolonifera CO III at positions aligning with conserved guanines of other species (other than at AGR codons). At the 79 aligned positions where guanine is more than 69% conserved in the 13 other species, the corresponding nucleotide in P.stolonifera is G at 68 positions, T at 5 positions and A and C at 3 positions each. There are only 8 positions where conservation of guanine is greater than 50% in the non-tunicates and adenine is present in the tunicate CO IH sequence; at none of these positions would A to G editing result in a better conserved amino acid in the tunicate protein. The presence of AGR codons at sites of conserved glycines can therefore not be ascribed to RNA editing. The novel codon usage in the mitochondrion of this unique taxon suggests that their phylogenetic affinities may need to be carefully reappraised.
REFERENCES 1. Arms,K. and Camp,P.S. (1987) Biology. Saunders College Publishing, Philadelphia. 2. de Bruijn,M.H.L. (1983) Nature 34, 234-241. 3. Clary,D.O. and Wolstenholme,D.R. (1985) J. Mol. Evol. 22, 252-271. 4. Okimoto,R., McFarlane,J.L., Clary,D.O. and Wolstenholme,D.R. (1992) Genetics 130, 471-498. 5. Anderson,S., Bankier,A.T., Barrell,B.G., de Bruijn,M.H.L., Coulson,A.R., Drouin,J., Eperon,I.C., Nierlich,D.P., Roe,B.A., Sanger,F., Schreier,P.H., Smith,A.J.H., Staden,R. and Young,I.G. (1981) Nature 290, 457-465. 6. Anderson,S., de Bruijn,M.H.L., Coulson,A.R., Eperon,I.C., Sanger,F. and Young,I.C. (1982) J. Mol. BioL 156, 683-717. 7. Roe,B.A., Ma,D.-P., Wilson,R.K. and Wong,J.F.-H. (1985) J. Biol. Chem. 260, 9759-9774. 8. Yokobori,S., Ueda,T. and Watanabe,K. (1993) J. Mol. Evol. 36, 1-8.
Department of Surgery, UMDNJ-Robert Wood Johnson Medical School, 675
Hoes Lane, Piscataway,
3588 Nucleic Acids Research, 1993, Vol. 21,. No. 15 58
Pyura Humian Finwhale Chicken
FVLRMNPFHL VDASPWPILV
Ascaris C. elegans Neurospora
Yeast
Qenothera Maize
Pyura Human
Finwhale Chicken Toad
Carp Sea urchin
Drosophila Ascaris C. elegans
Neurospora Yeast
Oenothera Maize
LDVVRSSY R..T. .. .T.Q R. II... .TQ R....TFQ R. .1I. . GTFQ
(AGA) V*L--EPMDP TGI--T.LN. Human TGI--R.LN. Finwhale TGV- -K. LN. Chicken TGI--T.LN. Toad TGM- -T. L. . Carp Sea urchin SGI--T.LN. Drosophila MGI--ISRN. I5W--HLVN. Ascaris C. elegans FQE--HLVN. Neurospora M3(--. .VN. .GM ... Yeast KGI--GVL. . Qenothera KGI--GVL.. Maize
(AGA) Human
Finwhale Chicken Toad
Carp Sea urchin
Drosophila Ascaris C. elegans Neurospora Yeast
Oenothera Maize
Human
Finwhale Chicken Toad
Carp Sea urchin
Drosophila Ascaris
TNTLTM. Q. . TNILTM.Q.. S. LLVMQ. . T.
.
.TMIQ
LLLLTM. Q..
--NLTLVGFL LLITNMVN.. --SLFLLGNI ITILT. . Q. . --GGLLFC.F SIFLVSFA.G --ELFIFT.F SVLFISFA.G IHYLYYM. .I GL.SAMFL.F SSVFFGISFL GLLATIM.L.F GATLLSLG. I FLLYTMEV..
GATLLSLG.I FLLYTMFV..
118 (AGO) (AGA) GFH*VMVQKG LRI*5LFIV SEVFFFLGFF WAFFHSGLSM IPDLSESWPP .H.TPP . .....YGI.. .T.... A.....Y..S.AP T.Q.GGH. .R . H.TPT ... .T . Y..S.A.P T.E.GGC ... .YG. I.....L.T.. .H.TPT . .....YGI.. .T .A ........SAP T.E. GGQ ... TYE.G.C... T.E.GGC ... SVEIGVA. ..
AIE.GA.... AH.VGGV.S. VHE.G.T.S.
TV.GAQ ... TFE.GAV ...
TEGI.... . ....GI...
(AGA) (AGA) (AGA) (AGO) 17 6 )S3VPLLMTVV LLSS*VTVTY SHYGVL*EtI YG*VMGL3FT7 LL*FF LE ....S. . .A.G.SI.W A.HSIMENNR 1N(IQA.LI. I. GLY..... LE....S.-..A.G.SI.W A.HSLMEGN)R KHML.QA.FI. IA.GLY .... A.HSITEGN4R A.HSIMHG.R A.HSIMEGER .HSI.AGNR A.HSLZ4ESNH A. .RL.SNKG A.HRL.RNKR .... GA. I. . A. HALIKGER .T. GS... A. . SLIAR.NR .... GAA. . W A.HAI.AGKE . P. . GAA. . W A. HAI.AGKE
LE ....Al .A.G.... .W FE....A. ...A.G.... .W FE....A. ...A.G.... .W FL ....G. .G. .1SW FQI....Al .A.G.... .W F.....II .... .G.S. .W F.....II. ..R.G.... .W FEL....M LE ..T.... I WE .F..PI WEI... PI
(AGA)
KQAIHATL. I. . GFY. . A. KEAIQS.Th. I. . GLY. . A. KQAIQS.AL. I. . GLY. . A. TESIQA.FL. VA.GSY..A.
SQ1TTQ..F.. V. ..GIY.. I. CANSUM. --. CI.A.Y. .GI CTNS.I.--. E.ALY. SIA. ENALK. . YM.KRA.YA. . A. KRA.YA. .A.
C. . AAY. . GI I. . AMM. . GF
IA.SFL.LGG VS.AIV. .GF V. .ALVS.GF
(AGO)
(AGG) (AGO)
Q*MEYNDA.GF NISDSVYGSV FFMAT*FHGF HjVJV*STLT* VSTLTZQGQ 7TMSQHV*7LE .AS. . FESP. T... GI...T ..V. .G ...L ... .IG. T. . T ICFI.Q.MFH . .SKH.FGF. AS. . YE. P. T... GI... T ..V. .G ...L .. IG. T. . I.CFL. QVKFH . .SNH.FGF. .A...HE.S. S.A....T ..V..G ... L ... IG.S..T A... .YE. P. T.A.G ... .T..V. .G.. .L ... IG.... .S .... L1 ... IG.T. .A A... .YE.P. T.A.G..... T ..V.G .... .L Q. .IGTr. .M .AW. .I.. P. T.A....T ..V.G .AY. .IE.P. T.A....TYY...G... V .LIGTP. .L LE S. G....V .... .F L... .KE.S. S... .GIF. .1I CGG....A .1 L... .ME. R. S.A. G.F.RI YL. G...I M .A .GV. SVSS.-... .GAF.I.C..F.G.....M .WTI. .AY. . WN. P. T.....AS .YF. .GT. -I .1..GTITTTF .. . IGT. .SI G... .YQ. PP T ....II...T.YL.. .G. .. M... 1 .G G... .YQ.PS T ...I .1...T L..G...
237 Pyura
- -ThL.LG. L - - .LLTLG. S - -TLLTW.L - -ILLTLG. I - -TL. TLG. I
I . .H.TPP ... .YG.I... .T. YN. S.AP ....I.. R. II. . GTFQ .H.TPP ... .YG. I...T ........Y. . S.AP R. II. KANFQ .S.TAI.N.. M. Y.I... .T ... .C..FA .....S.AP R..S. .GT.Q . L.TYA. TI. .I'.I. ...L ... .L..VS.....S. . P K. I.M.G-LS .Y.NFF.MD. FKFGVLV. . F . .FM. .F.I. IT. .DAA.VP K. IAM. G-LR . Y. NFF. MD. FKFGVI. . VF R.F`M.. FCI. .T. .DAA.VP R.MKS.GTF. .D.ThA. .R. N..I... .M. ..A ... AIl.....A.TP R.MST.ANIH A. TK. T. .KKF... TI .T.L.ASI .....S.. P .... .TLE .H. TKV. . L. . .YGSI .....M.LFA.. . .55 -. .S.AP R.L W..L. .. .TLE .H. TKA. . L. P.YGSI .....MS.FL. . ..SS..S.AP
Pyura
Pyu.ra
GKL.ALMLALG LVIWMHLKVS --LY?MrTALV MIMVlVYFWWJ
ALS. .LM4TS. .AM. F. FHSK ALS. .LMTS. .IM. F. FNSM AAA. .LTTS.. IM.F.YSST AVA. .L. TS. .AM. F.FGSM AIA.A.MTS. A. . F. FHST AFSG. .MTS. N. L. F.T QKr AIG.MTTIVS. M. K.F. QYDI ---VFHN. .I LS.RY.... IFCSS.GFTSS .VFFKNGIF ---NFHN. .I LRL.RYAYNL FFASAGMLRS -.M.FFKFGLY GALS. SNFQDH...S ..LNTF SVCL.N.TTTP .NFNN TKFQGH.Y.I .S....FFL SVVLFFNC.A ATLYL.GYKH! P ....SG SLG3..ATTV. G.MY.. PFQG IESQRHSY.. IESQRHSY.. .PP....SG 51.0..ATIV. G.MY. .SFQG
MT?IHQSHAY.M . P... LTG M.TIHQTHSY.M NP. ... .LTG MAHQAHSY.M ..PP....FG ..... .LTG Toad MAHQAHAY. M .P .P.P... .LTG MAHQAHAY.M Carp Sea urchin MAIQ-H.-Y.. .. ..... LDG Drosophila STHSNH ....Y... LTG
.C. L. .I3FH .C. L.QI. YH .C. L.QI. YH .C.F. TAGRH .C.L.H.NNH N.L....MSH ... KENH . A.W.NAYH AATYNIYTYH
.PNH.FGF. SKH. FGF. .SEH.FGF. .STHH.FGF. .SKNH{.FGF. .NYNH.LG. . .NYNH.WL. . L.11EM. . GF. L.NTH.NGF. ICGI.QYL.H L.KEH. .GF. .CGI.QYL.H L.KKH. .GF. .
(TAG)
CGIWYWHFVD VVWIFLYVSI YI*XSV* AAA.......L.... *.. RAA.......L.... *.. L...M.M.... AAA.. ..... I. AA-.......L.... *.. AAA.......L.... *.. AAA ..... P.V. .WL...A* L.. IT. ..G* AAA........ ..L.F.FV- .. .SY* FA. I .... ..L. F. FV ... SY* FA. L ....
C. elegans Neurospora G0L.... ..L.L ...INV .Y.. .1.. Yeast ... Y.... C.C .. .L.T.LT. Oenothera AMA.. L....LF.. F...GI* AMA.......L.PF .....GT* Maize
Figur 1. Alignment of CO MI protein sequences fr-om a range of eukaryotic taxa. Alignments are presented from the initiating methionine of vertebrate and echinoderm CO III sequences to the carboxyl ternitni of the proteins. Dots indicate identity to the reference (Pyura) sequence. AGA and AGO codons in the Pyura sequence are translated in the vertebrate manner (* = termination), and the Pyura codon at these positions is indicated directly above. The numbers refer to the Pyura sequence. Amiino acid sequences of the proteins were aligned using the DAPSA program (E.H.Harley, University of Cape Town), which uses an algorithm to maxmiz the score of aligned amino acid sequences, taking into account conservative substitutions, and with penalties for the number and length of gaps. Multiple alignmnents at various stringencies were performed, with manual adjustmnents to maximiize the aligmnment according to the criteria of the algorithm. The scientific names of the speciesshown, ith theGenBank or EMBL41U accessio nube f eachk inprnhei,ae sflow:Pua PEura stn fer7(1'7208); Human,- HomT saie (J01415); Finwhale, Balaenoptera physalus (X61 145); Chicken, Gallus gallus (X52392); Toad, Xenopus laevis (MI0217); Carp, Cyprinum carpio (X52392); Sea urchin, Strongylocentrotus purpuratus (X 1263 1); Drosophila, Drosophila yakuba (X03240); Ascaris, Ascaris suum (X54253); C.elegans, Caenorhabditis ekegans (X54252); Neurospora, Neurospora crassa (V00668); Yeast, Schizosaccharamycespombe (X 16868); Oenothera, Oenothera berteriana (X04764); Maize, Zea niays
(X12728).