Isolation and Characterization of cDNA Clones for

0 downloads 0 Views 3MB Size Report
Mar 15, 2016 - a Gigapack Plus kit (Stratagene) using the manufacturer's protocols. The library was put through .... compiled and analyzed using DNA Strider 1.0. Data base ...... Cloning: A Laborato'ry Manual, Cold Spring Harbor Laboratory,.
THEJOURNAL OF BIOLOGICAL CHEMISTRY

Vol. 266, No. 8, Issue of March 15, pp. 4965-4973, 1991

Printed in U.S.A.

Isolation and Characterization ofcDNA Clones for Rat Liver 10-Formyltetrahydrofolate Dehydrogenase* (Received for publication, August 22, 1990)

Robert J. CookSg, R.Stephen LloydST, and Conrad WagnerSII From the $Departmentof Biochemistry and the W e n t e r i nMolecular Toxicology, Vanderbilt University Schoolof Medicine, Nashville, Tennessee 37232 and the )[Veterans Administration Medical Center, Nashville, Tennessee 37212

We have isolated and characterized cDNA clones encoding rat liver cytosol 10-formyltetrahydrofolate dehydrogenase (EC 1.5.1.6). An open reading frame of 2706 base pairs encodes for 902 amino acids of M, 99,015. The deduced amino acid sequence contains exact matches to the NH2-terminal sequence (28 residues) and the sequences of five peptides derived from cyanogen bromide cleavage of the purified protein. The amino acid sequence of 10-formyltetrahydrofolatedehydrogenase has three putative domains. The NH2terminal sequence (residues 1-203) is 24-30% identical to phosphoribosylglycinamide formyltransferase (EC 2.1.2.2) from Bacillus subtilis (30%),Escherichia coli (24%),Drosophila melanogaster (24%),and human hepatoma HepG2(27%).Residues 204-416 show noextensive homologyto any known protein sequence.Sequence 417-900 is 46% (mean) identical to the sequences of a series of aldehyde dehydrogenase (NADP+)(EC 1.2.1.3). Intact 10-formyltetrahydrofolate dehydrogenaseexhibits NADP-dependent aldehyde dehydrogenase activity. The sequence identity to phosphoribosylglycinamide formyltransferase is discussed, and a binding region for 10-formyltetrahydrofolate is proposed.

The enzyme 10-formyltetrahydrofolate dehydrogenase (EC 1.5.1.6) catalyzes theNADP+-dependentoxidation of 10HCO-H,PteGlu’ to H4PteGlu andCOz. The enzyme was first described in detail by Kutzbach and Stokstad(1)in pig liver, and it has subsequently been purified to apparent homogeneity frompig liver (2) and ratliver (3). In eachcase, purified 10-HCO-P4PteGlu dehydrogenase hasalsobeenshownto performanNADP+-independent hydrolysis of 10-HCOH4PteGluto yield H4PteGluandformate, leading to the assumptionthatthis enzyme,like otherfolate-dependent * This work was supported by United States Public HealthService Grants DK 15289 and ES00267 and by the Medical Research Service of the Department of Veterans Affairs. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked “advertisement” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact. T h e nucleotide sequence(s)reported in this paper has been submitted totheGenBankTM/EMBLDataBankwith accession number(s) M59861. § T o whom correspondence should be addressed. Tel.: 615-3226367 or 615-327-4751 (ext. 5474). The abbreviations used are: 10-HCO-H4PteGluand10-HCOTHF, 10-formyltetrahydrofolate; H4PteGlu and THF, 5,6,7,8-tetrahydrofolate; H4PteGlu,, tetrahydrofolate pentaglutanate; HPLC,high pressure liquid chromatography; SDS, sodium dodecyl sulfate; bp, basepair(s);kb, kilobase pair(s);FBP-CI,folate-bindingproteincytosol I; GAR, glycinamide ribonucleotide 5’-phosphoribosylglycinamide).

enzymes, has more than one activity (1-3). Cook and Wagner (4) have purified and described arat liver cytosol folate-binding protein denoted FBP-CI. This protein has a subunit M , of 100,000 andcontainstightly bound H4PteGlu5. No enzymatic function requiring H4PteGlu could be ascribed to this protein.Recently, it was shown by Min et al. (5) that FBP-CI is 10-HCO-H4PteGludehydrogenase, although these authors did not purify the protein to homogeneity. We have since confirmed this observationby purifying FBP-CI from rat liver cytosol to apparent homogeneity and showing concomitant purificationof H4PteGlus rebinding and 10-HCO-H4PteGlu dehydrogenase and hydrolaseactivities (52).* Case etal. (6) recently reported that the dehydrogenase and hydrolase activities reported above (1-3) actually reside in two separate proteins that are compartmentalized. Their experiments using ratliver showed that 10-HCO-H4PteGlu dehydrogenase is presentexclusively in cytosol, whereas the 10HCO-H4PteGlu hydrolase is exclusive to the mitochondria. The original procedure for purification of 10-HCO-H4PteGlu dehydrogenase used liver that had been homogenized under conditions which did not protect the integrity of the mitochondria (1-3), whereas Case et al. (6) used liver that had been homogenized with a Teflon pestle in isotonic buffered mannitol/sucrose/EDTA. Case et al. (6) claim that the dehydrogenase and hydrolase enzymes are so closely related that they will copurify under conditions where mitochondria are damaged by the homogenization procedure. These results are clearly contrary to those of Wagner et al. (52)’ and those of Barlowe and Appling ( 7 ) , who showed clearlythat there is 10HCO-H4PteGludehydrogenase activity in rat liver mitochondria and that this activity is predominantly confined to the mitochondrial matrix. Another issue concerning 10-HCO-H4PteGludehydrogenase is its physiological function. Johlin et al. (8) suggest that it functions in themetabolism of formate, which involves the conversion of formate to 10-HCO-H4PteGluby the action of 10-formyltetrahydrofolate synthetase (EC 6.3.4.3) andthe subsequent oxidation of this product by 10-HCO-H4PteGlu dehydrogenase to C 0 2 and H4PteGlu. In primates, this is the exclusive pathway for metabolizing formate (9), and the investigation of 10-HCO-H,PteGlu dehydrogenase is of prime importanceinunderstandingthemechanism of methanol poisoning in humans (8). Krebs et al. (10) proposed that 10HCO-H,PteGlu dehydrogenase is the major route for the disposal of excess one-carbon units produced in folate-mediated amino acid metabolism. The physiological significance of the hydrolase activity is not clear. It can recycle 10-HCOH4PteGlu back to H4PteGlu, but the formate produced would C. Wagner, W. T. Briggs, and R. J. Cook, manuscript in preparation.

4965

4966

Primary Structure of Rat 10-HCO-H,PteGlu Dehydrogenase

be available for reincorporation into 10-HCO-H4PteGlu in a potentially "futile cycle" (2). With the pig liver enzyme, both dehydrogenase and hydrolase activities have been shown t o occur simultaneously, suggesting two active sites (2). 10-HCO-H4PteGludehydrogenase is an abundant enzyme in rat liver cytosol, where it is present at 0.5-1.0% of the cytosolic protein and binds a substantial proportion of the H4PteGlu5 pool (4). Thisobservationandthefactthat H4PteGlu, theproduct of both thedehydrogenase and hydrolase reactions, is a very potent inhibitor of these reactions with a K,value approximately equal to theK,,, value for the substrate, 10-HCO-H4PteGlu(1-3), suggest that this enzyme may have another as yet, unknown function. As an initial step in resolving the questionsof bifunctionality and physiological function of 10-HCO-H4PteGludehydrogenase, we have isolated and characterized cDNA clones of this enzyme. The translated 902-amino acid open reading frame reveals a protein of three distinct domains. Two of these domains show extensive identity tophosphoribosylglycinamide formyltransferase(EC 2.1.2.2) and aldehydedehydrogenase (NADP+) (EC 1.2.1.3), whereas the third domainshows no homology to any known protein sequence. EXPERIMENTALPROCEDURES

Materials-All common chemicals were of the highest grade available. Restriction enzymes and T, polynucleotide kinase were obtained from New England BioLabs, Inc. HPLC solvents and water were supplied by Burdick & Jackson Laboratories Inc. Trifluoroacetic acid was purchased from Pierce Chemical Co. CNBr was obtained from Fisher. Nitrocellulose and NA45 membrane were obtained from Schleicher & Schull. Custom synthetic oligonucleotides for cDNA library screening and nucleotide sequencing were supplied by Research Genetics (Huntsville, AL). Bacterial Strains-Escherichia coli strains BB4 and XL1-Blue were supplied by Stratagene and were grown and used as described by the supplier. Purification of Rat Liver Cytosol IO-HCO-H4PteGluDehydrogenase-10-HCO-H4PteGlu dehydrogenase was purified as described by Cook and Wagner (4, 11) for FBP-CI, with the exception that dehydrogenase and hydrolase activities were assayed by the method of Kutzbach and Stokstad(l),in addition to themeasurement of binding of [3H]H4PteGI~5. The purification involved the isolation of rat liver cytosol by centrifugation (100,000 X g for 60 min) after gentle homogenization in 0.25 M sucrose, 10 mM potassium phosphate (pH 7.0), 10 mM 2-mercaptoethanol, 1 mM sodium azide, and 0.1 mM phenylmethylsulfonyl fluoride. This was followedby Sephadex G-150 gel chromatography, DEAE-cellulose chromatography, affinity chromatography using 5-HCO-H4PteGlulinked to aminohexyl-Sepharose 4B (4), and finally Mono Q (cation-exchange; Pharmacia LKB Biotechnology Inc.) column chromatography. The protein was judged pure by virtue of the presence of a single band on SDS-polyacrylamide gel electrophoresis, nondenaturing gel electrophoresis, and chromatofocusing. The specific activity of pure dehydrogenase was 0.19 pmol of NADPH min" mg" measured at room temperature (22 "C). Cyanogen Bromide Cleavage of 10-HCO-H4PteGluDehydrogenasePure protein (1 mg) was lyophilized and dissolved in 100 p1 of 70% (v/v) formic acid, Cyanogen bromide (1mg) was added; and thetube was flushed with nitrogen, sealed, and incubated at room temperature for 24 h in the dark.The reaction mixture was evaporated to dryness under a streamof nitrogen, dissolved in 0.1% trifluoroacetic acid, and stored at -20 "C. Purification of CNBr Peptides-Peptides were separated and purified by reverse-phase HPLC using a SynChropak RP-P column (4.1 X 250 mm; SynChrom, Linden, IN) as described by Cook et al. (12). Digests dissolved in 0.1% trifluoroacetic acid were applied to the column, and peptides were eluted either with acetonitri1e:water(80:20, v/v) containing 0.1% trifluoroacetic acid or with isopropyl alcoho1:water (80:20, v/v) containing 0.1% trifluoroacetic acid. Peptides were detected at 214 nm. Peaks were collected and lyophilized. Amino Acid Sequence Analysis-Automated amino acid sequence analysis of CNBr peptides or complete protein was performed on either an Applied Biosystems Model 470A Gas-Phase Sequencer or a Model 475 Pulse-Liquid Sequencer using the manufacturer's proto-

cols. Samples contained a minimum of 25 pmol. Construction of Rat Liver cDNA Library-Rat liver poly(A)+RNA (250-4200 bp) was obtained from Clontech. The cDNA library was generated using a Pharmacia LKB Biotechnology cDNA synthesis kit following the manufacturer's protocols. The cDNA with cohesive EcoRI linkers was ligated into XZAP (Stratagene) andpackaged using a Gigapack Plus kit (Stratagene) using the manufacturer's protocols. The library was put through one round of amplification using BB4 as the host strain according to Stratagene protocol^.^ Isolation of Rat 10-HCO-HJ'teGlu Dehydrogenase cDNA ClonesThe rat liver cDNA library in XZAP was grown in the E. coli host strain BB4 according to Stratagene protocols. The library was screened using a double-lift procedure, where plaques were transferred to nitrocellulose filters (80 mm) in the first lift for 1 min and in the second lift for 5 min. The nitrocellulose filters were processed for hybridization as described by Maniatis et al. (13). The filters were prehybridized for 2 h at 25 "C in 20% formamide, 5 X SSPE (1 X SSPE = 0.15 M NaCI, 10 mM NaH2P04,1.3 mM EDTA (pH 7.4)), 2.5 X Denhardt's solution, 0.1% SDS, and 62.5 pg/ml denatured calf thymus DNA. Positive plaques were identified by hybridization with 5'-32P-labeledbest-guess oligonucleotidesadded at 5 X IO5cpm/ml of hybridization fluid. Hybridization was performed at 25 "C for 16 h. The filters were washed for 15 min at 25 "C in 4 X SSPE. Filters screened with oligonucleotide 5 were then washed in 4 X SSPE at 30 "C for 30min and then at 34 "C in fresh 4 X SSPE. Filtersscreened with oligonucleotide 6 were washed under the same conditions and then at 38 and 42 "C in 2 X SSPE. The filters were monitored at all stages of the washing procedure for background radioactivity. The filters were then exposed for 16 h at -70 "C with intensifying screens. Selected double positive clones were cored into sterile SM media (13) and rescreened through two more rounds until plaque-pure. Southern Analysis-Southern analysis combined with restriction enzyme digestion was used to characterize positive clones. Digested DNA was subjected to agarose gel electrophoresis and then transferred to nitrocellulose bidirectionally using the method of Smith and Summers (14). The filters were hybridized to either 5'-end-labeled oligonucleotideor nick-translated cDNA. The hybridization fluid was as described above, whereas hybridization temperatures and washing conditions are given in the appropriate figure legends. Northern Analysis-Total RNA from rat liver was extracted by the method of Chirgwin et al. (15) and was kindly donated by Dr. K. E. Hill (Division of Gastroenterology, Vanderbilt University School of Medicine). RNA was separated by electrophoresis on 1.25% formaldehyde-agarose gels as described by Maniatis et al. (13) and then transferred to Genescreen Plus using the manufacturer's protocol. Prehyhridization was carried out at 42 'C for 2 h in 50% formamide, 10% dextran sulfate, 1%SDS, 1 M NaCl, 5 X Denhardt's solution, and 100 pg/ml denatured calf thymus DNA. Hybridization was performed for 16 h at 42 "C with denatured nick-translated3.2-kb cDNA insert (5 X lo5 cpm/ml). After hybridization, the filter was washed in 2 X S s c (1 X SSC = 0.15 M NaCl, 15 mM sodium citrate) at room temperature for 5 min and then at 65 "C for 30 min in 2 X SSC 0.1% SDS and was finally rinsed in 0.2 X SSC 0.1% SDS prior to exposure for 2 h at room temperature. Purification of cDNA Fragments from Agarose Gels-DNA fragments were digested with EcoRI and subjected to agarose gel electrophoresis. After visualization of the DNA with ethidium bromide, the gel was cut on either side of the fragment to be isolated. A strip of NA45 membrane, cut to the size of the loading well, wasinserted into the incision ahead of the DNA fragment; and electrophoresis was continued until all of the DNA fragment was bound to themembrane. The NA45 membrane was removedfrom the gel, and the bound DNA was then eluted from the membrane and purified by ethanol precipitation. Production of 32P-LabeledDNA Probes-Purified cDNA fragments (50-500 ng) were nick-translated using a DuPont-New England Nuclear nick translation kit according to the supplied protocol. Labeled fragments were separated from unincorporated [cx-~'P]~CTP by the use of a Sephadex G-50 spin column procedure (13). Oligonucleotides (100-500 ng) were 5'-end-labeled with [-y-32P]ATPusing T4 polynucleotide kinase and purified on a Sephadex G-25 spin column as described by Maniatis et al. (13). DNA Sequencing-Plaque-pure positive clones for 10-HCOH,PteGlu dehydrogenase in XZAPwere excised and recircularized into pBluescript using the helper phage R408 as described by Stra-

+

+

Questions concerning the rat liver cDNA library should be addressed to R. J. C.

P r iSmt raur cy t u r e

of Rat IO-HCO-H,PteGlu Dehydrogenase

tagene protocols. The rescued phagemid was then selected by growth in E. coli XL1-Blue on LB-ampicillin plates. Rescued XL1-Blue colonies were maintained on LB-ampicillin, and large amounts of phagemid DNA were produced fromthese colonies by growth of XL1Blue in liquid culture. Restriction mapping showed the presence of many of the unique Bluescript restriction sites within the cDNA inserts, which eliminated the use of nested deletions as a method of sequencing. All sequences were derived from dideoxy sequencing (16) using [n-'"PIdATP with Sequenase (U.S. Biochemical Corp.) and hase-denatured double-stranded template following the manufacturer's protocol. Both strands were sequenced using specifically synthesized oligonucleotide primers. Sequence Compilation and Analysis-Nucleotide sequences were compiled and analyzed using DNA Strider 1.0. Data base searches and sequence alignments were performed either by FASTA (17) or by FASTDB (copyrighted software product of IntelliGenetics Inc.). Multiple sequence alignments were performed by the NeedlemanWunsch algorithm (18)GENALIGN (acopyrighted software product of IntelliGenetics Inc.; theprogramwas developed by Dr. Hugo Martinez (University of California, San Francisco)). Aldehyde Dehydrogenase Assay-Aldehyde dehydrogenase activity was assayed according to the methodof Lindahl and Evces (19).The substrates (50 mM) used were formaldehyde, acetaldehyde, propionaldehyde, and benzaldehyde. NAD or NADP was present a t a final concentration of 10 mM.

A I 2 3 4 5 6 7 8 9 1 0

4967 B

C

I2345678910 12345678910

FIG. 1. Agarosegel electrophoresis and Southern blot analysis of 10-HCO-H,PteGlu dehydrogenase clones. A, agarose gel electrophoresis of cDNA from clones in pRluescript. DNA (2 pg)wasdigestedwith EcoRI for 16 h at 37 "C and run ona 1% agarose gel. Lanes 1 and 10, DNA standards of X digested with Hind11 and +X174 digested with HaeIII; lane 2, clone 13C; lane 3, clone 15C; lane 4 , clone 17C; lane 5 , clone 5-1C; lane 6, clone 14-1C; lane 7, clone RESULTS 14-2C; lane 8, clone 16C; lane 9, clone 1%. R, bidirectional Southern Amino Acid Sequencing and Best-guess Oligonucleotide De- blot of A probed with nick-translated 3.2-kb insert of clone 15C (see sign-Pure 10-HCO-H,PteGlu dehydrogenase from two dif- "Experimental Procedures"). The filter was prehybridized for 3 h a t ferent preparations was subjected to automated Edman deg- 42 "C andhybridized for 16 h a t 42 "C with 5 X 10"cpm/ml "'P-probe described under "Experimental Procedures." The filter was rinsed radation using both a Gas-PhaseSequencerand a Pulse- as in 4 X SSPE atroom temperature and then washed in 4 X SSPE for LiquidSequencer. The derivedsequence (MKIAVIGQS- 30 min each at 42,50,65, and70 "C and finally rinsed in 0.1 X SSPE LFGQEVYXQLRKEGHEVVG) agrees with published a a t room temperature. The filter was then exposed for 3 h a t room NHp-terminalsequence for10-HCO-H,PteGlu dehydrogenase temperature. C, other Southern blot of A probed with 5'-:"P-oligoa t 30 "C for 3 h and (8),with the exception that the first residue is identified as nucleotide 3. The filterwasprehybridized methionine and that arginine confirmed is at position 20. The hybridizedfor 16 h a t 30 "C with5 X lo5 cpm/ml "'P-probe as under"Experimental Procedures." The filter was then reason for the failure of Johlin et al. (8) to identify theNH,- described rinsed in 4 X SSPE at room temperature and then washed in 4 X terminal methionineresidue is not apparent,except that their SSPE for 30 min a t 35 and 40 "C and finally rinsed in 0.1% SSPE a t purification procedure results in very low yields of pure 10- room temperature. The filter was exposed for 3 h at, -70 "C with HCO-H,PteGlu dehydrogenase. intensifying screens.

Peptides generated by CNBr digestion of 10-HCOH,PteGlu dehydrogenase were separated and purified to ap- other four clones contained inserts of 2.7 or 2.2 kb. The gel parent homogeneityby reverse-phase HPLC (see "Experi- was bidirectionally blottedonto nitrocellulose and probed mental Procedures"). Five peptides, 9, 15, 17A, 17B, and 23, either witha nick-translated 3.2-kb insert purified from clone were selected and sequenced. (These sequences are underlined 15 (Fig. 1B) or with 5'-"'PP-labeled oligonucleotide 3, a best in Fig. 3.) Best-guess sense oligonucleotides were designed to guess antisense oligonucleotide to the NHn-terminal amino peptide 15 (MYIAKEE), ATGTACATCGCCAAGGAGGAG acidsequence (Fig. IC).Thenick-translated 3.2-kb probe (oligonucleotide 5), and peptide 23 (MKIGNPLE), ATGAA- hybridized to the 3.2-, 2.7-, and 2.2-kb fragments (Fig. 1B) GATCGGCAACCCCCTGGAG (oligonucleotide 6), using under very stringent washing conditions, indicating that these mammalian codon usage data (20, 21). A third, best-guess sequences were highly related. The end-labeled NHp-terminal antisense oligonucleotide was designed to the NH2-terminal best-guess oligonucleotide only hybridized to the 3.2-kb insequence (MKIAVIGQSLFG), CCGAACAGGGACTGGCC- serts (Fig. l C ) , which suggests that these inserts containeda complete open reading framefor 10-HCO-H,PteGlu dehydroGATCAGGGCGATCTTCAT (oligonucleotide 3). Isolation, Purification, and Characterization of cDNA Clones genase, whereas the other shorter clones were truncated a t of IO-HCO-H,PteGlu Dehydrogenase-An initial screen of 5 the 5'-end. sequencing of cDNA Clones-Three clones were chosen for x 10" plaques gave 171 positive clones to oligonucleotide 5 and 81 positive clones to oligonucleotide 6. Comparison of the sequencing, 17C (3.2-kb insert), 14-1C (2.7-kb insert), and 5autoradiographs showed that 23 were double positive clones, 1C (2.2-kb insert).Thesethree clones contained only one of which 12 were cored. A secondround of plaque purification insert (Fig. lA)and were suitable for sequencing a doublereduced the number of double positive clones toeight. These stranded template in both directions. Initially,clone 17C (3.2 were screened through a thirdandfinalround of plaque kb) was sequencedwith two primers, 7pBS and M13 universal purification. Southernanalysis of EcoRI-or ClaI-digested primer, which read from the pBluescriptvector into both ends DNA from these clones with '"P-labeled best-guess oligonu- of the insert, andwith the two best-guess oligonucleotides, 5 cleotide 5 revealed insert sizes which ranged from 2.0 to 4.5 and 6,used to isolate the clones. The primer 7pBS showed an kb(datanotshown).Theeightplaque-pure clones were ATG startcodon 163bp downstreamof the left EcoRI linkage site. Translationof the nucleotide sequencerevealed an amino excised from XZAP and recircularized intotheBluescript phagemid (see "Experimental Procedures"). DNA was puri- acid sequence identical to thatof the NH2 terminusobtained from the M13 fied from the clones, digestedwith EcoRI, and electrophoresedby automated Edman degradation. The results on an agarose gel (Fig. L4). Four clones contained 3.2-kb universal primer (which primes in the reverse direction) and inserts; three of these also contained smaller fragments. The oligonucleotides 5 and 6 revealed three overlapping sequences

4968

Primary Structureof Rat IO-HCO-HZteGlu Dehydrogenase

FIG. 2. Sequencing strategy and physical map of full-length clone 17C (3109 bp) and truncated clones 14-1Cand 5-1C.The thin lines to the left and right are the 5'- and 3'-untranslated regions of clone 17C, respectively. The heauy line represents the open reading frame (2706 bp). Thebase pairs shown in parentheses for clones 14-1C and 5-1C are the 5"starting points relative to the full-length clone above. The arrows indicate the direction and extent of DNA sequence determined. The notations on the arrows denote which primer was used 3, 5, and 6 are best-guess oligonucleotides 3, 5, and 6; 7pBS is a primer for pBluescript (CGGTGGCGGCCGCTCT); and M13 UP is M13 universal primer. The positions of the peptide sequences are shown. Restriction endonuclease site abbreviations are: A, AccI; B, BamHI; Bg, BglI; E, EcoRI; ERV, EcoRV; H, HindIII; K, KpnI; Sc, ScaI; and Sm,SmaI.

700 bp long that contained a 430-bp open reading frame that encoded the sequence (M)YIAKEE derived from peptide 15; a putative TGA stop codon; an AATAAA polyadenylation splicing signal; and, 23 bp downstream, a short poly(A) tail. These results clearly show that clone 17C encodes the entire open reading frame for 10-HCO-H'PteGlu dehydrogenase. The sequencing strategy is shown in Fig. 2. Both strandsof each of thethree clones, 17C, 14-1C, and 5-1C,were sequenced. The complete nucleotide sequence of clone 17C (3109 bp) and the deduced amino acid sequence are shown in Fig.3.Only one extended open reading frame of2706 bp coding for 902 amino acids was found. This deduced amino acid sequence contained the NHp-terminal sequence and all five of the sequences derived from the CNBr peptides. The deduced amino acid sequence has a calculated M, of 99,015, whichagrees with the SDS-polyacrylamide electrophoresis estimate of 100,000 (4). All three clones had identical sequences, with the exception of the 3'-untranslated region. Clone 14-1C ended at bp 3088 (Fig. 3), whereas clone 5-1C did not contain bases 3093-3103 (Fig. 3); but it did have an extended 29-bp poly(A) tail. Northern Analysis-Northern analysis of rat liver total RNA with a nick-translatedfull-length cDNA insert revealed two bands, one minor band just below the 28 S RNA marker at -3.2 kb (equivalent in size to clone 17C and sufficient for the open reading frame) and a larger, major band just above the 28 S marker at -3.7 kb (Fig. 4). The abundance of the 3.7-kb message suggests that it is the mature transcript and that the3.2-kb band may be a degradation product. DISCUSSION

We have isolated a full-length cDNA clone for rat liver 10HCO-H'PteGlu dehydrogenase. The open reading frame codes for 902 amino acids of M , 99,015, whichis in agreement with the predicted subunit size of 100,000 (3, 4). The NHzterminal amino acid sequence and sequences of CNBr peptides derived from pure 10-HCO-H'PteGlu dehydrogenase are present in the deduced amino acid sequence. A search of the two majorprotein sequence data bases (PIR, release 23; and SWISS-PROT, release 13) using FASTA (17) or FASTDB (IntelliGenetics Inc.) revealed that 10-HCOH'PteGlu dehydrogenase has three putativedomains (Fig. 5). The first domain extends from the NH, terminus to amino acid 203 and shows identity to GAR transformylase from Bacillus subtilis (22), E. coli (23), Drosophila(24, 25), and human hepatoma HepG2 (26) (Fig. 5). No identity to the yeast GAR transformylase (27) was seen on the initial data

base search. Subsequently, an alignment was revealed using the Needleman-Wunsch algorithm (18)(data notshown). The extent of sequence identity increases from B. subtilis (99 amino acids) to thehuman enzyme, where it covers the entire length of the GAR transformylase sequence. The GAR transformylases from B. subtilis (22) and E. coli (23) are single polypeptides, whereas those from Drosophila (24, 25), human (26),and yeast (27) are the COOH-terminal domain in a trifunctional enzyme. The other two activities associated with this purine biosynthetic complex are 5'-phosphoribosylglycinamide synthetase (EC 6.3.4.13) and 5'-phosphoribosylaminoimidazole synthetase (EC6.3.3.1), which formthe sequence HzN-GAR synthetase-5'-phosphoribosylaminoimidazole synthetase-GAR transformylase-COOH (25). The role ofGAR transformylase is to transfer aone-carbon unit from 10-HCOH'PteGlu to 5'-phosphoribosylglycinamide, the product of GAR synthetase, to yield 5'-phosphoribosyl-N-formylglycinamide (28). These observations suggest that the NHp-terminal domain of 10-HCO-H'PteGlu dehydrogenase may bind 10-HCO-H'PteGlu. The sequence HPSLLP (residues 106-111) and a glycine (residue 115) 4 residues downstream (Fig. 6) arestrictly conserved in enzymes that use 10-HCO-H'PteGlu as a substrate, except in the yeast GAR transformylase (27), which hasa low sequence identity to GAR transformylase from other species. This strictlyconserved sequence does not occur in the trifunctional C1-THF synthase' which synthesizes 10HCO-H'PteGlu (29-32). However, visual inspection of the 10-HCO-THF synthetase domain (residues 305-935) (31) of human CI-THF synthase reveals a similar sequence (QPSQGPTFG). Thissequence is conserved in otherC1-THF synthases (29-32) and in 10-HCO-THF synthetasefrom Clostridium acidiurici (33) and Clostridium thermoaceticum (34). By aligning all 12 sequences and introducing a gap beforethe final glycine in the 10-HCO-THF synthetase sequences, there is a strict conservation of 4 amino acids in this 10-residue sequence (Fig. 6), with the obvious exception of yeast GAR transformylase. To facilitate easy discussion of these 12 sequences, the residues have been numbered 1-10 in Fig. 6. The differences at positions 1(histidine to glutamine or glutamate) and 5 (leucine to glycine) are split between the top six en-

'Cl-THF synthase is the trifunctional enzyme lo-formyltetrahydrofolate synthetase, 5,lO-methenyltetrahydrofolatecyclohydrolase (EC 3.5.4.9), and 5,lO-methylenetetrahydrofolate dehydrogenase (NADP+) (EC 1.5.1.5). The use of lO-HCO-H,PteGlu for the dehydrogenase and THF and10-HCO-THF for the C1-THF synthaseand its 10-HCO-THFsynthetase domain, respectively, is intended to clearly distinguish these two enzymes.

Primary Structure

10-HCO-H4PteGlu of Rat 60

4969 480

470

CCCGTCTGGGAGCTCGCGGTTCCTGTGCTGACTCTCAAAGGGCTGACTGACCCTCGCA~~

TAGCAGGTACTTCTGGATTGCRAGTCCAGGGTTCTGGGCCGGTGACCCTGTTTTCTCTA~

Dehydrogenase

AlaAlaLy~GluAlaPheGluAsnGlyLeuTrpGlyLe~T~pGlyLy~IleAsnAlaA~gA~pA~~Gly GCAGCAAAGGRAGCCTTCGAGRA~~GGACTGTGGGGAAAGTGCTCGTGACCGGGGC

1620 490 500 A r g L e u L e u T y r A r g L e u A l a A s p V a l M e t G l u G l n H i s G l n G l ~ G l u L ~ ~ Al~Th~Il~ " CGGCTCCTGTACAGGTTGGCGGATGTCATGGAGCAGCACCAGGRAGAGCTAGCCACCATT TTCCTGTCTTTGACCTTGGGTACCTTCCRACCCACTGTTGCCATGRAGATTGCAGTRATC 1680 180 510 520 10 20 120

1

GluAlaLeuASpArgGlyAlaValTyrThrLeuAlaLeuLy~Th~HisV~lGlyMetSe~ GGACAGAGCCTGTTTGGCCAGGAGGTTTACTGCCAGCTGAGGRAGGAGGGCCACGAGGTG

GAGGCCCTGGATCGAGGTGCCGTSTACACACGCTGGCCCTGRAGACGCACGTGGGCATGTCC

240

1740

30

530

~ValPheThrIleProAspLysAspGlyLy~A~pGlyLy~Al~A~pP~~L~~GlyL~uGl~Al~ GTGGGTGTGTTCACAATCCCAGACRAGGATGGG~GCAGACCCCCTGGGCCTGGAGGCT

300 60 50 GluLy~A~pGlyArgAlaValPh~LysPheProArgT~pA~gAl~A~gGlyGl~Al~L~u GAGRAGGATGGCCGTGCCGTGTTCAAGTTTCCTCGGTGGCGTGCTCGGGGACAGGCTCTG

360

._.

17n

e

r

1860 570

~~

90 S

1800 550 560 IleA~nGlnAlaArgPr~ASnA~gAsnL~uTh~Le~Th~Ly~Ly~Gl~P~oValGlyGly ATCAACCAGGCTAGACCCAACCGCRACCTGACCTTGACCRAGRAGGRACCTGTTGGGGGT

8580 0

7. .0

ProGluValValAlaLySTy~Gl~AlaLeUGlyAlaGluL~~A~~V~lLeUP~OPheCyS CCCGAGGTGGTGGCCAAATACCA(;GCTCTGGGTGCAGAGCTC~TGTGCTTCCCTTCTGC

G

l

n

P

h

e

I

100 l e

P

r

o

O

AGCCAGTTCATACCCATGGAGGTCATTRATGCCCCCAGGCACGGCTCCATCATCTACCAT

480 110

L~uTrpHi~CySHi~Pr~L~uGluLeUSeTLeuA5nASpAl~IleLevGl~A~pCysSe~ CTGTGGCATTGTCATCCCCTGGRACTATCCCTTRATGATGCTATCCTGGRAGACTGCAGC

1920 600 590 ProValAlaAlaGlyA~nThrValValIleLy~P~oAlaGl~V~lTh~P~~L~~Th~Ala CCTGTGGCGGCTGGGAACACCGTGGTGATCRAGCCTGCCCAGGTGACCCCACTCACAGCC

1980

120

~ S ~ = A l a I l e A ~ " T = p T h = L e " I l e H i ~ G l ~ CCATCCCTGCTACCCAGGCATCGAGGGGCTTCGGCCATCAACTGGACCCTCATTCATGGA

540

540

IleGlnThrPheArgTy~PheAlaGlyTrpCysAspLy~IleGl~GlyAlaTh~IlePro ATCCAGACCTTCCGGTACTTTGCTGGCTGGTGTGATRAGATCCAGGGTGCCACCATCCCC

610

620

LeuLy~PheAlaGluLeuThrLeuLysAlaGlyIl~P~~Ly~Gl~v~lValAsnIl~Leu TTGRAGTTTGCCGAGCTGACACTGRAGGCTGGCATTCCCRAGGGTGTGGTCAACATCCTC

2040

130 AspLysLysGlyGlyPheThrllePheTrpRlaAspAl~A~pA~pGlyLe~A~pTh~GlyA~pL~u GATAAGAAAGGAGGCTTCACCATCTTCTGGGCTGATGATGGCCTGGACACTGGTGACCTT

630

640

650

660

600 1 =,n ".

170

TTCCTCTTCCCAGAGGGCATCAAI\GGGATGGTTCAGGCTGTGCGACTGATCGCAGAGGGT

2160

720

670 ValLySLySValSerLeuGluLeilGlyGlyLysSerProL~~IleIl~Ph~Al~AspCy~ GTGRAGRAAGTCTCCCTGGAGCTGGGTGGAAAGTCACCCCTTATCATATTTGCTGACTGC

2220

700

rGlYGluGlvAlaThrTvrGluGlyIleGlnLysLys ACAGCCCCCCGCTGCCCCCAGTCTGAGGAAGGAGCCACCTATGAGGGCATTCAG~GAAG

780 710

GlyPheThrGlySerThrGluValGlyLysHislleMetLy~S~~Cy~Al~Le~S~~A~n GGGTTCACAGGCTCCACGGAGGTGGGG~CACATCATGAAAATCGGCAAGCTGTGCTCTGAGCRAC

180 680

P h e L e u P h e P r o G l u G l y I l e L y s G l y M e t V a l G l y ~

190

2100

160 ".

LeuLeuGlnLysGluCysGluValLeuProRspAspThrV~lS~~Th~Le~Ty~A~~A~g CTGCTGCAGRAGGAGTGTGAGGTGCTTCCAGATGACACCGTGAGCACGCTGTAC~TCGT 660

690 ASpLeuASnLySAlaValG1nMetGlyMetSerSerValPh~PheA~~Ly~GlyGl~A~~ GACCTCRACAAGGCCGTGCAGATGGGCATGAGCTCCGTTTTCTTCAACAAAGGGGAGAAC

2280

220

720

710 CySIleAlaAlaGlyArgLeuPheValGluGluSerIleHi~A~~Gl~PheV~lGlnLy~

840 230

240

ASnASpLySVslPrOGlyAl~T~~~ThrGluAlaCysGlyGl~Ly~L~~Th~Ph~PheA~n AATGACRAGGTGCCAGGTGCCTGGACAGAGGCTTGTGGGCAGRAGCTGACATTTTTCAAC

900 250 260 SerThrLeuASnThrSerGlyLeuSerThrGlnGlnGlyGluAl~L~uP~oIleP~~GlyAla TCGACACTCRACACTTCAGGACTGTCRACCCAGGGAGRAGCTTTACCCATCCCAGGAGCC 960 270 280 780 H i S A r g P r o G l y V a l V a l T h r L y s A l a G l y L e u I l e L e v P h e G l y A s n G l v H i ~ A t?EEl!X ~g~ CATCGTCCAGGCGTGGTCACCRAAGCAGGACTCAGGACTCATCCTCTTTGGGRATGAGCATAGRATG 1020 290 300 800 LeYLeuYalLIlSASnIleGInLeuC.luAsoGlvLvsMetMetProAlaSerGlnPhePhe TTGCTGGTGRAGRACATCCAGCTGGAAGATGGCAAGATGGCRAGATGATGCCAGCCTCCCAGTTCTTC 1080 310 320 8 2s0p L e u G l u L e u T h r G l u A l a G l a L y S G l y S e ~ A l a S e ~ S e ~ A RAGGGGTCCGCTAGCAGTGACCT(;GAGCTGACAGACAGAGGCGGAGTTGGCCACGGCGGAGGCT 1140 330 340 ValArgSerSerTrpMetArgIleLeuPrORsnValProGluV~lGluA~pSe~Th~A~p GTGCGGAGCTCTTGGATGCGAATCCTGCCGRACGTCCCAGAGGTGGRAGACTCTACAGAT 1200 350 PhePheLy~S~rG1~Al~AlaSerValAspValValAIgL~uV~lGl~GluV~lLy~G~~ TTCTTCRAGTCAGGAGCTGCTTCTGTAGATGTTGTGAGGCTGGTGGAGGRAGTGRAGGAG 860 1260 370 380 LeuCySAspGlyLeuGluLeuGluAsnGluAspValTyrMetAl~Th~Th~PheA~gGlv CTGTGTGACGGGCTGGAGTTAGA~~~VLTGRAGACGTTTACATGGCCACCACCTTTAGGGAG 880 1320 390 400 PheIleGlnL~uLeuValArgLysL~uArgGlyGlyGluAspA~pGluSe~GluCysvalIle TTTATCCAGCTCCTAGTGAGGAAGCTGAGAGGGGAGGACGACGAGAGCGAGTGTGTCATT 1380 410 420 ASnTyrValGluArgAlaVa1AsnLysLeuThrLeuGlnMetProTyrGlnLeuPheIle AACTACGTGGRAAGGGCAGTGRA~RAGCTGACTCTCCRAATGCCCTACCAGCTCTTCATA 1440 430 440 GlyGlyGluPheValAspAlaGluGlySerLysThrTyrA~"Th~I1eA~"P~~Th~A~p GGCGGCGAGTTCGTGGACGCTGAGGGCTCTRAGACCTACRATACCATCAACCC~CGGAT 1500 450 460 GlySerValIleCySGlnValSerLeuAlaGlnValSerAspv~~AspLy~Al~v~lAla GGRAGTGTCATCTGCCAGGTGTCCCTGGCTCAGGTGAGCGATGTTGACRAGGCAGTGGCA 1560

FIG. 3. Nucleotide sequence of rat 10-formyltetrahydrofolate dehydrogenase cDNA and derived amino acid sequence. The nucleotides and amino acids are numbered sequentially from their appropriate starting positions. The peptide sequences derived from CNBr digestion are underlined. The putativestop codon is indicated by asterisks. The polyadenylation splice site is shown in boldface letters. The nucleotide sequence missing from the 3'-end in clone 5-1C is underlined (bp 3093-3103), and the 3"termination of clone 14-1C is indicated by a uertical line (bp 3088).

TGCATTGCGGCAGGCCGGCTCTTTGTGGAGGAGTCCATCCACRACCAGTTTGTGCAGRAA 2340 730 740 V

a

l

V

a

l

G

l

u

G

l

u

V

a

l

G

l

u

L

y

~

~

t

~

' E e n R L x i

GTGGTGGAGGAGGTGGAGAAGATGAAAATCGGCAATCGGCAATCCCCTGGAGAGGGACACCRATCAT 2400 750 760 G1YPTOGlnRSnHiSCI1YAlaHiJLeuRIgLysLeuv~lGl~T~~Cy~GlnA~gGlyVal GGCCCGCAGRACCATGAGGCCCACCTGRGGRAGGAAGCTAGTGGAGTATTGCCAACGTGGTGTG 2460 770 Ly5GluGlyAlaThrLeuValCyaGlyGlyGlyA~~Gl~V~lP~oA~gP~~GlyPh~Ph~Ph~ AAGGRAGGGGCCACACTGGTCTGTGGTGGGRACCRAGTCCCRAGGCCAGGATTCTTCTTT

2520 790 G l n P r O T h r V a l P h e T h r A S p V a l G l u A s p H i s ~ S e r P h e

CAGCCRACCGTCTTCACAGACGTAGAGGACCACATGTACATCGCTRAGGAGGAGTCCT~T

2580 810

GlyProIleMetIleIleSerArgPheAlaAspGlyA8pValA8pAlaValLeuSerArg GGGCCCATCATGATCATCTCTCGGTTTGCTGATGGGGATGTGGATGCTGTGCTGTCCCGG 2640 830

840

AlaA8nAlaThrGluPheGlyLeuAlaSerGl~ValPheThrArqAspIleAsnLy8Ala GCCAATGCCACAGRATTTGGCCTGGCCTCTGGGGTCTTCACCCGCGATATCRACAAGGCC 2700 850

LeuTyrValSerAspLysLeuGlnAlaGlyThrValPheIleA8nThrTyrAsnLy8Thr CTGTATGTCAGTGACAAACTGCAGGCAGGCACTGTGTTCATCAACACGTATAACAAGACT 2760 870

AspValAlaAlaProPheG1yGlyPheLy8GlnSerGlyPheGlyLy8A8pLeuGlyGlu GATGTGGCCGCACCTTTTGGAGGATTCRAGCRATCTGGATTTGGCAAAGACCTGGGAGAG 2820

900 AlaAlaLeuA8nGluTyrLeuArgIleLy8ThrValThrPheGluTyr*** GCA~CCTGRATGAGTACCTGCGGATCRAGACTGTGACTTTTGAGTACTGAAGCCAAGTC 890

2880

TGTGATATTTCTCCTGTACCCTGTTGCCTCCCCTCAGGGAGCCCTGGCCCCTGTCTGGTG 2940

TCTTAGCACCCTCCTGTCCCCRACACTGTTCCCTTCAGCTGCTGGAGCTATCAGTGTGGA 3000 CCCCTGCTGGTGACXGGACACCTCCGGACAATTGGACAATTGGRAGTGGTTCCAAGTGGAGTGAGCAG

3060 TCATGTCCCGGATGGTTGTGAGCAGA-AAAAAA

I

3109

14-1C

FIG. 3"continued

zymes, which utilize 10-HCO-H,PteGlu, and the lower six, which synthesize 10-HCO-H,PteGlu, whereas the change at position 4 is limited to a conservative change in only the rat and human C1-THF synthases. Another interesting feature of

Primary Structure

4970

lO-HCO-H,PteClu of Rat Dehydrogenaye

this sequence is that i t occurs in the same relative position in each of the proteins. In the GAR transformylase sequences and rat 10-HCO-H'PteGlu dehydrogenase, the histidine occurs at positions 106-116, with the exception of the yeast sequence, which is displaced to residue 125. The clostridial sequences start with a glutamate a t position 94 or 98, whereas the Cl-THF synthase sequences start witha glutamine a t positions 100-108, relative to the beginning of the 10-HCOT H F synthetase domain. The beginning of the 10-HCO-THF synthetasedomain is setat Ile""' in thehumanCl-THF synthase (31),which is conserved in all the reportedsequences (29-32). Thestrictconservation of prolineandserine at positions 2 and 3, respectively, proline at position 6 (except for yeast GAR transformylase), and glycine a t position 10 may represent the boundaries of a secondary structural feature that is involved in the binding of 10-HCO-H'PteGlu by enzymes that both synthesize andutilize 10-HCO-H,,PteGlu. The extentof this domain may now be extended. Inglese et al. (35) recently reported thatE. coli GAR transformylase can be inactivated by using the affinity label N"'-(bromoacety1)5,8-dideazafolate. It was subsequently shown by Inglese et al. (36) that the dideazafolate label is covalently attached by an ester linkage to Asp'.'". This aspartate residue is conserved in all known GAR transformylase sequences includingyeast (Asp'"'). Site-directed mutagenesis of Asp'"' to Asn"' in E.

FIG.4. Northern blot analysis of rat liver RNA. Rat liver RNA ( 2 5 pg) was electrophoresed, transferred to GeneScreen Plus, and hyhridized as described under "Experimental Procedures." The positions of the 28 S and 18 S RNAs are indicated.

. .,.

too

0

200 203

I

COOH

'IN HUMAN(P7V.IDENTITY I

21

500

400

300

H2N

ALDEHYDE DEHYDROGENASE -NAD'IQl.IdDENTITYI

HzN'

1%

E .Ol

1

418

IPQWDFNTITYI

HzN

69

168 COOH

H2N B.S"BTILfS

800

700

900

cwn

458

DROSOPHILA I 2 Q h l D E N T I T Y l

u

600

417

193

H~N C -OOH

C "H l0H f

coli GAR transformylase (36) results ina catalytically inactive enzyme, which implies that Asp'"' may function within the catalytic center of the enzyme. Alignment of GAR transformylasesequences from R. subtilis, E. coli, Drosophila, and humanwiththerat 10-HCO-H'PteGludehydrogenasesequence from His""' to Gln1'9 is shown in Fig. 7. Asp"" of E. coli GAR transformylase is strictly conserved in all sequences and is 36 residues downstream of the initial histidineresidue of these sequences. The yeast GAR transformylase sequence has an aspartateconserved 42 residues downstream (data not shown). Inglese et al. (36) also reported that theE.coli Asn'.',' mutant could still bind substrates and inhibitors and can also be covalently labeled with N"'-(bromoacetyl)-5,8-dideazafolate. The site modification in the Asn'" mutant was His'"' (Fig. 7). This histidine residue is strictly conserved in Drosophila (His'2i) and yeast The human GAR transformylase has a histidine at position 121,2residues downstream. R. subtilis has no histidineresidue in this region, whereas rat liver lO-HCO-H,PteGlu dehydrogenase has a histidine at position 113, 4 residues upstream. The proximity of the reactive His"" in E. coli GAR transformylase to the conserved sequence in Fig. 6 supports the hypothesis that this domain is involved in the binding of 10-HCO-H'PteGlu. The spatial arrangement of this region of E. coli GAR transformylase (His""-Gln'") and its role in folate binding will have to await the elucidation of the crystal structure currently in progress (37). Coincidentally, the 10-HCO-THF synthetase domains of the four CI-THF synthases and the two10-HCO-THFsynthetases show highly conserved sequences within thisregion. However, there is no identity with the GAR transformylase sequences, except for a strictly conserved aspartatethat is 34 residues downstream.Thisis illustrated in Fig. 7 by the inclusion of the relevantsequences of the 10-HCO-THF synthetase domains of the yeast and human cytosol CI-THF synthaseenzymes. Currently, the inhibitionof purine synthesis, andin particular the inhibition of GAR transformylase,has become a popular target for the development of antineoplastic agents (38). The region of 10-HCO-H'PteGludehydrogenase at H i ~ ' ~ - G l nshows "~ a highidentity to the same region of GAR transformylase (Fig. 7). This suggests that lO-HCO-H,PteGlu dehydrogenase may also interact with folate analogs designed toinhibit GAR transformylase.Inhibition of 10-HCO-

1

COOH ALDEHYDEDEHYDROGENASE-NADP+

(237. I O E N T I T Y l

736 I

COOH

DELTA-I-PYRROLINE-5-CARBOXYLATE 125%lDENTITYI DEHYDROGENASE

(30-..IDEN&YI

GAR TRANSFORMYLASE

FIG. 5. Alignment regionsof rat liver 10-formyltetrahydrofolate dehydrogenase. Searches of the PIR (release 23) and SWISS-PROT (release 13) protein data bases by FASTA or FASTDR(see "Experimental Procedures") using the entire 902-residue sequence of 10-formyltetrahydrofolate dehydrogenase revealed alignments with other known sequences, with the exception of human GAR transformylase (see below). These results are shown graphically to illustrate the domains thatoccur in 10-formyltetrahydrofolate dehydrogenase. The entire sequence of each protein is shown. The hemv lines represent the areas of sequence identity to the proteins, and the numbers represent the corresponding residue in the 10-formyltetrahydrofolate dehydrogenase sequence. The sequences aligned to the NH,-terminal region are all examples of GAR transformylase. The human enzyme was not in the data base, and alignmentwas achieved by entering the amino acid sequence (26) and comparing it with 10-formyltetrahydrofolate dehydrogenase (residues 1-203) using GENALIGN (IntelliGenetics Inc.; see "Experimental Procedures"). The sequences aligned to the COOH-terminal region are composed of a series of NAD'dependent aldehyde dehydrogenases, (Table I), an NADP+-dependenttumor-specific aldehyde dehydrogenase, and ~-1-pyrroline-5-carboxylate dehydrogenase.

Primary Structure

of 10-HCO-HPteGlu Rat 460

aEsLeyE Yeast Rat

GART

BEE

&

A

27

1

2

1

0

10-FTHFDH

1

2

5

3

H

P

4 A

1

I

:

6

H

P

I

I

I P

5

6

L

P

L

L

I

I

I

1

0

8

R

I H

.I

.I

.I

.I

.I

116

P

S

L

L

P

E . coli GART

23

1

0

8.s u b t i l i s

22

1

0

mitoYeast

GART

GART C1-SYNTH

30

L

M

C1-SYNTH

2 9

U

Q

I

I

I

8

H

P

Cyto

I

I

H

P

I

l

C1-SYNTH

32

U

P

I Human

31

C1-SYNTH

C.acidivrici

A

Q

i

' 33

10-FTHFS

9

4

10-FTHFS

34

9

8

Q

P

I

I P

I E

P

G G

.

'

I

: G

S

I

I

I

I

P

G

K

I

Y

I A

F

'

.

P

T

L

I

I I

L

I

'

T

F

I

I

P

P

G

I -

G

-

G

-

G

-

G

-

G

-

G

I

T

G

I

G

P

P

Q

'

L

P

I

I

F

1 P

I

I

K

: Y

II L

G

S

I

K P

G

I Q

I

E

I C . therroaceticum

S

I

D R

:

1

I.

L

I

F H

L

L L

I

I

Q

I

S

C R

T

I I I

F

I

'

I

S

L

G

P

S

F

I

I

I

I

I

I

S

L

G

P

S

F

I

I

FIG. 6. Putative 10-HCO-H4PteGlu-binding site. The sequences were taken from the references indicated. The number next to the first residue of each sequence is the position of that residue in the protein sequence. The exceptions to this are underlined and represent the position of the residue within the 10-HCO-THF synthetase (IO-FTHFS) domain of the C1-THF synthase (Cl-SYNTH) sequence. The reference point for the start of the 10-HCO-THF synthetase domain is Ile305 ofthe human C,-THF synthase (31).This residue is conserved in all four C,-THF synthase sequences listed. 410 460 Identical residues are indicated by vertical lines,whereas conservative changes are indicated bycolons.GART, glycinamide ribonucleotide transformylase; IO-FTHFDH, 10-HCO-H4PteGludehydrogenase.

10-FTHFDH PAT 106

HPSLLPRHRGASAINWTLIHGDKKGGFTIFWADDGLDTGDLLLC 149 IIIIII I I 1 1 1 1 1 l I I 1 I 109 H P S L L P A F P G I D A V G Q A F R A G V K V A G I T V H Y V D E G M D T G P I I A Q 152 B . SUBTILIS GART I I I I I I I I I I1 I1 I I I l l I I I I 1 1 I 108 HUMAN GART HPSLLPSFKGSNAHEQALETGVTVTGCTVHFVAEDVDAGCIILC . . 151 IIII I IlIIlI1 I 1 1 I I 1 I IIIIII I DROSOPHILA GART 116 HPSLLPKYPGLHVQKQALEAGEKESGCTVHFVDEGVDTGAIIVC 159 IIIIIIIIIIII 1 1 1 1 I I I 1 1 1 1 I 1 I I €.COLI GART I o n HPSLLPKYPGLHTHRQALENGDEEHGTSVHFVTDEWGGPVILQ 151 ~~

YEAST CYTO C1-SYNTH HUMAN CYTC C1-SYNTH

I l l I I Le8 QPSLGPTL-GVKGGAAGGGYSCVIPMDEFNLHLTG-DIHAIGAA 149 I l l I l l I I I I l l 1 1 1 1 1 l I l I I I I I I I I I I I1111111 QPSQGPTF-GIKGGAAGGGYSQVIPMEEFNLHLTG-DIHAIGAA

450 RAT 10-FTHFDH

4971

4 1430 0 420 CVINYVERAVNKLTLQMPY-QLFIGGEFVDAEGSKTYNTINPTDGSVICQVSLAQVSDVD I:::/ ::11::1: I: ::/:: ::/I:::: : I l l : :: : I l l HUMAN CYTO ALDH SSSGTPDLPVLLTDLKIQYTKIFINNEWHDSVSGKKFPVFNPATEEELCQVEEGDKEDVD 30 20 10 40 50 60 470 480 520 490 510 500 KAVAAAKEAFENG-LWGKINARDRGRLLYRLADYMECHQEELATIEALDRGAVYTLALKT RAT 10-FTHFDH I11 1 1 : : 1 1 : I I :::l::IIlllI:lll::l::: II1:I::: I :I: I : HUMAN CYTC ALDH KAVKAARQAFQIGSPWRTMDASERGRLLYKLADLIERDRLLLATMESMNGGKLYSNAYLN 70 80 90 100 120 110 540 580 530 510 560 550 RAT 10-FTHFDH HVGMSIQTFRYFAGWCDKIQGATIPINQARPNRNLTLTKKEPVGGLWHCHPLELSLNDAI . . . . :. I : l : I l 1 1 1 I I I I I 1 1 1 1 : :: :I 1 : : 1 1 : 1 : I :::I I HUMAN CYTC ALDH 3LAGCIKTLKYCAGWADKIQGRTIPID----GNFFTYTRHEPIGVCGCIIPWNFPLVMLI 130 140 150 160 170 590 600 610 620 640 630 RAT 10-FTHFDH LEDCSPVRAGNTVVIKPACVTPLTALKFAELTLKAGIPKGVVNILPGSGSLVGCRLSDHP .. .. .. .. .. 1 1 1 1 1 : 1 1 1 : I I I I I I : 1 : 1 : : 1 1 : 1 / I / l l : l l 1 : : I : : 1 : 1 HUMAN CYTO ALDH WKIGPALSCGNTWVKPAEQTPLTALHVASLIKEAGFPPGVVNIVPGYGPTAGAAISSHM 1 8 0200 220 210 190 230 650 660 670 690 680 700 DVRKIGFTGSTEVGKHIMKSCALSNVKKVSLELGGKSPLIIFADCDLNKAVQMGMSSVFF PAT 10-FTHFDH 1: l::llllllIll 1::: : ll:l:l:llllllll 1::Il l1::lH::: :11: HUMAN CYTO ALDH DIDKVAFTGSTEVGKLIKEAAGKSNLKRVTLELGGKSPCIVLADADLDNAVEFAHHGVFY 240 260 250 210 280 290 720 710 130 140 150 760 NKGENCIAAGRLFVEESIHNCFVQKVVEEVEKMKIGNPLERDTNHGPCNHEAHLRKLVEY RAT 10-FTHFDH ::I: Illl:l:IlllII:::lI:: 1 1 ::1 :lIII::::::lIl :::: 1::: HUMAN CYTO ALDH HCGCCCIAASRIFVEESIYDEFVRRSVERAKKYILGNPLTPGVTQGPCIDKEQYDKILDL 330 300 320 310 340 350 710 180 820 790 810 800 PAT 10-FTHFDH CQRGVKEGATLVCGGNQVPRPGFFFCPTVFTDVEDHMYIRKEESFGPIMIISRFADGDVD :I:III 1 : I 111:: : I:I 1 1 1 1 1 : : 1 : 1 : 1 IIIII 1 1 1 : I : : :I I HUMAN CYTO ALDH IESGKKEGAKLECGGGPWGNKGYFVCPTVFSNVTDEMRIAKEEIFGPVCCIMKFK--SLD 360 3 1390 0 380 410 400 810 880 850 830 840 860 AVLSRRNATEFGLASGVFTRDINKALYVSDKLQAGTVFINTYNKTDVAAPFGGFKCSGFG PAT 10-FTHFDH :1::1ll:I :ll::llll:ll:ll: :I: I1lIII::I 1 : :::: I I I I I I I 1 I HUMAN CYTO ALDH D V I K R A N N T F Y G L S A G V F T K D I D K A I T I S S A L Q A G T V W V N S G N G 450 440 430 420

0

G

P

l S

I

I

1

P

L

L

S

P

P

I Rat

I

9 Q

L

S

I

I Yeast

S

.

9

I

24,25

GART

Drosophila

8

:

S

26

Human

7

Dehydrogenase 440

890 900 RAT 10-FTHFDH IDLGEAALNEYLBIKTVTFEY : : I l l :::I1 : / I l l HUMAN CYTC ALDH RELGEYGFHEYTEVKTVTVKI 480 490

FIG. 8. Alignment between 10-HCO-H4PteGludehydrogenase and human liver cytosol aldehyde dehydrogenase. The sequences were aligned by FASTA (17). Identical residues are identified by vertical lines, whereas conservative changes are identified by colons. The conserved active-site Cys residue is in boldface type. Argg' is underlined. IO-FTHFDH, 10-HCO-H4PteGlu dehydrogenase; CYTO ALDH, human cytosol aldehyde dehydrogenase (39). RAT 10-FTHFDH 591

GNTVVIKPAQVTPLTALKFAELTLKAGIPKGVVNILPGSGSLVGQRLSDHPDV

HUMANCYTOALDH185

lIlll:Ill: 111111::1 I :: H : I lllll:ll 1: :I :I I I: GNTVVVKPAECTPLTALHVASLIKEAGFPPGVVNIVPGYGPTAGAAISSHMDI

146

IIIII 11:: ::I: : ::111:1 1I:I:::I: : : :::::I: FIG. 7. Alignment of putative 10-HCO-H4PteGlu-binding AlP5C-DH 228 GNTVVWKPSQTAALSNYLLMTVLEEAGLPKGVINFILGDPVQVTDQVLADKDF site with active-site Asp'44 of E. coZi GAR transformylase. YEAST The GAR transformylase (GART) (22-26) and ratlO-HCO-H,PteGlu 10-FTHFDH 644 RKIGFTGST-----EVGK-HI-MKSCALSNVKKVSLELGGKSPLIIFADCDLN dehydrogenase (IO-FTHFDH) sequences were aligned by GEN- RAT I::1lIII 1 1 1 1 :I :: ll:l:l:llllllll 1::11 1 1 : DKVAFTGST-----EVGK-LIKEAAGK SNLKRVTLELGGKSPCIVLArWLD ALIGN (IntelliGenetics Inc.; see "Experimental Procedures") using HUMANCYTOALDH239 : : I I I II I/ :: :::I1 :: I:: I I l l : :I:: 1:: the Needleman-Wunsch algorithm (18).The Cl-THF synthase (Cl- YEAST AlP5C-DH 281 GALHFTGSTNVFKSLYGKIQSGWEGKYRDYPRIIGETGGKNFHLVHPSANIS SYNTH) sequences (29, 31) are included to show the putative 10HCO-H4PteGlu-bindingregion (QPSLGPTL-G) and theposition of RAT 10-FTHFDH 700 KAVQMGMSSVFFNKGENCIRGRLFVEESIHNCF 123 : 1 1 : : 1I:::I: l 1 1 1 : 1 : 1 1 1 1 1 1 : : : 1 the conserved aspartate (boldface). Residues of interest are shown in HUMANCYTOALDH295 NAVEFAHHGVFYHCGCCCIAASRIFVEESIYDEF 318 boldface type. The number next to the first residue of each sequence : 1 1 :: : I 1 1 1 1 I:l1II:lI 1l:::Il HAVLSTIRGTFEFCGQKCSAASRLYLPESKSEEF 367 is the position of that residue in the protein sequence. The exceptions YEASTAlPSC-DH244 are underlined and represent the position of the residue within the FIG. 9. Alignment of rat liver 10-HCO-H4PteGludehydro10-HCO-THF synthetase domain of the Cl-THF synthase as de- genase, human cytosol aldehyde dehydrogenase, and yeast Ascribed for Fig. 6. 1-pyrroline-5-carboxylatedehydrogenase. The alignment was achieved with GENALIGN (IntelliGenetics Inc.) using the Needleman-Wunsch algorithm (18).Residues which are identical are indicated by vertical lines. Spaces (indicated by dashes) were introduced TABLEI to improve the alignment. 10-FTHFDH, 10-HCO-H,PteGlu dehydroSequence identity of IO-HCO-HZteGlu dehydrogenase t o various genase; CYTO ALDH, human cytosol aldehyde dehydrogenase (39); aldehyde dehydrogenases AlP5C-DH, yeast A-1-pyrroline-5-carboxylate dehydrogenase (51).

Enzyme

Identity"

Extentb

Ref.

%

NAD+-dependent aldehyde dehydrogenase Human cytosol 47 Human mitochondria 46 Horse cytosol 45 Horse mitochondria 46 Rat mitochondria 47 Aspergillus nidulans 45 P. oleovorans 28 NADP+-dependentaldehyde dehydrogenase Rat tumor-specific Va1458-Thr897 48 25

Thr4I7-Thr8% 39,40 Gln423-Thrs99 41,42 Le~~~~-Thr~ 43" Gln423-Phe9" 44 Gln423-Thrs99 45 P r ~ ~ * ~ - P h e46~ ~ Thr441-Gly885 47

Percent of residues which are exactly matched. bExtent refers tothe region of identityin 10-HCO-H,PteGlu dehydrogenase.

H4PteGlu dehydrogenase by GAR transformylase inhibitors could compromise hepatic formate metabolism in primates and humans and may produce acidosis typical of methanol poisoning (8,9). Itwould also beinteresting to investigate the action of these inhibitors on the 10-HCO-THF synthetase activity of trifunctional C1-THF synthase. The region from amino acids 200 to 420 shows no extensive identity to any known sequencein eithernucleotide or protein data bases. The sequence from Thr4I7 to the COOH terminus (residue 902) shows a remarkable 45-47% identity toa series of mammalian and one fungalNAD+-dependentaldehyde dehydrogenase (Table I and Fig. 5) (39-46). The alignment to the sequence of human cytosolic NAD+-dependent and aldehyde dehydrogenase is shown in Fig. 8. The sequences

4972

Primary Structureof Rat 10-HCO-HZteGluDehydrogenase

show that out of 484 amino acid residues, 226 were identical (47% identity), whereas 165 residues (34%) were judged to be conservative changes, which would give an 81%overall identity. A lower identity (28%) was shown to anNAD+-dependent aldehyde dehydrogenase from Pseudomonas oleavoram (47) and to a rat tumor-specific NADP+-dependent aldehyde dehydrogenase (25% identity) (48). The extent of identity is essentially along the entire length of the aldehyde dehydrogenase sequences. The degree of conservation between 10-HCO-H4PteGlu dehydrogenase and NAD+-dependent aldehyde dehydrogenase prompted us to check for this activity. Using the method of Lindahl and Evces (19), which is designed for the assay of rat liver aldehyde dehydrogenase, no activity was found when NAD' was used. However, with added NADP+, activity was seen for formaldehyde (0.008 pmol of NADPH min" mg"), propionacetaldehyde (0.008 pmol of NADPH min" mg"), aldehyde (0.049 pmol of NADPH min-' mg-'), and benzaldehyde (0.005 pmol of NADPH min-' mg"). Rat liver aldehyde dehydrogenase has not been purified, so a direct comparison between these rates is not possible.However, the specific activity of the pure human cytosol NAD+-dependent aldehyde dehydrogenase assayed under similar conditions (1.3 pmol of NADH min-' mg-') (49) is -25-fold higher than the NADP+-dependent aldehyde dehydrogenase activity of 10HCO-H4PteGlu dehydrogenase with propionaldehyde as the substrate. An oriental variant of human mitochondrial aldehyde dehydrogenase, where glutamate 502, 13 residues from the COOH terminus (equivalent toin cytosolic enzyme) (Fig. 7), is changed to lysine, results in a drasticreduction in enzyme activity (50). The corresponding residue in 10-HCOH4PteGlu dehydrogenase is arginine 894, a basic residue, which may also affect the rate of aldehyde dehydrogenase activity. The active site of human liver cytosol aldehyde dehydrogenase contains a disulfiram-sensitive thiol that has been identified as cysteine 302 (39). This cysteine residue is conserved in all the enzymes listed in Table I and in 10-HCOH4PteGlu dehydrogenase (Cys707) (Fig.8). The effect of disulfiram on 10-HCO-H4PteGludehydrogenase has not been determined. Another enzyme that shows high identity to a significant region of the COOH terminus of 10-HCO-H4PteGludehydrogenase is the yeast mitochondrial enzyme A-l-pyrroline-5carboxylate dehydrogenase (EC 1.5.1.12), which converts A1-pyrroline 5-carboxylate to glutamate in an NAD+-dependent pyrrole ring-opening oxidation (51). Alignments of 10HCO-H4PteGlu dehydrogenase, human cytosol aldehyde dedehydrogenase hydrogenase, and A-l-pyrroline-5-carboxylate reveal a high levelof identity (29%)among all three sequences in a region close to the putative active-site cysteine 302of aldehyde dehydrogenase (Fig. 9). The revelation that 10-HCO-H,PteGlu dehydrogenase has three putative domains suggests that it may have evolved from a triple-gene fusion. The presence of the GAR transformylase domain and the aldehyde dehydrogenase domain now allows us to specifically inhibit or inactivate these domains and toinvestigate the folate-binding region(s), the mechanism of action, and the physiological role of 10-HCO-H4PteGlu dehydrogenase. Acknowledgments-We wish to thankWilliam T. Briggs for excellent technical assistance. We also would like to thank T. Porter for performing the automated peptide sequence analysis and Dr. Marion L. Dodson for assistance in the implementation of the FASTA and IntelliGenetics software.

REFERENCES 1. Kutzbach, C., and Stokstad, E. L. R. (1968). Biochem. Biophys. Res. Commun. 30,111-117 2. Rios-Orlandi, E. M., Zarkadas, C.G., and MacKenzie, R. E. (1986) Biochim. Biophys. Acta 8 7 1 , 24-35 3. Scrutton, M. C., and Beis, I. (1979) Biochem. J . 177, 833-846 4. Cook, R. J., and Wagner, C. (1982) Biochemistry 21,4427-4434 5. Min, H., Shane, B., and Stokstad, E. L. R. (1988) Biochim. Biophys. Acta 967,348-358 6. Case, G. L., Kaisaki, P. J., andSteele, R. D. (1988) J. Biol. Chem. 263,10204-10207 7. Barlowe, C. K., and Appling, D. R. (1988) Biofactors 1, 171-176 8. Johlin, F. C., Swain, E., Smith, C., and Tephly, T.R. (1989) Mol. Pharmacol. 35, 745-750 9. McMartin, K.E., Martin-Amat, G. C., Makar, A. B., and Tephly, T.R. (1977) J. Pharmacol. Exp. Ther. 2 0 1 , 564-572 10. Krebs, H. A., Hems, R., and Tyler, B. (1976) Biochem. J. 158, 341-353 11. Cook, R. J., and Wagner, C. (1986) Methods Enzymol. 122,251255 12. Cook, R. J., Misono. K. S., and Wagner. . C. (1984) J. Biol. Chem. 259,12475-12480 13. Maniatis. T.. Fritsch. E. F.. and Sambrook. J. (1982) Molecular Cloning:A Laborato'ry Manual, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 14. Smith, G. E., and Summers, M. D. (1980) Anal. Biochem. 109, 123-129 15. Chirgwin, J . M., Przybyla, A. E., MacDonald, R. J., and Rutter, W. J. (1979) Biochemistry 18, 5294-5299 16. Sanger, F., Nicklen, S., and Coulson, A. R. (1977) Proc. Natl. Acad. Sci. U. S. A. 7 4 , 5463-5467 17. Pearson, W. R., and Lipman, D. J. (1988) Proc. Natl. Acad. Sci. U. S. A. 8 5 , 2444-2448 18. Needleman, S., and Wunsch, C. (1979) J. Mol. Biol. 48,444-453 19. Lindahl, R., and Evces, S. (1984) Biochem. Pharrnacol. 33,33833389 20. Lathe, R. (1985) J. Mol. Biol. 1 8 3 , 1-12 21. Grantham, R., Gautier, C., Gouy, M.,Jacobzone, M., and Mercier, R. (1981) Nucleic Acids Res. 9, r43459 22. Ebbole, D. J., and Zalkin, H. (1987) J. Bid. Chem. 2 6 2 , 82748287 23. Smith, J. M., and Daum, H. A., 111 (1987) J. Biol. Chem. 2 6 2 , 10565-10569 24. Henikoff, S., and Furlong, C. E. (1983) Nucleic Acids Res. 1 1 , 789-800 25. Henikoff, S., Sloan, J. S., and Kelley, J. D. (1983) Cell 3 4 , 405414 26. Schild, D., Brake, A. J., Kiefer, M. C., Young, D., and Barr, P. J. (1990) Proc. Natl. Acad. Sci. U. S. A. 8 7 , 2916-2920 27. White, J. H., Lusnak, K., and Fogel, S. (1985) Nature 315,350352 28. Smith, G. K., Benkovic, P. A., and Benkovic, S . J. (1981) Biochemistry 20,4034-4036 29. Staben, C., and Rabinowitz, J. C. (1985) J. Biol. Chem. 2 6 1 , 4629-4637 30. Shannon. K. W.. and Rabinowitz, J. C. (1988) J. Bid. Chem. 263, 7717-7725 31. Hum, D. W., Bell, A. W., Rozen, R., and MacKenzie, R. E. (1988) J . Biol. Chem. 263.15946-15950 32. Thigpen, H. E., West, M. G., and Appling, D. R. (1990) J . Biol. Chem. 265,7907-7913 33. Whitehead, T. R., and Rabinowitz, J. C. (1988) J. Bacteriol 1 7 0 , 3255-3261 34. Lovell, C. R., Przybyla, A., and Ljungdahl, L. G. (1990) Biochemistry 29,5687-5694 35. Inglese, J., Johnson, D. L., Shiau, A., Smith, J. M., and Benkovic, S. J. (1990) Biochemistry 29, 1436-1443 36. Inglese, J., Smith, J. M., and Benkovic, S. J. (1990) Biochemistry 29,6678-6687 37. Stura, E. A., Johnson, D. L., Inglese, J., Smith, J. M., Benkovic, S. J., and Wilson, I. A. (1989) J. Biol. Chem. 264,9703-9706 38. Chabner, B. A,, Allegra,C. J., and Baram, J. (1986) in Proceedings Chemistry and of the 8th International Symposiumonthe Biology of Pterdines, Montreal, Canada (Cooper, B.A., and Whitehead, V. M., eds) pp. 945-951, Walter de Gruyter, Berlin 39. Hempel, J., Bahr-Lindstrom, H., and Jornvall, M. (1984) Eur. J . Biochem. 141,21-35

Primary Structure

of

Rat lO-HCO-H,PteGlu Dehydrogenase

40. Hsu, L. C., Tani, J., Jujiyoshi, T., Kurachi, K., and Yoshida, A. (1985) Proc. Natl. Acad. Sci. U. S. A . 82,3771-3775 41. Hsu, L.C., Bendel, R. E., and Yoshida, A. (1988) Genomics 2 , 57-65 42. Hempel, J., Kaiser, R., and Jornvall, H. (1985) Eur. J . Biochem. 153,13-28 43. Von Bahr-Lindstrom, H.,Hempel, J., and Jornvall, H. (1984) Eur. J. Biochem. 141,37-42 44. Johansson, J., Von Bahr-Lindstrom, H., Jeck, R., Woenckhaus, C., and Jornvall, H. (1988) Eur. J . Biochem. 1 7 2 , 527-533 45. Farres, J., Guan, K.-L., and Weiner, H. (1989) Eur. J. Biochern. 180,67-74 46. Pickett, M., Gwynne, D. I., Buxton, F. P., Elliot, R., Davies, R. W., Lockington, R. A,, Scazzacchio, C., and Sealy-Lewis, H.M.

4973

(1987) Gene (Arnst.) 5 1 , 217-220 47. Kok, M., Oldenhuis, R., van der Linden, M. P. G., Meulenberg, C. H. C., Kingma, J., and Witholt, B. (1989) J. Biol. Chem. 264,5442-5451 48. Jones, D. E., Brennan, M. D., Hempel, J., and Lindahl, R. (1988) Proc. Natl. Acad. Sci. U. S. A. 85, 1782-1786 49. Pietruszko, R., and Yonetani, T. (1981) Methods Enzyrnol. 7 1 , 772-781 50. Yoshida, A., Huang, I.-Y., and Ikawa, M.(1984) Proc. Natl. Acad. Sci. U. S. A. 8 1 , 258-261 51. Krzywicki, K. A,, and Brandriss, M.C. (1984) Mol. Cell. Biol. 4 , 2837-2842 52. Wagner, C., Briggs, W. T., and Cook, R. J . (1989) FASEB J. 3 , A1059 (abstr.)