purification of the enzyme - BioMedSearch

4 downloads 0 Views 1MB Size Report
Nov 6, 1991 - Wolfgang Kapfer, Jorn Walter and Thomas A.Trautner ...... Devereux,J., Haeberli,P. and Smithies,O. (1984) Nucl. Acids Res., 12,. 387-395. 30.
.-. 1991 Oxford University Press

Nucleic Acids Research, Vol. 19, No. 23 6457-6463

Cloning, characterization and evolution of the BsuFl restriction endonuclease gene of Bacillus subtilis and purification of the enzyme Wolfgang Kapfer, Jorn Walter and Thomas A.Trautner Max-Planck-lnstitut fOr Molekulare Genetik, lhnestraBe 73, D-1000 Berlin 33, FRG Received September 4, 1991; Revised and Accepted November 6, 1991

ABSTRACT The restriction endonuclease (R.BsuFI) of Bacillus subtilis recognizes the target DNA sequence 5' CCGG. The R.BsuFI gene was found in close proximity to the cognate M.BsuFI gene, which had previously been characterized (1). Cloning of the R.BsuFI gene in E.coli was only possible with the M.BsuFI Mtase gene present on a compatible plasmid. The cloned R.BsuFI gene was expressed in E. coil and restriction activity was observed in vivo and in vitro. The R.BsuFI gene consists of 1185 bp, coding for a protein of 395 amino acids with a calculated molecular weight of 45.6 kD. The R.BsuFI enzyme was purified to homogeneity following overexpression. It presumably works as a dimer and cleaves the 5' CCGG target sequence between the two cytosines to produce sticky ends with 5' CG overhangs, like the isoschizomers R.MspI and R.Hpall. The relatedness between R.BsuFI and R.Mspl is reflected by significant similarities of the amino acid sequences of both enzymes. This is the first case where such similarities have been observed between isoschizomeric restriction endonucleases which belong to 5mC specific R/M systems. This observation suggests that R.BsuFI and R.Mspl genes derive from a common ancestor. In spite of such functional and evolutionary relatedness, the R/M systems differ in the arrangement of their R and M genes. In the BsuFl system transcription of the two genes is convergent, whereas divergent transcription occurs in the Mspl system.

INTRODUCTION Type II restriction endonucleases (ENase) occur in association with DNA methyltransferases (Mtase) of restriction/modification (R/M) systems. Their primary function is to inactivate foreign DNA invading bacteria (2). In the last decade a number of genes encoding ENase have been cloned and characterized (3). Comparative studies on sequenced R/M systems led to some general conclusions: (i) the two genes of R/M systems are always located in close proximity to each other in bacterial chromosomes; (ii) no significant similarities were found on the amino acid sequence level between restriction enzymes of different R/M systems with the exception of the isoschizomeric pair R.EcoRI

EMBL accession no. X62104

and R.RsrI (4); (iii) in contrast, Mtases (5, 6, 7) show sometimes extensive amino acid similarities to one another; (iv) ENase and Mtase of the same R/M system have no significant homology, although both act on the same DNA target. The ubiquitous distribution of R/M systems in the bacterial kingdom, the presence of sequence specific Mtases in association with bacterial viruses and in eukaryotic cells and the occurrence of many R/M systems with identical specificity in phylogenetically separate bacteria make ENases and the Mtases interesting paradigms to study protein evolution. Recently, the M.BsuFI gene has been isolated and sequenced (1). Sequence comparison between M.BsuFI and the isomethylomer M.MspI (8) revealed a high degree of similarity on both the nucleotide and amino acid sequence levels (60% and 56%, respectively). This is particularly marked in the putative target-recognizing-domains (TRDs) which are the most likely regions responsible for sequence specific interaction with DNA (9). The similarity suggests that a common ancestor of M.BsuFI and M.MspI genes-which derive from gram positive (10) and gram negative (8) bacteria, respectively,-was transferred horizontally among the phylogenetically distant organisms. This observation raised the question of whether also the corresponding ENases R.BsuFI and R.MspI are similar in their amino acid sequences. This would indicate that they also can be traced back to a common ancestor. With the sequence of R.MspI available (8), such an amino acid sequence comparison of the two systems required cloning and sequencing of the R.BsuFI gene and characterization of the ENase, which is reported in this communication.

MATERIAL AND METHODS Bacterial strains and media The Bacillus subtilis strains used were ISF18 (hsrM-, hsmM+, hsrF+, hsmF+) (10) and SB1207 (hsrM-, hsmM+, hsrF, hsmF-) (11). The E. coli strains NM675 (e140, D (mrrhsdRMS-mcrBC)) (12), K803 (e14°, mcrB-1, hsdS-3) (13) and DH5aMcr (el4°, recA-, A (mrr-hsdRMS -mcrBC)) (Life technologies, Gaithersburg, MW, USA) were used for maintenance of plasmids encoding the BsuFI ENase and Mtase. JM103 (14) was used as host for cloning procedures with M13 mpl8/ mpl9 (15). B. subtilis strains were grown in TY medium (16) with the addition of 5 ,ug chloramphenicol/ml, if required. E.

6458 Nucleic Acids Research, Vol. 19, No. 23 coli strains were grown in L-Medium (17) with addition of 50 to 100 ug ampicillin/ml, 70 itg spectinomycine/mi or 13 Itg tetracycline/ml, if required. Competent B. subtilis cells were prepared and transformed as described (18).

Plasmids and phages Plasmid pBW201 (1) and pOU71-5, a derivative of pOU71 (19), each encoding the BsuFI Mtase were used in the two step cloning procedure to protect the host genome against R.BsuFI cleavage. The low copy number plasmid pGB2 (20) with the spectinomycine resistance gene (SpR) served as a vector for subcloning of the R.BsuFI gene and pMS1 19 H/E (a gift of E. Lanka), a derivative of pJF 119 (21) was used as an expression vector of the R.BsuFI gene. The chioramphenicol resistance gene (CatR) of pC 194 (22) and was used for construction of pBRCatl8. Phage stocks of X,ir were prepared by plate lysates (23). Phage stocks of modified and non-modified SPP1 phages were prepared as described (24).

Determination of nucleotide sequences DNA restriction fragments and deletion derivatives of the 1.7 kb HindIII/HpaI fragment of pBW55 were subcloned in M 13 mpl8 or mpl9 vectors. The nucleotide sequences of overlapping fragments of both strands were determined by the dideoxy chain termination method (25).

buffer 1 (10 mM Tris (pH 7.6), 0.1 mM EDTA, 1 mM DTE (Dithioerythritol), 20 mM KCl) and dialysed against the same buffer overnight. The suspension was loaded on a 2.5 x4.0 cm DEAE Sepharose column (Pharmacia) equilibrated in buffer 1. A linear gradient from 20 mM to 1.0 M KCl in buffer 1 was used to elute the active R.BsuFI with a flow rate of 60 ml/h. Active fractions were pooled and dialysed against ten volumes of buffer 1 overnight. The suspension was loaded on a 2.5x3.0 cm S Sepharose column (Pharmacia) equilibrated with buffer 1. Proteins were eluted with a linear gradient of 20 mM to l M KCl in buffer 1. Fractions exhibiting R.BsuFI ENase activity were pooled and dialyzed against buffer 1. The protein pool was further purified with a 1.0 cmx 2.5 cm DNA cellulose column (Pharmacia) and eluted with a linear gradient of 20 mM to 1.0 M KCl in buffer 1 (flow rate 10 ml/h). Finally, R.BsuFI active fractions were concentrated to a final volume of 2 ml by dialysis against buffer 2 (50 mM Hepes-KOH (pH 7.6), 0.5 M KCI, 10 mM EDTA, 1mM DTE + 40% glycerol) and gel filtration on a 1.5x75 cm G-100 Sephadex column (Pharmacia) was performed. Samples were eluted with

A

Enzymes and chemicals Restriction enzymes, calf intestinal phosphatase, Klenow fragment of DNA polymerase and T4 DNA Ligase were obtained from Boehringer Mannheim, FRG and used under conditions recommended by the manufacturer. [(35-S] dATP was purchased from Amersham (UK).

Construction of expression vector In order to clone the R.BsuFI gene, host DNA must be premodified by M.BsuFI. Therefore, the genes of the BsuFI R/M system were separated on two compatible plasmids. The Mtase encoded by a 2.5 kb EcoRI fragment was inserted into the plasmid pOU71, a ColEI replicon. This plasmid, pOU7 1-5, contains the AmpR gene and the temperature-sensitive 1cI857 repressor, which regulates the expression of the replication control genes. At 37°C the copy number of this vector reached nearly that of pBR322 (26). The R.BsuFI gene encoded on a 2.1 kb HindIII-BamHI DNA fragment of pBW55 was cloned on a compatible and selectable expression vector. The plasmid, designated as pBW57 (Fig. 4), was maintained in DH5aMcr [pOU71-5].

Overexpression and purifilcation of R.BsuFI endonuclease A 200 mi culture of DH5aMcr [pOU71-5/pBW57] grown to A6w = 0.8 was induced with 1mM IPTG. After 4 h cells were harvested by centrifugation and washed in 20 mM Tris (pH 8.0). Cells were collected by centrifugation and resuspended in 5 ml 20 mM Tris (pH 7.6), 500 mM KCl. Cells were disrupted twice by French press (pressure 850 psi) and cell debris was removed by centrifugation (15k rpm with a Sorvall SS34 rotor, 20 min). The supernatant was mixed slowly with a 3 fold volume of an ice cold saturated ammonium sulfate solution and kept at 4°C for 30 min. After centrifugation (40 min, 17k rpm with a Sorvall SS34 rotor at 4°C) the pellet was resuspended in 5 ml

+

ISF18

ISF Catl8

B

pBW5

in vivo activity AM

x

+ +

L

7_ CI_ A

C

E

CtR TcR

o

r

+

+

M

1

C pBW 55

+ +

R

pBW 53 pBW 43

pBW 54

M +-

II +-

Fig. 1. Schematic presentation of the strategy used for cloning of the R.BsuFI gene. (A) Integration of pBRCat 18 into the chromosome of ISF 18. Chromosomal and plasmid DNA are shown as double and single lines, respectively. Homologous DNA sequences between plasmid and the ISF18 chromosome are filled. Hatched boxes indicate direct repeats flanking the inserted pBRCat 18 at the integration point of mutant ISF Catl8. (B) Map of pBW5 excised from the chromosome of the ISFCatl8 mutant. Arrows indicate the position and orientation of genes of pBRCatl8 and of M.BsuFI (M) and R.BsuFI (R) genes; ori indicates the origin of replication of pBRCat l 8 in E. coli. (C) Reconstitution of the entire BsuFI R/M system. DNA fragments of pBW5 were joined together on the low copy number plasmid pGB2 creating pBW55. Several subclones of pBW55 were constructed to determine the confines of the R.BsuFI gene. The in vivo restriction and modification activity of ISF 18 wt, ISFCatl 8 and its subcloned DNA fragments in E. coli plamsids are show in the right panel (- = deficient; + = proficient).

Nucleic Acids Research, Vol. 19, No. 23 6459 buffer 2 + 10% glycerol (flow rate 5ml/h). The R.BsuFI active fraction was stored at -20°C without detectable loss of activity over a period of six months.

bromide. One unit of activity was defined as the amount of enzyme required for complete digestion of 1 jitg X-DNA in one

Determination of the N-terminal sequence of R.BsuFI To determine the translational startpoint of the R.BsuFI gene, crude extracts of DH5aMcr [pOU71-5] additionally encoding the R.BsuFI gene on pBW57 were separated by SDS-PAGE. The proteins were electroblotted on PVDF membrane (Millipore) and stained in Ponceau S solution (Sigma). To avoid contaminations only the middle parts of the 40 kD R.BsuFI protein bands in each track were carefully excised. Sequence determination was carried on an Applied Biosystems pulsed-liquid phase sequencer, model 477A. Phenylthiohydantoin amino acids were separated on-line in an Applied Biosystems model 120A analyser and identified by manual interpretation of the data.

In vivo restriction activity The R.BsuFI in vivo restriction activity of B. subtilis ISF18 wt and ISFCatl8 mutants was measured according to Trautner et al. (24) using phage SPP1. M.BsuFI modified Xvir phages were prepared on K803 encoding M.BsuFI on the plasmid pBW201. The R.BsuFI in vivo restriction activity was monitored by determination of the relative e.o.p. of non-modified and M.BsuFImodified Xvir phages grown on E. coli strains harbouring plasmids coding for the R.BsuFI gene according to standard

Endonuclease assay After each purification step, 1-4 t1l of selected fractions were analysed for R.BsuFI cleavage activity with phage X-DNA in 10 mM Tris/HCl (pH 7.6), 10 mM MgCl2 and 1 mM DTE. Samples were incubated for 1 h at 37°C and cleavage products were analysed on 1 % agarose gels after staining with ethidium

AAGCTTGGTAAAAATGAAAATGTTAATACTGACATTTTAATAAAAGTATGTAAGGCTC ** . . ~~~~~~~100

TiGACTGTGATATTGCTGATAiTAGGAAATTGTTAGTAAATAGTATCTTTCACCAAAAC -35 .-10

ACTAAATAGGGGATTTGTAGATQGGGGAAAAGTTATAACTATGAATAAAGACAATCAAAT S/D M

200

N

K

D

N

Q

I

CAAAAATGAATCTGGTAAACAAGCCAAAATTCTTGTATCAGAAATCGTAAATAATCTTAA K N E S G K Q A K I L V S E

I

V N

N

L

K

AAATGAATTAGGGATTAATATAGAAATTGAAGAAGGGTACTCTATAGGTTACCCAAATCA N E L G I N I E I E E G Y S I G Y P N

300

Q

hour.

procedures (27).

Other methods Protein concentrations were measured according to the method of Bradford (28) using a kit of Bio Rad (FRG) with BSA as a protein standard. The protein samples were analysed on 0.1 % SDS-15% polyacrylamide gels and visualized by coomassie brillant blue staining. Computer aided sequence comparisons were performed either on a VAX using the UWGCG software (29) or a PC (IBMcompatible) using the program 'Motif (30).

RESULTS Cloning of the R.BsuFI gene The BsuFI Mtase (M.BsuFI) gene has been previously isolated and characterized (1). Sequencing of a chromosomal DNA fragment containing the M.BsuFI gene revealed the 3' end of an open reading frame (ORF) downstream of the Mtase gene.

AGAAAAGCAATTTAAAATGGATTTTCTTGTTCAATTTACTGACTTTGATAACGAACAATG E K Q F K M D F L V Q F T D F D N E

400

Q

W

GTTAATAAAATCAACTAACTCTATAAGGGAACGTATATACGGTACAGAATTTTTTGCACA L I K S T N S I R E R I Y G T E

F

F

A

Q

A

AAACATCAGGCTTrATCGATGAGAAAGTAAAAAATATATATGTTGTTGTTCCAGATTCTAT N I R L I D E K V K N I Y V V V

500

P

D

S

I

ATCTTCAGCTGAAATGAAAAAGAAAAGAAACTACTCCGTAAAAATAAACGGAACAACATA S S A E M K K K R N Y S V K I N

G

TT

Y

R.BsuFI

TACTTCCTTTTTAACTGATGTTTTAACCGTTAATGAATTGCGACAAAAAA[iGTAGAAAA T S F L T D V L T V N E L R Q K I V

RM.spl

GGCATCTCAAAACATAGCGCAGGGCTTACGTGCTAATGTGCTTGGTAATGATGCTGAAAC A S Q N I A Q G L R A N V L G N

R.BsuFI

E

600

*

*

*

D

700

.

A

E

T

RMspl

CGCTCAACAAACCATCAAATCATCAACATACAAGATATACAAAGAGATCCTTGAAAAAAT A Q Q T I K S S T Y K I K

TGATCTAAAGGAAGGCTTTGATAAGATACTTGAAGTTACCGCTACAATGAATTTCCTCT D L K E G F D K I L E V T A T N D I

R.BsuFI RMspI

ATTrATCCAATAGGGGAAAACCGAAAACAGATGTATCAGTTACAATCAAAACAAATACAAA L S N R G K P K T D V S V T

R.BsuFI

AGAATTAATTAGGAATATCAGTATAAAAAACACTCGTGAAAAAACTGTCACTATACATGA E L I R N I S I K N

RMspl

*

.

E

Y

~~~800

N I

D

L

Y

E

Q

K

P

I

900

T

R

E

K

K

T V 1000

T

T

N I

T

H

I

L

K

E

.

L

I

1200

F

E

T

F

N

ACAGATTAAAGGTTTTTCAAACAATTAAAGCAAAAGACTAATATCAGCAAACTATACTTG Q I K G F S NNN * 1400

AATAAAAAAGTTATGGACGTA * K K L V Q M

Fig. 2. Nucleotide and deduced amino acid sequence of the R.BsuFI restriction ENase. The proposed promoter region (-10 and -35) and the Shine-Dalgano region (S/D) are underlined. The predicted terminator structure of two inverted repeats is indicated by arrows. The 3' end of the M.BsuFI gene (nt 1500 to 1518)

and the deduced amino acid sequence is shown.

187 76

230

-

N R G K P K T D V 8 V T I K T N T - 246

II.

92 - A G G 8 P K T D A .

T I R F T F - 106

.

.

293

III.

-

E X V G 8 X X X L . I A E H P NS

-

308

118 - I X H 8 8 KXKX

V S I A E Y D V E - 134

350 - V W N R D D Y I

K H Y I E E Y 8 G

I I

IV.

219 - I K N I D D Y V 8 D R I A

I

K 0 8 K

KGQ F G T P F X W T Y PS K K R G Q K I

I K G T

-

391

-

262

I

Q F K G

L

K

ATTTGCTGTTTGGAATCGTGATGATTATATTAAACATTACATCGAAGAATATAGTGGAAA F A V W N R D D Y I K H Y I E E Y S G K AGGACAATTGGAACTCCTTTTiAAATGGACTTATCCAAGCAAAAAGCGTGGTCAAAAAAT G Q F G T P F K W T Y P S K K R G Q K I CTGATATTAGTCTCTATATTAGTTATAAAGCTATAGTTT

-

III

I

CT'CAGATAAAATTTTAGAGGAAAACTTAAAATTGTATAATAGAGAACTTATTGAATTCTT S D K I L E E N L K L Y N R E

ACATAGCCCTTiACTCAATGACAAGATAC AATGGTAGATTAATTATATTiACAAATAA H S P L L N D K I Q M V D I

Q N

K D I S S K L L K A L N L D L D N

A R X P G F G T G L N W T Y A 8 G 8 K AG K K IN

1100

I

N L L N D L K N K A L W N D Y

-

N

AGGTAGTGTTTCGGATTTGATTTCTCGATTAAAATTATCGGAAACGGACCCACTATCGCA G S V S D L I S R L K L S E T D P L S Q AGCACTTATACATTTTGAAAAAiGTCGGTAGC'AAAAAAATTAATTGCAGAGCATCCTAA A L I H F E K V G S K K K L I A E H P N L

-

60

K

CAGTATTGTTAACCTGCTTAATGATTTGAAAAATAAAGCATTATGGAATGATTATCAAAA S I V N L L N D L K N K A W L

171

I.

B R. BsuFl R. MspI

III

u--III

NH2-C NH2

11

IV

*Ull_- COOH

E

III s*

IV

COOH *

Fig. 3. Amino acid alignment between the R.BsuFI and R.MspI (8) ENases. A. Four regions (I-IV) with a high degree of identical amino acids were aligned together with neighboring sequences; the numbers represent the position of the regions within each enzyme. B. Relative position of each conserved region (filled boxes) within R.BsuFI and R.MspI.

6460 Nucleic Acids Research, Vol. 19, No. 23 Hindill

Tab. 1. Purification of the R.BsuFI ENase Total

Purification step

units

(U)

Crude extract

EcoRV ,

Ammonium sulfate

---

Total protein (mg)

Specific activity (U/mg)

Purification factor

Yield

(%)

23

---

---

---

16.8

---

---

---

DEAE Se harose (Fast flow)

105780

13.4

7894

100a

I

S Sepharose (Fast flow)

89087

4.9

18181

84

2

DNA Cellulose Gelflltration

12666

0.38

33334

12

4

6316

0.04

157894

6

20

(G-100)

Dral /Pvull

a) R.BsuFl cleavage activity of crude extracts and of the fraction after NH4(SO4)2 precipitation could not be determined due to DNase activity.

lot Fig. 4. Circular map of the expression vector pBW57. A 2.1 kb fragment of pBW55 (black) coding for the R.BsuFI gene was inserted into the polylinker of pMS1 19-2 constructed from pMS1 19 H/E and the TcR gene of pBR322 (dark grey).

4

._L 4m 4)

()

sMW

16 .... ON...-

-'W4pw' -mm*-. 60.-. -~ _ _

25

35

Elution volume (ml)

z..z.-

Fig. 5. 0. 1 % SDS- 15 % polyacrylamide gel stained with coomassie brilliant blue of crude cell extracts and fractions of R.BsuFI enzyme during purification. MWS: molecular weight standard: BSA (64 kD), ovalbumin (43 kD), carbonic anhydrase (30 kD), Lane 1: induced crude extract of DH5aMcr [pOU71-5/pMS1 19-2] (control); Lane 2: crude extract of induced DH5cxMcr [pOU71-5] cells coding for the R.BsuFI gene on pBW57; following lanes: fractions of the latter extract after different purification steps: (3) ammonium sulfate precipitation, (4) DEAE Sepharose, (5) S Sepharose, (6) DNA cellulose and (7) gel filtration.

The deduced amino acid sequence of this ORF had some similarities to the C-terminal sequence of R.MspI (8), which recognizes the same DNA target sequence (5'CCGG) as R.BsuFI. Considering that genes of R/M systems are generally closely linked on the chromosome, we speculated that the partial ORF identified represented indeed the C-terminus of R.BsuFI. Therefore the cloning strategy for the R.BsuFI gene was based on the integration of a marker gene into the chromosome of the donor strain B. subtilis ISF18 into the predicted R.BsuFI gene followed by cloning of sequences neighbouring the insert (31) (Fig. 1). For this purpose a 0.6 kb DNA fragment from the internal region of the putative R.BsuFI gene was inserted into a derivative of pBR322 harbouring the CatR gene of pC194. The resulting pBRCatl8 is only able to replicate in E. coli, not in B. subtilis. Therefore stable CmR transformants were only expected in B. subtilis if integration into the chromosome had occurred via homologous recombination (Fig. IA). pBRCatl8

Fig. 6. Estimation of the subunit structure of the native R.BsuFI by gel filtration on a Sephadex G-100 column (see MATERIAL and METHODS). The column was calibrated with the proteins: 1. ribonuclease A (13 kD), 2. chymotrypsinogen A (25 kD), 3. ovalbumin (43 kD), 4. albumin (67 kD).

was transformed into ISF 18, which encodes the genes of the BsuFI system. Two stable CmR transformants were found. Both clones had lost the capacity to restrict the B. subtilis phage SPP1, but retained the ability for M.BsuFI specific modification. This suggested that the putative R.BsuFI gene had been disrupted following integration of the plasmid. Further work was performed with one of the insertional mutants (ISFCatl8). In order to excise pBRCatl 8 together with flanking sequences of the chromosome, R.MluI was used to cleave chromosomal DNA of ISFCatl 8. The resulting DNA fragments were religated under appropriate conditions and transformed in E. coli. Twentytwo AmpR transformants were found. One of them contained a plasmid, designated pBW5, carrying about 18kb of chromosomal DNA (Fig. iB). The restriction analysis of this plasmid agreed well with the restriction map of the BsuFI locus (data not shown) and showed that the inserted pBRCatl 8 was flanked by two direct DNA repeats, as would be expected if integration of pBRCatl8 into ISF18 had occurred by Campbell-type recombination (32). Plasmid pBW5 was resistant to R.MspI cleavage, indicating that the M.BsuFI gene was also part of the cloned chromosomal DNA. The R.BsuFI gene, which had been disrupted by pBRCatl8 insertion, was reconstituted from subfragments contained in pBW5 (Fig. 1B). The fragments were inserted on a low copy number plasmid pBG2, creating pBW55 (Fig. lC). We observed that the intact BsuFI R/M system on pBW55 cannot become established in E. coli without premodification of the host

Nucleic Acids Research, Vol. 19, No. 23 6461 M

CCGG

rni

C CGG g. \R

A

la

16

lb

215 -

K I L E V T A T N D I P L L S N R G K P K T D

Ic

310

K I L E E .

Ila

161 -

lIb

331 -

K I L V S E I V

[G JL

B

A E T S I

N

.

K I Q N

[

la la

EiG -35

N

L N D L K

VD

I I F

A L

T

F

300

200

Ila

lb

E

E

-

235

-

L Y N R E L .

100

0

NH2H-

Fig. 7. Effect of methylation by M.HpaII on restriction activity of R.BsuFI enzyme. About 1 icg plasmid DNAs of pBW201 and pMHpaII coding for the M.BsuFI (mCCGG) and M.HpaII (CmCGG), respectively, were treated with about 10 U of R.BsuFI, R.MspI and R.HpaII for one hour and analysed on 1% agarose gel; 'undig.' refers to untreated plasmid DNA.

.

N E L G I N I E

3

V

325

185

N R D

- 356

400 sa

Ic lib

}:- COOH

Fig. 8. Intramolecular sequence comparison of the primary structure of the R.BsuFI enzyme. A. Blocks with several identical amino acids are aligned and framed; Numbers indicate the position of each block within the enzyme. B. Schematic primary structure of R.BsuFI with the relative position of the repeated blocks corresponding to A. Bsu Fl

genome by M.BsuFI. Subclones of the R.BsuFI gene must therefore routinely be transferred into E. coli strains which already carry the M.BsuFI gene on an additional compatible multicopy number plasmid (pBW201). In order to identify the confines of the R.BsuFI gene within the cloned segment, several subclones were constructed and further tested for R.BsuFI restriction activities in vivo. The R.BsuFI gene was detected on a 1.7 kb HindIJJHpaI DNA fragment (Fig. IC). Several E. coli strains such as NM675, K803 or DH5cxMcr carrying plasmids with the cloned R.BsuFI gene exhibit a strong restriction activity in vivo. Unmodified Xvir phage grew on K803 harbouring pBW55, for example, with an e.o.p. of about 1 x 10-4 compared with that of the M.BsuFI modified phage (e.o.p.: 1). In ISF18, the restriction frequency of non-modified SPP1 phage was 10-6.

Nucleotide and amino acid sequence of R.BsuFI Sequencing the 1.7 kb HindlII-HpaI DNA fragment revealed one ORF of 1185 bp with the translational start point at nucleotide (nt) 159 (Fig. 2). This ORF codes for a protein of 395 amino acids with a calculated Mr of 45.6 kD. The translational start point was verified by protein sequencing (see below). The ORF is preceded by a putative -35 box at nt 58-63 and a -10 box at nt 78-83; each box deviates at only one position from the consensus sequence of B. subtilis vegetative promoters (33). It is difficult to predict a Shine-Dalgarno (S/D) sequence for the RBsuFI gene. Because of its similarity with the B. subtilis consensus sequence we favor the sequence (5'GGGGG), though we realize its unusual distance from the start codon.

Sequence comparison Intermolecular amino acid sequence comparisons were performed between R.BsuFI and all ENases studied to date. Only the comparison with the isoschizomer R.MspI showed 45 % overall similarity in the amino acid sequence with R.BsuFI. Four regions with a high degree of amino acid identity (Fig. 3A) were identified. These regions are sequentially arranged in the same order in each enzyme, but are separated by different number of amino acids (Fig. 3B). Regions I-Ill comprise 9-11 amino acids each with 56-63 % identity between the sequences. The longest

ENase

Mtase

MspI Mtase ENase 200 aa

Fig. 9. Gene arrangements of the BsuFI and MspI R/M systems. Arrows indicate transcriptional orientations of the ENase und Mtase genes.

region of similarity, IV, comprises 42 amino acid residues and is located at the C-terminus. The 45 % identity between R.BsuFI to R.MspI in this region is also reflected in the nucleotide sequences of the corresponding genes (60%). Sequence comparisons between R.BsuFI and M.BsuFI revealed no significant homology, a feature commonly observed in all R/M systems published so far (34). Purification and characterization of R.BsuFI enzyme E. coli DH5aMcr cells harbouring the two plasmids pOU71-5, encoding M.BsuFI, and the expression vector pBW57, carrying the R.BsuFI gene under control of the tac-promoter (Fig. 4) were induced with IPTG and extracts were analysed by PAGE. In comparison to the control extract (Fig. 5 lane 1), one additional band regarded as the R.BsuFI enzyme with an estimated Mr of about 40 kD appeared in a DH5aMcr [pOU71-5/ pBW57] cell extract (Fig. 5 lane 2). The amount of this protein increased significantly within four hours after IPTG addition. This result correlates with an about ten fold higher in vivo restriction activity of induced cells compared with non-induced ones (data not

shown). The R.BsuFI enzyme was purified to homogeneity in 4 steps (Tab. 1) yielding about 40 ug of pure R.BsuFI from about 1 g (wet weight) cells. The calculated Mr of R.BsuFI (45.6 kD) differs from the apparent Mr of about 40 kD estimated by PAGE (Fig. 5). The N-terminal amino acids of R.BsuFI were, therefore, verified by protein sequencing. The determined amino acid sequence M N K D N Q I K N ? ? G K matches with that derived from the nucleotide sequence of the R.BsuFI gene (see Fig. 2). R.BsuFI, with 395 amino acids, is larger than most ENases,

which average 280 amino acids (2). Restriction enzymes of this

6462 Nucleic Acids Research, Vol. 19, No. 23 size usually work as a dimers. Only R.BsuRI, with 576 amino acids the largest typell ENase (35,36) and R.Sau96I, which comprises 430 amino acids (37) are supposed to act as monomers. To determine the subunit composition of R.BsuFI enzyme under native conditions we performed a gel filtration on a G-100 Sephadex column. The active R.BsuFI enzyme elutes at a volume which corresponds to a size of about 85 kD, compatible with the assumption that the enzyme acts as a dimer (Fig. 6). Initial evidence for the R.BsuFI cutting position within the 5'CCGG target sequence was provided by ligating R.BsuFI cleaved fragments with either DNA fragments containing blunt ends (e.g. as produced by R.EcoRV) or fragments which had 5'CG overhangs ( e.g. as produced by R.NarI). Ligation was only observed with fragments containing 5'CG overhangs (data not shown). We concluded that R.BsuFI cleaves the sequence 5'CCGG between the two cytosines producing a 5'CG overhang similar to R.MspI. This conclusion was later substantiated by the direct determination of the R.BsuFI cleavage position in a sequencing reaction (Xu, unpublished results) according to a method described by Brown et al. (38). The effect of site-specific methylation within the 5'CCGG target of R.BsuFI was examined using different DNA substrates methylated in vivo (Fig. 7). Neither R.BsuFI nor R.MspI are inhibited by M.Hpall-methylation (39). The enzymes differ only in their sensitivity to salt: whereas R.BsuFI is fully active at 100 mM NaCl, R.MspI is only partially active under these conditions (data not shown).

DISCUSSION The BsuFI R/M system is one of six R/M systems that have been identified in B. subtilis (10, 40). Initial attempts to clone the two genes of the BsuFI R/M system together, in one step, according to the method of Kiss et al.(35) were unsuccessful. Since the M.BsuFI gene, alone, was readily clonable, we attributed this failure to a deleterious effect of expression of the R.BsuFI gene. Therefore, an alternative approach was carried out for the isolation of the R.BsuFI gene, in which the gene was disrupted by insertional mutagenesis prior to cloning in E. coli. Subsequently the R.BsuFI gene was reconstituted and could be cloned in E. coli, provided the cells had previously received the corresponding Mtase gene. The presence of the Mtase gene was also required for cloning of the entire BsuFI R/M system. The need for this 'two step' cloning procedure was attributed to uncoordinated expression of the genes comprising the R/M system in the heterologous E. coli host (41). This phenomenon has been reported before (42). In contrast, the genes of the entire MspI R/M system can be introduced into non-premodified E. coli cells, possibly pointing to different rules governing the expression of these genes in E. coli (8). The genes of the BsuFI R/M system are convergently transcribed (see Fig. 10) and each is preceded by putative transcription/translation signals which are presumed to be recognized not only in B. subtilis but also in E. coli, where the R/M system is also active in vivo. The 71 bp region separating the two genes includes a 16 bp inverted repeat flanked by T-rich sequences. This structure probably serves as a bidirectional rhoindependent transcriptional terminator signal for both genes

are the two pairs of isoschizomers R.EcoRIlR.RsrI (4) and R.BsuBI/R.PstI (G. Xu, unpublished results) with 50% and 45% overall amino acid identity, respectively, both of which are components of 6mA specific R/M systems. No significant similarity has been found between Mtases and ENases. These observations suggest that the genes of Mtases and ENases of known R/M systems have different evolutionary origins. Within their evolutionary pathways, Mtases in general have diverged less from each other than ENases (34, 43, 44). In both 6mA specific R/M systems mentioned above a common origin of the ENases can be proposed. Among the 5mC-specific R/M systems, R.BsuFI and R.MspI are the first pair of isoschizomeric ENases which have been found to share significant amino acid sequence similarity, suggesting that the enzymes derived from a common ancestor. The conservation of amino acids is pronounced towards the C-termini, probably reflecting the significance of these regions for the basic functional activity of these enzymes. The N-terminal regions on the other hand are highly diverse. In fact, the difference in Mr of the two enzymes can mostly be attributed to differences in length of the N-termini. Based on the identification of intramolecular repeats in some ENases, Lauster (43) proposed that their evolution involved gene duplication. In R.BsuFI, short regions with identical amino acid composition could be found (Fig. 8), whereas such repeats are absent in R.MspI. The presence of the repeated motifs la and 2a in the non-conserved N-terminal portion of the R.BsuFI enzyme (Fig. 8) suggests that duplication events of genetic material have occurred during the evolution of R.BsuFI gene, but not of R.MspI. Interestingly, the repeated motifs of R.BsuFI are not among sequences conserved in both ENases (Fig. 3). Assuming that R.BsuFI and R.MspI derive from a common ancestor, extensive parts of the proteins have followed divergent evolutionary avenues accompanied by drastic alterations of the primary structures. Yet both have retained the capacity to cleave the same DNA target sequence in the same manner and both exhibit identical sensitivity to different types of methylation. These observations imply a high flexibility of these enzymes for changes in the length and composition of their amino acid sequences while maintaining identical enzymatic properties. If such a structural flexibility represents a more general phenomenon, the evolutionary relationship between type II ENases could be obscured. This might explain the apparent unrelatedness among this class of enzymes (43, 44). In the case of the BsuFI and MspI R/M systems both the ENase and the Mtase genes apparently derive from common ancestral genes. Important aspects of the evolutionary history of these R/M systems concern the arrangements of the genes and their directions of transcription. In the BsuFI R/M system, the genes are convergently, and in the MspI R/M system (8) they are divergently transcribed (Fig. 9). The orientational differences support the notion that the genes of R/M systems assemble in a stepwise process, in which a cell first acquires a Mtase gene and then the corresponding ENase gene. It is an open question what selective pressure is responsible for bringing the genes into close proximity, a situation which is true for most R/M systems studied thus far.

(Fig. 2).

Amino acid sequence comparisons between Mtases have shown that the 5mC (5) and the 4mC/6mA (6, 7) enzymes represent distinct classes of Mtases. In contrast, amino acid comparisons among ENases did not in the majority of cases reveal any significant amino acid sequence similarity (43, 44). Exceptions

ACKNOWLEDGEMENTS We thank J. Willert for advice and support during protein purification, G. Xu for experimental help and M. Noyer-Weidner for useful discussions and critical reading of the manuscript.

Nucleic Acids Research, Vol. 19, No. 23 6463 Special thanks to V. Kruft for performance of the protein

sequencing. J. Walter is supported by the Deutsche Forschungsgemeinschaft (SFB 344) REFERENCES 1. Walter, J., Noyer-Weidner, M. and Trautner, T.A. (1990) EMBO, 9,

1007-1013 2. Wilson,G.G. (1988) TIG, 4, 314-318. 3. Kessler,C. and Manta,V. (1990) Gene, 92, 1-248. 4. Stephenson,F.H., Ballard,B.T., Boyer,H.W., Rosenberg,J.M. and Greene,P.J. (1989) Gene, 85, 1-13. 5. Lauster,R., Trautner,T.A. and Noyer-Weidner,M. (1989) J. Mol. Biol., 206, 305-312. 6. Posfai,J., Bhagwat,A.S., Posfai,G., Roberts,R.J. (1989) Nucl. Acids Res., 17, 2421 -2435. 7. Klimasauskas,S., Timninskas,A., Menkevicius,S., Butkus,V. and Janulaitis,A. (1989) Nucl. Acids Res., 17, 9823-9832. 8. Lin,P.M., Lee,C.H. and Roberts,R.J. (1989) Nucl. Acids Res., 17, 3001-3011. 9. Wilke,K., Rauhut,E., Noyer-Weidner,M., Lauster,R., Pawlek,B., Behrens,B., Trautner,T.A. (1988) EMBO J., 7, 2601-2609. 10. Ikawa,S., Shibata,T., Ando,T. and Saito,H. (1980) Mol. Gen. Genet., 177, 359-368. 11. Zahler,S.R., Korman,R.Z., Rosenthal,R., Hemphill,H. E. (1977) J. Bacteriol., 123, 556-558. 12. Woodcook,D.M., Crowthers,P.J., Doherty,J., DeCruz,E., Noyer-Weidner, M., Smith,S.S., Michael,M.Z. and Graham,M.W. (1989) Nucl. Acids Res., 17, 3469-3478. 13. Wood,W. B. (1966) J. Mol. Biol., 16, 118-133. 14. Messing,J. and Vieira,J. (1982) Gene, 19, 269-276. 15. Yanisch-Perron,C., Vieira,J. and Messing,J. (1985) Gene, 33, 103-119. 16. Biswal,N., Kleinschmidt,A.K., Spatz,H.Ch. and Trautner,T.A. (1967) Mol. Gen. Genet., 100, 39-55. 17. Luria,J. and Burrous,J.W. (1957) J. Bacteriol., 74, 461-476. 18. Rottlander,E. and Trautner,T.A. (1970) Mol. Gen. Genet., 108, 47-60. 19. Larsen,J.E.L., Gerdes,K., Light,J. and Molin,S. (1984) Gene, 28, 45-54. 20. Churchwald,G., Belin,D. and Nagamine,Y. (1984) Gene, 31, 165-171. 21. Furste,J.P., Pansegrau,W., Frank,R., Bl6cker,H., Scholz,P., Bagdasarian,M. and Lanka, E. (1986) Gene, 48, 119-131. 22. Gryscan,T.J., Contente,S., Dubnau,D. (1978) J. Bacteriol., 134, 318-329. 23. Sambrook,J., Fritsch,E.F. and Maniatis,T. (1989) Molecular Cloning: A Laboratory Manual. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. 24. Trautner,T.A., Pawlek,B., Bron,S., Anagnostopoulos,C. (1974) Mol. Gen. Genet., 131, 181-191. 25. Sanger,F., Nicklen,S. and Coulson,A.R. (1977) Proc. Natl. Acad. Sci. USA, 74, 5463-5467. 26. Bolivar,F., Rodriguez,R., Greene,P.J., Betlach,M.C., Heyneker,H.L., Boyer, H.W., Crosa,J.H. and Falkow,S. (1977) Gene, 2, 95-113. 27. Davies,R.W., Botstein,D. and Roth,J.R. (1980) in Advanced Bacterial Genetics. A Manual for Genetic Engineering. Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y. 28. Bradford,M. (1976) Anal. Biochem., 72, 248-254. 29. Devereux,J., Haeberli,P. and Smithies,O. (1984) Nucl. Acids Res., 12, 387-395. 30. Smith,H.O., Annau,T.M., and Chandrasegaran,S. (1990) Proc. Natl. Acad. Sci. USA, 87, 826-830. 31. Niaudet,B., Goze,A., Ehrlich,S.D. (1982) Gene, 19, 277-284. 32. Vosman,B., Kooistra,J., Olijve,J., and Venema,G. (1986) Mol. Gen. Genet., 204, 524-531. 33. Moran Jr.,C.P., Lang,N., LeGrice,S.F.J., Lee,G., Stephens,M., Sonenshein, A.L., Pero,J. and Losick,R. (1982) Mol. Gen. Genet., 186, 339-346. 34. Wilson,G.G. (1991) Nucl. Acids Res., 19, 2539-2566. 35. Kiss,A., Posfai,G., Keller,C.C and Roberts,R.J. (1985) Nucl. Acids Res., 13, 6403-6421. 36. Bron,S., Murray,K., and Trautner,T.A. (1975) Mol. Gen. Genet., 143, 13-18. 37. Szilak,L., Venetianer,P. and Kiss,A. (1990) Nucl. Acids Res., 18, 4659-4664. 38. Brown,N. L. and Smith,M. (1980) Methods Enzymol., 65, 391-404. 39. Walder,R. Y., Langtimm,C.J., Chatterjee,R. and Walder,J.A. (1983) J. Biol. Chem., 258, 1235-1241. 40. Jentsch,S. (1983) J. Bacteriol., 156, 800-808.

41. Bocklage,H., Heeger,K. and

1007-1013.

Muller-Hill,B. (1991) Nucl. Acids Res., 19,

42. Howard,K.A., Card,C., Benner,J.S., Callahan,H.L., Maunus,R., Silber,K., Wilson,G.G. and Brooks,J.E. (1986) Nucl. Acids Res., 14, 7939-7951. 43. Lauster,R. (1989) J. Mol. Biol., 206, 313-321. 44. Chandrasegaran,S. and Smith,H. 0. (1988) in Stnucture and Expression: From Proteins to Ribosomes (Sarma, R.H. & Sarma, M.H., Eds) Adenine Press, NY., Vol. 1, pp 149-156.