Evolution of Lambdoid Replication Modules - Springer Link

1 downloads 0 Views 119KB Size Report
case of lambdoid replication modules, the units on which natural selection acts do ... Rather, the first replication gene is split into two segments, and its 3H part ...
Virus Genes 24:2, 163±171, 2002 # 2002 Kluwer Academic Publishers. Manufactured in The Netherlands.

Evolution of Lambdoid Replication Modules 1

 BEL1* & GRZEGORZ WEÎGRZYN 2 BORYS WRO Institute of Oceanology, Polish Academy of Sciences, Sw. Wojciecha 5, 81-847 Gdynia, Poland 2 Department of Molecular Biology, University of Gdansk, Kladki 24, 80-822 Gdansk, Poland Received February 18, 2001; Accepted December 3, 2001

Abstract. Comparison of the putative iteron-binding proteins of lambdoid phages allows us to propose that in the case of lambdoid replication modules, the units on which natural selection acts do not coincide with the open reading frames. Rather, the first replication gene is split into two segments, and its 3 0 part (corresponding to the C-terminal domain of the iteron-binding protein) forms one unit with the second gene. We also propose from the phylogenetic analysis of phage-encoded homologs of E. coli DnaB and DnaC, that the recombination with the host sequences is not frequent. Accessory ATP-ases for helicase loading (E. coli DnaC homologs) may not be universal replication proteins. Our analysis may suggest that the bacterial helicase loaders might be of phage origin. The comparison of DnaC homologs of enterobacteria and enterobacterial phages supports the experimental data on residues important in interaction with DnaB. We propose that construction of plasmids carrying the replication origins of lambdoid prophages could be useful not only in further research on DNA replication but also on the role of these prophages in shuttling genes for bacterial virulence. The phage replication sequences could be also useful for identification of clinical enterobacterial isolates. Key words: bacterial virulence, DnaB homologs, DnaC homologs, DNA replication, helicase loading, modular genome organization

Introduction Lambdoid phages share a similar temperate life cycle. In lysogenic bacteria the phage genome is integrated in the host chromosome, and this prophage confers immunity to the superinfection by the same phage due to the production of a repressor. For any pair of lambdoid genomes, there are regions with substantial similarity interspersed with regions that apparently show none [1,2]. However, the arrangement of both the major gene modules and regulatory signals within them is preserved. It seems that selection for preservation of regulatory mechanisms allows to conserve the genome architecture. This view is supported by the existence of phages with no sequence similarity and different host group, which *Author for all correspondence: E-mail: [email protected]

nonetheless have similar genomic structure, such as mycobacteriophage L5 [3]. One of the definitions of lambdoid phages is that they can recombine with each other to form viable hybrids. Indeed, the ability to form viable recombinants may be an additional selective advantage of preservation of general genome organization [4]. It seems that these genomes are made of clusters of genes which are interchangeable, and could be nonhomologous but still have the same function (such as integration/excision, recombination, immunity, replication, lysis, formation of the capsid tail and head). Thus, the lambdoid genome seems to be mosaic in nature, with fragments derived from separate sources: the host chromosomes, other phages, and also plasmids and transposons present in the hosts. Recombination may occur in coinfected cells, in infected lysogens, and in infected defective lysogens. All three events are probably quite frequent,

164

WroÂbel and WeÎgrzyn

considering the fact that the number of phage particles seems to be at least one order of magnitude higher than the number of cells in the environment [5]. In addition, natural enterobacteria are often lysogenic for several lambdoid prophages [6]. Some prophage genes might confer selective advantage to the host, but most are not expressed as long as the repressor is present. But when they are expressed, and induction occurs, this is selectively disadvantageous to the host. The prophage DNA is, therefore, slowly eliminated through random drift and partial deletion. The defective lambdoid prophages of the K12 strain of Escherichia coliÐe14, Rac, DLP12 (qsr 0 ) and Qin (Kim)Ðcan be viewed as intermediates in this elimination process. All of them preserve the genome architecture characteristic for the lambdoid group. Only one of them, Rac, possesses a functional replication origin (oriJ). This origin, like the origin of phage  (ori), allows for autonomous replication when present on a plasmid [7,8]. Bacteriophage , the type species of the lambdoid group, requires two phage gene products, O and P, for replication (for review, see [9]). Both genes are under control of the pR promoter (Fig. 1). The activity of this promoter is also necessary for transcriptional activation of the origin (ori) which lies within the O gene. The O protein specifically binds to iterons within the origin. The DNA-binding domain is located in the amino-terminal portion of O [10]. The C-terminal part of O seems to bind to P, the second replication protein. P in turn interacts with the host helicase, DnaB. Replication modules of other lambdoid phages also contain two genes. One of these genes probably

encodes an initiator protein, possibly interactingÐas was shown for OÐboth with the origin and the second replication protein. This second protein can either be a helicase itself, or may interact with the host helicase DnaB to deliver it to the origin. So far, then, three families of second replication proteins have emerged: homologs of host DnaB, homologs of host DnaC (which delivers DnaB to the bacterial origin) or homologs of PÐa protein which interacts with DnaB but shows no significant similarity to DnaC [11]. In this work we aimed to compare several lambdoid replication modules, including modules of prophages present in the newly sequenced bacterial genomes, and to present ideas as to how these modules evolve. Materials and Methods Smith and Waterman [12] searches were done with default values (blosum62 matrix, gap penalty 11, and 1 for extension) against the NonReduntant database (dated 4 November, 2001) at the Department of Molecular Biology of the University of Oslo site (http://dna.uio.no/search/). Multiple alignments were done using the ClustalX program [13] with default values. The same program was used to create neighbour-joining trees with bootstrap. The ClustalX alignment was corrected by hand, the penalty for deletions was arbitrarily changed to equal two amino acid (aa) substitutions, and the maximum-likelihood analysis was done using the PUZZLE 4.0.2 program [14] with the blosum62 matrix and the default setting otherwise. Results Search for the Lambdoid Replication Modules

Fig. 1. Genes and promoters neighbouring the origin of replication of bacteriophage  and the putative replication origin (oriJ) of Rac.

Searches of the NonRedundant database with the replication sequences of known lambdoid phages and prophages (such as Rac) yielded a multitude of results. The hits with the lowest expectancy values (E values) usually come from replication modules of phages (which can be assumed as functional and related) and prophages. In our analysis, we considered only the modules of prophages when they showed a characteristic organization, similar to that in phage  (see Fig. 1). As an example, the Fig. 1 shows the organization of genes near the putative replication

Evolution of Lambdoid Replication Modules

origin of prophage Rac of E. coli K12 [15,16]. The modules which did not show similar organization were judged defective and were not considered in the analysis (for instance, of E. coli O157 : H7 prophages CP-933H, M and X) [17]. Phylogenetic Analysis of Phage Homologs of DnaC and DnaB To analyse the relationships between the DnaC homologs of lambdoid phages, we have made a

165

ClustalX multiple alignment of a number of helicase loaders of phages and bacteria, including the homologs of Bacillus subtilis DnaI protein. High similarity of DnaI to the r1t accessory replication protein supports the view that this protein is a helicase loader [18]. The proteins analysed are listed in the Fig. 2 legend. All of them share the conserved A and B parts of the NTP-binding motif (Walker A and B boxes) [19,20]. The sequences from Gram-negative bacteria and their prophages, but not the sequences taken from Gram-positive bacteria and their phages, share

Fig. 2. Maximum-likelihood trees for phage and bacterial homologs of E. coli helicase loader DnaC (panel A) and helicase DnaB (B). The corrected ClustalX alignment was used as input for the PUZZLE program. The numbers at the nodes show support for the internal branches in percent. Sequences used: (A) 7201 (a Streptococcus thermophilus phage)ÐAccession Number AAF26604, bIL309 (a Lactococcus lactis phage)ÐAAK08363, Bacillus subtilisÐCAB14858, Buchnera sp. APSÐBAB12748, CP-933RÐAAG56437, E. coliÐAAA23700, Gifsy-1ÐAAC26072, Lactococcus lactis subsp. lactisÐAAK04850, phiETA (a Staphylococcus aureus phage)ÐBAA97609, phiPV83 (a Staphylococcus aureus prophage)ÐBAA97828, r1t (a Lactococcus lactis phage)ÐAAB18687, S. typhiÐNP_458958, Staphylococcus aureus subsp. aureus N315ÐBAB42774, Streptococcus pneumoniae TIGR4ÐAAK75789, Streptococcus pyogenes M1 GASÐAAK33392, RacÐAAC74442; (B), Bacillus subtilisÐP37469, Buchnera sp.ÐP57611, D3ÐNP_061570, E. coliÐ P03005, HK022ÐNP_037693, HK97ÐNP_037740, Lactococcus lactisÐAAK04844, P1ÐCAA09719, P22ÐNP_059611, Pseudomonas aeruginosa PAO1ÐE83029, Pseudomonas putidaÐAAK00231, S. typhimuriumÐP10338, Staphylococcus aureusÐBAB21112, SPP1 (a Bacillus subtilis phage)Ð CAA66496, Streptococcus pneumoniae TIGR4ÐAAK76254, Streptococcus pyogenes M1 GASÐAAK34812, Vibrio choleraeÐAAF93544, X. fastidiosa 9a5cÐH82814.

166

WroÂbel and WeÎgrzyn

also a conserved cysteine corresponding to the Cys69 in E. coli. Neither Cys78 nor Cys120 of E. coli DnaC were conserved, strengthening the conclusion of Nakayama et al. [21], that Cys69 is the cysteine involved in interaction with DnaB. Two other amino acids (aa) in E. coli DnaC, Phe23 and Trp32, which as recently suggested by Ludlam et al. [20], when substituted might abolish binding to DnaB, are absolutely conserved between enterobacteria and their phages. Other residues, although when substituted expected to abolish DnaB binding [20] are not conserved absolutely. For instance, in the alignment we have obtained, E. coli Leu11 and Ser41 aligns with Ileu and Cys, respectively, in Gifsy-1 and Rac sequences, and Leu29 aligns with Trp in the phage sequences. Figure 2A shows the maximum-likelihood tree of phage and bacterial helicase loaders, and panel BÐa tree for phage and bacterial homologs of E. coli DnaB helicase. The maximum-likelihood trees presented in Fig. 2 both have very similar topology to the neighbour-joining trees produced by ClustalX (not shown). Analysis of Lambdoid Major Replication Proteins Figure 3 shows some of the results of Smith± Waterman searches with major lambdoid replication proteins as queries with the expectancy (E) values for their alignments. In general, the analysis is complicated by the presence of many newly sequenced enterobacterial and phage genomes in the database, many of which contain prophages with the replication modules identical or almost identical to those presents in other genomes. To simplify Fig. 3 and discussion of the results, out of clusters of sequences over 90% identical, only one is shown and discussed. For instance, only  sequence (but not the 933W, H19-B, CP-933V, VT1-Sakai), only HK022 sequence (and not HK620 or VT2-Sakai), or Gifsy-1 sequence (and not Gifsy-2 and the almost identical sequence in a S. typhi CT18 prophage) is shown. After examining the pairwise alignments (Fig. 3) it can be noticed that for the most part, significant similarities can be found in either the N or C part of the sequences. Similarities in the C parts of the iteronbinding proteins of phages of Gram-negative bacteria analysed here (Fig. 3) reflect the way the helicase activity is brought to the origin. For instance, Gifsy-1 iteron-binding protein shows similarity in its

C-terminal part to proteins of Rac and CP-933R, both of which are coded by genes followed by a dnaClike gene. Also the similarities in major proteins interacting with a DnaB homolog (HK022, P22, HK97) are the most significant at the C-termini. While D3 also uses a DnaB homolog as a second replication protein, this protein is not very similar to the DnaB homologs of the enterobacterial group, as can be seen in Fig. 2B. This is probably why no known protein is similar to the C-terminal part of the D3 protein. It was noted before for the major replication protein of phage , that the N-terminal domain probably interacts with DNA [10]. High similarity of Gifsy-1, , f80, HK-97, P22, and D3 (Fig. 3) proteins in this region indicates that also other proteins of this group bind to DNA with their N-terminal domain. This is also indicated by apparent similarity of f80 protein to SirR, a Mycobacterium tuberculosis iron transcription repressor, which might contain a helixturn-helix structure within the region of similarity according to the prediction tool at the Pole BioInformatique Lyonnais site (http://npsa-pbil.ibcp.fr/) [22]. Also the D3 sequence shows small similarities to transcriptional regulators in its N-terminal part (not shown). No other group of similarities in the N-terminal part emerges. The iteron-binding proteins of phage HK022 and CP-933R show high similarity in this putative DNA-binding domain to different replication proteins of phages of Gram-positive bacteria. The N-terminal part of YdaU, the hypothetical Rac major replication protein, shows similarity to the N-terminal parts of two hypothetical products of Xylella fastidiosa open reading frames, XF1560 (301 aa) and XF1665 (362 aa). There is also XF1664, a frame upstream XF1665, coding for 108 aa with similarity to the N-terminal part of YdaU. Only the region of similarity to XF1560 is shown in Fig. 3 for simplicity. Both XF1560 and XF1665 are followed by stretches of sequence which probably are noncoding. The three genes do not appear to lie in a region showing the organization of a lambdoid replication/immunity module. Still, they lie very near regions in the genome identified as prophage remnants, XfP3 (XF1560) and XfP4 [23]. Thus, these reading frames might be remnants of genes for iteron-binding proteins. When the Gifsy-1 orf8 DNA sequence (coding for the putative iteron-binding protein) is aligned using the Smith±Waterman algorithm with related genes, there

167

Evolution of Lambdoid Replication Modules

CP-933P

CP-933R E value

E value

E. coli YfdO

3e-35

Rac

3e-53

E. coli YfmN φ80

5e-22

Gifsy-1

2e-21

4e-19

λ

1e-18

E. coli DnaT φPV83

0.62

E. coli DnaT

0.004

0.65

D3

Gifsy-1 E value

E value

HK97 λ

2e-6

φ80

1e-27

5e-6

Rac

6e-22

Gifsy-1 φ80

7e-6 0.002

CP-933R λ

4e-17

P22

0.032

HK97

2e-11

2e-21

P22

3e-7

D3

6e-6

E. coli DnaT

8.9

φ80

HK022

E value

E value 0.001

λ

0.017

Gifsy-1

1e-27

P22

1.2

CP-933P

4e-19

HK97

3.7

HK97

3e-12

E. coli YfdO

2e-10

SSP1 φPVL

Rac

4e-51

P22

5e-10

D3

0.002

M. tuberculosis SirR

0.16

percentage of similar residues in pairwise aligment E value

above 85%

CP-933R

3e-53

65% - 84%

Gifsy-1

7e-22

45% - 64%

X. fastidiosa XF1560 1e-11

below 45%

E. coli DnaT

0.022

Fig. 3. The regions of best local similarity of several major lambdoid replication proteins to related phage and bacterial proteins. The CP-933P (Accession Number AAK16983), CP-933R (AAG56438), D3 (NP_061569), Gifsy-1 (T03010), HK022 (NP_037692), f80 (P14815), and Rac (AAC74441) sequences were used as queries for Smith±Waterman database searches at the Department of Molecular Biology of the University of Oslo site (http://dna.uio.no/search/) against the NonRedundant database. In each panel, the top line symbolizes the query sequence, while the lines underneathÐthe regions of Smith±Waterman local similarity of that sequence to other major replication proteins of lambdoid phages or prophages listed to the right. Also, the regions of local similarity of certain queries to several bacterial DNA-binding proteins are shown. The expectancy (E) value is listed for each pairwise alignment. The E value corresponds to the number of different alignments with equal or higher scores that are expected to occur by chance in the database search (the more significant the score, the lower the E value). As indicated, the higher the frequency of similar (`positive') residues in the pairwise alignments, the darker the lines are.

168

WroÂbel and WeÎgrzyn

is a short overlap of similarity: local alignment of orf8 to ydaU starts about nt 600, while the alignment of orf8 to sequences closely related to  ends at the 3 0 about nt 640. The boundaries of this short overlap (nt 600±640 in orf8) might approximate the boundaries of a region in which recombination leads to functional hybrids. They also approximate the region of similarity overlap at the protein level of Rac and  sequences to Gifsy-1 (see Fig. 3), corresponding to aa 209±217 in Gifsy-1 major replication protein. Although there are only two identical residues between Rac and  in this 8-aa part of the alignment, short similarities around this region might explain why the polyclonal anti-O serum recognizes YdaU in immunoblot experiments [15] although there is no discernible similarity between the two proteins overall. Interestingly, there appears to be an internal repeat at the protein level within the major replication protein sequence of Gifsy-1: there is a similarity between residues 144±182 and 179±212 (E ˆ 0.064). This region might be the site for recombination events which brought up the Gifsy-1 orf8. Furthermore, it might explain why Gifsy-1 major replication protein is somehow larger (325 aa) than the other similar proteins (most are less than 300 aa). This analysis allows to extend to all the major lambdoid replication proteins the conclusions of previous observations [10,24], in which some of these proteins have been shown to bind DNA with their N-terminal part and to interact with their C-terminal part with the second replication protein, which directly or indirectly provides the helicase activity to the origin. The only exception is CP-933P. This module codes for a DnaC homolog, but the C-terminal part of the major protein shows high similarity to iteron-binding proteins of  group (Fig. 3). Only at the very end is there similarity to the DnaT protein. It would be very interesting to find out if this module is functional in replication. If it is, then indeed this very short N-terminal fragment might be responsible for interaction with the DnaC-like protein. It is possible, however, that helper functions are necessary for the replication from the CP-933P origin, and still more possible that this module is a completely unfunctional product of recombination. It may be noted that the highest scoring similarities of the putative CP-933P major replication protein is to products of E. coli K12 open reading frames yfdO and yfmN, which are both probably unfunctional remnants of recombination of phage sequences.

Discussion The phylogenetic analysis of DnaC- and DnaB-like proteins (Fig. 2) shows that the helicase loader proteins of enterobacteria and their phages cluster separately. Also the helicase proteins of lambdoid phages of Gram-negative bacteria cluster separately from the host proteins. This suggests that the recombination of lambdoid replication modules with the host sequences is not frequent. Such separate clustering of phage and bacterial sequences was also observed for the ClpP-like proteins of Lactobacillus phages [2,25]. Interestingly, the P1 helicase seems to be more closely related to the host proteins than the helicases of lambdoid phages. It is also interesting that the helicase loaders of Gram-positive and Gram-negative bacteria cluster separately, with proteins from phages of Grampositive bacteria in between. It has been noted before [26] and we have noted during our searches, that the DnaC homologs are not readily found in bacteria, and perhaps some bacteria indeed do not utilize accessory ATP-ases for helicase loading [26]. If so, then the genes for helicase loading proteins in genomes that do carry them might be originally of phage origin. In bacteriophage  replication, P protein, which binds the E. coli DnaB helicase stronger than bacterial DnaC loader, redirects the helicase to the phage origin of replication [27,28]. If bacteria originally did not posses a helicase loader, a gene for such protein might have been recruited by their genomes from a prophage after erroneous activation during the process of elimination of phage sequences. It might then allow for competition for helicase loading with phage origins. It may be noted that in both Gram-negative and Gram-positive bacterial genomes which carry a gene for a helicase loader, such gene follows a gene for a DNA-binding protein involved in DNA replication (dnaT in Gram-negative bacteria, and dnaB in Gram-positive, please note that this gene is not related to the E. coli dnaB). Such arrangement is similar to that present in the lambdoid replication modules. If the hypothesis of phage origin of helicase loaders were true, it would put the time of dnaC gene acquisition in enterobacteria to the time before the divergence of E. coli and Buchnera lines, that is, more than 100 million years ago [29]. High similarity of Rac, Gifsy-1 and CP-933R sequences to dnaT at the 3 0 end (Fig. 3) may indicate that perhaps this region is involved in

Evolution of Lambdoid Replication Modules

interaction with DnaC or its homologs, or else is a remnant of the recombination event between the host and phage sequences. The phylogenetic analysis of the DnaC homologs strengthens the first possibility. We have noted that some residues in E. coli DnaC which substitution abolishes binding to DnaB [20] are not absolutely conserved between enterobacteria and their phages. In fact, it may be suggested that the substitutions in phage sequences might be responsible for more efficient binding, if the `molecular thievery' similar to that occurring in the initiation of  replication occurs in the case of these phages also. Further research should show if the phage DnaC homologs form a more stable complex with the host helicase than the host DnaC, as was shown for the PÐDnaB complex [27,28]. The proximity of genes involved in pathogenicity to phage genes and attachments sites in S. typhimurium has been previously observed ([30±33] and references therein). This suggests that phages are involved in horizontal spreading of virulence genes in enterobacteria. Two recently described [30,31] lambdoid prophages in S. typhimurium LT2, at 57 Cs (Gifsy-1) and 24 Cs (Gifsy-2), have been shown to contribute to virulence ([32,33], and references therein), Salmonella strains seem to carry many such prophages [33], at least some of which appear not to be defective: they are inducible, and form small plaques on susceptible strains [32,33]. The Rac prophage of E. coli K12 seems to be defective due to the lack of capsid genes, but it is competent in induction, excision and replication [7,8,34]. Our analysis of sequences neighbouring replication modules similar to CP-933R and Gifsy-1 in the S. typhi genome has shown that both prophages contain sequences similar to the recE/recT-like system. A probe containing a similar region from Gifsy-1 was used when identifying Gifsy elements in virulent S. typhimurium strains [30,31]. Considering the modular structure of lambdoid phages, however, it is possible that these strains might carry other elements which just share very similar modules. Conversely, Bakshi et al. [35] have recently shown that the sopE2 gene, involved in virulence and apparently carried in S. typhi near a prophage replication module almost identical to that of E. coli O157 : H7 prophage CP-933R [36], is also present in other S. enterica strains. Perhaps these strains are lysogenic for similar prophage(s).

169

The presence of regulation/replication module in a given prophage might suggest that it has not resided in the host genome for long, which may be important considering the hypothesis that more virulent bacterial strains could emerge through recent acquisition of virulence factors [37]. Indeed, Rac seems to be the only lambdoid prophage in E. coli K12 ( ) genome which is to some extent autonomous, and we might speculate that this suggests its more recent integration. Since the phage replication proteins are fairly large, their sequences seem to be a good choice in searches for possibly still autonomous, and thus likely quite recently acquired, prophages. Testing for phage replication sequences might be useful for identifying strains of virulent bacteria. Indeed, one such phage sequence, in which similarity to the Rac replication module can be found, and which is suitable for identification of S. typhimurium phage types DT104 and U302, has been recently identified by chance [38]. Testing if such prophages contribute to virulence could be then possible by inducing excision in appropriate strains by introducing a plasmid carrying their regulation/replication module with the origin, as is the case with Rac [8]. Construction of such plasmids would be also useful for further research on DNA replicationÐas discussed above for CP-933P. We have previously chosen the Rac oriJ as a model to propose that the phenomenon of the inheritance of the replication complex is general amongst the lambdoid group [15]. In this previous work we have shown, that not only in , but also in Rac replication the complex built from the replication proteins remains bound to one of the daughter DNA copies after each replication round. Such inheritance allows for replication of one copy even in conditions when new replication proteins cannot be synthesized [15,39,40]. The results presented here do not show any significant similarity between the  and Rac replication proteins. It appears that the N-terminal part of YdaU is similar to the N-terminal parts of putative proteins of prophage remnants of X. fastidiosa while its C-terminal protein and the gene downstream has homology to iteron-binding proteins which interact with a DnaC homolog. However, it is still possible that YdaU and O have common epitopes, allowing for the recognition by some of the polyclonal antibodies in the immunoblot experiments [15] despite the lack of significant similarity overall. A small local similarity may be present between the two proteins, perhaps in the

170

WroÂbel and WeÎgrzyn

hypothetical linker region (stemming from a recombination remnant), or else a similar fold of the two proteins leads to creation of a shared epitope. However, the general low similarity of YdaU to the O protein shown here supports the conclusion of the previous report: if proteins which are not highly similar within the protein group and nonetheless, both apparently form a heritable replication complex, it is likely that so do the other proteins within the family. Our analysis of lambdoid replication proteins seems to indicate that in evolution of the lambdoid replication modules the 3 0 -terminal part of the first gene and the second gene is treated as one unit by natural selection while the 5 0 -terminal part of the first gene and its iteron-containing region is selected for independently from the 3 0 part. The linker between the functional domains of the major replication protein might be coded by the region of the gene which harbours the iterons, as was proposed previously by Moore et al. [10] for . In the same work, laboratory-derived intergenomic hybrids within the O gene between  and f80 and f82 have been described. The selective pressure on both domains is likely to be different. While the C-terminal domains need to be conserved enough to allow drawing of the host replication machinery to the phage origin, the pressure on the N-terminal region leads to different specificities of binding to the iterons which are present in the DNA fragment which perhaps in turn codes for the linker. It is perhaps noteworthy, that the arrangement of genes coding for the tail proteins of  also resembles the order in which they interact. One may wonder if at least in some cases where the parts of the proteins that interact are coded by DNA regions which lie close to each other, such two parts could behave as units of natural selection which boundaries do not correspond to the boundaries of open reading frames. Acknowledgements This work was supported in part by the Polish State Committee for Scientific Research (projects no. 6 P04A 016 16 and 127/E-335/SPUB-M-5PR-UE/DZ 177/2000) and European Commission (grant no. QLK3-CT-1999-00533). GW acknowledges the financial support from the Foundation for Polish

Science (subsidy 14/2000). We are also grateful to Katarzyna Potrykus for critically reading the manuscript.

References 1. Hendrix R.W., Smith M.C., Burns R.N., Ford M.E., and Hatfull G.F., Proc Natl Acad Sci USA 96, 2192±2197, 1999. 2. Brussow H. and Desiere F., Mol Microbiol 39, 213±222, 2001. 3. Hatfull G.F. and Sarkis G.J., Mol Microbiol 7, 395±405, 1993. 4. Campbell A., Annu Rev Micriobiol 48, 193±222, 1994. 5. Bergh O., Borsheim Y., Bratbak G., and Heldal M., Nature 340, 467±468, 1989. 6. Anilionis A. and Riley M., J Bacteriol 143, 355±365, 1980. 7. Diaz R. and Pritchard R.H., Nature 275, 561±564, 1978. 8. Diaz R., Barnsley P., and Pritchard R.H., Mol Gen Genet 175, 151±157, 1979. 9. Taylor K. and WeÎgrzyn G., FEMS Microbiol Rev 17, 109±119, 1995. 10. Moore D.D., Denniston K.J., and Blattner F.R., Gene 14, 91±101, 1981. 11. Odegrip R., Shoen S., Haggard-Ljunngquist E., Park K., and Chattoraj D.K., J Virol 74, 4057±4063, 2000. 12. Smith T.F. and Waterman M.S., J Mol Biol 147, 195±197, 1981. 13. Thompson J.D., Higgins D.G., and Gibson T.J., Nucleic Acid Res 22, 4673±4680, 1994. 14. Strimmer K. and von Haeseler A., Mol Biol Evol 13, 964±969, 1996. 15. Potrykus K., WroÂbel B., WeÎgrzyn A. and WeÎgrzyn G., Plasmid 44, 111±126, 2000. 16. Blattner F.R., Plunkett G. III, Bloch C.A., Perna N.T., Burland V., Riley M., Collado-Vides J., Glasner J.D., Rode C.K., Mayhew G.F., Gregor J., Davis N.W., Kirkpatrick H.A., Goeden M.A., Rose D.J., Mau B., and Shao Y., Science 277, 1453±1474, 1997. 17. Perna N.T., Plunkett G. III, Burland V., Mau B., Glasner J.D., Rose D.J., Mayhew G.F., Evans P.S., Gregor J., Kirkpatrick H.A., Posfai G., Hackett J., Klink S., Boutin A., Shao Y., Miller L., Grotbeck E.J., Davis N.W., Lim A., Dimalanta E., Potamousis K., Apodaca J., Anantharaman T.S., Lin J., Yen G., Schwartz D.C., Welch R.A., and Blattner F.R., Nature 409, 529± 533, 2001. 18. Imai Y., Ogosawara N., Ishigo-oka D., Kadoya R., Daito T., and Moriya S., Mol Microbiol 36, 1037±1048, 2001. 19. Koonin E.V., Nucleic Acid Res 20, 1997, 1992. 20. Ludlam A.V., McNatt M.W., Carr K.M., and Kaguni J.M., J Biol Chem 276, 27345±27353, . 21. Nakayama N., Bond M.W., Miyajima A., Kobori J., and Arai K., J Biol Chem 262, 10475±10480, 1987. 22. Dodd I.B. and Egan J.B., Nucleic Acids Res 18, 5019±5026, 1990. 23. Simpson A.J.G. et al., Nature 406, 151±157, 2000. 24. Backhaus H. and Petri J.B., Gene 32, 289±303, 1984. 25. Desiere F., Pridmore R.D., and Brussow H., Virology 275, 294±305, 2000.

Evolution of Lambdoid Replication Modules 26. Caspi R., Pacek M., Consiglieri G., Helinski D.R., Toukdarian A., and Konieczny I., EMBO J 20, 3262±3271, 2001. 27. Mallory J.B., Alfano C., and McMacken R., J Biol Chem 265, 13297±13307, 1990. 28. Konieczny I. and Marszalek J., J Biol Chem 270, 9792±9799, 1995. 29. Moran N.A., Munson M.A., Baumann P., and Ishikawa H., Proc R Soc Lond Biol Sci 253, 167±171. 30. Figueroa-Bossi N.F., Coissac E., Netter P., and Bossi L., Mol Microbiol 25, 161±173, 1997. 31. Figueroa-Bossi N.F. and Bossi L., Mol Microbiol 28, 1040± 1041, 1998. 32. Figueroa-Bossi N.F. and Bossi L., Mol Microbiol 33, 167±176, 1999. 33. Figueroa-Bossi N.F., Uzzau S., Maloriol D., and Bossi L., Mol Microbiol 39, 260±271, 2001. 34. Diaz R. and Kaiser K., Mol Gen Genet 183, 483±489, 1981. 35. Bakshi C.S., Singh V.P., Wood M.W., Jones P.W., Wallis T.S., and Galyov E.E., J Bacteriol 182, 2341±2344, 2000.

171

36. Parkhill J., Dougan G., James K.D., Thomson N.R., Pickard D., Wain J., Churcher C., Mungall K.L., Bentley S.D., Holden M.T.G., Sebaihia M., Baker S., Basham D., Brooks K., Chillingworth T., Connerton P., Cronin A., Davis P., Davies R.M., Dowd L., White N., Farrar J., Feltwell T., Hamlin N., Haque A., Hien T.T., Holroyd S., Jagels K., Krogh A., Larsen T.S., Leather S., Moule S., O'Gaora P., Parry C., Quail M., Rutherford K., Simmonds M., Skelton J., Stevens K., Whitehead S., and Barrell B.G., Nature 413, 848±852, 2001. 37. Miao E.A. and Miller S.I., Proc Natl Acad Sci USA 96, 9452±9454, 1999. 38. Pritchett L.C., Konkel M.E., Gay J.M., and Besser T.E., J Clin Microbiol 38, 3484±3488, 2000. 39. WeÎgrzyn A., WeÎgrzyn G., Herman A., and Taylor K., Genes Cells 1, 953±963, 1996. 40. WeÎgrzyn G. and Taylor K., J Mol Biol 226, 681±688, 1992.