Complete Genome Sequence of the Shrimp ... - Journal of Virology

JOURNAL OF VIROLOGY, Dec. 2001, p. 11811–11820 0022-538X/01/$04.00⫹0 DOI: 10.1128/JVI.75.23.11811–11820.2001 Copyright © 2001, American Society for Microbiology. All Rights Reserved.

Vol. 75, No. 23

Complete Genome Sequence of the Shrimp White Spot Bacilliform Virus FENG YANG, JUN HE, XIONGHUI LIN, QIN LI, DENG PAN, XIAOBO ZHANG,

AND

XUN XU*

The Third Institute of Oceanography, Xiamen 361005, People’s Republic of China Received 11 June 2001/Accepted 1 August 2001

We report the first complete genome sequence of a marine invertebrate virus. White spot bacilliform virus (WSBV; or white spot syndrome virus) is a major shrimp pathogen with a high mortality rate and a wide host range. Its double-stranded circular DNA genome of 305,107 bp contains 181 open reading frames (ORFs). Nine homologous regions containing 47 repeated minifragments that include direct repeats, atypical inverted repeat sequences, and imperfect palindromes were identified. This is the largest animal virus that has been completely sequenced. Although WSBV is morphologically similar to insect baculovirus, the two viruses are not detectably related at the amino acid level. Rather, some WSBV genes are more homologous to eukaryotic genes than viral genes. In fact, sequence analysis indicates that WSBV differs from all known viruses, although a few genes display a weak homology to herpesvirus genes. Most of the ORFs encode proteins that bear no homology to any known proteins, either suggesting that WSBV represents a novel class of viruses or perhaps implying a significant evolutionary distance between marine and terrestrial viruses. The most unique feature of WSBV is the presence of an intact collagen gene, a gene encoding an extracellular matrix protein of animal cells that has never been found in any viruses. Determination of the genome of WSBV will facilitate a better understanding of the molecular mechanism underlying the pathogenesis of the WSBV virus and will also provide useful information concerning the evolution and divergence of marine and terrestrial animal viruses at the molecular level. White spot bacilliform virus (WSBV) or white spot syndrome virus (WSSV) is a major shrimp pathogen that is highly virulent in penaeid shrimp, the most important species used in aquaculture, and can also infect most species of crustacean (15, 32). Infection of penaeid shrimp by WSBV can result in mortality of up to 90 to 100% within 3 to 7 days (57). A major outbreak of WSBV infection in 1993 resulted in a 70% reduction in shrimp production in China (14, 57) and has raised major concerns in aquaculture around the world. Prevention and inhibition of infection by this virus can be difficult due largely to the ability of WSBV to survive for a long time in the environment (2 years in a shrimp pond) and also due to a poor understanding of this virus at the molecular level. WSBV was originally classified as an unassigned member of the Baculoviridae because of its rod-shaped, enveloped morphology (20). However, it was recently excluded from the baculovirus family and is temporarily unclassified due to the lack of molecular information (53). The virus is known generally as white spot syndrome virus (WSSV) (31), and a new genus name, Whispovirus, was proposed by Vlak et al. (48). Sequence analysis of individual genes and proteins later showed that most WSBV proteins bear poor sequence homology to baculovirus proteins but have repeated regions similar to those of some baculoviruses. To understand the molecular basis of viral replication and infection, we decided to sequence the whole genome of WSBV. MATERIALS AND METHODS Isolation and sequencing of WSBV genomic DNA. Intact WSBV genomic DNA was isolated from dead and moribund WSBV-infected Penaeus japonicus

* Corresponding author. Mailing address: The Third Institute of Oceanography, Xiamen 361005, People’s Republic of China. Phone: 86-592-2195296. Fax: 86-592-2085376. E-mail: [email protected].

shrimp which were collected from shrimp ponds in Tongan, Xiamen, east China, in October 1996 as previously described (56). A whole-genome random sequencing method (19) was used to obtain the complete genome sequence for WSBV. Genomic DNA was cloned by the shotgun method into SmalI-linearized pUC18 vector, amplified, and sequenced using ABI BigDye Terminator chemistry on ABI 377 and ABI 3700 capillary sequencers. Large DNA fragments of 5 to 10 kb were also obtained by partial digestion with Sau3A1 and cloned into the pBluescript vectors (41). This was used to form a genome scaffold and to verify the orientation and integrity of the contigs formed from the shotgun library. A total of 5,770 sequences for sevenfold coverage were assembled using the InnerPeace software by Charles Lawrence based on the Phred, Phrap, and Consed program originally developed at the University of Washington. The WSBV genome sequence was confirmed by comparison of the observed restriction fragments from seven restriction enzymes (BamHI, EcoRI, HindIII, KpnI, PstI, SalI, and XbaI) to those predicted from the sequence data and was also confirmed by the genome scaffold produced by sequence pairs from 1,495 large-insert clones, which covered 90% of the main genome. Gaps were closed by a combination of sequence-walking of shotgun and PCR large-fragment libraries. DNA sequence analysis. Genome DNA composition, structure, repeats, restriction enzyme patterns, and translation were analyzed with the DNAMAN software (Lynnon BioSoft, Vaudreuil, Canada). Open reading frames (ORFs) consisted of more than 60 codons that are initiated with a methionine codon. For detection of potential protein-coding regions, the codon usage bias and positional base preference were evaluated by determining the codon frequency of known WSBV genes or cDNA cloned from the WSBV cDNA library. Homology searches were performed with the FASTA (38) and BLAST programs (3). Protein motifs were analyzed by using the PROSITE database, release 16 (25). Transmembrane domains and signal peptides were predicted with ANTHEPROT (23). Preparation and screening of a WSBV cDNA library. Poly(A) mRNA was isolated from WSBV-infected shrimp tissue using the PolyATtract System 1000 kit (Promega). Double-stranded cDNAs were synthesized using the SUPERSCRIPT plasmid system for cDNA synthesis and plasmid cloning (GIBCO BRL). WSBV cDNA clones were selected by hybridization with the digoxigenin (DIG)-labeled WSBV genomic DNA probe (DIG labeling kit; Boehringer Mannheim) and sequenced. The transcription of some ORFs was also verified by PCR on a cDNA cocktail using ORF-specific primers. Nucleotide sequence accession number. The complete WSBV sequence can be obtained from the GenBank database (accession no. AF332093).

11811

11812

YANG ET AL.

J. VIROL.

FIG. 1. Circular representation of the WSBV genome. Arrows, positions (outer ring) of the 181 ORFs (red and blue indicate the different directions of transcription); green rectangles, 9 hrs. B, sites of BamHI restriction enzymes (inner ring; their positions are in parentheses).

RESULTS AND DISCUSSION General features of the WSBV genome. We have previously developed a unique method that enables us to highly purify the WSBV virus from infected shrimp tissue (56). A random shotgun method was employed to sequence the entire genome of WSBV; the sequence was subsequently confirmed by the genome scaffold formed by sequencing a large-fragment DNA library. The complete WSBV genome is a double-stranded circular DNA of 305,107 bp, similar to a previous estimate of 290 kb (56). Since the origin of replication was unknown, the start of the largest BamHI fragment was chosen to be base 1 (Fig. 1). Three percent of the WSBV genome is made up of nine homologous regions (hrs), while the remaining 97% of the sequences are unique (see description below). The genome has a total G⫹C content of 41%. A total of 531 putative ORFs were identified by sequence analysis, among which 181 ORFs are likely to encode functional proteins (Table 1). This corresponds to an average gene

density of one gene per 1.7 kb. Thirty-six of the 181 ORFs annotated here either have been identified by screening and sequencing a WSBV cDNA library (Table 1) or have been reported previously to encode functional proteins (45, 46, 48, 49, 50). Transcription of another 52 ORFs was confirmed by reverse transcription-PCR (RT-PCR; see Material and Methods) (Table 1). The relative positions of the ORFs and hrs in the genome are shown in Fig. 1. For 80% of the putative 181 ORFs there is a potential polyadenylation site (AATAAA) downstream of the ORF (Table 1). WSBV ORFs encode gene products homologous to known proteins. Table 1 contains a list of the 181 predicted WSBV ORFs. Among 181 ORFs, the proteins encoded by 18 ORFs show 40 to 68% identity to known proteins from other viruses or organisms or contain an identifiable functional domain. These proteins include enzymes involved in nucleic acid metabolism and DNA replication, a collagen-like protein, and three viral structure proteins (for details, see below). Thirty

VOL. 75, 2001

WSBV GENOME

11813

TABLE 1. Listing of potentially expressed ORFs in WSBV ORF

Product position (length [aaa])

WSV001 WSV002 WSV004 WSV006

300501–445 (1,684) 1118–495 (208) 1511–1200 (104) 2425–1541 (295)

WSV008 WSV009 WSV011 WSV013 WSV020 WSV021 WSV022 WSV023 WSV025

1749–2360 (204) 2672–2388 (95) 3051–6953 (1,301) 3955–3716 (80) 6604–6254 (117) 7645–7046 (200) 7250–7432 (61) 8502–7645 (286) 9248–8556 (231)

WSV026

13936–9332 (1,535)

WSV035 WSV037 WSV045

16983–14068 (972) 17000–20839 (1,280) 20784–23726 (981)

WSV047 WSV049 WSV051 WSV053 WSV055 WSV056 WSV059 WSV063

21688–22047 (120) 22759–22145 (205) 23710–24297 (196) 24906–24664 (81) 25153–24965 (63) 25878–25201 (226) 26631–27254 (208) 29077–28334 (248)

WSV064 WSV067 WSV069 WSV073 WSV076 WSV077 WSV078

30861–29080 (594) 31092–31958 (289) 32125–32796 (224) 32948–34213 (422) 34218–35045 (276) 35074–35964 (297) 37245–36052 (398)

WSV079

38917–37385 (511)

WSV083

40718–38976 (581)

WSV091 WSV097 WSV100 WSV107 WSV108

42054–45488 (1,145) 45175–45471 (99) 45951–47822 (624) 48635–48943 (103) 50300–49083 (406)

WSV112

51809–50427 (461)

WSV115 WSV119 WSV128 WSV129 WSV130 WSV131 WSV133 WSV134 WSV136 WSV137 WSV139 WSV142 WSV143

52007–54910 (968) 55055–58186 (1,044) 58948–60057 (370) 58956–60026 (357) 60581–60132 (150) 62127–60676 (484) 62204–63016 (271) 62991–63656 (222) 63666–64049 (128) 65042–64014 (337) 68659–65036 (1,208) 69118–68708 (137) 69265–76203 (2,313)


75119–74922 (66) 77653–76277 (459) 78365–77451 (305) 79065–83372 (1,436) 85707–83431 (759)

WSV166

88980–85765 (1,072)

WSV172

91607–89064 (848)

WSV177 WSV178 WSV181

92964–92647 (106) 93229–94134 (302) 94624–95739 (372)

Best matchb [source]

L23982, collagen type VII [Homo sapiens]

U07025, chitinase [Janthinobacterium lividum]

AF128951, flagellin [Escherichia coli]

BlastP Identity Length score (%) (aa)

930

42

63

54

41

18

Predicted structure and/or functionc (position)d

1,336 Collagen, TM Nucleocapsid protein VP24, TM, SP 53 Glycosyl hydrolase, Pro-rich cluster (147–175), TM SP Ser/Gly-rich region (4–80), TM 496 TM, SP,

TM

AE003491, sno gene product [Drosophila melanogaster] X77514, pupal cuticule protein [Galleria mellonella]

71

25

46

26

Ser/Glu-rich region (36–114) and basic region (130–200), TM 185 Acidic region (1406–1445), TM 111 TM, SP Glu-rich cluster (578–636) Acidic region (284–360), ATP/GTP binding motif, TM Basic region (24–59) TM, SP

SP Cys2/His2-type zinc finger AF078683, Ring-H2 finger protein RHA1a [Arabidopsis thaliana]

44

40

52 Cys2/Cys2-type zinc finger

NP-001062, thymidylate synthetase [Homo sapiens]

392

67

AE003485, CG11122 gene product [Drosophila melanogaster] AL022223, putative protein [Arabidopsis thaliana]

45

26

238

43

33

AC018363, putative protein kinase [Arabidopsis thaliana]

42

27

69 EF-hand calcium-binding motif; Ring finger protein-like; Ser/Asp-rich region (204–330) 102 Protein kinase, Ser/Thr protein kinase activesite signature, TM Glu/Ser-rich region (626–737), TM

AC024128, putative CBP [Arabidopsis thaliana]

47

41

NP-012284, cell surface flocculin [Saccharomyces cerevisiae] Q89662, dUTP pyrophosphatase [fowl adenovirus type 1]

55 90

TM 287 Thymidylate synthase Cys2/His2-type zinc finger TM, SP TM

X73481, mst101(2) [Drosophila hydei]

80

AE003593, CG10523 gene product [Drosophila melanogaster] P21524, ribonucleoside-diphosphate reductase large chain [Saccharomyces cerevisiae] T29757, protein UNC-89 [Caenorhabditis elegans]

⫹ⴱ ⫹ ⫹ⴱ ⫹ ⫺ ⫺ ⫹ ⫺ ⫺ ⫹ ⫺ⴱ ⫹ ⫺ ⫹ ⫺ ⫹ⴱⴱ ⫹ ⫹ ⫹ⴱ ⫹ⴱ ⫹ ⫹ ⫺ ⫹ⴱⴱ ⫹ⴱ ⫺ ⫹ⴱ ⫹ⴱ ⫹ⴱ ⫹

24

60 CBP; Cys2/Cys2-type zinc finger, TM TM, SP 237 Membrane-associated protein, TM

37

161 dUTPase

⫺

30

TM Acidic region (122–174) Repeat region (23–325) 302 Repeat region (20–322)

⫺ ⫹ⴱⴱ ⫹ ⫹ ⫹ⴱⴱ ⫹ ⫺ ⫹ ⫹ ⫹ⴱⴱ ⫺ⴱ ⫺ ⫹ⴱⴱ

TM, SP TM

48

⫹ⴱ ⫹ⴱ ⫹ⴱⴱ ⫺

⫹ⴱ ⫹ ⫹ⴱ ⫹ ⫹ⴱ

TM

AB037755, KIAA1334 protein [Homo sapiens]

Poly(A) signalc

18

554 Repeat region (325–427); Asn-rich cluster (990–1094), TM

TM Glu-rich cluster (97–134), Asn/pro-rich region (467–637) 69 Cys2/Cys2-type zinc finger, TM

⫹ ⫹ⴱ ⫺ ⫹ⴱ ⫹ⴱ ⫹ⴱ

49

34

728

48

790 Ribonucleotide reductase large subunit, TM

⫹ⴱ

46

22

216 Repeat region (82–302), TM, SP

⫹ ⫹ⴱⴱ ⫺

Continued on following page

11814

YANG ET AL.

J. VIROL. TABLE 1—Continued

ORF



WSV184 WSV188

95744–97366 (541) 97548–98786 (413)

WSV191 WSV192 WSV195 WSV198 WSV199 WSV206 WSV207 WSV209 WSV214 WSV216 WSV220 WSV222

98854–99786 (311) 102885–99829 (1,019) 103071–103841 (257) 103844–104677 (278) 104760–107327 (856) AF156271, Ring finger protein terf [Homo sapiens] 108550–109161 (204) 109261–110085 (275) 114953–110136 (1,606) 115053–115292 (80) L41834, nuclear protein [Ensis minor] 118987–115406 (1,194) 119057–121078 (674) 121100–123631 (844) AK016037, putative [Mus musculus]

WSV226 WSV230 WSV231 WSV234 WSV235 WSV236 WSV237 WSV238 WSV242 WSV244 WSV249

123758–126547 (930) 126755–127000 (82) 129006–127162 (615) 130290–129409 (294) 129611–129811 (67) 130076–130306 (77) 130566–131441 (292) 131481–132938 (486) 132994–133893 (300) 133969–136341 (791) 137589–139937 (783)

WSV252 WSV254 WSV256 WSV259 WSV260 WSV267 WSV269 WSV270 WSV271


140111–141613 (501) 141696–142538 (281) 142545–143696 (384) 143760–144686 (309) 147517–144752 (922) 148612–147770 (281) 150145–148679 (489) 150675–150166 (170) 150688–154341 (1,218) S59310, probable membrane protein YMR317w [Saccharomyces cerevisiae] 154557–156929 (791) D86346, crystal protein [Bacillus thuringiensis] 159352–161253 (634) 161263–161562 (100) 161718–165017 (1,100) AE003824, CG13185 gene product [Drosophila melanogaster] 169814–165120 (1,565) NP-011856, serine/threonine protein kinase [Saccharomyces cerevisiae] 167278–167532 (85) 170113–170730 (206) 170832–171458 (209) 172439–171513 (309) 173075–172509 (189) 173178–175850 (891) NP-069209, transcription initiation factor IID [Archaeoglobus fulgidus] 175840–177096 (419) 177124–178521 (466) 178530–179345 (272) 180036–179425 (204) 183817–180279 (1,180)

WSV321 WSV322 WSV323 WSV324 WSV325 WSV327 WSV331 WSV332 WSV333 WSV338 WSV339 WSV340 WSV342 WSV343 WSV344 WSV349 WSV360 WSV386

184132–184482(117) 184499–185179(227) 185082–184819(88) 185434–185189(82) 185433–186827(465) 190743–188176(856) 190094–190306(71) 190876–193233(786) 191135–190932(68) 194629–193331(433) 195503–194655(283) 196292–195510(261) 196697–196398(100) 209342–196803(4,180) 197221–197517(99) 199510–199779(90) 209616–227846(6,077) U96166, srpA [Streptococcus cristatus] 228196–227993(68)

WSV277 WSV282 WSV284 WSV285 WSV289 WSV291 WSV294 WSV295 WSV299 WSV302 WSV303

AF117061, ribonucleotide reductase R2 subunit [Aedes albopictus] AJ133437, deoxyribonuclease I [Penaeus japonicus]


364

59

313

54

32

149

43

30

72

59

46

73

44

32

58


Cys2/Cys2-type zinc finger, TM Ribonucleotide reductase small subunit, TM

⫹ ⫹ⴱ

Nuclease, TM, SP TM TM, SP

⫹ⴱ ⫹ⴱ ⫹ ⫺ ⫹ ⫹ⴱ ⫹ⴱⴱ ⫹ ⫹ⴱⴱ ⫹ⴱ ⫹ ⫹ⴱⴱ

Ring-H2 finger motif, TM Proline rich, TM TM DNA-binding protein Protein-splicing signature, TM Ring-H2 finger motif, ATP/GTP binding motif, TM TM TM

TM

AC024760, contains similarity to TR [Caenorhabditis elegans]

38

30

139

72

27

202

Poly(A) signalc

Gly-rich cluster (50–138), TM, SP TM TM Ring-H2 finger motif, repeat region (454–633)

⫹ ⫹ⴱⴱ ⫹ⴱ ⫹ⴱ ⫹ ⫹ ⫺ⴱ ⫹ⴱ ⫺ ⫹ⴱⴱ ⫹ⴱ

46

20

369

Lys/Ser-rich region (455–526), TM

⫹ⴱⴱ ⫹ⴱ ⫹ⴱ ⫹ⴱⴱ ⫹ⴱⴱ ⫹ⴱ ⫹ ⫹ ⫺

41

23

249

48

25

183

TM Ser-rich cluster (13–129), SP TM, SP ATP/GTP binding motif, TM

⫹ⴱ ⫹ⴱ ⫺ ⫹ⴱⴱ

46

25

245

Protein kinase, TM, SP

⫹ⴱ

TM, SP

⫹ ⫺ ⫹ⴱⴱ ⫹ⴱⴱ ⫺ ⫹

TM, SP Asp/Glu/Ser-rich region (344–485), TM TM

TM TM, SP 44

23

140

TBP Cys2/Cys2-type zinc finger, TM TM TM Nucleocapsid protein VP26, TM, SP Glu-rich region (37–358) and Pro-rich cluster (462–492), TM TM, SP TM

TM, SP TM

TM, SP TM, SP

46

23

222

TM TM TM, SP Leucine-zipper motif, TM TM, SP

⫹ⴱ ⫹ⴱ ⫹ ⫹ⴱⴱ ⫹ ⫺ ⫺ ⫹ⴱⴱ ⫹ ⫹ ⫹ⴱⴱ ⫹ ⫹ⴱⴱ ⫹ ⫹ⴱⴱ ⫺ ⫺ ⫹ ⫹ⴱⴱ ⫹ ⫹ ⫹ⴱⴱ ⫹

Continued on following page

VOL. 75, 2001

WSBV GENOME

11815

TABLE 1—Continued ORF



228375–230561(729) 230617–231579(321) 231422–231724(101) 231603–232796(398)


232819–233331(171) 233383–233763(127) 234330–233782(183) 236679–238601(641)

WSV406 238659–239435(259)



T41553, thymidylate kinase [Schizosaccharomyces pombe]

157

41


200 Thymidylate kinase; ATP/GTP binding motif

TM

T27927, hypothetical protein ZK593.8 [Caenorhabditis elegans]

WSV407 WSV412 WSV414 WSV415 WSV419 WSV421 WSV423

240139–239459(227) 240713–241189(159) 241637–241275(121) 241775–243406(544) 243217–243795(193) 244242–244853(204) 247143–244954(730)

WSV427 WSV432 WSV433 WSV440 WSV442 WSV446 WSV447 WSV455 WSV457 WSV459 WSV461 WSV462 WSV464 WSV465 WSV477

249230–247362(623) 249151–249456(102) 249426–253208(1,261) 253297–255117(607) 255075–257474(800) 257552–259129(526) 264975–259168(1,936) Z70204, similarity to yeast hypothetical helicase [Caenorhabditis elegans] 265079–265597(173) 265606–266400(265) 266838–266446(131) 267400–266930(157) 267399–267647(83) 268584–267721(288) 272423–268695(1,243) 274527–275150(208)


276736–275210(509) 277035–277571(179) 277705–278076(124) 278423–277776(216) 278637–280973(779)

WSV489 WSV492 WSV493 WSV495 WSV497 WSV500

281865–281131(245) 282176–282583(136) 283360–282677(228) 283754–284011(86) 285773–284079(565) 286706–286080(209)

50

26

Ring-H2 finger motif, repeat region (435–494), SP 175 TM, SP

Asp-rich cluster (63–86), TM, SP TM

T22255, hypothetical protein F45H7.4 [Caenorhabditis elegans]

45

26

Envelope protein VP28, TM, SP 165 Protein kinase, TM EF-hand calcium-binding motif, TM Pro-rich cluster (29–71), TM

42

40

ATP/GTP binding motif, TM ATP/GTP binding motif, TM, SP 52 Helicase, ATP/GTP binding motif, Asp-protease motif, TM TM, SP TM, SP TM

Cys2/Cys2-type zinc finger, TM Cys2/Cys2-type zinc finger, ATP/GTP binding motif Glu-rich cluster (467–485), TM TM

AF154037, surface protein PspC [Streptococcus pneumoniae]

46

24

TM 194 Lysine-rich, TM

Acidic region (59–103)

WSV502 286606–289632(1,009) AL352992, ariadne-like protein [Leishmania major]

51

WSV508 291298–289685(538) WSV513 291720–292202(161) WSV514 292190–298774(2,195) X61920, DNA polymerase III catalytic subunit [Saccharomyces cerevisiae] WSV518 293724–293275(150) WSV524 298729–298526(68) WSV525 299033–298821(71) WSV526 300432–299089(448)

52

51

24

TM Cys2/Cys2-type zinc finger, ATP/GTP binding motif 33 Cys2/His2, Cys2/Cys2-type zinc finger, ATP/ GTP binding motif, TM, SP TM 201 DNA polymerase, TM SP TM, SP TM

Poly(A) signalc

⫺ ⫹ ⫹ ⫹ ⫺ ⫹ ⫹ ⫹ⴱ ⫹ⴱ ⫹ ⫹ ⫹ⴱⴱ ⫺ ⫺ ⫹ⴱⴱ ⫹ⴱ ⫺ ⫹ ⫹ ⫺ ⫹ⴱ ⫹ⴱ ⫹ⴱ ⫺ ⫹ⴱ ⫹ ⫺ ⫹ ⫹ ⫹ⴱ ⫹ⴱ ⫹ⴱⴱ ⫹ⴱⴱ ⫹ ⫹ⴱ ⫺ ⫹ⴱ ⫹ ⫹ⴱⴱ ⫹ ⫹ⴱⴱ ⫹ⴱ ⫹ⴱ ⫹ⴱⴱ ⫺ ⫹ⴱⴱ ⫹ ⫹ ⫹ ⫹ⴱ

a

aa, amino acid. Accession numbers are from the GenBank or SwissProt database. c Function was deduced from the degree of amino acid similarity to either products of known genes or Prosite signatures. TM, transmembrane domains; SP, signal peptides. d Positions of amino acid residues. e ⫹, polyadenylation signal present; ⫺, signal absent. ⴱ, the transcription of the ORF was also verified by RT-PCR; ⴱⴱ, the ORF was confirmed by cDNA sequencing. b

ORFs predicted proteins that show a partial homology (20 to 39% identity) to known proteins or contain one or two sequence motifs (versus a real functional domain). The remaining 133 ORFs encode proteins with no homology to any known proteins or motifs.

Enzymes involved in nucleotide metabolism. Among the 18 ORFs encoding proteins that show extensive homologies with previously identified proteins, WSV067, WSV112, WSV172, WSV188, and WSV395 may encode the WSBV homologues of enzymes involved in nucleic acid metabolism (Table 1). The

11816

YANG ET AL.

J. VIROL.

FIG. 2. Multiple amino acid sequence alignment of products of WSV303 and WSV100. The homology regions are shaded (black, 100%; pink, ⬎75%; blue, ⬎50%). The positions of the amino acid sequence are indicated on both ends. (A) Alignment of product of WSV303 with a known TBP. Human, Homo sapiens, accession no. XP_004534; yeast, Saccharomyces cerevisiae, accession no. M26403; fly, Drosophila melanogaster, accession no. A35615; At, Arabidopsis thaliana, accession no. AC005223; Metha, Methanothermobacter thermautotrophicus, accession no. AE000921; Archa, Archaeoglobus fulgidus, accession no. AE001078; Halob, Halobacterium sp. strain NRC-1, accession no. AE005110. (B) Alignment of product of WSV100 with the CBP. Human, Homo sapiens, accession no. U47741; mouse, Mus musculus, accession no. S39161; fly, D. melanogaster, accession no. U88570; At, A. thaliana, accession no. AC024128.

highest degree of homology (67% identity over 287 amino acids) was detected between the product of WSV067 and the human thymidylate synthase. The 29-amino-acid thymidylate synthase prosite motif (PS00091), which contains the catalytic cysteine residue, is 100% conserved in the product of WSV067. In addition, WSV112 may encode a WSBV homologue of dUTPase (37% identity over 161 amino acids) since the five conserved regions of dUTPase, especially the highly conserved substrate-binding residues, were identified in the product of WSV112 (13, 35). dUTPase has been shown to be essential for the replication of DNA viruses (5). Consistent with the previous reports by van Hulten et al. (48) and Tsai et al. (45, 46), WSBV contains ribonucleotide reductases (products of WSV172 and WSV188) and also thymidylate/thymidine kinase (product of WSV395). Among these enzymes, thymidylate synthase catalyzes the methylation of dUTP to yield the nucleotide precursor dTMP. This is an important step in the de novo pathway of biosynthesis of pyrimidine (12). Despite its ubiquitous distribution in nature, a viral thymidylate synthase was found only in a few herpesviruses (2, 10, 26, 39), Melanoplus sanguinipes entomopoxvirus (MsEPV) (1), Chilo iridescent virus (CIV) (36), and bacteriophages (9). Most viruses do not contain thymidylate synthase, as they depend mostly on the host enzymatic machinery for the replication of their genomes so as to keep the viral genome small (36). WSBV and other thymidylate synthase-containing viruses may therefore exhibit a considerable independence from the host deoxyribonucleotide synthesis. This may represent a significant advantage for viral genome replication that may ultimately lead to persistence of infection and a broad host range for viral infection (36). It is possible that WSBV acquires these replication-related genes from its host and/or from a coinfecting virus that might occur at an earlier period in evolution. However, since the shrimp homologues of these genes have not been cloned, we are not able to test this hypothesis.

Proteins involved in DNA replication and transcription. WSBV contains genes encoding proteins involved in DNA replication such as DNA polymerase (product of WSV514). The WSBV DNA polymerase was putatively identified by the presence of three highly conserved motifs, YGDTDSVFC (DNA polymerase family B signature PS00116), KLG MNSMYG, and DMTSLYP (conserved amino acid residues are underlined), that are found in most eukaryotic DNA polymerases (4) as well as in some viral polymerases (18, 29, 43). However, since the degree of amino acid similarity between the product of WSV514 and known DNA polymerases is low (24% identity over 201 amino acids), its putative activity as a DNA polymerase still awaits future experimental verification. Interestingly, the size of this putative WSBV polymerase (2,195 amino acids) is much larger than those of the regular polymerases found in other organisms. Products of ORFs that show weak similarity (BlastP score, ⬍100; identity, ⬍20 to 39%) to known proteins include putative TATA-box binding protein (TBP) (product of WSV303, containing partial conservation with transcription initiation factor IID repeat signature PS00351) (Fig. 2A), the putative CREB-binding protein (CBP) (product of WSV100) (Fig. 2B), nuclease (product of WSV191, containing most residues of DNA/RNA nonspecific endonuclease active site PS01070), the putative helicase (product of WSV447), and protein kinases (products of WSV083, WSV289, and WSV423). Most of them play important roles in the regulation of gene transcription. TBP and CBP, which have never been reported in a virus genome, deserve special attention since they are critical basal transcription regulators in eukaryotic cells (21, 51). However, their functions in virus are yet to be determined. Structure proteins. A unique feature of WSBV is that it contains a collagen-like gene, WSV001, which encodes a predicted 1,684-amino-acid protein and whose transcription has been confirmed by RT-PCR. The product of this ORF displays

VOL. 75, 2001

WSBV GENOME

11817

FIG. 3. Multiple amino acid sequence alignment of the product of WSV001 with human (Homo sapiens) type VII collagen, accession no. L23982; fruit fly (Drosophila melanogaster) collagen, accession no. P08120; sea urchin (Strongylocentrotus purpuratus) collagen, accession no. A43426; brown alga virus (BAV; ectocarpus siliculosus virus) collagen-like protein, accession no. NP_077542; HVS strain 484-77) collagen-like protein, accession no. P25050; and bacteriophage PRD1 (PRD1) coat protein, which contains a short collagen-like region, accession no. P22536. The homology regions are shaded (black, 100%; pink, ⬎75%; blue, ⬎50%). Repeat sequence density is shown as a ratio of a/b, in which a indicates the length of the typical repeat sequence and b indicates the full length of the protein.

the highest degree of homology to human collagen type VII (42% identity over 1,336 amino acids) (Fig. 3). This is the first time that an intact collagen gene has been reported in a virus genome. The collagen-like protein of WSBV contains a typical repeat of Gly-X-Y (X is mostly proline, and Y can be any amino acid) that can form the triple-helical structure characteristics of animal collagen fiber. The presence of this collagenlike protein may help to protect the WSBV from environmental factors and may contribute to its ability to survive for a long time in a shrimp pond. Previously only a short segment of collagen-homologous sequence was found in the structural proteins of ectocarpus siliculosus virus 1 (EsV-1) (16), hepersvirus saimiri (HVS) (2, 22), and bacteriophage PRD1 (6, 7) (Fig. 3). In EsV-1, the collagen-like sequence was found in the N-terminal half of both vp55 and vp74 (16), which were encoded by the EsV-1 genome and which are likely to be the components of the viral core structure. In HVS, the Gly-X-Y motif is repeated 18 times and is located in the central region of saimiri transformationassociated protein (STP). These collagen-like repeats may serve as a hinge to extend the active domain of STP to its site of action (2). Finally, in bacteriophage PRD1, a minor capsid protein was found to contain a short collagen-like region (GlyX-Y)6 (7). All of the collagen-like segments present in these proteins are short. These segments may play only a supplementary role in protein functions. In addition, WSV002 and WSV311 encode a nucleocapsid protein, and the product of WSV421 shows characteristics of an envelope protein. These proteins have recently been purified from the nucleocapsid and envelope of WSBV (49, 50). WSV214 encodes a polypeptide with 44.2% basic amino acid residues (Arg/Lys) and 24.6% Ser residues. This amino acid composition is similar to that of the DNA-binding protein of insect baculoviruses (34, 40, 55). Homologs of these DNAbinding proteins have also been found in granulosis virus (47). The basic residues of these DNA-binding proteins have a high affinity for the phosphate backbone of DNA, enabling the

generation of a highly compact form of viral genomic DNA. Upon entry into a host cell, the DNA-binding protein may become phosphorylated by a protein kinase, resulting in the unpacking of the viral DNA (54). Protein motifs. ORFs containing zinc finger and leucine zipper motifs have been found in WSBV (Table 1). These motifs have been shown to be involved in DNA-protein interaction and in regulation of transcriptional activation. Ring-H2 finger motifs, a variation of the Ring finger motif (30, 44) found in proteins critical for virus survival and replication (11, 42), are also detected. Products of WSV079 and WSV427 contain an EF-hand calcium-binding motif (PS00018). Proteins with these motifs are found in some prokaryotic and all eukaryotic organisms and play important roles in the regulation and control of normal cellular functions. The detection of these motifs in proteins of a marine virus suggests that some of these basic regulatory activities are well conserved throughout evolution. The remaining 133 ORFs encode novel proteins of unknown function. These novel genes obviously will provide ample opportunities for future research and for exploration of molecular mechanisms by which a virus and its host interact to survive in the marine environment. Among the 181 ORFs examined, the products of 96 have potential transmembrane domains and 32 proteins contain both signal peptide sequences and substantial hydrophobic domains, suggesting that they may be membrane-associated proteins and that they may play an important role in the WSBVhost cell interaction and host range determination. Other than the putative signal sequences and hydrophobic domains, these proteins are not obviously related to other known proteins. Repetitive regions. Three percent of the WSBV genome is composed of highly repetitive sequences, and the repeats are distributed throughout the genome. We found nine hrs with a total of 47 repeated minifragments encompassing direct repeats, atypical inverted repeat sequences, and imperfect palindromic sequences. The nine hrs vary in size from 0.76 to 3.62

11818

YANG ET AL.

J. VIROL.

TABLE 2. Positions and identities of hrs in WSBV genome hr

Position

hr1 hr2 hr3 hr4 hr5 hr6 hr7 hr8 hr9

24528–28184 77591–78859 91832–92592 107335–108339 136540–137301 157231–159211 186876–188141 234231–236419 272510–274432

Minifragment

a, a, a, a, a, a, a, a, a,

b, b, b, b, b, b, b, b, b,

c, c, c, c, c, c, c, c, c,

d, e, f, g d, e d d, d, d, d,

e, f, g e, e, f, g e, f

Identity (%)

Identity between hrs (%)

73.87 87.98 88.85 87.26 91.41 74.35 89.65 79.77 80.86

61.62

kb, and hr1 to hr9 are separated in the WSBV genome by about 49, 13, 15, 28, 20, 28, 46, 36, and 55 kb of DNA, respectively. Each hr contains several repeated minifragments, each with a size around 300 bp. These minifragments are referred here as

a, b, c, d, e, f, etc. (Table 2). The percentage of homology among the consensus sequences within the same homologous region is over 73%, while the identity among the hrs is 61.6% (Table 2). A few sequence motifs were found to be present at very high copy numbers. For example, sequences CCAGAAA or TTTCTGG, AGNGGTCCACC, and AACTTGACAT are repeated 219, 88, and 47 times, respectively. As an example of such repetitive region, the homology among the b minifragments of the nine hrs is shown in Fig. 4. Both GC-rich sequences and AT-rich sequences are found in the repeats. In the imperfect palindromic sequences, there are 2- or 3-bp mismatches that always exist in the same location within every palindrome (Fig. 4), suggesting a functional significance for the mismatch. Atypical inverted repeat sequences that can form one or two hairpin loops are also found within the repeat segments. The AT-rich elements, inverted repeat sequences, and loop structures are reminiscent of the origin of

FIG. 4. Alignment of partial consensus sequences within each hr. The consensus minifragments b are shown in order: hr1 to hr9. The hrs are shaded (black, 100%; pink, ⬎75%; blue, ⬎50%), and the numbers on both ends refer to the positions of consensus sequences in the WSBV genome. The direct repeat region, the atypical inverted repeat sequence that may contribute to the hairpin loop, the imperfect palindrome, and GC-rich and AT-rich regions are shown.

VOL. 75, 2001

WSBV GENOME

replication in eukaryotic cells and also in some of the viruses (17, 37). The presence of hrs is a feature of many baculovirus genomes. The hrs may serve as transcription enhancers and origins of DNA replication and play a fundamental role in the viral life cycle (24, 27, 28, 33). The presence of nine hrs suggests that WSBV may contain multiple replication origins. This may account for the fast replication and the growth rate of WSBV. Furthermore, although the organization of WSBV hrs is similar to that of baculovirus, no homology among most of their ORFs is detected. Thus, future investigations are needed to determine whether WSBV is a seawater baculovirus and whether the ancestors of WSBV and insect baculoviruses evolved by separate routes, acquiring genes independently in different environments. In summary, we have obtained the complete genome sequence of WSBV. This is the first complete genome sequence from a marine invertebrate virus. It is also the largest animal virus genome sequenced (8, 52). As the genomic data demonstrated, more than 80% of WSBV proteins bear no homology to previously identified proteins. This leads us to consider a separate evolutionary origin for this virus. Among the proteins that show homology with known proteins, most seem to be related to eukaryotic proteins and relatively few seem to be related to viral proteins (Table 1). Although a few genes show weak similarities to genes of herpesvirus (data not shown), the morphology and the double-stranded circular WSBV genome differ significantly from those of herpesvirus, which contains an icosahedral capsid and a linear double-stranded DNA molecule. On the other hand, WSBV shares some complex morphological traits with the insect baculovirus, and a pattern of interspersed repetitive regions in WSBV is similar to that found in some of the insect baculoviruses, but sequence comparison indicates that they are not detectably related at the amino acid level. Unfortunately, until now there were no genome sequence data available for the nonoccluded baculovirus. Based on genetic analysis, WSBV clearly should not be included in any of the currently recognizable baculovirus subfamilies and perhaps should be classified in a new virus family. It is possible that other WSBV-like viruses that can infect other organisms may exist. As the sequence of a representative of a marine DNA virus, the complete WSBV genome sequence should provide valuable information to serve as the genetic basis for future studies. Future work may shed more light on the evolution of these viruses. ACKNOWLEDGMENTS We thank Mei He and Yun Ye for their assistance, and we acknowledge the support of Mingwei Wang, Lin Zao, and Yan Shen. We thank Mark Yandell, Jennifer R. Wortman, Chinnappa Kodira, P. W. Li, and Z. Deng of Celera Genomics for coordinating the project at Celera. We thank Kunxin Luo of Lawrence Berkeley National Laboratory and UC Berkeley for data analysis and critical reading of the manuscript. This work is funded by the Chinese High Tech “863” Program (Z19-02-05-01), Fujian Science Fund (C97053), and Science Foundation of the State Oceanic Administration. REFERENCES 1. Afonso, C. L., E. R. Tulman, Z. Lu, E. Oma, G. F. Kutish, and D. L. Rock. 1999. The genome of Melanoplus sanguinipes Entomopoxvirus. J. Virol. 73: 533–552. 2. Albrecht, J. C., J. Nicholas, D. Biller, K. R. Cameron, B. Biesinger, C. Newman, S. Wittmann, M. A. Craxton, H. Coleman, B. Fleckenstein, and R. W. Honess. 1992. Primary structure of the herpesvirus saimiri genome. J. Virol. 66:5047–5058.

11819

3. Altschul, S. F., W. Gish, W. Miller, E. W. Myers, and D. J. Lipman. 1990. Basic local alignment search tool. J. Mol. Biol. 215:403–410. 4. Arif, P. 1988. A sequence motif in many polymerases. Nucleic Acids Res. 16:9909–9916. 5. Baldo, A. M., and M. A. McClure. 1999. Evolution and horizontal transfer of dUTPase-encoding genes in viruses and their hosts. J. Virol. 73:7710–7721. 6. Bamford, D. H., and J. K. Bamford. 1990. Collagenous proteins multiply. Nature 344:497. 7. Bamford, J. K., and D. H. Bamford. 1990. Capsomer proteins of bacteriophage PRD1, a bacterial virus with a membrane. Virology 177:445–451. 8. Bankier, A. T., S. Beck, R. Bohni, C. M. Brown, R. Cerny, M. S. Chee, C. A. Hutchinson, T. Kouzarides, J. A. Martignetti, E. Preddie, S. C. Satchwell, P. Tomlinson, K. M. Weston, and B. G. Barrell. 1991. The DNA sequence of the human cytomegalovirus genome. DNA Seq. 2:1–12. 9. Belfort, M., A. Moelleken, G. F. Maley, and F. Maley. 1983. Purification and properties of T4 phage thymidylate synthase produced by the cloned gene in an amplification vector. J. Biol. Chem. 258:2045–2051. 10. Bodemer, W., H. H. Niller, N. Nitsche, B. Scholz, and B. Fleckenstein. 1986. Organization of the thymidylate synthase gene of herpesvirus saimiri. J. Virol. 60:114–123. 11. Borden, K. L. 2000. RING domains: master builders of molecular scaffolds? J. Mol. Biol. 295:1103–1112. 12. Carreras, C. W., and D. V. Santi. 1995. The catalytic mechanism and structure of thymidylate synthase. Annu. Rev. Biochem. 64:721–762. 13. Cedergren-Zeppezauer, E. S., G. Larsson, P. O. Nyman, Z. Dauter, and K. S. Wilson. 1992. Crystal structure of a dUTPase. Nature 355:740–743. 14. Cen, F. 1998. The existing condition and development strategy of shrimp culture industry in China, p. 32–38. In Y. Q. Su (ed.), The health culture of shrimps. China Ocean Press, Beijing, People’s Republic of China. 15. Chen, X. F., C. Chen, D. H. Wu, H. Huai, and X. C. Chi. 1997. A new baculovirus of cultured shrimp. Sci. China Ser. C 40:630–635. 16. Delaroque, N., S. Wolf, D. G. Muller, and R. Knippers. 2000. Characterization and immunolocalization of major structural proteins in the brown algal virus EsV-1. Virology 269:148–155. 17. DePamphilis, M. L. 1993. Origins of DNA replication that function in eukaryotic cells. Curr. Opin. Cell Biol. 5:434–441. 18. Earl, P. L., E. V. Jones, and B. Moss. 1986. Homology between DNA polymerases of poxviruses, herpesviruses, and adenoviruses: nucleotide sequence of the vaccinia virus DNA polymerase gene. Proc. Natl. Acad. Sci. USA 83:3659–3663. 19. Fleischmann, R. D., M. D. Adams, O. White, R. A. Clayton, E. F. Kirkness, A. R. Kerlavage, C. J. Bult, J. F. Tomb, B. A. Dougherty, J. M. Merrick, et al. 1995. Whole-genome random sequencing and assembly of haemophilus influenzae Rd. Science 269:496–512. 20. Francki, R. I. B., C. M. Fauquet, D. L. Knudson, and F. Brown. 1991. Classification and nomenclature of viruses. Fifth report of the International Committee on Taxonomy of Viruses. Arch. Virol. 1991(Suppl. 2):1–450. 21. Furukawa, T., and N. Tanese. 2000. Assembly of partial TFIID complexes in mammalian cells reveals distinct activities associated with individual TATA box-binding protein-associated factors. J. Biol. Chem. 275:29847–29856 22. Geck, P., S. A. Whitaker, M. M. Medveczky, and P. G. Medveczky. 1990. Expression of collagenlike sequences by a tumor virus, herpesvirus saimiri. J. Virol. 64:3509–3515. 23. Geourjon, C., and G. Deleage. 1995. ANTHEPROT 2.0: a three-dimensional module fully coupled with protein sequence analysis methods. J. Mol. Graph. 13:209–212. 24. Guarino, L. A., and W. Dong. 1991. Expression of an enhancer-binding protein in insect cells transfected with the Autographa californica nuclear polyhedrosis virus IE1 gene. J. Virol. 65:3676–3680. 25. Hofmann, K., P. Bucher, L. Falquet, and A. Bairoch. 1999. The PROSITE database, its status in 1999. Nucleic Acids Res. 27:215–219. 26. Honess, R. W., W. Bodemer, K. R. Cameron, H. H. Niller, B. Fleckenstein, and R. E. Randall. 1986. The A⫹T-rich genome of Herpesvirus saimiri contains a highly conserved gene for thymidylate synthase. Proc. Natl. Acad. Sci. USA 83:3604–3608. 27. Kool, M., P. M. Van Den Berg, J. Tramper, R. W. Goldbach, and J. M. Vlak. 1993. Location of two putative origins of DNA replication of Autographa californica nuclear polyhedrosis virus. Virology 192:94–101. 28. Kool, M., J. T. Voeten, R. W. Goldbach, J. Tramper, and J. M. Vlak. 1993. Identification of seven putative origins of Autographa californica multiple nucleocapsid nuclear polyhedrosis virus DNA replication. J. Gen. Virol. 74:2661–2668. 29. Larder, B. A., S. D. Kemp, and G. Darby. 1987. Related functional domains in virus DNA polymerases. EMBO J. 6:169–175. 30. Leverson, J. D., C. A. Joazeiro, A. M. Page, H. K. Huang, P. Hieter, and T. Hunter. 2000. The APC11 RING-H2 finger mediates E2-dependent ubiquitination. Mol. Biol. Cell 11:2315–2325. 31. Lightner, D. V. 1996. A handbook of pathology and diagnostic procedures for diseases of penaeid shrimp. World Aquaculture Society, Baton Rouge, La. 32. Lo, C. F., C. H. Ho, S. E. Peng, C. H. Chen, H. C. Hsu, Y. L. Chiu, C. F. Chang, K. F. Liu, M. S. Su, C. H. Wang, and G. H. Kou. 1996. White spot

11820

33.

34.

35.

36.

37.

38. 39. 40.

41.

42. 43.

44.

45.

YANG ET AL.

syndrome baculovirus (WSBV) detected in cultured and captured shrimp, crabs and other arthropods. Dis. Aquat. Org. 27:215–226. Lu, A., P. Krell, J. M. Vlak, and G. F. Rohrmann. 1997. Baculovirus DNA replication. In L. K. Miller (ed.), The baculoviruses. Plenum Press, New York, N.Y. Maeda, S., S. G. Kamita, and H. Kataska. 1991. The basic DNA-binding protein of Bombyx mori nuclear polyhedrosis virus: the existence of an additional arginine repeat. Virology 180:807–810. McGeoch, D. J. 1990. Protein sequence comparisons show that the ’pseudoproteases’ encoded by poxviruses and certain retroviruses belong to the deoxyuridine triphosphatase family. Nucleic Acids Res. 18:4105–4110. Muller, K., C. A. Tidona, U. Bahr, and G. Darai. 1998. Identification of a thymidylate synthase gene within the genome of Chilo iridescent virus. Virus Genes 17:243–258. Pearson, M., R. Bjornson, G. Pearson, and G. Rohrmann. 1992. The Autographa californica baculovirus genome: evidence for multiple replication origins. Science 257:1382–1384. Pearson, W. R. 1990. Rapid and sensitive sequence comparison with FASTP and FASTA. Methods Enzymol. 183:2444–2448. Richter, J., I. Puchtler, and B. Fleckenstein. 1988. Thymidylate synthase gene of herpesvirus ateles. J. Virol. 62:3530–3535. Russell, R. L., and G. F. Rohrmann. 1990. The p6.5 gene region of a nuclear polyhedrosis virus of Orgyia pseudotsugata: DNA sequence and transcriptional analysis of four late genes. J. Gen Virol. 71:551–560. Sambrook, J., E. F. Fritsch, and T. Maniatis. 1989. Molecular cloning: a laboratory manual, 2nd ed. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. Saurin, A. J., K. L. Borden, M. N. Boddy, and P. S. Freemont. 1996. Does this have a familiar RING? Trends Biochem. Sci. 21:208–214. Tomalski, M. D., J. G. Wu, and L. K. Miller. 1988. the location, sequence, transcription, and regulation of a baculovirus DNA polymerase gene. Virology 167:591–600. Torii, K. U., C. D. Stoop-Myer, H. Okamoto, J. E. Coleman, M. Matsui, and X. W. Deng. 1999. The ring finger of photomorphogenic repressor COP1 specifically interacts with the RING-H2 motif of a novel Arabidopsis protein. J. Biol. Chem. 274:27674–27681. Tsai, M. F., C. F. Lo, M. C. van Hulten, H. F. Tzeng, C. M. Chou, C. J. Huang, C. H. Wang, J. Y. Lin, J. M. Vlak, and G. H. Kou. 2000. Transcrip-

J. VIROL.

46.

47.

48.

49.

50.

51. 52. 53.

54.

55.

56. 57.

tional analysis of the ribonucleotide reductase genes of shrimp white spot syndrome virus. Virology 277:92–99. Tsai, M. F., H. T. Yu, H. F. Tzeng, J. H. Leu, C. M. Chou, C. J. Huang, C. H. Wang, J. Y. Lin, G. H. Kou, and C. F. Lo. 2000. Identification and characterization of a shrimp white spot syndrome virus (WSSV) gene that encodes a novel chimeric polypeptide of cellular-type thymidine kinase and thymidylate kinase. Virology 277:100–110. Tween, K. A., L. A. Bulla, and R. A. Consigli. 1980. Characterization of an extremely basic protein derived from granulosis virus nucleocapsid. J. Virol. 33:866–876. van Hulten, M. C., M. F. Tsai, C. A. Schipper, C. F. Lo, G. H. Kou, and J. M. Vlak. 2000. Analysis of a genomic segment of white spot syndrome virus of shrimp containing ribonucleotide reductase genes, and repeat regions. J. Gen. Virol. 81:307–316. van Hulten, M. C., M. Westenberg, S. D. Goodall, and J. M. Vlak. 2000. Identification of two major virion protein genes of white spot syndrome virus of shrimp. Virology 266:227–236. van Hulten, M. C., R. W. Goldbach, and J. M. Vlak. 2000. There functionally diverged major structural proteins of white spot syndrome virus evolved by gene duplication. J. Gen. Virol. 81:2525–2529. Van Orden, K., and J. K. Nyborg. 2000. Insight into the tumor suppressor function of CBP through the viral oncoprotein tax. Gene Expr. 9:29–36. Vink, C., E. Beuken, and C. A. Bruggeman. 2000. Complete DNA sequence of the rat cytomegalovirus genome. J. Virol. 74:7656–7665. Volkman, L. E. 1995. Baculoviridae, p. 104–113. In F. A. Murphy and C. M. Fauquet (ed.), Virus taxonomy. Sixth report of the International Committee on Taxonomy of Viruses. Springer-Verlag, New York, N.Y. Wilson, M. E., and R. A. Consigli. 1985. Functions of a protein kinase activity associated with purified capsids of the granulosis virus infecting plodia interpunctella. Virology 143:526–535. Wilson, M. E., T. H. Mainprize, P. D. Friesen, and L. K. Miller. 1987. Location, transcription, and sequence of a baculovirus gene encoding a small arginine-rich polypeptide. J. Virol. 61:661–666. Yang, F., W. Wang, and X. Xu. 1997. A simple and efficient methods for purification of prawn baculovirus DNA. J. Virol. Methods 67:1–4. Zhan, W. B., and Y. H. Wang. 1998. White spot syndrome virus infection of cultured shrimp in China. J. Aquat. Anim. Health 10:405–410.