Journal of General Virology (2008), 89, 312–326
DOI 10.1099/vir.0.83236-0
Molecular characterization of begomoviruses and DNA satellites from Vietnam: additional evidence that the New World geminiviruses were present in the Old World prior to continental separation Cuong Ha,1,2 Steven Coombs,1 Peter Revill,13 Rob Harding,1 Man Vu2 and James Dale1 Correspondence Rob Harding
1
[email protected]
2
Received 14 June 2007 Accepted 13 September 2007
Tropical Crops and Biocommodities Domain, Institute of Health and Biomedical Innovation, Queensland University of Technology, Brisbane 4001, Australia Department of Plant Pathology, Hanoi Agriculture University, Gialam, Hanoi, Vietnam
Sixteen viruses, belonging to 16 species of begomovirus, that infect crops and weeds in Vietnam were identified. Sequence analysis of the complete genomes showed that nine of the viruses (six monopartite and three bipartite) belong to novel species and five of them were identified in Vietnam for the first time. Additionally, eight DNA-b and three nanovirus-like DNA-1 molecules were also found associated with some of the monopartite viruses. Five of the DNA-b molecules were novel. Importantly, a second bipartite begomovirus, Corchorus golden mosaic virus, shared several features with the previously characterized virus Corchorus yellow vein virus and with other bipartite begomoviruses from the New World, supporting the hypothesis that New World-like viruses were present in the Old World. This, together with a high degree of virus diversity that included putative recombinant viruses, satellite molecules and viruses with previously undescribed variability in the putative stem–loop sequences, suggested that South-East Asia, and Vietnam in particular, is one of the origins of begomovirus diversity.
INTRODUCTION The family Geminiviridae is one of the largest plant virus families; its members have a circular, single-stranded DNA (ssDNA) genome of approximately 2.7–5.2 kb encapsidated within twinned (geminate) icosahedral virions. Based on their genome arrangement and biological properties, geminiviruses are classified into one of four genera: Mastrevirus, Curtovirus, Topocuvirus and Begomovirus (Stanley et al., 2005). The largest genus, Begomovirus, currently contains 132 species (Fauquet & Stanley, 2005) that have either bipartite genomes (DNA-A and DNA-B) or monopartite genomes resembling DNA-A (hereafter called DNA-A). DNA-A typically has six open reading frames (ORFs): AV1/V1 (coat protein, CP) and AV2/V2 (AV2/V2 protein) on the virion-sense strand, and AC1/C1 (replication initiation protein, Rep), AC2/C2 (transcriptional activator, TrAP), 3Present address: Victorian Infectious Diseases Reference Laboratory, 10 Wreckyn St, North Melbourne, Victoria 3051, Australia. The GenBank/EMBL/DDBJ accession numbers for the sequences reported in this paper are DQ641688–DQ641719 (see Table 1). A supplementary table showing the full names and abbreviations of all reference sequences used in the DNA-A analysis is available with the online version of this paper.
312
AC3/C3 (replication enhancer, REn) and AC4/C4 (AC4/C4 protein) on the complementary-sense strand. DNA-B has two ORFs encoding movement proteins: BV1 (nuclear shuttle protein, NSP) on the virus-sense strand and BC1 (movement protein, MP) on the complementary-sense strand (Rojas et al., 2005; Seal et al., 2006). The opposing transcription units of begomovirus DNA-A and -B molecules are separated by an intergenic region (IR) that generally shares a highly conserved region of approximately 200 nt, named the common region (CR) (Lazarowitz, 1992). The CR contains an origin of replication (ori) that includes a stem–loop structure containing an invariant nonanucleotide (TAATATTAC) sequence whose T7–A8 site is required for cleaving and joining viral DNA during replication, and conserved iterated sequences (iterons) required for specific recognition and binding by Rep during replication (ArguelloAstorga et al., 1994; Fontes et al., 1994a, b). Based on phylogenetic studies and genome arrangement, begomoviruses have been divided broadly into two groups: the Old World (OW) viruses (eastern hemisphere, Europe, Africa, Asia) and the New World (NW) viruses (western hemisphere, the Americas) (Padidam et al., 1999; Paximadis et al., 1999; Rybicki, 1994). Begomovirus genomes have a number of characteristics that distinguish 0008-3236 G 2008 SGM Printed in Great Britain
Characterization of begomoviruses in Vietnam
Old World and New World viruses. All indigenous New World begomoviruses are bipartite, whereas both bipartite and monopartite begomoviruses are present in the Old World. In addition, DNA-A of bipartite begomoviruses from the New World lacks an AV2 ORF (Rybicki, 1994; Stanley et al., 2005). New World begomoviruses also have an N-terminal PWRsMaGT motif in the CP that is absent from Old World viruses (Harrison et al., 2002). Until recently, it was thought that New World viruses arose more recently than Old World viruses, evolving after continental separation of the Americas from Gondwana (Rybicki, 1994). However, we recently identified a virus indigenous to Vietnam, Corchorus yellow vein virus (CoYVV), that resembles New World viruses more closely than Old World viruses (Ha et al., 2006). This was based primarily on phylogenetic analysis, the absence of an AV2 ORF and the presence of an N-terminal PWRLMAGT motif in the CP. The presence of CoYVV in Vietnam suggests that New World-like viruses were probably present in the Old World prior to the Gondwana separation although, to date, CoYVV is the only known New World geminivirus that is indigenous to the Old World. Recently, two additional circular ssDNA molecules, known as DNA-b and nanovirus-like DNA-1 (DNA-1), have been identified in association with some Old World monopartite begomoviruses (Briddon & Stanley, 2006). These ‘satellite’ molecules are approximately half the size of the ‘helper’ begomovirus. DNA-b molecules contain one major ORF (bC1) on their complementary strand, depend on the helper virus for replication and are responsible for symptom induction (Briddon & Stanley, 2006). Begomovirus-associated DNA-1 molecules are similar to nanovirus-encoded DNA-1 molecules in that they contain one major ORF that encodes a Rep protein and they replicate autonomously. The role of the begomovirusassociated DNA-1-like molecules in disease progression is unclear (Briddon et al., 2004). To date, the only begomoviruses reported in Vietnam have been CoYVV (Ha et al., 2006), tomato leaf curl Vietnam virus (ToLCVV) (Green et al., 2001), tomato yellow leaf curl Kanchanaburi virus (TYLCKaV) (GenBank accession no. DQ169054) and two viruses infecting cucurbits, namely squash leaf curl China virus (SLCCNV) and Luffa yellow mosaic virus (LYMV) (Revill et al., 2003, 2004). In this paper, we have identified and characterized numerous additional geminiviruses and associated DNA molecules infecting crop and weed species throughout Vietnam, and provide further evidence that indigenous ‘New World’ viruses were present in the Old World.
METHODS Plant samples and DNA extraction. Samples were collected from a
range of crop and weed plants exhibiting characteristic geminivirus symptoms (vein yellowing, leaf curling, chlorosis and stunting) and were dried on silica gel and stored at room temperature until use. Total DNA was extracted from the dried plant samples by using a http://vir.sgmjournals.org
DNeasy Plant Mini kit (Qiagen) according to the manufacturer’s instructions. Virus taxonomy. New virus species names were assigned by using the
rules proposed by the ICTV Geminiviridae Study Group (Fauquet et al., 2003; Fauquet & Stanley, 2005). Demarcation of new viral species was based on DNA-A sequence comparisons, using a threshold of 89 % nucleotide sequence identity (Fauquet et al., 2003). The sequence and arrangement of iterons, Rep N-terminal iterated-related domains (IRD) (Arguello-Astorga et al., 1994; Arguello-Astorga & Ruiz-Medrano, 2001) and recombinant regions were also considered in the taxonomic decisions. As a sanctioned taxonomic system was not available for naming the begomovirus-associated satellite molecules (DNA-b and DNA-1), names were assigned by using the protocol adopted by Zhou et al. (2003). PCR DNA-A. The sequences of degenerate primers, BegoAFor1 and
BegoARev1, and the PCR conditions used to detect DNA-A molecules have been described previously (Ha et al., 2006). These primers were used to detect DNA-A from all viruses except Corchorus golden mosaic virus (CoGMV). From the sequences amplified by degenerate primers, adjacent, outwardly extending, specific primers were designed to amplify the complete DNA-A molecules, using the Expand Long Template PCR system (Roche). For CoGMV, DNA-A was initially amplified from three different plants by using outwardly extending, CoYVV-specific primers 201For/201Rev1 (Ha et al., 2006). As these primers were derived from the CoYVV sequence, the sequence at the CoYVV priming sites was obtained by reamplification with the degenerate primers BegoAFor1 and BegoARev1 (Ha et al., 2006). DNA-B. With the exception of CoGMV, an antisense, specific primer
located in the DNA-A CR was used in combination with a degenerate primer specific for DNA-B (BegoBFor, 59-CCIDHIGCRTTRAWIGGIACYTG-39) to amplify a product of approximately 0.7 kbp. The PCR conditions for DNA-B amplification were as described previously (Ha et al., 2006). For CoGMV, DNA-B was initially obtained by using two primers specific for the BV1 gene of CoYVV (201BV1For, 59CGCTGATGATAAGATGACCAAACA-39; 201BV1Rev, 59-ACGCCCCATTACAACTATCAACAT-39). The complete DNA-B molecules were subsequently amplified by using adjacent, outwardly extending, specific primers and the Expand Long Template PCR system. DNA-b. To detect DNA-b molecules, two consensus outwardly
extending primers were designed in the satellite conserved region (SCR) (BetaFor2, 59-TAGCTACGCCGGAGCTTAGCTCG-39; BetaRev2, 59-AAGGCTGCTGCGTAGCGTAGTGG-39). The PCR conditions for DNA-b amplification were as described previously (Ha et al., 2006) except that an annealing temperature of 55 uC was used, due to a high G+C content. From the sequences amplified by using BetaFor2 and BetaRev2, adjacent, outwardly extending, specific primers were designed and used in the Expand Long Template PCR system to amplify the complete DNA-b molecules. Nanovirus-like DNA-1. To detect the nanovirus-like DNA-1
molecules, degenerate, outwardly extending primers (NLDNA1For, 59-TGGTTYTATWCACGTGGHGG-39; NLDNA1Rev, 59-ARAWGATAGTKCKRTCATCTG-39) were designed from the conserved region of the DNA-1 Rep gene. The PCR conditions for DNA-1 amplification were as described previously (Ha et al., 2006) except that the annealing temperature used was 46 uC. Adjacent, outwardly extending, specific primers were subsequently designed from the NLDNA1For/NLDNA1Rev-primed amplicon sequences, and were used in the Expand Long Template PCR system to amplify the entire DNA-1 molecules. 313
C. Ha and others
Cloning and sequencing. Amplicons were purified from agarose
gels by using standard protocols, ligated to the plasmid vector pGEMT Easy (Promega) and transformed into Escherichia coli XL1-Blue competent cells (Stratagene). Cloned plasmids were purified by using a Wizard Miniprep kit (Promega) and three clones for each sample were sequenced in both orientations by using an ABI PRISM BigDye Terminator kit (PE Applied Biosystems) at the Australian Genomic Research Facility (University of Queensland). Sequence analysis. The genomic sequences of DNA-A, DNA-B, DNA-b and DNA-1 molecules were assembled from contiguous sequences by using the SEQMAN program (DNASTAR). ORFs were identified by using the Vector NTI Suite 7 program. To determine whether the molecules were similar to known viral/satellite sequences, they were initially analysed by using the BLAST program available at the National Center for Biotechnology Information (http:// www.ncbi.nlm.nih.gov/BLAST/).
Sequences were aligned with the CLUSTAL_X program (Thompson et al., 1997) and phylogenetic trees were constructed by using the neighbour-joining algorithm and viewed with the TreeView program (Page, 1996). Nucleotide identities were determined by using the MEGALIGN program (DNASTAR). Potential gene-recombination events were analysed by using the Recombination Detection Program version 2.0 (RDP2) (Martin et al., 2005), also available online (http://darwin.uvigo.es/rdp/rdp.html). The recombination analyses were implemented by using six automated methods: RDP (Martin & Rybicki, 2000; Martin et al., 2005), GENECONV (Padidam et al., 1999), BootScan (Salminen et al., 1995), SiScan (Gibbs et al., 2000; Salminen et al., 1995), MaxChi and CHIMAERA (Posada & Crandall, 2001; Smith, 1992), using default parameters except that the option ‘Reference sequence selection’ was set at ‘internal references only’.
RESULTS Identification of novel begomoviruses Nine novel begomoviruses infecting three crop species and six weeds were identified in this study (Table 1). These viruses were named Corchorus golden mosaic virus (CoGMV), infecting jute mallow (Corchorus capsularis); kudzu mosaic virus (KuMV), infecting kudzu (Pueraria montana); Clerodendrum golden mosaic virus (ClGMV), infecting glory bower (Clerodendrum philippinum); Spilanthes yellow vein virus (SpYVV), infecting paracress (Spilanthes paniculata); Mimosa yellow leaf curl virus (MiYLCV), infecting mimosa (Mimosa sp.); Sida yellow vein Vietnam virus (SiYVVNV), infecting sida (Sida rhombifolia); tomato yellow leaf curl Vietnam virus (TYLCVNV), infecting tomato; Erectites yellow mosaic virus (ErYMV), infecting fireweed (Erectites valerianifolia); and Ludwigia yellow vein Vietnam virus (LuYVVNV), infecting willow primrose (Ludwigia octovalvis). Three of the viruses (CoGMV, KuMV and ClGMV) were bipartite, whilst the other six viruses were apparently monopartite. A DNA-b molecule was found associated with four of the monopartite viruses (MiYLCV, SiYVVNV, TYLCVNV and ErYMV) and a nanovirus-like DNA-1 molecule was detected in association with two of these viruses (MiYLCV and SiYVVNV). Five previously characterized, monopartite begomovirus species were also identified in Vietnamese crop and weed 314
species for the first time (Table 1). These were papaya leaf curl China virus (PaLCuCNV), infecting tobacco; Lindernia anagallis yellow vein virus (LaYVV), infecting false pimpernel (Lindernia procumbens); Alternanthera yellow vein virus (AlYVV), infecting zinnia (Zinnia elegans) and eclipta (Eclipta prostrata); Sida leaf curl virus (SiLCV), infecting abutilon (Abutilon indicum); and Ludwigia yellow vein virus (LuYVV), infecting willow primrose. A DNA-b molecule was identified in plants infected with PaLCuCNV, LaYVV, AlYVV and SiLCV, whilst plants infected with SiLCV also harboured a nanovirus-like DNA-1 molecule. In addition, two virus species previously reported in Vietnam were identified infecting tomato and eggplant, namely tomato leaf curl Vietnam virus (ToLCVV) and tomato yellow leaf curl Kanchanaburi virus (TYLCKaV), respectively (Table 1). Sequence and recombination analysis DNA-A and DNA-B. The genomes of CoGMV, KuMV and
ClGMV were bipartite. All DNA-A molecules contained four complementary-sense ORFs (AC1, AC2, AC3 and AC4) and two virion-sense ORFs (AV1 and AV2), with the exception of CoGMV, which did not contain an AV2 ORF. The stem–loop sequence in the putative intergenic region of all DNA-A molecules contained the nonanucleotide sequence TAATATTAC, with the exception of CoGMV, which contained the sequence TATTATTAC. The CoGMV, KuMV and ClGMV DNA-B molecules each possessed two major ORFs: BC1 on the complementarysense strand and BV1 on the virion-sense strand, separated by an IR that contained a CR shared with DNA-A. The CR sequences between the cognate DNA components of CoGMV, KuMV and ClGMV shared 66.2, 58.7 and 87.4 % identity, respectively. The unusually low identities of the CoGMV and KuMV CR sequences were due to differences in the region between the TATA box and stem– loop, as well as almost 50 % variability in the putative stem sequences that spanned the conserved TAATATTAC sequence in KuMV. For all three viruses, however, the sequence and arrangement of the iterons in the CRs of both components were identical (Fig. 1). Nucleotide-sequence comparisons of the entire genome showed that CoGMV was related more closely to CoYVV and New World viruses than to other Old World viruses. However, the CoGMV DNA-A and DNA-B molecules were only 71.3 and 50.9 % identical to the respective CoYVV sequences (Table 2). In addition, the sequence and arrangement of the iterons and IRD differed for the two viruses (Fig. 1), providing further evidence that they were two distinct species. The N-terminal region of the CoGMV CP encoded a putative PWRsMaGT motif, identical to that encoded by CoYVV and other New World begomoviruses (Table 3). In addition, the CP N-terminal region of all Old World begomoviruses, with the exception of CoGMV and CoYVV, encoded either two or three basic domains, typically KR, KVRRR and K/RRRR, that formed part of Journal of General Virology 89
Characterization of begomoviruses in Vietnam
Table 1. Identities and sizes of begomoviruses and associated satellites isolated from Vietnam in this study Name
Abbreviation
Proposed novel viruses Corchorus golden mosaic CoGMV virus Kudzu mosaic virus KuMV Clerodendrum golden mosaic virus Spilanthes yellow vein virus Mimosa yellow leaf curl virus
ClGMV SpYVV MiYLCV
Isolate
[Vietnam : Hanoi : Jute : 2004] [Vietnam : Hoabinh : Kudzu : 2004] [Vietnam : Sonla : Clerodendrum : 2000] [Vietnam : Dalat : Spilanthes : 2004] [Vietnam : Binhduong : Mimosa : 2004]
Sida yellow vein Vietnam virus
SiYVVNV
[Vietnam : Hanoi : Sida : 2000]
Tomato yellow leaf curl Vietnam virus Erectites yellow mosaic virus Ludwigia yellow vein Vietnam virus Previously characterized Papaya leaf curl China virus Lindernia anagallis yellow vein virus Tomato yellow leaf curl Kanchanaburi virus Alternanthera yellow vein virus
TYLCVNV
[Vietnam : Hanoi : Tomato : 2004] [Vietnam : Hoabinh : Erectites : 2004] [Vietnam : Hochiminh : Ludwigia : 2000]
Tomato leaf curl Vietnam virus Sida leaf curl virus
Ludwigia yellow vein virus
ErYMV LuYVVNV viruses PaLCuCNV LaYVV TYLCKaV AlYVV
ToLCVV SiLCV
LuYVV
GenBank accession numbers*
DQ641688, (DQ641689) DQ641690, (DQ641691) DQ641692, (DQ641693) DQ641694
[Vietnam : Hatay : Tobacco : 2004] [Vietnam : Hanoi-Lindernia : 2004] [Vietnam : Binhduong : Eggplant : 2004] [Vietnam : Hue : Zinnia : 2004] [Vietnam : Hanoi : Eclipta : 2004] [Vietnam : Hanoi : Tomato : 2004] [Vietnam : Thanhhoa : Abutilon : 2000 : 61] [Vietnam : Thanhhoa : Abutilon : 2000 : 62] [Vietnam : Hochiminh : Ludwigia : 2000]
DQ641695, DQ641710, DQ641719 DQ641696, DQ641712, DQ641718 DQ641697, DQ641714 DQ641698, DQ641713 DQ641699
DQ641700, DQ641709 DQ641701, DQ641715 DQ641702
Natural host
Locality
DNA molecule and size (nt) AD
Jute mallow, crop Kudzu, crop
B
Hanoi
2677
2649
Hoabinh
2731
2672
2767
2757
b
1
Glory bower, Sonla weed Paracress, weed Dalat
2761
Mimosa, weed
Binhduong
2757
1358
1378
Sida, weed
Hanoi
2753
1340
1372
Tomato, crop
Hanoi
2745(a)
1356
Fireweed, weed
Hoabinh
2751
1342
Primrose willow, Hochiminh 2751(b) weed Tobacco, crop
Hatay
2754
1322
2740
1346
False pimpernel, Hanoi weed Eggplant, crop Binhduong
2752
DQ641703, DQ641716 DQ641704
Zinnia, flower
Hue
2744
Eclipta, weed
Hanoi
2745
DQ641705
Tomato, crop
Hanoi
2775(a)
DQ641706, DQ641711, DQ641717 DQ641707
Abutilon, weed
Thanhhoa
2760(c)
Abutilon, weed
Thanhhoa
2762(c)
DQ641708
Primrose willow, Hochiminh 2757(b) weed
1344
1367
1379
*GenBank accession numbers for DNA-B are in parentheses, for DNA-b are in bold and for DNA-1 are underlined. DMolecules assigned the same lower-case letter in parentheses were isolated from the same plant.
the nuclear-localization signal (NLS). In contrast, the CP N-terminal sequence of CoGMV, CoYVV and all New World begomoviruses encoded only the first basic domain, KR (Table 3). All other viruses identified in this study were related more closely to Old World begomoviruses than to viruses from the New World. The sequences of the bipartite viruses http://vir.sgmjournals.org
KuMV and ClGMV were distinct from any sequences present in GenBank. KuMV was related most closely to horsegram yellow mosaic virus (HgYMV), although their DNA-A and DNA-B molecules shared only 65 and 45 % identity, respectively (Table 2). DNA-A of ClGMV was most similar to the genome of a putative novel monopartite virus identified in the current study (TYLCVNV, 68.8 %), whereas ClGMV DNA-B showed most similarity 315
C. Ha and others
Fig. 1. Iteron sequences and corresponding iteron-related domain (IRD) sequences in the N-terminal regions of Rep (ArguelloAstorga & Ruiz-Medrano, 2001) of selected begomoviruses from this study and closely related viruses. Iteron sequences are underlined; complementary iteron sequences (Arguello-Astorgo et al., 1994) are shown in italics; TATA motifs are boxed; IRD sequences are shown in bold.
to DNA-B from tomato yellow leaf curl Thailand virus (TYLCTHV, 38.7 %), a bipartite virus (Table 2). Most of the new monopartite viruses identified in this study showed significant sequence identity to known virus sequences (Table 2), with the exception of SpYVV, which shared only 65.7 % identity with the most closely related virus, tobacco leaf curl Yunnan virus (TbLCYNV). Over the entire nucleotide sequence, MiYLCV was most similar (82.3 %) to TYLCVNV, whilst SiYVVNV was most similar (84.7 %) to Sida yellow mosaic China virus (SiYMCNV). Although close to the 89 % threshold for classification as the same virus species, marked differences in both the sequences and arrangement of the iterons and the IRD of SiYVVNV and SiYMCNV (Fig. 1) suggested that they warrant classification as distinct species. Although the complete nucleotide sequence of TYLCVNV was most similar (85.2 %) to that of PalCuCNV, the nucleotide sequence of the TYLCVNV V1 ORF was almost identical to that of ToLCVV (97.7 %), whilst the C1 nucleotide sequence showed most similarity to that of Ageratum yellow vein virus (AYVV, 88.5 %) (Table 2). Recombination analysis of the TYLCVNV genome (Fig. 2) identified a 929 nt region in V1 that shared 97.4 % identity with the cognate region of ToLCVV. This putative recombinant fragment encompassed 31 nt of the IR 39 terminus, the complete V2 ORF and most of V1, with the exception of 33 nt at the V1 39 terminus. The overall nucleotide identity of TYLCVNV and ToLCVV was 83.2 % and the TYLCVNV IR was most similar to that of ToLCVV (82.5 %) (Table 2). However, the arrangement and 316
sequence of the TYLCVNV iterons and IRD were almost identical to those of AYVV (Fig. 1), suggesting that TYLCVNV may have emerged through two recombination events. The nucleotide sequence of ErYMV was most similar to that of pepper leaf curl virus (PepLCV) from Malaysia, with identities of 87.5 and 93.4 % for the complete genome and the V1 sequences, respectively (Table 2). However, the nucleotide sequences of the ErYMV C1 ORF and IR were most similar to those of tomato yellow leaf curl China virus (TYLCCNV; 88.9 and 77.9 %, respectively) (Table 2). Recombination analysis (Fig. 2) identified a 451 nt region in the C1 ORF that shared 96.9 % identity with the cognate region in TYLCCNV, although the sequence identity of the complete genomes was only 81.4 %. This region spanned 364 nt downstream of the 59 end of the C1 ORF (covering 208 nt of the 59 end of the C4 ORF), to position 2680 (22 nt upstream of the TATA box, 71 nt downstream of the nicking site). The nucleotide sequence identity of ErYMV and PepLCV in this region was only 82.5 %. This putative recombinant region contained the iterons and the IRD and, whereas the virion-sense iteron sequences of ErYMV, TYLCCNV and PepLCV were all identical, the ErYMV IRD and complementary iteron sequences were more similar to those of TYLCCNV than to those of PepLCV (Fig. 1). The nucleotide sequence of LuYVVNV was most similar to that of LuYVV from China, with nucleotide identities of 87.1 and 97 % for the complete genome and V1 ORF, respectively (Table 2). The high level of sequence identity Journal of General Virology 89
http://vir.sgmjournals.org
Table 2. Percentage nucleotide sequence identities of the DNA-A and DNA-B complete genomes, intergenic regions and selected ORFs for viruses identified in this study Full virus names are provided in Fig. 3. Asterisks indicate viruses identified in this study. ID, Identity. Virus
DNA-A overall Virus (accession no.)
DNA-A IR
AV1/V1
ID Virus ID (%) (accession no.) (%)
Virus (accession no.)
48.9
CoYVV (AY727903) HgYMV (AJ627904) ToLCVV*
Proposed novel viruses CoGMV CoYVV (AY727903) KuMV HgYMV (AJ62790) ClGMV TYLCVNV*
68.8
CoYVV (AY727903) TbLCzV (AF350330) ToLCVV*
SpYVV
65.7
TYLCVNV*
42.3
82.3
ToLCJV (AB100304) StaLCuV (AJ810157) ToLCVV (AF264063) TYLCCNV*
68.5
LuYVV (AJ965539)
78.2
MiYLCV SiYVVNV
TbLCYNV (AJ566744) TYLCVNV*
65.0
84.7 85.2 87.5 87.1
50.7 37.1
78.6 82.5 77.9
TbLCYNV (AJ566744) PaLCuCNV (AJ704604) SiYMCNV (AJ810096) ToLCVV (AF264063) PepLCV (AF214287) LuYVV (AJ965539)
DNA-B overall
ID Virus ID (%) (accession no.) (%)
Virus (accession no.)
CoYVV (AY727903) HgYMV (AJ627904) ToLCJV (AB100304) MiYLCV*
CoYVV (AY727904) HgYMV (AJ627905) TYLCKaV (AF511529)
81.0 75.4 74.7 73.3 91.4 93.5 97.7 93.4 97.0
74.5 68.8 76.9
BV1
BC1
ID Virus ID Virus (%) (accession no.) (%) (accession no.) 50.9 45.0 38.7
CoYVV (AY727904) MYMIV (AY271894) TYLCKaV (DQ169054)
56.0 51.0 41.0
CoYVV (AY727904) MYMV (AJ132574) TYLCKaV (DQ169054)
ID (%) 73.6 65.2 60.8
70.0
LaYVV (AY795900) ToLCMV (AF327436) AYVV (X74516) TYLCCNV*
84.7
ToLCLV (AF195782)
83.8
86.4 88.5 88.9
95.8 95.2 99.3 96.2 95.7
98.9 93.3
93.2
95.7
317
Characterization of begomoviruses in Vietnam
SiYMCNV (AJ810096) TYLCVNV PaLCuCNV (AJ704604) ErYMV PepLCV (AF214287) LuYVVNV LuYVV (AJ965539) Previously characterized viruses PaLCuCNV PaLCuCNV (AJ558116) LaYVV LaYVV (AY795900) TYLCKaV TYLCKaV (DQ169054) AlYVV-[Vietnam : AlYVV Hue : Zinnia : 2004] (AJ965540) AlYVV-[Vietnam : AlYVV Hanoi : (AJ965540) Eclipta : 2004] ToLCVV ToLCVV (AF264063) SiLCV-[Vietnam : SiLCV Thanhhoa : A (AM050730) butilon : 2000 : 61 SiLCV-[Vietnam : SiLCV Thanhhoa : (AM050730) Abutilon : 2000 : 62] LuYVV LuYVV (AJ965539)
71.3
AC1/C1
C. Ha and others
over the entire genome was very close to the 89 % threshold for delineation of different viral species, suggesting that LuYVVNV and LuYVV may be two distal isolates of the same virus. However, sequence comparisons between LuYVVNV and LuYVV identified a highly disparate region of 586 nt in the LuYVVNV C1 ORF that extended to the 59 end of the stem–loop in the IR; the nucleotide identity of LuYVVNV and LuYVV in this region was only 55.6 %. Although no putative recombination events were detected by computing analysis, the LuYVVNV sequence in this region had higher identity with the corresponding fragment of tomato leaf curl Laos virus (ToLCLV; 86 %) than that of LuYVV, suggesting that this region was a putative recombinant fragment (Fig. 2). Indeed, the iterons and IRD of LuYVVNV were more similar to those of ToLCLV than to those of LuYVV (Fig. 1). The isolates of the seven previously characterized begomoviruses identified in this study (TYLCKaV, ToLCVV, PaLCuCNV, LaYVV, AlYVV, SiLCV and LuYVV) all shared .93 % identity over the entire nucleotide sequence with the corresponding virus sequences in databases (Table 2). A putative recombinant region of 431 nt, located near the 59 end of the genome, was detected between SiLCV-[Vietnam : Thanhhoa : Abutilon : 2000 : 61] and Stachytarphyta leaf curl virus (StaLCuV) (Fig. 2). DNA-b and DNA-1. The DNA-b molecules associated with
AlYVV, ErYMV, LaYVV, MiYLCV, PaLCuCNV, SiYVVNV, SiLCV and TYLCVNV ranged in size from 1322 to 1367 nt (Table 1) and each encoded one major complementary-sense ORF (bC1). Each DNA-b molecule contained (i) a putative stem–loop structure with the loop sequence, TAATATTAC, (ii) an SCR immediately upstream of the putative stem–loop sequence, and (iii) an adenosine (A)-rich region upstream of the SCR. Sequence comparisons of the complete DNA-b genomes (Table 4) revealed that they could be divided into two distinct groups. One group consisted of DNA-b molecules associated with MiYLCV, SiLCV and PaLCuCNV; these all had high sequence identities with the respective DNA-b sequences in GenBank (80.3 % for MiYLCVb and GenBank accession no. AJ54249180; 92.5 % for SiLCVb and AM050732; and 96 % for PaLCCNVb and AJ971257). The second group, containing DNA-b molecules associated with AlYVV, ErYMV, LaYVV, SiYVVNV and TYLCVNV, shared ,70 % sequence identity with known DNA-b molecules and each other, suggesting that they were all novel satellite molecules. Nanovirus-like DNA-1 molecules associated with MiYLCV, SiLCV and SiYVVNV comprised 1378, 1379 and 1372 nt, respectively (Table 1). Each DNA contained one large ORF that encoded a putative protein of 315 aa; analyses of these sequences revealed high levels of similarity to the Rep proteins of nanoviruses. Each molecule also contained a putative stem–loop structure, containing the loop sequence TAGTATTAC and an A-rich region immediately downstream of the ORF. The DNA-1 molecules associated with 318
MiYLCV, SiLCV and SiYVVNV had highest sequence identities with DNA-1 molecules associated with tobacco curly shoot virus (83.8 %), SiLCV (84.8 %) and TYLCCNV (69.9 %), respectively (Table 4). Phylogenetic analyses DNA-A and DNA-B. Phylogenetic analyses based on the
complete nucleotide sequences of DNA-A (Fig. 3a) showed that CoGMV and CoYVV formed a distinct clade that was related more closely to New World begomoviruses than to viruses from the Old World. KuMV grouped tightly with three Old World legume-infecting bipartite begomoviruses: HgYMV, mungbean yellow mosaic virus (MYMV) and mungbean yellow mosaic india virus (MYMIV), to form an intermediate clade. The remaining viruses isolated from Vietnam grouped into one major clade that included both bipartite and monopartite Old World begomoviruses. Nearly identical topologies were observed in phylogenetic trees constructed by using both the amino acid and nucleotide sequences of REn, TrAP and CP (data not shown). Similarly, phylogenetic analysis of DNA-B molecules using either the complete sequence or the BV1 and BC1 nucleotide and amino acid sequences showed that CoGMV and CoYVV clustered closer to New World viruses than to viruses from the Old World. Unlike the tree topology obtained by using DNA-A sequences, however, the legume-infecting viruses clustered more closely with cassava-infecting viruses [African cassava mosaic virus (ACMV), East African cassava mosaic virus (EACMV) and South African cassava mosaic virus (SACMV)] from the Old World (data not shown). The separation of New World and Old World begomoviruses was less distinct using the AC1 sequence or the IR, and there was no separation between Old World and New World viruses using the AC4 sequences (data not shown). DNA-b and DNA-1. Analysis of the complete DNA-b
sequences revealed two major clades, Malvaceae and nonMalvaceae (Fig. 3b). All of the DNA-b molecules isolated from Vietnam fell within the non-Malvaceae clade. With the exception of LaYVV DNA-b, which was related more closely to sequences isolated from Japan and the UK, all DNA-b molecules from Vietnam grouped closely with molecules previously isolated from China or Laos. Interestingly, the tree topologies for many of the DNA-b molecules differed from that of their cognate DNA-A molecules, with the exception of viruses/DNA-b molecules isolated from sida in Vietnam and China and abutilon from Vietnam. In general, the DNA-b molecules isolated from Vietnam showed high sequence diversity, as can be seen from their distal positions on the trees. Phylogenetic analyses of the nanovirus-like DNA-1 molecules also identified a high degree of sequence diversity (Fig. 3c). Similar to DNA-b, the DNA-1 molecule isolated from abutilon was related most closely to a DNA-1 molecule isolated from sida in China. The cognate Journal of General Virology 89
Characterization of begomoviruses in Vietnam
Table 3. Comparison of begomovirus CP sequences showing the deduced N-terminal sequences of viruses identified in the current study compared with a number of begomovirus sequences from the Old and New Worlds Virus*
N-terminal regionD Sequence
MYMV MYMIV HgYMV TYLCV ACMV ICMV LYMV SLCCNV TYLCVNV ToLCVV ErYMV PaLCuCNV MiYLCV LaYVV LuYVVNV LuYVV SiYVVNV SpYVV AlYVV ClGMV KuMV SiLCV TYLCKaV CoGMV CoYVV TGMV AbMV SiGMV ToMoTV BDMV SiGMCRV SiMoV CaLCuV SLCV ToGMoV MaMPRV RhGMV DiYMoV
Origind Amino acid position
MPKRNYDTAFSTPMSNVRRRLTFDTPLSLPATAGSVPAS-A-KRRRW MPKRTYDTAFSTPISNARRRLNFDTPLMLPASAGGVPTN-M-KRRRW MPKRNYDTAFSTPGSSVRRRLTYDTPLALPASAGSAPAS-V-RRRRW MSKRPGDIIISTPVSKVRRRLNFDSPYSSRAAVPIVQGT--NKRRSW MSKRPGDIIISTPGSKVRRRLNFDSPYRNRATAPTVHVT--NRKRAW MSKRPADIIISTPGSKVRRRLNFDSPYSSRAAVPTVRVT---KRQSW MSKRPADIIISTPASKVRRRLNFDSPYVSRAVVPIARVT---KGKAW MSKRPADIIISTPASKVRRRLNFDSPYVSRAVVPIARVT---KGKAW MSKRPADIVISTPASKVRRRLNFDSPYTNRAVAPTVLVT--NKRRSW MSKRPADIVISTPASKVRRRLNFDSPYVNRAVAPTVLVT--NKRRSW MSKRPADIVISTPASKVRRRLNFDSPYVSRAAAPTVLVT--NKRRSW MSKRPADIVISTPASKVRRRLNFDSPYVSRAAAPTVLVT--NKRRSW MSKRPADIVISTPASKVRRRSNFDSPYASRAAAPTVLVT--NKRRSW MSKRPADIVISTPSSKVRRRLNFDSPYASRAAAPTVLVT--SKKRSW MSKRPADIVISTPVSKVRRRLNFDSPGVSRVAARTVLGI--TRKNAW MSKRPADIVISTPVSKVRRRLNFDSPGVSRAAARTVLGI--TRKNAW MSKRPADIVISTPASKVRRRLNFDSPGMSRAAAPTVLVT--NRKRSW MSKRPASIVISTPSSKVRRRLNFDSPYANRASAPIVRVT---KGQVW MSKRAADMIISSSGSRVRRRLNFDSPMARRATAPIVRAT---RKQQW MAKRAGDIIISTPASKVRRRLNFDSPYQNRVPVLTARGT---RKQLW MTKRNFETAFSSPISSARRRLSYGTPLALPAPAASAQGT--RRRRSW MSKRPASMAYSSPISSARRRLNFDSPRPSVAAALTAPG--I-RRRRW MPKRSIDTVTSLPMSITRRRLNFGSQYSLPASAPTAPGMSY-KRRAW M-KREAPWRTNAGTSKVRRALNF-SPRSG-------LGPK---ASAW MPKRDAPWRLMAGTSKVSRSSNY-SPRGGVSDSGSYLPRRFSRASLW MPKRDAPWRLMAGTSKVSRSANY-SPRGS-------LPKR----DAW MPKRDLPWRSMPGTSKTSRNANY-SPRAR-------IGPRVDKASEW MPKRELPWRSMAGTSKVSRNANY-SPRAG-------SGPRVHKASEW MPKRDRTWRSIAGTSKVSRNANY-SPRTG-------SGPIGNKASEW MPKRDAPWRSMAGTTKVSRNANY-SPRGG-------IGPKMTRAAEW MPKRDVPWRNIAGTSKVSRSSND-SPRAG-------SGPKFYKAARW MPKRDPSWRQMAGTSKVSRSSNF-SPRGG-------IGPKFNKASEW MPKRDAPWRSMAGTSKVSRNANY-SPRAG-------MIHKFDKAAAW MVKRDAPWRLMAGTSKVSRSANF-SPREG-------MGPKFNKAAAW MPKRDAPWRLMGGTSKVSRSFNQ-VSRTG-------TGPKFDKAHAW MPKRDAPWRSSAGTSKVSRNLNY-SPGG---------GPKSNRANAW MPKRDAPWRLSAGTSKVSRSANY-SPGGG-------MGPKSNRANAW MSKRDAPWRMMVGPSKVRRTLNF-SPGGG-------MGSKSNRASSW
45 45 45 45 45 44 44 44 45 45 45 45 45 45 45 45 45 44 44 44 45 44 46 36 46 35 39 39 39 39 39 39 39 39 39 38 39 39
OW OW OW OW OW OW Vietnam Vietnam This study This study This study This study This study This study This study This study This study This study This study This study This study This study This study This study Vietnam NW NW NW NW NW NW NW NW NW NW NW NW NW
*Full virus names and GenBank accession numbers are provided in Table 1 and Fig. 3. DBasic domains of the nuclear-localization signals (Guerra-Peraza et al., 2005; Kunik et al., 1998; Unseld et al., 2001) are shown in bold. The 7PWRsMaGT motifs (Harrison et al., 2002) are underlined. The W residue conserved among all CP sequences at position 39–45 was used to separate the N- and C-terminal regions. dOW, Old World; NW, New World.
DNA-A components associated with these satellite molecules from Vietnam and China were both associated with isolates of SiLCV. The DNA-1 molecule from mimosa formed a distinct branch between DNA-1 sequences originating from China. Although the DNA-1 sequence from sida formed a separate branch, its exact position in http://vir.sgmjournals.org
the tree was not well supported by bootstrap values. Similar to that observed for DNA-b, the DNA-1 sequences were positioned distally on the tree, indicating a high level of sequence diversity in Vietnam and suggesting that the sequences had been present in the region for a considerable period. 319
C. Ha and others
Fig. 2. Schematic representation of the recombinant regions relating to SiLCV, TYLCVNV, ErYMV and LuYVVNV. Break points, sizes, original detection methods and multiple comparison-corrected P values (Martin et al., 2005) of the recombinant regions are indicated on the linear map of each virus. Viral origins (shaded parts) of the recombinant regions are also indicated.
DISCUSSION We have identified numerous begomoviruses and DNA satellites infecting crops and weed species in Vietnam. Importantly, we identified a second bipartite virus infecting jute, CoGMV, that was related more closely to New World viruses than to viruses from the Old World. Although CoGMV had some sequence similarity to another juteinfecting virus (CoYVV), it was below the 89 % taxonomic threshold for inclusion in the same species (Fauquet et al., 2003). In addition, differences in the CoGMV and CoYVV iteron sequences and the results of phylogenetic analyses showed that CoGMV and CoYVV are distinct viruses. The
genomes of CoGMV, CoYVV and other New World viruses shared many common features, including (i) they are all bipartite, (ii) all have a CP N-terminal 7-PWRsMaGT motif, (iii) all lack the second and third basic domains in the CP N-terminal region, and (iv) all lack an AV2 ORF. In addition, the identification of SiLCV infecting abutilon, PaLCuCNV infecting tobacco and AlYVV infecting zinnia and eclipta suggested that these viruses had a wider natural host range than reported previously (Guo & Zhou, 2005, 2006; Wang et al., 2004). The basic domains in the CP N-terminal region form an essential part of the NLS in Old World viruses and their
Table 4. Percentage nucleotide sequence identities of satellite molecules identified in this study, compared with the most closely related sequences Satellite
Most closely related sequence in databases Virus isolate (GenBank accession no.)
DNA-b PaLCuCNVb MiYLCVb SiLCVb SiYVVNVb ErYMVb TYLCVNVb LaYVVb AlYVVb DNA-1 SiLCV DNA-1 SiYVVNV DNA-1 MiYLCV DNA-1
ID* (%)
Ageratum yellow vein China virus-associated DNA-b, isolate G66 (AJ971257) Tomato leaf curl virus-associated DNA-b, Laos isolate (AJ542491) Sida leaf curl virus-associated DNA-b, isolate Hn57 (AM050732) Sida yellow mosaic virus-[China]-associated DNA-b, isolate Hn8 (AJ810093) Tomato yellow leaf curl China virus satellite DNA-b, isolate Y146-31 (AJ536623) Tomato yellow leaf curl China virus-associated DNA-b, isolate G102 (AM050556) Honeysuckle yellow vein mosaic virus-associated DNA-b (AJ316040) Tobacco curly shoot virus-associated DNA-b, isolate Y115 (AJ457822)
96.0 80.3 92.5 69.9 65.7 66.1 39.0 65.3
Sida leaf curl virus-associated DNA-1, isolate Hn57 (AM050734) Tomato yellow leaf curl China virus-associated DNA-1, isolate Y87 (AJ579357) Tobacco curly shoot virus-associated DNA-1, isolate Y290 (AJ888453)
84.8 69.9 83.9
*ID, Identity. 320
Journal of General Virology 89
Characterization of begomoviruses in Vietnam
role in nuclear targeting has been demonstrated for TYLCV (Kunik et al., 1998), ACMV (Unseld et al., 2001) and MYMV (Guerra-Peraza et al., 2005). The reason for the absence of these domains in New World viruses is unclear. Bipartite viruses were thought to have evolved from monopartite viruses by gene duplication and/or DNA acquisition, with gene products encoded by DNA-B providing enhanced viral movement within the host (Rojas et al., 2005). The evolution of bipartite viruses is thought to have occurred before continental separation, due to the presence of bipartite viruses in both the Old and New Worlds (Rojas et al., 2005). All New World viruses lack the AV2 ORF, and it has been proposed that they evolved from a common ancestor that had lost the AV2 ORF after the Gondwana continental separation (Rybicki, 1994). However, the occurrence of both CoYVV (Ha et al., 2006) and CoGMV, bearing features similar to New World viruses, in Vietnam suggests that viruses with characteristics of New World viruses were present in the Old World prior to continental separation. The mechanisms by which bipartite viruses evolved into distinct Old World and New World populations is unclear, although it is possible that this process involves the region encoding the AV2 ORF and the CP N-terminal region. Harrison et al. (2002) reported variability in the Nterminal 50 residues from 27 begomovirus CP sequences originating from six continents. Similarly, in a comparison of the CP N-terminal region of six Old World and four New World CP sequences, Sharma et al. (2005) observed that the first 50 aa were highly variable when comparing all viruses together, but were much more conserved when the two groups were compared separately (approx. 62 and 68 % identity, respectively). In the current study, comparison of the deduced CP sequences from a large number of New World and Old World viruses showed that their CPs were clearly divided into distinct N-terminal and Cterminal regions. The N-terminal region consisted of approximately 39 aa for the New World viruses and approximately 45 aa for the Old World viruses (Table 3). This region was relatively conserved within the two groups (mean 62.9 and 66.8 % identity, respectively), but differed markedly between them (mean 28.8 % identity) (Table 5). In contrast, the C-terminal CP region was conserved in all begomoviruses, irrespective of whether they were from the Old or New World (mean 80.5 % identity between the two groups; Table 5). The higher conservation (mean 92.7 % identity) in the C-terminal region of the New World viruses compared with the Old World viruses (mean 81.6 % identity) (Table 5) supports the hypothesis that New World viruses emerged more recently (Rybicki, 1994). The AV2 ORF (also known as V1 ORF) is involved in symptom development, efficient viral movement and viral DNA accumulation (Padidam et al., 1996; Rigden et al., 1993). The AV2 ORF (approx. 115 aa), which is lacking in New World viruses, completely overlaps the abovementioned CP N-terminal region present in Old World viruses, suggesting that functions normally attributed to http://vir.sgmjournals.org
this CP region and AV2 are encoded by the DNA-B component present in all New World viruses. The presence of the N-terminal CP region and the AV2 gene in Old World viruses may explain why many of them are monopartite and also why DNA-B is not required for symptom expression in viruses such as TYLCTHV (Rochester et al., 1990) or Sri Lankan cassava mosaic virus (SLCMV) (Saunders et al., 2002). Among the nine newly identified viruses in this current study, we had difficulties determining the taxonomic status of ErYMV and LuYVVNV. Both viruses had overall nucleotide identities close to the 89 % threshold (87.5 and 87.1 % identity with PepLCV and LuYVV, respectively). Fauquet (2002) reported that most viruses with approximately 87 % identity may be recombinants. Indeed, computer analysis detected a putative recombinant fragment in ErYMV that covered the Rep N-terminal region and Rep-binding site, which contains species-specific factors essential for replication (Arguello-Astorga et al., 1994; Arguello-Astorga & Ruiz-Medrano, 2001; Fontes et al., 1994a, b; Hanley-Bowdoin et al., 2000; Jupin et al., 1995). A putative recombinant fragment, at a similar position, was also detected in LuYVVNV by sequence comparison. Sequence analyses of these fragments suggested that ErYMV and LuYVVNV would have higher affinity, in terms of trans-replication, with TYLCCNV and ToLCLV, respectively, than with their most closely related viruses based on overall nucleotide sequence. This needs to be demonstrated experimentally, however, as the ability of viruses to trans-replicate is considered an important criterion in begomovirus classification (Fauquet et al., 2003). Strict application of the 89 % taxonomy rule and phylogenetic analysis also supported the classification of ErYMV and LuYVVNV as distinct virus species. Although CoYVV and CoGMV share many features in common with New World viruses, their position on phylogenetic trees basal to the New World viruses suggests that they were distinct from New World viruses, but may share a common ancestor. This was further supported by the observations that they originated from the Old World and they shared very low overall sequence identity with the New World viruses [maximum 60.2 % for CoYVV (Ha et al., 2006); 58.4 % for CoGMV]. This scenario is very similar to that of the sweet potato viruses [sweet potato leaf curl virus (SPLCV), sweet potato leaf curl Georgia virus (SPLCGV) and Ipomoea yellow vein virus (IYVV)], which (i) are present in both the New World (Lotrakul et al., 1998) and the Old World (Briddon et al., 2006), (ii) have genome organizations typical of Old World monopartite viruses (Lotrakul & Valverde, 1999), and (iii) formed an evolutionary lineage independent of both Old World and New World viruses (Fauquet & Stanley, 2003). Our phylogenetic analysis, based on the complete DNA-A sequence, identified two geographically defined major clusters (Old World and New World viruses) and three other distinct clusters that were distinguished on the basis of host (legume, sweet potato and jute). The relatively 321
C. Ha and others
Fig. 3. Phylogenetic trees based on the complete nucleotide sequences of (a) begomovirus DNA-A, (b) DNA-b and (c) nanovirus-like DNA-1 isolated in this study (bold and shaded) compared with reference sequences available on GenBank. The previously published GenBank sequences originating from Vietnam are shown in bold and underlined. Only bootstrap percentages .50 % (1000 replicates) are shown. (a) ‘B’ after some GenBank accession numbers indicates ‘bipartite’ viruses. See Supplementary Table S1 in JGV Online for the full names and abbreviations of all reference sequences used in the DNA-A analysis. (b) The reference DNA-b sequences represent the different groups reported by Briddon & Stanley (2006). (c) All nanovirus-like DNA-1 sequences available in GenBank were included in the analysis.
322
Journal of General Virology 89
Characterization of begomoviruses in Vietnam
http://vir.sgmjournals.org
323
C. Ha and others
Table 5. Comparison of begomovirus CP sequences, showing percentage identities of the N-terminal (N) and C-terminal (C) regions from a number of Old and New World begomoviruses The highly conserved nature of the C-terminal sequences, irrespective of their Old World (OW) or New World (NW) origins, is highlighted in bold type. Comparison
Within NW (including CoYVV and CoGMV) Within OW Between NW and OW
n*
Maximum
Minimum
Mean
N
C
N
C
N
C
15
87.1
97.1
47.5
87.7
62.9
92.7
57
100.0 43.1
100.0 88.2
31.1 13.3
69.1 71.1
66.8 28.8
81.6 80.5
*n, Number of sequences. Sequences used in the analysis are shown in Fig. 3a (excluding three sweet potato viruses and SiLCV, GenBank accession no. AM050730).
intermediate positions of the sweet potato and jute viruses between the Old World and New World populations suggests that (i) geographical separation appears to play a less important role than proposed previously in evolution of the genus Begomovirus, and (ii) host factors appear to be the major selection force driving the evolution of the sweet potato and jute viruses. Two unexpected observations from this study were (i) that the nonanucleotide sequence of CoGMV was TATTATTAC rather than TAATATTAC and (ii) that the putative stem sequences of DNA-A and DNA-B differed in KuMV. Although the third residue of the TATTATTAC sequence is variable among nanoviruses (TAT/GTATTAC) and animal circoviruses (T/C/AAT/GTATTAC) (Hattermann et al., 2003), this is the first report of such variability in geminiviruses. Similarly, KuMV is the first geminivirus identified with variability in the putative stem sequence. In all other bipartite begomoviruses characterized to date, the putative stem sequence is almost identical in both components, even in viruses exhibiting low identities in the remainder of the CR sequences (Chakraborty et al., 2003; Ha et al., 2006; Hill et al., 1998; Idris & Brown, 2004). These differences are unlikely to affect replication, however, as the presence of the stem–loop structure, and not the sequence of the stem itself, is important for DNA replication (Orozco & Hanley-Bowdoin, 1996). However, this remains to be demonstrated experimentally. The majority of the viruses characterized during the current study were isolated from weeds that can serve as important virus reservoirs (Gilbertson et al., 1993; Stonor et al., 2003). Viruses in mixed infections may recombine, resulting in a novel virus that can cross host barriers. Indeed, we were able to detect a number of putative recombination events between SiLCV and StaLCuV, ErYMV and TYLCCNV, and TYLCVNV and ToLCVV, with the TYLCVNV iterons probably originating from an AYVV-like weed-infecting virus. In addition, the conserved nature of iteron sequences observed between distinct viruses such as ErYMV/TYLCCNV, LuYVVNV/ToLCLV 324
and TYLCVNV/AYVV suggests that they could replicate in a trans-acting manner (Fontes et al., 1994a); this may facilitate gene transfer through recombination-dependent replication (Jeske et al., 2001). The identification of CoGMV in this study, together with our previous characterization of CoYVV (Ha et al., 2006), provides strong evidence of a New World virus present in the Old World prior to Gondwana separation. This, together with a high degree of virus diversity that includes putative recombinant viruses, satellite molecules and viruses with previously undescribed variability in the putative stem–loop sequences, suggests that South-East Asia, and Vietnam in particular, is one of the origins of begomovirus diversity.
ACKNOWLEDGEMENTS The authors thank the Australian Centre for International Agricultural Research (ACIAR) for funding this research. C. H. was supported by a QUT International Postgraduate Research Scholarship.
REFERENCES Arguello-Astorga, G. R. & Ruiz-Medrano, R. (2001). An iteron-related
domain is associated to motif 1 in the replication proteins of geminiviruses: identification of potential interacting amino acid-base pairs by a comparative approach. Arch Virol 146, 1465–1485. Arguello-Astorga, G. R., Guevara-Gonzalez, R. G., Herrera-Estrella, L. R. & Rivera-Bustamante, R. F. (1994). Geminivirus replication
origins have a group-specific organization of iterative elements – a model for replication. Virology 203, 90–100. Briddon, R. W. & Stanley, J. (2006). Subviral agents associated with
plant single-stranded DNA viruses. Virology 344, 198–210. Briddon, R. W., Bull, S. E., Amin, I., Mansoor, S., Bedford, I. D., Rishi, N., Siwatch, S. S., Zafar, Y., Abdel-Salam, A. M. & Markham, P. G. (2004).
Diversity of DNA 1: a satellite-like molecule associated with monopartite begomovirus-DNA b complexes. Virology 324, 462–474. Briddon, R. W., Bull, S. E. & Bedford, I. D. (2006). Occurrence of Sweet
potato leaf curl virus in Sicily. Plant Pathol 55, 286. Journal of General Virology 89
Characterization of begomoviruses in Vietnam
Chakraborty, S., Pandey, P. K., Banerjee, M. K., Kalloo, G. & Fauquet, C. M. (2003). Tomato leaf curl Gujarat virus, a new begomovirus
Hill, J. E., Strandberg, J. O., Hiebert, E. & Lazarowitz, S. G. (1998).
species causing a severe leaf curl disease of tomato in Varanasi, India. Phytopathology 93, 1485–1495.
Asymmetric infectivity of pseudorecombinants of Cabbage leaf curl virus and Squash leaf curl virus: implications for bipartite geminivirus evolution and movement. Virology 250, 283–292.
Fauquet, C. (2002). Geminivirus species demarcation criteria study
Idris, A. M. & Brown, J. K. (2004). Cotton leaf crumple virus is a distinct
case. Webpage: Geminiviridae. http://www.danforthcenter.org/iltab/ geminiviridae/ Fauquet, C. M. & Stanley, J. (2003). Geminivirus classification and
Western Hemisphere begomovirus species with complex evolutionary relationships indicative of recombination and reassortment. Phytopathology 94, 1068–1074.
nomenclature: progress and problems. Ann Appl Biol 142, 165–189.
Jeske, H., Lutgemeier, M. & Preiss, W. (2001). DNA forms indicate
Fauquet, C. M. & Stanley, J. (2005). Revising the way we conceive and
rolling circle and recombination-dependent replication of Abutilon mosaic virus. EMBO J 20, 6158–6167.
name viruses below the species level: a review of geminivirus taxonomy calls for new standardized isolate descriptors. Arch Virol 150, 2151–2179. Fauquet, C. M., Bisaro, D. M., Briddon, R. W., Brown, J. K., Harrison, B. D., Rybicki, E. P., Stenger, D. C. & Stanley, J. (2003). Revision of
taxonomic criteria for species demarcation in the family Geminiviridae, and an updated list of begomovirus species. Arch Virol 148, 405–421. Fontes, E. P., Eagle, P. A., Sipe, P. S., Luckow, V. A. & HanleyBowdoin, L. (1994a). Interaction between a geminivirus replication
protein and origin DNA is essential for viral replication. J Biol Chem 269, 8459–8465. Fontes, E. P., Gladfelter, H. J., Schaffer, R. L., Petty, I. T. & HanleyBowdoin, L. (1994b). Geminivirus replication origins have a modular
organization. Plant Cell 6, 405–416. Gibbs, M. J., Armstrong, J. S. & Gibbs, A. J. (2000). Sister-Scanning: a
Monte Carlo procedure for assessing signals in recombinant sequences. Bioinformatics 16, 573–582. Gilbertson, R. L., Hidayat, S. H., Paplomatas, E. J., Rojas, M. R., Hou, Y. M. & Maxwell, D. P. (1993). Pseudorecombination between
infectious cloned DNA-components of tomato mottle and bean dwarf mosaic geminiviruses. J Gen Virol 74, 23–31. Green, S. K., Tsai, W. S., Shih, S. L., Black, L. L., Rezaian, A., Rashid, M. H., Roff, M. M. N., Myint, Y. Y. & Hong, L. T. A. (2001). Molecular
characterization of begomoviruses associated with leafcurl diseases of tomato in Bangladesh, Laos, Malaysia, Myanmar, and Vietnam. Plant Dis 85, 1286. Guerra-Peraza, O., Kirk, D., Seltzer, V., Veluthambi, K., Schmit, A. C., Hohn, T. & Herzog, E. (2005). Coat proteins of Rice tungro bacilliform
virus and Mungbean yellow mosaic virus contain multiple nuclearlocalization signals and interact with importin a. J Gen Virol 86, 1815–1826. Guo, X. & Zhou, X. (2005). Molecular characterization of
Alternanthera yellow vein virus: a new begomovirus species infecting Alternanthera philoxeroides. J Phytopathol 153, 694–696. Guo, X. J. & Zhou, X. P. (2006). Molecular characterization of a new
begomovirus infecting Sida cordifolia and its associated satellite DNA molecules. Virus Genes 33, 279–285. Ha, C., Coombs, S., Revill, P., Harding, R., Vu, M. & Dale, J. (2006).
Corchorus yellow vein virus, a New World geminivirus from the Old World. J Gen Virol 87, 997–1003.
Jupin, I., Hericourt, F., Benz, B. & Gronenborn, B. (1995). DNA
replication specificity of TYLCV geminivirus is mediated by the amino-terminal 116 amino acids of the Rep protein. FEBS Lett 362, 116–120. Kunik, T., Palanichelvam, K., Czosnek, H., Citovsky, V. & Gafni, Y. (1998). Nuclear import of the capsid protein of Tomato yellow
leaf curl virus (TYLCV) in plant and insect cells. Plant J 13, 393–399. Lazarowitz, S. G. (1992). Geminiviruses: genome structure and gene
function. Crit Rev Plant Sci 11, 327–349. Lotrakul, P. & Valverde, R. A. (1999). Cloning of a DNA-A-like
genomic component of sweet potato leaf curl virus: nucleotide sequence and phylogenetic relationships. Mol Plant Pathol On-Line. http://www.bspp.org.uk/mppol/1999/0422lotrakul/index.htm Lotrakul, P., Valverde, R. A., Clark, C. A., Sim, J. & De La Torre, R. (1998). Detection of a geminivirus infecting sweet potato in the
United States. Plant Dis 82, 1253–1257. RDP: detection of recombination amongst aligned sequences. Bioinformatics 16, 562–563.
Martin, D. & Rybicki, E. (2000).
Martin, D. P., Williamson, C. & Posada, D. (2005). RDP2: recombination detection and analysis from sequence alignments. Bioinformatics 21, 260–262. Orozco, B. M. & Hanley-Bowdoin, L. (1996). A DNA structure is required for geminivirus replication origin function. J Virol 70, 148–158. Padidam, M., Beachy, R. N. & Fauquet, C. M. (1996). The role of AV2
(‘‘precoat’’) and coat protein in viral replication and movement in tomato leaf curl geminivirus. Virology 224, 390–404. Padidam, M., Sawyer, S. & Fauquet, C. M. (1999). Possible emergence
of new geminiviruses by frequent recombination. Virology 265, 218–225. Page, R. D. M. (1996). TreeView: an application to display phylogenetic trees on personal computers. Comput Appl Biosci 12, 357–358. Paximadis, M., Idris, A. M., Torres-Jerez, I., Villarreal, A., Rey, M. E. C. & Brown, J. K. (1999). Characterization of tobacco geminiviruses in
the Old and New World. Arch Virol 144, 703–717. Posada, D. & Crandall, K. A. (2001). Evaluation of methods for
detecting recombination from DNA sequences: computer simulations. Proc Natl Acad Sci U S A 98, 13757–13762.
Hanley-Bowdoin, L., Settlage, S. B., Orozco, B. M., Nagar, S. & Robertson, D. (2000). Geminiviruses: models for plant DNA
Revill, P. A., Ha, C. V., Porchun, S. C., Vu, M. T. & Dale, J. L. (2003).
replication, transcription, and cell cycle regulation. Crit Rev Biochem Mol Biol 35, 105–140.
The complete nucleotide sequence of two distinct geminiviruses infecting cucurbits in Vietnam. Arch Virol 148, 1523–1541.
Harrison, B. D., Swanson, M. M. & Fargette, D. (2002). Begomovirus
Revill, P. A., Ha, C. V., Lines, R. E., Bell, K. E., Vu, M. T. & Dale, J. L. (2004). PCR and ELISA-based virus surveys of banana, papaya and
coat protein: serology, variation and functions. Physiol Mol Plant Pathol 60, 257–271.
cucurbit crops in Vietnam. Asia Pac J Mol Biol Biotechnol 12, 27–32.
Hattermann, K., Schmitt, C., Soike, D. & Mankertz, A. (2003).
Rigden, J. E., Dry, I. B., Mullineaux, P. M. & Rezaian, M. A. (1993).
Cloning and sequencing of Duck circovirus (DuCV). Arch Virol 148, 2471–2480.
Mutagenesis of the virion-sense open reading frames of tomato leaf curl geminivirus. Virology 193, 1001–1005.
http://vir.sgmjournals.org
325
C. Ha and others Rochester, D. E., Kositratana, W. & Beachy, R. N. (1990). Systemic
Smith, J. M. (1992). Analyzing the mosaic structure of genes. J Mol
movement and symptom production following agroinoculation with a single DNA of tomato yellow leaf curl geminivirus (Thailand). Virology 178, 520–526.
Evol 34, 126–129. Stanley, J., Bisaro, D. M., Briddon, R. W., Brown, J. K., Fauquet, C. M., Harrison, B. D., Rybicki, E. P. & Stenger, D. C. (2005). Family
Rybicki, E. P. (1994). A phylogenetic and evolutionary justification for
Geminiviridae. In Virus Taxonomy: Eighth Report of the International Committee on Taxonomy of Viruses, pp. 301–326. Edited by C. M. Fauquet, M. A. Mayo, J. Maniloff, U. Desselberger & L. A. Ball. London: Elsevier Academic Press.
3 genera of Geminiviridae. Arch Virol 139, 49–77.
Stonor, J., Hart, P., Gunther, M., DeBarro, P. & Rezaian, M. A. (2003).
Salminen, M. O., Carr, J. K., Burke, D. S. & McCutchan, F. E. (1995).
Tomato leaf curl geminivirus in Australia: occurrence, detection, sequence diversity and host range. Plant Pathol 52, 379–388.
Rojas, M. R., Hagen, C., Lucas, W. J. & Gilbertson, R. L. (2005).
Exploiting chinks in the plant’s armor: evolution and emergence of geminiviruses. Annu Rev Phytopathol 43, 361–394.
Identification of breakpoints in intergenotypic recombinants of HIV type-1 by bootscanning. AIDS Res Hum Retroviruses 11, 1423–1425.
Thompson, J. D., Gibson, T. J., Plewniak, F., Jeanmougin, F. & Higgins, D. G. (1997). The CLUSTAL_X windows interface: flexible
Saunders, K., Salim, N., Mali, V. R., Malathi, V. G., Briddon, R., Markham, P. G. & Stanley, J. (2002). Characterisation of Sri Lankan
strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Res 25, 4876–4882.
cassava mosaic virus and Indian cassava mosaic virus: evidence for acquisition of a DNA B component by a monopartite begomovirus. Virology 293, 63–74.
Unseld, S., Hohnle, M., Ringel, M. & Frischmuth, T. (2001).
Seal, S. E., vandenBosch, F. & Jeger, M. J. (2006). Factors influencing
Wang, X.-Y., Xie, Y. & Zhou, X.-P. (2004). Molecular characterization
begomovirus evolution and their increasing global significance: implications for sustainable control. Crit Rev Plant Sci 25, 23–46.
of two distinct begomoviruses from papaya in China. Virus Genes 29, 303–309.
Sharma, P., Rishi, N. & Malathi, V. G. (2005). Molecular cloning of coat protein gene of an Indian cotton leaf curl virus (CLCuV-HS2) isolate and its phylogenetic relationship with others members of Geminiviridae. Virus Genes 30, 85–91.
Zhou, X.-P., Xie, Y., Tao, X.-R., Zhang, Z.-K., Li, Z.-H. & Fauquet, C. M. (2003). Characterization of DNA beta associated with begomoviruses
326
Subcellular targeting of the coat protein of African cassava mosaic geminivirus. Virology 286, 373–383.
in China and evidence for co-evolution with their cognate viral DNAA. J Gen Virol 84, 237–247.
Journal of General Virology 89