Journal of General Virology (2005), 86, 511–520
DOI 10.1099/vir.0.80261-0
Banana contains a diverse array of endogenous badnaviruses Andrew D. W. Geering,1 Neil E. Olszewski,2 Glyn Harper,3 Benham E. L. Lockhart,4 Roger Hull3 and John E. Thomas1 1
Department of Primary Industries and Fisheries, 80 Meiers Road, Indooroopilly, Queensland 4068, Australia
Correspondence Andrew D. W. Geering
[email protected]
2,4
Departments of Plant Biology2 and Plant Pathology4, University of Minnesota, St Paul, MN 55108, USA
3
John Innes Centre, Colney Lane, Norwich NR4 7UH, UK
Received 7 May 2004 Accepted 5 November 2004
Banana streak disease is caused by several distinct badnavirus species, one of which is Banana streak Obino l’Ewai virus. Banana streak Obino l’Ewai virus has severely hindered international banana (Musa spp.) breeding programmes, as new hybrids are frequently infected with this virus, curtailing any further exploitation. This infection is thought to arise from viral DNA integrated in the nuclear genome of Musa balbisiana (B genome), one of the wild species contributing to many of the banana cultivars currently grown. In order to determine whether the DNA of other badnavirus species is integrated in the Musa genome, PCR-amplified DNA fragments from Musa acuminata, M. balbisiana and Musa schizocarpa, as well as cultivars ‘Obino l’Ewai’ and ‘Klue Tiparot’, were cloned. In total, 103 clones were sequenced and all had similarity to open reading frame III in the badnavirus genome, although there was remarkable variation, with 36 distinct sequences being recognized with less than 85 % nucleotide identity to each other. There was no commonality in the sequences amplified from M. acuminata and M. balbisiana, suggesting that integration occurred following the separation of these species. Analysis of rates of non-synonymous and synonymous substitution suggested that the integrated sequences evolved under a high degree of selective constraint as might be expected for a living badnavirus, and that each distinct sequence resulted from an independent integration event.
INTRODUCTION Integrated pararetrovirus sequences have now been found in the nuclear genomes of several plant species including banana, tobacco, petunia and rice (Gregor et al., 2004; Harper et al., 1999b; Jakowitsch et al., 1999; Kunii et al., 2004; Lockhart et al., 2000; Ndowora et al., 1999; RichertPo¨ggeler et al., 2003). Integration is not an essential step in the replication cycle of a pararetrovirus, as it is for a retrovirus, but during at least one stage of the replication cycle viral DNA does occur in the nucleus (Hull, 2002), and DNA has probably become integrated in the host chromosomes through a process of illegitimate recombination (Ndowora et al., 1999). These integrated sequences appear to be relics of quite ancient infection events, and their presence is not necessarily associated with infection. It is postulated that the integrated sequences confer a selective advantage upon the plant by contributing towards plant The GenBank/EMBL/DDBJ accession numbers reported in this paper are AY189378–AY189383, AY189384–AY189392, AY189444– AY189453, AY189393–AY189400, AY189401–AY189419, AY189420–AY189435 and AY189436–AY189443.
0008-0261 G 2005 SGM
Printed in Great Britain
virus resistance through induction of transcriptional or post-transcriptional gene silencing of homologous sequences (Hull et al., 2000). Supporting this hypothesis, Mette et al. (2002) showed that a reporter gene linked to a tobacco endogenous pararetrovirus enhancer-promoter sequence was expressed in stably transformed Arabidopsis, but silenced in allotetraploid tobacco containing integrated sequences. In some instances, integrated sequences can be activated to give rise to infection. The most economically significant example of this phenomenon is Banana streak virus (BSV; genus Badnavirus) in banana (Harper et al., 1999b; Ndowora et al., 1999). The banana is one of the oldest domesticated crops in the world and the majority of cultivars (cvs) have arisen by a process of traditional selection. The wild progenitors of the domesticated banana are Musa acuminata (A genome), Musa balbisiana (B genome), and to a much lesser extent, Musa schizocarpa (S genome) and Musa textilis/Musa maclayi (T genome) (Carreel et al., 2002; Daniells et al., 2001). M. acuminata is morphologically very variable and up to nine different subspecies are recognized (Daniells et al., 2001). Recent evidence suggests that parthenocarpy 511
A. D. W. Geering and others
first arose in M. acuminata subsp. banksii and/or subsp. errans, two subspecies occurring in New Guinea and the Philippines, respectively (Carreel et al., 2002). From there, primitive cultivated types were probably taken by humans to other parts of South-east Asia where other subspecies of M. acuminata occur, providing opportunities for intraspecific hybridization. The sweetness of dessert bananas is a trait of M. acuminata subsp. malaccensis and zebrina, two subspecies found in the Indonesian archipelago and Malay Peninsula (Bakry et al., 2001; Carreel et al., 2002). Finally, banana cultivation expanded north and westwards into the region stretching from southern China and northern Indochina to northern India, where M. balbisiana naturally occurs, resulting in interspecific hybridization and selection of an even greater diversity of cvs with improved drought and cold tolerance and disease resistance (Bakry et al., 2001; Jones, 1999). Banana breeding programmes began in the 1920s, but despite several decades of research only a small number of cvs have been successfully commercialised (Bakry et al., 2001). One of the major constraints to international banana breeding efforts has been the Obino l’Ewai isolate of BSV, for which the name Banana streak Obino l’Ewai virus (BSOEV) is proposed (G. Harper & R. Hull, personal communication). Despite an apparent absence of virus particles in the parent lines used in the breeding programmes, hybrid progeny resulting from crosses between these lines are frequently infected with BSOEV, curtailing any further exploitation (Dahal et al., 1999; Harper et al., 1999a; Ndowora et al., 1999). An integrated sequence with 99 % identity to BSOEV has been identified in one parent line, Musa AAB group cv. ‘Obino l’Ewai’ (Harper et al., 1999b; Ndowora et al., 1999). This integrant consists of two segments of viral sequence, which together comprise the full complement of the virus genome, but these two segments are separated by a 6 kb ‘scrambled region’ containing non-contiguous and inverted viral sequences. It is thought that infection in the hybrids arises following activation of this integrant. The current model for activation involves two homologous recombinations, leading to excision of the ‘scrambled region’ and the joining of either end of the integrant to give rise to a circular molecule, the equivalent of the virus mini-chromosome (Ndowora et al., 1999). A stimulus for these recombinations is tissue culture (Dallot et al., 2001), a propagation practice used to multiply planting stock once a hybrid is made. PCR using degenerate badnavirus-group primers amplifies badnavirus-related sequences from a broad range of banana cvs suggesting that most cvs contain integrated badnavirus sequences (A. D. W. Geering, unpublished results). However, BSV infection in Cavendish bananas (Musa AAA group, Cavendish subgroup), the most important bananas in world trade, is rare, despite propagation by tissue culture in large quantities for many years. In providing an explanation for this conundrum, Geering et al. (2001) showed that integrated BSOEV DNA is linked to 512
M. balbisiana and is therefore absent from Cavendish bananas, which have a pure M. acuminata genetic background. However, Geering et al. (2001) did show that Cavendish bananas contain a non-functional integrated badnavirus sequence which, in contrast to BSVOEV, was linked to M. acuminata. In this study, we have investigated the diversity of integrated badnavirus sequences in banana. As cultivated bananas are parthenocarpic and frequently sterile, plants are vegetatively propagated and their genotypes have largely been preserved since the initial hybridization events. Consequently, in order to understand the diversity of integrated badnavirus sequences that occur in at least 500 different banana cvs that exist today (Jones, 1999), we have examined accessions of M. acuminata, M. balbisiana and M. schizocarpa, as well as, two landraces, cvs ‘Klue Tiparot’ (Musa BAB group) and ‘Obino l’Ewai’ (Musa AAB group).
METHODS Plants. Sources of Musa are described in Table 1. Unless otherwise indicated, plants were grown in a glasshouse in Brisbane at ca. 20–30 uC, monitored for banana streak symptoms for at least 12 months and indexed for BSV infection by immunosorbent electron microscopy (ISEM) using a crude virus preparation (‘virus miniprep’) as described by Geering et al. (2000). Extraction of genomic DNA from banana. Genomic DNA was
extracted from either lyophilized or fresh banana leaf tissue by using either the method of Gawel & Jarret (1991) or a DNeasy Plant maxi kit (Qiagen). PCR and cloning. Badnavirus DNA was amplified using Badna-1A
(59-CTNTAYGARTGGYTNGTNATGCCNTTYGG-39) and Badna 4 (59-TCCAYTTRCANAYNSCNCCCCANCC-39) primers, designed to conserved sequences in the reverse transcriptase/RNase H region of the badnavirus open reading frame (ORF) III. A 50 ml PCR mix contained 2 ml Elongase enzyme mix (Gibco-BRL), 16 Elongase reaction buffer, 400 nM each primer, 100 mM each dNTP and 1?6 mM MgSO4. Thermal cycling conditions were one cycle at 94 uC for 2 min, 30 cycles at 94 uC for 15 s, 52 uC for 30 s and 72 uC for 1 min, followed by one cycle at 72 uC for 5 min. The PCR product was gel-purified using a QIAEX II gel extraction kit, A-tailed with Taq DNA polymerase (Gibco-BRL) and then cloned into pCR-Script II using a TOPO TA cloning kit (Invitrogen) according to the manufacturer’s instructions. Restriction enzyme analysis of PCR clones and sequencing.
To screen for sequence variants, purified plasmid DNA was digested with a mixture of DdeI and EcoRI and the digests electrophoresed in a 2 % agarose gel in 0?56 Tris-borate-EDTA. Representatives from each restriction pattern group were then selected for sequencing. Sequencing was done using fluorescent dye terminators on an Applied Biosystems automatic sequencer. Sequence analyses. Searches of GenBank were done using BLASTN
(Altschul et al., 1990). DNA sequences were aligned using CLUSTALX (Thompson et al., 1997) and the alignment manually adjusted by using an amino acid sequence alignment as a guide. All phylogenetic and molecular evolutionary analyses were done using MEGA version 2.1 (Kumar et al., 2001). A distance matrix was created using the Kimura two-parameter model and relationships deduced using the neighbour-joining method. A bootstrap consensus tree was generated from 1000 replicates. Rates of synonymous and non-synonymous Journal of General Virology 86
Banana endogenous viruses
Table 1. Musa genotypes used in the study and number of PCR clones analysed Plant species M. acuminata subsp. banksii M. acuminata subsp. burmannicoides ‘Calcutta 4’ M. acuminata subsp. malaccensis M. acuminata subsp. zebrina ‘Monyet’ M. balbisiana ‘Pisang Batu’ M. balbisiana ‘Pisang Klutuk Wulung’ cv. ‘Obino l’Ewai’ cv. ‘Klue Tiparot’ Musa schizocarpa
Genotype
Source*
Accession
No. PCR clones analysed
AA AA AA AA BB BB AAB BAB SS
INIBAP DPI FHIAD DPI & F DPI & F CIRADD CIRADD IITAD DPI & F INIBAP
ITC 0853 BAN 1059 BAN 852 BAN 646 – – – BAN 1070 ITC 0926
48 75 48 71 36 36 63 52 30
*Abbreviations are: IITA – International Institute for Tropical Agriculture, Ibadan, Nigeria; INIBAP – International Network for the Improvement of Banana and Plantain Transit Centre, Leuven, Belgium; DPI & F – Department of Primary Industries and Fisheries, Maroochy Horticultural Research Station, Queensland; CIRAD – Centre de coope´ration internationale en recherche agronomique pour le de´veloppement, Neufchateau, Sainte Marie, French West Indies. DFreeze-dried leaf tissue imported into Australia (AQIS permit no. 200307748).
substitution were estimated using the method of Nei & Gojobori (1986). All new sequences have been lodged in GenBank: Shiz2, Shiz3, Shiz25, Shiz14, Shiz23 and Shiz24 under accession numbers AY189378 to AY189383, respectively; Bank1, Bank10, Bank11, Bank13, Bank14, Bank17, Bank19, Bank6 and Bank8 under accession numbers AY189384 to AY189392, respectively; Cal12, Cal13, Cal1, Cal22, Cal27, Cal30, Cal34, Cal6, Cal8 and Cal22t under accession numbers AY189444 to AY189453, respectively; Mal10, Mal11, Mal15, Mal22, Mal26, Mal3, Mal6 and Mal8 under accession numbers AY189393 to AY189400, respectively; OBLE15, OBLE17, OBLE1, OBLE21, OBLE24, OBLE2, OBLE32, OBLE34, OBLE35, OBLE36, OBLE37, OBLE3, OBLE4, OBLE5, OBLE7, OBLE8, OBLE13t, OBLE1t and OBLE22t under accession numbers AY189401 to AY189419, respectively; Batu10, Batu19, Batu20, Batu21, Batu24, Batu25, Batu27, Batu2, Batu31, Batu34, Batu36, Batu4, Batu5, Batu6, Batu8 and Batu9 under accession numbers AY189420 to AY189435, respectively; PKW12, PKW16, PKW18, PKW23, PKW32, PKW36, PKW8 and PKW9 under accession numbers AY189436 to AY189443, respectively. Additional sequences used in analyses were (abbreviations and GenBank accession numbers in parentheses): Banana streak Cavendish virus (BSCavV, A.D.W. Geering, unpublished data); Banana streak Mysore virus (BSMysV, AF214005); Banana streak Goldfinger virus (BSGFV, A.D.W. Geering, unpublished data); Banana streak Imove virus (BSImV, A.D.W. Geering, unpublished data); BSOEV (AJ002234); Banana streak Uganda A virus (BSUgAV, G. Harper, unpublished data), Banana streak Uganda I virus (BSUgIV, G. Harper, unpublished data), Banana streak Uganda J virus (BSUgJV, G. Harper, unpublished data), Banana streak Uganda K virus (BSUgKV, G. Harper, unpublished data), Banana streak Uganda L virus (BSUgLV, G. Harper, unpublished data); Cacao swollen shoot virus (CSSV, L14546); Cassava vein mosaic virus (CsVMV, U20341); Citrus mosaic virus (CMBV, AF347695); Commelina yellow mottle virus (ComYMV, X52938); Dioscorea bacilliform virus (DBV, X94576 and X94581); Kalanchoe top-spotting virus (KTSV, AY180137); Rice tungro bacilliform virus isolate Chainat (RTBV-Ch, AF220561); RTBV isolate Ic (RTBV-Ic, AF113832); RTBV isolate G1 (RTBV-G1, AF113830); RTBV isolate Philippines (RTBV-Phil, X57924); RTBV isolate Serdang (RTBV-Serdang, AF076470); RTBV isolate West Bengal (RTBV-WB, AJ314596); Sugarcane bacilliform virus isolate Ireng Maleng (SCBV-IM, AJ277091); SCBV isolate Morocco (SCBVM, M89923); Taro bacilliform virus (TaBV, AF357836). http://vir.sgmjournals.org
RESULTS BSV indexing Leaf tissue of plants used in the Musa breeding programmes was free of both banana streak disease symptoms and BSV infection in their country of origin (Lheureux et al., 2002; Ndowora et al., 1999). The remaining genotypes, when grown in a glasshouse for over a period of at least 12 months, never exhibited symptoms of banana streak disease nor were they infected with BSV, as determined by ISEM on a ‘virus miniprep’. Rapid screening of PCR clones by restriction enzyme analysis A method was developed to rapidly screen PCR clones for sequence variation by analysing restriction patterns following digestion with a combination of EcoRI, an enzyme that symmetrically excises the PCR fragment out of the vector, and DdeI, a 5-base cutter. An example of the restriction patterns obtained is shown in Fig. 1. This technique proved to be both informative and reliable in grouping clones with nearly identical sequences and differentiating significantly divergent sequences. Characterization of the PCR clones In total, 103 clones were sequenced and all had similarity to the RT/RNase H region of the badnavirus ORF III (Table 2). Based on the frequency distribution of pairwise nucleotide sequence identities (Fig. 2), a threshold of 85 % was chosen to differentiate distinct groups of sequences, of which there were 36. Sequences with >99 % identity to BSOEV, BSMysV and BSImV were identified. The remaining sequences have been named Banana endogenous virus (BEV) 1–33. Clone Wil1 from cv. Williams, previously reported by Geering et al. (2001), was 98?6 % identical to the 513
A. D. W. Geering and others
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 bp 1000 500 100
Fig. 1. Analysis of sequence variation by digestion of clones with DdeI and EcoRI. Order of loading: lanes 1 and 15, 100 bp marker; lanes 2–14, clones OBLE1, 2, 3, 5, 7, 8, 15, 17, 21, 24, 32, 35 and 36, respectively. Arrows point to DNA fragments generated by digestion of the vector backbone.
clone Mal1, and therefore based on the proposed system of classification, can be considered a variant of BEV 1. Fourteen distinct badnavirus sequences were amplified each from M. acuminata and M. balbisiana (Table 2). In many cases, the same sequence was amplified from more than one subspecies of M. acuminata and in the case of BEV 1, BEV 4 and BEV 8, from every M. acuminata subspecies examined in this study. In two instances, the presence of a sequence crossed species boundaries: BEV 10 and BEV 13 were amplified from both M. acuminata and M. schizocarpa. However, there was no commonality in the sequences amplified from M. acuminata and M. balbisiana. As expected, sequences found in both M. acuminata and M. balbisiana were amplified from cv. ‘Obino l’Ewai’ (Musa AAB group). However, only sequences common to M. balbisiana but not M. acuminata were amplified from cv. ‘Klue Tiparot’ (Musa BAB group). Eight sequences were amplified from cv. ‘Obino l’Ewai’ and ‘Klue Tiparot’ for which a linkage with either M. acuminata or M. balbisiana could not be established. To examine whether each of the sequences contained functional protein coding sequence, amino acid sequence translations were done (Table 2). Ninety-two clones were 544 bp long (excluding primer sequences), the expected size for a badnavirus, and 60 had an uninterrupted reading frame starting at nt 2. Seven clones (Bank11, Bank 17, Cal 27, KT32, OBLE34, PKW12 and Shiz23) had additions or deletions of 1 or 2 nt, leading to translational frameshifts. The clone Zeb2t, which was 588 bp long, contained two segments of sequence (nt 1–246 and 292– 588) with at least 95 % identity to other variants of BEV 10 (e.g. Bank11, Mal8, Zeb3). However, the 45 bp of sequence separating these two segments had little similarity to any part of the badnavirus genome and was presumably Musa sequence. The clone Batu5, when compared with other variants of BEV 11, had deleted blocks of 14 and 6 nt, the first resulting in a translational frameshift. The additional length of the clone OBLE2 was due to the Badna 4 primer binding at a site 4 nt downstream of its normal position. 514
Remaining clones with an interrupted reading frame had nucleotide substitutions leading to insertion of stop codons. Interestingly, sequences of the clones Batu27 and PKW32 were identical to that of BSImV in all but 1 nt position; both had an identical substitution at nt 442 (TGG to TGA), leading to insertion of a stop codon. This substitution was not present in the clone KT32, the other variant of BSImV isolated, but for reasons stated above, this clone also had an interrupted reading frame. Phylogenetic analyses Deduced relationships of the BEVs to a range of badnaviruses and closely related pararetroviruses are shown in Fig. 3. Strong bootstrap support was obtained for eight major clades of badnaviruses, of which the BEVs grouped in three (1, 3 and 5). Clades 1 and 5 were the largest and were further divided into six and eight subclades, respectively. Twelve of the 14 BEVs amplified from M. acuminata clustered in clade 1, and 11 of these specifically in subclades 1A and 1B. In comparison, all but one of the BEVs amplified from M. balbisiana were evenly distributed across subclades 1C to 1F and clades 3 and 5 (Table 2). BSOEV, BSMysV and BSImV all clustered in clade 5 along with BSCavV, BSGfV and KTSV. To examine the basis of variation shown in Fig. 3, the mean number of synonymous differences per synonymous site (ks) and non-synonymous differences per nonsynonymous site (kn) were calculated for each clade (Table 3). For all clades, the estimated rate of synonymous substitution was far greater than the rate of non-synonymous substitution, leading to kn/ks ratios significantly less than one, a sign of purifying selection whereby most molecules with a non-synonymous mutation are eliminated from a population because of the deleterious or lethal nature of the mutation. Interestingly, clade 1, the largest and most diverse of all of the clades, had the smallest kn/ks ratio and this ratio was about half that of clade 2 containing two badnaviruses from different hosts, CMBV and CSSV.
DISCUSSION Our study shows that the genomes of M. acuminata, M. balbisiana and M. schizocarpa contain a diverse array of endogenous badnaviruses. Although the PCR approach used to obtain the sequences would not have discriminated between episomal and integrated viral DNA, several observations suggest the latter origin. Firstly, DNA was isolated from plants that had remained free of banana streak symptoms for at least a year and had tested negative for BSV by ISEM. Secondly, for nearly half of the sequences, the ORF terminated prematurely due to nucleotide substitutions, deletions or additions, generating premature stop codons or causing translational frameshifts. Misincorporation of nucleotides during PCR is an unlikely explanation for this number of mutations, especially as a Journal of General Virology 86
http://vir.sgmjournals.org
Table 2. Classification and distribution of the banana endogenous viruses (BEVs) in the different Musa genomes Clones that are in bold font have an interrupted reading frame; symbols in parentheses indicate the cause of the interruption, where ‘s’ is a premature stop codon and ‘f’ is a frameshift mutation. Clade* Name
Musa acuminata
subsp. banksii 1A
BEV 1 Bank10(s) BEV 2 Bank19(s) BEV 3 Bank8 BEV 4 Bank13 BEV BEV BEV BEV
1B
1C
1E 1F 515
3
Cal12
Musa BAB group
cv. ‘Obino l’Ewai’
cv. ‘Klue Tiparot’
Cal27(f) Cal22t
Mal22(s), Zeb22 OBLE35 Mal26(s) Mal6(s), Zeb33(s) Zeb16 OBLE8 Mal11 Zeb15, OBLE1t Zeb36 Mal3 Zeb5 Mal10
Cal13(s)
Mal15
Zeb20, Zeb43
Mal8
Zeb3(s), OBLE22t(s) Zeb2t(f) OBLE2(s,f)
Cal6, Cal34
BEV 9 BEV 10 Bank17(f), Bank11(f) BEV 11
Zeb5t
BEV 13 Bank1(s), Bank14(s) BEV 14 BEV 15 BEV 16 BEV 17 BEV 18 BEV 19 BEV 20 BEV 21 BEV 22 BEV 23 BEV 24 Cal30(s,f) BEV 25 BEV 26
Zeb1
‘Pisang Batu’
Musa schizocarpa
Sequence variation between clones (%)
‘Pisang Klutuk Wulung’ 1?2–13?4D (8?9)d 2?9–4?8 (4?1) 1?9–3?5 (2?8) 1?5–9?4 (5?8) 4?7–7?8 (6?5) 2?1 0?4–4?5 (3?1)
KT38(s)
BEV 12
Musa balbisiana
Batu6, Batu20
PKW36(s) Shiz25
KT11(s)
Batu5(f), Batu8(s), Batu31
PKW8
OBLE3, OBLE34(s,f) OBLE13t(s)
2?3–4?1 (3?5) 1?4–4?8 (3?1) 0?2–4?1 (2?5) 4?5–5?4 (4?8)
Shiz2(s), Shiz3(s), Shiz14, Shiz23(s,f), Shiz24(s)
0?4–6?8 (4?4)
PKW12(s,f) Batu25(s) OBLE15 OBLE17, OBLE37
OBLE24
KT3 KT51 KT6 KT23
0–1?5 (1?0)
Batu34 Batu19(s)
PKW23(s)
0–2?3 (1?6) 0
KT31 Batu36 KT9(s) KT36(s)
Batu10(s) PKW9(s)
0?4 0?6
Banana endogenous viruses
1D
5 6 7 Bank9 8 Bank6(s)
subsp. subsp. subsp. burmannicoides malaccensis zebrina
Musa AAB group
Sequence variation between clones (%)
A. D. W. Geering and others
Frequency
0?8
0–1?0 (0?6) 0?1–12?8 (8?7)
800
PKW32(s)
PKW16
Batu27(s)
Batu21, Batu9
Fig. 2. Frequency distribution of pairwise nucleotide sequence identities for the reverse transciptase/RNase H gene of the banana endogenous viruses.
high fidelity DNA polymerase mixture was used to amplify the fragments.
BEV 33 5H
5G
5F
BSMysV BEV 27 BEV 28 BEV 29 BEV 30 BSImV BEV 31 BSOEV BEV 32 5A 5B
5C 5D
Name
*Clades are as provided in Fig. 3. D,dRange and mean, respectively.
subsp. burmannicoides
Cal1, Cal22(s), Cal8
subsp. malaccensis
Zeb1t
OBLE5
OBLE7 OBLE36 OBLE21
OBLE32, OBLE4
KT30
KT37
KT32(f)
PKW18(s) Batu4 Batu2(s), Batu24(s) OBLE1
KT42
‘Pisang Klutuk Wulung’ ‘Pisang Batu’ cv. ‘Klue Tiparot’ cv. ‘Obino l’Ewai’
_ 58_ 60_ 62_ 64_ 66_ 68_ 70_ 72_ 74_ 76_ 78_ 80_ 82_ 84_ 86_ 88_ 90_ 92_ 94_ 96_ 9_8 100 56 58 60 62 64 66 68 70 72 74 76 78 80 82 84 86 88 90 92 94 96 98
Nucleotide sequence identity (%)
subsp. banksii
Musa acuminata
400
0
Clade*
Table 2. cont. 516
600
200
subsp. zebrina
Musa AAB group
Musa BAB group
Musa balbisiana
Musa schizocarpa
0–0?2 (0?1)
0?2–0?4 (0?3) 0–0?2 (0?2)
1000
To examine the basis of the diversity of the BEVs, rates of synonymous versus non-synonymous substitution were calculated. For each clade and subclade of BEVs, even those containing non-functional sequences, the kn/ks ratio was substantially less than one, suggesting that during the evolution of the sequences, the majority of non-synonymous mutations had been selectively removed, probably because they were highly deleterious to the functioning of the encoded protein, an essential protein in the replication cycle of a badnavirus. This result is consistent with the majority of nucleotide polymorphism arising when the BEVs were still ‘free-living’, and each of the 36 distinct BEVs probably represents an independent integration event. Although the presence of non-functional sequences suggests some sequence decay following integration, the amount of decay has been insufficient to remove the strong underlying signal of purifying selection. Also supporting the conclusion of minimal sequence decay, the length of the amplified DNA fragments was very uniform. The tendency for genes, once function has been lost, is for frameshift mutations to accumulate, as well for the rates of synonymous and nonsynonymous substitution to reach equivalence (Bensasson et al., 2001; Page & Holmes, 1998). Providing a contrast to the BEVs in Musa, the 21 geminivirus-related DNA (GRD) sequences integrated in the genomes of several Nicotiana spp. appear to stem from two independent integration events and the ancestral sequences have subsequently diverged as a result of genetic drift (Murad et al., 2004). In previous Southern hybridization studies, BSOEV and BSMysV have been linked to the B genome, and BEV 1 to the A genome (Geering et al., 2001, 2005). Here, it has not been possible to make such linkages with the same certainty, as the absence of a BEV in a particular Musa genotype may Journal of General Virology 86
Banana endogenous viruses
79 62
BEV 2 (Bank19) BEV 3 (Bank8) BEV 4 (Bank13) BEV 5 (Zeb5) BEV 6 (Cal22t) BEV 7 (Bank9)
55
84
89
50 100
93 50 72 81
53 98
99 100
58 51 70 93 96
84 79 90
100
94
100
99
94 93 79
100
BEV 8 (Bank6) BEV 9 (KT38) BEV 10 (Mal8) BEV 11 (KT11) BEV 12 (OBLE3) BEV 13 (Bank1)
99 99
Sub-clade 1A
1B
BEV 14 (PKW12) BEV 15 (Batu25) BEV 16 (OBLE15) BEV 17 (OBLE17) BEV 18 (KT51) BEV 19 (KT6) BEV 20 (OBLE24) BEV 21 (Batu19) BEV 22 (KT31) BEV 23 (Batu36) BEV 24 (Cal30) BEV 25 (Batu10) CMBV CSSV BEV 26 (KT36) DBV BSMysV (OBLE1) BEV 27 (Batu2) BEV 28 (OBLE7)
97 61
BEV 1 (Bank10)
BSGfV BEV 29 (OBLE36) BslmV (Batu27) BEV 30 (OBLE21) KTSV BSCavV BEV 31 (Zeb1t) BSOEV(Batu21) BEV 32 (Cal1) BEV 33 (OBLE5) BSUgAV ComYMV TaBV SCBV-IM SCBV-M BSUgMV BSUgJV BSUgIV BSUgLV BSUgKV RTBV-Serdang RTBV-WB RTBV-Ch RTBV-Phi1 RTBV-G1 RTBV-Ic CsVMV
Clade 1 1C
1D 1E 1F Clade 2 Clade 3 Clade 4 5A 5B 5C 5D 5E 5F 5G 5H Clade 6 Clade 7
Clade 8
Clade 9
simply reflect insufficient sampling of the PCR clones. Whilst acknowledging this caveat, some interesting trends were observed. No individual BEV was found in both M. acuminata and M. balbisiana, suggesting that integration occurred after separation of these species. M. acuminata is susceptible to infection by a broad range of badnaviruses, including BSOEV, BSMysV and BSImV (G. Harper, unpublished results; Geering et al., 2005), precluding this as a reason for the absence of integrated DNA of these viruses in this species. Two BEVs were found in common between M. acuminata and M. schizocarpa. At this point of http://vir.sgmjournals.org
Clade 5
Fig. 3. Cladogram depicting the relationships of the banana endogenous viruses (BEVs) to a range of badnaviruses and related pararetroviruses within the reverse transcriptase/RNase H gene. Abbreviations are: Banana streak Cavendish virus (BSCavV); Banana streak Mysore virus (BSMysV); Banana streak Goldfinger virus (BSGFV); Banana streak Imove virus (BSImV); Banana streak Obino l’Ewai virus (BSOEV); Banana streak Uganda A virus (BSUgAV), Banana streak Uganda I virus (BSUgIV), Banana streak Uganda J virus (BSUgJV), Banana streak Uganda K virus (BSUgKV), Banana streak Uganda L virus (BSUgLV); Cacao swollen shoot virus (CSSV); Cassava vein mosaic virus (CsVMV); Citrus mosaic virus (CMBV); Commelina yellow mottle virus (ComYMV); Dioscorea bacilliform virus (DBV); Kalanchoe top-spotting virus (KTSV); Rice tungro bacilliform virus (RTBV); Sugarcane bacilliform virus (SCBV); Taro bacilliform virus (TaBV). BEVs linked to the A genome are in bold font, those linked to the B genome are in italic font, and those of unknown linkage are underlined. Bootstrap values are provided in the nodes of the branches; branches with less than 50 % bootstrap support have been collapsed.
time, it is unclear whether this reflects integration of these BEVs prior to the separation of these species or, alternatively, there has been introgression of the BEVs in M. schizocarpa by plant hybridization or the BEVs independently integrated into the different Musa genomes. It should be noted that M. acuminata subsp. banksii and M. schizocarpa are sympatric in New Guinea and natural hybrids do occur (Carreel et al., 2002). Some BEVs were amplified from all subspecies of M. acuminata examined while others from only one or two 517
A. D. W. Geering and others
Table 3. Rates of synonymous and non-synonymous substitution amongst the banana endogenous viruses, other badnaviruses and Rice tungro bacilliform virus Clade* Subclade
1
2 5
8 9
1A 1B 1C 1D 1F Overall 5C 5E 5F 5H Overall
No. nonNo. nonMean Maximum synonymous synonymous nucleotide nucleotide sites divergence (%) divergence (%) differences 26?5 23?8 26?5 15?4 23?2 41?3 35?5 28?7 23?3 23?9 21?6 36?7 30?9 17?4
22?3 20?1 21?9 15?4 23?2 26?0 35?5 26?9 23?3 20?7 21?6 29?9 25?8 11?3
26?9 29?5 30?1 12?0 33?3 41?1 88?9 42?1 32?0 27?7 36?0 59?7 49?1 11?9
380 381 379 378 380 380 380 378 381 384 383 381 382 385
k(n)D
0?0709 0?0774 0?0793 0?0318 0?0877 0?108 0?234 0?111 0?0841 0?0722 0?0941 0?157 0?129 0?0308
No. No. synonymous synonymous sites differences 82?5 70?8 77?8 61?0 78?7 87?4 89?1 90?6 80?0 72?3 75?0 88?8 79?5 44?8
109 108 110 111 109 109 110 111 108 105 106 108 107 104
k(s)d
k(n)/k(s)
0?755 0?656 0?709 0?550 0?723 0?801 0?814 0?818 0?738 0?686 0?706 0?825 0?745 0?433
0?0939 0?118 0?112 0?0578 0?121 0?135 0?288 0?136 0?114 0?105 0?133 0?190 0?173 0?0711
*Clades are as described in Fig. 3. DNumber of non-synonymous differences per non-synonymous site. dNumber of synonymous differences per synonymous site.
subspecies. Using a Southern hybridization assay, Geering et al. (2001) were unable to detect BEV 1 in M. acuminata subsp. burmannicoides, apparently contradicting the results presented in this study. However, the variant of BEV 1 amplified from M. acuminata subsp. burmannicoides in this study (clone Cal12) had only 85?9 % nucleotide identity to the variant of BEV 1 amplified from cv. Williams (clone Wil1) by Geering et al. (2001), and the two sequences would not be expected to cross-hybridize under high stringency conditions. As expected, A- and B-type BEVs were amplified from the AAB hybrid cv. ‘Obino l’Ewai’. However, surprisingly, no A-type BEVs were amplified from cv. ‘Klue Tiparot’, a BAB hybrid (Jenny et al., 1997; Carreel et al., 2002). Further examination of the BEV integration patterns in cv. ‘Klue Tiparot’ may provide additional insights into the genetic background of this cultivar. If it is assumed that the BEVs are ‘molecular fossils’ of viruses that existed in wild plant populations before plants were domesticated and traded, some predictions can be made about the prehistoric distributions of badnaviruses. BEVs from M. balbisiana showed the greatest diversity, suggesting a major radiation of badnaviruses in the region where M. balbisiana originated, which includes the Philippines and a broad zone stretching from southern China and northern Indochina to northern India (Jones, 1999). It is of significance that two of the three BEVs from M. acuminata grouping outside subclades 1A and 1B were from subsp. burmannicoides, a subspecies naturally occurring in 518
Myanmar near India (Jones, 1999). Otherwise, the diversity of BEVs in M. acuminata was very low, suggesting that badnaviruses were either not very common or diverse in places like Indonesia and the Malay Peninsula where M. acuminata originated (Jones, 1999). What evolutionary implications do the BEVs have for banana plants? The BEVs may confer a selective advantage upon the plant by contributing towards plant virus resistance through induction of transcriptional or posttranscriptional gene silencing of homologous sequences (Hull et al., 2000; Mette et al., 2002). If this is so, it would provide a selection pressure for the maintenance of sequence integrity. The presence of multiple copies of a particular BEV may facilitate duplication of endogenous genes by homologous, unequal recombination, as is thought to be the case when a retrotransposon inserts on either side of an endogenous gene (White et al., 1994). The presence of closely related endogenous badnavirus sequences at distant loci may also mediate larger scale chromosomal rearrangements through homologous recombination (Hughes & Coffin, 2001). Insertion of a badnavirus promoter next to an endogenous plant gene may not only change its transcription levels, but also alter the tissue specificity of expression (Matzke et al., 2004). Of importance to banana breeding programmes, endogenous virus sequences in addition to BSOEV may be able to be activated to give rise to infection. Of particular note, sequences nearly identical to BSImV and BSMysV have been found. Evidence from Nicotiana and Oryza suggests that endogenous pararetrovirus sequences are transcriptionally silenced through Journal of General Virology 86
Banana endogenous viruses
methylation, and it is possible that interspecific hybridization or the creation of polyploids during breeding may lead to the release of this silencing mechanism (Kunii et al., 2004; Matzke et al., 2004). Further work is required to characterize the BEVs to determine if they are capable of giving rise to an episomal virus genome.
Harper, G., Dahal, G., Thottappilly, G. & Hull, R. (1999a). Detection
of episomal banana streak badnavirus by IC-PCR. J Virol Methods 79, 1–8. Harper, G., Osuji, J. O., Heslop-Harrison, J. S. P. & Hull, R. (1999b).
Integration of banana streak badnavirus into the Musa genome: molecular and cytogenetic evidence. Virology 255, 207–213. Hughes, J. F. & Coffin, J. M. (2001). Evidence for genomic
rearrangements mediated by human endogenous retroviruses during primate evolution. Nat Genet 29, 487–489.
ACKNOWLEDGEMENTS We thank Ingrid Jakobsen for helpful discussions. Funding from the Queensland Banana Industry Protection Board and Horticulture Australia Limited is gratefully acknowledged.
REFERENCES Altschul, S. F., Gish, W., Miller, W., Myers, E. W. & Lipman, D. (1990).
Basic local alignment search tool. J Mol Biol 215, 403–410. Bakry, F., Carreel, F., Caruana, M.-L., Coˆte, F.-X., Jenny, C. & Te´zenas du Montcel, H. (2001). Banana. In Tropical Plant Breeding,
pp. 1–29. Edited by A. Charrier, M. Jacquot, S. Hamon & D. Nicolas. Enfield, NH, USA: Science Publishers. Bensasson, D., Zhang, D.-X., Hartl, D. L. & Hewitt, G. M. (2001).
Mitochondrial pseudogenes: evolution’s misplaced witnesses. Trends Ecol Evol 16, 314–321. Carreel, F., Gonzalez de Leon, D., Lagoda, P., Lanaud, C., Jenny, C., Horry, J. P. & Tezenas du Montcel, H. (2002). Ascertaining maternal
Hull, R. (2002). Matthews’ Plant Virology, 4th edn. San Diego:
Academic Press. Hull, R., Harper, G. & Lockhart, B. (2000). Viral sequences integrated
into plant genomes. Trends Plant Sci 5, 362–365. Jakowitsch, J., Mette, M. F., van der Winden, J. & Matzke, M. A. (1999). Integrated pararetroviral sequences define a unique class of
dispersed repetitive DNA in plants. Proc Natl Acad Sci U S A 96, 13241–13246. Jenny, C., Carreel, F. & Bakry, F. (1997). Revision on banana
taxonomy: ‘Klue Tiparot’ (Musa sp) reclassified as a triploid. Fruits 52, 83–91. Jones, D. R. (1999). Introduction to banana, abaca´ and enset. In Diseases of Banana, Abaca´ and Enset, pp. 1–36. Edited by D. R. Jones. Wallingford, UK: CABI Publishing. Kumar, S., Tamura, K., Jakobsen, I. B. & Nei, M. (2001). MEGA2:
molecular evolutionary genetics analysis software. Bioinformatics 17, 1244–1245. Kunii, M. M., Kanda, M. M., Nagano, H. H., Uyeda, I. I., Kishima, Y. Y. & Sano, Y. Y. (2004). Reconstruction of putative DNA virus from
and paternal lineage within Musa by chloroplast and mitochondrial DNA RFLP analyses. Genome 45, 679–692.
endogenous rice tungro bacilliform virus-like sequences in the rice genome: implications for evolution and integration. BMC Genomics 5, 80.
Dahal, G., Dahal, F., Pasberg-Gauhl, C., Hughes, J. d. A., Thottapilly, G. & Lockhart, B. E. L. (1999). Evaluation of micro-
Lheureux, F., Carreel, F., Jenny, C., Lockhart, B. E. L. & IskraCaruana, M. L. (2003). Identification of genetic markers linked to
propagated plantain and banana (Musa spp.) for banana streak badnavirus incidence under field and screenhouse conditions in Nigeria. Ann Appl Biol 134, 181–191.
banana streak disease expression in inter-specific Musa hybrids. Theor Appl Genet 106, 594–598.
Dallot, S., Acuna, P., Rivera, C., Ramirez, P., Cote, F., Lockhart, B. E. L. & Caruana, M. L. (2001). Evidence that the proliferation stage
of micropropagation procedure is determinant in the expression of Banana streak virus integrated into the genome of the FHIA 21 hybrid (Musa AAAB). Arch Virol 146, 2179–2190.
Lockhart, B. E., Menke, J., Dahal, G. & Olszewski, N. E. (2000).
Characterization and genomic analysis of tobacco vein clearing virus, a plant pararetrovirus that is transmitted vertically and related to sequences integrated in the host genome. J Gen Virol 81, 1579–1585.
Daniells, J., Jenny, C., Karamura, D. & Tomekpe, K. (2001).
Matzke, M., Gregor, W., Mette, M. F., Aufsatz, W., Kanno, T., Jakowitsch, J. & Matzke, A. J. M. (2004). Endogenous para-
Musalogue: a catalogue of Musa germplasm. Diversity in the genus Musa. Compiled by E. Arnaud & S. Sharrock. International Network for the Improvement of Banana and Plantain.
retroviruses of allotetraploid Nicotiana tabacum and its diploid progenitors, N. sylvestris and N. tomentosiformis. Biol J Linn Soc 82, 627–638.
Gawel, N. J. & Jarret, R. L. (1991). A modified CTAB DNA extraction
procedure for Musa and Ipomoea. Plant Mol Biol Rep 9, 262–266.
Mette, M. F., Kanno, T., Aufsatz, W., Jakowitsch, J., van der Winden, J., Matzke, M. A. & Matzke, A. J. M. (2002). Endogenous
Geering, A. D. W., McMichael, L. A., Dietzgen, R. G. & Thomas, J. E. (2000). Genetic diversity among Banana streak virus isolates from
viral sequences and their potential contribution to heritable virus resistance in plants. EMBO J 21, 461–469.
Australia. Phytopathology 90, 921–927.
Murad, L., Bielawski, J. P., Matyasek, R., Kovarı´k, A., Nichols, R. A., Leitch, A. R. & Lichtenstein, C. P. (2004). The origin and evolution
Geering, A. D. W., Olszewski, N. E., Dahal, G., Thomas, J. E. & Lockhart, B. E. L. (2001). Analysis of the distribution and structure
of integrated Banana streak virus DNA in a range of Musa cultivars. Mol Plant Pathol 2, 207–213. Geering, A. D. W., Pooggin, M. M., Olszewski, N. E., Lockhart, B. E. L. & Thomas, J. E. (2005). Characterisation of Banana streak
of geminivirus-related DNA sequences in Nicotiana. Heredity 92, 352–358. Ndowora, T., Dahal, G., LaFleur, D., Harper, G., Hull, R., Olszewski, N. E. & Lockhart, B. (1999). Evidence that badnavirus infection
in Musa can originate from integrated sequences. Virology 255, 214–220.
Mysore virus and evidence that its DNA is integrated in the B genome of cultivated Musa. Arch Virol (in press).
Nei, M. & Gojobori, T. (1986). Simple methods for estimating the
Gregor, W., Mette, M. F., Staginnus, C., Matzke, M. A. & Matzke, A. J. M. (2004). A distinct endogenous pararetrovirus family in
numbers of synonymous and nonsynonymous nucleotide substitutions. Mol Biol Evol 3, 418–426.
Nicotiana tomentosiformis, a diploid progenitor of polyploid tobacco. Plant Physiol 134, 1191–1199.
Page, R. D. M. & Holmes, E. C. (1998). Molecular Evolution: a
http://vir.sgmjournals.org
phylogenetic approach. Oxford: Blackwell Science. 519
A. D. W. Geering and others Richert-Po¨ggeler, K. R., Noreen, F., Schwarzacher, T., Harper, G. & Hohn, T. (2003). Induction of infectious petunia vein clearing
strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Res 25, 4876–4882.
(pararetro) virus from endogenous provirus in petunia. EMBO J 22, 4836–4845.
White, S. E., Habera, L. F. & Wessler, S. R. (1994). Retrotransposons
Thompson, J. D., Gibson, T. J., Plewniak, F., Jeanmougin, F. & Higgins, D. G. (1997). The CLUSTALX windows interface: flexible
520
in the flanking regions of normal plant genes: a role for copia-like elements in the evolution of gene structure and expression. Proc Natl Acad Sci U S A 91, 11792–11796.
Journal of General Virology 86