within Echinococcus granulosus, and these were used to categorise the isolates. The 4 generally ... multilocularis; V, E. vogeli; 0, E. oligarthrus; RFLP, restriction.
Molecular and Biochemical Parasitology, 54 (1992) 165 174
165
© 1992 Elsevier Science Publishers B,V. All rights reserved. / 0166-6851/92/$05.00 MOLBIO 01784
Genetic variants within the genus Echinococcus identified by mitochondrial DNA sequencing J o s e p h i n e Bowles a, D a v i d Blair b a n d D o n a l d P. M c M a n u s a aTropical Health Program, Queensland Institute of Medical Research, Brisbane, Australia; and bDepartment of Zoology, James Cook University, Townsville, Australia (Received 2 January 1992; accepted 16 April 1992)
The pattern of species and strain variation within the genus Echinococcus is complex and controversial. In an attempt to characterise objectively the various species and strains, the sequence of a region of the mitochondrial cytochrome c oxidase subunit I (COl) gene was determined for 56 Echinococcus isolates. Eleven different genotypes were detected, including 7 within Echinococcus granulosus, and these were used to categorise the isolates. The 4 generally accepted Echinococcus species were clearly distinguishable using this approach. In addition, the consensus view of the strain pattern within E. granulosus, based on a variety of criteria of differentiation, was broadly upheld. Very little variation was detected within Echinococcus multilocularis. Remarkable intra-strain homogeneity was found at the D N A sequence level. This region of the rapidly evolving mitochondrial genome is useful as a marker of species and strain identity and as a preliminary indication of evolutionary divergence within the genus Echinococcus. Key words: Echinococcus; Intra-specific variation; Mitochondrial DNA; Cytochrome c oxidase I; Direct sequencing; Polymerase chain reaction
Introduction
Of the 16 species which have been described within the genus Echinococcus (causative agent of hydatid disease) only 4, Echinococcus granulosus, Echinococcus multilocularis, Echinococcus vogeli and Echinococcus oligarthrus, are now recognised as taxonomically valid [1]. Correspondence address: D.P. McManus, Tropical Health Program, Queensland Institute of Medical Research, 300 Herston Rd, Brisbane, Queensland 4029, Australia. Tel.: + 61-7-3620401; Fax: + 61-7-3620104.
Note: Nucleotide sequence data reported in this paper have bgen submitted to the GenBank T M data base with the accession numbers M84661 to M84671.
Abbreviations." PCR, polymerase chain reaction; COl, cytochrome c oxidase subunit 1; G, E. granulosus; M, E. multilocularis; V, E. vogeli; 0, E. oligarthrus; RFLP, restriction fragment length polymorphism; rDNA, ribosomal DNA; mtDNA, mitochondrial DNA.
Considerable intra-specific or 'strain' [2] variation has been observed, particularly within E. granulosus and to a lesser extent in E. multilocularis [3]. The term 'strain' refers, in the case of Echinococcus, to an intra-specific group of genetically distinguishable isolates which also share biological features of actual or potential significance in the control of hydatid disease [1]. E. granulosus strains vary in features such as morphology, biochemistry, physiology, pathogenicity, developmental patterns and infectivity to humans and domestic animals [1] with important implications for the epidemiology of hydatid disease. Such differences necessitate some reliable means of identifying to the level of strain, isolates collected in the field or at surgery. Identification of isolates of E. granulosus has depended heavily upon a limited number of morphological features and there has been much disagreement as to the value of this
166
approach [4]. Furthermore, it has been demonstrated that morphological features, including those commonly studied, can vary as a result of environmental influences [4,5]. Other methods of identification, such as those based on immunological and biochemical analysis, are also limited in their usefulness because of the potential for host- and environmentally induced variation. The problem of grouping isolates into biologically relevant sub-specific categories is basically one of determining genetic identity. Approaches which investigate the genetic composition of an organism directly, by examining its DNA, overcome problems of life-cycle stage variability and external influences on phenotype. It is not known to what extent epidemiologically significant differences (such as host range, infectivity, virulence and drug susceptibility) reflect genetic heterogeneity within Echinococcus populations. This reservation is particularly valid in the light of recent findings [4] which suggest that host factors can significantly influence the parasite phenotype. However, if genetically homogeneous sub-groups can be shown to be linked to features of biological and epidemiological importance, then the usefulness of these methods for differentiation and characterisation cannot be disputed. In this study, we have attempted to distinguish and group isolates of Echinococcus genetically, and have included representative isolates from many of the proposed intraspecific groups. Isolates were analysed for sequence variation within a region of the mitochondrial cytochrome c oxidase subunit I (CO1) gene. We chose to study mitochondrial DNA (mtDNA) since it generally evolves more rapidly than nuclear DNA and, because it is haploid, the allele haplotypes can be determined unambiguously. It is also advantageous that m t D N A does not recombine, a feature which simplifies analysis. The functional significance of variation in the CO1 gene was not considered in this study. Rather, it was anticipated that variation in important biological characteristics could be linked to particular mitochondrial genotypes.
Materials and Methods
Total genomic DNA was prepared from vertebrate tissues and from fresh, frozen or alcohol preserved isolates of Echinococcus species using standard extraction techniques [6]. For the purposes of this study, an E. granulosus 'isolate' refers to the protoscoleces obtained from a single hydatid cyst or an individual adult worm. Isolates of E. multilocularis from China, Alaska and North America originated from alveolar cyst material surgically resected from infected human livers. The European isolate of E. multilocularis was obtained from a naturally infected rodent in Germany. All isolates were passaged by intraperitoneal injection in jirds (Meriones unguiculatus) or cotton rats (Sigmodon hispidus). The E. oligarthrus and E. vogeli isolates were similarly laboratory maintained, following original field isolation in Central America (Panama) and South America, respectively. In all cases, DNA was isolated from protoscoleces obtained several months following passage. It was not necessary to isolate mitochondrial DNA from total DNA, and the quality and quantity of DNA extracted from all samples was suitable for analysis by polymerase chain reaction (PCR). Primers suitable for PCR and sequencing were designed, based on evolutionarily conserved regions of the CO1 sequence published for Fasciola hepatica [7]. The sequences of the primers used in the study were as follows: Forward 5 ' T T T T T T G G G C A T C C T G A G G T TTAT3'(2575) and Reverse 5'TAAAGAAAGAACATAATGAAAATGY(3021). The numbers in brackets refer to the position of the 5' end of the primer in the published F. hepatica sequence [7]. Double-stranded fragments were produced by standard PCR methods [8] except that one primer was previously phosphorylated at the 5' end [6]. Single-stranded sequencing templates were prepared by lambda-exonuclease digestion [9] and sequenced using Sequenase (USB). Sequences were verified as being CO1 by inference from similarity with mouse [10] and
167
F. hepatica [7] sequences. The partial mitochondrial CO1 sequences (366 bp) obtained were aligned by eye (Fig. I) and isolates with identical COl genotypes grouped together (Table I). Amino acid sequences were predicted using known mitochondrial modifications to the universal genetic code [7] (Fig. 2). The extent of variation amongst the detected mitochondrial genotypes was estimated by pairwise comparison of nucleotide and amino acid sequences (Table II).
GGA
TTT
GGT
ATA
AGT
CAT
Aq'T
TGT
TTG
For each Echinococcus isolate examined, a single double-stranded PCR product of approximately the expected size (446 bp) was visualised on an ethidium-bromide stained agarose gel. Unambiguous sequences of at least 366 bp were obtained for each sample. Vertebrate tissue D N A was not amplified under the PCR conditions used for the Echinococcus isolates. Eleven distinct partial CO1 sequences were detected amongst the 56 Echinococcus isolates examined (Fig. 1). Out of the 366 surveyed
AGT
CCT
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
T
. . . . . . . . . . . . . . .
G3
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
T
. . . . . . . . . . . . . . .
G4
.....
G
. . . . . . . . . . . . . . . . . . . . . . . . . .
G5
.....
G
......
G.T
. . . . . . . . . . . . . . . . . . . . . . .
G6
. . . . . . . . . . . .
G.T
. . . . . . . . . . . . . . . . . . . .
G
......
T . . . . . . .
G7
..C
G.T
. . . . . . . . . . . . . . . . . . . .
G
......
T . . . . . . .
A
ATT
. .G
AGT
C,C T
AAT
. . . . . . . . . . . . . . C
. . . . . . . . . . .
M1
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
A
.....
M2
. . . . . . . . . . . . . . .
A
.....
V
.....
G
. . . . . . . . . . . . . . . . . . . .
A
........
O
.....
T
........
A
...
C.T
. . . . . . . . . . . . . . . .
G1
GCT
AGC
AGG
Gq'~
ATG
TTT
G . . . . . . . . . . . . . . . .
T
TCT
. . . . . . . . . . .
ATA
GTG
TGT
T'PG
GGT
TT?
G
GAT
GCG
T'I'F
GGG
TTC
T . . . . . . . . . ....
T . . . . . . . . .
T
. . . . . . . . . . . . . . .
....
~
........
T
. . . . . . . . . . .
....
TT
........
T
. . . . . . . . . . . . . . .
G
....
"IT
........
T
. . . . . . . . . . . . . . .
....
G . . . . . . . . . . . . . . . . . . . . . G . . . . . . . . . . .
A
T
.....
T
.........
T . . . . . . . . .
T
.....
T
.........
....
T . . . . . . . . .
T
........
CG
....
TT
ATG
q'Fl"
G
CAT
TTG
G
....
CAT
T'PG
G
A
GGT
GGG
G
A
G . . . . . . . . . . . . .
TGG
TAT
"FFF
G1 G2
.........
ATT
Results
.....
ACT
GTT
A
GGG
A
...
......
..T
. . . . . . . . . . . . . . .
TTG
GAT
GTG
AAG
ACG
GCT
G2 G3 G4 G5
.
A
.
.
T
, .T
.
G6
. . . . . . . . . . . . . . . . . . . . . . .
A
. . . .
T
. .T
........
A
. . . . . . . . . . . . . . . . . . . .
A
. .A
. . . . . . . . . . .
T
. . .
G7
. . . . . . . . . . . . . . . . . . . . . . .
A
. . . .
T
• .T
........
A
. . . . . . . . . . . . . . . . . . . .
A
. .A
. . . . . . . . . . .
T
. . .
M1 M2
. . . . . . . . . . . . . . . . . . . . . . .
A
. .G
.T
, .T
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
G
. . . . . . . . . . . . . . . . . . . . . . .
A
. .G
.T
, .T
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
G
V
. . . . . . . . . . . . . . . . .
A
. .A
.T
........
A
. .G
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
O
. . . . . . . . . . . . . . . . . . . . . . .
A
. . . .
. .A
.....
A
. .G
. . . . . . . . . . . . . . . . . . . . . . .
G1
GTT
GGG
GTT
.
.
.
.
.
.
TTT
.
.
.
.
~
.
AGC
.
.
.
.
TCT
.
.
.
.
A
.
.
.
.....
GTT
ACT
ATG
.
.
ATT
A
ATA
.
.
.
.
.
CCT
.
.
.
.
ACT
.
.
.
GGT
.
.
.
ATA
.
.
.
AAG
.
.
.
GTG
.
.
.
TTT
.
.
.
ACT
.
.
.
.
.
.
A
TGG
TTA
.
.
.
.
. .C
.
.
.
. .T
TAT
.
.
.
.
.
.
.
.........
AT(]
TTG
TTG
AAT
G2 G3 G4 G5
.
.
.
.
.
.
.
.
.
.
.
T
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
T
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
G
.
.
.
.
.
.
.
.
A
G6
.
.
.
.
.
.
.
.
.
.
.
T
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
T
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
G
.
.
.
.
.
.
.
.
A
G7
. . . . . . . . . . .
T
. . . . . . . . . . . . . . . . . . . .
T
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
G
........
M1
. . . . . . . . . . .
T
........
G
. . . . . . . . . . .
T
.....
G
. . . . . . . . . . . . . . . . . . . . . . . . . .
G
.........
C.T
. . .
G
. . . . . . . . . . .
T
.....
G
. . . . . . . . . . . . . . . . . . . . . . . . . .
G
.........
C.T
. . .
G
. . . . . . . . . . . . . . .
M2
. . . . . . . . . . .
T
........
V
. . . . . . . . . . .
T
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
O
. . . . . . . . . . .
T
. . . . . . . . . . . . . . . . . . . .
G1
TCG
AGT
GTT
AAT
GTT
AGT
~AT
CCG
GTT
q~PG
T TGA
........ TGG
A
Gq'T
G ~
. . . . . . . . . . . . . . . . . . . . TCT
TTT
ATA
GTG
TTG
TTT
A ACG
. ,G TTT
A
........ GGG
A
GGA
GTT
G2
. . . . . . . . . . . . .
C . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
G3
. . . . . . . . . . . . .
C . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
G4
....
G5
..T
.A . . . . . .
G6
..T
.A . . . . . . .
G7
.A
A . . . . . .
M2
• .T • .T • .T
V
........
O
....
M1
G1
ATA
.....
GTT
........
T
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
T
........
. . . . . . . . .
T
........
G
......
A . . . . . . . . . . . . .
T
..A
..C
T
........
G
......
A . . . . . . . . . . . . .
T
..A
. . . . . . . . . . . . . .
C . . . . . . . . .
T
. .A
. . . . . . . . . . . . . .
A
..
AAG
A
..
AAG
G
..
AAG
. .A
..
AAG
TCT
GCT
TTG
G
T
.....
........
T
........
G
......
A . . . . . . . . . . . . .
........
T
A .......
G
......
A . . . . . . . . . . . . . . . . . . . . . . . . . . . .
........
T
A .......
G
......
G . . . . . . . . . . . . . ........ TGT
T
GTG
TTA
A
........ GAT
AAT
. .G G
. .A
A . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Aq'l~ TTG
G CAT
...... ACG
GGT
A
........
T
..T
T
.....
T
. . .
T
...
C
. .C
. .T
...
C
, .C
. .T
. . .
. .T
.....
T
...
T
. .T
.....
T
...
. . . . . . . . . . . . . . . . . . . . . . . . . . . . .
T
. .T
.....
T
...
A ....
T
. .T
.....
T
...
. .G
.....
. . . . . .
......
~ . . . . . . . . . . . . . . . . . . . . . . .
C . . . . . . . . .
CG
. . . . . . .
.....
A.
AAG
. . . . . .
G
........
A
........
A
.....
GAT
G2 G3
G4 G5
G
....
G6
. . . . . . . . . . . . . . . . . . . . . . .
G
....
G ....
A
......
G7
. . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . .
G
....
G ....
A
......
M1 M2
........
A
. . . . . . . . . . . . . .
G
....
G ....
A
. .C
. . ,
........
A
. . . . . . . . . . . . . .
G
....
G ....
A
. .C
. . .
V
. . . . . . . . . . . . . . . . . . . .
. .G
....
G . . . . . . . . . . .
O
........
. . . . . . . . . . . . . . . . . . . . .
G . . . . . . . . . . .
A
G
. . . . . . . . . . .
A
G . . . . . . . . . . .
Fig. 1. Nucleotide sequences of a 366-bp fragment of the mitochondrial CO1 gene, for a number of species and strains within the Echinococcus genus• GI to G7 represent the 7 variants identified within E. granulosus, M1 and M2 are E. rnultilocularis variants, and V and O represent E. vogeli and E. oligarthrus respectively. Dots denote homology with the G1 sequence. The first and last nucleotides correspond to nucleotides 2611 and 2979, respectively, of the F. hepatica COl sequence [7].
168 TABLE I Host and geographical origins of Echinococcus isolates examined Host (number of isolates examined)
Origin
Sheep (19) Human (4) Kangaroo (2) Dingo (2)a Cattle (5) Camel (1) Pig (1) Goat (2)
UK, Spain, China, New South Wales, Kenya, Uruguay, Turkey, Jordan, Lebanon, Italy Tasmania Queensland, Tasmania, China Queensland Queensland UK, China, Spain China China China, Masailand (Kenya)
G2
Sheep (2)
Tasmania
G3
Buffalo (2)
India
G4
Horse (2) Donkey (1)
UK, Spain Ireland
G5
Cattle (1)
Holland
G6
Camel (2) Goat (1)
Somalia, Sudan Turkana (Kenya)
G7
Pig (2)
Poland
M1
(3)
China, Alaska, North America
M2
(1)
Europe
V
(2)
South America
O
(1)
Panama
G1
G1 G7, E. granulosus; M1, M2, E. multilocularis; V, E. vogeli; O, E. oligarthrus. ~Adult worms. Isolates are arranged in the genotypic groups suggested by mitochondrial cytochrome c oxidase 1 gene sequence data.
nucleotide positions, 76 variant nucleotide positions were found. G1 to G7 represent variants of E. granulosus, M1 and M2 are variants of E. multilocularis, and E. vogeli and E. oligarthrus are denoted by V and O, respectively. The 4 Echinococcus species could be clearly distinguished. A significant amount of CO1 sequence variation was found within E. granulosus. The 49 E. granulosus isolates examined could be divided into 7 discrete groups (G1-G7) (Fig. 1). The CO1 sequences of some of these groups are very similar. The geographical and host origins of the isolates examined are shown in Table I, along with their genotypic grouping as designated by virtue of their CO1 sequence. All the sheep isolates, with the exception of a
sample (pool of 2 isolates taken from different sheep on the same farm) from Tasmania, were found to have the G1 genotype, which was used as a reference sequence. All human, cattle (apart for an isolate from Holland), Australian sylvatic and Chinese isolates tested also fell into this category as did a single goat isolate from Kenya. The sequence obtained with the Tasmanian sheep sample (G2) differed from the standard sheep sequence at 3 of the 366 nucleotide sites examined, and 2 of these variants should correspond to an amino acid change (Fig. 2). A common sequence (G4) was found in E. granulosus isolates of horse and donkey origin. A unique CO1 sequence (G5) was found with the single cattle isolate examined from Holland. Camel isolates from
169 G1
PGFGI/MISH
I
CLS
I
SANFDAFGFYGLLFAMFSI/MVCLGSSVWGHHMFTVGLDVKTAVFFSTVT
G2
. . . . . . . . . . . . . . . . . .
G3
....
G4
. . . . . . . . . . . . . . . .
G5
....
V
. . . . . . . . . . .
G6
....
V
. . . . . . . . .
S.L.V
..........................................
G7
....
V
. . . . . . . . .
S.L.V
..........................................
M1
. . . . . . . . . . . .
I/M.G
M2
.....
I/M.G...V
V
. . . . . . . .
I/M...
O
. . . . . . . .
I/M
GI
.
...
V
.
...
. . . . . .
•
V
V ..........................................
............--....
•
-...---.-...--..-.--.........
L.V
..........................................
L.V
..........................................
.............................................. ..........................................
...L.V
. . . . . . .
S.V
..........................................
..........................................
MII/MGVPTGI/MKVFTWLYMLLNSSVNVSDPVLWWVVSFI/MVLFTFGGVTGI/MVLSACVLDLILHD
G ~
.o
o
o o o . o
o
. o o o . ~ o o . o o o . o o a o . o o o . . o o . o
o
. o o o . ° . o o o
.
o o o o , . o o o o . . o
G 3
.o
.
o . . , .
.
Q . . . . . . . , ° . . . o . A , . . . . , . . . . o
°
, . . . o . o . ° o
°
o . . o . . , o . . . ° o
G4
. . . . . . . . . . . . . . . . . . . . .
N..K
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
V...
G5
. . . . . . . . . . . . . . . . . . . . .
N..R
........
I . . . . . . . . . . . . . . . . . . . . . . .
V...
G6
. . . . . . . . . . . . . . . . . . . . .
N..A
........
I . . . . . . . . . . . . . . . . . . . . . . .
V...
G7
. . . . . . . . . . . . . . . . . . . . .
N..A
........
I . . . . . . . . . . . . . . . . . . . . . . .
V...
M1
. . . . . . . . . . . . . . . . . . . . . . . .
K.
. .I
....
I . . . . . . . . . . . . . . . . . . . . . . .
V.
. .
M2
. . . . . . . . . . . . . . . . . . . . . . . .
K.
. .I
....
I . . . . . . . . . . . . . . . . . . . . . . .
V.
. .
V
. . . . . . . . . . . . . . . . . . . . . . . .
KG
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
O
. . . . . . . . . . . . . . . . . . . . .
N..K
........
I . . . . . . . . . . . . . . . . . . . . . . .
V...
V...
Fig. 2. Predicted partial amino acid sequences of mitochondrial cytochrome c oxidase I for each of the genetically distinct groups of Echinococcus. A dot indicates an amino acid that is conserved relative to the G1 sequence. Modifications of the universal genetic code were used, based on knowledge of the mitochondrial genetic code in other organisms [7]. G 1 - G 7 represent the 7 variants identified within E. granulosus, MI and M2 are E. rnultilocularis variants, and V and O represent E. vogeli and E. oligarthrus respectively.
Somalia and the Sudan and a goat isolate originating from the Turkana region of northwest Kenya shared the same sequence (G6), which differed from the sequence found in pig isolates from Poland (G7) at only 1 nucleotide position. Minor variation was detected in the CO1 sequences of the E. multilocularis isolates, which fell into two groups (M1 and M2). Sequence for the European isolate examined (M2) varied from the others at 2 nucleotide sites, both of which should cause amino acid changes. The 2 E. vogeli isolate sequences (V) were identical, and distinct from all other isolates. The CO1 sequence determined for E. oligarthrus (O) was unique. The derived amino acid sequences for this region of the mitochondrial CO1 protein are
shown in Fig. 2. By considering the third base changes that have occurred in the sequences of some of these isolates, it is probable that A G A and A G G c o d e for serine (rather than arginine) as they do in F. hepatica [7] and nematodes [11]. T G A is not a stop codon, and so probably codes for tryptophan, as has been documented in other metazoan mitochondrial D N A (mtDNA) [7]. ATA specifies isoleucine in the universal code but in mammals and Drosophila it codes for methionine. By maximising the conservation of amino acid sequence it is more likely that ATA codes for isoleucine than methionine in the Echinococcus organisms but both alternatives have been included in the predicted amino acid sequences shown. Pair-wise nucleotide and amino acid se-
170 TABLE II Levels of pairwise sequence differences among the 11 mitochondrial CO1 genotypes found within the genus Echinococcus Genotype
G1
G1 G2 G3 G4 G5 G6 G7 M1 M2 V O
2 1 5 7 8 8 5/6 7/8 6/7 6/7
G2
G3
G4
G5
G6
G7
M1
M2
V
O
3
2 1
30 28 29
32 30 31 30
33 30 31 30 17
34 31 32 31 18 1 7/8 7/8 7/8 4/5
34 34 33 32 31 35 36
36 34 35 32 31 35 36 2 7/8 5/7
31 29 30 21 33 37 38 35 35
42 40 41 33 32 35 36 39 39 34
1 4 6 6 6 6/7 6/7 5/6 5/6
5 7 7 7 5/6 7/8 6/7 6/7
3 4 4 6/7 6/7 3/4 2/3
2 2 7/8 7/8 6/7 3/4
0 7/8 7/8 7/8 4/5
2 7/8 4/6
5
G, E. granulosus; M, E. multilocularis; V, E. vogeli; O, E. oligarthrus. For each pairwise comparison, the number of nucleotide sequence differences is shown above the diagonal and the number of amino acid sequence differences below the diagonal Where 2 alternative values are given for the number of amino acid sequence differences, the second value refers to the number of amino acid sequence differences if ' A T A ' codes for methionine rather than isoleucine (see Fig. 2).
quence comparisons (Table II) provide some quantitative information about the relationships amongst the genotypic groups. The similarity of the G1/G2/G3, G6/G7 and M1/ M2 groups is evident. In addition, G5 is quite similar to G6 and G7. The levels of difference between the recognised species of Echinococcus are not appreciably greater than those between genetically-defined subspecific groups of E. granulosus. This is the case for both nucleotide and amino acid sequence comparisons.
Discussion
Previous studies have shown that E. granulosus is made up of a number of genetically distinct subspecific forms [1,12]. In this study we have obtained part of the mitochondrial CO1 sequence for a number of isolates of E. granulosus and have been able to use this information to group them into 7 distinct intra-specific categories (Table I). Prior conclusions of strain patterns, based on a wide variety of intrinsic and extrinsic criteria [1] are broadly upheld by our findings. The G 1 group represents the widespread and well recognised 'sheep' strain of E. granulosus. The remarkable uniformity of this strain is confirmed at the DNA sequence level, with isolates from sheep and a number of other hosts from geographically diverse regions
sharing an identical mitochondrial CO1 genotype. In biological and epidemiological features, the sheep strain is considered to be homogeneous except in Tasmania where morphological distinctiveness and a significantly shortened prepatency period have been reported [1,13]. Two (group G2) out of a total of 9 Tasmanian sheep isolates examined in this study differed slightly in CO1 sequence from all the other sheep isolates examined. These same 2 Tasmanian isolates could not be distinguished from UK and Australian mainland sheep isolates by restriction fragment length polymorphism (RFLP) analysis of the ribosomal DNA (rDNA) repeat region [14] although the well-established strain groups can be clearly differentiated using this approach [12]. Even if the characteristic features of the Tasmanian form of E. granulosus are the result of some fundamental genetic difference, knowledge of the general genetic similarity indicates that it has only very recently diverged from the common sheep strain. Gill and Rao [15] and Rao [16] have proposed that a morphologically and developmentally distinct E. granulosus strain exists in India, producing large fertile cysts in the lungs of buffaloes. It has been suggested [1] that this form of E. granulosus may be the same as the bovine-adapted strain reported in Switzerland, South Africa and various other
171
areas. The Indian buffalo isolates examined in this study (G3) are clearly quite distinct from the Dutch cattle isolate (G5), and only slightly different from the common sheep strain (G1). Our genetic investigations suggest that, like the Tasmanian sheep isolates, these Indian buffalo isolates are either variants of the common sheep strain or represent a genetically distinct but very closely related group. All of the hydatid material analysed from Chinese hosts (human, sheep, cattle, camel and pig) is representative of the common 'sheep' strain. These samples were collected from the Northern part of Xinjiang province in North West China. Other strains of E. granulosus may occur in this and other areas but, from the public health point of view, it is of significance that a range of animal hosts in this part of China harbour the sheep strain which is known to be infective to man. Evidence of morphological, biochemical and developmental differences between isolates of E. granulosus of domestic and sylvatic origin on mainland Australia led to their proposed designation as distinct strains [17]. However, this hypothesis has been questioned following new morphological studies [4,18] and RFLP analysis [14,19]. The current study found no genetic evidence to support the theory that a distinct Australian sylvatic strain of E. granulosus exists, at least in Queensland. Based on their host and geographical origins, groups G4, G5, G6 and G7 probably represent the well-characterised and distinct 'UK horse' [20], 'Swiss cattle' [21], 'African camel' [22,23] and 'European pig' [24] strains, respectively. Unlike G2 and G3, these groups are quite different genetically from the sheep strain (G1). This result complements earlier reports which have documented the biological distinctiveness of these strains. The G5 genotype ('Swiss cattle') is reasonably similar to the G6 ('African camel') and G7 ('European pig') group genotypes, but is still clearly identifiable. The G4 genotype ('UK horse') does not appear particularly close to any other genotype. There is some evidence of intra-specific variation in E. multilocularis. As reviewed by
Thompson and Lymbery [1], there have been reports of variability in morphology, pathogenicity, developmental characteristics and host specificity. Furthermore, RFLPs have been detected among E. multilocularis isolates originating from different endemic areas [25]. Three geographically distinct subspecies of E. multilocularis have been described. The nominate subspecies (E. m. multilocularis) is found in central Europe whilst E. m. sibiricensis is reported in Alaska and E. m. kazakhensis in Kazakhstan [26]. The E. multilocularis isolates examined here from Alaska, China and North America were genetically identical in the CO 1 region and were very slightly different from the European isolate. Further sequence information will be necessary to determine whether stable genetic distinctions, which parallel biological and morphological dissimilarities, can be found. The amount of between group nucleotide sequence change we have detected is very slight in some cases. G1, G2 and G3 are obviously very closely related, and may simply represent polymorphism within the common 'sheep' strain. If so, the genotype G1 is probably the most common within that strain. It may be significant that variation, albeit slight, is only found in isolates which, for other reasons, are thought to be distinct from the common sheep strain of E. granulosus. It is because of this correlation with epidemiological differences that we decided to assign even minor variants to distinct groups. Similarly, the 'camel' (G6) and 'pig' (G7) strains, and the 2 E. multilocularis groups (M1 and M2) have been kept separate because other evidence [12,26] exists to support their distinctiveness. Further investigations may uncover more clear-cut genetic differences between these groups. We have chosen to examine and compare Echinococcus isolates using the most objective and unambiguous method available, that of direct DNA sequence analysis. The mitochondrial CO1 gene is sufficiently variable in the Echinococcus organisms to allow discrimination at the intraspecific level. This is despite the fact that this has been shown in other organisms to be one of the least variable of
172
the mitochondrial protein-encoding genes [7]. Additional sequence information obtained from the more variable genes may help clarify whether the minor sequence variants detected in this study represent polymorphisms within strain groups or genetically definable strains in their own right. Perhaps by chance, we have not constructed biologically irrelevant groups using this approach as may have been anticipated [27]. Although, as expected, the E. granulosus strain pattern has been shown to be complex, the variation is not continuous. The mtDNA types found in the G1/G2/G3, G4, G5 and G6/G7 clusters appear to be not only distinct but also discrete, suggesting that the categorisation of isolates into a number of well-defined intra-specific groups is not an oversimplification. This study reinforces the usefulness of direct genetic approaches in the study of variation. We have no data to suggest that we will be unable to identify unambiguously strains and species of Echinococcus using the DNA sequencing approach, as long as the DNA regions targeted are carefully selected. The gene examined need not have any direct relevance to the nature of the infection. An advantage of the approach is that sequence data provides information about the types and magnitude of evolutionary changes which have become fixed, and supplies a large number of characters suitable for evolutionary and phylogenetic analysis. We are currently using the COl and other sequence data in phylogenetic studies aimed at determining the actual interrelationships and taxonomic status of species and strains of Echinococcus.
Acknowledgements We would like to thank the National Health and Medical Research Council of Australia, the European Economic Community, the Tropical Health Program and the Australian Meat and Livestock Development Corporation (Junior Research Fellowship awarded to J.B.) for financial support. We would also like to thank the following colleagues who collect-
ed or assisted in the collection of the various Echinococcus isolates analysed in the study: F. van Knapen, Z. Pawlowski, C. Macpherson, C. Cuesta-Bandera, G. Macchioni, T. Wikerhauser, A. Nieto, I. Reisin, R. Matossian, M. Abo-Shehada, R. Lorenzini, S. Gulen, C. Hatch, H. Ross, P. Schantz, D. Obendorf, J. Goldsmid, O. Sousa, B. Gottstein, Z.X. Ding, W.G. Yang.
References 1 Thompson, R.C.A. and Lymbery, A.J. (1988) The nature, extent and significance of variation within the genus Echinococcus. Adv. Parasitol. 27, 209 258. 2 Rausch, R.L. (1967) A consideration of infraspecific categories in the genus Echinococcus Rudolphi 1801 (Cestoda: Taeniidae). J. Parasitol. 53, 484491. 3 McManus, D.P. (1990) Characterisation of taeniid cestodes by DNA analysis. Rev. Sci. Tech. Off. Int. Epiz. 9, 489 510. 4 Hobbs, R.P., Lymbery, A.J. and Thompson, R.C.A. (1990) Rostellar hook morphology of Echinococcus granulosus (Batsch, 1786) from natural and experimental Australian hosts, and its implications for strain recognition. Parasitology 101,273 281. 5 Schantz, P.M., Colli, C., Cruz-Reyes, A. and Prezioso, U. (1976) Sylvatic echinococcosis in Argentina, II. Susceptibility of wild carnivores to Echinococcus granulosus (Batsch, 1786) and host-induced morphological variation. Tropenmed. Parasitol. 27, 7(~78. 6 Maniatis, T., Fritsch, E.F. and Sambrook, J. (1982) Molecular Cloning. A Laboratory Manual. Cold Spring Harbor Laboratory, Cold Spring Harbor, NY. 7 Garey, J.R. and Wolstenholme, D.R. (1989) Platyhelminth mitochondrial DNA: evidence for early evolutionary origin of a tRNASerAGN that contains a dihydrouridine arm replacement loop, and of serinespecifying AGA and AGG codons. J. Mol. Evol. 28, 374~387. 8 Saiki, R.K., Scharf, S., Faloona, F., Mullis, K.B., Horn, G.T., Erlich, H.A. and Arnheim, N. (1985) Enzymatic amplification of/3-globin genomic sequences and restriction site analysis for diagnosis of sickle cell anaemia. Science 230, 1350-1354. 9 Higuchi, R. and Ochman, H. (1989) Production of single-stranded DNA templates by exonuclease digestion following the polymerase chain reaction. Nucleic Acids Res. 17, 5856. 10 Bibb, M.J., Van Etten, R.A., Wright, C.T., Walberg, M.W. and Clayton, D.A. (1981) Sequence and gene organization of mouse mitochondrial DNA. Cell 26, 167 180. 11 Wolstenholme, D.R., Macfarlane, J.L., Okimoto, R., Clary, D.O. and Wahleithner, J.A. (1987) Bizarre tRNAs inferred from DNA sequences of mitochondrial genomes of nematode worms. Proc. Natl. Acad. Sci. USA 84, 1324 1328. 12 McManus, D.P. and Rishi, A.K. (1989) Genetic
173 heterogeneity within Echinococcus granulosus: isolates from different hosts and geographical areas characterised with DNA probes. Parasitology 99, 17-29. 13 Thompson, R.C.A. (1986) Biology and systematics of Echinococcus. In: The Biology of Echinococcus and Hydatid Disease (Thompson, R.C.A., ed.), pp. 5-43. George Allen and Unwin, London. 14 Hope, M., Bowles, J. and McManus, D.P. (1991) A reconsideration of the Echinococcus granulosus strain situation in Australia following RFLP analysis of cystic material. Int. J. Parasitol. 21,471-475. 15 Gill, H.S. and Rao, B.V. (1967) On the biology and morphology of Echinococcus granulosus (Batsch 1786) of buffalo-dog origin. Parasitology 57, 695-704. 16 Rao, B.V. (1968) Experimental transmission of Echinococcus of buffalo origin to foxes (Vulpes bengalensis). Vet. Rec. 83, 5657. 17 Thompson R.C.A. and Kumaratilake L.M. (1985) Comparative development of Echinococcus granulosus in dingoes (Canis familiaris dingo) and domestic dogs (Canisfamiliarisfamiliaris) with further evidence for the origin of the Australian sylvatic strain. Int. J. Parasitol. 15, 535 542. 18 Lymbery, A.M., Thompson, R.C.A. and Hobbs, R.P. (1990) Genetic diversity and genetic differentiation in Echinococcus granulosus (Batsch, 1786) from domestic and sylvatic hosts on the mainland of Australia. Parasitology 101, 283-289. 19 Hope, M., Bowles, J., Prociv, P. and McManus, D.P. (1992) A genetic comparison of human and wildlife isolates of Echinococcus granulosus in Queensland and
the public health implications. Med. J. Aust. 156, 27 30. 20 Thompson, R.C.A. and Smyth, J.D. (1975) Equine hydatidosis: a review of the current status in Great Britain and the results of an epidemiological survey. Vet. Parasitol. 1, 107-127. 21 Thompson, R.C.A., Kumaratilake, L.M. and Eckert, J. (1984) Observations on Echinococcus granulosus of cattle origin in Switzerland. Int. J. Parasitol. 14, 283 291. 22 McManus, D.P. A biochemical study of adult and cystic stages of Echinococcus granulosus of human and animal origin from Kenya. J. Helminthol. 55, 21 27. 23 Macpherson, C.N.L. and McManus, D.P. (1982) A comparative study of Echinococcus granulosus from human and animal hosts in Kenya using isoelectric focusing and isoenzyme analysis. Int. J. Parasitol. 12, 5"15-521. 24 Eckert, J. and Thompson, R.C.A. (1988) Echinococcus strains in Europe. Trop. Med. Parasitol. 39, 1-8. 25 Vogel, M., Mfiller, N., Gottstein, B., Flury, K., Eckert., J. and Seebeck, T. (1991) Echinococcus multilocularis: characterization of a DNA probe. Acta Trop. 48, 109 116. 26 Kumaratilake, L.M. and Thompson, R.C.A. (1982) A review of the taxonomy and speciation of the genus Eehinococcus Rudolphi 1801. Z. Parasitenkd. 68, 121 146. 27 Thompson, R.C.A. and Lymbery, A.J. (1990) Intraspecific variation in parasites what is a strain? Parasitol. Today 6, 345-348.