above, to settle a unified evolutionary genetics of microorganisms, valid what- ever the species studied, whether eukaryotic (parasitic protozoa and fungi) or.
August 12, 1996
10:34
Annual Reviews
TIBATEXT.TRA
AR15-13
Annu. Rev. Microbiol. 1996. 50:401–29 c 1996 by Annual Reviews Inc. All rights reserved Copyright
TOWARDS A UNIFIED EVOLUTIONARY GENETICS OF MICROORGANISMS M. Tibayrenc UMR CNRS/ORSTOM 9926, G´en´etique Mol´eculaire des Parasites et des Vecteurs, ORSTOM, BP 5045, 34032 Montpellier Cedex 01, France KEY WORDS:
comparative approach, clonality, linkage disequilibrium, phylogenetic subdivision, epidemiology
ABSTRACT I propose here that evolutionary genetics, apart from improving our basic knowledge of the taxonomy and evolution of microbes (either eukaryotes or prokaryotes), can also greatly contribute to applied research in microbiology. Evolutionary genetics provides convenient guidelines for better interpreting genetic and molecular data dealing with microorganisms. The three main potential applications of evolutionary genetics in microbiology are (a) epidemiological follow-up (with the necessity of evaluating the stability of microbial genotypes over space and time); (b) taxonomy in the broad sense (better definition and sharper delimitation of presently described taxa, research of hidden genetic subdivisions); and (c) evaluation of the impact of the genetic diversity of microbes on their relevant properties (pathogenicity, resistance to drugs, etc). At present, two main kinds of population structure can be distinguished in natural microbial populations: (a) species that are not subdivided into discrete phylogenetic lineages (panmictic species or basically sexual species with occasional bouts of short-term clonality fall into this category); (b) species that are strongly subdivided by either cryptic speciation or clonal evolution. Improvements in available statistical methods are required to refine these distinctions and to better quantify the actual impact of gene exchange in natural microbial populations. Moreover, a codified selection of markers with appropriate molecular clocks (in other words: adapted levels of resolution) is sorely needed to answer distinct questions that address different scales of time and space: experimental, epidemic, and evolutionary. The problems raised by natural genetic diversity are very similar for all microbial species, in terms of both basic and applied science. Despite this fact, a regrettable compartmentalization among specialists has hampered progress in this field. I propose a synthetic approach,
401 0066-4227/96/1001-0401$08.00
August 12, 1996
10:34
402
Annual Reviews
TIBATEXT.TRA
AR15-13
TIBAYRENC
relying on the statistical improvements and technical standardizations called for above, to settle a unified evolutionary genetics of microorganisms, valid whatever the species studied, whether eukaryotic (parasitic protozoa and fungi) or prokaryotic (bacteria). Apart from benefits for basic evolutionary research, the anticipated payoff from this synthetic approach is to render routine and commonplace the use of microbial evolutionary genetics in the fields of epidemiology, medicine, and agronomy.
CONTENTS INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 402 GOALS AND TOOLS OF EVOLUTIONARY GENETICS IN MICROBIOLOGY . . . . . . . . 403 Principles and Methods of Microbial Population Genetics . . . . . . . . . . . . . . . . . . . . . . . . 406 Phylogenetic Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 408 MAIN RESULTS OBTAINED IN EVOLUTIONARY GENETICS OF EUKARYOTIC MICROORGANISMS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 411 A Paradigm of the Clonal Model: Trypanosoma cruzi . . . . . . . . . . . . . . . . . . . . . . . . . . . . 411 Other Parasitic Protozoa . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 415 Yeasts and Fungi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 416 A TELLING COMPARISON: DATA FROM BACTERIA . . . . . . . . . . . . . . . . . . . . . . . . . . . . 416 PRESENT DEBATES ON POPULATION STRUCTURE OF MICROBIAL SPECIES . . . . . 417 Possible Impact of Selection; The Telling Model of HIV/AIDS Opportunistic Infections . 417 Linkage Disequilibrium Does Not Amount to Clonality . . . . . . . . . . . . . . . . . . . . . . . . . . . 418 Proposals for Additional Approaches: Data Weighting and Avoidance of a Type II Statistical Error . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 419 ADDITIONAL RECOMMENDATIONS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 420 Better Defining the Question Under Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 420 Setting the Molecular Clock: Different Classes of Genetic Markers for Different Uses . . 421 Resettling the Debate: Two Main Kinds of Population Structure . . . . . . . . . . . . . . . . . . . . 422 CONCLUDING REMARKS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 422
INTRODUCTION During the past few years, a considerable amount of work has been undertaken on the genetic characterization of microorganisms. This has permitted substantial progress in the areas of both strain identification and taxonomy. Nevertheless, the overall result has been somewhat disappointing. Strain typing and molecular taxonomy are seen as narrow fields in microbiology, and the studies currently performed in these fields undergo the risk of appearing repetitive and poorly informative for scientists who work on clinical science, epidemiology, or vaccine/drug development. Currently, three major factors appear responsible for preventing this line of research from becoming a vast, autonomous scientific field, as it deserves to be. Thus, this review represents an urgent call to correct these handicaps. I take each of these major points in turn:
August 12, 1996
10:34
Annual Reviews
TIBATEXT.TRA
AR15-13
MICROBIAL EVOLUTIONARY GENETICS
403
1. Overemphasis on strain typing. Although this is a relevant pay-off of it, evolutionary genetics can provide much more, in terms of both basic and applied science. 2. Empirism. Owing mainly to the overemphasis given to strain typing, it appears that many researchers do not attempt to understand the biological mechanisms that generate microbial genetic diversity. They confine themselves instead to scoring bands on gels, which they use, at best, to compute dendrograms. Although this approach can yield some information, in many cases it is totally misleading. 3. Compartmentalization (possibly most important, and perhaps the most difficult to correct). For example, researchers specializing in different categories of microorganisms tend not to interact, even if they work in the same field of genetic characterization of microbes. This is all the more regrettable, because in terms of applied research, the problems raised by, and the potential payoffs of, evolutionary genetics are strikingly similar, whatever the microorganism under study, be it a bacterium, a parasitic protozoan, or a fungus (see Table 1). In other publications (62, 64), I have advocated a comparative approach making use of common evolutionary genetic methods valid for any microorganism, of agronomic or medical interest. This review is not a general review on microbial evolutionary genetics. Rather, it has the same specific goal as the preceding ones (62, 64) of breaking down the existing compartmentalization at three sets of relationships, across different scientific fields that currently suffer from a lack of interaction and that have much to learn from each other. The three relationships are those (a) between basic and applied science; (b) among parasitologists, mycologists, and bacteriologists; and (c) between specialists studying microorganisms of medical, veterinary, agronomic, or industrial interest. Although special attention is paid in this paper to the case of parasitic protozoa, particularly to the illustrative example of Trypanosoma cruzi, the agent of Chagas’ disease, telling comparisons are drawn with other kinds of microbes to illustrate the urgent need for this unified evolutionary genetics of microorganisms.
GOALS AND TOOLS OF EVOLUTIONARY GENETICS IN MICROBIOLOGY Studying the evolution of microbes is a goal in itself and can be considered a poorly explored and promising field in terms of basic science. In comparison to the evolution in stars named Homo sapiens, Mus musculus, Drosophila
August 12, 1996
404
10:34
Annual Reviews
TIBATEXT.TRA
AR15-13
TIBAYRENC Table 1 The applied and basic aspects of evolutionary genetics in microorganisms [after (62), slightly modified] Main applications of evolutionary genetics in microbiology
Translation in terms of basic science
Epidemiological follow-up (strain typing) Checking for the stability of microbial genotypes over space and time Short-term epidemiologya Long-term epidemiologyb Improvement of taxonomy Must be confronted to phylogenetics datac exploring the reltionships between genetic diversity and the commonly accepted taxonomical nomenclature looking for hidden discrete genetic subdivisions within presently identified species Studies downstream fron gentics Impact of genetic diversity and phylogenetic divergence on the relevant properties of microorganisms Pathogenicity resistance to drugs; immunological patterns, suspectibility to potential vaccines Vector and host specificity
structure and dynamics of microbial populations impact of genetic recombination on population structure; evolutionary role of sex
molecular phylogeny evolutionary role of sex
adaptative significance of microbial genetic diversity vector/host/parasite co-evolution
Time and space scales: a days-months, hospital or village based; b months-years, country- or continent-wide, up to the whole geographical range of the species; c millions of years country- or continent-wide, up to the whole geographical range of the species.
melanogaster, and Caenorhabditis elegans, evolution in microorganisms has received relatively limited attention, except perhaps in the case of Escherichia coli. Microbes provide a fascinating and original material in which to study general evolutionary phenomena (see Table 1). We are using our rudimentary concepts of species, subspecies, clonality∗ , sex, and others for microbes, concepts that have been developed largely for higher organisms. Now, although this step is indispensable to the construction of eminently falsifiable hypotheses, it appears possible that in the evolutionary genetics of microbes we may discover totally new concepts. In turn, such new concepts will help us refurbish our ideas on the evolutionary genetics and systematics of higher organisms. ∗ For definitions of all terms with asterisks, see the Glossary at the end of this chapter, before the Literature Cited section.
August 12, 1996
10:34
Annual Reviews
TIBATEXT.TRA
AR15-13
MICROBIAL EVOLUTIONARY GENETICS
405
In terms of applied microbiology, I propose that evolutionary studies, from the comparative viewpoint outlined above, are an indispensable detour for microbiologists working on any species and involved in the tasks of (a) strain typing, (b) improving microbial taxonomy, and (c) taking into account the crucial parameter of microbial genetic diversity in all biological studies dealing with microorganisms (Table 1). Strain typing has been and still is a classical and widely used application of genetic methods in applied microbiology. Many studies amount to empirical checks of the identity or nonidentity of stocks* based on a given set of markers and rely on the simple assumption that two identical stocks belong to the same clonal* lineage. This approach can be deeply misleading for two reasons. First, taking a more discriminative marker is bound to show additional variability within each of the formerly individualized clones* (63). The notion of a molecular clock* is crucial to handle this difficult problem. Second, an empirical approach neglects totally the possibly important role played by genetic exchange in interfering with the stability of the genotypes that are the target of strain typing. When taxonomical problems are considered, the initial working hypothesis is of course the presently accepted taxonomy. Microbial taxonomy has very often stemmed from medically relevant biological features (for example, pathogenicity), which is evidently acceptable and indeed desirable. Nevertheless, evolutionary genetics has much to tell us about the biological nature of these taxa. Evolutionary genetics considers two problems. First, are the taxa presently described phylogenetically meaningful? Second, are the taxa presently described actually subdivided into discrete phylogenetic subdivisions? Lastly, many microbial species exhibit considerable genetic diversity. It is reasonable to expect this genetic variability to have a profound impact on the biological properties of microorganisms, especially those properties that are medically relevant (virulence, resistance to drugs, etc). I propose that the crucial parameter of genetic variability no longer be neglected by scientists working on applied research, and that evolutionary genetics provides a rigorous framework for downstream studies (immunology, clinical research, and vaccine and drug design). A first minimal step would be, for example, that when a new drug or a new vaccine is developed against a given strain, its efficiency is checked on a convenient set of stocks, selected to be representative of the whole variability of the species. A central point of the approach proposed here is that the two sides (basic and applied research, the right and left sides of Table 1) of microbial evolutionary genetics should by no means be treated separately and are indispensable to each other. Basic genetics is a mandatory detour on the way to properly interpreting data usable for applied studies. In turn, data generated by epidemiology,
August 12, 1996
406
10:34
Annual Reviews
TIBATEXT.TRA
AR15-13
TIBAYRENC
immunology, clinical surveys, etc enrich the basic approach and permit the working hypotheses elaborated by it to be falsified. Two complementary tools are offered by evolutionary genetics: population genetics and phylogenetics. The first, population genetics, is more extensively discussed in this review, for it has been applied on a larger scale than has phylogenetics to the study of microbes; it presents important specificities through comparison with the population genetics of higher organisms. It gives an ‘evolutionary snapshot’ of the population under study, allowing exploration of structure in space and time, the importance of gene∗ flow, the existence of discrete genetic lines, etc. The second tool, phylogenetics, addresses a much greater time scale and aims to reconstruct past evolutionary events and evaluate the level of phylogenetic divergence among organisms. An important point to emphasize is that microevolutionary studies (population genetics and phylogenetics at limited levels of divergence) appear more relevant for applied microbiology than macroevolutionary studies, and are therefore the only ones treated in the present review. Strain typing for epidemiological follow-ups obviously requires that evolution be studied at a microlevel, as does taxonomy. Indeed, when identifying species, the task of distinguishing Trypanosoma cruzi from Toxoplasma gondii, even without the aid of genetics, is not difficult; however, it is trickier to decide whether Leishmania panamensis and L. guyanensis (two parasitic protozoan taxa that are extremely close phylogenetically) are ‘good’ species.
Principles and Methods of Microbial Population Genetics I focus here on those applications of population genetics that appear to be more specific to microorganisms (Table 1). Classical applications, such as the study of geographical differences among populations, migrations, and natural selection, are, nevertheless, also usable in microbiology. However, in the approaches summarized hereafter they are considered as possible biases that should be eliminated from the analysis. Many studies in microbial population genetics have been centered on the clonality/sexuality debate that has important implications in terms of applied research (35, 49, 67). If a species is sexual (in the broad sense: if it undergoes frequent genetic exchange, by whatever mechanism), each multilocus genotype will be unstable owing to frequent genetic recombination and can thus be equated to an ephemeral variant. However, if a species is clonal, each genotype will reproduce itself in the manner of a “genetic photocopy.” If clonality remains preponderant on an evolutionary scale, clonal lines will tend to accumulate more and more divergent mutations, including those mutations that govern medically relevant characters (virulence, resistance to drugs, etc). In basically clonal species, a statistical correlation is therefore expected between genetic and
August 12, 1996
10:34
Annual Reviews
TIBATEXT.TRA
AR15-13
MICROBIAL EVOLUTIONARY GENETICS
407
biological differences among strains; this is not the case in predominantly sexual species. When the notion of species is considered, the classical biological concept of species (36) is valid for sexual species but not for asexual ones. Mating experiments in the laboratory have been proposed (26, 72), an approach that is quite informative for our knowledge of the basic biology of these organisms. Nevertheless, even when successful, these experiments show only that the potentiality for gene exchange is present in the species under study and are poorly informative about the actual frequency∗ and impact of this phenomenon in natural populations (62). Surveys of natural populations through the methods of population genetics are therefore indispensable. A similar approach has been taken for bacteria (see 20, 50), parasitic protozoa (see 62), and yeasts (see 7, 44). Panmixia∗ (where gene exchanges occur randomly in the population under survey) is taken as the null hypothesis, for it is the only situation for which theoretical assumptions are well codified. Significant statistical departures from panmictic∗ expectations are then taken as circumstantial evidence that the population is clonal. An interesting feature of this indirect approach is its ‘blind’ aspect. Whatever the biology, the genome size and structure, and the mating system of the organism under study, it is always possible to score departures from panmictic expectations in its natural populations and to estimate statistically their level of significance and, therefore, their importance. Statistics used to provide evidence for departures from panmixia consider either the lack of segregation∗ or the lack of recombination. As examples, Table 2 shows the tests commonly used in my group. Other approaches, which rely on the same basic principle, are of course possible (see 24, 35, 53). Although all tests rely on the same basic principle (departures from panmictic expectations), they could have different levels of resolution and therefore could lead to divergent conclusions even when analyzing the same samples. An effort should be made to standardize the tests and to scale their respective levels of resolution so that different sets of data can be compared in an informative manner. Segregation refers to random reassortment of different alleles∗ at a given locus∗ . Tests based on a lack of segregation are based on the Hardy-Weinberg∗ statistics, the use of which requires the fulfillment of certain criteria: Alleles must be identified; the ploidy level of the organism must be known; and this level must not equal 1. Segregation tests are therefore not usable for bacteria and for eukaryotic microorganisms such as Plasmodium falciparum (the agent of malaria, which, when found in humans, is haploid∗ ). For parasitic protozoa such as Trypanosoma and Leishmania, diploidy∗ (28, 65) is parsimoniously hypothesized, but is not proven. Alleles are not always easily identified with the markers used for evolutionary genetics.
August 12, 1996
10:34
408
Annual Reviews
TIBATEXT.TRA
AR15-13
TIBAYRENC
Recombination tests are more robust, for they can be performed without any of the requirements that are mandatory for segregation tests. Recombination tests may be performed not on individual loci but rather on groups of loci, even if these groups are not precisely identified. The only requirement is that the loci or groups of loci be independent from each other (61, 69). Due to their flexibility, recombination tests are therefore especially suitable for population genetics of microorganisms. The tests all explore different facets of a single biological phenomenon: linkage disequilibrium∗ . Among them, the g test (correlation between independent sets of genetic markers). Table 2 and Figure 1 appear to be especially powerful (see 54, 61). Departures from panmixia can be interpreted as the result not specifically of clonality but of any obstacle to gene shuffling. These obstacles can be organized into two main categories: (a) physical obstacles and (b) biological obstacles. The first class refers to genetic isolation by space, time, or both. The second class refers to natural selection, physical linkage of different genes on the same chromosome, cryptic speciation, and clonality. Some means to identify potential causes of departures from panmixia other than clonality have been detailed elsewhere (62, 67).
Phylogenetic Analysis As stated above, phylogenetic studies that address higher levels of divergence (for example, between different genera) are outside the scope of this review, Table 2 Statistical tests used to evidence departures from panmictic expectations [after (67) slightly modified]. Criterion
Description
Segregation (within locus) a fixed heterozygosity b absence of segregation genotypes c deviation from Hardy-Weinberg expectations Recombination (between loci) d overrepresented, widespread identical, genotypes (statistical tests d1#∗ AND d2§∗∗ ) e deficit of recombinant genotype@∗∗ f classical linkage disequilibrium analysis¶∗∗ g
correlation between two independent sets of genetic markers∗∗
#∗ Probability of observing as many individuals of a given genotype as (or more individuals than) actually observed in the sample. § Probability of observing any genotype as often as, or more often than, the most common genotype in the sample. @ Probability of observing as few or fewer different genotypes than actually observed. ¶ Probability of finding a linkage disequilibrium as high as, or higher than, actually observed in the sample. ∗ Performed either by a chi-square test (when expected sizes are sufficient) or by a combinatorial analysis. ∗∗ Performed by a Montecarlo simulation with 104 iterations.
Annual Reviews TIBATEXT.TRA
MICROBIAL EVOLUTIONARY GENETICS
Figure 1 Two dendrograms derived from genetic distances∗ obtained by MLEE (right) and RAPD (left) on 24 stocks of Trypanosoma cruzi, the agent of Chagas’ disease. Fair agreement between the two dendrograms is evidence of linkage disequilibrium. The symbols on the lefthand dendrogram correspond to RAPD fragments that have a synapomorphic value. They specifically mark all the genotypes that pertain to a given phylogenetic cluster (69). Genotypes 19, 20, 32, and 39 have been characterized for various biological parameters (30, 45), as illustrated by Figure 4.
10:34
70%
August 12, 1996 AR15-13
409
August 12, 1996
410
10:34
Annual Reviews
TIBATEXT.TRA
AR15-13
TIBAYRENC
which instead focuses on questions at the subspecific and specific levels. Phylogenetic criteria are especially useful to better define microbial taxa, in which the biological concept of species (36) is often difficult or impossible to use. Phylogenetic analysis for microorganisms presents fewer specificities than population genetics. Many textbooks are available for the reader who wants complete information on these methods (see 10, 23, 27, 74), and only a few general remarks, especially relevant for microorganisms, are made here. Methods for analyzing phylogenetic diversity can be consigned to two main categories: (a) numerical methods, and (b) “true” phylogenetic methods. NUMERICAL METHODS The first class (52) contains quantifying methods that take into account and give an equal weight to all available characters. Dissimilarities among individuals may be calculated by various kinds of genetic distances∗ . Overall distances among the individuals that compose the population can be conveniently visualized by a dendrogram (see Figure 1). Such dendrograms cannot be regarded directly as phylogenetic trees. They would be equivalent to such trees only if all the individuals surveyed represented discrete phylogenetic lineages (for example, natural clones, or distinct biological species) and the rate of evolution were the same in all lines. PHYLOGENETIC METHODS Phylogenetic methods aim to discard the bias due to different rates of evolution among phylogenetic lines. The Fitch-Margoliash method (18) and the Wagner method (16, 17) elaborate phylogenies based on genetic distances, and the cladistic∗ method (23) relies on character states. Cladistics has been developed largely on the basis of morphological characters, and electrophoretic characters definitely convey a different kind of phylogenetic information. Consequently, their use in cladistics is much in debate. In general, the phylogenetic value of electrophoretic data is still a subject of intense debate (46). Electrophoretic information [isoenzymes∗ , Random Amplification of Polymorphic DNA (RAPD)∗ , Restriction Fragment Length Polymorphism (RFLP)∗ ] carries a notable risk of homoplasy∗ . For example, isoenzyme bands that comigrate do not obligatorily correspond to identical proteins; moreover, the loss of a RAPD fragment is not an improbable event and can happen simultaneously in different phylogenetic lines. Whatever the actual nature of the genetic variation under study, a computer is always able to generate a dendrogram, even if the data have no phylogenetic value. For example, if a human population is surveyed for some set of isoenzyme loci, it is possible to compute genetic distances among the individuals of this population and to draw a dendrogram from these distances. This dendrogram is definitely not a phylogenetic tree; it is only a reflection of the individual variability of the population under survey.
August 12, 1996
10:34
Annual Reviews
TIBATEXT.TRA
AR15-13
MICROBIAL EVOLUTIONARY GENETICS
411
Even when the trees generated from genetic data represent true phylogenetic trees, the resolution of the more widely used markers [i.e. multilocus enzyme electrophoresis(MLEE)∗ and RAPD] is fair at the intermediate levels of divergence only. They lack resolution at the upper levels of divergence, for two reasons. First, at these upper levels, the genetic distances based on, for example, isoenzymes become less meaningful. Second, when generating trees, clustering becomes less reliable for the upper levels of divergence (46). At the lower levels of divergence, RAPD and to a greater degree MLEE are probably unable to mark all genetic variability. Despite these restrictions, phylogenetic analysis brings valuable information to the study of microbial genetic variability. As underscored above, phylogenetic analysis is complementary to population genetics, for it addresses a longer scale of time. This allows informative results about the actual impact of gene exchange on microbial evolution to be obtained. As an example, phylogenetic structuration could appear weak in some populations that show drastic departures from panmixia [the case for some populations of Trypanosoma brucei, the agent of human African trypanosomiasis (32)]. This suggests that, for the species considered, recombination is probably rare at a limited time scale (years or hundreds of years) but plays a notable role at an evolutionary level. Another elegant use of phylogenetics, which actually amounts to a linkage disequilibrium analysis, consists of comparing sequence phylogenies obtained from different genes in a given species. Discordant phylogenies are taken as evidence for gene transfer with the species under study (12, 13). Apart from its contribution in assessing population structure, phylogenetic analysis, as already noted, is indispensable to the better definition of species in the field of microbiology, where the biological concept of species (36) is generally difficult or impossible to apply.
MAIN RESULTS OBTAINED IN EVOLUTIONARY GENETICS OF EUKARYOTIC MICROORGANISMS A Paradigm of the Clonal Model: Trypanosoma cruzi The main results obtained from evolutionary genetics studies of T. cruzi are used below to illustrate the three main applications of evolutionary genetics presented above. T. cruzi constantly exhibits all the classical marks of circumstantial evidence for clonality. When the whole species is considered, both important departures from Hardy-Weinberg equilibrium (see Figure 2) and a considerable linkage disequilibrium are recognized
POPULATION STRUCTURE AND STRAIN TYPING
August 12, 1996
412
10:34
Annual Reviews
TIBATEXT.TRA
AR15-13
TIBAYRENC
Figure 2 Isoenzyme analysis for the glucose phosphate isomerase system, showing genetic variability in various microorganisms. From left to right: (a) four stocks of Trypanosoma cruzi, the agent of Chagas’ disease; (b)Leishmania infantum (the agent of visceral leishmaniasis), L. chagasi (the agent of New World visceral leishmaniasis), and L. braziliensis and L. amazonensis (two agents of New World mucocutaneous leishmaniasis); and (c) four stocks of Escherichia coli. This picture illustrates the polyvalence of MLEE. The two first T. cruzi samples exhibit typical heterozygous∗ patterns that are found unchanged on vast geographical areas (“fixed heterozygosity,” a classical indication of clonality) (photograph by C Barnab´e).
(63, 66, 69). To illustrate the strength of this linkage, Figure 1 shows a close agreement between genetic trees elaborated from isoenzymes and RAPD. Under the null hypothesis that recombination∗ in this set of stocks is random, the probability of this correlation, as measured by a nonparametric Mantel test (31), is p < 10−4 (Test g; see 61). A similar concordance has been noted between isoenzymes and pulse-field gel electrophoresis (PFGE)∗ (48). A concrete manifestation of this pattern is that prior knowledge of a given MLEE T. cruzi genotype makes it possible to predict its RAPD genotype with a high probability. In T. cruzi, Hardy-Weinberg and linkage disequilibria persist in close sympatric∗ conditions and when lower genetic subdivisions of the species are treated separately (62). The population genetic results obtained in the case of
August 12, 1996
10:34
Annual Reviews
TIBATEXT.TRA
AR15-13
MICROBIAL EVOLUTIONARY GENETICS
413
the agent of Chagas’ disease do not rule out the possibility of occasional bouts of recombination. They do suggest that recombination is severely restricted in natural populations of this species considered as a whole, as well as within its genetic subdivisions. PHYLOGENETIC ANALYSIS AND TAXONOMICAL INFERENCES Several comments can be made on the double tree formed by two dendrograms of Trypanosoma cruzi, illustrated in Figure 1. First, it shows that T. cruzi is subdivided into two major phylogenetic lineages, each characterized by several RAPD synapomorphic∗ characters. Corresponding isoenzyme synapomorphies are also easy to find (C Barnab´e & M Tibayrenc, in preparation). Pioneering work by Miles et al (38) had already recognized the existence of principal zymodemes∗ , which remain entirely valid as polar phylogenetic types in T. cruzi. Second, the genetic divergence between the two main subdivisions amounts to that found between Leishmania braziliensis and Leishmania guyanensis, as shown by direct comparison relying on exactly the same RAPD technique (S Brisse & AL Ba˜nuls in my laboratory, unpublished data; see Figure 3). Third, each of the two
Figure 3 RAPD analysis, showing genetic variability in various microorganisms. From left to right: (a) five stocks of Trypanosoma cruzi, the agent of Chagas’ disease; (b) Leishmania infantum, L. chagasi, and L. donovani (three agents of visceral leishmaniasis); and L. braziliensis and L. mexicana (two agents of New World mucocutaneous leishmaniasis); (c) four stocks of Candida albicans; (d) three stocks of Mycobacterium tuberculosis; and (e) M. bovis. Samples at either end are molecular weight markers. This picture illustrates the polyvalence of RAPD. (photograph by S Brisse).
August 12, 1996
414
10:34
Annual Reviews
TIBATEXT.TRA
AR15-13
TIBAYRENC
Figure 4 Principal component analysis of 26 stocks of Trypanosoma cruzi, the agent of Chagas’ disease. The stocks fall into three main genetic groups (genotypes 19/20, 32, and 39; see Figure 1) evidenced by MLEE. Various biological parameters have been studied, including in vitro culture growth and in vitro drug resistance. Biological parameters make it possible to distinguish among the three genetic groups, which is evidence for linkage between biological and genetic parameters (after 45).
main subdivisions remains considerably heterogeneous. Fourth, the robustness of the clustering subdivisions evidenced in Figure 1 is corroborated by strong agreement between isoenzyme and RAPD data. LINKS BETWEEN GENETIC AND BIOLOGICAL DIVERSITY We have tested the working hypothesis that the phylogenetic divergence accumulated among the clonal lineages subdividing T. cruzi has a statistical impact on this parasite’s biological diversity (30, 45). Several parameters, including in vitro culture growth, virulence in mice, and in vitro sensitivity to major antichagasic drugs, have been quantified on a panel of 26 stocks representative of the whole genetic variability of the species. As illustrated by Figure 4, an overall highly significant correlation has been noted between genetic distances on the one hand and biological differences on the other hand, in agreement with the working hypothesis.
August 12, 1996
10:34
Annual Reviews
TIBATEXT.TRA
AR15-13
MICROBIAL EVOLUTIONARY GENETICS
415
Other Parasitic Protozoa Evolutionary genetic approaches developed for T. cruzi have been applied to several major species of parasitic protozoa (68). The main results are summarized below. Classical circumstantial evidence for a clonal population structure (strong linkage disequilibrium) has been found for the following species: Entamoeba histolytica, Giardia duodenalis, Leishmania spp., Trypanosoma brucei, and Toxoplasma gondii. In Leishmania spp., despite clear indications of clonality (67, 68), convergent evidence suggests that some genotypes result from hybridization events between different species (15). The case of Toxoplasma is illustrative, for a sexual cycle is well known to occur in this parasite. The clonal model proposed for this species (51, 64, 67) suggests that the impact of the sexual cycle in this parasite’s natural populations is limited, a conclusion that could be reached only by population genetic analysis. With the data now available, it is difficult to decide whether Toxoplasma gondii has a clonal structure or is composed of several cryptic species (59). The case of African trypanosomes is informative too because in their case the possibility of mating, first inferred from field data (58), has been amply confirmed by laboratory experiments (26). Nevertheless, the hypothesis of panmixia (58) has not been confirmed. Cibulskis (8) proposed that Trypanosoma brucei populations were subdivided into discrete lines, while we postulated a clonal population structure for the species (33, 67, 68). Nevertheless, the longterm stability of T. brucei natural clones is still under debate (9, 32, 35, 54, 55). In the cases of Leishmania spp., Toxoplasma gondii, and Trypanosoma brucei, the coexistence of indications for both clonality and genetic exchange illustrates the fact that the clonal model is compatible with a certain level of gene transfer (63, 68). The situation appears to be different for Plasmodium falciparum, the agent of the most malignant form of malaria. Random mating (6, 71) has been postulated in the case of this parasite, which undergoes obligatory meiosis at each transmission cycle. Evidence for linkage disequilibrium using data from the literature led us to suspect the existence either of some kind of uniparental propagation or of cryptic speciation in certain populations of this parasite (68). More recent studies have relied on polymerase chain reaction (PCR) analysis of oocysts isolated from mosquito vectors. Notable rates of cross-fertilization have been evidenced in Tanzania (1) whereas high levels of “selfing” have been recorded in Papua New Guinea (43). Despite this last result and the fact that statistically significant linkage has been evidenced in my laboratory by MLEE analysis (2), P. falciparum populations are clearly far less structured than populations of the other parasites cited above. Current debate (43, 60)
August 12, 1996
416
10:34
Annual Reviews
TIBATEXT.TRA
AR15-13
TIBAYRENC
considers the possibility that this situation is compatible with the sympatric coexistence of stable, discrete strains (multilocus phenotypic associations) in P. falciparum. When phylogenetic studies are considered, cladistic approaches have been widely applied (29) in studies of isoenzyme variability of Leishmania to such an extent that a notable refinement of the taxonomic nomenclature of this genus was possible. Leishmania species are now mainly described according to phylogenetic or at least genetic characters. Nevertheless, it is probable that too many different species of Leishmania are now being described with little if any phylogenetic and epidemiological relevance.
Yeasts and Fungi In comparison to those of parasitic protozoa, few results have been obtained on the evolutionary genetics of fungal organisms, although considerable efforts have been spent on their molecular characterization. The most studied fungus is Candida albicans, a frequent agent of opportunistic infections in HIV+ patients. Contradictory results, ranging from absent or limited linkage disequilibrium (7, 67) to very clear indices of clonality (44), have been produced, suggesting that levels of recombination may vary among different populations of this yeast, in which no sexual cycle has until now been described. Preliminary analyses using data from the literature (4, 47) suggest that Cryptococcus neoformans, another frequent opportunistic agent, is clonal (62, 67); evidently, such a result requires confirmation by more extensive studies. A sexual origin has been postulated for British isolates of Aspergillus nidulans (19). For the fungus Pneumocystis carinii, few data concerning its population structure are available, as it is presently impossible to cultivate. Nevertheless, both molecular (56) and isoenzyme analyses (37) suggest that P. carinii strains are probably host specific. Such an epidemiologically relevant result indicates that rats, mice, and rabbits would not therefore be reservoirs for human pneumocystosis.
A TELLING COMPARISON: DATA FROM BACTERIA Striking similarities have been observed between the population structures of parasitic protozoa and bacteria (20, 63, 64, 68). Classical population genetic approaches, relying mainly on MLEE analysis and in the first instance focusing on the case of Escherichia coli (11, 22, 49), have elevated the clone concept (42) to the rank of paradigm in many bacterial species, including Legionella pneumophila, Haemophilus influenzae, Neisseria meningitidis, and Yersinia enterocolitica (see 20, 50). Direct comparisons have been facilitated in bacteria
August 12, 1996
10:34
Annual Reviews
TIBATEXT.TRA
AR15-13
MICROBIAL EVOLUTIONARY GENETICS
417
by the fact that many studies relied on comparable techniques (starch gel MLEE) and statistics [very broad use of the same index of linkage disequilibrium (5, 35, 57)]. Despite the broad generalization of the clone concept in bacteria, departures from this model have been recognized even by classical population genetic approaches. For example, an absence of detectable linkage disequilibrium has been observed in Neisseria gonorrhoeae and Pseudomonas aeruginosa, which have therefore been considered as nonclonal species (50). Lack of linkage disequilibrium has also been recorded within each of the main subdivisions of Rhizobium meliloti (14). In Rhizobium leguminosarum, the different components of linkage disequilibrium have been explored in depth, leading to the conclusion that, although some genetic isolation was apparent in close sympatry, a notable part of the overall linkage was due to geographical distance and genetic drift rather than to biological obstacles to gene flow (53). Moreover, it has been shown that some bacterial genes have a complex mosaic structure due to gene transfer between different clonal lineages (12, 34, 39). In spite of the clonal paradigm, it has become increasingly apparent that gene exchange (sex in a broad sense) has played a major role in bacterial evolution that probably varied from one species to another. This led Maynard Smith et al (35) to propose several distinct models of population structure capable of explaining linkage disequilibrium in bacterial populations. These proposals are discussed in the following section. Population genetic studies are more abundant for bacteria than for eukaryotic microorganisms. On the other hand, within-species phylogenetic analyses have been applied less often to bacteria than to parasitic protozoa. In bacteria, intraspecific genetic variability has been generally visualized either by phenetic Unweighted Pair–Group Method with Mathematic Averages (UPGMA)∗ clustering or by principal component analysis (see 50, 52). Nevertheless, phylogenetic techniques should be useful in determining whether bacterial species are subdivided into discrete, durable phylogenetic lines (see following section).
PRESENT DEBATES ON POPULATION STRUCTURE OF MICROBIAL SPECIES Possible Impact of Selection; The Telling Model of HIV/AIDS Opportunistic Infections Selection can be suspected to greatly interfere with the results obtained in microbial evolutionary genetics. For example, the culture step probably leads to selection of given genotypes, a bias that is not easy to discard. PCR techniques could provide a solution to this problem (see below). An important potential
August 12, 1996
418
10:34
Annual Reviews
TIBATEXT.TRA
AR15-13
TIBAYRENC
bias is also the selection due to immune responses of the host. From this point of view, apart from their obvious epidemiological interest, the study of HIV/microbial associations provides a convenient model to falsify the working hypothesis that many microbial genotypes are systematically eliminated by immune defenses of the host.
Linkage Disequilibrium Does Not Amount to Clonality Four different models of population structure have been proposed by Maynard Smith et al (35): (a) true clonality (long-term clonal evolution), (b) cryptic speciation (the species under study is actually subdivided into two or more biological species, each panmictic), (c) epidemic clonality (in a basically sexual species, occasional bouts of clonality lead to the propagation of clones that have a short lifetime of at best a few years), and (d) panmixia (gene exchange occurs at random). Examples cited by these workers for the four population models are as follows: (a) Salmonella sp. and Trypanosoma cruzi; (b) Rhizobium meliloti (no example in parasites); (c) Neisseria meningitidis and Trypanosoma brucei; and (d) Neisseria gonorrhoeae and Plasmodium falciparum. The only case in which no linkage disequilibrium is observed is d. It is relevant to distinguish between the four cases for the following reasons: The only case in which individual multilocus genotypes (or strains) have a long-term stability is a. In models b, c, and d, multilocus genotypes actually amount to individual variants and have little if any temporal stability. The only instances in which a given microbial species is subdivided into durable phylogenetic lineages that have a taxonomical relevance are a and b. MEANS FOR DISTINGUISHING THE FOUR MODELS (35)
Distinguishing long-term clonal evolution from cryptic speciation Linkage disequilibrium analysis [the Ia test in (35)] is applied separately within each phylogenetic subdivision of a given species rather than on the whole species. If linkage disappears, it favors cryptic speciation rather than long-term clonal evolution. Istock et al (24) used a similar approach and postulated that recombination occurs randomly in Bacillus subtilis, at least within the major phylogenetic subdivisions of this species. Distinguishing long-term clonal evolution from epidemic clonality The analysis unit taken for linkage disequilibrium analysis is the electrophoretic type∗ (i.e. zymodeme) rather than the stock. If linkage disappears, it is evidence for epidemic clonality. The working hypothesis taken here is that identical genotypes (electrophoretic types or zymodemes) are the result of short-term clonal propagation, and that the presence of many identical genotypes resulting from epidemic clonality bias the estimations of linkage.
August 12, 1996
10:34
Annual Reviews
TIBATEXT.TRA
AR15-13
MICROBIAL EVOLUTIONARY GENETICS
419
Proposals for Additional Approaches: Data Weighting and Avoidance of a Type II Statistical Error Population genetic statistics, like any statistics, are heavily dependent upon the richness of the sample under study. When the sample becomes less abundant, the risk of a type II error (i.e. to reject the working hypothesis while it is true, or in other words, to adopt the null hypothesis wrongly when it is impossible to reject it) increases. In the ultimate case (for example, no genetic variation in the sample), all tests become impossible (59). It is hence critical to compare samples of equivalent richness and to weight data from different samples when necessary. Parameters to take into account (62) are (a) population size (number of stocks), (b) number of variable loci, (c) average genetic distance and its standard deviation (zero distances being eliminated), (d) number of different genotypes at each locus, (e) relative frequency of the different genotypes observed, and (f) multilocus genotype diversity (number of different genotypes divided by the number of individuals). The means of distinguishing clonal evolution from cryptic speciation (35) leads to a risk of a type II error [in (24) and (35), the impossibility of rejecting the null hypothesis of panmixia is equated to panmixia]. Indeed, when considering the subdivisions of a given species separately rather than the entire species as a whole, both the number of stocks and the richness of the genetic information are lowered. Another possible source of type II error in the analysis of Bacillus subtilis (24) is that linkage is estimated on pairs of loci, which provides useful detailed information but lacks resolution by comparison with linkage analysis performed on the whole set of loci. The way proposed by Maynard Smith et al (35) for distinguishing clonal evolution from epidemic clonality also leads to the risk of a type II error. Indeed, in this case, the number of stocks analyzed can be considerably lowered by comparison with the actual sample. An additional bias in this last approach is the working hypothesis that identical genotypes are the result of short-term clonal propagation, which could be untrue in some cases. The notion of identical genotypes is extremely difficult to assess experimentally. The only way of ascertaining that two microbial strains have an identical genotype is to sequence the whole genome of these strains (63). So-called identical genotypes characterized by a given marker are expected to be further subdivided when a more discriminative marker is used [see the notion of “clonet”∗ proposed by Tibayrenc & Ayala (64)]. As an example of this possible bias, a recent study by Stevens & Tibayrenc (54) has shown that identical MLEE genotypes of Trypanosoma brucei are actually composed of different RAPD genotypes. When the MLEE genotype is taken as a unit, no linkage is apparent, while linkage is restored when the unit is the RAPD genotype. A sufficiently discriminative technique is therefore desirable when trying
August 12, 1996
420
10:34
Annual Reviews
TIBATEXT.TRA
AR15-13
TIBAYRENC
to distinguish between clonal evolution and epidemic clonality (35). If this condition is not fulfilled, there is a risk that the approach proposed by Maynard Smith et al (35) is not adapted to specifically detect epidemic clonality; rather, it may amount to a nonspecific quantification of the overall level of linkage in the population under survey.
ADDITIONAL RECOMMENDATIONS Better Defining the Question Under Study Genetic means for analyzing microbial populations can be better selected when the goal of a given study is accurately defined. We have distinguished (62) among four different situations (see Table 1): 1. Experimental study: The only problem here is to characterize a limited number of stocks that are involved in an experiment (for example, a drug trial). Obviously no population genetic approach is required here; the only necessity is to use a marker able to distinguish between the stocks under study. The marker should have sufficient stability (i.e. a sufficiently slow molecular clock—see next paragraph) so that genotypic characterization is not upset in the course of certain long-term experiments. 2. Short-term epidemiology: When, for example, surveying the spread of a methicillin-resistant clone of Staphylococcus aureus in a resuscitation unit or of a pathogenic clone of Trypanosoma brucei in an African village, the critical point is to identify identical genotypes. I have already insisted on the fact that certainty is impossible to reach in this case. Only strong presumptions of identity can be obtained by selecting a marker that has a sufficient resolution (i.e. a sufficiently fast molecular clock). Neither the level of resolution required nor the strictness of genotype identity necessary to conclude that two strains pertain to a same clone is presently determined. When two strains are “almost” identical, should they be mandatorily considered as representing different clones? These difficult questions presently have no answer. At this limited temporal and spatial scale, the risk that apparently identical genotypes originate from reversion∗ rather than from a common clonal origin is limited, as is the risk that genetic recombination interferes with genotype stability. 3. Long-term epidemiology: When the goal is to follow, for example, the spread of a virulent clone of Escherichia coli over continents for several years, the double risk that reversion will generate identical genotypes that do not pertain to the same genetic clone and that gene exchange will interfere with genotype stability becomes higher. A marker that has too fast a
August 12, 1996
10:34
Annual Reviews
TIBATEXT.TRA
AR15-13
MICROBIAL EVOLUTIONARY GENETICS
421
molecular clock (i.e. too high a level of resolution) could therefore not be adapted to this kind of study. Moreover, the population structure of the organism considered should be clarified by careful population genetic analyses to evaluate the risk of gene flow interference with genotype repartition. 4. Phylogenetic analysis: As stated earlier, the goal here is to reconstruct the evolutionary past of the species rather than to explore its present population structure. According to the level of phylogenetic divergence considered, different classes of genetic markers with different levels of resolution should be selected. As emphasized earlier, no single genetic marker gives a satisfactory level of resolution both at the bottom and at the top of the levels of divergence that are of interest in applied microbiology.
Setting the Molecular Clock: Different Classes of Genetic Markers for Different Uses The notions of “level of resolution” and “molecular clock∗ ” are linked. With the broad definition used here (see Glossary), the molecular clocks of different markers can be simply compared by estimating the numbers of different genotypes they can identify for a given number of stocks. Although it is apparent that markers with different levels of resolution will have to be selected to address different problems (see above), the respective molecular clocks of the different markers are presently not well known. Extensive empirical comparisons will be necessary for these to be established. From this point of view, and also for addressing the between-species comparative approach advocated in this paper, it is useful to classify genetic markers in two categories (62): 1. “Generalist” markers are usable for any kind of organism. MLEE and RAPD are typical of this category (see Figures 2 and 3). These markers could therefore be used to compare directly different kinds of microorganisms—for example, to compare the level of phylogenetic divergence within a species of trypanosome and within a species of Leishmania. These comparisons are of course especially informative when done in the same laboratory with the same techniques (see Figures 2 and 3). Nevertheless, some caution is necessary because the molecular clocks of isoenzymes may differ between different organisms, and RAPD variability may be modulated by peculiarities in the genome structure. Still the fact remains that these between-species comparisons are technically possible. 2. “Specialist” markers are usable only for one species, or for one group of species. For example, RFLP typing of Mycobacterium tuberculosis with the specific IS 6110 probe (70) is not possible for other species. Direct PCRbased methods, relying on primers that are specific of the species under study,
August 12, 1996
10:34
422
Annual Reviews
TIBATEXT.TRA
AR15-13
TIBAYRENC
fall into the category of specialist markers. Their main advantage is to avoid the bias due to strain culturing. However, in the present state of the art, an important drawback of this approach is that it cannot reliably characterize multilocus genotypes, an indispensable requirement for population genetic studies.
Resettling the Debate: Two Main Kinds of Population Structure Tibayrenc has proposed (62) that the most relevant border in different models of population structure is between species that are subdivided into durable phylogenetic lines (either cryptic speciation or clonal evolution) on the one hand and species that show no such subdivisions on the other. In the first case, phylogenetic subdivisions represent meaningful taxonomic information. Moreover, divergent mutations, including those that govern medically relevant characteristics, accumulate between different phylogenetic lines, and statistical linkage is expected between genetic and biologic diversity (30, 40, 45; see Figure 4). In species that are not subdivided, a distinction between species that are panmictic and species that undergo epidemic clonality remains useful, for in the latter case, epidemic clones could be used for epidemiological tracking if their lifetimes are long enough. Although cryptic speciation and clonal evolution have some common epidemiological and evolutionary properties, it is also important to distinguish between them. Indeed, in the former, individual genotypes have no stability and cannot be used for epidemiological follow-up. It is worth noting that in some cases, clonality has been inferred from data that are only able to show that the species is subdivided into distinct phylogenetic lineages (11, 13). Last, it can be hypothesized that intermediate situations may exist among the four polar models proposed by Maynard Smith et al (35). We have proposed some refinements of the presently existing evolutionary statistics to discriminate these different cases (62). These tests will have to be used with the data weighting called for above.
CONCLUDING REMARKS In recent years, the study of the evolutionary genetics of microorganisms has benefited from substantial progress and has been enriched by fruitful debates. Nevertheless, the use of evolutionary genetic theory in applied microbiology is far from routine and, as illustrated in this review, many aspects remain obscure. It is proposed that the systematically comparative approach called for in this and preceding articles (62, 64) is the best approach to codifying a presently disparate field of research. It is probable that the statistics now available, even
August 12, 1996
10:34
Annual Reviews
TIBATEXT.TRA
AR15-13
MICROBIAL EVOLUTIONARY GENETICS
423
with the suggested improvements, will be insufficient, and that totally new avenues beyond them will have to be explored. Nevertheless, systematically comparing the results obtained from different kinds of microbes will permit the general laws that govern these organisms’ evolution to be delineated, while underlining the specificities of each species. The working hypothesis underlying the comparative approach proposed here is that both (a) considerable genetic diversity and (b) the play between sexual and clonal reproduction represent major evolutionary strategies of many microbial species. This hypothesis probably has profound implications for applied research. Apart from the benefits of basic knowledge, substantial progress in agronomic and medical applied microbiology is expected from this unified evolutionary genetics of microorganisms. GLOSSARY Allele: Different molecular forms of a same gene. Allelic Frequency: The ratio of the number of a given allele to the total number of alleles in the population under survey. Apomorphy, Apomorphic: See Cladistics. Cladistics: A specific phylogenetic method of analysis proposed by the German entomologist Willy Hennig (23) that relies on the distinction between ancestral (i.e. plesiomorphic) and derived (i.e. apomorphic) character states. According to the cladistic approach, only those derived characters that are specifically shared by several distinct phylogenetic lines (i.e. synapomorphic characters) convey a reliable phylogenetic information, unlike symplesiomorphic characters (ancestral characters shared in common by several phylogenetic lines) and autapomorphic characters (derived characters owned specifically by a unique phylogenetic line). See also Homoplasy. Clone, Clonal, Clonality: “Clonal” propagation is not limited to “mitotic” propagation. In population genetics, this term is used in all cases where the individuals of the progeny are genetically identical to one another and to the reproducing individual (64). Apart from mitotic reproduction, this includes several cases of parthenogenesis, as well as self-fertilization in haploid organisms. A clonal population structure can therefore be observed in animals exhibiting apparent meiosis and even mating. From a population genetics point of view, the term clonality does not refer to the mating behavior but rather to the population structure. Clonet: All the individuals of a clonal species that share a similar profile for a particular set of genetic markers (64). Diploid: The situation in which there are two copies of each chromosome and therefore of each gene (diploid is frequently indicated as 2N).
August 12, 1996
424
10:34
Annual Reviews
TIBATEXT.TRA
AR15-13
TIBAYRENC
Electrophoretic Type (ET): A set of individuals or stocks that share the same profile for a given set of isoenzyme markers. This term is more commonly used in publications dealing with bacteria. The corresponding term in the parasitic protozoa literature is zymodeme. ET: See above entry. Gene: A DNA sequence coding for a given polypeptide; more broadly, any given DNA sequence. This broad sense is adopted in the population genetic tests described in this review. Genetic Distance: Various statistical parameters inferred from genetic data, estimating the genetic dissimilarities among individuals or populations. The most widely used are Nei’s standard genetic distance (41) and the Jaccard distance (25). Although the statistics differ, most genetic distances are derived from an estimation of the percentage of band mismatch on electrophoresis gels. Haploid: The situation in which there are two copies of each chromosome and therefore of each gene (haploidy is frequently indicated as 1N or N). Hardy-Weinberg Equilibrium: See Segregation. Heterozygote, Heterozygous: In a diploid organism, the two copies of a given gene in one individual have a different molecular structure; this individual harbors two different alleles of the same gene (opposite: homozygote). Homoplasy: Possession in common by several phylogenetic lines of same characters that are not due to common ancestry. The origin of homoplasic characters include the following: (a) convergence (possession of same characters derived from different ancestral characters, due to convergent evolutionary pressure), (b) parallelism (possession of same characters derived from a same ancestral character, and generated independently in different phylogenetic lines), and (c) reversion (restoration of an ancestral character from a derived character; for example, restoration of an ancestral isoenzyme character from a derived one by reverse mutation). See also cladistics. Isoenzymes: Protein extracts from given samples, for example, various microbe stocks, are separated by electrophoresis. The gel is then subjected to a histochemical reaction involving the specific substrate of a given enzyme, and this enzyme’s zone of activity is specifically stained. The same enzyme from different samples may migrate at different rates (see Figure 2). These different electrophoretic forms of a same enzyme are referred to as isoenzymes or isozymes. When given isoenzymes are governed by different alleles of the same gene, they are referred to as alloenzymes or allozymes. For detailed information about isoenzyme characterization of parasitic protozoa, see (3). Linkage Disequilibrium: Nonrandom reassortment of genotypes occurring at different loci (see recombination).
August 12, 1996
10:34
Annual Reviews
TIBATEXT.TRA
AR15-13
MICROBIAL EVOLUTIONARY GENETICS
425
Locus: The physical location of a given gene on the chromosome. By extension, in population genetic jargon, the gene itself (plural: loci). MLEE: See Multilocus Enzyme Electrophoresis. Molecular Clock: The speed of evolution of a given molecular marker. Multilocus Enzyme Electrophoresis (MLEE): A technique that relies on the use of a notable number (usually at least 10) of isoenzyme loci (see Figure 2). Panmixia, Panmictic: A situation in which gene exchanges occur randomly in the population under survey. PFGE: See Pulse Field Gel Electrophoresis. Pulse Field Gel Electrophoresis (PFGE): Separation of large DNA fragments by a particular electrophoresis technique using alternately pulsed, perpendicularly-oriented electrical fields. Random Amplification of Polymorphic DNA (RAPD): A method of genetic characterization simultaneously proposed by Williams et al (75) and Welsh & McClelland (73). In the classical Polymerase Chain Reaction (PCR) method, the primers used are known DNA sequences, but the RAPD technique relies on primers whose sequence is arbitrarily determined (see Figure 3). RAPD: See above. Recombination: Free recombination implies that the expected probability of a given multilocus genotype is the product of the observed probabilities of the single genotypes of which it is composed. For example, in a panmictic human population, if the observed frequency of the AB blood group is 0.5, and the observed frequency of the Rh (+) blood group is 0.5, the expected frequency of individuals who are both AB and Rh (+) is 0.5 × 0.5 = 0.25. Inhibition of recombination leads to linkage disequilibrium or to nonrandom association among loci (when the predictions of expected probabilities for multilocus genotypes are no longer satisfied). For example, if the observed frequency of the individuals who are both AB and Rh (+) was statistically greater than 0.25, this would evidence that the two loci are linked (not transmitted independently). If this frequency were 0.5, it would indicate total linkage between AB and Rh (the two characters transmitted as a unit). See Table 2 for specialized tests of linkage disequilibrium. Restriction Fragment Length Polymorphism (RFLP): Variability shown on gels by cutting a given DNA by restriction endonucleases. Reversion: See Homoplasy. Segregation, Hardy-Weinberg Equilibrium: In a panmictic population of a diploid organism, let us consider a gene for which there are two possible alleles, a and b. The frequency of a is p, and the frequency of b is q = 1 − p (see Allelic Frequency above). The Hardy-Weinberg law predicts that the frequency of each of the three possible genotypes a/a, a/b, and b/b, will be p2, 2pq, and q2, respectively. If the observed frequencies are statistically different
August 12, 1996
10:34
426
Annual Reviews
TIBATEXT.TRA
AR15-13
TIBAYRENC
from the expected ones, then gene flow is restricted in the population under survey. Stock: A microbial line kept in a laboratory, taken originally from a host (often improperly called a strain). Sympatry: Living in the same geographical location (antonym: allopatry). Synapomorphy, Synapomorphic: See Cladistics. Unweighted Pair–Group Method with Arithmetic Averages (UPGMA): A widely used statistical method proposed by Sneath & Sokal (52) for creating dendrograms in numerical taxonomy. Zymodeme: See Electrophoretic Type. ACKNOWLEDGMENTS I thank Jamie Stevens for valuable linguistic help.
Literature Cited 1. Babiker HA, Ranford-Cartwright LC, Currie D, Charlwood JD, Billingsley P, et al. 1994. Random mating in a natural population of the malaria parasite Plasmodium falciparum. Parasitology 109:413–21 2. Ben Abderrazak S. 1993. Variabilit´e g´en´etique des populations de Plasmodium falciparum. PhD thesis, Univ.Montpellier, France 3. Ben Abderrazak S, Guerrini F, MathieuDaud´e F, Truc P, Neubauer K, et al. 1993. Isozyme electrophoresis for parasite characterization. In Protocols in Molecular Parasitology, Vol. 21, Methods in Molecular Biology, ed. Hyde JE, Walker JM. pp. 361–82. Totowa, New Jersey: Humana 4. Brandt ME, Bragg SL, Pinner RW. 1993. Multilocus enzyme typing of Cryptococcus neoformans. J. Clin. Microbiol. 31:2819– 23 5. Brown AHD, Feldman MW. 1981. Population structure of multilocus associations. Proc. Natl. Acad. Sci. USA 78:5913–16 6. Carter R, Voller A. 1975. The distribution of enzyme variation in populations of Plasmodium falciparum in Africa. Trans. R. Soc. Trop. Med. Hyg. 69:371–76 7. Caugant DA, Sandven P. 1993. Epidemiological analysis of Candida albicans strains by multilocus enzyme electrophoresis. J. Clin. Microbiol. 31:215–20
8. Cibulskis RE. 1988. Origins and organization of genetic diversity in natural populations of Trypanosoma brucei. Parasitology 96:303–22 9. Cibulskis RE. 1992. Genetic variation in Trypanosoma brucei and the epidemiology of sleeping sickness in the Lambwe Valley, Kenya. Parasitology 104:99–109 10. Darlu P, Tassy P. 1993. La reconstruction phylog´en´etique. Concepts et m´ethodes. Paris/Milan/Barcelone: Masson 11. Desjardins P, Picard B, Kaltenbock B, Elion J, Denamur E. 1995. Sex in Escherichia coli does not disrupt the clonal structure of the population: evidence from random amplified polymorphic DNA and restriction-fragment-length polymorphism. J. Mol. Evol. 41(4):440–48 12. Dykhuizen DE, Green L. 1991. Recombination in Escherichia coli and the definition of biological species. J. Bacteriol. 173:7257–68 13. Dykhuizen DE, Polin DS, Dunn JJ, Wilske B, Preacmursic V, et al. 1993. Borrelia burgdorferi is clonal—implications for taxonomy and vaccine development. Proc. Natl. Acad. Sci. USA 90:10163–67 14. Eardly BD, Materon LA, Smith NH, Johnson DA, Rumbaugh MD, Selander RK. 1990. Genetic structure of natural populations of the nitrogen-fixing bacterium Rhi-
August 12, 1996
10:34
Annual Reviews
TIBATEXT.TRA
AR15-13
MICROBIAL EVOLUTIONARY GENETICS
15.
16. 17. 18. 19.
20. 21. 22. 23. 24.
25. 26.
27. 28.
29.
30.
31.
zobium meliloti. Appl. Environ. Microbiol. 56:187–94 Evans DA, Kennedy WPK, Elbihari S, Chapman CJ, Smith V, Peters W. 1987. Hybrid formation within the genus Leishmania? Parasitologia 29:165–73 Farris JS. 1970. Methods for computing Wagner trees. Syst. Zool. 19:83–92 Felsenstein J. 1982. Numerical methods for inferring evolutionary trees. Q. Rev. Biol. 57:379–404 Fitch W, Margoliash E. 1967. Construction of phylogenetic trees. Science 155:279–84 Geiser DM, Arnold ML, Timberlake WE. 1992. Sexual origins of British Aspergillus nidulans isolates. Proc. Natl. Acad. Sci. USA 91:2349–52 Hartl D. 1992. Population genetics of microbial organisms. Curr. Biol. 2:937–42 Deleted in proof Hartl DL, Dykhuisen DE. 1984. The population genetics of Escherichia coli. Annu. Rev. Genet. 18:31–68 Hennig W. 1966. Phylogenetic Systematics. Urbana, Illinois: Univ. Illinois Press Istock CA, Duncan KE, Ferguson N, Zhou X. 1992. Sexuality in a natural population of bacteria: Bacillus subtilis challenges the clonal paradigm. Mol. Ecol. 1:95–103 Jaccard P. 1908. Nouvelles recherches sur la distribution florale. Bull. Soc. Vaudoise Sci. Nat. 44:223–70 Jenni L, Marti S, Schweizer J, Betschart B, Le Page RWF, et al. 1986. Hybrid formation between African trypanosomes during cyclical transmission. Nature 322:173–75 Joysey KA, Friday AF. 1982. Problems of phylogenetic reconstruction. London: Academic Lanar DE, Levy LS, Manning JE. 1981. Complexity and content of the DNA and RNA in Trypanosoma cruzi. Mol. Biochem. Parasitol. 3:327–41 Lanotte G, Rioux JA, Serres E. 1986. Approche cladistique du genre Leishmania Ross, 1903. A propos de 192 souches de l’Ancien Monde. Analyse num´erique de 50 zymod`emes identifi´es par 15 enzymes et 96 isoenzymes. Leishmania. Taxonomie et phylog´en`ese. Applications e´ co´epid`emiologiques. Coll. Int. CNRS/INSERM, 1984. IMEEE, Montpellier, 1986:27–40 Laurent JP. 1994. Comparaison des propri´et´es biologiques de diff´erents clones naturels de Trypanosoma (schizotrypanum) cruzi (Chagas, 1909), agent de la maladie de Chagas. PhD Thesis, Univ. Montpellier, France Mantel N. 1967. The detection of disease
32.
33.
34. 35. 36. 37.
38.
39.
40. 41. 42.
43.
44.
45.
427
clustering and a generalized regression approach. Cancer Res. 27:209–20 Mathieu-Daud´e F, Stevens J, Welsh J, Tibayrenc M, McClelland M. 1995. Genetic diversity and population structure of Trypanosoma brucei: clonality versus sexuality. Mol. Biochem. Parasitol. 72:89–101 Mathieu-Daud´e F, Tibayrenc M. 1994. Isozyme variability of Trypanosoma brucei s.l.: genetical, taxonomical and epidemiological significance. Exp. Parasitol. 78:1– 19 Maynard Smith J, Dowson CG, Spratt BG. 1992. Localized sex in bacteria. Nature 349:29–31 Maynard Smith J, Smith NH, O’Rourke M, Spratt BG. 1993. How clonal are bacteria? Proc. Natl. Acad. Sci. USA 90:4384–88 Mayr E. 1940. Speciation phenomena in birds. Am. Nat. 74:249–78 ¨ Mazars E, Odberg-Ferragut C, Durand I, Tibayrenc M, Dei-Cas E, Camus D. 1994. Genomic and isoenzymatic markers of Pneumocystis from different species. J. Eukaryot. Microbiol. 41:S104 Miles MA, Souza A, Povoa M, Shaw JJ, Lainson R, Toy´e PJ. 1978. Isozymic heterogeneity of Trypanosoma cruzi in the first autochthonous patients with Chagas’ disease in Amazonian Brazil. Nature 272:819–21 Milkman R, Bridges MM. 1993. Molecular evolution of the Escherichia coli chromosome. 4. Sequence comparisons. Genetics 133:455–68 Miller RD, Hartl DL. 1986. Biotyping confirms a nearly clonal population structure in Escherichia coli. Evolution 40:1–12 Nei M. 1972. Genetic distance between populations. Am. Nat. 106:283–92 Ørskov F, Ørskov I. 1983. Summary of a workshop on the clone concept in the epidemiology, taxonomy, and evolution of the Enterobacteriaceae and other bacteria. J. Infect. Diseases 148:346–57 Paul REL, Packer MJ, Walmsley M, Lagog M, Ranford-Cartwright LC, Paru R, Day KP. 1995. Mating patterns in malaria parasite populations of Papua New Guinea. Science 269:1709–11 Pujol C, Reynes J, Renaud F, Raymond M, Tibayrenc M, et al. 1993. The yeast Candida albicans has a clonal mode of reproduction in a population of infected HIV+ patients. Proc. Natl. Acad. Sci. USA 90:9456–59 Revollo S. 1995. Impact de l’´evolution clonale de Trypanosoma cruzi, agent de la maladie de Chagas, sur certaines propri´et´es biologiques m´edicalement impor-
August 12, 1996
10:34
428
46.
47.
48.
49. 50.
51. 52.
53.
54.
55.
56.
57. 58. 59. 60.
Annual Reviews
TIBATEXT.TRA
AR15-13
TIBAYRENC
tantes du parasite. PhD thesis, Univ. Montpellier, France Richardson BJ, Baverstock PR, Adams M. 1986. Allozyme electrophoresis. A Handbook for Animal Systematics and Population Studies. London/New York: Academic Safrin RE, Lancaster LA, Davis CE, Braude AI. 1986. Differentiation of Cryptococcus neoformans serotypes by isoenzyme electrophoresis. Am. J. Clin. Pathol. 86:204–8 Sanchez G, Wallace A, Mu˜noz S, Venegas J, Solari A. 1993. Characterization of Trypanosoma cruzi populations by several molecular markers supports a clonal mode of reproduction. Biol. Res. 26:167–76 Selander RK, Levin BR. 1980. Genetic diversity and structure in Escherichia coli populations. Science 210:245–47 Selander RK, Musser JM, Caugant DA, Gilmour MN, Whittam TS. 1987. Population genetics of pathogenic bacteria. Microb. Pathog. 3:1–7 Sibley LD, Boothroyd JC. 1992. Virulent strains of Toxoplasma gondii comprise a single clonal lineage. Nature 359:82–85 Sneath PHA, Sokal RR. 1973. Numerical Taxonomy. The Principle and Practice of Numerical Classification. ed. D Kennedy, RB Park, 537 pp. San Francisco: Freeman Souza V, Nguyen TT, Hudson RR, Pi˜nero D, Lenski RE. 1992. Hierarchichal analysis of linkage disequilibrium in Rhizobium populations: evidence for sex? Proc. Natl. Acad. Sci. USA 89:8389–93 Stevens JR, Tibayrenc M. 1995. Detection of linkage disequilibrium in Trypanosoma brucei isolated from tsetse flies and characterized by RAPD analysis and isoenzymes. Parasitology 110:181–86 Stevens JR, Tibayrenc M. 1996. Trypanosoma brucei s.l.: evolution, linkage and the clonality debate. Parasitology. In press Stringer JR, Stringer SL, Zhang JX, Baughman R, Smulian AG, Cushion MT. 1993. Molecular genetic distinction of Pneumocystis carinii from rats and humans. J. Eukaryot. Microbiol. 40:733–41 Sved JA. 1968. The stability of linked systems with a small population size. Genetics 59:543–63 Tait A. 1980. Evidence for diploidy and mating in trypanosomes. Nature 237:536– 38 Tibayrenc M. 1993. Entamoeba, Giardia, and Toxoplasma: clones or cryptic species? Parasitol. Today 9:102–5 Tibayrenc M. 1994. Antigenic diver-
61.
62. 63.
64.
65.
66.
67.
68.
69.
70.
71.
sity and the transmission dynamics of Plasmodium falciparum: the clonality/sexuality debate revisited. Parasitol. Today 10:456–57 Tibayrenc M. 1995. Population genetics and strain typing of microorganisms: How to detect departures from panmixia without individualizing alleles and loci. C. R. Acad. Sci. III Paris 318:135–39 Tibayrenc M. 1995. Population genetics of parasitic protozoa and other microorganisms. Adv. Parasitol. 36:47–115 Tibayrenc M, Ayala FJ. 1988. Isozyme variability of Trypanosoma cruzi, the agent of Chagas’ disease: genetical, taxonomical, and epidemiological significance. Evolution 42:277–92 Tibayrenc M, Ayala FJ. 1991. Towards a population genetics of microorganisms: the clonal theory of parasitic protozoa. Parasitol. Today 7:228–32 Tibayrenc M, Cariou ML, Solignac M. 1981. Interpr´etation g´en´etique des zymogrammes de flagell´es des genres Trypanosoma et Leishmania. C. R. Acad. Sci. III Paris 292:623–25 Tibayrenc M, Cariou ML, Solignac M, Carlier Y. 1981. Arguments g´en´etiques contre l’existence d’une sexualit´e actuelle chez Trypanosoma cruzi; implications taxinomiques. C. R. Acad. Sci. III Paris 293:207–9 Tibayrenc M, Kjellberg F, Arnaud J, Oury B, Breni`ere SF, et al. 1991. Are eucaryotic microorganisms clonal or sexual? A population genetics vantage. Proc. Natl. Acad. Sci. USA 88:5129–33 Tibayrenc M, Kjellberg F, Ayala FJ. 1990. A clonal theory of parasitic protozoa: the population structure of Entamoeba, Giardia, Leishmania, Naegleria, Plasmodium, Trichomonas and Trypanosoma, and its medical and taxonomical consequences. Proc. Natl. Acad. Sci. USA 87:2414–18 Tibayrenc M, Neubauer K, Barnab´e C, Guerrini F , Skarecky D, Ayala FJ. 1993. Genetic characterization of six parasitic protozoa: parity between random-primer DNA typing and multilocus enzyme electrophoresis. Proc. Natl. Acad. Sci. USA 90:1335–39 Van Embden JDA, Cave MD, Crawford JT, Dale JW, Eisenach KD, et al. 1993. Strain identification of Mycobacterium tuberculosis by DNA fingerprinting: recommendations for a standardized methodology. J. Clin. Microbiol. 31:406–9 Walliker D. 1985. Characterization of Plasmodium falciparum of different countries. Ann. Soc. Belge M´ed. Trop. 65:69–77
August 12, 1996
10:34
Annual Reviews
TIBATEXT.TRA
AR15-13
MICROBIAL EVOLUTIONARY GENETICS 72. Walliker D, Quakyi IA, Wellems TE, McCutchan TF, Szarfman A, et al. 1987. Genetic analysis of the human malaria parasite Plasmodium falciparum. Science 236:1661–66 73. Welsh J, McClelland M. 1990. Fingerprinting genomes using PCR with arbitrary primers. Nucleic Acids Res. 18:7213–18
429
74. Wiley EO. 1981. Phylogenetics: The Theory and Practice of Phylogenetic Systematics. New York: Wiley & Sons 75. Williams JGK, Kubelik AR, Livak KJ, Rafalski JA, Tingey SV. 1990. DNA polymorphisms amplified by arbitrary primers are useful as genetic markers. Nucleic Acids Res. 18:6531–35