Duplication and Diversifying Selection Among Termite Antifungal Peptides Mark S. Bulmer and Ross H. Crozier School of Tropical Biology, James Cook University, Townsville, Queensland, Australia We have identified and analyzed the mRNA sequence of 20 new defensin-like peptides from 11 Australian termite species of Nasutitermes and from an outgroup, Drepanotermes rubriceps. The sequence was amplified by reverse transcriptase PCR with a degenerate primer designed from termicin, an antifungal peptide previously characterized from the termite Pseudocanthotermes spiniger. All 20 genes show high sequence identity with P. spiniger termicin and have duplicated repeatedly during the radiation of Nasutitermes. Comparison of the relative fixation rates of synonymous (silent) and nonsynonymous (amino acid altering) mutations indicates that the Nasutitermes termicins are positively selected. This positive selection appears to drive a decrease in termicin charge. In termites with two genes, the decrease in charge is predominantly restricted to one termicin. Furthermore, the spread of charge is significantly greater within species than across species among amino acid sites that appear to be under strong positive selection and this spread is attributable to only three sites. Our results suggest that after termicin duplication, certain critical sites have maintained a positive charge in one duplicate and evolved towards neutrality in the other and that positive selection has directed these changes repeatedly and independently. This diversification among duplicated genes may be a counter-response to the evolution of fungal resistance in social insects that are particularly vulnerable to fungal epidemics.
Introduction Host immune systems are at a disadvantage in their evolutionary response to pathogenic microorganisms because pathogens generally have shorter generation times than do the hosts. The innate immune system is the last line of defense against pathogens in plants and invertebrates but is also critical for vertebrate immunity before the acquired immune system can generate a specific response. The innate immune system exploits molecular and physiological elements of microorganisms that are evolutionarily constrained (Girardin, Sansonetti, and Philpott 2002; Loker et al. 2004). For example, the innate immune system detects pathogen cell wall components such as lipopolysaccharides because there are strong constraints on how much these molecules can be changed by selection without the disruption of their biochemical properties. The conserved nature of the innate immune system (Girardin, Sansonetti, and Philpott 2002; Hultmark 2003), which shares common elements from invertebrates to mammals, suggests that this strategy has successfully limited the advantage of rapid evolution in certain pathogenic microorganisms. Nevertheless, evidence is mounting that there are points of molecular conflict between pathogens and components of the innate immunity humoral response, which is an inducible system that generates antimicrobial peptides. Positive selection has been found in both Drosophila Relish (Begun and Whitley 2000) and mammalian defensins (Hughes and Yeager 1997; Boniotto et al. 2003; Morrison et al. 2003; Semple, Rolfe, and Dorin 2003; Lynn et al. 2004). These findings suggest that selection has favored pathogen mechanisms that disrupt the signaling apparatus of the humoral immune response or evade the effect of toxic peptides. Selection on elements of the innate immune system may be particularly strong in social insects. Group living facilitates pathogen transmission especially when the group Key words: immune genes, termites, diversifying selection, gene duplication, termicins, defensins, protein evolution. E-mail:
[email protected]. Mol. Biol. Evol. 21(12):2256–2264. 2004 doi:10.1093/molbev/msh236 Advance Access publication August 18, 2004
is composed of closely related individuals (Schmid-Hempel 1998), and insects lack an immune system that can generate somatic diversity, such as the acquired immune system in vertebrates. However, group living in both termites and ants has been found to confer resistance to disease (Rosengaus et al. 1998; Hughes, Eilenberg, and Boomsma 2002; Traniello, Rosengaus, and Savoie 2002). Grooming, which may help remove pathogens from the insect cuticle, and transmission of antibiotics among nestmates appear to be important for this resistance. Most ants have metapleural glands that produce antibiotic secretions (Maschwitz 1974; Beattie et al. 1986; Poulsen et al. 2002), and in one group of termites, frontal gland secretions have also been shown to have antibiotic properties (Rosengaus, Lefebvre, and Traniello 2000). In addition, there is evidence suggesting that selection by pathogens may influence the relatedness of group members. Common social insect breeding systems such as polyandry and polygyny increase colony genetic diversity, and both an experimental study (Liersch and Schmid-Hempel 1998) and comparative analysis (SchmidHempel and Crozier 1999) indicate that genetic variation within social insect colonies reduces parasite load. It is unknown whether diversity in components of the innate immune system, including antimicrobials, could contribute to this effect. Recently, termicin, an antifungal peptide expressed in both salivary glands and hemocytes, was isolated and characterized from the termite Pseudacanthotermes spiniger (Lamberty et al. 2001). Termicin is a cysteine-rich peptide with close structural similarities to other invertebrate defensins of the humoral immune response (Da Silva et al. 2003). However, termicin is most active against filamentous fungi, whereas the majority of defensins are active against gram-positive bacteria (Dimarcq et al. 1998). Its expression in salivary glands suggests it can be passed among nestmates by trophallaxis or spread on the insect cuticle during grooming (Lamberty et al. 2001). It also may be directly applied to fungal mycelia or spores in an attempt to inhibit the growth of fungus on the wood, grass, or other cellulosic material that termites eat. To investigate molecular evolution in termicins, we isolated and characterized termicins among Nasutitermes
Molecular Biology and Evolution vol. 21 no. 12 Ó Society for Molecular Biology and Evolution 2004; all rights reserved.
Diversification Between Termite Antifungal Genes 2257
have been deposited in the Australian National Insect Collection, CSIRO, Canberra.
Table 1 Sources of Termites Used Species
Location
N. comatus
Townsville
N. dixoni
Sydney district
N. exitiosus
Sydney district
N. fumigatus
Sydney district
N. graveolus
Townsville
N. longipennis
Woodstock
N. magnus
Woodstock
N. pluvialis
Lake Eacham
N. triodiae
Mareeba
N. walkeri
Atherton
T. pastinator
Woodstock
D. rubriceps
Townsville
GPS Coordinates
Collection Code
S 19820.0249 E 146845.3139 S 34809.7539 E 151802.0339 S 34803.9169 E 151803.3249 S 33841.6879 E 151807.7219 S 19820.0539 E 146845.6549 S 19835.7309 E 146850.0089 S 19835.7309 E 146850.0089 S 17816.8929 E 145837.6209 S 16859.9279 E 145827.5949 S 17820.1819 E 145829.9779 S 19835.6899 E 146850.0489 S 19819.7849 E 146845.7549
ACQO ACQP ACQQ ACQR ACQS ACQT ACQU ACQV ACQW ACQX ACQY ACQZ
species in Australia (table 1). Although these termites are closely related, they construct strikingly different nest structures in a range of different habitats and they feed on different forms of cellulosic material. They are therefore expected to be in contact with different microbial fauna. Using a maximum-likelihood approach utilizing codon models, we compared nonsynonymous substitution rates to synonymous substitution rates among termicins from 11 Nasutitermes species and the outgroup Drepanotermes rubriceps. Our results show that termicins have duplicated repeatedly and nonsynonymous substitution rates are considerably higher than synonymous substitution rates. Moreover, selection following duplication appears to drive the diversification of termicins.
Methods Termites The nasute termites used in this study belong to a single genus, Nasutitermes, with the exception of Tumulitermes pastinator. Both the results presented here and a phylogeny of Nasutiterminae (Miller 1997) based on morphological characters support the placement of Tumulitermes pastinator within Nasutitermes. We also included as an outgroup Drepanotermes rubriceps, which is a mandibulate termite in the Termitinae, a sister taxon of the Nasutitermitinae (Miura et al. 1998). All termites were collected in Australia (for locations see table 1). Both workers and soldiers were collected from small breaches in nests, wood, or foraging trails. Termites were frozen in liquid nitrogen within 24 h of collection and stored at 2808C. Termites were identified using standard texts (Hill 1942; Watson and Abbey 1993) and voucher specimens
Isolation and Characterization of Termicins Termicin had originally been identified in the termitid Pseudocanthotermes spiniger. We designed a degenerate primer (GCN TGY AAY TTY CAR WSN TGY TGG) from the N terminal end of the P. spiniger protein sequence. The degenerate primer and a poly-T primer (T25VN) were used for one-step reverse transcriptase PCR (Invitrogen, San Diego, Calif.) of mRNA. The mRNA was prepared from 20 untreated large N. graveolus workers, ground in liquid nitrogen, homogenized by centrifugation through a Qiagen shredder for 2 min at 16,0003g and purified with a Quick Prep mRNA kit (Amersham Biosciences, Castle Hill, Australia). PCR cycle conditions were 508C, 20 min; 948C, 2min; 403(948C, 15 s; 458C, 30 s; 728C, 30s); 728C, 7 min. Two PCR products of approximately 250 and 350 base pairs were gel purified and amplified a second time: 948C, 2min; 403(948C, 30 s; 508C, 30 s; 728C, 30 s); 728C, 7min. The two bands were gel purified again and directly sequenced with a poly-T (T23V) primer. The larger band had sequence resembling P. spiniger termicin. Two specific primers were designed from this sequence for 59RACE (Term2: CCT TTG CAC TTG CCA TTT CT, and Term3: GGC AGG TGT CAA AGA GTT AAA CAC T). First strand cDNA was synthesized from N. graveolus mRNA as described in the Clontech, SMARTÔ PCR cDNA Synthesis Kit, except that Term2 was used in place of the CDS primer. The first strand cDNA was cleaned with a Qiagen spin column and used for PCR with primers Term3 and the SMARTÔ kit’s ‘‘PCR primer.’’ PCR cycle conditions were 948C, 2min; 403 (948C, 30 s; 528C, 30 s; 728C, 30 s); 728C, 7 min. The PCR product was gel purified and cloned with the PGEM-T Easy Vector system (Promega, Annandale, Australia). A Blast search revealed that the sequences of PCR-amplified clones had significant homology with P. spiniger termicin (Expect ¼ 1025, Identities ¼ 21/36). The full mRNA transcripts for termicins from different nasutes were amplified with a primer (GAA CAA GTC AAC GGC TAC TAC AAT G) designed from a short stretch of sequence just upstream of the termicin start codon and a poly-T primer (T25VN). Onestep RT-PCR cycle conditions were 508C, 15 min; 948C, 2 min; 353(948C, 15 s; 508C, 30 s; 728C, 45 s); 728C, 7 min. PCR products were cloned into the PGEM-T Easy Vector system and at least five clones with the correct insert were sequenced for each species of termite. Statistical Analysis Termicin mRNA transcripts were aligned with ClustalX, version 1.81 (Higgins, Thompson, and Gibson 1996) and the codon structure was inspected with Se-Al, version 2.0a11 (Rambaut 2002). The phylogenetic trees were constructed by maximum-likelihood (ML) analysis with the program PAUP* version 4.0b10 (Swofford 1998), and optimum substitution models for the analysis were determined with the program Modeltest, version 3.06 (Posada and Crandall 1998). The trees were rooted with mRNA sequence from D. rubriceps.
2258 Bulmer and Crozier
We compared different ML models of selection with the program package PAML (Yang 1997) to determine whether termicins are positively selected. PAML’s ML models rely on the phylogenetic relationship among genes. Termicins have duplicated repeatedly, and given their small size, convergence and divergence among different termicins could easily mask their true phylogeny. Selection on a short peptide (P. spiniger termicin is only 36 amino acids long) may induce phylogenetic error because distantly related termicins group according to functional similarity rather than similarity due to descent. The termicin mRNA transcript continues for an additional 229–285 bases after the stop codon. Multiple stop codons in this region indicate that it is not translated and selective pressure is likely to be relaxed at this 39 untranslated region (39UTR). We used a randomization scheme (described below) to test whether the 39UTR provided a better phylogenetic signal of the relationship between the nasute termicins than the expressed region. The topology of a tree constructed with the 39UTR was then used in the PAML program CODEML. Using the likelihood ratio test (LRT), we compared models of evolution that allow for nonsynonymous to synonymous substitution rates, dN/dS, to be greater than one with neutral models that restrict dN/dS between zero and one (Yang et al. 2000). This included comparing the discrete model M3 to M1 and the beta distribution model M8 to M7 to determine whether models that allow for positive selection (dN/dS . 1) were significantly better than those that only allow for neutral (dN/dS ¼ 1) and/or purifying selection (dN/dS , 1). Model M2 was not included in our analysis as it frequently fails to detect positive selection identified with other models (Yang et al. 2000). We also compared the free ratio model (Mb), in which dN/dS values are free to vary among termite lineages, to a model with one fixed value of dN/dS (M0). The Bayesian approach was used to identify sites in termicin that were potentially under positive selection (Yang et al. 2000). In addition, we used three methods for detecting positively selected sites (SLAC, ALRS, and Full likelihood) that were implemented with the HYPHY program (Pond, Muse, and Frost 2004) with the topology of the 39UTR tree. All three methods allow for variation in synonymous substitution rates across sites, and SLAC uses a ‘‘counting method’’ similar to that of the Suzuki-Gojobori method (Suzuki and Gojobori 1999). The extent to which the trees from the expressed termicin and from the 39UTR reflected history was gauged by the tendency of conspecific sequences to assort together. Nasutitermes sequence names were randomly assigned 1,000 times without replacement to each row and corresponding column of a matrix of maximum-likelihood distances (derived using the HKY 1 G model), and each time the mean distance between conspecific sequences was determined. The observed mean distance between conspecific sequences was then compared with the distribution of 1,000 randomly derived means to determine whether it is significantly smaller or larger than chance expectation. The proportions of radical nonsynonymous and conservative nonsynonymous substitutions that resulted in a change in molecular charge, polarity, or polarity and volume were calculated with the method of Zhang (2000) using the HON-NEW program. The transition to trans-
FIG. 1.—Protein alignments of mature termicin. The species affiliation for each termicin sequence ID are given in table 4, except Ps, which corresponds to the original termicin isolated from P. spiniger (Lamberty et al. 2001). Among positively selected sites, three sites identified with asterisks account for the charge differences. The Nl, Nm, and Tp1 mature termicin sequences are identical.
version ratio, j, from PAML’s M8 model of selection for the mature peptide was used in the HON-NEW program. In addition, we determined whether there was a correlation between the difference between the rates of nonsynonymous to synonymous substitutions (dN–dS) and estimates of the change in termicin molecular charge along branches of the 39UTR tree. Termicins at ancestral nodes were reconstructed with the PAML program CODEML. The amino acids of termicin or subsets of the peptide were used to estimate charge at pH 7. The amino acids contributing to the isoelectric point in this calculation were: C, D, E, H, K, R, Y and the amino- and carboxy-terminus of the peptide (Nelson, Lehninger, and Cox 2000). We also examined whether there was a correlation between dN–dS and genetic distance along branches of the 39UTR tree. Genetic distances were estimated under maximum likelihood using the substitution model (HKY 1 G), selected using Modeltest (Posada and Crandall 1998). A randomization test was used to determine whether mature termicins of the same species have charges at positively selected sites that are more or less similar to those expected by chance. The observed nasutitermine charges were randomly assigned to sequence names and the mean absolute difference in charges of conspecific sequences was determined, weighting each polymorphic species equally. The observed intraspecific mean charge was then compared to the distribution of 1,000 randomly derived means. Results Termicin Structure Termicins from Nasutitermes and D. rubriceps show high sequence identity with the original termicin peptide (fig. 1), which was isolated from P. spiniger and shown to have antifungal activity (Lamberty et al. 2001). The arrangement of cysteines is typical of insect defensins
Diversification Between Termite Antifungal Genes 2259
FIG. 2.—(a) Maximum-likelihood tree with boostrap values for the 39UTR of the termicin mRNA transcript. The K801G substitution model was used for constructing the tree. (b) Maximum-likelihood tree for mature termicin with boostrap values. The HKY 1 G substitution model was used for constructing the tree. The species affiliation for each termicin sequence ID is given in table 4. Scale bars are in nucleotide substitutions per site.
(Dimarcq et al. 1998) and the structure of P. spiniger termicin has been shown to have the typical CSab motif associated with antimicrobial defensins (Da Silva et al. 2003). The mRNA sequence also appears to encode a signal peptide of 24 amino acids, which is absent in the active termicin isolated from P. spiniger and is probably cleaved off after translation. In addition, there is an extra glycine at the C-terminal end of most Nasutitermes and D. rubriceps termicins (fig. 1). N. triodiae termicins have become truncated after N. triodiae branched off from the other nasutes (fig. 1). Termicin Duplication and Divergence The screen of clones revealed two termicins in four species of Nasutitermes and D. rubriceps and three termicins in N. walkeri (fig. 1). Mature termicins in N. longipennis, N. magnus, and one of the two termicins in T. pastinator are identical. These species cannot be resolved by the phylogeny presented here (fig. 2a) and were collected from the same locale in Queensland (table 1), but they can be easily distinguished by morphological characters (Miller 1997) and the structure of their mounds. Several conspecific termicins do not group together in the 39UTR tree (fig. 2a), which suggests that they have duplicated prior to speciation. Exceptions are the two termicins in N. triodiae and those in N. exitiosus. These termicins differ at a substantial number of sites and it is
unlikely that they are allelic variants. Moreover, in a screen of 10 clones from a single individual of N. triodiae, we found three distinct sequences, two assignable to Nt2 and the other to Nt1. Termicins therefore appear to have duplicated and diverged after N. triodiae and N. exitiosus split off from the other nasutes. Within species, the expressed mature termicins appear to have diverged more rapidly than the 39UTR, and this has resulted in very different tree topologies for the mature termicin and the 39UTR (figs. 2a and b). The observed mean distance between 39UTRs from the same species fell at position 19 in the distribution of 1,000 randomizations, showing that this region has a significantly above-chance tendency to group conspecific sequences. The corresponding mean distance for the mature termicin fell at position 978 in the corresponding distribution of randomization means, showing that this region performed significantly worse than chance in grouping conspecific sequences. We concluded that the most consistent phylogenetic signal is given by the 39UTR. Furthermore, conspecific termicin sequence appears to have diverged significantly more than expected by chance. PAML and HYPHY Results The free ratio model in which dN/dS varies among termite lineages is not significantly better than a model with a fixed value of dN/dS (Mb vs. M0; table 2). There is
2260 Bulmer and Crozier
Table 2 Likelihood Ratio Test (LRT) to Detect Positive Selection
Models Full termicin M0 vs. Mb M1 vs. M3(K3) M3, K3 vs. K4 M7 vs. M8
2L 2(21015.83, 2(21288.04, 2(21209.46, 2(21276.33,
P Value v2 for Best Global value df Model dN/dS
21006.55) 18.56 37 P ¼ .995 21209.46) 157.16 4 P , .001 21204.94) 9.04 2 P ¼ .011 21212.74) 127.18 2 P , .001
1.62 2.17 2.30 1.76
Signal peptide M1 vs. M3(K3) M7 vs. M8
2(2256.74, 2255.81) 2(2256.76, 2255.81)
1.86 1.90
4 P ¼ .761 2 P ¼ .387
0.53 0.50
Mature termicin M1 vs. M3(K3) M7 vs. M8
2(2971.69, 2912.80) 2(2955.85, 2917.97)
117.78 75.76
4 P , .001 2 P , .001
3.02 1.76
also no relationship between dN–dS and genetic distance along branches of the 39UTR tree (excluding the outgroup, R2 ¼ .002). We therefore employed PAML models that assume that selective pressure varies among sites in the molecule but not among lineages. Model M3 with three different unrestricted categories (K3) of dN/dS equal to 0.10, 1.86, and 11.19 is significantly better than model M1 with two restricted categories of dN/dS equal to one and zero (table 2). Furthermore, model M3 with four unrestricted categories (K4) of dN/dS equal to 0.06, 1.09, 4.08, and 13.85 is significantly better than model M3 with three categories (table 2). A comparison between model M7, which uses a beta distribution between the dN/dS interval from zero to one, and M8, which uses the same beta distribution plus an extra category allowing for positive selection (dN/dS . 1), also reveals that termicins are under significant positive selection (table 2). The global dN/dS is greater than one for all the best models described above. Sites under positive selection appear to be restricted to the mature molecule and the signal peptide appears to be under purifying selection (dN/dS , 1), and we therefore repeated the analysis for these segments separately. The discrete (M3) and beta distribution (M8) models for the signal peptide do not fit the data better than the neutral models (M1 and M7; table 2). However, the discrete (M3) and beta distribution (M8) models indicate that the mature peptide is under positive selection (table 2). The Bayesian approach was used to identify amino acid sites under positive selection with models M3 (K4) and M8 (posterior probabilities, P . .95; table 3). In the M3 (K4) model, sites identified with dN/dS equal to 1.09 were not included as positively selected, as the value is very close to expectations under neutral evolution (dN/dS ¼ 1). A number of the boostrap values in the 39UTR tree are low (fig. 2a). We therefore ran the PAML analysis with a tree in which branches were collapsed according to the 50% consensus rule. The M8 model identified the same positively selected sites with the exception of site 58, which did not appear to be under positive selection. We also employed three tests for detecting positively selected sites with HYPHY (Pond, Muse, and Frost 2004), and they are shown in table 3. The P value cutoff was 0.05 for
Table 3 Positively Selected Sites Identified with Different Models of Evolution Model CODEML, M3 CODEML, M8 HYPHY, SLAC HYPHY, ALRS HYPHY, Full likelihood
Global dN/dS
Positively Selected Sites
2.30 1.76 1.44 1.44
34, 34, 37, 34,
37, 37, 40, 37,
40, 45, 50, 40,
45, 50, 52, 45,
50, 52, 54, 50,
52, 54, 58, 59, 60 54, 58, 59, 60 59 54, 58, 59
1.38
34, 37, 40, 50, 52, 54, 58, 59
identifying positively selected sites with the SLAC and ALRS methods and a posterior probability of greater than 0.95 for the Full-likelihood method. The global dN/dS was greater than one for all three HYPHY tests. Duplication, Selection, and Molecular Charge The protein alignments of mature termicin reveal that many of the substitutions result in the replacement of a neutral amino acid with a charged amino acid or vice versa. We calculated the average proportions of radical versus conservative amino acid substitutions in termicins that resulted in either a change in charge, polarity, or polarity and volume under the Miyata-Yasunaga amino acid classification (Miyata, Miyazawa, and Yasunaga 1979). The ratio of average radical nonsynonymous substitutions to average conservative nonsynonymous substitutions was 1.29 for charge, 1.17 for polarity, and 0.59 for the Miyata-Yasunaga classification. These results suggest that a substantial component of selection is directed at sites that shift molecular charge. Furthermore, among nasute termicins there is a negative correlation (r ¼ 2.70, P , .001) between dN–dS and estimates of the change in termicin molecular charge along branches of the 39UTR tree (fig. 3). The negative correlation is strengthened when only positively selected amino acids are used to calculate charge. For example, the correlation between dN–dS and charge is 2.81 (y ¼ 20.042x 1 0.009, P , 0.001), with charge estimates
FIG. 3.—Selection and the change in molecular charge along different termicin lineages. The strength of selection (dN–dS) and the shift in termicin charge (calculated with reconstructed termicins for internal nodes) were calculated along each branch of the 39UTR tree. The correlation is significant (P , .001).
Diversification Between Termite Antifungal Genes 2261
Table 4 Termicin GenBank Accession Numbers, the Net Charge of the Mature Molecule (Column A), the Net Charge for the Positively Selected Sites Identified by the M3, M8, SLAC, and Full-Likelihood Models (column B), and Positively Selected Sites Identified by the ALRS Model (Column C) Species Species with two N. comatus N. comatus N. exitiosus N. exitiosus N. pluvialis N. pluvialis N. triodiae N. triodiae N. walkeri N. walkeri N. walkeri T. pastinator T. pastinator
Termicin ID
Accession No.
or three termicins Nc1 AY642891 Nc2 AY642892 Ne1 AY642894 Ne2 AY642895 Np1 AY642900 Np2 AY642901 Nt1 AY642902 Nt2 AY642903 Nw1 AY642904 Nw2 AY642905 Nw3 AY642906 Tp1 AY642907 Tp2 AY642908
A
B
C
3.72 0.73 1.73 1.91 2.73 20.13 20.18 2.82 2.73 0.91 1.91 0.82 0.91
4.00 1.00 3.00 1.00 4.00 1.00 2.00 4.00 4.00 1.00 1.00 3.00 0.00
3.00 1.00 2.00 1.00 3.00 1.00 2.00 3.00 3.00 1.00 1.00 2.00 0.00
Species with one termicin N. dixoni Nd N. fumigatus Nf N. graveolus Ng N. longipennis Nl N. magnus Nm
AY642893 AY642896 AY642897 AY642898 AY642899
1.82 1.82 2.73 0.82 0.82
3.09 3.09 4.00 3.00 3.00
2.09 2.09 3.00 2.00 2.00
Outgroups D. rubriceps D. rubriceps P. spiniger
AY642909 AY642910 P82321
3.82 6.82 3.82
4.00 4.00 3.00
3.00 3.00 4.00
Dp1 Dp2 termicin
for the nine sites identified by the M8 model. This result indicates that selection is largely directed at reducing the positive charge at critical sites in the termicin molecule. Up to 66% (r2 ¼.656) of the variation in the strength of selection along branches is explained by charge reduction at sites under positive selection, and only three sites (indicated with an asterisk in fig. 1) are responsible for the estimated charge differences. Among termicins within nasute species, the net charge at positively selected sites appears to be spread among duplicated termicins (table 4). The randomization test did not reveal a significant difference in the mean spread of termicin charge, estimated with all 34 or 37 sites in the mature molecule, compared to the simulated mean, in which charges were randomly assigned to sequence names. The mean difference between charges of conspecific sequences fell at position 899 along the distribution of randomization means. Only three positions among positively selected sites contribute to differences in molecular charge among termicins (indicated with an asterisk in fig. 1; sites 37, 50, and 52 in table 3). All the models of evolution identified these sites as positively selected with the exception of ALRS, which only identified sites 37 and 50 (table 3). Our results indicate that conspecific sequences differ very significantly compared to random expectation when charge was estimated from positively selected amino acids, either for the sites identified by the M3, M8, SLAC, or Fulllikelihood models (position 998) or for the sites identified by the ALRS model (position 982).
The tree in figure 4 uses the topology of the 39UTR tree and branch lengths corresponding to the number of nucleotide substitutions per codon in mature termicin. At positively selected sites, charge has been reduced in only one termicin in nasute species with two copies of the locus. In N. walkeri, which has three copies of the locus, charge has been reduced in two termicins. The reduction in charge occurs along lineages with relatively high substitution rates, in which the number of nonsynomymous substitutions are substantially higher than expected under no selection. Discussion We have characterized 20 new termite antifungal defensins that show high sequence identity to termicin from P. spiniger. Termicins in Nasutitermes have duplicated repeatedly and appear to be under strong positive selection. There is a strong correlation between the strength of selection (dN–dS) and estimates of the change in molecular charge along different termicin lineages, particularly at sites under positive selection. When selection is strong (relatively high values of dN–dS) it appears to cause a reduction in the positive charge of termicin, and 66% of the variation in the strength of selection is explained by the loss of positive charge at three sites. The reduction in charge at these sites results from the replacement of positively charged lysine and arginine with neutral amino acids (alanine, glycine, threonine, asparagine, serine, glutamine, and histidine) and a negatively charged glutarate. Our results suggest that ancestral termicins in Nasutitermes had a relatively high positive charge (P. spiniger termicin as well as the Drepanotermes termicins all have substantially higher charge values than the nasutes; table 4) and that this charge has become reduced along specific lineages. In termites with two termicins, or in one case three, differences in selection pressure appear to have driven a charge reduction at specific sites in one termicin while maintaining positive charge at the same sites in the other (fig. 4). This in turn has resulted in a greater spread of charge within species than across species at sites under positive selection (table 4). The repeated and parallel reduction in charge among positively selected sites in termicin duplicates and the maintained expression of both termicins in extant nasutes suggest that there is a selective advantage to having two termicins with different charge properties at specific sites. Several positively selected sites appeared to be responsible for changes in polarity. However, we did not find patterns of substitution among polar and nonpolar amino acids (data not shown) that would suggest a selection process for conspecific variation in polarity, although this pattern may have simply been beyond detection with our sample size. The mechanism behind termicin’s antifungal activity is not yet known. However, cationic insect defensins disrupt bacterial plasma membranes by forming voltagedependent ion channels (Cociancich et al. 1993). The addition of divalent cations inhibits defensin antibacterial activity, which suggests that positive charge is critical to defensin function. Interestingly, the bacterium Staphylococcus aureus achieves resistance to defensins by the addition of positively charged lysine to a membrane
2262 Bulmer and Crozier
FIG. 4.—Maximum-likelihood tree for mature termicin constrained with the topology of the 39UTR tree. The scale bar is in nucleotide substitutions per codon. Numbers along branches are dN–dS values. The first value in parentheses is the charge estimate for the mature termicin and the second is that for positively selected sites identified by the M3, M8, SLAC, and Full-likelihood models.
phospholipid (Staubitz et al. 2004). P. spiniger termicin appears to behave like ‘‘morphogenic’’ defensins in plants (Lamberty et al. 2001), and there is evidence that some plant defensins act through a receptor-mediated mechanism that is cation resistant (Thevissen et al. 1996; Thevissen, Terras, and Broekaert 1999). This raises the possibility that there is a molecular arms race between termicin and a specific fungal membrane receptor. In termicins, selection appears to be largely directed at reducing the charge of three sites (indicated with an asterisk in fig. 1). These sites are externally exposed in the 3D structure of the protein (Da Silva et al. 2003) and may interact with a fungal binding site that has shifted charge to achieve resistance. The repeated duplication and divergence of termicin may be a response to Nasutitermes species encountering novel fungal pathogens as they radiated. Termite species were collected from habitats ranging from dry savannah to wet tropical rainforest and vary in their foraging ecology, nest type, and location. Nasutitermes walkeri and N. graveolus feed on wood and construct arboreal carton nests. Nasutitermes comatus, N. pluvialis, N. fumigatus, and N. dixoni also feed on wood but construct subterranean nests. Nasutitermes triodiae, N. longipennis, N. magnus, T.
pastinator, and D. rubriceps are mound builders that forage on grass, which they store in the nest, and N. exitiosus is a mound builder that forages on wood. Therefore, both the abundance and type of microbial competitors and pathogens faced by these different termites is expected to vary. Our original expectation was that this variability would facilitate the diversification of termicins and lead to stronger selection and greater change along lineages associated with high predictors of pathogen load. For example, the strength of selection may be higher among rainforest species than dry savannah species because the former face greater and more diverse pathogens. However, the parallel reduction in termicin charge among duplicates occurs in the arboreal, mound-building, and subterranean termites (both rainforest and dry sclerophyll species) and suggests an alternative hypothesis; duplication and parallel divergence among nasute termicins represents the repeated evolution of a strategy of coping with one or more pathogens. Group living in termites potentially increases their vulnerability to disease. Closely related nestmates frequently exchange food and symbionts, facilitating the spread of pathogens (Schmid-Hempel 1998). The transfer of antimicrobials among nestmates by grooming, allogrooming, and trophallaxis as well as other social behaviors
Diversification Between Termite Antifungal Genes 2263
may help compensate for this vulnerability (Rosengaus et al. 1998; Hughes, Eilenberg, and Boomsma 2002; Traniello, Rosengaus, and Savoie 2002). In Nasutitermes, both the terpenoid frontal gland secretion of soldiers and allogrooming by workers appear to help prevent fungal infections (Rosengaus et al. 1998; Rosengaus, Lefebvre, and Traniello 2000). The effectiveness of allogrooming in reducing fungal infections may be due in part to termicins. Termicin is produced constitutively in P. spiniger worker salivary glands, as well as by hemocytes, and inhibits spore germination in several fungi (Lamberty et al. 2001). Therefore, a pathogen that evolves resistance to termicin may overcome an important component of social immunity, allowing the pathogen to quickly access a large number of interdependent individuals. Furthermore, the pathogen is likely to have an important evolutionary advantage over its host because it has a short generation time. Duplication and the repeated diversification of charge at critical sites for termicin function may help compensate for this apparent advantage because different termicins acting in concert restrict the evolution of fungal resistance. Acknowledgments This work was supported by an Australian Research Council grant to R.H.C. We thank A. J. Shuetrim and J. A. Holt for assistance and advice with insect collections and Y. C. Crozier for assistance and guidance in the laboratory. Literature Cited Beattie, A., C. Turnbull, T. Hough, and B. Knox. 1986. Antibiotic production: a possible function for the metapleural gland of ants (Hymenoptera: Formicidae). Ann. Entomol. Soc. Am. 79:448–450. Begun, D. J., and P. Whitley. 2000. Adaptive evolution of Relish, a Drosophila NF-j B/IjB protein. Genetics 154:1231–1238. Boniotto, M., A. Tossi, M. DelPero, S. Sgubin, N. Antcheva, D. Santon, J. Masters, and S. Crovella. 2003. Evolution of the beta defensin 2 gene in primates. Genes Immun. 4:251–257. Cociancich, S., A. Ghazi, C. Hetru, J. A. Hoffmann, and L. Letelliers. 1993. Insect defensin, an inducible antibacterial peptide, forms voltage-dependent channels in Micrococcus luteus. J. Biol. Chem. 268:19239–19245. Da Silva, P., L. Jouvensal, M. Lamberty, P. Bulet, A. Caille, and F. Vovelle. 2003. Solution structure of termicin, an antimicrobial peptide from the termite Pseudacanthotermes spiniger. Protein Sci. 12:438–446. Dimarcq, J. L., P. Bulet, C. Hetru, and J. Hoffmann. 1998. Cysteine-rich antimicrobial peptides in invertebrates. Biopolymers 47:465–477. Girardin, S. E., P. J. Sansonetti, and D. J. Philpott. 2002. Intracellular vs. extracellular recognition of pathogens— common concepts in mammals and flies. Trends Microbiol. 10:193–199. Higgins, D. G., J. D. Thompson, and T. J. Gibson. 1996. Using Clustal for multiple sequence alignments. Pp. 383–402 in R. F. Doolittle, ed. Computer methods for macromolecular sequence analysis. Academic Press, Inc., San Diego, Calif. Hill, G. F. 1942. Termites (Isoptera) from the Australian region. Commonwealth Scientific and Industrial Research Organization, Melbourne, Australia.
Hughes, A. L., and M. Yeager. 1997. Coordinated amino acid changes in the evolution of mammalian defensins. J. Mol. Evol. 44:675–682. Hughes, W. O. H., J. Eilenberg, and J. J. Boomsma. 2002. Tradeoffs in group living: transmission and disease resistance in leaf-cutting ants. Proc. R. Soc. Lond. Ser. B Biol. Sci. 269:1811–1819. Hultmark, D. 2003. Drosophila immunity: paths and patterns. Curr. Opin. Immunol. 15:12–19. Lamberty, M., D. Zachary, R. Lanot, C. Bordereau, A. Robert, J. A. Hoffmann, and P. Bulet. 2001. Insect immunity—constitutive expression of a cysteine-rich antifungal and a linear antibacterial peptide in a termite insect. J. Biol. Chem. 276:4085–4092. Liersch, S., and P. Schmid-Hempel. 1998. Genetic variation within social insect colonies reduces parasite load. Proc. R. Soc. Lond. B 265:221–225. Loker, E. S., C. M. Adema, S.-M. Zhang, and T. B. Kepler. 2004. Invertebrate immune systems—not homogenous, not simple, not well understood. Immunol. Rev. 198:10–24. Lynn, D. J., A. T. Lloyd, M. A. Fares, and C. O’Farrelly. 2004. Evidence of positively selected sites in mammalian adefensins. Mol. Biol. Evol. 21:819–827. Maschwitz, U. 1974. Vergleichende Untersuchungen zur Funktion der Ameisenmetathorakaldru¨se. Oecologia 16:303–310. Miller, L. R. 1997. Systematics of the Australian Nasutiterminae with reference to evolution within the Termitidae (Isoptera). Australian National University, Canberra, Australia. Miura, T., K. Maekawa, O. Kitade, T. Abe, and T. Matsumoto. 1998. Phylogenetic relationships among subfamilies in higher termites (Isoptera, Termitidae) based on mitochondrial COII gene sequences. Ann. Entomol. Soc. Am. 91:515–523. Miyata, T., S. Miyazawa, and T. Yasunaga. 1979. Two types of amino acid substitutions in protein evolution. J. Mol. Evol. 12:219–236. Morrison, G. M., C. A. M. Semple, F. M. Kilanowski, R. E. Hill, and J. R. Dorin. 2003. Signal sequence conservation and mature peptide divergence within subgroups of the murine beta-defensin gene family. Mol. Biol. Evol. 20:460–470. Nelson, D. L., A. L. Lehninger, and M. M. Cox. 2000. Lehninger principles of biochemistry. Worth Publishers, New York. Pond, S. L. K., S. V. Muse, and S. D. W. Frost. 2004. ‘‘HYPHY: hypothesis testing using phylogenies. (www.datamonkey. org).’’ University of California, San Diego. Posada, D., and K. A. Crandall. 1998. Modeltest: testing the model of DNA substitution. Bioinformatics 14:817–818. Poulsen, M., A. N. M. Bot, M. G. Nielsen, and J. J. Boomsma. 2002. Experimental evidence for the costs and hygienic significance of the antibiotic metapleural gland secretion in leaf-cutting ants. Behav. Ecol. Sociobiol. 52:151–157. Rambaut, A. 2002. ‘‘Sequence alignment editor. (http://evolve. zoo.ox.ac.uk).’’ Department of Zoology, University of Oxford, UK. Rosengaus, R. B., M. L. Lefebvre, and J. F. A. Traniello. 2000. Inhibition of fungal spore germination by Nasutitermes: evidence for a possible antiseptic role of soldier defensive secretions. J. Chem. Ecol. 26:21–39. Rosengaus, R. B., A. B. Maxmen, L. E. Coates, and J. F. A. Traniello. 1998. Disease resistance: a benefit of sociality in the dampwood termite Zootermopsis angusticollis (Isoptera: Termopsidae). Behav. Ecol. Sociobiol. 44:125–134. Schmid-Hempel, P. 1998. Parasites in social insects. Princeton University Press, Princeton, New Jersey. Schmid-Hempel, P., and R. H. Crozier. 1999. Polyandry vs. polygyny vs. parasites. Phil. Trans. R. Soc. B 354:507–515. Semple, C. A., M. Rolfe, and J. R. Dorin. 2003. Duplication and selection in the evolution of primate b-defensin genes. Genome Biol. 4:R31.
2264 Bulmer and Crozier
Staubitz, P., H. Neumann, T. Schneider, I. Wiedemann, and A. Peschel. 2004. MprF-mediated biosynthesis of lysylphosphatidylglycerol, an important determinant in staphylococcal defensin resistance. FEMS Microbiol. Lett. 231:67–71. Suzuki, Y., and T. Gojobori. 1999. A method for detecting selection at single amino acid sites. Mol. Biol. Evol. 16: 1315–1328. Swofford, D. L. 1998. PAUP*: phylogenetic analysis using parsimony (*and other methods). Version 4.0b10. Sinauer Associates, Sunderland, Mass. Thevissen, K., A. Alenzandre, G. W. De Samlanx, C. Brownlee, R. W. Osborn, and W. F. Broekaert. 1996. Fungal membrane responses induced by plant defensins and thionins. J. Biol. Chem. 271:15018–15025. Thevissen, K., F. R. G. Terras, and W. F. Broekaert. 1999. Permeabilization of fungal membranes by plant defensins inhibits fungal growth. Appl. Environ. Microbiol. 65: 5451–5458.
Traniello, J. F. A., R. B. Rosengaus, and K. Savoie. 2002. The development of immunity in a social insect: Evidence for the group facilitation of disease resistance. Proc. Nat. Acad. Sci. USA 99:6838–6842. Watson, J. A. L., and H. M. Abbey. 1993. Atlas of Australian termites. CSIRO, Australia. Yang, Z. 1997. PAML: a program package for phylogenetic analysis by maximum likelihood. CABIOS 13:555–556. Yang, Z. H., R. Nielsen, N. Goldman, and A. M. K. Pedersen. 2000. Codon-substitution models for heterogeneous selection pressure at amino acid sites. Genetics 155:431–449. Zhang, J. Z. 2000. Rates of conservative and radical nonsynonymous nucleotide substitutions in mammalian nuclear genes. J. Mol. Evol. 50:56–68.
Edward Holmes, Associate Editor Accepted August 12, 2004