MEASURING SELECTION COEFFICIENTS AFFECTING THE. ALCOHOL DEHYDROGENASE POLYMORPHISM IN. DROSOPHILA MELANOGASTER.
MEASURING SELECTION COEFFICIENTS AFFECTING THE ALCOHOL DEHYDROGENASE POLYMORPHISM IN DROSOPHILA MELANOGASTER S. R. WILSON Department of Statistics, Research School of Social Sciences, Australian National University, P . 0.Box 4 , Canberra, A.C.T. 2600, Australia
J. G. OAKESHOTT, J. B. GIBSON AND P. R. ANDERSON Department of Population Biology, Research School of Biological Sciences, Australian National University, P . 0.Box 475, Canberra,City, A.C.T. 2601, Australia Manuscript received June 24, 1981 Revised copy accepted October 15,1981 ABSTRACT
This paper describes a perturbation experiment on the frequency of the F and S Alcohol dehydrogenase (Adh) alleles of D.melanogaster. Fifty-four isofemale lines set up from three wild populations and with initial F frequencies of either 0.25, 0.50 or 0.75 were maintained on standard laboratory food medium a t 22". At generations 4,12 and 20 &e lines were again scored for Adh gene frequencies. Maximum likelihood procedures were used to estimate selection coefficients for the Adh genotypes. An analysis of deviance was used to compare the coefficients against expectations under the hypotheses of neutrality and of constant values for the three base populations, and for the three initial gene frequency classes. Highly-significant departures from neutrality were observed; over all 54 lines, the set of relative fitnesses for S/S:F/S:F/F was estimated as 1.00: 1.08:1.08. In addition, there were significant differences between lines in the outcome of selection which were not attributable to differences between base populations or initial F frequencies. These residual between-line differences, as well as some between-generation, within-line differences are discussed in terms of linkage disequilibria with background genes and electrophoretically cryptic variation at the Adh locus.
VER the last decade numerous laboratory Drosophila experiments have been reported in which changes in allozyme gene frequencies have been monitored over several generations following experimental chapges to the environment and/or the gene frequencies (BERGER1971; AYALAand ANDERSON 1973; VAN DELDEN,BOEREMAand KAMPING1978). Various types of result have been obtained, including directional gene frequency change leading to fixation, movement toward a polymorphic equilibrium (at either the pre-perturbation, or a new frequency) or no systematic change from the perturbed frequency. Selective processes have often been inferred to underlie pronounced systematic changes; but a number of problems have hindered precise interpretation, particularly where no large systematic changes have occurred. Genetics 100: 113-126 January, 1982.
114
s. R.WILSON et al.
For example. the absence of systematic change may reflect a real lack of selective effects, but it may also imply that the selective equilibrium under the experimental conditions does not differ significantly from the starting gene frequencies. In theory, this problem can be avoided by using several different starting gene frequencies but, in practice, this strategy has not been generally used. However, when it has been used, selection has been detected on Esterase-5, Octanol dehydrogenase and Malate dehydrogenase-2 gene frequencies in D. pseudoobscura (FONTDEVILA et al. 1975) and Esterase-6, Glucose-6-phosphate dehydrogenase, 6-Phosphogluconate dehydrogenase and Phosphoglucomutase gene freand WRIGHT 1966; YARBROUGH and quencies in D.melanogaster (MACINTYRE KOJIMA1967; BIJLSMA and VAN DELDEN1977; CARFAGNA et al. 1980). Many of the selective effects inferred from such experiments may not, however, have acted directly on the enzyme loci monitored. Often only a few related inbred lines have been studied and large linkage disequilibria will have been established and YAMAZAKI during the construction of the experimental populations (JONES et al. 1975; CAVENER and CLEGG 1981 and references therein). 1974; FONTDEVILA Furthermore, such changes in gene frequency as have been consistently observed over a variety of stocks are often still dependent on the presence of perhaps unand realistically severe environmental stresses (GIBSON1970; WILLS,PHELPS FERGUSON 1975, and references therein). Another substantial confounding effect on observed changes in gene frequency is random drift due to the unknown but finite sizes of the experimental populations. However, SCHAFFER, YARDLEY and ANDERSON (1977), GIBSONet al. ( 1979), and WILSON (1980) developed hierarchical model fitting and analysis of deviance techniques which make allowance for the effects of conservatively estimated finite population sizes. These authors have demonstrated the action of weak selection on the Malate dehydrogenase-2 locus in D . pseudoobscura populations kept under standard laboratory conditions, and the absence of detectable selection on the Alcohol dehydrogenase locus in D.melanogaster populations selected for alcohol tolerance. In the present experiment, the multiple starting gene frequency approach was used. with 54 isofemale lines from three base populations forming the experimental material. Selection coefficients were estimated using model fitting and maximum likelihood procedures and hypotheses concerning the values of the coefficients were evaluated using analyses of deviance. The power of this method was demonstrated by detecting weak selective differences between the electrophoretic F and S Alcohol dehydrogenase (Adh) alleles in laboratory populations of D. melanogaster kept at about 22" on standard laboratory food. M A T E R I A L S A N D METHODS
The 54 isofemale lines were founded from gravid females collected from three populations in the Mudgee district of New South Wales in 1978. One population, Montrose, was captured from two breeding sites in wine seepages in the Montrose winery cellar, the second population, Craigmoor, was obtained from numerous separate wine seepages in the Craigmoor winery cellar, the third, Linden, was captured around several piles of rotting fruit in a mixed fruit orchard. Each collecting locality was about 6 km away from the others.
SELECTION IN
D.melanagaster
115
Genotypes at the Adh locus (map position 2-50.1; GRELL,JACOBSONand MURPHY 1965) were determined in the flies collected after electrophoresis on cellulose acetate strips following the methods of LEWISand GIBSON(1978) ; overall the frequency of F was 0.67 k 0.03 in the 135 flies collected from Montrose, 0.64 t 0.03 in the 108 from Craigmoor, and 0.63 2 0.06 in the 34 from Linden. F frequencies in the collections were not significantly different from one another (x' = 2.63, P 0.05).
>
Each isofemale line selected for study was initially polymorphic for P and S and apparently had arisen from a single mating. Thus only three gene frequency classes occurred among the progeny of the 54 parent females: 0.25, 0.50 and 0.75 F. The 0.50 F class could also be divided according to genotypic frequencies, one type containing S / S , F/S and F/F progeny i n the ratio 1:Z:l and a second containing only F/S progeny. The number of lines of each type founded from each base population (together with the presumed founder mating i n parenthesis) was: 0.25 F (S/S X F/S)
Montrose Craigmoor Linden
5 2 1
0.50 F(i) (F/S X F / S )
6
0.75 F ( F / S 13 X F/F)
0.50 F(ii) (S/S X F / F )
1
7
1
2
1
14 1
The important difference between the 0.50 F(i) and 0.50 F(ii) lines is that selection could not act between Adh genotypes at generation 1 i n the 0.50 F(ii) lines as only one Adh genotype was present (see also Statistical Procedures). The 54 lines were maintained at 22" 2 2 " for 20 three-week generations by mass transfer in 300 ml vials containing 50 ml of food medium. The recipe for the medium was: 10 g agar, 26 g sucrose, 50 g glucose, 22 g wheat germ, 50 g maize meal, 6 g dead brewers' yeast, 5 ml propionic acid and 1 liter water. At generations 4, 12. and 20 after founding, each line was scored for F and S gene frequencies. The average numbers of Adh genes scored were 46 +- 4 at g = 4, 64 z!z 6 a t g = 12, and 84 4 3 a t g = 20.
N
It was important for the analyses to estimate effective population size, -, since this would e influence the extent to which random drift might contribute t o gene frequency changes. ACcordingly, eight Craigmoor lines were taken at random; in two successive generations their population sizes under the experimental conditions were estimated in terms of the numbers of adults of each sex produced in three weeks and the proportions of these adults that were fertile. Fertility among the females was estimated by removing a sample of 20 females from each line after the three weeks and putting them individually in vials with standard food. After three more weeks, the proportions of the vials containing progeny were recorded. Fertility among the males was scored in a similar way, except that each of the sampled males was put into a vial of standard food with three virgin females from another wild-type stock. The proportions of the vials containing progeny were recorded after three weeks. There were no significant differences among the eight lines in any of the four components of population size measured. The means and standard errors over the eight lines and two test generations for the four measures were: Total number
Males Females
Percent fertile
82 2 9 82 +- 4
143 2 9 139 +- 11
This suggests that there were an average of about 230 fertile individuals per vial. However, since this estimate was obtained under noncompetitive conditions, it probably overestimated effective
N
population size, -; under the experimental conditions not all fertile individuals would have con2 tributed to successful matings. Therefore, the following analyses were worked twice, one c a h -
N
N
lation assuming - = 200, the second more conservatively assuming - = 50. 2 2 During the experiment seven of the 54 lines were terminated when the number of adults available for transfer at the end of the three weeks fell below 200. The first loss occurred be-
116
5. R. WILSON
et al.
tween g = 7 and g = 8 in a Linden 0.50 F(i) line. The other losses occurred between g = 14 and g = 15 and involved one Craignzoor 0.25 F line and two Craigmcor and three Montrose 0.75 F lines. No obvious reasons for the sudden drops in population in these lines were detected, but because many dead pupae were found in the bottles disease was suspected. The cause 0.f the deaths was not investigated. Statistical procedures
Mode2 fitting: The technique for fitting models for various selective modes to the data followed WILSON(1980), and used similar notation. The data comprised a sequence of gene frequencies { p g , i c l }where pg,icz was the gene frequency in the gth generation (g = 4, 12, 20) in isofemale line 1 of base population c (c = L, M , C ) in initial gene frequency class i ( i = 0.25 F , 0.50 F(i), 0.50 F(ii), 0.75 F). T k E g z n e frequencies were subject to angular (arcsin) transformation such that Y g , i c l=i?sin-l d p g , i c z . For each line 2 in category c and initial frequency class i, the transformed observations are represented by a vector Y i c l whose three elements are the generation values. The variancecovariance matrix for the three transformed gene frequencies is t h e same f o r each line in category c and initial frequency class i, and is represented by the 3 x 3 symmetric matrix, Wi,. The elements of W i care given by ( W i c ) g , g1 =-+(l-~) l-(l-;)g-l] ?
-
[
%,ic
ng,ic
i = 0.25 F , 0.50 F (i),0.75 F and all c 1
=--+(1-L)[l--(l---) 1
Y-2
]
I
%,iC
“9,l.C
i = 0.50 F (ii) and all c,
(
=1- 1-
i)g-2 for i = 0.50 P(ii) and all c,
N 1 where - is the effective population size and - ng,icis the number of individuals scored for gene 2 2 frequency at generation g. Each mod21 m, gives rise to an expected transformed gene frequency eg,icz( m ) at generation g for each line in initial frequency class i and category c. From these, the generation vectors are formed,sic, ( m ) .For each line, the difference vector is determined,
-’ l i c z ( m )
=_yid
-2icz(m)
.
Then we have the well known statistical result (based on the likelihood ratio) that for the model, m, for a particular set, S, of lines in particular category/categxies and initial frequency class/ classes, the deviance x p )= z rYic~(~)l~~~C-~CY~cz(~)l (1) [iClES]
-
-
is distributed as a chi-square variate with degrees of freedom (df)in the range ( d , ( m ) ,d,(m)). The upper bound, d , ( m ) is given by d,(m) 2 Glcl--(m) rlc’e
i
-
where G l c zis the number of observations in the vector Y z c l (here either 3, 2 or 1) and v ( m ) is the total number of separate parameters fitted under m3del m The lower bound, d , ( m ) is d l ( m ) = max (0, ns - ~ ( m ) ) where ns is the number of independent vectors Y l . c zthat , belong t o S. (The df cannot be determined exactly, since the gsneration values, Y g , z c iare , not independent as g varies ) To determin- whether model 7 (involving more parameters) is a significantly better fit to the observed data, S, than model m (involving fewer parameters) we calculated ( m ) xi ( i ) which. under the null hypothesis of no improvement, has a central chi-square distribution with
-
xi
-
SELECTION IN
D.melanogaster
117
exactly ~ ( j ) v ( m ) df. Thus, one may build up a table of deviances for sequences of nested models similar to the tables of sums of squares determined in an analysis of variance. Estimation procedure: For all models except that of no selection, the relevant v ( m ) parameters can be determined. The most general selection model is need to be estimated so that EiCz(m) that each line in each base population c and initial gene frequency class i has a different set of fitness parameters which are assumed to be constant over time. The ratios of fitness for genotypes {S/S,F/S,F/F}are taken to be of the general form {I : l+tic2 : l+siC2} respectively. This is a more general and biologically meaningful model than that i n SCHAFFER et al. (1977) and WILSON (1980), where appropriate changes in the transformed gene frequencies, rather than in the selection coefficients,were set as constant over time. The expected gene frequency at generation g 1, @ ( g + l ) , i c z is related t o that at generation g, @g,ieztby
-
+
+
@g,ic~
+
@~,ic2~ic2
@ g , i c (~I - @ g , < c ~ )t i c 2
and to starting gene frequencies by
1,+tiel
= 4,+2tiez
%iCZ
-
for i = 0.25 F, all c, 2
-
Thus one obtains the vector E,,z(m) and hence y i c 2 ( m )as a function of tiel and siCl.For each line, 1, the parameters ticz and sic2 can be estimated by the method of maximum likelihood, which is equivalent to maximizing the function qc2=
-
- Cn,z(m)lT wit-1 CYccl(m)l -
(2)
(and this is the same as minimizing -Lic2). The NAG subroutine EO4JAF was used. TOestimate the parameters under appropriate subhypotheses concerning various subsets, S b , of the data, we maximize 2 Liczrwhere Lic2is given by (2). For examole, if the model, mic. 1ic 2 I eS,
is fit SO that the selection parameters are identical for every line in each (i, c) class, then for each (i, C) class the estimates of (tic,, s~,.) are determined by maximizing Z Licl,since for each (i, c ) 2
gicz
class, the subset s b consists of all lines in that class. These determine the values of (mic) and hence yicz ( m i c ), and the deviance value is obtained by substituting into (1). Another exc
ample if the model, m . . . , is fit so that the selection parameters are identical for every line in every population and every initial gene frequency class, the estimates of ( t . . , , s , . .) can be determined by maximizing Z Licz.(In this special case Sb S.) Then, having determined the icz
values of y l c l ( m , . ,), the deviance value is obtained by substituting into (1).
-
RESULTS
Table 1 shows the average gene frequency changes observed over all three base populations for lines in each initial frequency class. (It also gives expected changes for a particular mode of selection; see below). Regular, monotonic rises in F frequency are apparent for each initial frequency class and the average
118
S . R . WILSON C t
d.
TABLE 1 Means and standard errors of cbserued F frequencies
Initial
P frequency
0.25 0.50 0.75
4
Generation 12
eo
0.31 4 0.06 (3.30) 0.53 i- 0.03 (0.54) 0.76 t 0.02 (0.76)
0.41 t 0.10 (0.39) 0.56 i 0.03 (0.60) 0.78 0.03 (0.78)
0.49 i- 0.11 (0.48) 0.62 t 0.01. (0.66) 0.79 +- 0.04 (0.80)
*
Frequencies predicted by a relative fitness set of 1.00: 1.08: 1.08 f o r S / S : F/S:F/Fin parentheses.
frequencies in the different classes converge towards relatively high values. Thus, by g = 20, average F frequency in 0.75 F lines rises by only 0.04 to a value of 0.79, but in 0.50 F lines it rises by 0.12 to 0.62 and in 0.25 F lines it rises by 0.24 to 0.49. Figure 1 shows that with few exceptions these trends also occurred in the initial frequency classes within each of the three base populati3ons. In three classes, Craigmoor 0.25 F , Linden 0.50 E and Linden 0.25 F , there were temporary drops in F frequency at particular scored generations; hawever. each of these classes contained only one or two lines, so their temporary reversals may have been due to sampling. The only consistent exception to the general trends in Table 1 was the Montrose 0.75 F class, which showed a slight but sustained drop in F frequency throughout the experiment. This class contained 13 lines, of which ten survived until g = 20; but, as Figure 2 shows. the slight decline observed overall in this class was due to pronounced falls in F frequency in two of the lines. These two were the most deviant of all 54 lines investigated. In the analyses, the design of the experiment was regarded as having two crossed factors, ini’cial frequency class i, and base population c, as well as two nested factors, line I within each (i,c) combination and generation g within each line I. A hierarchy of models was constructed around this structure; the parame-
N
terization for these models, with deviances f o r - = 50 and 200, is given for 2 Table 2. The differences in deviance between appropriate models was calculated to embrace hypotheses of interest concerning selection parameters s and t. These deviance differences formed the basis of the analysis of deviance shown in Table 3. Six terms were tabulated, of which (i) was for the improvement over a neutral model of a selection model assuming single maximum likelihood s and t estimates for all 54 lines. Terms (ii)? (iii) and (iv) reprecent the improvement in fit over the selection model in (i) of selection models assuming different s and t estimates for each initial frequency class i, base population c, and (i.c) combination respectively. Term (v) represents the improved fit over the model in (iv) of a fully parameterized celection model assuming different s and t estimaies f o r all
SELECTION IN
0-.
x c
/.
0.25
D.melanogaster
119
PI (51
Mon trose
U Q,
3
U
E)
g=4
g=l2
g =20
g=4
g=12
g=20
Y
LL
. g=4
g=12
g=20
g=4
g=12
g=20
FIGURE 1.-Mean F frequency observed at each scored generation for each initial F frequency class. Values are shown within each base population and pooled across all three. The numbers of lines with complete data are given in parentheses.
54 lines, and term (vi) represents the actual fit of this fully parameterized model in (v).
N
The analyses assuming - = 50 and 200 give essentially similar results. In 2 both cases term (i) is highly significant, indicating that s and/or t are not zero. Terms (ii), (iii) and (iv) are not significant, indicating that the s and t estimates are essentially homogeneous over initial frequencies, base populations and their combinations. As is appropriate in a nested analysis, term (v), the variability between lines within (i,c) combinations, is used as a denominator mean deviance to derive the F ratios in each of these tests of significance. The variability between lines within (i,c) combinations (v) is significantly greater than the variability between generations within lines (vi), which in turn is significantly greater than expectations under binomial sampling (e.g.
x:47- = 84.3P
N < 0.001 for 2 = 50). In both of these tests df
is uncertain, but the
s. R . WILSON et al.
120 1 .o
2 c
0.75
al
3
L
U-
LL
0.3
0.21 g=12
g =4
g=20
FIGURE 2.-F frequency observed at each scored generation for each Montrose line initiated a t 0.75 F. Only lines for which there were complete data are shown.
number assumed in the analysis gives the most conservative test. The uncertainty about the df and some other numerical problems in the analysis are explained in the APPENDIX. The between-generation within-line term (vi) is significantly larger than is expected under binomial sampling; this reflects large reversals in the direction of gene frequency change. The largest reversals occurred in two Craigmoor 0.75 F lines in which the F frequencies at generations 4, 12 and 20 were C0.55, 0.83, 0.94) and (0.71, 0.69, 1.00). If these two lines are omitted from the analysis, the
N
within-line deviance for 7 :=50 is reduced from 84.3 (spa, t U M
SELECTION IN
D.melanogaster
123
- 0.2
-0.21
- Q2l
FIGURE 3.-The 95% and W% confidence contours for the estimates of s = t = 0.08 obtained from the pooled data for all 54 lines.
cated by the fact that the selection coefficients, while relatively small (s = t = 0.08), were still significantly differentfrom zero. The result also proved robust to variation in the effective population size assumed, with the selection still highly significant even when the population size was conservatively set at 50 individuals. The selection coefficientsdid not differ significantly among the three base populations. Nor was there evidence for frequency dependent selection (MORGAN 1976; YOSHIMARU and MUKAI1979) ; not only were selection coefficients consistent across initial frequency classes (Table 3), but also the mean gene frequencies observed over time agree with those expected under the hypothesis of constant selection coefficients (Table 1). There were residual differences in the coefficients among isofemale lines within base populations and initial frequency classes, as well as differences among generations within lines. Both these sources of heterogeneity probably resulted from founder e€fects during the production of the isofemale lines; these effects would have established large differences among lines in background genotype as well as large linkage disequilibria between Adh and background genes within lines. Together these two properties could account for variation in selection coefficients between lines, while the progressive decay of the linkage disequilibria over generations might also account for the within-line variation observed, One aspect of the genetic background which could affect the selection observed is the inversion Zn(2L)t. This is on the same chromosome arm as the Adh locus and in heterokaryotypes suppresses recombination throughout the a m . Zn(2L)t is polymorphic in most natural populations and generally in linkage disequilibrium with Adh, with Zn(2L)t,S gametes in excess, and Zn(2L)t,F gametes less frequent than expected under random association (VOELKER et al. 1978 and references therein). Although not typed in the lines used for this study, Zn(2L)t has been found a t a frequency of 0.07 +- 0.04 in a subsequent Montrose collection,
124
s. R.
WILSON
et al.
and all three Zn(2L)t chromosomes identified bore S (KNIBB,OAKESHOTT and GIBSON1981) . INOUE (1979) found that Zn(2L)t is gradually eliminated from polymorphic populations introduced into the laboratory. Therefore, some of the variation in selection coefficients on the chromosome region marked by A d h observed within and between lines in the present study could be due to the occurrence of some Zn(2L)t, S chromosomes in a minority of the lines and the selection against these chromosomes. Likewise, Zn(2L)t might have biased the overall estimates of selection coefficients on the chromosome region marked by A d h , with the fitnesses of S/S and, to a lesser extent, F / S flies underestimated. Any such bias was probably small, since the overall frequency of the inversion was probably low. Nevertheless, the direction of any correction for the bias would imply slight heterozygote advantage, rather than the complete dominance for fitness otherwise indicated by the analysis. Another factor which could have affected the outcome of selection is the electrophoretically cryptic thermostability variant ADH-FCh.D., which is allelic to et al. 1980; GIBSON, WILKSand the two common electrophoretic variants (WILKS CHAMBERS 1981). The allele FCh.D. has been found in collections from the Mudgee region a t frequencies around 0.05 * 0.01 (range 0.00-0.20) and it could have occurred at substantial frequencies (i.e. 0.25 or greater) in a small number of lines studied, contributing t o withm- and between-line variation in selection COefficients and possibly biasing the overall coefficients estimated. The strong chance of general genetic background effects, as well as specific sources of heterogeneity such as Zn(2L)t and FCh.D., preclude positive identification of the site of action of the selection observed in the chromosome region marked by Adh. However, if the A d h locus was the site for the selection, the selection detected is consistent with the general observation that F is at much higher frequencies than S in laboratory stocks or populations (e.g. VAN DELDEN,BOEREMA and KAMPING 1978; CAVENER and CLEGG1978; GIBSON et a2 1979; OAKESHOTT 1979). The computed selection coefficients predict that F would eventually become fixed in the experimental populations, which is not in accord with the persistence of the polymorphism in the wild populations from which the experimental lines were extracted (see MATERIALS AND METHODS). This would suggest differences in the selective regimes imposed by the wild and laboratory environments. The experimental design and analysis we used has the power to detect selection coefficients as small as those which might be expected to operate on enzyme polymorphisms in unstressful environments. Our conclusions have also been robust to the assumption of effective population size, although this might not be so in other experiments. Finally, the analysis is potentially versatile for a variety of hypotheses, depending on the experimental design; our design was suitable for six hypotheses but still more complex and informative designs are obviously possible. We thank Prof. J M. THODAY, Dr. NI. A. h E N 4 and Dr. N. G. MARTINfor valuable discussions; Mrs. W. EDDEY, Mrs. V. PARTRIDGE, Mr. D. A. WILLCOCKS and Miss A. V. WILKSf o r technical help, and Miss Y. PITTELKOW for assistance with the computing.
SELECTION IN
D.melanogaster
125
LITERATURE CITED
AYALA, F. J. and W. W. ANDERSON, 1980 Evidence of natural selection in molecular evolution. Nature 241: 274-276. BERGER, E. M., 1971 A temporal survey of allelic variation in natural and laboratory populations of Drosophila melanogaster. Genetics 67: 121-136. BIJLSMA,R. and W. VAN DELDEN,1977 Polymorphism at the GGPD and 6PGD loci in Drosophila melanogaster. I. Evidence for selection in experimental populations. Genet. Res. 31): 221-236. CAVENER, D. R. and M. T. CLEGG,1981 Multigenic response to ethanol in Drosophila mlanogaster. Evolution 35: 1-10. and R. RUBINO,1980 Adaptive value of CARFAGNA, M., L. LUCCI,L. GAUDIO,G. PONTECORVO PGM polymorphism i n laboratory populations of Drosophila melanogaster. Genet. Res. 36; 265-276.
FONTDEVILA, A., J. MENDEZ,F. J. AYALAand J. MCDONALD, 1975 Maintenance of allozyme polymorphisms in experimental populations of Drosophila. Nature 255: 14-15 1. GIBSON,J. B., 1970 Enzyme flexibility in Drosophila melanognster. Nature 227: 959-960. GIBSON,J. B., N. LEWIS,M. A. ADENAand S. R. WILSON,1979 Selection for ethanol tolerance in two populations of Drosophila melanogaster segregating A D H allozymes. Aust. J. Biol. Sci. 32: 387-398. GIBSON,J. B., A. V. WILKSand G. K. CHAMBERS, 1981 Population variation i n functional properties of alcohol dehydrogenase in Drosophila melanogaster. In: Genetic Siudies of Drosophila Populations. Edited by J. B. GIBSONand J. G. OAKESHOTT, The Australian National University, Canberra. GRELL,E. H., K. B. JACOBSON and J. B. MURPHY,1965 Alcohol dehydrogenase in Drosophila melanogaster: isozymes and genetic variants. Science 149: 80-82. INOUE, Y., 1979 The fate of polymorphic inversions of Drosophila melanogaster transferred to laboratory conditions. Jap. J. Genet. 54: 83-96. J. S. and T. YAMAZAKI, 1974 Genetic background and the fitness of allozymes. Genetics 78: 1185-1189.
JONES,
KNIBBS,W. R., J. G. OAKESHOTT and J. B. GIBSON,1981 Chromosome inversion polymorphisms in Drosophila melanogaster. I. Latitudinal clines and associations between inversions in Australasian populations. Genetics 98: 833-847. LEWIS,N. and J. B. GIBSON,1978 Enzyme protein amount variation in natural populations. Biochem. Genet. 16: 159-170. MACINTYRE, R. J. and T. R. F. WRIGHT,1966 Responses of Esterase-6 alleles of Drosophila melanogaster and D . simulans to selection in experimental populations. Genetics 53 : 371-387. MORGAN, P., 1976 Frequency dependent selection at two enzyme loci in Drosophila melanogaster. Nature 263: 765-767. OAKESHOTT, J. G., 1979 Selection affecting enzyme polymorphisms in laboratory populations of Drosophila melanogaster. Oecologia 43: 341-354. SCHAFFER, H.E., D. YARDLEY and W. W. ANDERSON 1977 Drift or selection: a statistical test of gene frequency variation over generations. Genetics 87: 371-379. VOELKER, R. A., C. C. COCKERHAM, F. M. JOHNSON, H. E. SCHAFFER,T. MUKAI and L. E. METTLER,1978 Inversions fail to account for allozyme clines. Genetics 88: 515-527. WILLS,C., J. PHELPS and R. FERGUSON, 1975 Further evidence for selective differences between isoalleles in Drosophila. Genetics 79: 127-141. WILKS,A. V., J. B. GIBSON,J. G. OAKESHOTT and G. K. CHAMBERS, 1980 An electrophoretically cryptic alcohol dehydrogenase variant in Drosophila melanognster. 11. Post electrophoresis heat-treatment screening of natural populations. Aust. J. Biol. Sci. 33 : 575-585.
s. R. WILSON et al.
126
WILSON, S. R., 1980 Analyzing gene frequency data when the effective population size is finite. Genetics 95: 489-502. DELDEN,W., A. C. BOEREMA and A. KAMPING,1978 The alcohol dehydrogenase polymorphism in populations of Drosophila melanogaster. I. Selection in different environments. Genetics 90: 161-191. YARBROUGH, K. and K. I. KOJIMA,1967 The mode of selection a t the polymorphic Esterase-6 locus in cage populations of Drosophih melanogaster. Genetics 57: 677-686. H. and T. MUKAI,1979 Lack of experimental evidence for frequency-dependent YOSHIMARU, selection at the alcohol dehydrogenase locus in Drosophila melanoguster. Proc. Nat. Acad. Sci. U.S. 76: 876-878. Corresponding editor: M. NEI VAN
APPENDIX
The uncertainty about the degrees of freedom in some of the statistical tests used i n this article arises largely from the loss of an unknown number of df due to the correlation between the successive gene frequency scores within each line. This problem is compounded by the loss of g = 20 data in 6 lines and of both g = 12 and g = 20 data from another one. In tobal there are 154 gene frequency scores from the 54 lines (excluding g = 0 data), comprising: 47 lines each contributing 2 1 df 6 lines each contributing 1 -/- Q 1 df 1 line contributing 1 df, giving a total of 101 53 df. Eighteen df are used by the interaction model in term (iv) of Table 2, which is the most complex of the models in the first four terms. Of the remaining 83 53 df,0 47 are attached to the within line term (vi) and the rest, 83 6, are attached to term (v) for the residual between line variation. Apart from the problem with df,numerical problems were also occasionally encountered in application of the minimization procedure to estimating sicland tiel and corresponding deviance in individual lines. These problems arose because the procedure only determined local minimums; such irregularities could be detected by running the routine several times, varying the starting position for the search for the minimums as well as the value of N and the particular installation (runs being done on a UNIVAC 1100 and DEC 10/20) and whether single or double precision was used. Also, crude contour plots [evaluating the deviance at grid points on the (1. s) plane] were done to examine how deviance values changed with changes in 5 and s (see e.g. Figure 3 ) . For individual lines in good agreement with the hypothesis of constant t and s values over generations, the minimization routine generally gave low deviance values and the contour plots indicated a ( t , s) region supporting the hypothesis which was approximately wedge-shaped, fanning out from a curvilinear ridge line. Note however line (0.50F (ii), C, I ) , with observed values (0.50, 0.53, 0.41}, for which the minimization routine always gave an error exit indicating convergence difficulties but still yielded reasonable deviance values (i.e. 3.0). In this case the contour plots indicated ridging along the line s = 0, but difficulties near the boundary r = - 1.0. A second irregularity is exemplified by line (0.26F, L, 1 ) with observation vector (0.13, 0.21, 0.67). On the UNIVAC, the minimization routine gave an error message indicating that convergence requirements had not been met at the point 5 = - 0.53, s = 2.27 with expected values
+