Les calculs .... reducing densities in several lakes over one summer, using gill nets. Improved .... random variables with q u a l variances. .... This system has a unique solution for all t ..... doreplication9 in time (Hurlkrt 1984; Stewart-Oaten et d.
Experiments Designs for Estimating Transient Responses to Management Disturbances Carl I. Walters and Jeremy S. Collie Resource Ecology, University of British Columbia, kncoclwr, B.C. VBT I W.5
and Timothy Webb EnvimnrnewtaE and Social Systems Analysts kfd., 905-808 Melssn Street, kneoclver, B.C. V6Z 2H2
Walters, C. I., 1. S. Collie, and ToWebb. 1988. Experimental designs for estimating transient responses to management disturbances. Can. ]. Fish. Aquat. Sci. 45: 530-538. Simple experimental designs involving treated and control areas, with all treatments initiated at the same time, should not be used to assess transient responses to management actions or environmental disturbances. Suck designs will not properly control for '7irn+treatment'Ynteractions, involving differential responses of treated areas to nonrandom trends in the experimental environment. For example, survival trends for hatchery salmon stocks cannot be simply compared with survival trends in wild stocks, since the hatchery stocks may be more susceptible to changes in environmental factors such as ocean temperature. To control for such time-treatment experimental design in which treatment is initiated at different interactions, it is suggested to use a "stairca~e'~ times on the treated areas. Computation of the average transient and interaction response parameters for suck designs, while correcting for temporal trends and inherent differences among areas, can be done using general linear models. The optimum design configuration (number of treated versus control areas, number of pretreatment control times, etc.) involves spreading the treatment starts widely over time and using relatively few control areas. bes plans experimentaux simples basks sur des zones A intervention et des zones t$moins, toutes les interventions &ant mises en euvre au m@memoment, ne devraient pas &re utilises pour Ir6vaIuationde repsnses transitsires 2 des rnesures de gestisn ou 5 des perturbations environnementales. Ces plans ne permettent pas de contr6ler ad4quatement les interactions (( temps-intervention B quand il y a r6pnses differentielles des zones traitees A des tendances non alkatoires de l'environnernent experimental. Ainsi, les tendances de la survie des stocks de saurnon d'elevage ne peuvent Gtre simplement comparees 5 celles des stock sauvages, les stocks d'klevage pouvant &re plus sensibles aux variations de facteurs environnementaux, cornme la ternwrature de ('ocean. bes auteurs proposent, pour le contr6le be ces interactions temps-intervention, d'utiliser un plan expkrimental en a paliers ou les interventions sont mises en ceuvre A des moments differents dans la zone traitbe. Les calculs des paramGtres des r6ponses transitsires moyennes et des interactions peuvent @trerealis& A ['aide de mdGles linkaires g6n6raux tout en apportant des corrections pour tes tendances temporelies et les &arts inhkrents entre zones. La configuration optimale du plan (nornbre de zones traitkes par rapport au nornbre de zones tkrnoins, nombre de temps de pr6-traitementtGrnoins, etc.) est sbtenue en ktalant largement dans le temps le moment du debut des interventions et en faisant appel A un nombre relativement restreint be zones t$moins. ))
Received March 2, 7987 Accepted November 2, 1987
Recu !e 2 man 1987 Accept6 le 2 novernbre 1987
(Bl63)
my fisheries and other renewable resource management actions result in transient responses that may be confused with the effects of other management activities m&or natural 66e~~ironmenta199 changes that have nonrandom temporal structure. Often it is possible to test actions on local "pilot" areas or spatial replicates, through mmagement experiments h a t involve pretreatment monitoring m d monitokng of untreated control areas. However, such experiments are normally quite expensive, because of both field logistics m d costs of waiting before full-scale implementation; thus it is important that experimental designs be developed care?klly. The purpose of this paper is to p i n t out some basic design requirements for distinguishing between transient effects due to deliberate treatment versus transient effects due to other causes. Simple experimental designs such as pre- versus posttreatment monitogiing and monitoring both treated m d control 530
mea for example, by Green (1979) and Mil(1986), do not allow the effects to be lard clemly separated, m d we reco end the use of a design that we call the 'staircase. ' The development of the staircase design was motivated by difficulties that have k n encountered in interpretation of results from one of the largest fisheries management experiments in Canadian history, the Pacific Sdmonid Enhancement Program. The first phase of this program (1977-86) was to test a variety of efimcement technologies md options, while the second phase was to implement the best of these on a wider scale. A very visible element of phase I duction of chinook (Onc~rhynchus sdmon (0. kisutch), with success being measured though m extensive coded-wire tagging (CWT) program. The tagging data from some hatcheries a d d m CWT data from some American Con. J. Fish. Aqwb. Sci. Val. 45, 1 9 8
hatcheries have shown substantial declines in survival rates, while wild ('kcontrol") stocks have apparently not shown comparable declines (Wdters and Riddell 1986). Thus an obvious hypothesis is that the hatchery declines represent a transient movement h m initial high success to a more moderate sustainable production pattern following adjustment of diseases, predation, etc. However, the declines are correlated with a warming trend in ocean temperature (T.P e q , Department of Fisheries a d Oceans, Vancouver, B .C., pen. co correlation is used as an argument to continue hatchery production because the natural trend will reverse itself. In response to questions about why the wild stocks have not also responded to the oeem temperature change, proponents of the temperature hypothesis simply a g u e that there is a time-treatment " interaction: the treated (hatchery) populations may be more sensitive to environmental changes than are wild stocks. Time-treatment interaction effects cannot be measured with simple experimental designs that involve simultaneous initiation of treatment ow d l treated replicates, yet such effects can be critical in policy formulation: people with pmonal conviction or vested interest in a treatment that does not seem to be working as planned can always mgue that the treated replicates are undergoing a temporary setback that will sssn be reversed. The only direct, empirical way to convincingly counter such arguments is to demonstrate that treatments initiated later in time result in the same transient pattern as for earlier treatments. The essence of the staircase design is to provide such a staggered schedule of treatment initiation. In the hatchery survival example mentioned above, it is perhaps obvious that treated hatchery stocks may respond differently over time to possible environmental changes than do natural populations. However, the problem of time-treatment interaction is much more general, m d the followineg paragraphs present thee more examples in which the failure to use a staircase design, with different treatment starting times, would invite misinterpretation or criticism of the experimental results. Suppose an experiment is conducted to measure the effect of fertilization on algal growth in streams. Treatment and control stream reaches are defined, a d algal growth is measured as a transient biomass response on artificial substrates. Treatment is h i t i a t 4 in maaidsummer, and 8 modest increase in growth is found in treated reaches relative to the control(s). An obvious objection to this experiment is that the midsumer environment for algal growth may be less favorable than spring or fall environments, so the experiment underestimates long-term (annual) typical responses to fertilization; such a time-treatment interaction is in fact likely, as demonstrated by the experiments of Penin et al. (1887). A researcher on high mountain lakes attempts to improve growth of '%tuntedW brook trout (Scslvelinus fontinalis) by reducing densities in several lakes over one summer, using gill nets. Improved growth rates relative to control lakes are seen for the next several years. A critic of this experiment could argue that the improved growth rates were seen for much longer than would 'k~ormally" be expected, since abnormally heavy snowfalls md short growing seasons (the reader may substitute his own favorite environmental factors at this point) prevented the s h q recruitment response that would normally occur in response to reduced density, bringing biomasses more quickly back to n o m d and reducing growth rates sooner. Suppose fishing rates are deliberately reduced on several salmon stocks to test whether recruitment (smolt production) can be increased by better "seeding' ' sf natal streams. Suppose Can. J. Fish. &wt. Sci., V01. 45, 1988
that no increase in smolt production is s e n , relative to heavily fished control stocks. The obvious conclusion would be that spawning stock is not a limiting factor for production. En this example, a believer in the importance of stock-recruitment relationships might argue that the experiment did not provide representative results, since lower than normal stream Wows (or pick another pssible environmental limiting factor) resulted in unusudialy limited space for juvenile rearing, thus preventing the increased production response that would have been seen under more ''typical" Wow conditions. A classicd statistical approach to time-treatment interaction is to use so-called '6cross-over" or '6chmgeo~er9' designs in which treatment is reversed or discontinued on each replicate after a time, while earlier controls receive treatment. (Fbr a good discussion of such designs, see Chapter 8, especially p. 203ff, in Gill 1978.) We shdl assume in the following discussion that (1) the treatment effects are irreversible, strongly persistent, or might be exhibited only after an unknown and perhaps long time delay after treatment is initiated m d (2) there is direct management interest in estimating the full transient pattern of response. Under these assumptions, a crossover design would be either impractical or undesirable. Another possible approach would be to assume a specific dynamic model for the transient response, a d then seek a design that would provide g o d parameter estimates for this model in the presence of correlated model e m r s (the timetreatment interaction effects). This is an unnecessarily risky approach, since it may result in a very bad design if the correct model structure is not known in advance. We take the more conservative view that the response at each point in time should be estimated separately, m d the 4 after the experiment for agreement with various dy In the following sections, we develop alternative exprimentd design models for estimating transient effects md t i m e treatment interactions, comment on the efficient of the design model parameter estimates and cov a trivial task, since the designs will generally be md suggest tactics for finding an optimum combination of design parameters (number of replicates, duration of monitoring, etc.). The development makes extensive use of the theory of linear statistical models; for introductions to this theory, we nd the texts by Seale (1966, 1971) and Graybill (1961).
Design Models for Transient Responses Suppose that a set of n replicate units (areas, streams, etc.) have been identified m d a response index y , is to be measured on each of them (I' = 1,...,n) at a series o f t = 1,...,T measurement times (Fig. 1). A subset r" = I,. ..,m of the units are to be treated after pretreatment times t , ,...,tm,and these treatment units are chosen at random from the set of n ex units. The remaining n - rn units serve as untreated experimental controls for the entire period T. A g e n e d linear m d e l of the observed responses is (1) Yit 'Pi + Tt + eit where pi is the mean response than unit i would show in the measures locally unique, time-invarabsence of treatment ( F ~ imt features of unit i), 4 , is a time effect shared by d l units independent of whether they are treated, 19, is the effect of treatment on unit i in year t (R, = O for t < ti or i > m),and eit is a locally unique response, independent of treatment, of unit i
FIG. I . TOesthate interaction effects hat are assumed independent of time since treatment, when it is feasible to monitor only three experimental units, it is necessq to use a "staircase" design where treatment is started in two successive times. Treated unit-time csmbina%icdns are indicated by shading.
at time t. If the expe~mentdunits have been chosen with c u e and are really "good replicates," d l wsmmdom time trends should be shared (r,effect) and the eit should be independent random variables with q u a l variances. According to the above definitions, the ki and e, are "random effects" by design, while the T , and Rit are "fixed effects9, which may display arbitray patterns.
but does not depend on k. As for the T , in model (I), the I , c m only be measured relative to the I?,, and it is computationdly convenient to estimate the reduced parameters = RR, I,, + , (where t, is the number of control times for the first unit treated) and I: = I, - I,,+,for r > t , 1. These definitions lead to the linear model
Estimation s f Locd Responses
where6, = O f o r t G t i m d 1 f o r t > t i m d 6 ~= 8fortG ti 1 a d 1 for r > ti + 1. Note here that the e, for treated time-unit observations now contain (wonestimable) components of vaiation Ri, - Rk - I, as well as locd it effects that would have occurred even in the absence of treatment. We shall assume that these components of variation are small enough to ignore, so that d l observations y, will be assumed to have eitwith the s m e variance. (This assumption can be relaxed by using two-stage regression pmcedures, but the extra computational effort is unlikely to be worthwhile unless treatment causes a gross md obvious increase in variability.) Under what conditions can l ?and ~ I t be estimated uniquely by least squares, i.e. when is model (3) of full rank? One obvious condition is rn > 1; if only one unit is treated, then only the sums I?: 1: for that unit are observed. For m = 2 (two treated units), the conditions are (1) t, > 0, (2) t2 = 8, 1, (3) T > t2, md (4) n > m. That is, for rn = 2 there must be at least one control year for the first unit treated, and the second unit must be treated at the very next time. The second condition is what led us to use the term "staircase9,; one must step evenly from control to a single treated unit to a second treated unit (Fig. 1). While we discovered this condition by nume~cdBytesting the rank of various design matrix configurations, the basic reason for it can be seen algebraically. Note that the R, are estimable under conditions (I)+), and suppose that t2 = t, 2. Then the following response combinations will be observed over time:
+
+
~ d (1) is~ not of l full rank even 8 = 0,and it is puQaiondly convenient to work with the reduced model
+
+
+
R, + e, (2) yit = F ~ * 7," where the pi* = pi - T, and T*, = 4, - T , (for t > 1) u e estimable functions of the original pameters. Model (2) is of B the ti, i = 1,. ..rn , are greater thm zero provided d (at least one control observation for each unit), tmfor the last wit treated is Bas than T- 2, md m < 8a (at least one control unit). However, the Bocdly unique time-unit responses R, camnot be distinguished fiom the post-treatment residuals e,, (least squares estimation will choose the Rjeso that eit 8). The fundmenM assumption of additivity of effects in models (1) and (2) may seem umecessaily restrictive. It may be more redistic in some cases to assume multiplicative effects involving relative rather than absolute deviations from the mem response. This possibility is easily dedt with by applying the model with the l o g & h s of the original data. Most statistics cuss other lhe&zing trmsfomations. e model (2) can provide a general picture of response patterns from an e x ~ ~ e ncorrecting t , for basic differences among units and for shared time effects, the R,, estimates do not necessarily provide much infomation about average respnses for v&ous times after treatment, or about time-treatmewt interactions. In par$jLcuBw9more restrictive design critefia are necessary to estimate the interactions. The restrictions depend on how complicated the interaction effects are assumed to be.Here, we consider two cases: (1) interaction effects independent of time since treatment and (2) interaction effects v q k g with t h e since treatment.
-
Interaceion Effects Independent of Time Since Treatment In the simplest case, the R,, are assumed to consist of a sum of two effects, Rt and I,, where Rk is the mean response at the time after treatment begins (k > 0) and I, is a time-treatment imteraction effect that applies only to treated units (t > ti)
+
+
+
Note here that If can be chosen arbitrarily, a d the resulting esthate of Rt will in turn determine I:, &, etc., for every even year. At each time, there is &ways one less new observation of Can. J. Fish. Aquagat. Sci. Val. 45, 1988
R, than there are unknown p ters at that time. In contrast, when t2 = #, 1 the following R, csmbindions will be memwed over time:
+
R7 it: da
+ e,R: + q + q,22 + I:
have been in operation longer (k lager), due to accumuHatiopn of temperature sensitive disease organisms or selection fox less viable hatchery brood stock, To model this, the R, effects can be expressed as the sum ofRkas before, plus a k-specific interaction I,. The I, c m only be measured relative to I?,, and a convenient rep~meterizationis to set l ? =~ I?, + I,,, and W = I , - I,,,, for all k and t > r, 1 where r, is the first time that the kth response is present. This results in the linear model
+
+
This system has a unique solution for all t t, I (use l ? to ~ get 13 to get Rz ...), aM~oughthere is no replication to permit separation of the e , effects from the W,. For rn > 2, more complex control perid patterns t1, .* . , bm can be used. It is still of course necessary that ag > m md T 2 t,, 1. Rovided t2 = f l + 6 md ir, = b2 1 (i,e. an even staircaw sf starting times), then it is not necessary that t1 > 0,i.e. treatment can be started on the fist replicate without any control period for it, since other estimates of @ iflow separation of h m p?.Provided t1 > 0 (control period for d l treated units), then it is not necessary that the times be one step apart, but there must be both even m arting times. For, example, the sequence [t, = 1, t2 -- 3 , t3 = 51 dms not lead to unique estimates (if d l startigag times we odd, at least one Rffor an even R can be chosen arbitrarily), while the sequence it, = I , t2 = 3 , f3 61 does Had to unique estimates. More vividly, for rn > 2 the stairs must either be one step wide or else must be uneven in width (as the 1, 3, 6 exmpk, Fig. 2). The stating time requirements for rn > 2 remain vdid when more than one replicate w i t is treated at each starting time; replication does not eliminate the need to have a staircase sf starting times. Replicdon permits analysis of variation mong the r, that represent the same and I: combinations, and this variation can be compmd with eft variation from untreated time-unit measurements to see whether treatment causes increased "rmdom9 variation.
+
+
-
Interaction Effects Dependent on Time Since Treatment It may be aamedis~cto assume that the I, the-treatment interaction effects are indewndent of k, the time since treatment. For example, survival rates of hatchery salmon may be more sensitive to wean kmprawre variation for h&hefles that
where the tiitmd 6;rare 6 or 1 as for mode1 (3). Note bere that observations containing 1%: are made only for a progressively shorter set of times as k increases: IT: is present for t > tl 6, 127 f a t > t1 2, etc . Also, the IF are present only at those times for which at least one unit was treated k times earlier; to keep measuring new values containing I::, it is necessary to keep adding new treated wits at each time. The basic design requirements for model (4) are tl > 0 (at Beast one control year for d l units), m 2 2 treated units, n > m (at least one untreated control), T > tm 1 and the even staircase pattern ti = ti-I + 1 for i = 2 ,...,m. Under these requirements, rn - 1 interaction terns I?.,, ,,,, are estimable for k < T - tn and progressively fewer terns for lager k until none are estimable for k = T - t , . In order that all the initial response (k = 1) interaction terns I? for B = P, 2 to t = T be estimable, treatment must be initiated on at least one new unit every time from tl 1though TI in that case, all IE for t -- t , k 1 to t = T md k < T - $, will dso be estimable. This "Ml" design case requires m 3 T - t,, which may be a very fomidable requirement if the time ho$izon T of interest is long * S u p p s that there is a. fixed duration k k y m d which the transient responses l?,are not of interest, Then a modified staircase design c m be used, in which each treatment lanit is mowitored for a control perid ,$, and treatment period R, md then chopped from the design; again at least one new treatment unit must be added at each time (Fig. 3). This design would provide a shifting window sf estimates for the time-treatment interaction, with the s m e number of estimates for each k but for progressively lam time intervals. Various reduced designs (e.g. intervals between starting times greater than one or variable9 monitoring dwations vzi-
+
+
+
B,
+
+ +
FIG. 2. Interaction effects hat are independent of time since treatment can be estimated when more than t h e e x p i m e n d units used, provided treatment starts in successive times sr else at uneven times as im this c x m p b where tt, = 1 , t2 = 3, t, 6 , T = 10, and n = 4. Treated unit-time combinations are indicated by shading.
-
Cm. J. Fish. khqmt. Sci., VoB. 45, 19861
+-
FIG. 3. Modified staircase design with each trmtd unit monitored for a pretreatment period treatment period K, md thefa dropped.
able) could be used to provide a "patchwork' ' of 12estimates. However, infomation would be lost about at least some of the @ (those confounded with IF that represent 'holes9' in the patchwork) md all the parameter estimates would be very difficult to correctly interpret without imposing further unrealistic assumptions (such as randomness in the I,, over time). C~mputationof the Response Estimates The design models (2)-(4) will generally involve unbdanced numbers of observations of the various effects, so that minimum variance parameter estimates cannot be obtained from simple averaging of the data as in balanced analysis of variance designs. Instead it is necessary to use the general linear model framework, which involves solving a fairly large system of linear equations. Package computer programs are widely available for the general Binear model, but these may not make efficient use of the problem structure as outlined above. Also, in searching for an optimum or at least acceptable design pattern (m,n, T , ti) for a particular problem, it is useful to be able to quickly estimate the relative precision (variances of parameter estimates) to be expected from various choices. Some computational shortcuts for obtaining parameter estimates md their v a imces are given in the Appendix.
The staircase designs suggested above for models (3) md (4) do not produce pameter estimates with equd variances, due to imbalance in the number of times each effect is observed and in the combinations of effects that are observed together. The general pattern of estimation precision is as follows: (1) the p: are least precise for the units treated earliest, and most precise for the untreated (control) units; (2) the earliest $ are most precise, with the precision decreasing (02, increasing) each time mother unit is treated; (3) as for the T * , the earliest (smallest k , 4) Rt md 1: or IF axe estimated most precisely, and the v a imces increase progressively with time since treatment Qk). A somewhat more complex pattern is seen if the parameters are transformed toR, and It orIkfby using the constraint T I t = 0 T
= 0) so that I, can be estimated as - Z IriT and (OI removed from Rf = Rk I,. Under this transfo&ition, both the early and late R, md It may have higher variance, with the intermediate transient responses being estimated more precisely.
+
tc and
Measures of Design Performance Since the pameter estimates have varying precision over time, it is difficult to define a single, overall performance measure for c o m p ~ n gdesigns with different numbers sf replicates, pretreatment periods, etc. (BOX md Draper 1975; Fedorov 1972; Keifer 1975). Presumably the central concern is with the & (or Rk) estimates, so any perf0 should be related to the variances and c o v ~ m c eof s these estimates (and might ignore uncertainty in the p, 7,a d I estimates except in so far as this uncertainty impacts on the estimation of R,). Three simple performance measures are (1) the varthe variance iance of the first response in the transient; (2) of a response midway in the maximum K-year transient seen in any unit; and (3) S2= 1Z u2 the average over all times K k since treatment of the u& (a crude measure of precision in estimating the overall average response). These measures can be calculated quickly for any proposed design, given an estimate of G~ (the variance of the e,) md the first k diagonal elements of the H = (E - C'S- IC) - submatrix of (X'X)- I (see Appndix); u+ is u2Hl19u$2 is ~ ~ f f and ~ / C2 ~ is, C2/K ~ , times ~ ~ the sum of the first R diagonal elements of H. A more elaborate pthfomance measure is the determinant of the H submatrix (see Appendix). This determinant or