Optimal Designs with Minimal Aliasing - CiteSeerX

13 downloads 0 Views 163KB Size Report
constructing an optimal design for a first-order model, aliasing of main effects and ... design no main effects are confounded with two-factor interactions.
Optimal Designs with Minimal Aliasing Bradley Jones SAS Institute Cary, NC 27513 ([email protected])

Christopher J. Nachtsheim Carlson School of Management University of Minnesota Minneapolis, MN 55455 ([email protected])

Abstract For some experimenters, a disadvantage of the standard optimal design approach is that it does not consider explicitly the aliasing of specified model terms with terms that are potentially important but are not included in the model. For example, when constructing an optimal design for a first-order model, aliasing of main effects and interactions is not considered. This can lead to designs that are optimal for estimation of the primary effects of interest, yet have undesirable aliasing structures. Using a Bayesian formulation of the design problem, we construct exact designs that minimize expected squared bias subject to constraints on design efficiency. We demonstrate use of the method for the construction of screening and response surface designs.

KEY WORDS: Alias matrix, Bayesian design, Constrained design, D-optimality, I-optimality, Minimum bias design, regular designs, non-regular designs, Box-Behnken designs

1

1

Introduction

In recent years, the use of optimal designs in industrial experimentation has grown rapidly, due, in part, to the fact that the methodology is now being introduced in standard DOE text books (see, e.g., Montgomery, 2008, Kutner, et al., 2005) and also because facilities for constructing optimal designs have become generally available. A common concern that many experimenters share when using optimal designs has to do with the potential over-reliance on a single model. Model robust designs (see, e.g., Lauter, 1974, Cook and Nachtsheim, 1982, Li and Nachtsheim, 2000), model discriminating designs (see, e.g., Box and Meyer, 1996, Bingham and Chipman, 2007, Jones et al., 2007) minimum bias designs (see e.g., Draper and Lawrence, 1965, Karson, 1970, Karson and Spruill, 1975, Evans and Manson, 1978) and Bayesian D-optimal designs (DuMouchel and Jones, 1994, Jones, et al., 2008) represent alternative approaches to reducing the dependence on a single model. One specific criticism of the basic optimal design approach is that standard criteria do not consider in any way the aliasing of specified model terms with terms that are potentially important but are not included in the model. Consider, for example, the following common design scenario. The researcher wishes to obtain an 8-run optimal screening design for four factors, each at two levels. Using optimal design software, she specifies a screening model comprised the four main effects and an intercept term, and an optimal design for n = 8 runs is constructed using an optimal design algorithm. Any orthogonal design will be D-optimal for this design problem, and the design algorithm will find one quite reliably. The researcher might then ask: Are any of the four main effect terms aliased with two-factor interactions, and if so, in what way? For example, the D-optimal design in the Table 1 was constructed using a standard optimal design software package. It is readily apparent that the defining relation for this half fraction is (1) = 124 and its resolution is three. Moreover, we observe that three of the effects of interest are directly aliased with two factor interactions: 1 = 24, 2 = 14, and 4 = 12. For most experimenters, even though this design is D-optimal, it would not be the first choice. An alternative fraction, 2

Table 1: Eight-run D-optimal screening design for four factors Run (i) x1 x2 x3 x4 Run (i) x1 x2 x3 x4 1 1 -1 -1 -1 5 -1 1 1 -1 2 1 1 -1 1 6 1 -1 1 -1 7 1 1 1 1 3 -1 -1 1 1 4 -1 1 -1 -1 8 -1 -1 -1 1

also D-optimal, based on the defining relation (1) = 1234 has resolution four, and for this design no main effects are confounded with two-factor interactions. The methodology to be described in this paper avoids the limitation of the standard optimal design approach by considering the alias relationships explicitly. In the current example, our methodology will produce the resolution IV design. Following DuMouchel and Jones (1994), we will refer to the terms in our model of interest as primary model terms. Model terms that correspond to effects of secondary interest will be referred to as potential terms. In the above example, the primary model terms are x1 , x2 , x3 , and x4 , and the potential terms are the six two-factor interactions x1 x2 , x1 x3 , x1 x4 , x2 x3 , x2 x4 , and x3 x4 . In general, let X1 denote the n × p1 model matrix for the p1 primary terms, including the intercept column, and let X2 denote n × p2 model matrix for the p2 potential terms. We will assume throughout that X1 is full rank. The ith rows of X1 and X2 are denoted f10 (xi ) and f20 (xi ), where xi is the vector of factor settings for the ith run. It is assumed that the standard normal theory model for the response vector Y is applicable. The full model, containing both primary and potential terms, is:

Y = X 1 β 1 + X2 β 2 + ε

(1)

The experimenter intends to estimate terms in the primary model:

Y = X1 β 1 + ε

3

(2)

where Var(ε) = σ 2 I. It is well known that expected value of the least squares estimator of β 1 in (2) is:

ˆ 1 ) = β 1 + Aβ 2 E(β

(3)

where the p1 × p2 alias matrix, A, is given by (X01 X1 )−1 X01 X2 . As a simple illustration, consider again the D-optimal design in Table 1. Letting βi denote the main effect of the ith factor, and βij represent the interaction between factors i and j, we have:  





 E(βˆ0 )                



E(βˆ2 ) E(βˆ3 ) E(βˆ4 )





 β0   

E(βˆ1 )            

 β12  0 0 0 0 0 0     β     13   0 0 0 0 1 0        β     14  +  0 0 1 0 0 0        β     23   0 0 0 0 0 0        β     24 1 0 0 0 0 0   β34 

=

            

β1 β2 β3 β4

                   

(4)

From this expression, it is easy to see that E(βˆ1 ) = β1 + β24 , E(βˆ2 ) = β2 + β14 , and E(βˆ4 ) = β4 + β12 . β0 and β3 are not confounded with two-factor interactions. In the simplest terms, the strategy we propose here is to choose a design to minimize the sum of squares of the elements of A, subject to lower bound constraints on the efficiency of the design for estimation of the primary regression coefficients. We will demonstrate by example that it is often possible to realize substantial reductions in the norm of A, and hence the level of aliasing, with little or no loss in efficiency. In the next section, we give the new design criterion and its Bayesian rationale, and discuss related work in the literature. In Section 3, we describe the algorithmic implementation, and, in Sections 4 and 5, we explore the use of our methodology in a variety of standard design scenarios. Conclusions and suggestions for further work are provided in Section 6.

4

2

Optimality Criterion and Related Work

Efforts here are most closely related to and motived by Montepiedra and Fedorov (1997), who, building on the work of Cook and Fedorov (1995), develop methods for constructing optimal approximate designs subject to constraints. They consider two strategies: 1. Minimize a function of the variance, subject to constraints on a function of the bias 2. Minimize function of the bias, subject to constraints on a function of the variance However, as Montepiedra and Fedorov (1997) note: “Suppose there is a greater need to control precision in estimation than there is to minimize bias in estimation. Then a logical step is to find designs which minimize [bias], but at the same time ensure that [the variance] will not be too large.” This is the perspective we take in this article. ˆ , given β is: Assume that β 2 ∼ N(0, σβ22 I). Then from (3), the bias of β 1 2 ˆ 1 ) − β 1 = Aβ 2 E(β One easy summary measure is the sum of squares of the bias vector, SSB|β 2 = β 02 A0 Aβ 2 . Taking the expectation over the prior distribution of β 2 , we obtain the expected sum or squared bias components (ESSB):

ESSB = E{β 02 A0 Aβ 2 } = E{Trace[A0 Aβ 2 β 02 ]} = Trace[A0 AE[β 2 β 02 ]}] = σβ22 Trace[A0 A]

(5)

We search for designs that minimize ESSB subject to a lower-bound constraint on the D efficiency of the design for the primary model.

5

We use d to denote a design, X(d) to denote the model matrix corresponding to design d, A(d) to denote the alias matrix for design d, and d∗ to denote a D-optimal design for the primary model. The D-efficiency of a design d is: |X1 (d)0 X1 (d)| De (d) = |X1 (d∗ )0 X1 (d∗ )| "

#1/p1

With this notation, the criterion employed may be written:

min Tr[A(d)0 A(d)], subject to De (d) ≥ lD d

(6)

where lD denotes the experimenter’s lower bound for D efficiency and 0 < lD ≤ 1. The approach taken here is closely related to, but distinct from, a number of previous contributions to the literature. As noted, our work is motivated in part by Montepiedra and Fedorov (1997) who constructed approximate optimal designs for fixed values of β 2 . Our focus is on exact designs and we incorporate β 2 through its prior distribution. Draper and Guttmann (1992) also incorporate uncertainty about β 2 through a prior distribution. They show that if β 2 ∼ N(β ∗ , σβ22 I), then: ˆ Var(β|d) ∝ (X1 (d)0 X1 (d))−1 + γA(d)0 A(d)

where γ = σβ22 /σ 2 . They considered construction of response surface designs to minimize the ˆ |d)M], for varying γ, where integrated variance, Trace[Var(β 1

M=

Z R

f1 (x)f10 (x)dx/

Z

dx

R

and R is the region of interest. Welch (1983) developed an algorithm for constructing optimal approximate and exact designs when subject to the presence of an upper bound on the maximum of the absolute values of X2 β 2 . Steinberg (1985) developed a Bayesian approach

6

assuming that the X2 β 2 consist of orthogonal polynomials. Bursztyn and Steinberg (2003) use (5) to evaluate designs for computer experiments.

3

Computational Details

In this section, we discuss the numerical approaches taken to solve exact design construction problem (5), which involves a nonlinear objective function subject to nonlinear constraints. Building on the work of Cook and Fedorov (1995), Montepiedra and Fedorov (1997) provided necessary and sufficient conditions for the global optimality of (5) with fixed β 2 in the approximate design case. They use the Lagrangian technique of Cook and Fedorov (1995) in combination with a first-order approximate design exchange algorithm (Fedorov, 1969) to obtain optimal constrained designs. In essence, we mirror this approach in the exact design space. The Lagrangian of (5) is:

g1 (d, λ) = Trace[A(d)0 A(d)] + λ(lD − De (d))

(7)

For ease of interpretation, we convert objective function (7) to a maximization by multiplying by −1 and rescaling as follows. Let Ae (d) denote the fraction reduction in the trace criterion brought about by moving from design d∗ to d: Trace[A(d)0 A(d)] Ae (d) = 1 − Trace[A(d∗ )0 A(d∗ )] and let w, 0 < w < 1, denote the relative weight assigned to De (d). Objective function (7) becomes, apart from a constant:

g2 (d, w) = wDe (d) + (1 − w)Ae (d)

7

(8)

Let dw denote arg maxd g2 (d, w). We first obtain d1 = d∗ , any D-optimal design for the model f1 , since d∗ is required for computing De (d) and Ae (d). To do so, we use the coordinate exchange algorithm (Meyer and Nachtsheim, 1995), although any exact design algorithm could be employed. We then obtain dw for increasing w until either Ae (dw ) = 0 or the D-efficiency bound lD = De (d) is met, using a standard line search. For a fixed weight w > 0, optimization of (8) is (again) accomplished using the coordinate exchange algorithm with multiple random starting designs. We report the best of the resulting designs. To speed convergence, we use the rank-2 update formulas for the determinant and information matrices as given in Meyer and Nachtsheim (1995). An update formula for the alias matrix is given in the appendix to this paper. Construction of the optimal constrained designs in this fashion is fast, typically requiring computing on the same order of magnitude as for the unconstrained case. We now turn to the construction of designs in screening and response surface settings.

4

Screening Designs with Protection Against Two-Factor Interactions

In this section we explore the construction of two-level screening designs where the primary model is the main effects model and the potential terms are comprised of all two-factor interactions. We first consider design settings in which the standard approaches lead to resolution III and resolution IV factorial arrangements.

4.1

Choice of Screening Designs for n = 16

Two-level Resolution III fractional-factorial designs are popular choices for screening many factors with a minimum of runs while maintaining orthogonality. However, for designs having a number of runs that is a power of two, there may be many nonisomorphic designs that are orthogonal for the main effects model. Sun, et. al., (2002) catalogued 16-run designs for 8

Table 2: Summary of 16-run orthogonal designs for 6 ≤ p ≤ 14 factors Number of Factors, p 6 7 8 9 10 11 12 13 14

Orthogonal Number of Designs Regular Designs 27 5 55 6 80 6 87 5 78 4 58 3 36 2 18 1 10 1

Number of Nonregular Designs 22 49 74 82 74 55 34 17 9

Resolution of Minimum Aberration Design 4 4 4 3 3 3 3 3 3

three to fifteen factors. Table 2 summarizes their results for 6 ≤ p ≤ 14 factors, showing the total number of orthogonal designs and their breakdown into regular and nonregular designs. Here we consider the construction of designs for 9 ≤ p ≤ 14 factors, where the maximum resolution possible is III. In every case, with lb = .95, our design algorithm produced an orthogonal design (De = 1.0) having the minimum value of Trace(A0 A). For example, with p = 9, the diagnostic measure, Trace(A0 A), ranges from 12 to 24 over the 87 nonisomorphic orthogonal designs, with Trace(A0 A)=12 for any of the five resolution III fractional-factorial designs. However, there is non-regular orthogonal design that also has Trace(A0 A)=12. Sun, et. al., (2002) generated this design based on columns 4-12 of Hall’s (1961) design II. Using our design approach with a lD = .95, we generated the design in Table 3. This design is isomorphic to the aforementioned design in Sun, et. al., (2002) and is globally D-optimal for estimating the main effects model. For this design scenario, Trace(A0 A) can be further reduced only by sacrificing orthogonality (and therefore D-efficiency). In summary, the proposed approach augments the standard optimal design search through its use of a secondary criterion. In standard design situations such as those just discussed, the secondary criterion leads to an ordering of orthogonal designs in much the way the minimum

9

Table 3: Optimal Design for Screening Model with Protection against Two-Factor Interactions, n = 16 Run X1 X2 X3 X4 X5 X6 X7 X8 X9 1 -1 -1 -1 1 -1 -1 1 1 -1 2 -1 -1 1 -1 -1 -1 -1 -1 -1 3 -1 -1 1 -1 -1 1 1 1 1 4 -1 -1 1 1 1 1 -1 1 -1 5 -1 1 -1 -1 1 -1 -1 1 1 6 -1 1 -1 -1 1 1 1 -1 -1 7 -1 1 -1 1 -1 1 -1 -1 1 8 -1 1 1 1 1 -1 1 -1 1 9 1 -1 -1 -1 -1 -1 1 -1 1 10 1 -1 -1 1 1 -1 -1 -1 -1 11 1 -1 -1 1 1 1 1 1 1 12 1 -1 1 -1 1 1 -1 -1 1 13 1 1 -1 -1 -1 1 -1 1 -1 14 1 1 1 -1 1 -1 1 1 -1 15 1 1 1 1 -1 1 1 -1 -1 16 1 1 1 1 -1 -1 -1 1 1

aberration criterion does, and thus to the identification of a “best” orthogonal design.

4.2

Obtaining Resolution IV Designs

Resolution IV fractional-factorial designs are popular alternatives in two-level screening studies. For a modest number of extra runs, they avoid the risk of ambiguity caused by the direct confounding of main effects and two-factor interactions that are characteristic of Resolution III designs. As noted in the introduction, the standard optimal design approach does not distinguish among alternative D-optimal designs. Therefore, an optimal design algorithm will frequently produce resolution III designs, even when designs of higher resolution are possible. As shown in Table 2, with n = 16, resolution IV designs are possible for 6-8 factors. For these cases, the resolution IV designs also uniquely minimize Trace(A0 A). We applied our approach to these design problems. Letting the primary model include the intercept and main effects and the potential terms correspond to all two-factor interactions, the resulting design with lD = .9 was the standard resolution IV fractional-factorial design. 10

Thus, use of the constrained approach in these and similar situations leads to designs that give up nothing in terms of D-efficiency while protecting all main effects from 2nd order bias.

4.3

Screening Designs for n Not a Power of 2

Using the same primary and aliasing model, consider now lowering the number of runs to 12 while requiring the D-efficiency to be greater than 90%. The resulting design is presented in Table 4a. Note that the even numbered rows are the mirror image of the row above. Here the main effects are not aliased with any two-factor interactions. Selecting six of the eleven columns from the 12 run Plackett-Burman design results in an orthogonal design that is also D-optimal. For this design, the variance of any estimated regression coefficient is σ 2 /12. However, each main effect is aliased with 10 two-factor interactions with aliasing coefficient equal to ±1/3. Balanced against the desirable characteristic that all of the main effects are independent of two-factor interactions, the design in Table 4a is not orthogonal for the main effects model. Its D-efficiency is 92%. For this design, Var(βˆi ) = σ 2 /10 for i = 1, . . . , p, and Var(βˆ0 ) = σ 2 /12. If the investigator wishes to avoid any bias in the main effects due to a large two-factor interaction, this slight loss of efficiency might be acceptable. The only nonzero entries of alias matrix for the design in Table 4a involve two-factor interaction terms aliased to the intercept. By accepting a lower bound of 85% (instead of the previous 90%) for the D-efficiency of the primary model, we obtain the design in Table 4b. Note that this design also has the property the each pair of rows are mirror images of each other. Interestingly, for this design, the alias matrix A = 0, so that it’s alias-efficiency, Ae , is 100%. Its D-efficiency with respect to the primary model is 85.5%. By adding two factor settings of zero to each column of factor settings the resulting design is now orthogonal for the primary model. Moreover, the relative variance all the model coefficients is the same as those in Table 4a. That is, the A efficiencies of the designs in Tables 4a and 4b are the same. An unanticipated advantage of the design in Table 4b is that it capable of detecting 11

Table 4: Optimal Designs for Screening Model with Protection against Two-Factor Interactions, n = 12 (a) lD = .9

(b) lD = .8

Run X1 X2 X3 X4 X5 X6 1 1 1 -1 1 1 1 2 -1 -1 1 -1 -1 -1 3 1 1 1 -1 1 -1 4 -1 -1 -1 1 -1 1 5 1 1 -1 1 -1 -1 6 -1 -1 1 -1 1 1 7 1 -1 -1 1 1 -1 8 -1 1 1 -1 -1 1 9 1 -1 1 1 -1 1 10 -1 1 -1 -1 1 -1 11 1 -1 -1 -1 -1 1 12 -1 1 1 1 1 -1

Run X1 X2 X3 X4 X5 X6 1 0 1 -1 -1 -1 -1 2 0 -1 1 1 1 1 3 1 0 -1 1 1 -1 4 -1 0 1 -1 -1 1 5 -1 -1 0 1 -1 -1 6 1 1 0 -1 1 1 7 -1 1 1 0 1 -1 8 1 -1 -1 0 -1 1 9 1 -1 1 -1 0 -1 10 -1 1 -1 1 0 1 11 1 1 1 1 -1 0 12 -1 -1 -1 -1 1 0

nonlinearity due to any of the factors. If one were to add a center run to this design, it would be able to fit a saturated model including the intercept, all the main effects, and all the pure quadratic effects of each factor. There are 15 two-dimensional projections of this design. With the added center run, each of these projections is a 3 × 3 full factorial design. In this case, accepting a slightly lower D-efficiency has resulted in a dramatic improvement in many other design properties. Suppose now that the investigator was willing to add two runs to a 16-run design, so that n = 18. The D-optimal design and the optimal constrained design, with lD = .95, are shown in Tables 5a and 5b. The constrained design again has the property that the runs form mirror image pairs. One can also think of it as a nine run design together with its foldover pair where the foldover is with respect to all the factors. Only the intercept term is aliased with two factor interactions. The minimum alias design seems clearly preferable to the D-optimal design in this setting. It is 99.3% D-efficient, and all of the estimated main effects have variance equal to 0.058σ 2 — little changed from the D-optimal values of 0.057σ 2 . However, the alias matrix of the Doptimal design, for which Trace(A0 A) = 3.49, is far more complex. For the constrained 12

Table 5: Optimal Design for Main Effects Model with Protection Against Bias Due to TwoFactor Interactions, n = 18 (a) Constrained Design

(b) Unconstrained D-Optimal Design

Run X1 X2 X3 X4 X5 X6 1 1 -1 -1 -1 -1 1 2 1 1 -1 1 -1 1 3 -1 -1 -1 1 -1 -1 4 -1 1 -1 -1 1 -1 5 -1 1 1 -1 -1 1 6 1 -1 1 -1 -1 1 7 1 1 1 -1 1 1 8 -1 -1 1 -1 1 -1 9 -1 1 1 1 1 1 10 -1 -1 -1 1 1 1 11 1 1 1 1 -1 -1 12 -1 -1 1 1 -1 1 13 1 1 -1 1 -1 -1 14 1 1 1 1 1 1 15 -1 1 -1 -1 1 1 16 -1 1 1 -1 -1 -1 17 1 -1 -1 -1 1 -1 18 1 -1 1 1 1 -1

Run X1 X2 X3 X4 X5 X6 1 1 1 1 -1 -1 1 2 -1 -1 -1 1 1 -1 3 1 -1 -1 1 -1 1 4 -1 1 1 -1 1 -1 5 1 -1 1 1 1 -1 6 -1 1 -1 -1 -1 1 7 1 -1 -1 -1 -1 -1 8 -1 1 1 1 1 1 9 1 -1 1 -1 1 1 10 -1 1 -1 1 -1 -1 11 1 -1 1 1 -1 1 12 -1 1 -1 -1 1 -1 13 1 1 1 1 -1 -1 14 -1 -1 -1 -1 1 1 15 1 1 -1 1 1 1 16 -1 -1 1 -1 -1 -1 17 1 1 -1 -1 1 -1 18 -1 -1 1 1 -1 1

design, Trace(A0 A) = 15/92 = 0.185, leading to an alias-efficiency of 95%. The absolute value of the maximum entry in the alias matrix of the D-optimal design is 0.583 (X2 by X1 × X4 ), whereas, for the constrained design, each two-factor interaction is partially aliased with only the intercept, with alias coefficient equal to ±1/9.

5

Protecting Against Quadratic and Cubic Bias

5.1

Interaction Models with Protection Against Quadratic Bias

Adding a center-point in two-level screening applications with continuous factors is a common practice. This requires the performance of at least one run at a setting other than the two end-points of the range of each factor. Consider the case of two quantitative factors and assume that the primary terms are 13

the intercept, main effects and two-way interaction. Let the potential terms be the two quadratic effects. Suppose the number of runs allocated to the experiment is five. The Doptimal design is the 2×2 full-factorial design with an additional arbitrary replicate of one of the four points. When lD ≥ .889, this design is also optimal by our criterion. However, for smaller values of lD , the replicated factorial point jumps to the center of the design region. We observe the same behavior in any case where we add one run to a screening design that is saturated for its primary model. Examples are the 23−1 resolution III fractional factorial design, the 27−4 resolution III fractional factorial design, and the 25−1 resolution V fractional factorial design. If the primary model is not saturated, our approach may not result in multiply replicated center runs. In the two-factor case mentioned previously but excluding the interaction term from the primary model and adding a sixth run, we obtain the design shown in Table 6 and plotted in Figure 1 when lD = 0.95. Note that, for the primary model, the D-efficiency of this design is 95% and it is capable of fitting both quadratic effects. For this design, Trace(A0 A) is 1.627. The 2×2 design with two center-point runs is 90% efficient for the primary model and its Trace(A0 A) is also 1.627. Our design is more efficient with the same protection against quadratic bias. On the other hand, replicating the center point has the advantage of providing one degree of freedom for pure error.

14

Table 6: Optimal Design for Interaction Model with Protection against Quadratic Bias, n=6 Run X1 X2 1 1 1 2 -1 1 3 -1 1 4 -1 -1 5 0.65 -1 6 1 -0.65

1

Factor 2

0.5

0

ï0.5

ï1

ï1

ï0.5

0 Factor 1

0.5

1

Figure 1: Optimal Design for Interaction Model with Protection against Quadratic Bias, n=6

5.2

Response Surface Models with Protection Against Cubic Bias

Many practitioners employ the three factor Box-Behnken design in response surface studies because it can fit a full quadratic model, it has only three levels for each factor and it does not have any runs at the extremes of all three factors simultaneously. We used the minimum aliasing criterion to create a design for three factors and 13 runs. The primary model was full quadratic model containing 13 terms, and the potential terms were comprised of the 15

Table 7: Optimal Design for Primary Second-Order Model with Potential Cubic Terms, n=9 Run X1 X3 Radius 1 0.8994 0.8994 1.275 2 0.8994 -0.8994 1.275 3 -0.8994 0.8994 1.275 4 -0.8994 -0.8994 1.275 5 1 0 1 6 -1 0 1 7 0 1 1 8 0 -1 1 9 0 0 0

10 third order terms. Interestingly, when the D-efficiency bound is lD = 0.8, the algorithm generates the Box-Behnken design with one center point. To explore the effect of changing the D-efficiency bound, consider a design for two factors with nine runs. The D-optimal design for the full quadratic model is the 3×3 full-factorial design. Let the primary model be the full quadratic model and the aliasing model contain all third-order terms. As we reduce the lD from 1.0, we find that the points at the vertices of the square design region shrink toward the center while remaining on a circle of constant radius. Table 7 shows the design obtained with lD = .81 (D-efficiency is 81%). For the D-optimal design, the Trace(A0 A) = 2.89 compared to 2.05 for the minimum aliasing design. Further reductions in the D-efficiency bound forces the axial points to shrink towards the origin while remaining on a circle of constant radius. The ratio of the radius of the larger circle to the radius of the smaller circle remains at 1.27. Both these examples illustrate that to reduce third order bias in a response surface design, it is advantageous to move away from a cubic region of interest towards a spherical one.

16

6

Discussion

In this paper we have developed a new, constrained-optimization based approach for obtaining optimal experimental designs, and we have explored its use in standard design settings. This approach has a number of advantages relative to classical optimal design. Firstly, our design approach provides a way to choose among competing non-unique D-optimal designs. In screening situations when n is a power of 2 and resolution IV designs are not possible, it can be used to produce orthogonal designs having minimal aliasing (as measured by Trace(A0 A)) between main effects and two-factor interactions. In screening situations where resolution IV designs exist, the method will reliably produce them. If the experimenter is concerned about the potential presence of quadratic effects, the method favors the addition of center points and, for small values of the bound on D-efficiency, will tend to shrink Doptimal boundary points toward the center, favoring spherical arrangements. In general, our approach leads to designs that are more robust to misspecification of the assumed model. In one unexpected result, the approach led to the construction of two-level designs where factor settings of pairs of runs mirror each other. Such designs succeed in eliminating any bias of main effects with two-factor interactions and do so with negligible loss of D-efficiency. The constrained-optimization approach advocated in this article can be extended in a number of directions. Firstly, minor modifications to the optimization criterion can lead to useful alternative designs. For example, in our approach, we minimize the sum of squared PP

entries— PP

A2ij —in the alias matrix. One might consider minimization of the Lk norm,

|Aij |k , for k ≥ 1. In screening situations with k = 1, such a criterion will favor mini-

mum aberration designs. With k = 2 as implemented here, the criterion does not distinguish between the minimum aberration resolution III designs and the best nonregular orthogonal alternatives. For k > 2, such a criterion will favor nonregular othogonal designs. Further, for k → ∞, this criterion will minimize the maximum squared entry in the alias matrix— also favoring nonregular designs in screening situations. Secondly, in response surface settings, where the primary objective is prediction, the integrated variance criterion (V- or 17

I-optimality) is often preferred. Minimization of the integrated bias, subject to constraints on the integrated variance could be implemented in similar fashion. These ideas are currently under investigation.

18

7

References

Bingham, D. R. and Chipman, H. A. (2007). “Incorporating Prior Information in Optimal Design for Model Selection”, Technometrics, 49, 155-163. Bingham, D. R. and Sitter, H. A. (2007). “Incorporating Prior Information in Optimal Design for Model Selection”, Technometrics, 49, 155-163. Bursztyn, D., and Steinberg, D. M. (2006). “Comparison of designs for computer experiments”, Journal of Statistical Planning and Inference, 136, 1103-1119. Bursztyn, D. and Steinberg, D. (2006). “Comparison of designs for computer experiments”, Journal of Statistical Planning and Inference 136, p.1103-1119. Cook, R. D., and Fedorov, V. V. (1995). “Constrained optimization of experimental design”, Statistics, 26, 129-148 . Cook, R. D. and Nachtsheim, C. J. (1982), “Model Robust, Linear-Optimal Designs”, Technometrics, 24, 49–54. Draper, N. R. and Lawrence, W. E. (1965). “Designs which minimize model inadequacies: Cuboidal regions of interest”, Biometrika, 52, 111-118. Draper, N. R. and Guttmann, I. (1992). “Treating Bias as Variance for Experimental Design Purposes”, Ann. Inst. Statist. Math., 44, 659–671. DuMouchel, W. and Jones, B. (1994), “A Simple Bayesian Modification of D-Optimal Designs to Reduce Dependence on an Assumed Model,” Technometrics, 36, 37–47. Evans, James W. and Manson, A. R. (1978). “Optimal experimental designs in two dimensions using minimum bias estimation”, Journal of the American Statistical Association, 73, 171–176. Fedorov, V. V. (1972). Theory of Optimal Experiments. New York: Academic Press. 19

Fries, A. and Hunter, W. G. (1980). “Minimum Aberration 2k−p Designs”, Technometrics, 22, 601-608. Hall, M. Jr. (1961). Hadamard matrix of order 16. Jet Propulsion Laboratory Research Summary, 1, 21–26. Jones, B., Lin, D., and Nachtsheim, C. J. (2008). “Bayesian D-optimal Supersaturated Designs”, Journal of Statistical Planning and Inference, 138, 86–92. Jones, B., Li, W., Nachtsheim, C. J. and Ye, K. (2007), “Model Discrimination—Another Perspective on Model-Robust Designs,” Journal of Statistical Planning and Inference, 137, 1577–1583. Karson, M. J. (1970). “Design criterion for minimum bias estimation of response surfaces”, Journal of the American Statistical Association, 65, 1565-1572. Karson, M. J. and Spruill, M. Lynn (1975). “Design criteria and minimum bias estimation”, Communications in Statistics, 4, 339-356. Kutner, M., Nachtsheim, C. J., Neter, J., and Li, W. (2005) Applied Linear Statistical Models, Burr Ridge, IL: McGraw-Hill/Irwin. Li, W., and Nachtsheim, C. J. (2000), “Model-robust factorial designs,” Technometrics, 42, 379–396. L¨auter, E. (1974), “Experimental Design in a Class of Models,” Mathematische Operationsforschung und Statistik, 5, 379–396. Meyer, R. D., Steinberg, D. M., and Box, G. E. P. (1996). “Follow-Up Designs to Resolve Confounding in Multifactor Experiments”, Technometrics, 38, 303-313. Montepiedra, G. and Fedorov, V. V. (1997). “Minimum bias designs with constraints”, Journal of Statistical Planning and Inference, 63, 97-111. 20

Montgomery, D. C. (2008) Design and Analysis of Experiments, New York: J. Wiley & Sons. Steinberg, D. M. “Model Robust Response Surface Designs: Scaling Two-Level Factorials”, Biometrika, 72, 513-526. Sun, D. X., Li, W., and Ye, K. Q. (2002), “An Algorithm for Sequentially Constructing NonIsomorphic Orthogonal Designs and Its Applications,” Technical Report SUNYSBAMS-02-13, State University of New York at Stony Brook, Dept. of Applied Mathematics and Statistics. Welch, W. J. (1983). “A Mean Squared Error Criterion for the Design of Experiments”, Biometrika, 70, 205-213.

8

Appendix: Update Formulas

Define the following terms:

f (x) = (f10 (x), f20 (x))0 M = X01 X1 = X01 X2 =

X

X

f1 (xi )f10 (xi )

f1 (xi )f20 (xi )

A = M−1 X01 X2

(9)

v(x) = f10 (x)(X01 X1 )−1 f1 (x) v(x, xj ) = f10 (x)(X01 X1 )−1 f1 (xj )

Using “up” to denote the resulting value obtained from deleting a point xj from the design and adding a new point x, we have:

X01 Xup = X01 X2 − f1 (xj )f20 (xj ) + f1 (x)f20 (x) 2 21

= X01 X2 + A1 B02

(10)

where A1 = [f1 (x), −f1 (xj )] and Bi = [fi (x), fi (xj )], for i = 1, 2. Letting: 





1 + v(x) −v(x, xj )  

Q2 =  

v(x, xj )

1 − v(xj )



and employing standard rank-2 updating formulae (see, e.g., Meyer and Nachtsheim, 1995), the updated dispersion matrix is:

−1 −1 0 −1 M−1 up = [Ip − M A1 Q2 B1 ]M

Substitution of (10) and (11) into (9) gives the required expression:

0 −1 0 0 Aup = [Ip − M−1 A1 Q−1 2 B1 ]M [X1 X2 + A1 B2 ] 0 −1 0 = [Ip − M−1 A1 Q−1 2 B1 ][A + M A1 B2 ]

22

(11)

Suggest Documents