Graphical methods for evaluating ridge regression ...

5 downloads 105 Views 410KB Size Report
Jun 27, 2007 - Giovannitti-Jensen and Myers (1989) and Vining, Cornell, and Myers (1993) proposed ... St. John (1984) found that when fitting the first-degree ...
This article was downloaded by: [Pukyong National University] On: 05 April 2013, At: 01:39 Publisher: Taylor & Francis Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK

Communications in Statistics - Simulation and Computation Publication details, including instructions for authors and subscription information: http://www.tandfonline.com/loi/lssp20

Graphical methods for evaluating ridge regression estimator in mixture experiments a

Dae Heung Jang & Min Yoon

a

a

Department of Applied Mathematics, Pukyong National University, Pusan, 608-737, Republic of Korea Version of record first published: 27 Jun 2007.

To cite this article: Dae Heung Jang & Min Yoon (1997): Graphical methods for evaluating ridge regression estimator in mixture experiments, Communications in Statistics - Simulation and Computation, 26:3, 1049-1061 To link to this article: http://dx.doi.org/10.1080/03610919708813425

PLEASE SCROLL DOWN FOR ARTICLE Full terms and conditions of use: http://www.tandfonline.com/page/terms-and-conditions This article may be used for research, teaching, and private study purposes. Any substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any form to anyone is expressly forbidden. The publisher does not give any warranty express or implied or make any representation that the contents will be complete or accurate or up to date. The accuracy of any instructions, formulae, and drug doses should be independently verified with primary sources. The publisher shall not be liable for any loss, actions, claims, proceedings, demand, or costs or damages whatsoever or howsoever caused arising directly or indirectly in connection with or arising out of the use of this material.

COMMUN.STATIST.-SIMULA., 26(3), ION- 1061 (1997)

Graphical Methods for Evaluating Ridge Regression Estimator in Mixture Experiments Dae-Heung Jang and Min Yoon

Downloaded by [Pukyong National University] at 01:39 05 April 2013

Department of Applied Mathematics Pukyong National University Pusan, 608-737 Republic of Korea

Key Words and Phrases : multicollinearity; ridge regression e s t i m a t o r ; ridge trace; t h e response trace; t h e prediction variance trace. ABSTRACT When the component proportions in mixture experiments are restricted by lower and upper bounds, multicollinearity appears all too frequently. Thus, we can suggest the use of ridge regression as a mean for stabilizing the coefficient estimates in the fitted model. We propose graphical methods for evaluating the effect of ridge regression estimator with respect to the predicted response value and the prediction variance.

1. INTRODUCTION In mixture experiments, the measured response is assumed to depend only on the relative proportions of the components present in the mixture. For mixture experiments, if we let x, represent the proportion of the ith component in the mixture where the number of components is q , then

1049 Copyright O 1997 by Marcel Dekker, Inc.

JANG AND YOON

where 0 5 x, .< 1.2 = 1.2;.. , q . The experimental region is a regular ( q - 1)-dimensionalsimplex. When additional constraints are imposed on the proportions in the form of lower and upper bounds

the experimental region becomes a subregion of the simplex. Typically, mixture models are of the Scheff4 type where the first order model is

Downloaded by [Pukyong National University] at 01:39 05 April 2013

and the second order model is

where y is observed response and expressed in matrix notation as

E

is random error. Scheffk model can be

where y = (yl, y2, . . . , y,)' is the vector of observed response, X is the n x p(< n ) matrix of the component proportions and cross-products bctween P is the the p x 1 vector of the proportions depending on the model, parameters which appear in the chosen model, and g = ( e l , ~ 2 , .. . E,,)' is the vector of random errors associated with y. Here p is the number of parameters in the model, and the response at a n y location c in the region of interest is y(.) = E

@+

where & = ( x l , x 2 , . . . , x q ) for a first order model and 4 = (xl,2 2 , . . . , x q ,51x2,. . . , xq-lxq) for a second order model. The vector of unknown parameters is estimated using ordinary least squares methods by b = ( x ' x ) - ~ X'~ where b' = (bl, b2,. . , bq) for a first order model andb' = ( b l , b2,.. . , b,, b12, . . , b l q , . . . , bq-l,q) for a second order model. The variance-covariance

RIDGE REGRESSION ESTIMATOR

1051

matrix of the estimated coefficients under the assumption that g is Var(6) = g 2(x'x)-' .

- (0,

a 2I )

The estimated values of the response at the design points are

and the estimated value of the response at any location g in the region of interest is $(LC) = db. (2) Thus, the variance of predicted value of the response at a point g is given by Var[Q(g)] = a2zb(X'X)-110. (3) Var[$(g)] depends on the particular values of the explanatory variables It also depends on the design through the mathrough the vector trix ( X I X ) - l . Giovannitti-Jensen and Myers (1989) and Vining, Cornell, and Myers (1993) proposed graphical methods comparing and evaluating experimental designs based on Var [lj(g)]. The presence of unstable coefficient estimates arise from what is known as multicollinearity. Multicollinearity is a condition among the set of p regressor variables X I , x2, . . . , xp in the model. If there exists an approximate linear dependence between the columns of X , then we have the condition usually identified as ill-conditioning or multicollinearity. Various techniques for detecting and identifying multicollinearity have been proposed. A prevailing technique is ridge regression although there are some criticisms(for example, See Draper and Smith (1981)). Hoerl and Kennard (1970 a, b) and, Marquardt (1970) have suggested problems associated with a ridge regression estimator. Until quite recently, there are being many researches for ridge regression (for example, See Crouse, Jin, and Hanumara(1995).). The ridge regression estimator for the parameters in the first order and second order polynomial models are calculated using the formula

Downloaded by [Pukyong National University] at 01:39 05 April 2013

4.

b(k) = (X'X -

+ k ~ ) - 'XIy, -

where k is a constant and usually 0 < k < 1. The purpose of this paper is to suggest graphical methods for evaluating the effect of ridge regression estimator with respect to the predicted response value and the prediction variance. In Section 2, we propose graph-

JANG AND YOON

1052

ical methods for evaluating the effect of ridge regression estimator with respect to the predicted response value and the prediction variance. In Section 3, we give numerical examples. Tn Section 4, we draw conclusions. 2 . GRAPHICAL METHODS FOR EVALUATING

RIDGE REGRESSION ESTIMATOR Multicollinearity is introduced into the model-fitting exercise when trying to fit a model containing too many terms. St. John (1984) found that when fitting the first-degree Scheffk-type model, ill-conditioning of the X'X matrix appears all too frequently if the component proportions have nonzero lower bounds, so he suggested the use of ridge regression as a mean for stabilizing the coefficient estimates in the fitted model. Therefore, we can use ridge regression estimator for overcoming multicollinearity. From(4), the variance-covariance matrix of ridge regression estimator b(k), is Var[b(k)] = k~)-IX'X(X'X k I ) - l . Downloaded by [Pukyong National University] at 01:39 05 April 2013

a2(x'x+

+

And, the predicted response value at g is

Therefore, the prediction variance at g is

By Hoerl and Kennard(l97Oa), we can obtain the following facts. Fact 1. For all k > 0, Var[y(x)] > Var[jjk(g)]. Fact 2. There exists k > 0 such that MSE[jj(g)]> MSE[$k(g)]. Through fact 1 and fact 2, we know that ridge regression estimator is superior to least squares estimator from standpoint of the prediction variance when multicollinearity exists. In recent years, many statisticians have recongnized the value of graphical methods in data analysis. Since the performance of an experimental design so obviously presents a multidimensional problem, it would seem that graphical techniques in comparing and evaluating designs with ridge regression would be an obvious approach. When the mixture component proportions are restricted by lower and upper bounds of the form (I),these restrictions make the reference mixture

RIDGE REGRESSION ESTIMATOR

1053

(or overall centroid), the centroid of constrained simplex. When measuring the effect of component with respect to the predicted response value and the prediction variance, and a reference mixture other than the centroid of the simplex is to be used, Cox direction is generally appropriate. Cox direction of component i is an imaginary line projected from the reference mixture to the vertex xi = 1. Cox direction was introduced by Cox (1971). Let us denote the proportions of the q components at the reference mixture by c1 = (cl, c 2 , . . . , cq) where Cf=lc; = 1. When the proportion c; of component i is changed by an amount Ai in Cox direction, so that the the new proportion becomes

Downloaded by [Pukyong National University] at 01:39 05 April 2013

the proportions of the remaining q - 1 components resulting from the c; in the ith component, is

Note that the ratio of the proportions for components j and k , where xJ and xk are defined by (8), is the same value as the ratio of components j and k at the reference mixture. When the proportion c; of component i is changed by an amount A, in Cox direction, A;= xi - C i and

from (7) and (8). Let

and

For a first order model, substituting (9) for xi into (2), the predicted response value along Cox direction about ith component becomes

JANG AND YOON

And, for the second order model, the predicted response value along Cox direction about ith component becomes

Downloaded by [Pukyong National University] at 01:39 05 April 2013

Here, we must substitute subscript ij by subscript j i in case of i > j . Also, the predicted response value along Cox direction about ith component owing to ridge regression, yk(c), have the similar form as the form of

$(.I. Substituting (9) for xS into (10) and ( l l ) , the prediction variance along Cox direction about ith component (apart from a 2 ) become second or fourth degree polynomial about xi according to first or second order model. Using the idea of Cornell(1990) and Vining, Cornell, and Myers (1993), we propose the response trace and the prediction variance trace as a tool for evaluating the effect of ridge regression estimator. The plots of $(-) and $k(J1)along Cox direction of each component for some k, the response traces, can be used to give comprehensive picture of the behavior of the pedicted response value change due to ridge regression over Cox direction of each component under constrained region. Therefore, through the response trace, we can find the change of maximum and minimum point of the predicted response value in each component due to ridge regression. The plots of V(2) and Vk(g) along Cox direction of each component for some k, the prediction variance traces, can be used to give comprehensive picture of the behavior of the prediction variance change due to ridge regression over Cox direction of each component under constrained region. These graphs can be used to examine the effect of ridge regression estimator on mixture designs with respect to the predicted response value and the prediction variance, respectively. There are several techniques to estimate k. First, Hoerl and Kennard (1970b), Hoerl, Kennard and Baldwin (1975), McDonald and Galarneau (1975), Hocking, Speed and Lynn (1976), Lawless and Wang (1976), Wahba, Golub and Heath (1979), and Myers (1986) proposed stochastic methods.

RIDGE REGRESSION ESTIMATOR

1055

These stochastic methods have all exploited response data. Therefore, k is a random variable. Second, Tripp (1983, See Myers(1986).) and John (1984) proposed nonstochastic methods, i.e. DF trace and VIF technique. These nonstochastic methods do not have exploited response data. Therefore, k is not a random variable. Through the response trace and the prediction variance trace, we can appreciate the determination of k which is estimated by above-mentioned techniques. Furthermore, we can use the prediction variance trace as a graphical criterion for the decision of k with ridge trace. As k is increasing sequentially, the prediction variance values are decreasing gradually. When this decreasing trend becomes very weakened, we can decide the value of k. This method is a method for choosing k as a function only the regressor data. Therefore, the choice of k is determined by the nature of the multicollinearity itself and k is not a random variable.

Downloaded by [Pukyong National University] at 01:39 05 April 2013

3. NUMERICAL EXAMPLES

Our first example is taken from McLean and Anderson (1966). The purpose of the experiment was to find the combination of the proportions of magnesium ( X I ) , sodium nitrate (x2), strontium nitrate (x3),and binder (x4) for producing flare with maximum illumination. McLean arld Anderson (1966) suggested the 15-point extreme vertices design consisting of the eight extreme vertices, the centroids of the six faces and the overall centroid of the region along with the flare illumination data. A second order polynomial was fit to the data. The component ranges are 0.40 X I 5 0.60, 0.10 5 x2 5 0 . 5 0 , 0 . 1 0 2 2 3 50.50, and 0.03 5 2 4 5 0 . 0 8 . We can use ridge regression because of multicollinearity in this example. John (1984) showed that k=0.005 is appropriate by means of the VIF's and the ridge trace. Also, we can calculate k = 0.004 by the method proposed by Hoerl, Kennard, and Baldwin (1975). Using (2) and (5), we can draw Figure 1. Figure 1 shows the comparison of the response traces for the McLean-Anderson data in case of several k. From Figure 1, we can find that when k = 0.000 the illumination of the flares appears to be very sensitive to the concentration of the 2 4 and the maximum illumination occurs at the upper boundary value (x4 = 0.08) for this component. We can ascertain that as k is increasing sequentially, the response trace changes from the curve to the line gradually and maxirnurri or minimum response value along Cox direction of each component appears at the boundary of each component gradually. Using (10) and ( l l ) , we can draw Figure 2. Figure 2 shows the comparison of the prediction variance traces for the McLean-Anderson design in case of several k. We can ascertain that as k is increasi~igseque~itidly,

Suggest Documents