Assessing the Accuracy of Mixture Model Regression Calculations

Assessing the Accuracy of Mixture Model Regression Calculations RONALD D. SNEE Engineering Department, E. 1. du Pont de Nemours and Company, Wilmington, Delaware 19898

ARTHUR

A. RAYNER

Department of Statistics and Biometry, University of Natal, Pietermaritzburg, 3200 South Africa The Scheffe and Becker models for mixture systems present special computing problems because neither of these models contains a constant term. Scientists may find that the associated least-squares calculations are inaccurate because of computer roundoff errors or, worse yet, that the available regression program does not have the capability of fitting a zero-intercept model. It is shown that these two problems can be circumvented by dropping one of the linear terms from the model and replacing it with a constant term. The resulting intercept model produces all the coefficients, predictions and significance tests appropriate to the Scheffe or Becker models. Methods for detecting roundoff error are discussed and recommendations concerning preferred computing procedures are included. Examples illustrating the effectiveness of the proposed methodology are presented.

A mixture system has the characteristic that the response or property of interest, y, is a function of the proportions of the q components or ingredients in the mixture, Xi, i = 1, ... , q, and not the total amount of the mixture. This results in the mathematical constraints that the proportions Xi of the components in the mixture must lie between zero and one and the sum of the proportions in the mixture must be one.

Introduction use of statistical designs and models in the Tsolution of mixture and product formulation HE

problems has grown rapidly since the publication of the pioneering papers of Claringbold (1955),Scheffe (1958and 1963),and McLean and Anderson (1966). These studies have been in a wide variety of applications, such as rocket propellants, Kurotori (1966), gasoline blends, Gorman and Hinman (1962),lubricants, Snee (1975), detergents, Narcy and Renaud (1972), foods, Hare (1974), drugs, Claringbold (1955), and entomology, Cornell and Gorman (1978). Additional background information on the statistical approach to mixture experimentation can be found in Snee (1971)and Cornell (1973,1979and 1981).

Xl

E(y) =

-s 1

(1)

+ {J2X2 + ... + {JqXq + {J12XIX2 + {J13XIX3 + ... + {Jq-I,qXq-IXq

{JIXI

The homogeneous models of degree one proposed by Becker (1968) do not contain a constant term either. This will be discussed in a later section.

KEY WORDS: Blending, Computer Roundoff Error, Mixture Models, Multicollinearity, Numerical Accuracy, Regression Calculations, Variance Inflation Factor

I

o ::s Xi

The canonical polynomial model proposed by Scheffe (1958 and 1963) has been used widely in describing the responses of mixture systems. This model does not contain a constant term, that is, it has a zero intercept (see Appendix). For example, the quadratic model has the following form.

Dr. Snee is the Consultant Supervisor in the Applied Statistics Group. He is a Senior Member of ASQC. Dr. Rayner is Department Chairman and Professor of Statistics and Biometry.

Vol. 14, No.2, April 1982

+ X2 + ... + Xq = 1

Some users may encounter inaccurate leastsquares calculations when fitting Scheffe and

67

Journal of Quality Technology

RONALD D. SNEE AND ARTHUR A. RAYNER

68

Becker models to mixture data because of the computer roundoff errors that can result from the absence of a constant term in these models. Some persons using these models for the first time may even find themselves without a program to do the job if the available regression program does not contain a zero-intercept option. Procedures for handling these problems are described in this paper. We represent the general linear model by

(2)

y=Xfl+e

where y is an n X 1 vector of observations on the response variable and e is an n X 1 vector of variables such that E (e) = 0 and V(e) = a2In• In the case of the zero-intercept model X is an n X p matrix of corresponding observations on the p predictor variables and fl = (fll, ... , f3p) is a vector of parameters. For the intercept model fl equals (f3o, f31, ... , f3p) where f30 is the constant term, and X is the same matrix as before augmented by an initial column of 1's associated with f3o. The normal equations corresponding to (2) are I

I,

X/Xb=X/y

(3)

where b is the least-squares (best linear unbiased) estimator of fl, and bk is the kth regression coefficient, associated with the kth predictor variable. It is well known [see Draper and Smith (1981), pp. 259-262] that, for a model with a constant term, expression of the predictor variables as deviations from their means (centering) produces orthogonality between the reparametrized constant term and the regression coefficients and results in an X'X matrix consisting of corrected sums of squares and sums of products. "Centering" increases the accuracy of the calculation of b but does not change the predictions of the model. It is, unfortunately, not possible to center the predictor variables when the model does not contain a constant term because to do so adds a constant term to the model, thereby changing the model form. The elements of the matrix X'X of (3) are therefore the crude sums of squares and sums of products (i.e., not adjusted for means) of the predictor variables. Such a matrix is often poorly conditioned, leading to roundoff errors in the solution of (3) unless highly accurate computational procedures are used. In the special case of polynomial regression models of order two or higher in which the p predictor variables are powers and products of powers of q factors (Xl, ... , xq), high correlations among the predictor variables are likely, also resulting in


I~_-

a poorly conditioned X'X matrix. Marquardt and Snee (1975) and Snee (1973b) have shown that for the usual case where the model contains a constant term, centering the factors (i.e., expressing the levels of each factor as deviations from their means before the quadratic and higher-order terms are generated) will increase computational accuracy by reducing the correlations among the predictor variables. Bradley and Srivastava (1979) made a detailed examination of the single-factor case. Chun (1968)showed how the factors could be centered so that each factor would be uncorrelated with all crossproduct terms involving the factor, thereby reducing roundoff error. Part of the effect of centering the factors is due to the orthogonality produced between the reparametrized constant term and the regression coefficients of the linear terms, as described previously. This orthogonality can be extended to the curvilinear terms, and still higher accuracy achieved, by the further step of expressing the centered powers and products of degree higher than one as deviations from their respective means. Once again neither ofthese recourses is available for zero-intercept polynomial regression models such as the Scheffe canonical model. Table 1 shows the effects of different types of centering on the construction of a two-factor quadratic model for a set of response surface data discussed by Myers (1971,pp. 34-35). In order to ensure accuracy of computer calculations in the face of a poorly conditioned X'X matrix (multicollinearity), it is essential to have a good algorithm for the solution of linear equations. Several authors [e.g., Freund (1963), Longley (1967), Wampler (1970) and Beaton et al. (1976)] have pointed out the effects of severe ill conditioning on the outputs of various programs and packages. In this connection the Cholesky (or square root) method is well regarded [see Graybill (1976), p. 231], though any method will show increasing inaccuracy with increasing severity of ill conditioning. The effects of severe ill conditioning can be reduced, however, by invoking the higher degrees of precision of which large computers are capable. If only single-precision arithmetic is used, the threshold of unacceptable inaccuracy with a particular algorithm may be reached at a relatively moderate level of ill conditioning. The same will be true if the computer has a small word-length, as with minicomputers and microcomputers, which have 16-bit words compared with the 32-bit and larger words of larger machines. Further, the extent of the inaccuracy may go unrecognized (see the next section).


ASSESSING TABLE

--

THE ACCURACY

Coefficient

Constant 1 2 12 11 22

MODEL REGRESSION

CALCULATIONS

69

1. Effects of Different Types of Centering on Regression Coefficients and Variance inflation Factors (VIF)-Myers Boron Data

Uncentered(1) Term

OF MIXTURE

7.3713 -.3645 -.8479 -.0606 .0112 .2759

Model Centered(2) VIF

900552 197707 2828226 190521 42615 430802

Factors Centered(3)

Coefficient

VIF

Coefficient

.5045 -.3645 -.8479 -.0606 .0112 .2759

1* 3017 5453 2547 2795 3404

.3172 -.1451 .1216 -.0589 .1628 .0180

VIF 9.68 19.51 5.69 5.76 27 .53 4.91

Factors Centered and Model Centered(4) Coefficient .5045 -.1451 .1216 -.0589 .1628 .0180

VIF 1* 19.51 5.69 5.33 18.26 3.32

(1)

Crossproduct and squared terms generated from raw x's.

(2)

As in (1) with all five model terms centered.

(3)

x1 and x2 centered before crossproduct and squared terms are generated, hence, the two linear terms are orthogonal to the constant term but the second-order terms are not.

(4)

As in (3) with all five model terms centered. Note that the centering of the model in this case reduces the VIF of the constant, crossproduct and squared terms which results because the correlation between the constant term and the second-order terms is reduced.

(*)

Centering of the model results in the constant term being orthogonal to the other model terms (VIF=1) and the estimated constant (bO) being equal to the mean of the response y.

With the Scheffe and Becker models, severe ill conditioning can arise if the experimental region is constrained by lower or upper bounds on the levels of the individual components or on linear combinations of the components, i.e., if

o -s L :s Xi :s with one or more L, ~ 0 or C:s A1xl

Vi

-s 1

Vi ~

1 or

+ A2x2 + ... + Aqxq:s D

for some set of coefficients Ai not all equal to one. Component constraints, which are encountered in a large number of mixture problems [McLean and Anderson (1966), Snee and Marquardt (1974), Snee (1975 and 1979)] increase the correlations among the terms in the mixture model. Gorman (1970) showed that, in the presence of such constraints, roundoff errors can be reduced by the pseudocomponent transformation, discussed later in this paper. However, Gorman also showed that this transformation may not be effective if one ofthe component ranges is small and he recommended an additional transformation for this situation. Computer programs that do not have the capability of fitting a multiple regression model without a constant term continue to exist [see Burchfield (1971) and Nie et al. (1975)]. These models have been discussed very little in the statistical literature. Moreover, new generations of computers will continue to appear for which regression programs will


very likely be written without the guidance of those who have knowledge of and experience with the special characteristics of these models. It is, therefore, conceivable that many of these programs will not be able to fit models with no constant term. The purpose of this paper is to show that all the necessary calculations for fitting mixture models, including tests of significance for the overall model and for the curvilinear terms (i.e., deviations from linear blending), can be done by fitting models with a constant term, thus permitting reduction of roundoff errors by centering the factors. Watson (1969) made the same suggestion to improve the accuracy of the calculation of the residual sum of squares in fitting a linear mixture model. Weare not necessarily recommending that the proposed technique for fitting mixture models be adopted exclusively. The methodology should be used, however, when fitting a model with no constant term either produces inaccurate calculations or is not possible with the available regression program.

Using Variance Inflation Factors to Detect Computer Roundoff Error In general, we recommend that all regression calculations, including those for mixture models, be done in double-precision arithmetic and that the (p X p) matrix X' X be scaled to give a matrix R with unit diagonal elements [Draper and Smith (1981), pp. 262-265]. This reduces the magnitude of


RONALD

70

D. SNEE AND ARTHUR

the elements of the matrix of the normal equations because the elements of R all lie between -1 and 1. When the model contains a constant term, R should be taken as the (p X p) correlation matrix. A lookout should always be kept for computer roundoff errors when fitting regression models to data even when the computational procedures mentioned above are employed. We believe that the maximum variance inflation factor (VIF),proposed by Marquardt (1970),is the best single statistic for detecting situations in which roundoff error might be a problem. This statistic is already available in good regression programs. The VIF for the kth regression coefficient is VIF(bk) = (1 -

R~)-l

where R; is the multiple correlation coefficient of the regression of the kth predictor variable on all other predictor variables, whether or not the model contains a constant term [Searle (1971),p. 95]. Gorman (1970) concluded that roundoff error would probably not be a problem if R't -s 0.99; hence, the accuracy of regression calculations should be acceptable if the maximum VIF -s 100. The VIF(bk) is also the kth diagonal element ofR-1 and measures the collective effects of the correlations among the predictor variables on the variance of the regression coefficient bs, We feel that the VIF's should be printed out routinely with all regression computations because they provide a very useful measure of the degree of multicollinearity [see Marquardt (1970), Snee (1973b) and Mason et al. (1975)]. The VIF's are also related mathematically to the eigenvalues and eigenvectors of the correlation matrix R and provide much of the same information [see the Appendix of Snee (1973b)]. Berk (1977)has shown that the maximum VIF is a lower bound on the condition number of R, which is used by some workers to detect multicollinearity [see Belsley et al. (1980)]. In polynomial regression models with a constant term, the VIF's are affected in different ways by the two types of centering: centering of the factors in the model and centering of all the predictor variables. In a first order model, the two types of centering coincide, resulting in a VIF equal to one for the constant term (i.e., the reparametrized constant term is orthogonal to the linear terms) and reduced VIF's for the linear terms. In polynomial models of order two or higher, centering the factors usually reduces the VIF's drastically [see Marquardt and Snee (1975)and Snee (1973b)].Further center-


A. RAYNER

ing of all predictor variables in these models results in a VIF equal to one for the constant term and reduced VIF's for the terms of order two and higher (see Table 1). Centering of all the predictor variables in the model is good statistical practice and is used in most computer programs. This is easily accomplished by the conversion of X'X to the correlation matrix R before solving the normal equations. Other criteria may also be of use: (i) the maximum element of RR-1 - I, which may, however, be difficult to compute accurately in the circumstances when the criterion is needed; (ii) the difference (y'y - b'X'y) - (y - y)'(y - y) between the residual sum of squares as calculated by subtraction in the regression ANOVAtable (with the correction for the mean excluded if the model has a constant term) and as calculated directly from the residuals y - y, where y is the n X 1 vector of fitted values; and (iii) Mullet and Murray's (1971) rounding error check, which, however, requires mp (where m is an integer 2: 1) additional runs of the regression program. The mixture models considered in the sections that follow are mathematically equivalent alternatives to the Scheffe model, and so comparisons of the outputs of programs for fitting the different models are of interest when testing programs for computational accuracy. The residual sums of squares and the fitted coefficients (after detransformation of the model where appropriate) of the models should agree, subject to rounding error. It must be noted, however, that since the equivalence of the models is based on the relationship (1), these comparisons are meaningful only if Xlj + X2} + ... + Xq} equals one exactly for all j (j = 1, ... , n), where Xi} represents the proportion of the ith component for the jth observation. Any rounding error in Xi} will produce erroneous regression statistics. Daniel and Wood (1980,chapter 9) recommend that if Lq~=1 Xi} does not equal one for any blends (e.g., due to rounding or analytical errors) then Xi} be scaled to add to one by using xi; = Xi}/L;_l Xi} in the regression calculations. Although this is a reasonable suggestion, we are not aware of any studies of its properties. Different computers provide different levels of accuracy. We recommend that in routine fitting of linear models the VIF's be used as guidelines for


ASSESSING THE ACCURACY OF MIXTURE MODEL REGRESSION CALCULATIONS

detecting those situations in which roundoff errors may be a problem. This will produce an awareness of the roundoff problems associated with the regression calculations and provide a method for determining what degree of ill conditioning a given combination of regression program and computer can tolerate. Accordingly, all comparisons among the models discussed in this paper are based on their associated VIF's, those for the models with a constant term always being calculated with centering of the predictor variables (i.e., VIF equal to one for the constant term).

Intercept Mixture Models Reduce Roundoff Errors and Provide Appropriate Coefficients and Significance Tests It is shown in the Appendix that the Scheffe canonical model may be regarded as a full-rank reparametrization of the corresponding full polynomial model in the presence of the constraint (1). For the example of three components and a second degree polynomial, the Scheffe canonical model, the Three-Component Scheffe Quadratic Model,

-

E(y)

= (31Xl + (32X2 + (33X3 + (312XIX2

(4)

+ (313X1X3 + (323X2X3 is derivable from the corresponding ten-term full quadratic polynomial by the deletion of the constant term and the three squared terms. It is also shown that this is only one of a series of models that can be produced by deleting anyone of the terms of degree less than two and a suitable set of three of the second degree terms. For example, deletion of the linear term in X3 and the three squared terms produces the alternative model, Three-Component Intercept Quadratic Model, as follows.

(5) Relative to (4) this amounts to the replacement of the term in X3 by a constant term. We shall refer to this as the intercept model for this case, with X3 deleted even though X3 still appears in the product terms. This replacement of a linear term in the Scheffe model by a constant term is permissible whatever the number of components or the degree of the


I

71

model. It is also applicable to the Becker models. These intercept models are mathematically equivalent to the corresponding Scheffe or Becker models and provide all the necessary regression coefficients, predicted values and regression test statistics: F, R2, R~ (see below). However, choice of the linear term to be deleted can have some effect on the accuracy of the calculations (see Table 6 and later discussion). It is also seen that (4) may be derived from (5) by multiplying ao by Xl + X2 + X3 (i.e.,by 1) and making a notational change, from which the relationship between the parameters of the two models is seen to be the following. (31

= ao + al

(312

=

(32

= ao + a2

(313

= a13

(33

= ao

(323

=

a12

(6)

a23

This shows us that tests of hypotheses involving a12, a13 and a23 are equivalent to tests of hypotheses involving the nonlinear blending coefficients ((312, (313, (323) in the Scheffe model. Tests of hypotheses involvingthe linear coefficients (3i are ofless interest in mixture models because it does not make any physical sense to delete a component from the mixture. The exception is the use of the linear mixture model in screening mixture components [see Snee and Marquardt (1976)],where computations beyond the usual regression calculations are needed regardless of the model used. In this paper we will concentrate on higher-order models (as exemplified by the quadratic in particular) because it is the curvilinear terms that produce the majority of the roundoff problems. Use of an intecept model may result in some reduction in the VIF's as compared with the Scheffe model, but it may well remain subject to roundoff problems. The main benefit of this approach will only be achieved if the factors (proportions of the components) are centered, as shown below for the case of model (5) with the Centered Three-Component Intercept Quadratic Model, E(y)

= ad' + at(xl

- xd

+ a~(x2 - X2)

+ at2(xl

- Xd(X2

- X2)

+ at3(xl

- Xd(X3

- X3)

+ a~3(x2

- X2)(X3 - X3)

(7)

where Xi = n -1 ~j~l Xi}, the mean of Xi in the data. The relationships between the at's and the (3k'S of



72

(4) are given by

{3l

= A + af - af2x2 - af3x3

{32 = A (33

=

+ a~

- af2xl

A - af3xl

(8)

- a~3x3

- a~3x2

where A = a6' - af Xl - a~ X2

+

af2XlX2

+ af3XlX3

to pseudocomponents will reduce roundoff error. This is achieved through reductions in the correlations between the predictor variables in a way similar to centering the components. The terms in the pseudocomponent model are less correlated because its coefficients describe the response changes in the region of the data. The pseudocomponent transformation is

+ a~3X2X3.

I

The slack-variable model presented by Marquardt and Snee (1974)is one of the class of intercept models. For the three-component quadratic case with X3 slack it is the following. .E(y)

=

ao

+ a'txl + a2x2

+ al2xlx2

+ allxI

(9)

+ a22x~

This model is used to describe the response of a mixture system when one component (the so-called slack variable) makes up the majority of the mixture with the other components occurring in trace amounts, and the main interest is in studying the effects of increasing the trace components at the expense of decreasing the level of the slack component. In comparison with (5) this amounts to the choice of terms in xi and x~ in preference to terms in XlX3 and X2X3, with the result that the slack variable does not appear explicitly in the model. Marquardt and Snee (1974)pointed out that since the slack-variable model is equivalent to the Scheffe model it can be used to compute the appropriate F, R2 and R1 statistics, which differ from those of zero-intercept models whose factors Xi are not subject to the constraints of (1). However, the expressions for the quadratic coefficients in the Scheffe model in terms of the ak's of (9) are {3l2 = a12 - all - a22, {3l3 = -all and {323 = -a22. Therefore, with the exception of the test for nonlinear blending (Ho: {3l2 = {3l3 = (323 = 0) for which H«: a12 = all = a22 = 0 is equivalent, tests of hypotheses involving {3l2, {3l3 and {323 cannot be conducted as simply as with models (5) and (7). Since these models also give the correct F, R2 and R1 statistics, they are preferable for our purposes to the slack-variable form except in the physical situation for which the latter is appropriate.

Pseudocomponents Reduce Roundoff Errors by Estimating Coefficients that Describe the Response in the Region of the Data If one or more of the components of a mixture system have nonzero lower bounds, transformation


Xi

xi-Li

= -l-----=L:-

where L = L, + L2 + ... + Lq and L, 2:: 0 is the lower bound for the ith component (~:; ~ 1 xi = 1). The pseudocomponent variables describe the smallest simplex that will include all the observations. The effect of the transformation in the case of three components with L, = 0.15,L2 = 0.20 and L3 = 0.35 is shown graphically in Figure 1. A psuedocomponent is a mixture of two or more components. In Figure 1 we see that pseudocomponent Xl consists

CONSTRA I NTS Xl

~ 0.20

X)

"0.)5

2

X

L-.

-'-

~ 0.15

X

•..••

X)

2

FIGURE 1. Pseudocomponent gion

Simplex Design Re-

of 45 percent component 1, 20 percent component 2 and 35 percent component 3. Corresponding to the model in terms of the pure components (4), the response model (Scheffe form) in terms of pseudocomponents may be denoted by the following. (10) In terms of this model Yl, Y2 and y3 represent the responses for pseudocomponents Xl, X2 and X3, and Y12, Y13 and Y23 describe the effect of nonlinear blending of pseudocomponents X'I and X2, Xl and X3, and X2 and X3, respectively. The corresponding coefficients of (4)·describe similar effects for the


I


pure components Xl, X2 and X3, and are applicable to an unconstrained simplex. When the region of experimentation is constrained, the estimated values of {3i, i = 1, ... , q, give predicted responses for mixtures that may not be physically meaningfuL Such extrapolations increase the correlations among the terms in the model and, if these are sufficiently large, roundoff errors will result.

The first example is a vegetable oil blending study considering ten blends (Table 2) whose components TABLE 2. Vegetable Oil Blending Study

X1

=

06

+ 8{(xi

- xU

+ 01(X2

+ oMxi

- Xi)(X2

- X2)

+ oMxi

- xi)(xs

- xs)

+ OMX2

- X2)(XS - xs)

X3

Y

Corn Oil

Stearine

Palm Oil

SCI*

1 2 3 4 5

0.50 0.50 0.52 0.51 0.55

0.05 0.10 0.07 0.09 0.05

0.45 0.40 0.41 0.40 0.40

22.8 21.1 21.7 20.7 19.6

6 7 8 9 10

0.57 0.55 0.51 0.50 0.60

0.07 0.10 0.12 0.15 0.05

0.36 0.35 0.37 0.35 0.35

17.9 18.4 18.9 17.4 16.4

(11)

E(y)

X2

Blend

Replacement of one of the linear terms in (10) by a constant produces an intercept form of the pseudocomponent model, which, with X3 deleted, may be denoted by the following.

The corresponding centered pseudocomponent form is denoted by the following.

73

*Solids Content Index

- X2)

(12)

have the ranges: 0.50 ::S Xl ::S 0.60, 0.05 ::S X2 ::S 0.15 and 0.35 ::S X3 ::S 0.45. In Figure 2 we see that these Xl'

CORN OIL

The relationship between the Ok'S of (11) and the Yk'S of (10) and between the ok's of (12).and the Yk'S are in exact correspondence with the relationships (6) and (8) for the corresponding component models.

Illustrative Examples The Scheffe, intercept and centered intercept models, with and without the pseudocomponent transformation, comprise six potential models whose VIF's in a number of examples will be compared.

x-Scale Component Pseudocomponent

Scheffe Model

Intercept Model Not Centered Centered

(4)

(5)

(7)

(10)

(11)

(12)

However, since

l-L

the parameters of models (7) and (12) are related by Ok = (1 - L)Uak, where u is the degree of the term concerned. This means that the VIF's for these two models are identical.

Vol. 14, No.2, Apri/1982

X

z

STEARINE

X J PALM OIL

FIGURE 2. Vegetable Oil Blending Study Experimental Region

ranges form a simplex when expressed in terms of pseudocomponents. The data in the second example (Table 3) were obtained from a poultry feed blending study of ten blends that utilized an extreme vertices design (Figure 3). The components in this study have the ranges: 0.3 ::S Xl ::S 0.8, 0 ::S X2 ::S 0.3 and 0 ::S X3 ::S 0.5. The VIF's associated with the Scheffe and intercept models for these two examples are summarized in Tables 4 through 6. In Table 4 we see that in the vegetable oil blending study the VIF's for the Scheffe component model are extremely high (maximum VIF = 88858.80), so that accurate fitting of this model


74

RONALD D. SNEE AND ARTHUR A. RAYNER TABLE 3. Poultry Feed Blending Study

X1 Maize

X2 Fish

1 2 3 4 5

0.80 0.70 0.30 0.30 0.50

0.20 0.30 0.20 0.30 0

0 0 0.50 0.40 0.50

323.28 345.69 347.97 328.19 343.91

6 7 8 9 10

0.80 0.80 0.65 0.50 0.50

0 0.10 0.15 0.15 0.30

0.20 0.10 0.20 0.35 0.20

290.00 313.75 347.06 335.91 339.84

Blend

X3 Soybean

Y

Mass Gain*

* Average gain (g) of 32 chicks

would be very difficult. As indicated earlier [see the discussion of model (10)], this happens because the design covers a small portion of the simplex and the coefficients in the Scheffe model are extrapolations well beyond the region of the data. On the other hand the VIF's for the Scheffe pseudocomponent model are very low, which is to be expected in this example because the experimental region is a sim-

X

x,

fiSH

SOYBEAN

2

FIGURE 3. Experimental Blending Study

Region for Poultry Feed

plex when expressed in terms of pseudocomponents (Figure 2). The centered intercept models with X3 deleted give VIF's higher than those of the Scheffe pseudocomponent model, but by only a margin small enough to have virtually no effect on computational accuracy. In the poultry feed blending study (Table 5) we see that the maximum VIF for the Scheffe component model is very much lower than in Table 4, though still well above the acceptable maximum of 100. The centered intercept models again produce low VIF's, but in this example the Scheffe pseudocomponent model is not nearly as good, though it does have a maximum VIF under 100.The pseudocomponent transformation applied to the Scheffe model was not very effective in this example because the experimental region did not cover the complete pseudocomponent simplex (Figure 3). The effect of deleting different linear terms in forming intercept models for the poultry feed blending example is shown in Table 6. Only the VIF's for the linear terms are affected because the values of the quadratic coefficients do not depend on which linear term is deleted, see equation (6). Deletion of X2 to produce the uncentered model is not much help, but the VIF's of the models with Xl deleted and with X3 deleted are lower and very similar to one another. Some comment on the question of which linear term to delete is made at the end of this section. Another useful test case for studying the computational accuracy of a regression program is the hypothetical, but realistic, three-component example published by Gorman (1970).The region is very small and the terms in the Scheffe model are highly correlated; however, the six points form a pseudocomponent simplex design. The VIF's for the Scheffe component model are 70410.80for the three linear terms and 74001.61for the three quadratic terms. If a centered intercept model with X3 deleted

TABLE 4. Variance Inflation Factors for Vegetable Oil Blending Models

Model Term 1 2 3 12 13· 23

Scheff~ Model PseudoComponent(4)* Component(1o) 19918.67 6043.57 34451.85 4645.22 88858.80 1876.47

1.89 2.02 2.10 1.89 1.81 2.23

Uncentered Intercept Model PseudoComponent(11) Component(5) 67.57 1164. 14 deleted 484.97 435.82 189.66

2.02 1.82 deleted 1.28 1.25 1.36

Centered Intercept (7)or( 12) 3.29 2.95 deleted 2.04 2.09 1.74

*Model equation numbers are shown in parentheses.



75

ASSESSING THE ACCURACY OF MIXTURE MODEL REGRESSION CALCULATIONS TABLE 5. Variance Inflation Factors for Poultry Feed Blending Models

Model Term 1 2 3 12 13 23

~

Scheffe Model PseudoComponent(10) Component(4)* 16.54 197.45 42.33 143.01 43.64 33.68

Uncentered Intercept Model PseudoComponent(5) Component(11) 12.98 58.30 deleted 45.98 12.29 18.96

10.34 65.23 9.60 38.94 7.83 33.68

Centered Intercept (7)or(12) 1.27 1.28 deleted 3.52 1.21 3.63

5.19 18.81 deleted 19.54 3.56 18.96

*Model equation numbers are shown in parentheses. TABLE 6. Poultry Feed Blending Intercept Models-Effect

Model Term+ 1 2 3

Component(5)* 12.98 58.30 Deleted

155.38 Deleted 156.92

of Choice of Deleted Variable on VIF's

Uncentered Pseudocomponent(11) Deleted 52.97 11 .91

5.19 18.81 Deleted

50.53 Deleted 50.62

Centered (7)or(12) 1.27

Deleted 17.23 4.77

1. 28

Deleted

3.47 Deleted 3.44

Deleted 1. 18 1.17

+The choice of the deleted variable has no effect on the VIF's of the quadratic coefficients. *Model equation numbers are shown in parentheses.

is used, the maximum VIF is 3.15.The VIF's for the Scheffe pseudocomponent model are all equal to 1.50, and so in this example an intercept form is unnecessary for further improvement in accuracy. The flare experiment in McLean and Anderson (1966),a four-component extreme vertices design of fifteen blends, is another well-known mixture study that illustrates several important points. In brief we found that: (i) the maximum VIF of the ten-term Scheffe component model is 41042 (for be); (ii) the uncentered intercept model with Xl, X2 or X3 deleted has a maximum VIF of about 5000 (for b4), but with X4 deleted, because of the small range of X4, a problem also noted by Gorman (1970), the maximum VIF rises to 158533 (for bz and bs): (iii) the Scheffe pseudocomponent model gives a maximum VIF of 1612 (for b4), i.e., the pseudocomponent transformation has not been very successful in this example; (iv) the intercept pseudocomponent forms with Xl, X2 or X3 deleted are moderately effective, with maximum VIF in the region of 650, but this rises to 20408 with X4 deleted; (v) irrespective of choice of deleted term, the centered intercept models are superior to all others, with maximum VIF = 102 (for b24 and b34). We have not made an extensive study of which term to delete to form the intercept model. It appears, however, that deletion of the linear term for the component with the largest range is a good procedure. This approach does the most to break

Vol. 14, No.2, Apri/1982

up the multicollinearity caused by the fact that the x's add to one. Deletion of a component that varies over a small range results in near multicollinearity because there is little variation in the sum of the remaining x's. In the poultry feed example, X2, whose deletion produced the largest VIF (Table 6), has the smallest range (0 :s X2 :s 0.3), and so Xl + X3 varies only between 0.7 and 1. The flare experiment is a more revealing example since 0.03 :s X4 :s 0.08 implies 0.92 :s Xl + X2 + X3 -s 0.97, i.e., Xl + X2 + X3 is nearly constant, with the result that deletion of X4 produces a very high VIF (158533). As noted above, the centered intercept model works very well in this example irrespective of which term is deleted. Fitting Becker's Models Becker's (1968)models are of the form HI

E(y)

=

'2.{3iXi

H2

E(y)

=

'2.{3iXi

H3

E(y)

=

'2.{3iXi

+ '2.{3ijmin(xi, + '2.{3ijXiXj/(Xi + '2.{3ij(XiXj)I/2

Xj)

+ Xj)

where XiXj/(Xi + Xj) = 0 when Xi + Xj = O.These models do not contain a constant term, hence the same device of replacing one of the linear terms by a constant term as in model (4) can be used to produce an intercept model to reduce roundoff errors. Because of the nature of the curvilinear terms in Becker's models, the intercept model cannot be centered without changing the mathematical form



76

of the model. The effect is the same as that of the pseudocomponent transformation, which can alter the fit of Becker's models [Snee (1973a)] and can reduce the magnitudes of the VIF's. This is not necessarily a problem because the fit of the model (i.e., R'i) can increase or decrease [Snee (1973a)]. The examples in Tables 2 and 3 provide some useful illustrations of what can happen when fitting Becker's models in intercept form. In the vegetable oil blending study (Table 2) the maximum VIF for Becker's H3 model was reduced from 371456 to 494-798, depending on which linear term was deleted. Similarly in the poultry feed blending study (Table 3) the maximum VIF for Becker's H2 model was reduced from 90.94 to 22.88-86.91. As noted above, centering and the pseudocomponent transformation could be used to reduce the VIF's further, at the risk of altering the fit of Becker's models [see Snee (1973a)].We will not pursue this point further here.

Concluding Remarks We have pointed out that numerical accuracy of mixture model regression calculations is a critical problem and have shown how variance inflation factors can be used to determine whether the numerical accuracy of a given combination of regression program and computer is adequate. We do not recommend that both the Scheffe and intercept models be fit routinely. A comparison of the results of these two models in a few instances should give one a feel for the numerical accuracy capabilities of a given program and suggest which model should be fit on a routine basis. Another point that should be taken into account when deciding on which model form to use routinely in the analysis of mixture data is that a regression program capable of fitting a zero-intercept model may not compute the correct F and u; statistics [see Marquardt and Snee (1974)]. This can be resolved by comparing the results from fitting the Scheffe and intercept models, since, if roundoff errors are not a problem, the calculated F and statistics should agree. If they do not, the zerointercept model must be computing the wrong statistics, which will be too large. In that case the correct values may be calculated from

When the experimental region is an unconstrained simplex, because of the moderate VIF's to be expected [see Table 2 in Snee (1973b)], there should be no problem in fitting the Scheffe model directly and no problem in interpreting its coefficients. The Scheffe model is the preferred final form of the fitted model because of the interpretational value of its coefficients; hence, if an intercept model is used because of the absence of a zero-intercept regression program, conversion to the Scheffe coefficients by means of (6) or (8) will have to be made or incorporated into the program. On the other hand, if the interpretation is to be based on contour plots of the fitted response surface or predicted values for specific mixtures, no such conversion is necessary. In the case of the constrained simplex the preferred final form of the fitted model for purposes of coefficient interpretation is the pseudocomponent Scheffe model (10). This model can be fitted directly with adequate computational accuracy in many instances. If direct fitting does give high VIF's, or if a zero-intercept program is unavailable, the alternative is one of the centered intercept models (7) or (12), which have identical VIF's. Model (7) might be preferred if the available intercept regression program does not permit the preliminary transformation of the Xi to pseudocomponents. No conversion to the pseudocomponent Scheffe form is necessary in either case unless interpretation of the Scheffe coefficients is of interest. In that case the conversions of (11) and (12) to the Scheffe pseudocomponent form (10) correspond exactly to the conversions of (5) and (7) to the Scheffe form (4). The conversion of the fitted coefficients of (7) to those of (10) requires the very similar but slightly more complicated formulae Cl2

=

(1 - L)2ai2

F

1-

S;/S}

= 1 + [R~/(l

CI

- l)/(p

- 1)]

where S; = residual variance and S} = total variance (i.e.,the variance of y about y) in the regression analysis of variance.


=

+

m

(1 - L)[at -

C2 =

C3

m+

(1 - L)2ai3

=

-

(X3 -

= m - (1 -

L)[(xi

air -

-

(X2 - L2)at2

(X3 - L3) at3]

(1 - L)[ai

+ m

- R~)][(n

=

C23= (1 - L )zai3

s;

R~ =

CI3

(X2 -

(XI - LI)at

-

L3)

(XI - LI)atz ai3]

- L1)at3

L2)

ai3]

-

(X2 -

+

(XI - Ld(X2

- L2)at2

+

(XI - LI)(X3

- L3) at3

+

(X2 - L2)(X3 - L3) ai.3

L2)ai



where at: and c» are the estimates of cd and respectively.

Yh,

An important further consideration is that, regardless ofthe model used to achieve computational accuracy, it is the VIF's of the Scheffe models (4) and (10) which are important for purposes of interpretation. While a VIF less than 100 implies acceptable numerical accuracy, a VIF greater than 10 suggests that the associated coefficient may be poorly estimated [see Marquardt (1970)]. In such a situation it may be wise to base the interpretation of the experiment on contour plots and model predictions. One should, of course, always be careful when predicting outside the region of the experiment, particularly when some ofthe VIF's are large. Coefficient interpretation is too broad a subject to discuss here. We refer the reader to Marquardt (1970 and 1980), Marquardt and Snee (1975) and Snee (1973b) for guidance on the use of the VIF in evaluating regression coefficients. Finally, when designing blending studies the VIF's for proposed models can be ascertained before the experiment is conducted, so that any computational problems can be identified and resolved beforehand. Changes to design points may even be a possibility to consider. The procedures discussed by Snee (1975 and 1979) and Snee and Marquardt (1974) have been shown to produce designs with good statistical properties. Large VIF's are not uncommon when the region of experimentation is a constrained mixture space (e.g., Figure 3). In some instances changes in the number or positions of the design points will have little effect on the VIF's.

Application of Linear Model Theory to Mixture Models The Scheffe mixture model and a class of equivalent intercept models have been discussed in this paper. Through the use of linear model theory, these models and the associated statistics can be conveniently represented as special cases of the full polynomial model. Scheffe (1958) considered the full polynomial of degree d as a response model for mixture experiments with q components. For q = 3, and d = 2 this model is as follows. E(y) = f3J

+ f3tXI + f3!X2 + f3;X3 + f3t2XIX2 + f3t3XIX3 + f313X2X3 + f3tlxi + f312X§ + f3;3X§

(13)

In the presence of the constraints (1) on the x/s, the observation matrix X* for (13) has rank six


instead often, i.e., (13) contains four surplus parameters. Because of this singularity Scheffe proposed his canonical form for the model, which for the above case is given by (4). In addition to being full rank, the Scheffe model contains parameters with simple and meaningful interpretations. The use of asterisks in (13) rather than in (4) to distinguish the parameters is therefore not undeserved, even though (4) is derived from (13). Actually (4) is a full-rank reparametrization [see Pringle and Rayner (1971), pp. 88-90] of (13), the relationship between the parameters of (4) and those of (13) being given by 13 = Tf3* where, for q = 3 and d = 2, T is shown below.

T=

1 1 0 0: 1 0 1 0; 1............... 0 0 1: o 0 0 0 0 0 0 0 0 0 0 0

0 0 0 1 0 0

0 0 0 0 1 0

0 0 1 0 1 0 0 0 0 1 0 0 0 -1 -1 0 0 -1 0 -1 1 0 -1 -1

The leading submatrix of T (set off by the dotted lines) is the corresponding matrix for q = 3, d = 1. Since (4) is a full-rank model, 13 is estimable, as is also apparent from the fact that the columns of T obey the same four constraints as the columns of X*.

Hence, if b* is any solution to the normal equations (X* )'X*b*

Appendix

77

= (X*)'y

(14)

the best linear unbiased (least squares) estimator of f1 is given by Tb*. b

=

Tb*

(15)

A convenient way of solving (14) is by imposing four linearly independent linear constraints on b", Cb" = c, where C is complementary to (X*)'X* and c is a combination of the columns of C [see Rayner and Pringle (1976)]. Particularly convenient constraints are ones in which four of the elements of b* are taken to be zero [see Searle (1971), pp. 213215]. This may be done by striking out the corresponding rows and columns of (X* )'X* and elements of (X* )'y. It can equivalently be achieved by deleting four of the surplus parameters of (13) and estimating the parameters of the reduced model. Deletion of 13;, 13fl, 1312, 13;3 produces the intercept model (5) with X3 deleted. Deletion of f3~, f3t1, 1312,13;3 produces the Scheffe model (4). Other vari-


78

RONALD

D. SNEE AND ARTHUR

ations on this theme are deletion of: /35, /3 ~\, /3 52, /3!3 (X2 deleted); f3'{, /3fl' /352, /3!3 (XI deleted); /3!, /3 f3, /3 53, /3!3 (slack variable with X3 = slack (9)); or even /3!, /3 t2, /3 f3, /3 53 (retention of squared terms instead of product terms in (5)). When centered, this latter model gave lower VIF's than (7) for the poultry feed data. Rayner and Pringle (1976) present a general formula for obtaining a solution subject to one set of constraints from the solution subject to the second set. This would permit the Scheffe coefficients to be calculated from the intercept solution, but the simplest method here is to use (15). For example, for the poultry-feed data the least-squares estimates of the parameters of (5) are the following. ao

380.701 -135.89 -433.27

al

a2

a=

895.17 121.49

al2

a13

a2~

J

195.55

The corresponding (b*)' is [aOala2 0 a12a13a23 0 0 0] from which Tb" may be calculated, but it is obviously easier to delete the columns of T corresponding to the zeros in b* to produce a reduced matrix T* and to calculate b = T*a. This produces the conversion formulae

with some of the computations reported in this paper, and to the referees, whose comments aided in improving the presentation of the paper.

References BEATON,A. E., RUBIN, D. B. ANDBARONE,J. L. (1976), "The Acceptability of Regression Solutions," Journal of the American Statistical Association, Vol. 71, No. 353, pp. 158-168. BECKER,N. G. (1968), "Models for the Response of a Mixture," Journal of the Royal Statistical Society, Series B, Vol. 30, pp. 349-358. BELSLEY,D. A., KUH, E. and WELSCH,R. E. (1980), Regression Diagnostics: Identifying Influential Data and Sources of Collinearity, Wiley-Interscience, John Wiley & Sons, Inc., New York, New York. BERK, K. N. (1977), "Tolerance and Condition in Regression Computations," Journal of the American Statistical Association, Vol. 72, No. 360, pp. 863-866. BRADLEY,R. A. and SRIVASTAVA,S. S. (1979), "Correlation in Polynomial Regression," The American Statistician, Vol. 33, No.1, pp. 11-14. BURCHFIELD,P. B. (1971), "Multiple Linear Regression," Journal of Quality Technology, Vol. 3, No.4, pp. 184-189. CHUN, D. (1968), "A Note on a Regression Transformation for Smaller Roundoff Error," Technometrics, Vol. 10, No.2, pp. 393-396. CLARINGBOLD, P. J. (1955), "Use of the Simplex Design in the Study of the Joint Action of Related Hormones," Biometrics, Vol. 11, pp. 174-185. CORNELL,J. A. (1973), "Experiments with Mixtures: A Review," Technometrics, Vol. 15, No.2, pp. 437-455.

ao + al = 244.81 = ao + a2 = -52.57

CORNELL,J. A. (1979), "Experiments with Mixtures: An Update and Bibliography," Technometrics, Vol. 21, No.1, pp. 95-106.

= 380.70

CORNELL,J. A. (1981), Experiments with Mixtures: Designs, Models, and the Analysis of Mixture Data, Wiley-Interscience, John Wiley & Sons, Inc., New York, New York.

bi = bz

A. RAYNER

plus b a = a12, bl3 = ence with (6).

al3

and b23 = a23, in correspond-

The centered and pseudocomponent models, (7) and (10), may be handled similarly. However, in deriving conversion formulae corresponding to (8), it will be found that, when reduced to ordinary multiple regression form, (7) contains a term in X3, even though X3 is deleted. Therefore, the fourth column of T may not be deleted. The same applies for the pseudocomponent model (10).

Acknowledgments The authors wish to thank Dr. R. M. Gous, for his permission to use the poultry feed blending data in this study, and Dr. L. B. Hare and R. J. Leofsky, for their assistance in obtaining the vegetable oil blending data. The authors express their appreciation to D. W. Marquardt for helpful comments, to A. M. Lloyd and M. P. Rayner, for their assistance


CORNELL,J. A. and GORMAN,J. W. (1978), "On the Detection of an Additive Blending Component in Multicomponent Mixtures," Biometrics, Vol. 34, pp. 251-263. DANIEL, C. and WOOD,F. S. (1980), Fitting Equations to Data, 2nd ed., Wiley Interscience, John Wiley & Sons, New York, New York. DRAPER,N. R. and SMITH,H. (1981), Applied Regression Analysis, 2nd ed., John Wiley & Sons, New York, New York. FREUND,R. J. (1963), "A Warning of Roundoff Errors in Regression," The American Statistician, Vol. 17, No.5, pp. 13-15. GORMAN,J. W. (1970), "Fitting Equations to Mixture Data with Restraints on Compositions," Journal of Quality Technology, Vol. 2, No.4, pp. 186-194. GORMAN,J. W. and HINMAN, J. E. (1962), "Simplex Lattice Designs for Multicomponent Systems," Technometrics, Vol. 4, No.4, pp. 463-487. GRAYBILL,F. A. (1976), Theory and Application of the Linear Model, Duxbury Press, North Scituate, Massachusetts. HARE, L. B. (1974), "Mixture Designs Applied to Food Formulation." Food Technology. Vol. 28, pp. 50-56 and 62.

Vol. 14, No.2, April 19B2

ASSESSING

THE ACCURACY

OF MIXTURE

MODEL REGRESSION

CALCULATIONS

79

KUROTORI,I. S. (1966), "Experiments with Mixtures of Components Having Lower Bounds," Industrial Quality Control, Vol. 22, No. 11, pp. 592-596.

Matrices with Applications to Statistics, Griffin's Statistical Monographs and Courses, Vol. 28, Hafner, New York, New York.

LONGLEY,J. W. (1967), "An Appraisal of Least Squares Programs for the Electronic Computer from the Point of View of the User," Journal of the American Statistical Association, Vol. 62, No. 319, pp. 819-841.

RAYNER,A. A. and PRINGLE,R. M. (1976), "Some Aspects of the Solution of Singular Normal Equations with the Use of Linear Restrictions," SIAM Journal of Applied Mathematics, Vol. 31, pp. 449-460.

MARQUARDT, D. W. (1970), "Generalized Inverses, Ridge Regression, Biased Linear Estimation and Nonlinear Estimation," Technometrics, Vol. 12, No.3, pp. 591-612.

SCHEFFE, H. (1958), "Experiments with Mixtures," Journal of the Royal Statistical Society, Series B, Vol. 20, pp. 344-360.

MARQUARDT,D. W. (1980), "You Should Standardize the Predictor Variables in Your Regression Models," Discussion of "A Critique of Some Ridge Regression Methods" by G. Smith and F. Campbell, Journal of the American Statistical Association, Vol. 75, No. 369, pp. 87-91.

SCHEFFE,H. (1963), "The Simplex-Centroid Design for Experiments with Mixtures," Journal of the Royal Statistical Society, Series B, Vol. 25, pp. 235-263. SEARLE,S. R. (1971), Linear Models, John Wiley & Sons, New York, New York.

MARQUARDT,D. W. and SNEE, R. D. (1974), "Test Statistics for Mixture Models," Technometrics, Vol. 16, No.4, pp. 533-537.

SNEE, R. D. (1971), "Design and Analysis of Mixture Experiments," Journal of Quality Technology, Vol. 3, No.4, pp.159169.

MARQUARDT,D. W. and SNEE, R. D. (1975), "Ridge Regression in Practice," The American Statistician, Vol. 29, No.1, pp. 320.

SNEE, R. D. (1973a), "Techniques for the Analysis of Mixture Data," Technometrics, Vol. 15, No.3, pp. 517-528.

MASON, R. L., GUNST, R. F. and WEBSTER, J. T. (1975), "Regression Analysis and Problems of Multicollinearity," Communications in Statistics, Vol. 4, pp. 277-292. McLEAN, R. A. and ANDERSON,V. L. (1966), "Extreme Vertices Design of Mixture Experiments," Technometrics, Vol. 8, No. 3, pp. 447-454.

SNEE, R. D. (1973b), "Some Aspects of Nonorthogonal Data Analysis, Part I. Developing Prediction Equations," Journal of Quality Technology, Vol. 5, No.2, pp. 67-79. SNEE, R. D. (1975), "Experimental Designs for Quadratic Models in Constrained Mixture Spaces," Technometrics, Vol. 17, No. 2, pp. 149-159.

MYERS,R. H. (1971), Response Surface Methodology, Allyn and Bacon, Inc., Boston, Massachusetts.

SNEE, R. D. (1979), "Experimental Designs for Mixture Systems with Multicomponent Constraints," Communications in Statistics, Vol. A8(4), pp. 303-326.

MULLET, G. M. and MURRAY,T. W. (1971), "New Method for Examining Rounding Error in Least-Squares Regression Computer Programs," Journal of the American Statistical Association, Vol. 66, No. 335, pp. 496-498.

SNEE, R. D. and MARQUARDT,D. W. (1974), "Extreme Vertices Designs for Linear Mixture Models," Technometrics, Vol. 16, No.3, pp. 399-408.

NARCY,J. P. and RENAUD,J. (1972), "Use of Simplex Experimental Designs in Detergent Formulation," Journal of the American Oil Chemists' Society, Vol. 49, pp. 498-608. NIE, N. H., HULL, C. H., JENKINS,J. G., STEINBRENNER,K. and BRENT,D. H. (1975), SPSS Statistical Package for the Social Sciences, 2nd ed., McGraw-Hill Book Company, New York, New York. PRINGLE,R. M. and RAYNER,A. A. (1971), Generalized Inverse


SNEE, R. D. and MARQUARDT,D. W. (1976), "Screening Concepts and Designs for Experiments with Mixtures," Technometrics, Vol. 18, No.1, pp. 19-29. WAMPLER,R. H. (1970), "A Report on The Accuracy of Some Widely Used Least Squares Computer Programs," Journal of the American Statistical Association, Vol. 65, No. 330, pp. 549-565. WATSON, G. S. (1969), "Linear Regression on Proportions," Biometrics, Vol. 25, pp. 585-588.


Assessing the Accuracy of Mixture Model Regression Calculations

Assessing the Accuracy of Mixture Model Regression Calculations

Suggest Documents

Assessing the Accuracy of Mixture Model Regression ...

Assessing the impact of accuracy of ab initio calculations ... - CiteSeerX

Assessing the accuracy of ANOVA calculations in statistical software

Appendix Theory and Calculations Accuracy Model ...

The use of regression for assessing a seasonal forecast model ...

A Bayesian Mixture Model with Linear Regression Mixing ... - CiteSeerX

A sparse regression mixture model for clustering time-series

Subspace Clustering by Mixture of Gaussian Regression

High-Accuracy Calculations of Dipole ... - Derevianko Group

ASSESSING OBSERVER ACCURACY IN

solar mixture opacity calculations using detailed ... - IOPscience

Assessing the Reproducibility and Accuracy of ...

Assessing the Accuracy of Upscaled Groundwater ...

ASSESSING OBSERVER ACCURACY IN

Assessing the Accuracy of Statewide Presidential

AD of Matrix Calculations Regression and Cholesky

Assessing the Accuracy of Volunteered Geographic Information ...

RESEARCH ARTICLE Assessing the accuracy of ...

Assessing the Accuracy of Interval Arithmetic ...

RESEARCH ARTICLE Assessing the accuracy of ...

Assessing the Accuracy of Ancestral Protein ... - CiteSeerX

Assessing the diagnostic test accuracy of

The Logistic Regression Model

How collinearity affects mixture regression results