HIGH LEVERAGE POINT

1 downloads 0 Views 50KB Size Report
high leverage points may also be responsible for causing multicollinearity. ... In linear regression analysis we often face the problem of multicollinearity. It causes.
HIGH LEVERAGE POINT: ANOTHER SOURCE OF MULTICOLLINEARITY Md. Kamruzzaman & A.H.M. Rahmatullah Imon Department of Statistics University of Rajshahi Rajshahi-6205, Bangladesh ABSTRACT Multicollinearity often causes a huge interpretative problem in linear regression analysis. There is a large body of literature available about the sources of multicollinearity. In our work we find another important and almost inevitable source of multi-collinearity that is the existence of high leverage points in a linear model. Leverage values play very important role in regression analysis. They often form the basis of regression diagnostics as measures of influential observations in the explanatory variables. It is generally believed that high leverage points are responsible for causing masking of outliers in linear regression. But we observe that high leverage points may also be responsible for causing multicollinearity. We present few examples and figures, which draw our attention to this problem. Then we investigate the nature and extent of multicollinearity caused by the high leverage points through Monte Carlo simulation experiments. KEYWORDS Multicollinearity, high leverage points, masking, swamping, Monte Carlo simulation. 1. INTRODUCTION In linear regression analysis we often face the problem of multicollinearity. It causes major interpretive problems, such as wrong sign problem, generation of unstable and obtaining inconsistent estimates of parameters, insignificant regression while in fact it is significant. A variety of sources of multicollinearity are now available in the literature, but so far as we know another important source of multicollinearity, the existence of high leverage points is not much focussed issue. In regression diagnostics we are more concerned with the identification of high leverage points together with outliers and influential observations. But in our work we mainly deal with the cases where high leverage points are responsible for causing multicollinearity. In section 2 we introduce the concept of multicollinearity in linear regression. We briefly discuss the nature and sources of multicollinearity. The unfortunate consequences of the presence of multicollinearity are also mentioned. In section 3 we mainly focus on high leverage points in regression diagnostics because they are responsible for causing masking/swamping of outliers. We introduce the term leverage and briefly discuss some of the commonly used measures of leverages. We suspect high leverage points as a source of multicollinearity in section 4. We present few examples and figures, which show how high leverage points become responsible for causing multicollinearity. Then we report a Monte Carlo simulation study, which is designed to investigate how high leverage points behave as a source of multicollinearity. We observe how a single high leverage point causes multicollinearity. We also investigate whether the existing methods can successfully detect

436

Another Source of Multicollinearity

high leverage points or not and, the behaviour of multicollinearity when high leverage points thus identified are omitted from the regression model. We also extend this experiment to the cases when a group of high leverage points are present. 2. CONCEPT OF MULTICOLLINEARITY In a linear regression model, if there is no relationship between the regressors, they are said to be orthogonal. When the regressors are orthogonal, inferences such as identifying the relative effects of the regressors, prediction or estimation and selection of an appropriate set of variables for the model can be made relatively easily. Unfortunately in most applications of regression, the regressors are not orthogonal. Sometimes the lack of orthogonality is not serious. However, in some situations the regressors are nearly perfectly linearly related, and in such cases the inferences based on become erroneous. When there are near linear dependencies between the regressors, the problem of multicollinearity is said to exit. We write the multiple regression model as Y=Χβ+∈ (2.1) where Y is an n × 1 vector of response or dependent variables, X is an n × k n > k

(

matrix of predictors (explanatory variables) possibly including one constant predictor,

)

β is

a k ×1 vector of unknown finite parameters to be estimated and ∈ is an n × 1 vector of random disturbances. When the OLS method is employed to estimate the regression parameters we obtain

(

βˆ = X T X

)

−1

X TY .

Let the j-th column of the X matrix be denoted

X j , so that X = [ X1 , X 2 ,L X k ] .

We formally define multicollinearity interms of the linear dependence of the columns of X, i.e., the vectors of X1 , X 2 ,LL X k are linearly dependent if there is a set of constants

t1, t2 LLt k , not all zero, such that k

∑t jX j = 0 j =1

(2.2) If (2.2) holds exactly, we face the problem of perfect multicollinearity. The problem of near multicollinearity is said to exit when (2.2) holds approximately. We generally use the Ordinary Least Squares (OLS) technique to estimate the regression parameters β because of tradition and ease of computation. But the presence of multicollinearity has a number of serious effects on the OLS estimates of the regression coefficients. If there is strong multicollinearity between the explanatory variables X i and

X j , then the correlation coefficient rij will be large and that will be responsible for obtaining large variance and covariance of the OLS estimates of the parameters

βi and βj .

Multicollinearity also tends to produce OLS estimates that are too large in absolute value. Because of the two problems the partial t-test may infer significant regression as insignificant

Kamruzzaman and Rahmatullah Imon

437

or vice versa. When using multiple regression occasionally we experience an apparent contradiction of intuition or theory when one or more of the regression coefficients seem to have the wrong sign. This makes a serious interpretive problem, as it is really difficult to explain a negative estimate (say) of a parameter to the model user when that user believes that the coefficient should be positive. Mullet (1976) pointed out that multicollinearity is mainly responsible for causing this wrong sign problem. There are several sources of multicollinearity. As Montgomery and Peck (1992) note, multicollinearity may be due to the following factors: (a) The data collection method employed, for example, samp ling over a limited range of the values taken by the regressors in the population. (b) Constraints on the model or in the population being sampled. For example, in the regression of electricity consumption on income and house size there is a physical constraint in the population that families with higher incomes generally have larger homes than families with lower incomes. (c) Model specification, for example, adding polynomial terms to regression model, especially when the range of the X variable is small. (d) An over determined model. This happens when the model has more explanatory variables than the number of observations. This could happen in the medical research where there may be a small number of patients about whom information is collected on a large number of variables. In section 4 we introduce another source of multicollinearity which so far as we know is not referred to in the literature. That is the existence of high leverage points in a data set.

3. HIGH LEVERAGE POINTS AND THEIR MEASURES Sometime it is observed that few observations belonging to the explanatory variables may exert too much influence to the fitting of a regression model. We can re-express the general linear model (2.1) by

y i = xiT β+ ∈i

i =1,2,----, n

(3.1) where y i is the i-th observed response, xi is a k×1 vector of predictors, β is a k×1 vector of unknown finite parameters and ∈i ’s are uncorrelated random disturbances. In regression analysis, since the random disturbances are unobserved, they are traditionally estimated by the OLS residuals, which are actually the differences between observed and estimated responses. When the OLS method is used to fit the model, the i-th residual is given by

ˆ i = yi − yˆ i ∈ ˆ i = yi − xTi βˆ ; i = 1,2, L, n ⇒∈

(3.2)

In matrix notation

∈ ˆ = Y − X T βˆ which can also be expressed as

(3.3)

438

Another Source of Multicollinearity

∈ ˆ = (I − W ) ∈ where

(

W = X XTX

)

−1

(3.4)

X T which is generally known as weight matrix or leverage

matrix. These influential observations in the X-space are generally termed as high leverage points. According to Hocking and Pendleton (1983) high leverage points “are those observations for which the input vector diagonal elements of W, denoted as

(

wii = xiT X T X

)

−1

xi is in some sense, far from the rest of the data”. The

wii and defined by xi , i = 1,2, -- n

(3.5)

are called the leverage values which measure how far the input vector

xi are from the rest of

the data. High leverage point is an observation with large wii, in comparison to other observations in the data set. Observations, which are isolated in the X space, will have high leverage. Points with high leverages may be also regarded as outliers in the X space. As we observe from (3.4) that residuals are functions of leverages and disturbances, high leverage points together with large disturbances (outliers) may pull the fitted least squares line in a way that the fitted residuals corresponding to that outliers might be too small. This may cause masking of outliers [see Peña and Yohai (1995)] and that is why the identification of high leverage points is really necessary. The opposite effect of masking is known as swamping [Barnett and Lewis (1994)] for which good observations appear as outliers. Much work has been done on the identification of high leverage points and a good number of diagnostic measures are now available in the literature. Here we briefly discuss some commonly used measures of leverages. We have already mentioned that the i-th diagonal element, wii of the weight matrix W is traditionally used measures of leverage of the response value y i on the corresponding value

yˆ i . We know that the average value of wii is k/n

and observations and data points having large wii values are generally considered as high leverage points. But the immediate question comes to mind how large is large? Hoaglin and Welsch (1978) considered observations as high leverage points when wii exceeded 2k/n and this method is known as “twice the mean rule”. Vellman and Welsch (1981) consider wii as large when it exceeds 3k/n, which is known as “thrice the mean rule”. We also know that for all i, wii lies between 0 and 1. For a definition of when wii is large, Huber (1981) suggested to break this range of wii into three intervals. Observations having

wii less than 0.2 are safe,

between 0.2 and 0.5 are risky and above 0.5 should be avoided. Hadi (1992) pointed out that traditionally used measures of leverages are not sensitive enough to the high leverage points. In the presence of a high leverage point, the weight matrix W may break down easily. In this situation neither of the above methods may be effective in the assessment of the true leverages and consequently the identification of high leverage points becomes complicated. He introduced a new type of measures, named as potentials,

Kamruzzaman and Rahmatullah Imon

439

where the leverage of the i-th point is based on a fit to the data with the i-th case deleted and that is why is more sensitive to the high leverage points. Every possible subset of n − 1 observations is used to form the weight matrix, and weight of every deleted observation in turn is generated externally which is known as potentials. Although it seems that calculation of potentials will require construction of n weight matrices, it is possible to calculate them from wii ’s in a very simple way. We define the i-th potential as −1

pii = xiT ( X (Ti ) X ( i ) ) xi

(3.6)

where X(i) is the data matrix X with the i-th row deleted. Using the following result of Rao (1965), i.e.,

(X

T (i )

X (i )

)

−1

(

) = (X X ) + (X X )

T −1 i i

= X X −xx T

T

it is easy to obtain a simple relationship between

pii = xi T ( X T X ) −1 xi +

−1

T

−1

xi xi (X T X )

−1

1 − xiT (X T X ) xi −1

wii and pii as

( xi T ( X T X ) xi )2 1 − xi T ( X T X ) xi

=

wii 1 − wii

(3.7)

Observations corresponding to excessively large potential values are considered as high leverage points. Hadi (1992) proposed a cut-off point for Mean (

pii as

pii ) + c. St. dev. ( pii )

where c is an appropriately chosen constant such as 2 or 3. This implies that the observations are said to be high leverage points having

pii > Mean ( pii ) + c. St. dev. ( pii )

(3.8)

This form is analogous to a confidence bound for a location parameter. But the problem with the cut-off point like (3.8) that both mean and variance of pii may be non-robust in the presence of a single extreme value yielding a high cut-off point. To avoid such a problem the alternative suggestion of Hadi (1992) is to consider Median (

pii ) + c. MAD ( pii )

where the Median Absolute Deviation (MAD) is computed by MAD (

pii ) = Median {| pii - Median ( pii )|}/ 0.6745

440

Another Source of Multicollinearity

Hence the observations are said to be high leverage points having

pii > Median ( pii ) + c. MAD ( pii )

(3.9)

Throughout the experiment we use (3.9) as the detection criterion and call it as Potential method. 4. HIGH LEVERAGE POINTS AND MUITICOLLINEARITY Here we anticipate that high leverage points may cause multicollinearity in linear regression. At first we give few examples in favour of our proposition. We consider an artificial three-predictor data set where our X’s are generated independently as Uniform (0,1) so that the pairwise correlation coefficients become very low (near to 0). We then deliberately change or set some values of X’s so that they become points of high leverages. Several techniques have been proposed for detecting multicollinearity. Some of them are a) Examination of the correlation matrix (b) Variance inflation factor (VIF) (c) Condition number and (d) Eigen value decomposition. The correlation method is certainly not the best way of detecting multicollinearity but we use this method because it is very simple, easy to compute and higher correlation always guarantee multicollinearity [Belsley (1991)]. Artificial Data Set for Different Types of Leverages Table – 1 Index 1 2 3 4 5 6 7 8 9 10 11

X1 0.9917 0.7006 0.3949 0.9618 0.0042 0.3044 0.3521 0.0993 0.4072 0.1105 0.8282

X2 0.7067 0.8301 0.7586 0.5460 0.0504 0.3952 0.3140 0.9953 0.4604 0.8435 0.9434

X3 0.9820 0.6167 0.8862 0.6400 0.7311 0.5180 0.8409 0.8676 0.5702 0.8864 0.2216

Index 12 13 14 15 16 17 18 19(a) 20(a) 19(b) 20(b)

X1 0.0859 0.9283 0.4926 0.4069 0.0227 0.4129 0.8919 0.1397 0.6295 0.1397 10.00

X2 0.3367 0.6090 0.8648 0.2590 0.1896 0.6420 0.8459 0.5453 0.5354 0.5453 10.00

X3 0.3921 0.4289 0.6607 0.2028 0.6677 0.3723 0.0634 0.3354 0.5630 0.3354 10.00

Table1 presents three sets of data each containing 20 observations. In our first example, we call it example (a), observations are generated for each of three explanatory variables X1 , X 2 and X 3 independently as Uniform (0,1) by using RAND command in MINITAB12 so that this example becomes a case which does not contain any high leverage point. For this particular data set we obtain the maximum leverage value as wii = 0.410777 which is only moderate. We compute the correlation coefficients for different pair of X’s and obtain

r12 = 0.423 , r13 = −0 .217 and r23 = 0.043 . Neither of these three correlation coefficients is significantly high that clearly indicate that multicollinearity does not arise when

Kamruzzaman and Rahmatullah Imon

441

data are generated independently with no high leverage points. For example (b) we set the value corresponding to the observation no. 20 at 10 for each of the three X’s so that it becomes an example of a single high leverage point in a sample of size 20 with the highest leverage wii =0.9940. Thus the 20th observation produces very high leverage. We obtain

r12 = 0.989 , r13 = 0.977 , r23 = 0.985 while using the data given in this table. This clearly indicates that the model is severely affected by multicollinearity. The data set (c) presents an example where 10% of the observations are points of high leverages. As the sample size is 20 we generate this data with 2 high leverage points. The first 18 observations for each of the three explanatory variables are generated as U(0,1) and the last two observations for each of then are set at 10. Thus the 19th and the 20th observation produce very high leverage with

wii = 0.498 each. For this data we obtain r12 = 0.994, r13 = 0.989 and

r23 = 0.992 which indicates that these high leverage points generate strong multicollinearity.

X1

1.2

12

12

1.0

10

10

.8

8

8

.6

6

X1

6

X1

.4

4

.2

2

2

0.0

0

0

1.2

1.0

X2

.8

.6

.4

.2

.2

.4

Figure-1

.6 X3

.8

1.0

1.2

12

10 8 6 4 2 X2 0

0

2

Figure-2

8 10 4 6 X3

12

4

12

10 8 6 4 2

X2

0

0

10 6 8 2 4

Figure-3

Figures 1-3: 3D plot of explanatory variables with no high leverages, with a single high leverage point and with 10% high leverage points. Figures 1 to 3 give us another feeling how high leverage points become a source of multicollinearity. In these figures 3D plots for the explanatory variables X1 , X2 and X3 of Table 1 are presented. Figure 1presents a 3D plot of the X’s when no high leverage points are present in the data set. Since the X’s are generated independently this plot clearly shows no sign of multicollinearity. But we observe from figure 2 that there is a strong indication of multicollinearity when only one of the 20 observations is subjected to high leverage. We observe from figure 3 that the collinearity is even stronger when 2 observations in a sample of size 20 are points of high leverage.

X3

12

442

Another Source of Multicollinearity

Here we report a Monte Carlo simulation study, which is designed to investigate how high leverage points behave as a source of multicollinearity. At first we demonstrate how a single high leverage point causes multicollinearity. We also investigate whether the existing methods can successfully detect high leverage points or not and, the behaviour of multicollinearity when high leverage points thus identified are omitted from the regression model. Then we extend this experiment to the cases where multiple high leverage points are present in the data. Likewise the examples considered earlier we generate three-predictor artificial data set for a single high leverage and multiple (10%) high leverage cases. For both of the designs we consider cases for four different sample sizes (n = 20, 50, 100 and 200) and pairwise correlation coefficient for these there sets of variable are computed. Throughout our simulation experiment we use four detection techniques such as twice the mean rule (2M), thrice the mean rule (3M), Huber method and Hadi’s potential method to identify high leverage points. Correlation coefficients r12 , r13 and r23 are computed after omitting the observations identified by these four detection techniques. The results of these experiments are presented in Tables 2 and 3, each of which is based on 10000 simulations. First we report the simulation results where the X variables contain a single high leverage point. The first (n-1) observations for each of the three explanatory variables are simulated as Uniform (0,1). The n-th observation for each of the X’s is kept fixed at a same high leverage value. For four different samples are set at 10. Correlation coefficients of X’s together with the results after excluding the suspect high leverage cases are presented in Table 2. We observe from this table that for every n, a single high leverage point can generate strong multicollinearity. When n = 20 the 2M, 3M and Potential methods correctly identify high leverage cases and when those identified cases are omitted, the correlation coefficients become very small. 2M swamps on average 2 good cases and produce slightly higher correlation. But Huber’s method is proved to be very sensitive. It identifies 7 good cases as high leverage points and the exclusion of those good points may not reduce the values of the correlation coefficients like the other methods. For n = 50, 100 and 200 we also observe from Table 2 that a single high leverage point can produce strong multicollinearity. But the magnitude of the correlation coefficients tends to reduce with the increase in sample size as the proportions of high leverage points get reduced for higher samples. The 2M, 3M, Huber and Potential methods correctly identify high leverage cases and when those identified cases are omitted, the correlation coefficients become very small. 2M and Potential methods swamp on average 1 good cases and produce slightly higher correlation than the other two methods. We reuse the same set of design just mentioned above for the cases where 10% of the X’s are points of equal high leverage. For each of the cases the first 90% observations are simulated as Uniform (0,1). The last 10% observations of X1 , X2 and X3 are set at 10 so that these points are considered as equally high leverage points.

Kamruzzaman and Rahmatullah Imon

443

Table for Correlation Coefficient of the X’s with a Single High Leverage Point Table-2 Sample size

Measures

n=50

n=100

n=200

Correlation

r12

r13

r23

-----

0.9830

0.9827

0.9828

2M

2.5580

0.0897

0.0652

0.0909

3M

1.3210

0.0275

0.0091

0.0262

Huber

8.3980

0.3642

0.3473

0.3746

Potential

1.6720

0.0428

0.0271

0.0444

----------

-----

0.9565

0.9568

0.9565

2M

2.0940

0.0202

0.0253

0.0181

3M

1.0350

-0.0022

0.0033

-0.0043

Huber

1.1930

0.0013

0.0069

0.0002

Potential

1.7110

0.0127

0.0177

0.0108

-----------

-----

0.9168

0.9157

0.9161

2M

1.9230

0.0145

0.0062

0.0084

3M

1.0000

0.0043

-0.0031

-0.0007

Huber

1.0000

0.0043

-0.0031

-0.0007

Potential

1.7410

0.0121

0.0050

0.0066

---------

-----

0.8444

0.8443

0.8445

2M

1.9350

0.0026

0.0013

0.0011

3M

1.0000

-0.0022

-0.0033

-0.0033

Huber

1.0000

-0.0022

-0.0033

-0.0033

Potential

1.9330

0.0023

0.0017

0.0011

---------

n=20

Identified Cases

The correlation coefficients for four different sample sizes are computed. These results together with the correlation values computed after deleting the cases identified by different detection method are presented in Table 3.

444

Another Source of Multicollinearity

Correlation Coefficients of the X’s with 10% Equal High Leverage Points Table- 3 Sample size

n=20

n=50

n=100

n=200

Measures

Identified Cases

Correlation

r12

r13

r23

---------

-----

0.9913

0.9912

0.9913

2M

3.8290

0.1081

0.0794

0.0918

3M

2.4050

0.0346

0.0106

0.0178

Huber

9.7080

0.3975

0.3920

0.3914

Potential

2.4260

0.0370

0.0171

0.0227

----------

------

0.9911

0.9911

0.9911

2M

6.8050

0.0449

0.0410

0.0429

3M

5.0960

0.0179

0.0076

0.0025

Huber

5.4040

0.0123

0.0402

0.0118

Potential

4.5280

0.1560

0.1571

0.1538

-----------

-------

0.9910

0.9910

0.9909

2M

12.0630

0.0264

0.0182

0.0213

3M

0.0140

0.9910

0.9910

0.9909

Huber

0.0000

0.9910

0.9909

0.9909

Potential

6.6090

0.3567

0.3500

0.3528

---------

-----

0.9909

0.9909

0.9909

2M

22.4255

0.0152

0.0137

0.0098

3M

0.0020

0.9909

0.9909

0.9909

Huber

0.0000

0.9909

0.9909

0.9909

Potential

8.0970

0.5950

0.5934

0.5918

We observe from this table that the presence of multiple high leverage points causes strong multicollinearity. But it is interesting to note that unlike the previous experiment here correlation among the X’s are not reduced significantly with the increase in sample size as the proportions of high leverage points are the same for each samples. We also observe from Table 3 that for n = 20 all of the four detection methods correctly identify high leverage cases and the omission of those identified cases can remove the multicollinearity effect from the

Kamruzzaman and Rahmatullah Imon

445

data. 2M swamps on average 2 good cases and produce slightly higher correlation. But Huber’s method is more sensitive. It identifies on average 7 good cases as high leverage points and the exclusion of those good points may not reduce the values of the correlation coefficients as it really should be. For n = 50 we observe that 2M, 3M and Huber method correctly identify high leverage cases and when those identified cases are omitted, the correlation coefficients become very small. 2M swamps on average 2 good cases and produce slightly higher correlation. But potential method masks on average 0.47 observation and hence produces the highest correlation. We also observe from Table 3 that for n = 100 that only 2M method correctly identify high leverage cases and the omission of those identified cases produces the best set of results. Potential method masks on average 3.31 cases and produce higher correlation. But 3M and Huber methods fail to identify even a single high leverage point and consequently can not play any role in removing multicollinearity effect from the data. For n = 200 we observe that 2M method correctly identify all of the high leverage points and when those identified cases are omitted, the correlation coefficients become very small. Potential method can detect only 8 of 20 high leverage points and that is why its performance is not satisfactory in the reduction of multicollinearity. 3M and Huber methods again fail to identify high leverage cases totally and produce very high correlation coefficients. 5.

CONCLUDING REMARKS

The nature and extent of multicollinearity generated by high leverage points in a linear regression model is investigated in this paper. From the examples, figures and simulation results we observe that even a single high leverage point can generate huge multicollinearity. Multicollinearity might increase with the increase in magnitude and proportion of high leverage points. That is, for a fixed sample size when the X values corresponding to a high leverage point increase, multicollinearity also tends to increase. On the other hand if we fix the X values corresponding to a high leverage point, then multicollinearity is more for small samples and less for higher samples. That is the extent of multicollinearity depends on the proportion of the high leverage points in the data. We also observe that multicollinearity increases with the increases in the number of high leverage points, but they remain almost same when the proportions of high leverage points are fixed. Fitting a regression line without the high leverage points is a remedy to this problem but it is not that easy to correctly identify them. When a single high leverage point is responsible for causing multicollinearity the problem is simple because all of the commonly used detection techniques can successfully identify the high leverage points. But in the presence of multiple high leverage points masking/swamping may occur. That is, sometimes the commonly used detection methods may fail to identify all of the high leverage points. In that case, multicollinearity is reduced but the problem remains. In this situation the performance of twice-the-mean rule is quite satisfactory. Hadi’s Potential method is the next choice but its performance tends to deteriorate with the increase in sample size. We also observe that thrice-the-mean rule and Huber method are proved to be less effective in the identification of multiple high leverage points.

446

Another Source of Multicollinearity

REFERENCES 1. 2.

Barnett, V., and Lewis, T. (1994), Outliers in Statistical Data, 3rd. Ed., Wiley, New York. Belsley, D.A., (1991), Conditioning Diagnostics; Collinearity and Weak Data in Regression, Wiley, New York. 3. Hadi, A.S. (1992), A new measure of overall potential influence in linear regression, Comp. Stat. Data. Anal., 14, 1-27. 4. Hoaglin, D. C. and Welsch, R.E. (1978), The hat matrix in regression and ANOVA, J. Amer. Stat. Assoc. 32, 17-22. 5. Huber, P.J. (1981), Robust Statistics, Wiley, New York. 6. Montgomery, D.C. and Peck, E, A. (1992), Introduction to Linear Regression Analysis, 2nd Ed., Wiley, New York. 7. Mullet, G.M. (1976), Why regression coefficients have the wrong sign, J. Qual. Technol. 8, 121-126. ~ 8. Pe n a, D. and Yohai, V.J. (1995), The detection of influential subsets in linear regression by using an influence matrix, J. Roy. Stat. Soc., Ser-B, 57, 145-156. 9. Rao, C.R. (1965), Linear Statistical Inference, Wiley, New York. 10. Velleman, P.F. and Welsch, R.E. (1981), Efficient computing of regression diagnostics, Amer. Statist., 35, 234-242.