Introduction to the Practice of Statistics. Authorization: Description: As cheese
ages, various chemical processes take place that determine the taste of the final
...
Chapter 6: Multiple Linear Regression Example: Cheddar Cheese Reference: Moore, David S., and George P. McCabe (1989). Introduction to the Practice of Statistics. Authorization: Description: As cheese ages, various chemical processes take place that determine the taste of the final product. This dataset contains concentrations of various chemicals in 30 samples of mature cheddar cheese, and a subjective measure of taste for each sample. The variables "Acetic" and "H2S" are the natural logarithm of the concentration of acetic acid and hydrogen sulfide respectively. The variable "Lactic" has not been transformed. Variable Names: 1. Case: Sample number 2. Taste: Subjective taste test score, obtained by combining the scores of several tasters 3. Acetic: Natural log of concentration of acetic acid 4. H2S: Natural log of concentration of hydrogen sulfide 5. Lactic: Concentration of lactic acid Our goal is to predict Taste.
Tastei = beta0 + beta1 Acetici + beta2 H2Si + beta3 Lactici + ei Call: lm(formula = taste ~ Acetic + H2S + Lactic, data = cheese) Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -28.8768 19.7354 -1.463 0.15540 Acetic 0.3277 4.4598 0.073 0.94198 H2S 3.9118 1.2484 3.133 0.00425 ** Lactic 19.6705 8.6291 2.280 0.03108 * Residual standard error: 10.13 on 26 degrees of freedom Multiple R-Squared: 0.6518, Adjusted R-squared: 0.6116 F-statistic: 16.22 on 3 and 26 DF, p-value: 3.81e-06 Df Sum Sq Mean Sq Regression 3 4994.48 1664.83 Residuals 26 2668.41 102.63
Test results in multiple regression output: 1) t-statistics for each beta in the model: example: t = 3.133 = b2/se(b2) = 3.9118 / 1.2484 pvalue = 0.00425 is for test of beta2=0 vs beta2 not 0 Since pvalue is less than 0.05, reject H2 and conclude that beta2 is statistically different from zero.
2)
F statistic for test of lack of fit: MSR / MSE = 16.22 pvalue = 3.81e-06 is for test of beta1=beta2=beta3=0. Test applies to all three parameters at the same time. (State test as comparing models)
Other regression outputs are qualitatively similar to what we had for regression with one predictor (aka “simple linear regression”) R-squared is now called the coefficient of multiple determination. It’s still SSR / SStotal, and it still means the fraction of y’s variability explained by the Xs. Betai is now the “partial derivative” of mean of y with respect to xi.(i.e. change in mean of y that is associated with a one unit change in xi with all other xjs held constant)
Residuals are now ei = yi – b0 - b1*x1i - . . . – b(p-1)*x(p-1)i (p-1 is number of covariates for each observation) (p is number of betas, including the intercept) or (using matrices and vectors) Y – X b where b = (b0, . . ., b(p-1))’ = (X’X)-1X’Y and X = (1 X1 . . . X(p-1)) MSE = (e1^2 + . . . + en^2) / (n-p) is still an estimate of variance of the errors.
Adjusted R^2 or adjusted coefficient of multiple determination is 1 – ((n-1)/(n-p))SSE/SSTO. It’s smaller than R-squared, and the gap between R^2 and adjusted R^2 gets larger as more terms are added to the model. (Roughly: higher is better. . . ) See caution in book about extrapolation!
Chapter 7: More about multiple regression When you have a multiple regression, a natural question is, which covariate matters the most if we want to explain Y? Can we order the covariates by how much they help explain Y? Extra sums of squares measure the reduction in the error sum of squares that is caused by including another covariate in a model. I.e. additional explanatory power of another covariate
Notation: SSR(X2 | X1) = SSE(X1) – SSE(X1,X2) X1 is already in the model and X2 is added. SSR(X1 | X2) = SSE(X2) – SSE(X1,X2) X2 is already in the model and X1 is added. Some algebra: Since SST doesn’t depend on any X’s, and SST = SSR + SSE SSR(X2|X1) = SST-SSR(X1) – (SST-SSR(X1,X2)) = SSR(X1,X2) – SSR(X1) Since SST =
SSR(X1,X2) + SSE(X1,X2) SSR(X2|X1) + SSR(X1) + SSE(X1,X2)
Example: When H2S is already in the model, what’s Lactic’s additional explanatory power? Fit 2 models: 1. 2.
Taste = alpha0 + alpha1*H2S + error Taste = beta0 + beta1*H2S + beta2*lactic + error
Model: Taste Df regresion 1 Residuals 28
= alpha0 + alpha1*H2S + error Sum Sq 4376.7 3286.2
Model: Taste Df regression 2 Residuals 27
= beta0 + beta1*H2S + beta2*lactic + error Sum Sq 4993.9 2669.0
SSR(Lactic|H2S) = 3286.7 – 2669.0 = 617.7
How does that compare to H2S’s additional explanatory power when Lactic is already in the model? Model: Taste Df regression 2 Residuals 27
= beta0 + beta1*H2S + beta2*lactic + error Sum Sq 4993.9 2669.0
Model: Taste Df regression 1 Residuals 28
= gamma0 + gamma1*Lactic + error Sum Sq 3800.4 3862.5
SSR(H2S|Lactic) = 3862.5-2669.0 = 1193.5
Can also look at explanatory usefulness of a variable when more than 1 other covariate is the model: SSR(X3 | X1,X2) = SSE(X3) – SSE(X1,X2,X3) Or explanatory usefulness of more than 1 other covariate when more than 1 covariate is the model: SSR(X3,X4 | X1,X2) SSR(X2,X3 | X1) etc.
Ex: SSR(Acetic | H2S,Lactic) Model: Taste Df regression 2 Residuals 27
= beta0 + beta1*H2S + beta2*lactic + error Sum Sq 4993.9 2669.0
Model: Taste Df regression 3 Residuals 27
= beta0 + beta1*H2S + beta2*lactic + beta3*acetic+e Sum Sq 4994.5 2668.4
SSR(Acetic | H2S,Lactic) = 2669.0-2668.4 = 0.6 Similarly, SSR(Lactic | H2S,Acetic) = 3201.7-2668.4 = 533.3
F-tests: The F-test we’ve seen before, have been of the form: H0: beta1=. . .=beta(p-1) = 0 Vs HA: They’re not all zero Reject if MSR/MSE > F(0.95,p-1,n-p) We can use partial sums of squares to test subsets of the parameters.
Let’s compare two models: H0: E(taste) = beta0 + beta1*H2S HA: E(taste) = beta0 + beta1*H2S + beta2*Lactic + beta3*Acetic F = SSR(Lactic,Acetic|H2S)/2 -----------------------SSE(Lactic,Acetic,H2S)/(n-4) SSR(Lactic,Acetic|H2S)= SSE(H2S) – SSE(Lactic,Acetic,H2S) 3286.1- 2668.4=617.6 F = (617.6/2)/(2668.4/26) = 3.01 F(.95,2,26) = 3.37, so don’t there’s not enough evidence to pick HA over HO
Let’s compare two models: H0: E(taste) = beta0 + beta1*H2S HA: E(taste) = beta0 + beta1*H2S + beta2*Lactic F = SSR(Lactic |H2S)/1 -----------------------SSE(Lactic, H2S)/(n-3) SSR(Lactic|H2S)= SSE(H2S) – SSE(Lactic,H2S) 3286.1- 2669=617.1 F = (617.6/1)/(2669/27) = 6.2 F(.95,1,27) = 4.21, so there is enough evidence to pick HA over HO