Design of Experiments: One Factor and Randomized Block ...

60 downloads 17 Views 69KB Size Report
1. Design of Experiments: One. Factor and Randomized Block. Experiments. Prof. Daniel A. Menasce. Dept. of Computer Science. George Mason University.
Design of Experiments: One Factor and Randomized Block Experiments Prof. Daniel A. Menasce Dept. of Computer Science George Mason University 1 © 2001. D. A. Menascé. All Rights Reserved.

• Response: what you want to measure. • Factor: what affects the response. • Level: value of a factor.

Levels

Basic Notions in Design of Experiments Factors

Response

CPU Clock Frequency Main Memory (MHz) Number of CPUs (MB) 550 1 128 750 1 128 1000 1 128 550 2 128 750 2 128 1000 2 128 550 1 256 750 1 256 1000 1 256 550 2 256 750 2 256 1000 2 256

Benchmark Execution Time (sec) 25.0 32.0 48.0 19.0 13.5 10.0 23.0 29.0 45.0 16.5 11.8 8.8

2 © 2001. D. A. Menascé. All Rights Reserved.

1

Comparing Means of Various Groups • ANOVA: Analysis of Variance. • Consider c groups (each group is a level of a factor). • Subdivide total variation in the response into variations attributable to differences among the c groups and differences within the c groups (experimental error). 3 © 2001. D. A. Menascé. All Rights Reserved.

• A, B, C, and D are different page replacement algorithms. • Factor: page replacement algorithm. • Levels: A, B, C, and D. • Number in each column: running times of programs under each replacement algorithm.

A 11 13 17 17 15 16 14 10 12 14

Page Replacement Algorithm B C D 12 18 11 14 16 12 17 18 16 19 20 15 21 22 14 18 15 17 19 17 13 18 21 16 16 16 17 18 20 18 4

© 2001. D. A. Menascé. All Rights Reserved.

2

How about the Influence of Uncontrolled and Unforeseen Factors? • The running time of a program depends on many other factors. Its locality of reference plays a role in the effectiveness of a page replacement algorithm. • Randomization: consider a large set of programs and randomly assign programs to each group. 5 © 2001. D. A. Menascé. All Rights Reserved.

Randomization A

B Randomizer C

computer programs

D 6

© 2001. D. A. Menascé. All Rights Reserved.

3

Randomization A

B Randomizer C

D

computer programs

7 © 2001. D. A. Menascé. All Rights Reserved.

ANOVA Model Total Variation (SST) df = n - 1

Among Group Variation (SSA) df = c - 1 Attributable to difference in groups

Within-Group Variation (SSW) df = n - c Attributable to experimental error

8

© 2001. D. A. Menascé. All Rights Reserved.

4

ANOVA • Assumptions: – c groups or levels of the factor being examined represent populations whose outcome measurements are randomly and independently drawn and follow a normal distribution and have equal variances.

• Hypotheses: H 0 : µ1 = µ 2 = ... = µ c H1 : not all µ j are equal (j = 1,..., c) 9 © 2001. D. A. Menascé. All Rights Reserved.

ANOVA Model Total Variation (SST) df = n - 1

Among Group Variation (SSA) df = c - 1

Within-Group Variation (SSW) df = n - c

10 © 2001. D. A. Menascé. All Rights Reserved.

5

SST (Sum of Squares Total) c

nj

(

SST = ∑ ∑ X ij − X j =1 i =1

where

c

X =

)

2

nj

∑∑ X j =1 i =1

n

ij

: overall or grand mean.

X ij : i-th observation in group or level j. nj :

number of observations in group or level j. c

n: total number of observations:

∑n j =1

j 11

© 2001. D. A. Menascé. All Rights Reserved.

SST Example A 11 13 17 17 15 16 14 10 12 14 Grand Mean

Page Replacement Algorithm B C D 12 18 11 14 16 12 17 18 16 19 20 15 21 22 14 18 15 17 19 17 13 18 21 16 16 16 17 18 20 18

16.075

SST = (11-16.075)2 + (13-16.075)2 + … + (18-16.075)2 = 336.75 12 © 2001. D. A. Menascé. All Rights Reserved.

6

ANOVA Model Total Variation (SST) df = n - 1

Within-Group Variation (SSW) df = n - c

Among Group Variation (SSA) df = c - 1

13 © 2001. D. A. Menascé. All Rights Reserved.

SSA (Sum of Squares Among Groups) c

(

SSA = ∑ n j X j − X j =1

where

c

X =

)

2

nj

∑∑ X j =1 i =1

n

ij

: overall or grand mean.

X j : sample mean corresponding to group or level j. nj :

number of observations in group or level j.

14 © 2001. D. A. Menascé. All Rights Reserved.

7

SSA Example Page Replacement Algorithm A B C D 11 12 18 11 13 14 16 12 17 17 18 16 17 19 20 15 15 21 22 14 16 18 15 17 14 19 17 13 10 18 21 16 12 16 16 17 14 18 20 18 13.9 17.2 18.3 14.9

Mean Grand Mean

16.075

SSA = 10 (13.9-16.075)2 + 10 (17.2-16.075)2 + 10 (18.3- 16.075)2 + 10 (14.9- 16.075)2 = 123.275 15 © 2001. D. A. Menascé. All Rights Reserved.

ANOVA Model Total Variation (SST) df = n - 1

Among Group Variation (SSA) df = c - 1

Within-Group Variation (SSW) df = n - c

16 © 2001. D. A. Menascé. All Rights Reserved.

8

SSW (Sum of Squares Within Groups) 2 c nj

(

SSW = ∑ ∑ X ij − X j j =1i =1

)

where

X ij : i-th observation in group or level j. X j : sample mean corresponding to group or level j.

17 © 2001. D. A. Menascé. All Rights Reserved.

SSW Example

Mean

Page Replacement Algorithm A B C D 11 12 18 11 13 14 16 12 17 17 18 16 17 19 20 15 15 21 22 14 16 18 15 17 14 19 17 13 10 18 21 16 12 16 16 17 14 18 20 18 13.9 17.2 18.3 14.9

SSW = (11-13.9)2 + …+(14-13.9)2 + (12-17.2)2 + … + (18-17.2)2 + (18-18.3)2 + … + (20-18.3)2 + (11-14.9)2 + … + (18-14.9)2 = 213.5 18 © 2001. D. A. Menascé. All Rights Reserved.

9

ANOVA Model Total Variation (SST) df = n - 1

Among Group Variation (SSA) df = c - 1

Within-Group Variation (SSW) df = n - c

SST = SSA + SSW 19 © 2001. D. A. Menascé. All Rights Reserved.

ANOVA Model: Mean Squares SSA c −1 SSW MSW = n−c SST MST = n −1 MSA =

The mean squares are variances! If there are no real differences among the c groups, MSA, MSW, and MST provide estimates for the variance inherent in the data. 20 © 2001. D. A. Menascé. All Rights Reserved.

10

The one-way ANOVA F Test Static F=

MSA MSW

• The F-test statistic follows an F distribution with c-1 degrees of freedom in the numerator corresponding to MSA and n-c degrees of freedom in the denominator corresponding to MSW. • Null hypothesis: H0 : µ1=µ2=…=µc •Alternative hypothesis: H1 :Not all µj are equal (j=1,…c) 21 © 2001. D. A. Menascé. All Rights Reserved.

Reject Ho if F > Fu Region of non-rejection

Region of rejection

1−α α Fu: Critical Value

22

© 2001. D. A. Menascé. All Rights Reserved.

11

ANOVA Summary Table Source of Degrees of Variation Freedom Among groups c-1 Within Groups n-c Total n-1

Sum of Mean Square Squares (Variance) SSA MSA= SSA/(c-1) SSW MSW=SSW/(n-c) SST

F F=MSA/MSW

23 © 2001. D. A. Menascé. All Rights Reserved.

ANOVA Example

Mean Grand Mean SSA SSW SST MSA MSW F df numer. df denom. Fu

Page Replacement Algorithm A B C D 11 12 18 11 13 14 16 12 17 17 18 16 17 19 20 15 15 21 22 14 16 18 15 17 14 19 17 13 10 18 21 16 12 16 16 17 14 18 20 18 13.9 17.2 18.3 14.9 16.075 47.30625 12.65625 49.50625 13.80625 213.5 336.775 41.091667 5.9305556 6.93 3 36 2.87 (from table)

F > Fu reject Ho. Algorithms A, B, C, and D have a significant difference at 0.05 level of significance.

123.275

F=MSA/MSW

α

c-1 = 4-1 n-c = 40-4 24

© 2001. D. A. Menascé. All Rights Reserved.

12

ANOVA With Excel Anova: Single Factor SUMMARY Groups Column 1 Column 2 Column 3 Column 4

Count 10 10 10 10

ANOVA Source of Variation Between Groups Within Groups

SS 123.275 213.5

Total

336.775

Sum Average 139 13.9 172 17.2 183 18.3 149 14.9

df

Variance 5.877778 6.844444 5.566667 5.433333

MS F P-value F crit 3 41.09167 6.928806 0.000844 2.866265 36 5.930556 39

Since the p-value is less than α = 0.05, reject Ho. 25 © 2001. D. A. Menascé. All Rights Reserved.

Multiple Comparisons: The Tukey-Kramer Procedure • If Ho is rejected, then the question is “Which groups are different?” • Use the Tukey-Kramer procedure to compare all pairs of groups simultaneously. • Must compute the differences X j − X j for j ≠ j ' among all c(c-1)/2 pairs of means. '

26 © 2001. D. A. Menascé. All Rights Reserved.

13

Multiple Comparisons: The Tukey-Kramer Procedure • Obtain the critical range: critical range = qu

MSW  1 1  + 2  n j n j'   

where qu is the upper-tail critical value from a Studentized range* distribution with c degrees of freedom in the numerator and (n-c) degrees of freedom in the denominator. * See statistical table. 27 © 2001. D. A. Menascé. All Rights Reserved.

Multiple Comparisons: The Tukey-Kramer Procedure • A pair is considered significantly different if the absolute difference between the sample means exceeds the critical range.

28 © 2001. D. A. Menascé. All Rights Reserved.

14

Multiple Comparisons: The Tukey-Kramer Procedure |XA-XB| |XA-XC| |XA-XD| |XB-XC| |XB-XD| |XC-XD| qu MSW

3.3 4.4 1 1.1 2.3 3.4

critical range > 2.9341 A significantly different than B > 2.9341 A significantly different than C < 2.9341 A not significantly different than D < 2.9341 B not significantly different than C < 2.9341 B not significantly different than D > 2.9341 C significantly different than D

3.81 (from table) 5.930556

29 © 2001. D. A. Menascé. All Rights Reserved.

Reviewing ANOVA Assumptions • Randomness and independence: must always be met. • Normality: ANOVA F test is robust as long as distributions are not extremely different from a normal distribution particularly for large samples. • Homogeneity of variance: s 12 = s 22 = ... = s 2c – If unequal sample sizes between groups, different variances is a problem. – Should try to use same-size groups.

30 © 2001. D. A. Menascé. All Rights Reserved.

15

Randomized Block Model Level 1

Level 2

...

Level c

block 1

...

block r

• each block contains the response of the same item to the c levels of the factor being analyzed. • Purpose: remove as much block or subject variability as possible by reducing experimental error. 31 © 2001. D. A. Menascé. All Rights Reserved.

Randomized Block Model A

B Randomizer C

computer programs

D 32

© 2001. D. A. Menascé. All Rights Reserved.

16

Randomized Block Model A

B Randomizer C

computer programs

D 33

© 2001. D. A. Menascé. All Rights Reserved.

block 1 block 2 block 3 block 4 block 5

Page Replacement Algorithm A B C D 11.0 12.0 18.0 11.0 13.0 14.0 19.0 12.0 17.0 18.4 23.4 16.5 14.0 14.9 20.0 12.5 15.0 16.0 21.0 13.5

program 1 program 2 program 3 program 4 program 5

r=5 c=4 34 © 2001. D. A. Menascé. All Rights Reserved.

17

ANOVA Model for Randomized Block Design Total Variation (SST) df = n - 1

Note: n = r x c

Among Group Variation (SSA) df = c - 1 Among-Block Variation (SSBL) df = r-1

Random Variation (SSE) df = (r-1)(c-1)

SST = SSA+SSBL+SSE

35

© 2001. D. A. Menascé. All Rights Reserved.

ANOVA Model for Randomized Block Design Total Variation (SST) df = n - 1

Note: n = r x c

Among Group Variation (SSA) df = c - 1 Among-Block Variation (SSBL) df = r-1

Random Variation (SSE) df = (r-1)(c-1)

SST = SSA+SSBL+SSE

36

© 2001. D. A. Menascé. All Rights Reserved.

18

SST (Sum of Squares Total) c

r

(

SST = ∑ ∑ X ij − X j =1i =1

wherec

2

)

r

∑∑ X ij

X =

j =1i =1

rc

: overall or grand mean.

X ij : observation in i-th block and level j. r:

number of blocks.

37 © 2001. D. A. Menascé. All Rights Reserved.

block 1 block 2 block 3 block 4 block 5

Page Replacement Algorithm A B C D 11.0 12.0 18.0 11.0 13.0 14.0 19.0 12.0 17.0 18.4 23.4 16.5 14.0 14.9 20.0 12.5 15.0 16.0 21.0 13.5

program 1 program 2 program 3 program 4 program 5

Grand Mean = 15.61 SST = (11.0-15.61)2 +(13.0-15.61)2 +…+(15.0-15.61)2 + … (11.0-15.61)2 +(12.0-15.61)2 +…+(13.5-15.61)2 = 232.44 38 © 2001. D. A. Menascé. All Rights Reserved.

19

ANOVA Model for Randomized Block Design Total Variation (SST) df = n - 1

Note: n = r x c

Among Group Variation (SSA) df = c - 1 Among-Block Variation (SSBL) df = r-1

Random Variation (SSE) df = (r-1)(c-1)

SST = SSA+SSBL+SSE

39

© 2001. D. A. Menascé. All Rights Reserved.

SSA (Sum of Squares Among Group) c

(

SSA = r ∑ X . j − X j =1

2

)

where r

∑ X ij

X . j = i =1 r

: group mean.

40 © 2001. D. A. Menascé. All Rights Reserved.

20

block 1 block 2 block 3 block 4 block 5 Mean Grand Mean

Page Replacement Algorithm A B C D 11.0 12.0 18.0 11.0 13.0 14.0 19.0 12.0 17.0 18.4 23.4 16.5 14.0 14.9 20.0 12.5 15.0 16.0 21.0 13.5 14.0 15.1 20.3 13.1

program 1 program 2 program 3 program 4 program 5

15.61

SSA = 5 * [ (14.0-15.61)2 +(15.1-15.61)2 +…+(13.1-15.61)2 ] = 155.018

41 © 2001. D. A. Menascé. All Rights Reserved.

ANOVA Model for Randomized Block Design Total Variation (SST) df = n - 1

Note: n = r x c

Among Group Variation (SSA) df = c - 1 Among-Block Variation (SSBL) df = r-1

Random Variation (SSE) df = (r-1)(c-1)

SST = SSA+SSBL+SSE

42

© 2001. D. A. Menascé. All Rights Reserved.

21

SSBL (Sum of Squares Among Blocks) r

(

SSBL = c ∑ X i. − X i =1

2

)

where c

∑ X ij X i. =

j =1

c

: block mean.

43 © 2001. D. A. Menascé. All Rights Reserved.

block 1 block 2 block 3 block 4 block 5 Mean Grand Mean

Page Replacement Algorithm A B C D 11.0 12.0 18.0 11.0 13.0 14.0 19.0 12.0 17.0 18.4 23.4 16.5 14.0 14.9 20.0 12.5 15.0 16.0 21.0 13.5 14.0 15.1 20.3 13.1

program 1 program 2 program 3 program 4 program 5

Mean 13.0 14.5 18.8 15.4 16.4

15.61

SSBL = 4 * [(13.0-15.61)2 +(14.5-15.61)2 +…+(16.4-15.61)2 ] = 76.133

44 © 2001. D. A. Menascé. All Rights Reserved.

22

ANOVA Model for Randomized Block Design Total Variation (SST) df = n - 1

Note: n = r x c

Among Group Variation (SSA) df = c - 1 Among-Block Variation (SSBL) df = r-1

Random Variation (SSE) df = (r-1)(c-1)

SST = SSA+SSBL+SSE

45

© 2001. D. A. Menascé. All Rights Reserved.

SSE (Random Error)

SSE =

2

∑ ∑ (X ij − X . j − X i. + X ) c

r

j =1i =1

46 © 2001. D. A. Menascé. All Rights Reserved.

23

block 1 block 2 block 3 block 4 block 5 Mean

+

Grand Mean

-

Page Replacement Algorithm A B C D 11.0 12.0 18.0 11.0 13.0 14.0 19.0 12.0 17.0 18.4 23.4 16.5 14.0 14.9 20.0 12.5 15.0 16.0 21.0 13.5 14.0 15.1 20.3 13.1

program program program program program

1 2 3 4 5

Mean 13.0 14.5 18.8 15.4 16.4

15.61

SSE = (11.0-14.0-13.0-15.61)2 +(13.0-14.0-14.5+15.61)2 + …. (21.0-20.3-16.4+15.61)2 +(13.5-13.1-16.4+15.61)2 = 1.287

47 © 2001. D. A. Menascé. All Rights Reserved.

ANOVA Model: Mean Squares SSA c −1 SSBL MSBL = r −1 SSE MSE = (r −1)(c − 1) MSA =

The mean squares are variances! If there are no real differences among the c groups, MSA, MSBL, and MSE provide estimates for the variance inherent in the data. 48 © 2001. D. A. Menascé. All Rights Reserved.

24

ANOVA Hypothesis Testing H 0 : µ.1 = µ.2 = ... = µ.c

H1 : Not allµ. j ( j = 1,..., c) are equal. F-Test statistic:

F=

MSA MSE

The F-test statistic follows an F distribution with (c-1) degrees Of freedom in the numerator and (r-1)(c-1) in the denominator.

Reject H 0 if F > Fu 49 © 2001. D. A. Menascé. All Rights Reserved.

block 1 block 2 block 3 block 4 block 5 Mean Grand Mean SSA SSE SSBL SST MSA MSBL MSE F df numer. df denom. Fu

Page Replacement Algorithm A B C D 11.0 12.0 18.0 11.0 13.0 14.0 19.0 12.0 17.0 18.4 23.4 16.5 14.0 14.9 20.0 12.5 15.0 16.0 21.0 13.5 14.0 15.1 20.3 13.1

program 1 program 2 program 3 program 4 program 5

Mean 13.0 14.5 18.8 15.4 16.4

15.61 155.018 1.287 76.133 232.438 51.67267 19.03325 0.10725 481.80

F=MSA/MSE

3 12 7.23 (from table)

F > Fu

reject Ho 50

© 2001. D. A. Menascé. All Rights Reserved.

25

Anova: Two-Factor Without Replication SUMMARY Row Row Row Row Row

Count

1 2 3 4 5

Column 1 Column 2 Column 3 Column 4

Sum 4 4 4 4 4

52 58 75.3 61.4 65.5

5 5 5 5

70 75.3 101.4 65.5

Average Variance 13 11.333 14.5 9.667 18.825 9.949 15.35 10.590 16.375 10.563 14 15.06 20.28 13.1

5 5.638 4.292 4.425

MSA

ANOVA Source of Variation Rows (blocks) Columns (groups) Error

SS 76.133 155.018 1.287

Total

232.438

df

MS F P-value 4 19.03325 177.4662 1.46E-10 3 51.67267 481.79643 9.11E-13 12 0.10725

MSE

19

F crit 3.25916 3.4903

=MSA/MSE

Since the p-value is less than α = 0.05, reject Ho. Since F > F critical, reject Ho. 51 © 2001. D. A. Menascé. All Rights Reserved.

Estimated Relative Efficiency (RE) SSBL RE =

(r −1) MSBL + r ( c − 1)MSE ( rc −1) MSE

n-1 • Used to assess if blocking results in an increase in precision in comparing the different groups. MSA MSBL MSE

51.67267 19.03325 0.10725

RE

=

38.2

• If blocking is not used, we would need 38.2 times as many observations to obtain the same precision in comparing the groups.

52

© 2001. D. A. Menascé. All Rights Reserved.

26

Multiple Comparisons: The Tukey-Kramer Procedure • Obtain the critical range: critical range = qu

MSE r

where qu is the upper-tail critical value from a Studentized range* distribution with c degrees of freedom in the numerator and (r-1)(c-1) degrees of freedom in the denominator. (See Statistical Tables). 53 © 2001. D. A. Menascé. All Rights Reserved.

Multiple Comparisons: The TukeyKramer Procedure Tukey Kramer Multiple Comparisons

Group 1 2 3 4 Intermediate Calculations MSE r c

Sample Mean 14 15.06 20.28 13.1

Comparison Group 1 to Group 2 Group 1 to Group 3 Group 1 to Group 4 Group 2 to Group 3 Group 2 to Group 4 Group 3 to Group 4

Absolute Difference 1.06 6.28 0.9 5.22 1.96 7.18

Critical Range Result 0.615124 Means are different 0.615124 Means are different 0.615124 Means are different 0.615124 Means are different 0.615124 Means are different 0.615124 Means are different

0.10725 5 4

54 © 2001. D. A. Menascé. All Rights Reserved.

27