1. Design of Experiments: One. Factor and Randomized Block. Experiments. Prof.
Daniel A. Menasce. Dept. of Computer Science. George Mason University.
Design of Experiments: One Factor and Randomized Block Experiments Prof. Daniel A. Menasce Dept. of Computer Science George Mason University 1 © 2001. D. A. Menascé. All Rights Reserved.
• Response: what you want to measure. • Factor: what affects the response. • Level: value of a factor.
Levels
Basic Notions in Design of Experiments Factors
Response
CPU Clock Frequency Main Memory (MHz) Number of CPUs (MB) 550 1 128 750 1 128 1000 1 128 550 2 128 750 2 128 1000 2 128 550 1 256 750 1 256 1000 1 256 550 2 256 750 2 256 1000 2 256
Benchmark Execution Time (sec) 25.0 32.0 48.0 19.0 13.5 10.0 23.0 29.0 45.0 16.5 11.8 8.8
2 © 2001. D. A. Menascé. All Rights Reserved.
1
Comparing Means of Various Groups • ANOVA: Analysis of Variance. • Consider c groups (each group is a level of a factor). • Subdivide total variation in the response into variations attributable to differences among the c groups and differences within the c groups (experimental error). 3 © 2001. D. A. Menascé. All Rights Reserved.
• A, B, C, and D are different page replacement algorithms. • Factor: page replacement algorithm. • Levels: A, B, C, and D. • Number in each column: running times of programs under each replacement algorithm.
A 11 13 17 17 15 16 14 10 12 14
Page Replacement Algorithm B C D 12 18 11 14 16 12 17 18 16 19 20 15 21 22 14 18 15 17 19 17 13 18 21 16 16 16 17 18 20 18 4
© 2001. D. A. Menascé. All Rights Reserved.
2
How about the Influence of Uncontrolled and Unforeseen Factors? • The running time of a program depends on many other factors. Its locality of reference plays a role in the effectiveness of a page replacement algorithm. • Randomization: consider a large set of programs and randomly assign programs to each group. 5 © 2001. D. A. Menascé. All Rights Reserved.
Randomization A
B Randomizer C
computer programs
D 6
© 2001. D. A. Menascé. All Rights Reserved.
3
Randomization A
B Randomizer C
D
computer programs
7 © 2001. D. A. Menascé. All Rights Reserved.
ANOVA Model Total Variation (SST) df = n - 1
Among Group Variation (SSA) df = c - 1 Attributable to difference in groups
Within-Group Variation (SSW) df = n - c Attributable to experimental error
8
© 2001. D. A. Menascé. All Rights Reserved.
4
ANOVA • Assumptions: – c groups or levels of the factor being examined represent populations whose outcome measurements are randomly and independently drawn and follow a normal distribution and have equal variances.
• Hypotheses: H 0 : µ1 = µ 2 = ... = µ c H1 : not all µ j are equal (j = 1,..., c) 9 © 2001. D. A. Menascé. All Rights Reserved.
ANOVA Model Total Variation (SST) df = n - 1
Among Group Variation (SSA) df = c - 1
Within-Group Variation (SSW) df = n - c
10 © 2001. D. A. Menascé. All Rights Reserved.
5
SST (Sum of Squares Total) c
nj
(
SST = ∑ ∑ X ij − X j =1 i =1
where
c
X =
)
2
nj
∑∑ X j =1 i =1
n
ij
: overall or grand mean.
X ij : i-th observation in group or level j. nj :
number of observations in group or level j. c
n: total number of observations:
∑n j =1
j 11
© 2001. D. A. Menascé. All Rights Reserved.
SST Example A 11 13 17 17 15 16 14 10 12 14 Grand Mean
Page Replacement Algorithm B C D 12 18 11 14 16 12 17 18 16 19 20 15 21 22 14 18 15 17 19 17 13 18 21 16 16 16 17 18 20 18
16.075
SST = (11-16.075)2 + (13-16.075)2 + … + (18-16.075)2 = 336.75 12 © 2001. D. A. Menascé. All Rights Reserved.
6
ANOVA Model Total Variation (SST) df = n - 1
Within-Group Variation (SSW) df = n - c
Among Group Variation (SSA) df = c - 1
13 © 2001. D. A. Menascé. All Rights Reserved.
SSA (Sum of Squares Among Groups) c
(
SSA = ∑ n j X j − X j =1
where
c
X =
)
2
nj
∑∑ X j =1 i =1
n
ij
: overall or grand mean.
X j : sample mean corresponding to group or level j. nj :
number of observations in group or level j.
14 © 2001. D. A. Menascé. All Rights Reserved.
7
SSA Example Page Replacement Algorithm A B C D 11 12 18 11 13 14 16 12 17 17 18 16 17 19 20 15 15 21 22 14 16 18 15 17 14 19 17 13 10 18 21 16 12 16 16 17 14 18 20 18 13.9 17.2 18.3 14.9
Mean Grand Mean
16.075
SSA = 10 (13.9-16.075)2 + 10 (17.2-16.075)2 + 10 (18.3- 16.075)2 + 10 (14.9- 16.075)2 = 123.275 15 © 2001. D. A. Menascé. All Rights Reserved.
ANOVA Model Total Variation (SST) df = n - 1
Among Group Variation (SSA) df = c - 1
Within-Group Variation (SSW) df = n - c
16 © 2001. D. A. Menascé. All Rights Reserved.
8
SSW (Sum of Squares Within Groups) 2 c nj
(
SSW = ∑ ∑ X ij − X j j =1i =1
)
where
X ij : i-th observation in group or level j. X j : sample mean corresponding to group or level j.
17 © 2001. D. A. Menascé. All Rights Reserved.
SSW Example
Mean
Page Replacement Algorithm A B C D 11 12 18 11 13 14 16 12 17 17 18 16 17 19 20 15 15 21 22 14 16 18 15 17 14 19 17 13 10 18 21 16 12 16 16 17 14 18 20 18 13.9 17.2 18.3 14.9
SSW = (11-13.9)2 + …+(14-13.9)2 + (12-17.2)2 + … + (18-17.2)2 + (18-18.3)2 + … + (20-18.3)2 + (11-14.9)2 + … + (18-14.9)2 = 213.5 18 © 2001. D. A. Menascé. All Rights Reserved.
9
ANOVA Model Total Variation (SST) df = n - 1
Among Group Variation (SSA) df = c - 1
Within-Group Variation (SSW) df = n - c
SST = SSA + SSW 19 © 2001. D. A. Menascé. All Rights Reserved.
ANOVA Model: Mean Squares SSA c −1 SSW MSW = n−c SST MST = n −1 MSA =
The mean squares are variances! If there are no real differences among the c groups, MSA, MSW, and MST provide estimates for the variance inherent in the data. 20 © 2001. D. A. Menascé. All Rights Reserved.
10
The one-way ANOVA F Test Static F=
MSA MSW
• The F-test statistic follows an F distribution with c-1 degrees of freedom in the numerator corresponding to MSA and n-c degrees of freedom in the denominator corresponding to MSW. • Null hypothesis: H0 : µ1=µ2=…=µc •Alternative hypothesis: H1 :Not all µj are equal (j=1,…c) 21 © 2001. D. A. Menascé. All Rights Reserved.
Reject Ho if F > Fu Region of non-rejection
Region of rejection
1−α α Fu: Critical Value
22
© 2001. D. A. Menascé. All Rights Reserved.
11
ANOVA Summary Table Source of Degrees of Variation Freedom Among groups c-1 Within Groups n-c Total n-1
Sum of Mean Square Squares (Variance) SSA MSA= SSA/(c-1) SSW MSW=SSW/(n-c) SST
F F=MSA/MSW
23 © 2001. D. A. Menascé. All Rights Reserved.
ANOVA Example
Mean Grand Mean SSA SSW SST MSA MSW F df numer. df denom. Fu
Page Replacement Algorithm A B C D 11 12 18 11 13 14 16 12 17 17 18 16 17 19 20 15 15 21 22 14 16 18 15 17 14 19 17 13 10 18 21 16 12 16 16 17 14 18 20 18 13.9 17.2 18.3 14.9 16.075 47.30625 12.65625 49.50625 13.80625 213.5 336.775 41.091667 5.9305556 6.93 3 36 2.87 (from table)
F > Fu reject Ho. Algorithms A, B, C, and D have a significant difference at 0.05 level of significance.
123.275
F=MSA/MSW
α
c-1 = 4-1 n-c = 40-4 24
© 2001. D. A. Menascé. All Rights Reserved.
12
ANOVA With Excel Anova: Single Factor SUMMARY Groups Column 1 Column 2 Column 3 Column 4
Count 10 10 10 10
ANOVA Source of Variation Between Groups Within Groups
SS 123.275 213.5
Total
336.775
Sum Average 139 13.9 172 17.2 183 18.3 149 14.9
df
Variance 5.877778 6.844444 5.566667 5.433333
MS F P-value F crit 3 41.09167 6.928806 0.000844 2.866265 36 5.930556 39
Since the p-value is less than α = 0.05, reject Ho. 25 © 2001. D. A. Menascé. All Rights Reserved.
Multiple Comparisons: The Tukey-Kramer Procedure • If Ho is rejected, then the question is “Which groups are different?” • Use the Tukey-Kramer procedure to compare all pairs of groups simultaneously. • Must compute the differences X j − X j for j ≠ j ' among all c(c-1)/2 pairs of means. '
26 © 2001. D. A. Menascé. All Rights Reserved.
13
Multiple Comparisons: The Tukey-Kramer Procedure • Obtain the critical range: critical range = qu
MSW 1 1 + 2 n j n j'
where qu is the upper-tail critical value from a Studentized range* distribution with c degrees of freedom in the numerator and (n-c) degrees of freedom in the denominator. * See statistical table. 27 © 2001. D. A. Menascé. All Rights Reserved.
Multiple Comparisons: The Tukey-Kramer Procedure • A pair is considered significantly different if the absolute difference between the sample means exceeds the critical range.
28 © 2001. D. A. Menascé. All Rights Reserved.
14
Multiple Comparisons: The Tukey-Kramer Procedure |XA-XB| |XA-XC| |XA-XD| |XB-XC| |XB-XD| |XC-XD| qu MSW
3.3 4.4 1 1.1 2.3 3.4
critical range > 2.9341 A significantly different than B > 2.9341 A significantly different than C < 2.9341 A not significantly different than D < 2.9341 B not significantly different than C < 2.9341 B not significantly different than D > 2.9341 C significantly different than D
3.81 (from table) 5.930556
29 © 2001. D. A. Menascé. All Rights Reserved.
Reviewing ANOVA Assumptions • Randomness and independence: must always be met. • Normality: ANOVA F test is robust as long as distributions are not extremely different from a normal distribution particularly for large samples. • Homogeneity of variance: s 12 = s 22 = ... = s 2c – If unequal sample sizes between groups, different variances is a problem. – Should try to use same-size groups.
30 © 2001. D. A. Menascé. All Rights Reserved.
15
Randomized Block Model Level 1
Level 2
...
Level c
block 1
...
block r
• each block contains the response of the same item to the c levels of the factor being analyzed. • Purpose: remove as much block or subject variability as possible by reducing experimental error. 31 © 2001. D. A. Menascé. All Rights Reserved.
Randomized Block Model A
B Randomizer C
computer programs
D 32
© 2001. D. A. Menascé. All Rights Reserved.
16
Randomized Block Model A
B Randomizer C
computer programs
D 33
© 2001. D. A. Menascé. All Rights Reserved.
block 1 block 2 block 3 block 4 block 5
Page Replacement Algorithm A B C D 11.0 12.0 18.0 11.0 13.0 14.0 19.0 12.0 17.0 18.4 23.4 16.5 14.0 14.9 20.0 12.5 15.0 16.0 21.0 13.5
program 1 program 2 program 3 program 4 program 5
r=5 c=4 34 © 2001. D. A. Menascé. All Rights Reserved.
17
ANOVA Model for Randomized Block Design Total Variation (SST) df = n - 1
Note: n = r x c
Among Group Variation (SSA) df = c - 1 Among-Block Variation (SSBL) df = r-1
Random Variation (SSE) df = (r-1)(c-1)
SST = SSA+SSBL+SSE
35
© 2001. D. A. Menascé. All Rights Reserved.
ANOVA Model for Randomized Block Design Total Variation (SST) df = n - 1
Note: n = r x c
Among Group Variation (SSA) df = c - 1 Among-Block Variation (SSBL) df = r-1
Random Variation (SSE) df = (r-1)(c-1)
SST = SSA+SSBL+SSE
36
© 2001. D. A. Menascé. All Rights Reserved.
18
SST (Sum of Squares Total) c
r
(
SST = ∑ ∑ X ij − X j =1i =1
wherec
2
)
r
∑∑ X ij
X =
j =1i =1
rc
: overall or grand mean.
X ij : observation in i-th block and level j. r:
number of blocks.
37 © 2001. D. A. Menascé. All Rights Reserved.
block 1 block 2 block 3 block 4 block 5
Page Replacement Algorithm A B C D 11.0 12.0 18.0 11.0 13.0 14.0 19.0 12.0 17.0 18.4 23.4 16.5 14.0 14.9 20.0 12.5 15.0 16.0 21.0 13.5
program 1 program 2 program 3 program 4 program 5
Grand Mean = 15.61 SST = (11.0-15.61)2 +(13.0-15.61)2 +…+(15.0-15.61)2 + … (11.0-15.61)2 +(12.0-15.61)2 +…+(13.5-15.61)2 = 232.44 38 © 2001. D. A. Menascé. All Rights Reserved.
19
ANOVA Model for Randomized Block Design Total Variation (SST) df = n - 1
Note: n = r x c
Among Group Variation (SSA) df = c - 1 Among-Block Variation (SSBL) df = r-1
Random Variation (SSE) df = (r-1)(c-1)
SST = SSA+SSBL+SSE
39
© 2001. D. A. Menascé. All Rights Reserved.
SSA (Sum of Squares Among Group) c
(
SSA = r ∑ X . j − X j =1
2
)
where r
∑ X ij
X . j = i =1 r
: group mean.
40 © 2001. D. A. Menascé. All Rights Reserved.
20
block 1 block 2 block 3 block 4 block 5 Mean Grand Mean
Page Replacement Algorithm A B C D 11.0 12.0 18.0 11.0 13.0 14.0 19.0 12.0 17.0 18.4 23.4 16.5 14.0 14.9 20.0 12.5 15.0 16.0 21.0 13.5 14.0 15.1 20.3 13.1
program 1 program 2 program 3 program 4 program 5
15.61
SSA = 5 * [ (14.0-15.61)2 +(15.1-15.61)2 +…+(13.1-15.61)2 ] = 155.018
41 © 2001. D. A. Menascé. All Rights Reserved.
ANOVA Model for Randomized Block Design Total Variation (SST) df = n - 1
Note: n = r x c
Among Group Variation (SSA) df = c - 1 Among-Block Variation (SSBL) df = r-1
Random Variation (SSE) df = (r-1)(c-1)
SST = SSA+SSBL+SSE
42
© 2001. D. A. Menascé. All Rights Reserved.
21
SSBL (Sum of Squares Among Blocks) r
(
SSBL = c ∑ X i. − X i =1
2
)
where c
∑ X ij X i. =
j =1
c
: block mean.
43 © 2001. D. A. Menascé. All Rights Reserved.
block 1 block 2 block 3 block 4 block 5 Mean Grand Mean
Page Replacement Algorithm A B C D 11.0 12.0 18.0 11.0 13.0 14.0 19.0 12.0 17.0 18.4 23.4 16.5 14.0 14.9 20.0 12.5 15.0 16.0 21.0 13.5 14.0 15.1 20.3 13.1
program 1 program 2 program 3 program 4 program 5
Mean 13.0 14.5 18.8 15.4 16.4
15.61
SSBL = 4 * [(13.0-15.61)2 +(14.5-15.61)2 +…+(16.4-15.61)2 ] = 76.133
44 © 2001. D. A. Menascé. All Rights Reserved.
22
ANOVA Model for Randomized Block Design Total Variation (SST) df = n - 1
Note: n = r x c
Among Group Variation (SSA) df = c - 1 Among-Block Variation (SSBL) df = r-1
Random Variation (SSE) df = (r-1)(c-1)
SST = SSA+SSBL+SSE
45
© 2001. D. A. Menascé. All Rights Reserved.
SSE (Random Error)
SSE =
2
∑ ∑ (X ij − X . j − X i. + X ) c
r
j =1i =1
46 © 2001. D. A. Menascé. All Rights Reserved.
23
block 1 block 2 block 3 block 4 block 5 Mean
+
Grand Mean
-
Page Replacement Algorithm A B C D 11.0 12.0 18.0 11.0 13.0 14.0 19.0 12.0 17.0 18.4 23.4 16.5 14.0 14.9 20.0 12.5 15.0 16.0 21.0 13.5 14.0 15.1 20.3 13.1
program program program program program
1 2 3 4 5
Mean 13.0 14.5 18.8 15.4 16.4
15.61
SSE = (11.0-14.0-13.0-15.61)2 +(13.0-14.0-14.5+15.61)2 + …. (21.0-20.3-16.4+15.61)2 +(13.5-13.1-16.4+15.61)2 = 1.287
47 © 2001. D. A. Menascé. All Rights Reserved.
ANOVA Model: Mean Squares SSA c −1 SSBL MSBL = r −1 SSE MSE = (r −1)(c − 1) MSA =
The mean squares are variances! If there are no real differences among the c groups, MSA, MSBL, and MSE provide estimates for the variance inherent in the data. 48 © 2001. D. A. Menascé. All Rights Reserved.
24
ANOVA Hypothesis Testing H 0 : µ.1 = µ.2 = ... = µ.c
H1 : Not allµ. j ( j = 1,..., c) are equal. F-Test statistic:
F=
MSA MSE
The F-test statistic follows an F distribution with (c-1) degrees Of freedom in the numerator and (r-1)(c-1) in the denominator.
Reject H 0 if F > Fu 49 © 2001. D. A. Menascé. All Rights Reserved.
block 1 block 2 block 3 block 4 block 5 Mean Grand Mean SSA SSE SSBL SST MSA MSBL MSE F df numer. df denom. Fu
Page Replacement Algorithm A B C D 11.0 12.0 18.0 11.0 13.0 14.0 19.0 12.0 17.0 18.4 23.4 16.5 14.0 14.9 20.0 12.5 15.0 16.0 21.0 13.5 14.0 15.1 20.3 13.1
program 1 program 2 program 3 program 4 program 5
Mean 13.0 14.5 18.8 15.4 16.4
15.61 155.018 1.287 76.133 232.438 51.67267 19.03325 0.10725 481.80
F=MSA/MSE
3 12 7.23 (from table)
F > Fu
reject Ho 50
© 2001. D. A. Menascé. All Rights Reserved.
25
Anova: Two-Factor Without Replication SUMMARY Row Row Row Row Row
Count
1 2 3 4 5
Column 1 Column 2 Column 3 Column 4
Sum 4 4 4 4 4
52 58 75.3 61.4 65.5
5 5 5 5
70 75.3 101.4 65.5
Average Variance 13 11.333 14.5 9.667 18.825 9.949 15.35 10.590 16.375 10.563 14 15.06 20.28 13.1
5 5.638 4.292 4.425
MSA
ANOVA Source of Variation Rows (blocks) Columns (groups) Error
SS 76.133 155.018 1.287
Total
232.438
df
MS F P-value 4 19.03325 177.4662 1.46E-10 3 51.67267 481.79643 9.11E-13 12 0.10725
MSE
19
F crit 3.25916 3.4903
=MSA/MSE
Since the p-value is less than α = 0.05, reject Ho. Since F > F critical, reject Ho. 51 © 2001. D. A. Menascé. All Rights Reserved.
Estimated Relative Efficiency (RE) SSBL RE =
(r −1) MSBL + r ( c − 1)MSE ( rc −1) MSE
n-1 • Used to assess if blocking results in an increase in precision in comparing the different groups. MSA MSBL MSE
51.67267 19.03325 0.10725
RE
=
38.2
• If blocking is not used, we would need 38.2 times as many observations to obtain the same precision in comparing the groups.
52
© 2001. D. A. Menascé. All Rights Reserved.
26
Multiple Comparisons: The Tukey-Kramer Procedure • Obtain the critical range: critical range = qu
MSE r
where qu is the upper-tail critical value from a Studentized range* distribution with c degrees of freedom in the numerator and (r-1)(c-1) degrees of freedom in the denominator. (See Statistical Tables). 53 © 2001. D. A. Menascé. All Rights Reserved.
Multiple Comparisons: The TukeyKramer Procedure Tukey Kramer Multiple Comparisons
Group 1 2 3 4 Intermediate Calculations MSE r c
Sample Mean 14 15.06 20.28 13.1
Comparison Group 1 to Group 2 Group 1 to Group 3 Group 1 to Group 4 Group 2 to Group 3 Group 2 to Group 4 Group 3 to Group 4
Absolute Difference 1.06 6.28 0.9 5.22 1.96 7.18
Critical Range Result 0.615124 Means are different 0.615124 Means are different 0.615124 Means are different 0.615124 Means are different 0.615124 Means are different 0.615124 Means are different
0.10725 5 4
54 © 2001. D. A. Menascé. All Rights Reserved.
27