Using MS Excel in teaching Design of Experiment

37 downloads 52864 Views 1MB Size Report
Abstract: - Choosing right software to teach an applied statistical course is very crucial. When one ... An option is to use R [6] which is reasonably good software ...
International Journal of Education and Learning Systems http://iaras.org/iaras/journals/ijels

Mohammad Salehi

Using MS Excel in teaching Design of Experiment MOHAMMAD SALEHI Mathematics, Statistics, and Physics Qatar University Doha, PO Box 2713 QATAR [email protected] Abstract: - Choosing right software to teach an applied statistical course is very crucial. When one teaches applied statistical courses to statistics students s/he needs to use an interactive software. Instructors should communicate with the software rather getting all the statistical results by clicking tabs. The mean reason is that the instructor can simultaneously teach the techniques and the data analysis. Perhaps the most straightforward option is to use the free software R but it is not easy for average students to use R. In this paper, I will share the experience of using MS Excel in teaching Design of Experiment. I find this experience as a success one.

Key-Words: - Interactive teaching approach; Statistical package; Problem-oriented; Technique-oriented. Design of Experiment (STAT332). Even those students who passed the course they are reluctant to use R. In Spring 2016, I was supposed to teach STAT381 (Categorical data analysis) and I posted its syllabus on Blackboard at the night of the first class. In the syllabus, I mentioned that we are going to use R. Right after posting it, students dropped the course except one! My conjecture was that their favorite software is not R. Like many universities, we should have one more extra contact hour for each applied statistics course in which a TA works on R so students feel comfortable to use it. But this opportunity is not available in our program. However, our students are usually familiar with Excel either from High School or Foundation program of Qatar University so that they are very comfortable to use Excel. This motivated me to find the possibility of using Excel in this course. In the first step, I did a literature survey on the subject. As I expected there are quiet number of literatures in which the advantage and disadvantage of using Excel in the first year level statistical courses were discussed. Goos and Leemans (2004) [2] discussed using Excel to teach optimal design of experiments in which he presented an interactive teaching approach to use Matrix operations in Excel for teaching optimal design. I did also Google-searched to find out whether some colleagues use Excel in teaching Design of Experiment but I could not find any. I decided to use Excel beside SPSS and Minitab in teaching. I will present an Example solved in Excel to demonstrate how plausible using Excel in Design of Experiment course.

1 Introduction The first challenge of teaching an applied statistics course is to choose appropriate Statistical Packages. Instructors should solve "problemoriented“ and "technique-oriented” exercises [1]. For “problem-oriented” exercises those software with clickable interfaces like SPSS and Minitab are very useful as student does not require extensive learning and training of them. The software provides the required output with few clicks and students can focus on solving the problem and interpret the output . However teaching a statistical course to statistics students, instructor ought to present “technique-oriented” exercises in order to teach the technique and review some important concepts. Even though using artificial data sets were discouraged for many reasons but for the first year courses you may prepare a simple artificial data set and use calculator to teach and clarify the technique [5]. However, it is very difficult if not impossible to use only calculator to teach techniques in a more advance courses like Design of Experiment course. An option is to use R [6] which is reasonably good software to be used for teaching data analysis and techniques simultaneously. Moreover, it is a free software and designed for teaching. Personally, I like the software very much for many its advantages particularly for its interactive nature. However, it requires extensive learning and training with some programming skills [3]. R is taught in a course called Statistical Packages (STAT371) in Statistics program, Qatar University. However, not all students have passed this course when they take

ISSN: 2367-8933

93

Volume 1, 2016

International Journal of Education and Learning Systems http://iaras.org/iaras/journals/ijels

Mohammad Salehi

2 Used Feeatures of MS Exccel

fu 2.22 Simple functions

An A importannt advantage of MS Exceel is that it can c interactively i be used in class, avoids the "black boox" offered by some statisstical softwaare. approaches a Using U MS E Excel helps students to understand the problem p annd its soluttion. After teaching the techniques t annd methods by an interactive softw ware we w may usse a softwarre like SPSS S or Minitabb to speed s up thee teaching and a emphasiize on the data d analysis a partt of the course. Our text book k is Montgomery M y (2013) [4] and a I solvedd the most off its examples e using MS Excels during the class tiime with w details. Used Exceel features were w very few. fe The T most freequent used features wiill be presennted below. b

Too compute formulas f in the experim mental desig gn prooblems one needs n to use the four baasic arithmetiic operations, th he square/sqquare root opperations an nd sum mmation, SU UM(), for a row, colum mn or a matriix whhich can easily be done inn a nice interractive way in i Exxcel. I also used F.DISST.RT for computing c P PVaalues in the ANOVA A tabbles.

3 Examplees Ass it is mentioned, Excell was used for f almost all a tauught methodds in the coourse. I willl present an a exaample to shhow that a practical p exaample can be b preesented in th he class with h details of methods annd com mputation off formulas inn a reasonabble time in thhe claass.

2.1 2 PivotTable

Wee present Exxample 5.3. of o the text boook [4]. As an a

PivotTable P c creates Multti-ways tablee of functioons, including i freequencies, su ummations, the means, the variances, v ettc. PivotTab ble is on thee Insert tabb in Excel. E The data d set files provided byy the text boook are a in a stack format so s that PivootTable is very v useful u to gett sums of thhe factor leevels as welll as replications. r Figure 1 sh hows PivotTaable and Inssert tab. t

exaample of a faactorial design n involving two t factors, an a enggineer is desiigning a batteery for use inn a device thaat willl be subjeccted to som me extreme variations in i tem mperature. Thee only design parameter thaat he can selecct at this point is the t plate material for the battery, b and he h hass three posssible choicees. When thhe device is maanufactured annd is shipped to t the field, thhe engineer haas no control over the temperatu ure extremes that the devicce willl encounter, and he knnows from exxperience thaat tem mperature willl probably afffect the effectiive battery liffe. Hoowever, tempeerature can be b controlled in the producct devvelopment lab boratory for the t purposes of a test. "Thhe enggineer decides to test all thhree plate maaterials at threee tem mperature levvels-15, 70, and 125°F- because thesse tem mperature levels are consisstent with thee product enddusee environmen nt. Because thhere are two factors f at threee levvels, this dessign is somettimes called a 32 factoriaal dessign. Four baatteries are tessted at each combination c o of plaate material annd temperaturre, and all 36 tests are run in i ranndom order. The experiiment and the resultin ng

Figure F 1. P PivotTable in i Insert tabb is shown by red r boxes.

observed batteery life dataa are given in Figure 1 forrmat. Thhe general moodel is

W We are interested in testiing hypotheeses about thee equality of o row treatm ment effectss, say : :

ISSN: 2367-8933

94





0 0

(1)

Volume 1, 2016

International Journal of Education and Learning Systems http://iaras.org/iaras/journals/ijels

Mohammad Salehi

and a the equallity of colum mn treatment effects, say : :







0 0 0

Taable 2: a) Copy and Paste of observvations from m Table1 1.

(2)

15

We W are also interested in n determininng whether row r and a column ttreatments innteract. : :



0



0

70

125

Row w

1

2

3

4

1

2

3

4

1

2

3

4

1

130

74

1555

180

34

8 80

40

75

20

82

70

58

998

2

150

159

1888

126

136

1006

122

115

25

58

70

45

1300

3

138

168

110

160

174

1550

120

139

96

82

104

60

1501

Totals

418

401

4553

466

344

3336

282

329

141

222

244

1 163

3799

b) Copy and Paste of the row w totals from m Table1.

(3)

To test (1), ((2) and (3) we w should com mpute

15 T Total

70 Total

125 Totall

Grand Total

539 623 576

229 479 583

2300 198 3422

998 1300 1501

Thhen, using square oppration, ^22, and draag meethods pred duce squaress of Table 2 (a) and (b)). W We can now w calculatee the sum of squarees reqquired in AN NOVA tablle from Tabble 2.

Which can bbe done quicckly by usingg PivotTablee in Excel E and thee result will be b as followss Table T 1: Usiing PivotTab ble to compuute summatio ons over o the rowss, the columnns factors annd replication n. Sum of LColumn  15 R Lab 1 Row  1 130 2 150 3 138 (bblank) G To 418 Grand 

2 3 4 74 155 180 159 188 126 168 110 160

15 Total

70 70 Total 1 2 3 4 539 34 80 40 75 229 623 136 106 1 122 115 479 576 174 150 1 120 139 583

1255 1 200 255 966

401 453 466

1738 344 336 3 282 329 1291 1411 222 244 163

2 3 82 70 58 70 82 104

4 58 45 60

125 Total ((Grand Tot (blank) 998 230 198 1300 342 1501 770

3799

This T is reprodduction of Table 5.4 of thhe text book.. At A this stagee, to copy and paste Table T 1 in two t tables t of obsservations an nd the rows totals, t valuess in the t gray cellss, to other ceells, Table 2 (a) and (b).. It is i better to ppaste them ass Values to avoid a unwannted changes. c Figure F 1: Vaalues (V) Paste option.

ISSN: 2367-8933

Thhey are compputed just usiing row, coluumn sum()annd thee arithmetic operations. o We W are now able to creatte thee table of AN NOVA, Tab ble 3. The fiirst column is i jusst the sum of squares, the second is degree of o freeedoms, thee next coluumn is produced by b divviding the firrst column by b the secondd column an nd thee forth colum mn is dividiing the thirrd column by b SSE SE. The lastt column is the P-valuees for whicch funnction “=F.D DIST.RT(F0,,Df,DFE)” iss used. Wheen

95

Volume 1, 2016

International Journal of Education and Learning Systems http://iaras.org/iaras/journals/ijels

Mohammad Salehi

a)

one uses those software with clickable interfaces the P-values are provided out of the dark so that students gradually forget its concept and they only know that H0 is rejected if it is small but we can emphasize that it is probability of observing the current observed data under H0 using Excel. We can also have a flashback on the F distribution structure.

Write appropriate model indicating the response, the factor, and the covariate variables.

b) Compute all necessary S.., T.. and E..’s, and test the hypothesis that there is no treatment effect at level 0.05. c)

Test the hypothesis that the covariate variable (the hardness) has no effect at 0.05.

Table 3: The ANOVA table. SSA= SSB= SSAB= SSE= SST=

SS 10683.7 39118.7 9613.78 18230.8 77647

DF. 2 2 4 27 35

MS 5341.8611 19559.361 2403.4444 675.21296 2218.4849

F0 7.911 28.97 3.56

I had 23 students in this course. The mean mark for the final exam was 23.12 out of 35 while the mean score for Question 4 was 7.1 out of 10. This means that the mean percentage for Question 4 was 71% while for that of final exam was 66%. This shows that student did slightly better in Question 4. An example of a student’s Excel sheet in the final exam is giving in Figure 3.

P-value 0.001976 1.91E-07 0.018611

Solving this example took less than 30 minutes in the class while I could review all computations and concepts. More importantly, students were following the steps very well.

4 Conclusion

4 Students feedback

I used Excel effectively in teaching Design of Experiment for statistics students and I found it useful to teach “technique-oriented” exercises in the class. Many example were taught using Excel. It was mainly used as a calculator. Our students were comfortable to use it and express their preference of using Excel. However, I do not claim Excel is the best software to be used but we significantly spent less time to teach it compare to a statistical packages like R that requires extensive learning and training with some programming skills. Nevertheless, it can be a good solution when no TA available to assist instructor for teaching statistical package and students have problem to use a programmable software.

Students were quite comfortable with Excel and they expressed their appreciation for using Excel. To test their level of learning through Excel I gave the following question in the final exam and restricted them to use Excel only. The question is an Analysis of Covariance problem from Chapter 15 of the test book.

Question 4 An engineer is studying the effect of cutting speed on the rate of metal removal in a machining operation. However, the rate of metal removal is also related to the hardness of the test specimen. Five observations are taken at each cutting speed. The amount of metal removed (y) and the hardness of the specimen (x) are shown in the following table. Analyze the data using and analysis of covariance. Use =0.05.

References: [1] Chatfield C. Teaching a Course in Applied Statistics, Journal of the Royal Statistical Society. Series C (Applied Statistics), Vol. 31, No. 3, 1982 pp. 272-289 [2] Goos P. and Leemans H., Journal of Statistics Education Vol.12, No 3, 2004), www.amstat.org/publications/jse/v12n3/goos.ht ml. [3] Montgomery, D. C. Design and Analysis of Experiments, 8th edition, John Wiley & Sons, New York, 2013. [4] Paura L., Arhipova I. Advantages and Disadvantages of Professional and Free Software for Teaching Statistics, Information

Cutting Speed (rpm) 1000 1000 y x 8 60 30 80 38 90 17 65 28 76

ISSN: 2367-8933

1200 y 52 34 5 14 25

1200 x 105 80 60 65 73

1400 y 58 22 13 32 20

1400 x 115 72 64 81 70

96

Volume 1, 2016

International Journal of Education and Learning Systems http://iaras.org/iaras/journals/ijels

Mohammad Salehi

Technology and Management Science, Vol. 15, 2012, pp 9-14. [5] Singer J. D. and Willett J. B, Improving the Teaching of Applied Statistics: Putting the Data Back into Data Analysis, The American Statistician, Vol. 44, No.3, 1990, 224-230. [6] The R Project for Statistical Computing. [Online]. Available: http://www.R-project.org/

Figure 2: Exel sheet of results for the Example.

ISSN: 2367-8933

97

Volume 1, 2016

International Journal of Education and Learning Systems http://iaras.org/iaras/journals/ijels

Mohammad Salehi

Figure3: Student’s example in final exam.

ISSN: 2367-8933

98

Volume 1, 2016