A survey of methods used to evaluate computer science ... - CiteSeerX

Will appear in the 6th Annual Conference on the Teaching of Computing, 18th - 21 August, 1998, Dublin, Ireland

1

A survey of methods used to evaluate computer science teaching Angela Carbone

Jens J. Kaasbøll

School of Computer Science and Software Engineering Monash University Wellington Road, Clayton, Victoria 3168, Australia +61 3 9905 5200

Department of Informatics, University of Oslo P.O.Box 1080 Blindern N - 0316 Oslo, Norway +47 2285 2429

[email protected] 7 April, 1998 ABSTRACT A literature survey shows that the teacher’s own impression of the teaching and the students is the most common way of evaluating novelties in teaching. However, also low cost methods with better validity and reliability were found. These included data from several sources, or they comprised of several learning cycles in an iterative development.

Keywords Literature survey, research methods, evaluation, learning

1. INTRODUCTION Like many other teachers, we employ many ideas when trying to improve our teaching, eg, role plays, games or real world projects. We gain useful experience, but unfortunately, we have not always documented the teaching and students’ learning. That makes it difficult convincing other teachers of the soundness of our approach. When no well documented teaching procedure and evaluation of its effects are given, i t is often impossible to determine whether reported success depends on the method of teaching or on the teachers' personality or other factors. Dale [6] raises similar concerns in demanding that research on computer science education should apply the same principles as research in other areas of computer science. Since we aim at finding and substantiating knowledge o n teaching, we surveyed evaluation methods previously used. This paper presents the results of the survey and discusses the qualities of the evaluation methods, taking into account the limited time teachers and researchers have for evaluation. To get an overview of the current research, we scanned the 115 short articles in the Bulletin of the Special Interest Group for Computer Science Education, (SIGCSE Bulletin) volume 28, 1996. We found 26 relevant papers which included indications of empirical evaluation. In addition, we picked relevant papers from Communications of the ACM from 1992 to 1997, and also included some research often quoted.

2. Approaches to Evaluation Gilmore [10] distinguishes between controlled experiments and real-world observations in his exposition of research Permission to make digital/hard copy of part or all of this work for personal or classroom use is granted without fee provided that the copies are not made or distributed for profit or commercial advantage, the coyright notice, the title of the publication and its data appear, and notice is given that the copying is by ermission of ACM Inc. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires a fee and/or specific ermission. SIGCSE 6th CoToC, Dublin, Ireland © 1998 ACM

[email protected] traditions in the study of programming. We find a similar distinction useful also in studies of teaching computer science.

2.1 Experimental Research Changes of teaching and evaluation of the change fall into the category of experiments in educational research [9]. Normal experimental set-up that may include pre and posttests and control groups are recommended. Gall et al suggest that both quantitative and qualitative methods are useful i n evaluating the experiment. Almstrum et al [1] outline some typical experiments in the teaching of computer science. For example, for evaluation of use of multimedia in a beginner course, they suggest a control group taught by traditional methods, and collecting rich evidence through participant observation, measures of time spent, and a questionnaire to the students.

2.2 Studies of Teaching and Learning Teaching can be studied like any other professional practice [16], [14]. Many teachers have developed useful theories of how their students learn and ways of teaching that corresponds to the learning paths. Through documenting these ideas and practice, other teachers can recognize the teaching situation. Subsequently they can judge whether and how they should adapt the way of teaching in their own situation. Studies of teaching require comprehensive documentation of the essential parts of the teaching and the students’ behavior and thoughts. This documentation would include the teachers’ model or theory of student learning, the teachers’ strategy of teaching, detailed accounts of how she or he teach according to the strategy, and the students’ responses and actions during the period of teaching. In studies of teaching, and learning is more setting, where studies teaching constitute the

describing the processes of teaching important than in the experimental of the situation before and after the basis for evaluation.

3. Methods Used In this section, we will outline the techniques for evaluation of teaching difficult areas that have been used in recent research. We regard evaluation as a necessary activity i n each research cycle, which may consist of formulation of theory, planning of empirical study, empirical study, evaluation, and return to a new cycle where the theory i s revised. We separate the techniques into whether they aim at evaluation in experiments or in studies of teaching.

3.1 Evaluations of Experiments 3.1.1 General questionnaires Many universities routinely inquire the students on their impression of a course after the course is finished. Such questionnaires are normally of a general nature. The


responses that may be useful for evaluating a specific area or unit of teaching are the demographic data and eventually some comments made to open ended questions. Goelman [11] reports on a practical way of teaching database theory. He received specific comments targeted to his teaching on a general questionnaire, confirming that the students did not complain about unnecessary theoretical work, which was the normal situation in previous courses. Such comments are valuable for experimental research, but without questions targeted at the teaching innovation, the response is not targeted.

3.1.2 Examination Marks Assuming that examination marks portray the students’ competence, the marks should be indicators of effects of teaching. To measure the success of teaching a difficult topic, the marks on this topic has to be separable from those of other topics in a course. In order to compare a change of teaching from one population of students to another, also the populations have to be matched, and the procedure of marking has to be consistent and independent. These requirements may explain why so few changes of teaching have been measured by the method we prefer to measure our students’ achievements. McLoughlin and Hely [13] mention that the examination marks of students who were taught programming by means of developing formal specifications into programs improved significantly. Since this change of teaching concerned the whole course, the marks may actually reflect the subject to be learnt. However, no details concerning population and the reliability of the measure are given. Barrett [2] reports in a similar way on an attempt to teach design. Erickson [7] shows the development of student grades over several semesters throughout the introduction of a laboratory for teaching operating systems. His discussions of changes in student population and the small changes of grades illustrates the difficulty of longitudinal evaluation based on grades.

3.1.3 Experiments The current research in computer science education does not seem to include much of laboratory experiments. Vessey and Conger’s [17] experiment on the relative difficulty i n learning object-oriented modeling versus process modeling and data modeling constitutes an exception. However, they evaluated learning independently of teaching. This is also a common research method in human-computer interaction, where the learnability of programs is measured. Studies of learning show only limited aspects of how people learn and not how various forms of teaching can affect learning. When considering issues that are more complex than finding a way to get food out of a box, the teaching and learning process often becomes too complex to set up a controlled experiment. Nevertheless, Linn and Clancy [15] demonstrated that when including expert programmer comments in teaching, the programming process was superior to using only expert code and to make students create programs on their own without the comments. They came t o this conclusion after a controlled experiment with the three mentioned teaching strategies as the independent variable.

3.2 Studies of teaching and learning 3.2.1 Phenomenography Booth [3] interviewed students beginning to learn programming, and followed them up through a series of interviews during six months. Her study provided new insights into students’ conceptions of programming,

2

computers, and programming languages, and it demonstrates the complexity of the conceptions of programming. Her research is frequently cited by others. Social science studies like Booth’s are powerful ways for understanding how students think and how our teaching succeeds or misses out. Training as a social scientist is necessary to carry out such studies, and the studies are time consuming, so they are beyond the capabilities of most computer science teachers. However, also computer science and information systems teachers can learn some principles for improving the quality of their studies from social science. Detailed observations and literal citations of what students do and say, and subsequent interpretation of why the students says this, gives the reader the opportunity to an informed opinion of whether to agree on the author’s interpretations. Also, asking questions to reveal the students’ understanding at various stages during a course, eg, “What do the concepts instantiation and specialization mean t o you?” can be useful for finding out differences between the students’ and the teachers’ conceptualization.

3.2.2 Description of One’s Own Teaching and of One’s Students’ Learning We found 19 papers where the teacher describes her or his teaching together with their experience of their students’ learning, see Table 1. The documentation resembles that of the case discussion method in educational science [5]. The idea of the case method is discussing one’s teaching with colleagues, and one’s personal description of teaching and student response constitutes the basis for the discussion. The case method may be a part of an action research approach, and the collegial discussion can contribute with the interpersonal interpretation which characterizes good research. The report of the research should therefore also include the case discussion. Iterations of theory refinement, planning, action and evaluation constitute the core learning cycle of experimental and action research. Feldman and Zelenski [8] describes several iterations of their development of an assignment for learning recursion. Even if their evaluation does not follow a rigorous method, their reports of the students’ work give convincing reasons for their removal of unnecessary detail and focus on aspects of recursion in the assignment. Hutchens and Katz [12] report in a similar way on their teaching of iterative development.

3.3 Combined Approaches Carroll and Rosson [4] have carried out seven cycles of development and evaluation of a computer tutorial for learning object-oriented programming. They carried out formative and summative evaluations, evaluated experiments and observed the learning process. In the end, programmers learning through their tutorial learnt the object-oriented paradigm after 6 hours, while programmers learning by means of a comparable commercial tutorial did not master the concepts after 12 hours. This example indicates strongly that iterative design combined with different evaluations is an effective way of developing teaching techniques. Since Carroll and Rosson’s mode of instruction was a computer tutorial, they could exercise more control of the teaching than, eg, for class room or group work teaching. Improved control eases set up of laboratory tests and evaluations. The researchers’ background from human computer interaction suggests that this field of study has a rich repertoire of evaluation that is worth considering also when evaluating teaching.


4. Discussion The literature review indicates that evaluations of teaching are commonly restricted to the teachers’ personal experience of the students’ learning. More elaborate ways of evaluation have also been found, and these seem to group into a highroad and a low-road solution. The high-road studies, eg, Booth [2] and Linn and Clancy [15] seem to provide knowledge that others find useful. However, they are time consuming and require a sound knowledge of educational, psychological or social science research methods. In projects that are supported by research grants, comprehensive evaluations like these would be feasible. Even though resources spent on evaluation imply that some research problems must be omitted from a project, we believe that convincing answers to a few problems are usually better than anecdotal evidence of some students’ learning. Most computer science and information systems teachers d o not have substantial additional resources to spend o n evaluating their teaching improvements. Some low-road solutions to evaluation in these cases have been indicated above. Data from sources like questionnaires possibly with targeted questions, examination marks, literal quotations from students, and material from student work are easy t o collect. Gathering data from more than one source usually strengthens the validity of the findings. Carrying out several learning cycles in an iterative development also strengthened the credibility of the report. Having worked in the low-road situation for a while, we have also used a couple of other methods for evaluating teaching innovations. Fellow academics will often be willing to sit in on occasional teaching sessions and subsequently discuss the session. Peer observers can offer feedback on content, organisation and communication issues. This type of review often benefits both the teacher, who is afforded another perspective on his or her teaching performance, and the observers, who have the chance t o quietly compare their own approaches to teaching with another’s, without any performance pressure. When teaching in groups of 15 - 30 students, some of them are often actively participating in discussions. Being concentrated towards the teaching, the teacher may not be aware of which student is saying what. A few active voices who demonstrate their competence may therefore create the impression that every student understands the subject of the session, while this understanding is actually limited to the few active students. We have experienced that this misconception can be avoided by having an observer recording a “class interaction diagram.” A class interaction diagram consists of reports on how much and what type of student discussion occurred, ie, were questions asked by the academic open or closed, were student responses forced or unforced, were their responses short or long, which students asked questions and what type of questions did they ask. To record the diagrams requires little training. We have also used the Web for feedback from tutors and distribution of teaching ideas. For example, a tutor has entered the following, for the other tutors also to see:

3

itself were empty blocks of land on the road, and that each block of land has its own address. The contents of the blocks of land (the data) were houses. I asked each student to name what sort of house they'd like on their block of land; this helped me learn the students' names too. I explained that pointers were sort of like forwarding addresses Concluding remarks: On the whole, the students found that the Memory Lane analogy worked quite well. Even though it took away from the time the students had to do the practice session, it allowed them to finish it more quickly since they had a greater understanding of it. This mechanism provides a low cost distribution of teaching ideas, although the evaluation is based on the tutor’s subjective experience.

5. Conclusion In research projects supported by grants, resources should be spent on evaluation according to established research methods. The computer science teachers who carry out the majority of teaching innovations do not usually have time for thorough research procedures. Even so, the validity of evaluations can be increased by gathering and analysing data from several sources, and the data collection is often easy to carry out. Iterative development and evaluation of a teaching innovation also seems to improve the validity of the findings. We will therefore recommend that when evaluating teaching innovations, computer science teachers find ways that are simple yet more powerful than only stating their subjective impression. Some of the examples described may serve as prototypes for evaluation.

6. References [1] Almstrum, V.L.; Dale, N.; Berglund, A et al (1996) Evaluation: turning technology from toy to tool. SIGCSE Bulletin 28, Special issue on integrating technology into computer science education. 201-217 [2] Barrett, M.L. (1996) Emphasizing design in CS1. SIGCSE Bulletin 28, 1, 315-318 [3] Booth, S. (1992) Learning to program: a phenomenographic perspective. ACTA Univ. Gothenburg studies in educational science 89 [4] Carroll, J.M. and Rosson, M.B. (1995) Managing evaluation goals for training. Communications of the ACM, 38, 7, 40-48 [5] Colbert, J.A.; Desberg, P.; Trimble, K. (1996) The Case for Education: Contemporary Approaches for Using Case Methods. Allyn and Bacon, Boston. [6] Dale, N. (1996) Research in computer science education: five case studies. SIGCSE Bulletin 28, 1, 1-2 [7] Erickson, C. (1996) The EOS laboratory environment for a course in operating systems. SIGCSE Bulletin 28, 1, 353-357 [8] Feldman, T.J. and Zelenski, J.D. (1996) The quest for excellence in designing CS1/CS2 assignments. SIGCSE Bulletin 28, 1, 319-323

General teacher evaluation: It was a good class and I think everyone enjoyed it.

[9] Gall, M.D.; Borg, W.R.; Gall, J. (1996) Educational Research: An Introduction. Longman Publishers, USA

Topic: Pointers

[10] Gilmore, D.J. (1990) Methodological issues in the study of programming. In J.-M. Hoc et al Psychology o f programming. Academic Press, London, 83–98

Teaching example: I gave the students an analogy which they understood. I explained that memory was like a road, then drew a road on the board (and called it "Memory Lane"). Then I showed that the memory


4

[11] Goelman, D. (1996) The Ingres tutorial as a tool i n teaching database theory. SIGCSE Bulletin 28, 1, 117119

[15] Linn, M.C. and Clancy, M.J. (1992) The case for case studies of programming problems. Communications o f the ACM, 35, 3, 121-132

[12] Hutchens, D.H. and Katz, E.E. (1996) Using iterative enhancement in undergraduate software engineering courses. SIGCSE Bulletin 28, 1, 266-270

[16] Shulman, L.S. (1987) Knowledge and Teaching: Foundations of the New Reform. Harvard Educational Review, 57, 1, 1-22

[13] McLoughlin, H. and Hely, K (1996) Teaching formal programming to first year computer science students. SIGCSE Bulletin 28, 1, 155-159

[17] Vessey, I. and Conger, S.A. (1994) Requirements specification: learning object, process, and data methodologies. Communications of the ACM, 37, 4, 102-113

[14] Leinhardt, G. (1990) Capturing craft knowledge i n teaching. Educational Researcher 19, 2, 18-25

Table 1. Papers written by teachers describing their teaching and students' learning Concurrent programming

Berk, Toby S. (1996) A simple student environment for lightweight process concurent programming under SunOs. SIGCSE Bulletin 28, 1, 165-169

Inheritance

Biddle, Robert and Tempero, Ewan (1996) Explaining inheritance: a code reusability persoective. SIGCSE Bulletin 28, 1, 217-221

Analysis of algorithms

Bradley, Michael J. (1996) Analyzing multi-phase searching algorithms. SIGCSE Bulletin 28, 3 5-8

Synchronization

Bynum, Bill and Camp, Tracy (1996) After you, Alfonse: a mutual exclusion toolkit. SIGCSE Bulletin 28, 1, 170-174


Chavey, Darrah (1996) Songs and the analysis of algorithms. SIGCSE Bulletin 28, 1, 4-8

Recursion

Denman (1996) Derivation of recursive algorithms for CS2. SIGCSE Bulletin 28, 1, 9-13

Database application development

Dietrich, Suzanne W. and Urban, Susan D. (1996) Database theory in practice: learning from cooperative group projects. SIGCSE Bulletin 28, 1, 112-116

Parallel algorithms

Elbogen, Bruce S. (1996) Parallel and distributed algorithms laboratory assignments in Joyce/Linda. SIGCSE Bulletin 28, 1,14-18


Ginat, David (1996) Efficiency of algorithms for programming beginners. SIGCSE Bulletin 28, 1, 256-260

Object-oriented patterns

Goldfedder, B. and Rising, L. (1996) A training experience with patterns. Communications of the ACM, 39, 10, 60-64

Inspection of formal specification

Hilburn, Thomas B. (1996) Inspections of formatl specifications. SIGCSE Bulletin 28, 1, 150-155


Krone, Joan (1996) Using symbolic computation for teaching data structures and algorithm analysis. SIGCSE Bulletin 28, 4, 19-24,32

Shifting between programming paradigms

Leska, Chuck; Barr, John; King, L.A. Smith (1996) Multiple paradigms in CS1. SIGCSE Bulletin 28, 1, 343-347

Remote file access through eventdriven network implementations

McDonald, Chris (1996) User-level distributed file systems projects. SIGCSE Bulletin 28, 1, 333-337

Inheritance and inclusion

Reek, Kenneth A. (1996) Teaching inheritance versus inclusion to first year computer science students. SIGCSE Bulletin 28, 1, 24-26

Data hiding and modularization

Resler, Dan (1996) The prisoner’s dilemma tournament revisited. SIGCSE Bulletin 28, 2, 31-36

Object-oriented design

Rosson, Mary Beth and Carroll, John M. (1996) Scaffolded examples for learning objectoriented design. Communications of the ACM, 39, 4, 46-47

Ethics

Schulze, Kay G. and Grodzinsky, Frances G. (1996) Teaching ethical issues in computer science: what worked and what didn’t. SIGCSE Bulletin 28, 1, 98-101

Parallel computing

Smith, Harry F.; Plusnick, Patrick; Sarojak, Mark; Seitz, William (1996) Image processing as an exemplar of parallelism applied to graphics. SIGCSE Bulletin 28, 1, 363-367

Formal methods

Sobel, Ann E. Kelley (1996) Experience integrating a formal method into a software engineering course. SIGCSE Bulletin 28, 1, 271-274

Real life software engineering projects

Song, Ki-Sang (1996) Teaching software engineering through real-life projects to bridge school and industry. SIGCSE Bulletin 28, 4, 59-64

A survey of methods used to evaluate computer science ... - CiteSeerX

A survey of methods used to evaluate computer science ... - CiteSeerX

Suggest Documents

Comparison of Sampling Methods Used to Evaluate

Overview of methods used to evaluate the

Analysis of Clinical Methods Used to Evaluate ... - ATS Journals

A Case Study to Evaluate Ground-Based, Wildland Survey Methods ...

taxonomic biases of seven methods used to survey a diverse

Statistical process control methods used to evaluate the serologic ...

Methods Used to Evaluate Pain Behaviors in ... - Semantic Scholar

A Comparative Study to Evaluate Filtering Methods for ... - Science Direct

A Survey of Computer Science Teacher Preparation ... - CiteSeerX

Designing survey methods to evaluate the undeclared ...

A Survey of 'Game' Portability - Computer Science

Teaching Formal Methods in Computer Science ... - CiteSeerX

A Survey of Programmable Networks - Computer Science

A Survey of 'Game' Portability - Computer Science

Teaching Formal Methods in Computer Science ... - CiteSeerX

A Mixed Methods Approach to Evaluate

A Survey of Methods Used by Malaysian Brokerage Finn ... - Core

Qualitative methods to evaluate Mediterranean

Mathematical methods to evaluate ecological

Morphological methods to evaluate protective

Methods to Evolve Legal Phenotypes - Computer Science

ambient vibration tests used to evaluate seismic properties of a ...

A Survey of Scheduling Methods - CiteSeerX

Image morphing: a survey - Computer Science - CUNY.edu