Items 1 - 6 - Cette etude vise a mesurer I'impact d'une activite d'auto~valuation et ..... WebCT course Homepage with Online Course Components. 41. Figure 2 ...... experiment shows the relationships between mechanics and heat, and electricity.
Running Head: Web-Based Self and Peer Assessment
Web-Based Self and Peer Assessment
Genevieve Gauthier
Department of Educational and Counselling Psychology McGill University, Montreal August 2004
A thesis submitted to McGill University in partial fulfillment of the requirements for the degree of Masters of Arts in Educational Psychology
O Genevieve Gauthier, 2004
ABSTRACT This study aims at measuring the impact of web-based self and peer assessment activities in a physics university level course. These assessment activities were integrated into a first year experimental physics course. Fifty seven students participated in the asynchronous self and peer assessment activities. Dependent variables included: measures of students' monitoring abilities, the use of peer assessment to reinforce self assessment and the degree to which students endorsed the activities. Due to missing data the repeated measure design had to be replaced by descriptive statistics. Data collected do not confirm a significant decrease in the number of errors made by students but they suggest decrease in specific types of errors. The study suggests that the self and peer assessments, even when they are similar in form, add to each other and provide users with useful information. However, student level of endorsement of the activity is not conclusive and key changes need to be discussed to improve the activity.
RCSUMI! Cette etude vise a mesurer I'impact d'une activite d'auto~valuationet d'6valuation entre pairs dans une classe de premiere annee de physique au niveau universitaire. Cette activite dYvaluation formative sur fe Web a BtB integree
ZI I'enseignement rbgulier dans le cadre d'un cours de physique
experimental de premiere annee d'universite. Cinquante sept Btudiants ont participe aux exercices d'auto 6valuation et d'evaluation entre pair de manihre asynchrone sur le Web. Les variables dependantes comprennent diverses mesures dont I'habilite de revision des etudiants, I'utilite et la complementaritb de 1'8valuation entre pairs, ainsi que du degr6 d'appropriation de I'activite' par les Btudiants. Due au manque de donnees recueillies, re modhle de mesutes repetees a ete remplacb par des statistiques descriptives. Les donnbes ne nous permettent pas de confirmer que les activites ont eu un impact significatif sur le nombre d'erreur et la performance des etudiants mais elles indiquent un effet positif sur certains types
d'erreur.
L'Btude
suggbre
aussi
une
complementarite de
I'autoeval~iationavec IUvaluation entre pairs. Cependant I'appropriation de I'activite par les etudiants n'est pas conclusive et certains changements sont a envisager pour arneliorer I'activite. /
ACKNOWLEDGMENTS Thank you to Susanne P. Lajoie and Laura Winer my co-supewisors, for believing in me and supporting me along the way. Special thanks to Sonia for support, revisions and comments. Thanks to Anne and my family who probably never thought I would finish one day.
Thank you to Tom, Andrew, Lucy and other colleagues from the research lab for their support and opinions. I wish to acknowledge and thank Mrs. Edith Engetberg and Professor
Michael Hilke, fascinating and passionate educators who opened their door for exchanges and colaboration. The present research was made possible through funding from the Social Sciences and Humanities Research Council of Canada (SSHRC).
TABLE OF CONTENTS Page ABSTRACT ........................................................................................................... I RESUME.............................................................................................................. II ACKNOWLEDGMENTS...................................................................................... Ill LIST OF TABLES ............................................................................................... Vll LIST OF FIGURES............................................................................................ Vlll fX LIST OF APPENDICES .................................................................................... CHAPTER 1: INTRODUCTION............................................................................ 1 Original Contribution to Knowledge ........................................................... 3 CHAPTER 2: REVIEW OF LITERATURE.......... . . ....................................... 4 Constructivist Views on Learning .......................................................... 4 Role of Metacognition and Self-Regulation......... .................. 4 Group Dynamics in the Classroom; Nurturing a Community of Practice........................................................................................... 8 Acknowledging the Role of Assessment ....................................... 10 Self and Peer Assessment .....;................................................................ 12 Self Assessment in Higher Education........................................... 13 Peer Assessment in Higher Education ......................................... 14 Self, Peer and Go-Assessment .................................................... 15 Integrating Self and Peer Assessment to Improve Learning ......... 16 From Theow l o Practice: Building a Bi-Directional Bridge between Practitioners and Researchers ............................................................... 18 Challenging Research Settings................................................. 19 Need for Applied Research to Build a Bi-Directional Bridge ......... 20 Technology as an Enabling Medium .......................................... 21 Description of Study ................................................................................ 22 Nature of the Problem Space..................... ............................... 23 Support of Interactive Patterns ............................................... :..... 24 Process of Creating a Shared Social Context ............................... 25 Appfication of Concepts in Current Study ..................................... 26 Research Questions ................................................................... 27 CHAPTER 31 METHODOLOGY.....:.................................................................. 29 Participants.............................................................. I............................... 29 Procedure............................................................................................... 30 Design ..................................................................................................... 34 Grouping ........................................................................................ 35 Materials................................................................................................. 35 Design Process of Assessment Forms ......................................... 35 Assumptions about Errors .......:................................................. 39 Environments .........................................................................................40 Online Course Space on WebCT ................................................. 40 Task on WebCT ............................................................................42 Online Communication, Web Course Usage and Statistics .......... 42
...
. .
Data Analysis .......................................................................................... 43 Participation and Task Completion ............................................... 43 Self and Peer Assessment Forms ................................................ 43 Grades and Comments on Lab Reports ....................................... 44 Interviews and Post Activity Questionnaire ................................... 45 CHAPTER4: RESULTS ..................................................................................... 46 Participation............................................................................................. 46 Measures Related to the First Hypothesis............................................... 47 Number of Errors ....................................................................... 47 Types of Errors ....................................................................... 48 Grades .......................................................................................... 49 Grades' Estimation by Students............................................... 50 Measures Related to the Second Hypothesis.......................................... 50 Types of Errors in Self vs . Peer .................................................... 51 Type of Error Recognition in Self vs . Peer .................................... 52 Participation in Self vs. Peer ................................................... 54 Students' Preference for Self or Peer Assessment ....................... 54 Measures Related to the Third Hypothesis.............................................. 56 Task Completion and Participation ............................................... 56 Web Course Usage and Statistics ................................................ 58 Students' Exchanges .................................................................... 59 Online Communication Amongst Teams ............................59 Endorsement of the Activities............................................. 60 Group Exchanges and Responsibilities........................... 61 CHAPTER 5: DISCUSSION............................................................................... 63 Error Monitoring, Gauging Performance and Improvement of Lab Reports .................................................................................................... 63 Number of Errors .......................................................................... 64 Types of Errors ............................................................................. 64 Improvement of Lab Reports ..................................................... 65 Estimation of Performance........................................................ 65 Summary ...................................................................................... 66 Relationship Between Self and Peer Assessment ................................... 66 . Number of Errors in Self vs; Peer ................. .:.............................. 67 Types of Errors in Self vs. Peer .................................................... 67 Recognition of Errors in Self vs . Peer ....................... I ................... 68' Participation Rate in Self and Peer Assessment........................... 69 Students' Preference .................................................................... 69 Summary ..................................................................................... 70 Community of Practice ............................................................................ 71 Students' Endorsement of the Activities ....................................... 72 Participation and Task Completion ................... ,............. 72 Students' Opinion about Repeating the Activities .............. 72 Collaboration............................................................................. 73 Task Distribution and Group Dynamics........... . . .............. 73 Use of WebCT Course and Communication amongst Students ... 73
Summary ...................................................................................... 74 Limitations of the study .................................................. 75 Participation .................................................................................. 75 Timing ...........................................................................................75 Task Requirements................................................................ 76 Technology ................................................................................... 77 Educational Implications and Future Directions...................................... 78 Active Use of Assessment Criteria................................................ 78 Monitoring Comments given by Peers .......................................... 78 Dynamic Self and Peer Assessment Form ................................... 79 Conclusion............................................................................................... 80 Future Studies .............................................................................. 82 REFERENCES ................................................................................................... 83
............... . . .
vii
LIST OF TABLES Page
Table 1. Table 2. Table 3. Table 4. Table 5. Table 6. Table 7. Table 8. Table 9. Table 10. Table 11. Table 12. Table 13.
Previous Experience using WebCT Semi-structured Interview Protocol Self and Peer Assessment Criteria Assumptions about Errors Overall Participation Number of Errors Identified in Lab Report 10 and 11 Types of Errors in Lab I 0 and Lab 11 Grades for Self. Peer and Instructor's Grading Number of Errors in Self and Peer Assessment Types of Errors in Self and Peer Assessment Participation Rate in Self and Peer Task Completion for Lab 10 and II Number and Types of Messages per Team
30 34 37 40 47 47 48 49 51 52 54
57 60
LIST OF FIGURES
Page Figure 1. Figure 2. Figure 3. Figure 4. Figure 5.
WebCT course Homepage with Online Course Components Average of Types of Errors per Assessment Form Average of Types of Errors in Self and Peer Assessment Average of Errors Made and Not ldentified vs. Self ldentified in Lab 10 Average of Errors Done Not ldentified vs. Self ldentified in Lab 11
41 49 52
53 54
LIST OF APPENDICES Page
Appendix A. Appendix B. Appendix C. Appendix D. Appendix E. Appendix F.
Consent Form and Ethics form Instructions and Tirneline Post Activity Questionnaire Details on WebCT Course Space Components Self and Peer Assessment Forms Summary of Interviews and Post Activity Questionnaire
88 90 93 94 95 101
Web-Based Self and Peer Assessment 1 CHAPTER 1 iNTRODUCTlON In the last decade changes in primary and secondary science education have been led by the educational reform movement (American Association for the Advancement of Science, 1997; Nationat Science Foundation, 1996). Similar trends in post-secondary education have been encouraged by government institutions to emphasize high-order thinking, collaboration and the use of technology (Council of Ministers of Education, 1997; Halpern, 1998; National Academy of Sciences
-
National Research Council Center for Science
Mathematics and Engineering Education. 1997). A lifelong learning paradigm is promoted in both high school and higher education. This paradigm supports higher order thinking that consists of reflection and self monitored activity that is domain specific (Halpern, 1998). This shift in teaching needs to reflect advances in learning theories (Brown A. & Campione, 1996). Learning is now seen as an
interactive process where learners have to construct their understanding instead of simply memorizing definitions given to them.
Considering the significant
influence that assessment has over students' learning behaviour (Boud, Cohen, & Sampson, 1999; Mcdowell, 1995) any attempt to implement constructivist
learning and collabo'r'ation in the classroom needs to address assessment procedures to send a coherent message to students.
Consequently, any
changes m teaching need to integrate coherent assessment procedures to be successful (Shepard, 2000).
Web-Based Self and Peer Assessment 2 Assessment needs to be carefully integrated into the learning process. Considering the impact of assessment in our educational system, new assessment methods cannot be simply parachuted into classrooms: - they need to be introduced progressively (Anderson, Reder. & Simon, 1998). Rather than
imposing new assessment methods, researchers need to go into classrooms to observe the progressive implementation of new methods.
Going into the
classroom will give researchers opportunities to test theories in real contexts as well as help to build a bi-directional bridge with practitioners. One way to align assessment with the learning process is to involve students in their own assessment. This study aims to measure the impact of self and peer assessments in the classroo,m. Self-assessment involves learners in making judgments and taking more responsibility for their own learning (Dochy. Segers, & Sluijmans, 1999a). Peer assessment can give learners an opportunity to dynamicaNy
practice applying criteria, giving and receiving feedback and being able to compare their work with others. Self and peer assessments provide a framework for clearer goals and expectations. Combined with shared self-assessment. peer-assessment can be a concrete implementation of socio-constructive learning that can enhance students' learning by providing scaffolding and feedback to others. This study will observe if the implementation of self and peer assessment, in the context of a physics introductory laboratory course, can improve students' ability to monitor errors, improve their work and create a sense
of community where students can exchange ideas.
Web-Based Self and Peer Assessment 3 Original Contribution to Knowledge This thesis investigates the impact of web-based self and peer assessment in a university science classroom. Self and peer assessment are not new tools in the classroom but few studies have used technology to foster reflection through asynchronous assessments and discussions. Involving students in the assessment process promotes higher order thinking and gives purpose for collaboration.
Web-Based Self and Peer Assessment 4
CHAPTER 2
REVIEW OF LITERATURE Current views of learning emphasize the active role learners need to play in their learning as opposed to their passive assimilation of content. To situate collaborative learning in the context of shared self and peer assessment, the literature on socio-constructivist Iearning is examined. Afterwards a discussion briefly reviews concepts of group learning and the role of assessment in bringing change in the classroom. Self and peer assessment are not new classroom practices and reviewing previous findings can help orient the implementation of these practices for the design of the current study. Constructivist Views of Learning
.
.
Role of Metacognition and Self-Regulation Learning, as suggested from research in cognitive sciences, "is an active process of mental construction and sense making " and it 'involves selfmonitoring and awareness about when and how to use skills" (Shepard, 2000). The cognitivist view emphasizes the need for students to play an active role in the learning process and the importance of the metacognitive abilities. These metacognitive abilities refer to thinking abovt one's own thinking processes while performing rather than focusing solely on performance outcomes (Greeno, 1998). Instead of focusing on what behaviour can be measured and observed cognitivists try to include what is going on in the head of the learners. Some researchers in the field try to uncover how experts' behaviour and thinking
Web-Based Self and Peer Assessment 5 processes differ fmm the ones of novices in a given field (Anderson et al., 1998). They demonstrated that experts differ in their use of strategies and selfregulating behaviours. Anderson et al. (1998) point out some of the differences between novice and experts even in skills that can be automated with experience like typing or driving. The difference between an average person and a Formula 1 driver is that the average person has stopped learning and achieved a level of
automaticity whereas the Formula 1 driver is constantly looking at how they can improve their reflexes and skills. Expert performers have the ability to avoid the arrested development associated with automaticity and continue to improve their knowledge and skills. Metacognitive abilities refer to knowledge and regulation of learner's own cognitive process (Hacker,
1998).
Metacognition includes two major
.
.
components: knowledge o f cognition and regulation of cognition (Schraw & Moshrnan, 1995). In self-regulated learning studies, attention is given to "how learners regulate their use of cognitive tactics and strategies in tasks" (Winne, 1996).
Theoreticians agree that
-
" the most effective learners are self-
regulating" (Butler & Winne, 1995). Even if there is no agreement on the'exact self-regulation mechanism. theoreticians all identify three core aspects to explain how learners "are metacognrtively, motivationally, and behaviourally active participants in their own learning process" (Zimrnerman B. J. , 1986). In all definitions, students are assumed to be conscious of the potential of the self-regulating process, to make use of a "self-oriented feedback loop during learning" (Zimmerman 6. J., 2000)
Web-Based Self and Peer Assessment 6 and to have reasons to explain their adaptive response (Zimmerman B. J. & Schunk, 2001). In an academic context, students acquire varying forms of self-regulating learning but some forms of self-regulated learning are more successful than others (Winne, 1997). According to some authors (Graham & Harris, 1994; Winne, 1997) successful self-regulating abilities can be taught.
In his self-
regulated model Winne (1997) suggests that feedback, one of the important aspects in the self-monitoring mechanism, could be modelled. In Winne's model, feedback both internal and external to the learner, play multiple and multifaceted functions.
Internal feedback is information produced by the learner's own
monitoring processes that evaluate their performance, whereas external feedback is given by other people or events to the learner. Research confirms that "learners are more effective when they attend to externally provided feedback" (Winne, 7997).
External feedback can inform the learner about
domain knowledge errors but more importantly it can enhance calibration and therefore a learner's effective engagement in tasks (reported in Winne 1995 from Balzer et al., 1989). External feedback, whether it comes from the instructor or the classmates, adds another dimension to our understanding of the leaming ,
.
phenomenon. . External feedback can lead to student participation in socially organized activities of learning, such as classroom discourse (Anderson, Greeno, & Simon, 2000).
The interaction amongst learners is impossible to isolate and must be taken into account if one hopes to be able to understand any classroom
.
Web-Based Self and Peer Assessment 7 dynamics or individual learning (Brown A,,1992). In his analogy of learning in the classroom, Anderson (1998) compares learning dynamics In the classroom to learning in team sports like basketball. This analogy does not really take into consideration the individualistic nature of assessment in the classroom (Anderson et al., 1998). In team sports learning needs to take place in group situations since performance is ultimately evaluated in a group situafion. This is not the case for assessment in school settings. A more salient analogy would be comparing learning in the classroom with individual sports like running where the evaluation of performance is individual yet training still takes place amongst others. For example, runners train by themselves but they also realise the need to train in groups.
Group training provides a social context that enhances
motivation and opportunities to challenge and be challenged by others' performance, A runner training with others will be pushed to a faster pace than
he or she would normally do by him or herself. Even though individuals in classroom situations are generally assessed on their individual performance they can benefit and be challenged by their peers. In training situations, peer feedback and questioning is more frequent than coach feedback and they both play important roles. Learning situations in the classroom are often treated as if the only interactions happening are the ones between the teacher and one student; the student-student relat~onships are often ~gnored for many of our learning objectives. Peer interaction is an important element of the classroom that should
be acknowledged. It should also be better understood within the context of the
Web-Based Self and Peer Assessment 8 educational system so that it can be used to leverage the learning dynamics in the classroom. Group Dynamics in the Classroom; Nurturing a Community of Practice Structuring and managing group work in the classroom is a subject of interest in educational research that is found in studies of collaboration, cooperation and communities of practice. The term collaborative learning is used in this paper because it incorporates valuable aspects of cooperative learning and insightful ideas from the community of practice literature, but keeps an individual learning perspective within the group dynamics. Yet, the differences between cooperative and collaborative learning is subtle. The different characteristics evolve with time from one author to the other e,g, see (Johnson D.
W. & Johnson, 1994; Johnson D.W., Johnson, & Smith, 1998). By comparing and contrasting these overlapping concepts the goal is to better delineate the key components of group dynamics in the classroom. Collaborative learning rnvolves instructional strategies in which learners work together to complete a task (Hassard, 1992). The completion of a task or the aspect of a purposeful end product is an essential characteristic of collaborative learning.
It differs from cooperative learning which is defined
primarily as a classroom technique or way of structuring activities in which learners work in groups to perfom these activities (Slavin, 1980).
The
differences between the two terms is subtle and since the different characteristics pioneers of cooperative learning like Johnson and Johnson continue to authors do not agree on the exact differences between Cooperative learning proposes
Web-Based Self and Peer Assessment 9 group arrangements or roles for learners independent from the content or the task (Panitz, 1997) and it tends to reject the use of competition amongst groups (Bruffee. 1995). A community of learners, a phrase used by Lave and Wenger (1991),
refers to group dynamics where newcomers or learners learn and integrate themselves into a community of practice. The idea of a community of practice pushes the idea of group learning that develops itself dynamically over time where learners share a practice and develop their identity as part of this group. It not only recognises the importance of collaboration for motivation and the developmeni of social skills but it places social norms and group interaction as the root of knowledge and meaning construction (Lewis, 2001). This learning model relies on the integration of a few novice learners into a targer community of more advanced or expert-like learners. It might not be appropriate for all types of learning going on in the classroom because of the difference in settings but some elements could be imported into the classroom to teach students about professional skills and attitudes. The notion of responsibility for learning is key; Panitz (1997) places collaborative and cooperative learning on a continuum. At one end, collaborative . ~
learning emphasizes students' responsibility for their own learning and at the other end, cooperative learning emphasizes teacher control of the activity and responsibility for the learning taking place. Collaborative learning tends to favour social negotiation of knowledge (Bruffee. 1995). In the case of the community of learners' model, a structure is imposed by the group onto the newcomers. This
Web-Based Self and Peer Assessment 10 situation is the reverse of the classroom where one expert-like teacher has the responsibility for structuring a large number of novice learners. In the design of a collaborative activity the instructor has to make sure that learners not only understand the structure and rules, but also agree with them. Ideally authors suggest that structure, rules and assessment should be discussed and negotiated to attain a better endorsement from students (Brown S. & Knight, 1994). Structure is linked to the learner's recognition of work which tends to be based on the group's performance and not on their individual performance within the group as in cooperative learning (Slavin. 1980).
Without individual
accountabifity within the group the basis for the assessment system in our schools is challenged. Collaboration involves a shifi in responsibility for learning that has implications for the type of assessment going on in the classroom.
.
. .
Acknowledging the Role of Assessment The idea that assessment heavily influences learners' behaviour and learning in the classroom is not new (Boud et al., 1999; Mcdowell, 1995); the typical student's questions have often been heard: " Will this be on the exam?" or "How much is it worth?" The nature of the assessment tasks influences students' approach to learning (Beckwith, 1991). Modifying the assessment procedure or reward structure can certainly facilitate or enable change in the classroom but Shepard (2000) reminds us that this change needs to be compatible with the other purposes of assessment in our current education system. Shepard (2000) argues that changes in assessment traditions need to be incremental and "consistent with and support social-constructivist pedagogy". On the other hand
Web-Based Self and Peer Assessment 11 administrative authorities need to be reassured that the implementation of a new assessment system will be compatible with the resources and needs of the system as a whole. Pellegrino (2000) suggests that assessment procedures need to be refined in order to facilitate collaborative learning. These procedures also need to be practically implementable in our school systems (Pellegrino J. W. et al., 2000). As mentioned by the National Research Council (2001) assessment in our educational system serves multiple purposes: assisting learning, measuring individual achievement, and evaluating programs. Pellegrino and Chudowsky (2003) question whether assessments can simultaneously support and measure students' learning. They discussed the challenges to move from the current conservative testing culture (Birenbaum & Dochy, 1996; Gipps, 1999) based on
.
-
the psychometric-quantitative paradigm to a more constructivist approach based on more contextual-qualitative paradigms.
Unfortunately, attempts to
successfully implement these kinds of interrelated changes in classroom environments have not been conclusive (Pellegrino J, W. et at., 2000).
For
example, in California and Connecticut there have been attempts to change the classroom. dynamic by imposing new .assessment tasks and techniques but teachers did not respond as expected. The assumption was that if teachers teach to the test, it might be possible to change the way they teach by changing the tests. Many reasons might explain this failure. However, high-stakes testing lead teachers to teaching to the test even if the test is not based on the principle of promoting understanding of learners. Considering the consequences involved
Web-Based Self and Peer Assessment 12 in the measurement aspect of assessment. Sadler (1998) suggests that formative assessment which he defines as "specifically intended to provide feedback on performance and accelerate learning", can be implemented gradually without involving a complete immediate readjustment of the educational grading system.
Gielen, Dochy, and Dierick (2003) advocate for using this
transition for developing quality criteria for new modes of assessment. Self and Peer Assessment Self and peer assessment are not new in Higher Education (Falchikov & Goldfinch, 2000); they have been gaining in popularity in the last ten years as the system has re-evaluated the relationship between learning and assessment. Student-centred activities can be used as a tool for learning when they include opportunities for reflection, feedback and integration of learning and'assessment (Dochy F. J. R. C. & Mcdowell, L.. 1997). Self and peer assessment activities can be combined to scaffold self-regulation and monitoring through feedback. These innovative assessment methods can provide a purpose for student exchanges and collaboration leading to better understanding of instructor's goals and expectations. The definition of self, peer and co-assessment will be examined in the Higher Education context. Involving students in the assessment process can
promote self-regulated learning and collaboration without
jeopardizing existing assessment structures in the system. To set a baseline for comparison with more conventional assessment methods many studies of self and peer assessment have compared grades given by the students to those given by instructors.
Web-Based Self and Peer Assessment 13 Self Assessment in Higher Education Self-assessment is defined as a process during which learners have to make a judgement about the amount, level, value or worth of their own performance (Topping. 2003). Self assessment can be implemented in different ways (formative, summative) and take different forms (grading, rating, commenting) but it is generally used for formative purposes to foster reflection in learners (Sluijsmans. Dochy, 8 Moerkerke, 1998). In their rneta-analysis of self-assessment studies Falchikov and Boud (1989) reported on 57 quantitative studies fmm 1933 to 1988 focussing on the
comparison of self and instructor grades. Studies varied in both their reporting and methodologies. To account for various methodologies researchers have included the quality of design as a factor with the main variable being grade correspondence between instructor and students and other factors like the course level and the subject area. They have reported a closer correspondence in grades between instructors and students in better quality designs, higher course levels and in the broader area of science (engineering, math, physics, chemistry, etc). They also suggest that weaker students have a tendency to overrate themselves whereas stronger students ofLen. underrate themselves (Falchikov & Boud, 1989; Lajoie, Lavigne. Munsie. & Wilkie, 1998). Results about the accuracy of self assessment are generally optimistic but not conclusive. Only one study found a discrepancy between the students' and teacher's grades in science and they explained their results by referring to the omnipresent testing culture in the sciences areas which impedes learners'
Web-Based Self and Peer Assessment 14 understanding of the assessment process (Zoller & Ben-Chaim, 1998). Using an I
electronic interactive advice system for self-assessment Gentle (1994) demonstrated that students could assess themselves to within five percentage points of the instructor's grade on long-term projects.
Another study by
Fonghurst and Norton (1997) reported a positive correlation between students' and tutor's grade (r = .43). Ross (1998) reviewed self assessment in the area of language learning and suggests that the discrepancy is reduced by giving the learner more experience with the self assessment process. Peer Assessment in Higher Education Peer assessment involves a judgement about the amount, level, value or worth of somebody else's performance, usually a same status classmate or coworker (Falchikov & Boud, 1989; Topping & Ehly, 1998). Both external feedback from an instructor or peers and development over time improve students' ability to assess themselves (Birenbaum & Dochy, 1996; Falchikov & Boud, 1989) Results for peer assessment regarding reliability are more consistent: in their meta-analysis Falchikov and Goldfinch (2000) and Topping (1998) reported that peer assessment in Higher Education produces reliable grades compared to those given by instructors. These results are promising but should not be interpreted as a way or means to relieve instructors from their grading responsibilities. Reliability is not outstanding in all studies; students' grades vary in non systematic ways. For example in Orsmond (1996) and in Topping and Ehly (1998) 18 of the 31 studies that were reviewed questioned the reliability of the grades given by students.
Contradictory findings are often justified by
Web-Based Self and Peer Assessment 15 authors by confounding variables such as the course level, whether product or performance is being evaluated, the level of details of the assessment, the clarity and understanding of criteria, and the training and support provided to assessor. Reliability is better in advanced level courses and when academic products such as written assignments rather than performance based assessments i.e.: oral presentations are being evaluated.
Other variables increasing reliability are
providing global assessment and not detailed ones, discussing and negotiating criteria and supporting the process by checklists, feedback and training (Topping, 2003).
Self, Peer and Co-Assessment Closer to the traditional educational practice, co-assessment involves a combination of ' self, peer and instructor assessment.
The participation of
students in the assessment process allows them to assess themselves while instructors maintain control over the final assessment (Dochy, Segers, 8 Sluijsmans, 199913). Synonyms for co-assessment are collaborative assessment and cooperative assessment. This type of assessment can take different forms and the amount of summative versus formative weight varies. Studies of co-assessment generally report positive experiences. For example, Stefani (1992) carried out a practical experiment in a first-year undergraduate biochemistry laboratory involving self, peer and co-assessment. Students had to define the grading schedule for their lab report. Results show that students have realistic perceptions of their abilities and are able to make rational judgments about their peers' achievements. Other studies in computer
Web-Based Self and Peer Assessment 16 science (Rushton, Ramsey, & Rada, 1993) and physiology (Orsmond, 1996) report positive impact and good reliability of students' judgement. Integrating Self and Peer Assessment to Improve Learning Self and peer assessment are not strictly assessment procedures, they can also be part of the instructional process in which learning skills can be developed. Self and peer assessment can be used as instructional strategies to promote collaborative learning activities. The purposes of the activities meet both individual learning objectives and group objectives.
Individuals are
accountable for specific tasks and the group depends on each member's work to succeed. When groups are paired up in teams for self and peer assessment they both depend on each other's feedback and this corresponds to the criteria of positive interdependence identified by Slavin (1980) to foster the development of successful group dynamics. The argument presented by Edward (2003) in his use of self and peer assessment in engineering is that they "are not only an aid to learning the topic being assessed but that the skills of evaluation are themselves vital". Studies show that the exercise in itself is a valuable learning activity (Falchikov & Boud, 1989) that can improve the quality of student learning (Prosser & Trigwell, 1999).
These findings are supported by earlier descriptions of learning whereby students' learning processes and development of meta-cognitive abilities must be coordinated. Sluijsmans, Dochy and Moerkerke (1998) concluded that peer assessment could inform self assessment. As research in self-regulated learning suggests the cyclical or recursive process of trial and error in which self-
Web-Based Self and Peer Assessment 17 monitoring occurs could be accelerated by observing others (Winne, 1997). Peer assessment plays a complementary role in improving students' ability to do better self-assessment, question their work and to help them improve their performance. Collaboration and monitoring by peers often lead to questioning which has been shown to improve learners' understanding (Karabenick, 1996). In the context of writing and reading comprehension, learners tend to fail to detect misunderstanding and ignore incorrect information and guided writing from
other learners can improve the self-editing procedures (Brown A. & Carnpione,
1990) . hstructional scaffolding has also shown to improve students' loop of diagnosis and self evaluation (Bereiter & Scardamalia, 4987). Peer interaction and monitoring force students to externalize their thinking and often in the process of collaboration peers tend to identify errors and help correct them (Resnick, Salmon, & Zeitz, 1993). Even if the grading criteria for students' self and peer assessment are not exactly the same as the ones used by the instructor, these activities bring students to reflect on and monitor their own learning and it provides them with a better understanding of the standards expected of them (Ballantyne, Hughes, & ~ ~ l o n a2002). s, setting up these activities involves an overt presentation of the .
~
assessment criteria and it implies a dialogue or negotiations between the students and the instructor (Brown S. & Knight, 1994). Stating expectations and clear procedures for self and peer assessment dernystifies the assessment process (Ballantyne et at., 2002) and gives students more responsibility for their learning (Divaharan & Atputhasamy, 2002).
This sharing of responsibiiity
Web-Based Self and Peer Assessment 18 emphasizes learner's participation and ownership for their learning. Increasing learner's personal accountability for their learning has positive impacts on their motivation and self-efficacy (Schunk, 2001). In a meta-analysis of group learning in undergraduate science courses Springer et al. (1999) found that students demonstrated greater academic achievement. more favourable attitudes towards learning plus they persisted more in their programs than students in a traditionally taught environment. Collaboration amongst students brings some of the focus of learning back onto students' thinking processes instead of being solely on the teacher's presentation (Smith
&
Macgregor, 1992). Collaboration challenges students to discuss and
explain their thoughts (Slavin, 1995) which usually brings them forward in their understanding of the material.
Peer assessment on its own has been
successfully used to enhance participation in collaborative learning (Divaharan B Atputhasamy, 2002). Peer assessment imposes a time frame and a structure in which students have to give each other's feedback. Students tend to perform better for peer assessment (Eschenbach, 2001) and it has been shown to improve their social skills and self-esteem (Topping, 1998). Peer assessment can improve students' ability to evaluate themselves since it helps them to ..
become more critical of others' and their own work (Falchikov, 1995). From Theory to Practice: Building a Bi-Directional Bridge between Practitioners and Researchers In the context of a design experiment (Brown A,, 1992; Greeno, 1998) the goal is to measure the repercussions of the implementation of some socio-
Web-Based Self and Peer Assessment 19 constructivist practice in the classroom. Clear findings from this type of research are lacking because of the "combined challenge of complexity, statistical significance testing and the ever present factor of change " (Pellegrino J. W. et al., 2000). The challenge of this type of research is to incorporate and test learning theories and concepts in a real classroom context, which is ultimately where they will have to be effective. It is important to situate research in the classroom since it helps validate the practicability of positive and negative findings for instructors. This type of research can be regarded as a bi-directional bridge that can influence the classroom practices and improve our understanding of the learning phenomenon which could be the initial step leading to a deeper level of collaboration described by Brown and Campione (1996) in their fostering Communities of Learner (COL) model of classroom environments. Challenging Research Settings Many components of classroom settings make the classic scientific hypothesis testing model challenging (Cobb, Confrey, Disessa, Lehrer, & Schauble, 2003). overwhelming.
The number of variables taken into consideration can be
Some of these variables, like "teacher" or "group" represent
potential confounding variables that cannot be controlled. Two teachers are .
~
inherently different in their teaching styles and the impact they have on a group of students even when teaching the same material in supposedly the same way. Group dynamics are also unique; some groups are harder than others in terms of discipline or learning potential. The complexity of the classroom settings does not exclude research from being performed in these settings but it certainly
Web-Based Self and Peer Assessment 20 influences the type of methodologies used under these conditions. Brown (1992) stresses the need for new and complex methodologies to capture these multifaceted environments. Her design experiment approach to research goes beyond the quantitative versus qualitative debate, it is a mixture of methodologies that are dictated not by her beliefs but by which methodologies could best fit into a given situation. \n this case-based approach the context and our research interest Ihypothesis have to be examined together to come up with a realistic design. Need for Applied Research to Build a Bi-Directional Bridge Even if the concept of cooperation and collaboration has been around since the 1920's (Slavin, 1980) and the benefits of group learning have been repeatedly conveyed to instructors, their implementation in the school system at all levels is not yet widespread. There are probably many explanations. As Anderson et al. (1998) report, the National Research Council (NRC), in its review of collaborative learning research. has found numerous studies reporting on the
benefits associated with it but relatively few studies that address the potential detrimental effects and difficulties associated with implementation. Anderson et
at. (2000) attribute this Tack of implementation of group learning in classrooms to the costs outweighing the benefits when implementation is not carefuBy planned.
his brings up the issue that in the context of educational research the same scientific
rigor should
be given to the dissemination of problematic
implementations as to successful ones. Errors or unsuccessful implementations
Web-Based Self and Peer Assessment 21 can guide practice and help teachers to set realistic expectations for the implementation of new approaches. Brown (1992) stresses the synergy of the classroom environment and the challenges of conducting research while teaching and dealing with all the related issues but she urges researchers to move beyond their well-controlled environments.
The classroom has multiple inputs (curriculum requirement,
students, classroom ethos) and some outputs are required at the end of the day or the week (assessment of learning, accountability). Learning in these social settings is inherently different than what researchers can produce in one-on-one tutoring sessions. without playing the double role of teacher and researcher, researchers can play a key role in classrooms; they can build a bi-directional bridge between the school system and the research institutions. In one direction,
.
.
data collection in the real classroom settings can help confirm or inform more fundamental and applied learning theories; in the other direction, researchers can be agents of change by explaining and bringing new ideas and theories into the
classroom. The exchange can be beneficial for both
teachers can learn
about the different developments and changes while researchers can gain a better understanding of learning in classroom settings. Technology as an Enabling Medium Technology can help bridge the gap between researchers and practitioners as it facilitates access to information and communication within and outside of the classroom. Outside of the classroom the use of ernail, discussion bards, and websites can facilitate communication amongst researchers and ..
Web-Based Self and Peer Assessment 22 practitioners.
In the classraom the use of technologies opens up new
possibilities for design and data collection for both practitioners and researchers. Digital assignments, logs or journals can be shared and analysed without delay and these data can be collected without class intrusion. These data about the classroom can be accessed in more breadth or depth since researchers are not limited by scheduling or cost of collecting data. Flecknoe (2002) suggests that teachers could also gain from having access to more data about their students. Over time they might be able to better assist students in their learning. The use of information technologies in the face-to-face classrooms have been seen as a way of "pushing back the wall of the classroom" as they allow for discussions and interaction to extend outside of the realm of the classroom (Bender, 2003). Technology is a tool that enables us to structure the interaction . .
,
in space and time in ways that would not be possible if relying on. paper or classroom interaction.
For example, peer assessment is a time consuming
activity and students need to reflect before giving their feedback. The use of online mediated communication to structure the process allows students to share their work, complete the peer evaluation and communicate feedback to their peers. Computer-based peer assessment is slowly emerging and various forms have been described. However, little outcome data are available to help orient future research and practices (Topping, 1998). Description of Study In the context of a first year physics course where the primary goals are to provide students with the opportunity to develop experimental skilfs and to teach ..
Web-Based Self and Peer Assessment 23 them the preparation of a clear and concise report, assessment activities were implemented to provide continuous monitoring, to improve the quality of the lab report, and to increase collaboration amongst students. Web-based self and peer assessment were implemented to answer practical aspects related to the exchange of work and group communication. This research is part of a larger scale program of research conducted by Winer. It entails studying collaboration and the use of ICT in Higher Education classrooms (Winer, 2002; Winer & Cooperstock, 2002). Prior to this research, an exploratory design in the same course during the winter 2002 semester had suggested pursuing collaboration and online discussion about lab reports. To build on these findings the researchers met with the instructor and the senior demonstrator.
.
.
.
Senior demonstrators in this department - are people
supervising and teaching the laboratory section of the course in collaboration with the main course instructor. In a course context where students complete one experiment and attend a lecture per week, a collaborative activity was designed in consultation with the two instructors. The design of this study addressed constructivist learning principles and the three main features identified by Palincsar & Herrenkohl (2002) for the design of collaborative learning context: the nature of the problem space, the support of interactive patterns, and the process of creating a shared social context. Nature of the Problem Space In first year physics the problem space is somewhat challenging. The main goal of experimentation at this level is not geared towards a deep understanding
Web-Based Self and Peer Assessment 24 of the phenomena, but rather to familiarize students with the equipment and the rudiments of how to write a lab report. As reported by VAzquez-Abad, Winer, and Derome (1997), first year physics students attach a lot of importance to laboratory experimentation as the process itself can foster the team nature of scientific work. Even if students do not have the knowledge to understand the concepts behind the lab experiments, when experimentation is not part of the program the rate of attrition increased.
If deep understanding of the
experimentation is not the primarily goal behind the writing of the lab report, it can be used as an opportunity to emphasize the importance of communicative and writing skills in the contexl of science. Emphasis on collaborative skills in the process of experimentation and report writing in science courses helps promote and better represent the "collaborative nature of scientific and technological
.
-
work" (Springer et al., 1999). This context can also be used as an opportunity to work on students' writing skills.
As Eschenbach (2001) argues writing skills
should be taught in the context of science rather than in separate or stand-alone courses in communication. Support of Interactive Patterns Monitoring and communicating success is an important goal of .
assessment (Lajoie et al., 1998). What and how to monitor learning or progress is task and content dependent. Astofili (1999) advocates that errors in most contexts can be a great tool for learning for both students and instructors. He suggests that students can benefit from observing and discussing errors and why they were made. He proposes that errors are revealing of misunderstanding and
.
Web-Based Self and Peer Assessment 25 can even show progress in students' learning process.
Monitoring students
progress or evolution of errors can give useful feedback to the instructor for improving teaching as well as to students to improve performance (Lajoie et al.,
1998). In his typology of errors Astofili identifies six different reasons responsible for students' errors. They are errors due to: misunderstanding of instructions, inconsistent work habits, overload of intellectual operations, a disorganized approach, lack of transfer from other disciplines, and inherent complexity of content. The assessment activities were designed to address two of those, students' misunderstanding of instructions and students' habits of work.
By
providing students with a clear assessment grid this study aimed at improving their understanding of the instruction and assessment expectations.
The
activities were also designed to improve learning strategies and acti.vely involve students in the revision of their work and their peer's work. Process of Creating a Shared Social Context Students need to have clear and meaningful goals for interactions, they could not be simply told to share and discuss their lab reports. Collaborative learning environments give students goals, purpose and structure for interactions, Students need to be active and to do something that requires collaboration. Introducing shared self and peer assessment of lab reports gives a structure to students' interaction.
Students' ability to revise and evaluate
themselves is crucial (Hayes & Flower, 1986) and they are expected to learn from each other's errors (Astolfi, 1999). To develop a common ground on why to build knowledge, referred to as "intersubjective attitude" in Palincsar and
Web-Based Self and Peer Assessment 26 Herrenkohl (2002), the study was designed to ensure that peer assessment would be reciprocal amongst pairs.
The self assessment was aimed at
structuring intra-team collaboration whereas the peer assessment aimed at structuring the inter-team collaboration. Comments and exchanges related to this reciprocal assessment would also need to be private to promote trust and responsibility (Divaharan & Atputhasamy, 2002). Application of Learning Principles in Current Study Assessment forms were designed in collaboration with the senior demonstrator. The forms are a structured list of all the assessment criteria in both paper and web versions (see appendix E for more details). Having to fill-in the form forces students to assimilate or better understand the criteria since they have to revise and check whether or not single criterion apply to their lab and whether they have met it. Most criteria in the first part of the form have a clear answer and represent students' common errors but as they reach the last questions students need to take more initiatives and write substantive comments. Adding a peer-assessment gives students more practice with the assessment process, allows them to compare themselves with others and teaches them about how to give and receive constnjctive feedback.
The immediacy of
feedback as well as the questioning that might be induced by peer comments can accelerate their ability to identify errors and hopefully improve the level of writing for their lab reports. The main goal in comparing students self and peer gtading with the instructor grading in this research is not to validate the accuracy of self and peer
Web-Based Self and Peer Assessment 27 assessment but to see how good students can learn and improve at assessing their performance and predicting their grades and their teammates' grade. The senior demonstrator's grades were used to judge student improvement in lab reports but were not compared to students' grading since the format was different. Research Questions In the context of a first year physics laboratory class this study was designed to explore the impact of implementing formative self and peer assessment activities. The research design focuses on three questions, whether web-based shared self and peer assessment activities can help students improve their learning strategies and products; whether peers can facilitate the learning of individuals; and whether or not the activities wilt foster a community of practice within the class that will go outside of the task requirement and bring stodents to discuss and exchange even afler the experirnen< To measure whether or not students' learning strategies and products improve, the number of errors from the first lab to the last lab report will be examined. Will the number of errors decrease? Do certain types of errors decrease and others remain constant?
DO
the grades of participants improve
from the first to the last lab report? Do students improve at estimating their grades and the grades of their peers?
To measure if the peer assessment can facilitate self assessment this study looks at whether students can find more errors in their peers; do they find same types of errors in others? How does this pattern of errors evolve in time?
Web-Based Self and Peer Assessment 28 Are there errors students make and identify in others but they make and don't recognize making? Will the participation rate in peer assessment be higher than in self assessment? Do students prefer self or peer assessment? Providing students with the opportunity to play an active role in the assessment process and creating a situation of interdependence was expected to foster a community of practice within the class that will go outside of the task requirement and bring students to discuss and exchange even after our experiment. The observations will examine whether students visit the website after required tasks are done.
Do students continue to do self and peer
assessment? Do students extend the task above requirements? How much and what type of discussion do they have online?
Web-Based Self and Peer Assessment 29
CHAPTER 3 METHODOLOGY Participants
A group of undergraduate students in an introductory Physics Laboratory course at a Montreal university participated in the study. All students enrolled in the course took part in the self and peer assessment activities at least once as it was integrated in the normal course instructionalactivities but 55 of the 67 students (82%) agreed on giving the researchers access to their work for the course (online and offline) and to fill-in pre and post questionnaires (see Appendix A for consent form and ethics certificate). Parficipating students were mostly male (46 male. 9 female) in their first year (Ul) university program (1 freshman, 47 U1, 6 U2 and 1 U3). This laboratory course was a prerequisite for almost all students as 52 of the 55 students had physics as a major, honour or joint major. The first language was English for 24 students, French for 22 students and Other for 9 students. However, 14 students-werestudying in,Englishfor the first time. Students' experience using the online course management tool WebCT varied but most students (90%) had some experience working with the online course management tool WebCT in previous. courses.
Web-Based Self and Peer Assessment 30
Table 1 Previous Experience using WebCT WebCT used in previous courses Number (n=55) 7 Never 1 or2 16 3 or 4 9 More than 4 courses 23
Percentage
13% 30% 16% 41 %
Students were grouped in pairs or trio for the execution and the write-up of their lab report. These lab groups were self selected by students during their previous semester. For the assessment activities lab groups were kept intact and paired randomly with another lab group to do the peer assessment. Course co-instructors also participated in the study. The lead instructor was a recently hired professor who was mainly responsible for class lectures. The senior.demo'nstrator was an experienced support staff who had been teaching and grading the laboratory experiments sections of the introductory physics course for 25 years. Both were willing to try a new instructional method to help students produce good lab reports and encourage student collaboration. Procedure As part of their course assignments in the winter 2003 semester, the instructor required students to participate in filling in the self and peer assessment forms. The activities were designed in collaboration with the instructor and the senior demonstrator between October 2002 and January 2003.
The instructor was fully committed to the importance of increasing students' collaboration and exchange in this course but the person ultimately in charge of
Web-Based Self and Peer Assessment 31 the assessment of lab reports was the senior demonstrator. The senior demonstrator was highly qualified; she had been teaching first year physics laboratory courses for more than 25 years and had an excellent reputation. The assessment forms were developed with her and were approved by the instructor before their introduction to students. Students were introduced to the research study at the beginning of their second semester of a year long introductory physics laboratory course. They had to perform and write-up a lab report almost every week during this course. The activities they would participate in for the study consisted of filling-in a self assessment, sharing their lab report with another team, and doing a peer assessment for four of their lab reports. The researchers visited the class on their lecture day twice prior to the beginning of the study to explain the goals and procedures. The first presentation introduced the research goals, and the consent forms and questionnaires were also distributed. During the second presentation a week prior to the beginning of the activities, the oral and written instructions were reviewed. The basic technical information required to make use of the WebCT tools was also reviewed. Students were provided with examples and guidelines on how to give and receive feedback. The instructions also provided details about when and what was required from the students week by week for the two lab experiment classes, the Monday and the Wednesday classes (see Appendix
B for Instructions and Timeline).
Web-Based Self and Peer Assessment 32 Before or immediately after submitting their lab report to the instructor on paper, groups had to fill-in an online self-assessment on WebCT. Lab report groups also had to post an electronic version of their lab report on a private discussion space on WebCT. This step was a pre-requisite for the peer assessment to take place. During the week following the submission of their lab report, the groups had to evaluate and fill-in the peer assessment form on WebCT. Students could then post the results of the assessment in the private discussion space to share the information with the other team. For each lab report students had to complete three tasks: fill-in the self assessment, posting a copy of their lab report on WebCT, and completing the peer assessment the following week. Peer feedback and team exchanges were not done anonymously. , . .
The activities were conducted for laboratory experiments 10, 11, 12 and 13. Prior to laboratory 10, classes were dedicated to special projects where each
group was working on their own chosen topic. During the first two weeks of the activities the researchers were present in the lab classes on Mondays and Wednesdays to answer questions from students. Support was given when students had technical difficulties. The researchers also met with the senior demonstrator every week to photocopy lab reports, record grades and for additional informal exchanges and planning. At the end of the course a post activity questionnaire was posted on
WebCT and semi-structured interviews were conducted with students. The
Web-Based Self and Peer Assessment 33 questionnaire and interviews addressed time requirement, group process, perceived usefulness of the activities. ways they gave and received feedback and suggestions for future use. Interviews were conducted with students between the 3'C'week of March 2003 and the 3rdweek of April 2003. The online questionnaire was open to students from the 3Idweek of March to the end of April. The researchers contacted students from different teams that had previously expressed interest. Students were contacted by phone or email and they could choose to meet for 20 minutes with the researchers at a convenient time. The interviews took place in a lo~ingeon the ground floor of the university's Physics building. The room was not well suited for interviews as it was adjacent tothe professor's lounge but the location was convenient for students.
. .
,
The interviews were done one-on-one around a table; they were digitally recorded with a laptop using a sound recording program. At the beginning of each interview students were given the questions that would be asked in English but the researchers gave them the choice of proceeding in English or French. Six out of the seven interviews were conducted in French and one in English. Four students were male and three were female. On average interviews lasted 22 minutes, the shortest one lasted 13 minutes and the longest one 34 minutes.
One of the seven interviews did not record properly and one had heavy background noise. Only one student filled in the online questionnaires. Questions from the online questionnaire and semi-structured interviews were similar. The questions aimed at finding out about how students did the activities,
Web-Based Self and Peer Assessment 34 what they liked and did not like, and what changes they would suggest in the future. The online post-questionnaire addresses similar topic (see Appendix C
for a paper copy of the questionnaire) Table 2 Semi-Structured Interview Protocol A Group Process 1. In your lab group, how did you split up the responsibilities? 2 . Did you meet outside of class to discuss the evaluation and the feedback you needed to give the other team? B Preference for Self or Peer? 3. Which evaluation (self or peer) did you feel was more useful? Why? Do you find it easier to find errors in yours or in somebody else's lab report? C- Format, Grouping and Timing 4. Would you have participated more or felt better if your comments and grading feedback had been anonymous? What about receiving anonymous feedback? 5. How did you feel about the private group setting opposed to whole class discussion setting? 6. Given the time constraint, would you have preferred to do the exercise before handing in your lab report to the instructor? 7. Did you like the grid or format in which you had to give feedback? Did the numerical values in each section help or constrain your feedback? D Suggestions for Future Use 8. Do you think this exercise should be repeated next year? 9. Would you prefer to do the evaluation on paper? 10.What change would you suggest to make it more efficient or meaningful for students?
-
-
-
Design The study was designed to be a repeated measure analysis, with data collected from a series of four lab reports. .For each lab report students had to fill-in a self and a peer assessment. Various measures were used to assess the impact of the self and peer assessment on students' performance. However, the participation rate for the two last reports was too low and did not provide enough data to enable a four times repeated measure analysis (see chapter 4 for
Web-Based Self and Peer Assessment 35 details). Therefore the observations and analysis were limited to descriptive and non-parametric statistics on results from lab reports 20 and 11. Grouping Students worked in self selected pairs or triads for the activities and the write-up of the lab reports. Groups were randomly paired in teams for sharing their lab reports and completing the peer assessment; however due to odd numbers of groups two teams were composed of three groups. Each group was responsible for evaluating and giving feedback to the other team. Nobody outside the team could read their lab reports or see the comments they made to each other. Materials I
.
.
-
Design Process of Assessment Forms The online self and peer assessment forms were first designed on paper with the senior demonstrator in charge of the lab experiments during the fall semester. The researchers met with the senior demonstrator, who was ultimately grading the lab report, six times prior to the beginning of the winter semester. These meetings -were aimed at trying to design a formative self and peer assessment that would be aligned with the summative assessment. Alignment of formative and surnmative assessment is key to ensure that the proposed task does not lose credibility and became an added burden for students. In a context where the senior administrator had been grading lab reports for first year physics students for over 25 years, the goal was to collaborate with her, learn from her experience and try to understand how she
Web-Based Self and Peer Assessment 36 was assessing her students. Integrating her criteria and discussing with her about her vision of assessment was a pre-requisite to planning the experiment. The senior demonstrator started by creating an enumeration of criteria that would apply to most lab reports. These criteria correspond to the internal criteria she had been using for grading lab reports throughout the first semester. As shown in table 3, these criteria were then structured into five main sections: general, graphs, data, sample calculations, least square fit and conclusion. The first five categories (from A to E) have a checklist of criteria. Ail criteria are either present or absent and the weighting of categories is not related to the number of criteria. For example, in the first category "general", there is a list of six criteria and students have to indicate which have been met, give a score out of 10 and justify briefly. The main task for students in this first section is basically to '
..
'
monitor their errors and their teammates' errors. The other three categories or sections consist of more open-ended questions. For example, in the conclusion section the characteristics of a good conclusion are listed but no checklist is given; students have to give a grade out of 10 and give justifications. The summary section of the forms differs for the self and the peer assessment. in the self assessment students only have to identify the stronger and weaker points of their tab report. In the peer assessment students have to identify the stronger and weaker points in their teammates' lab report but they as well have to formulate two questions for the other team. This last section aims at challenging students to communicate directly and openly with the team they are evaluating.
Web-Based Self and Peer Assessment 37 The scores attributed to each section reflected the grading scheme used by the grader. Even if studies on self and peer assessment suggest using a more holistic and qualitative framework for assessment this form was chosen to ensure coherence with the surnrnative assessment process (Dochy et at., f 999b). Individual scoring of categories was used to help students gauge the
importance of each section and qualitative feedback was added for each category. A summary section, not related to the summative assessment, was also added to help students reflect and synthesize their critique. Attempts to provide students with samples of good lab reports and feedback failed, as it was impossible to obtain work and consent form from previous students. Table 3 Self and Peer Assessment Criteria Sections (types of error) and crlterla A- GENERAL 1 Title page with name and number ofexperiment names of students, students' I.D.3, date experiment was performed. apparatus number and numbered pages. 2 Original data at the end (after the conclusions). 3 Standard notation used, not computer symbols such as * and ". 4 Equation editor used for equations, formulae, and symbols; i.e. Ay, not delta y. 5 One conclusion per student, with name at the top. 6 Horizontal pages all face towards the right. B- GRAPHS 1 Graphs are full-page. 2 Units indicated. 3 Thin lines, extended back to y-axis or (0,O) if relevant. 4 White background. 5 Axes are labeled. 6 Grid lines in both x- and y- directions. 7 Title. 8 Points are represented by small, round dots., 9 Error bars (if there is no Least Squares Fit, and a quantitative error analysis is asked for). No error bars are - required if the error analysis is to be qualitative.
Scores
10
Justlflcation
2 lines
2 lines
Web-Based Self and Peer Assessment 38
Sections (types of error) and crlterla CDATA 1 Recopy original data exactly as taken in the lab. 2 Dala that is altered (different units, etc.) should be in a separate column or in a separate table. 3 Always show % difference from an accepted value. 4 Set up your data in tabular form with appropriate units. 5 Results may be added to the data tables if convenient. If not, put them in a separate heading under RESULTS. D- SAMPLE CALCULATIONS (and ERROR ANALYSIS CALCULATIONS) 1 For each calculation, the equalion, one line of substitution, and the results are shown. 2 Data that is altered (different units, etc.) should be in a separate column or in a separate table. 3 Results are rounded off to the appropriate number of significant figures. E- LEAST SQUARES FIT for Linear Functions 1 Line of best fit. 2 Error lines. 3 Sample Calculations or spreadsheet lor the Least Squares Fit. F- CONCLUSION 1 Do they quote their results the calculated experimental errors? 2 Do they discuss whether they consider the experiment a success? Do they discuss their results, their errors or what they have learned from this experiment . . . . 3 Do they discuss whether their accepted values lie withinthis range? 4 Do they suggest improvement, within reason, to improve the experiment? Verify that their ideas are realistic and that they explain how it could be implemented. 5 Do they include any other relevant comments? E.g.: if they were to repeat the experiment what would they do differently? SUMMARY Strong and weak points 2 questions*
I
Scares
I
Justification
20
2 lines
30
2 lines
2 lines
*
2 lines
NIA NIA
NIA NIA
*only in peer assesment
All experiments involve careful'observation of a physical phenomenon, quantitative measurements of the variable which describe the phenomenon. discovery of experimental relationships or verification of theoretical relationships between variables, and determination of the values of the parameters which describe the relationships. Most lab reports require calculations which can differ
depending on the nature of the experiment. In lab 10 students worked on the
Web-Based Self and Peer Assessment 39 ratio of specific heats of a gas. The goal of this experiment is to determine the value of alpha (CplCv), the ratio of a specific heat at constant pressure to the specific heat at constant volume, for a non-atomic gas (argon), a diatomic gas (air), and a polyatomic gas (carbon dioxide), using the method of Clemont and Desormes. In lab 11 students experimented with variation of the vapour pressure of water. The goal is for student to experiment with the variation of the boiling point of water under different pressure. The analyses of the data are not based on any exact theory and calculations should give students slightly different values. Lab 12 requires students to test Stefan' Boltzmann law of radiation through calculation of voltage and current in an electric circuit. In lab 13 the experiment shows the relationships between mechanics and heat, and electricity and heat. Assumptions about Errors The nature of the assessment done in the first five sections refers to basic and fundamental knowledge. The checklist of criteria for each section is straightforward; not much subjectivity was involved in deciding whether all the elements were present or not. This led to some assumptions about the reliability of errors identified by students in the context of the self and peer assessment. Table 4 identifies the assumptions made about errors for analysis purposes. In case A, errors identified in self-assessments reflect errors made and seen by students in their lab report. In case 6, errors identified by students in peer assessments reflect errors made by peers. In case C, errors identified by peers minus the ones self-identified reflect errors done but not seen by students. In
Web-Based Self and Peer Assessment 40 case 0,the intention was to use data from the grader's comments. However, the grader did not follow the same assessment form and gave very few comments, so the errors missed by the self and the peer assessment were not identified. It was also impossible to triangulate data for case C to verify that errors identified by peers were accurate and extensive.
Table 4 Assumofions . .- - - . . ..- ..- about - - - - . Ewors -. . --. -
. . .. .-
Errors in Self
Errors in Peers
Errors Identified A Correctly identified errors (self)
6 Correctly identified errors in others
Errors Not Identified --..- . C Missed errors (peer - self) (grader self) D Missed errors (grader)
Environments Online Course Space on WebCT An online course space component was added to the regular face-to-face class. The lead instructor had some experience with the online course management tool WebCT; he wanted to use it in combination with an independent web-page for the submission of a&ignments, in collaboration with the senior demonstrator who was responsible for conducting the lab experimentations and the grading of the lab reports new elements were added to
the W ~ ~ course. C T The WebCT homepage for the course had eight icons (see figure t next page) but they were regrouped into six different uses: 1Dissemination of general class information, 2- submission of assignments, 3-
Web-Based Self and Peer Assessment 41 Private and public class exchanges, 4- Questionnaires, 5- Communicating specific results from experiments, 6- Providing help for related software uses (see Appendix D for details about each tool).
Figure 1. WebCT course Homepage with Online Course Components -
..
.
Course tools under icons one, two and six were managed by the instructor. They were related to assignments and class lectures whereas the other sections were related to the lab experiments and the self and peer .assessment activities. The assessment activities revolved mainly around sections under icons three and four. The tools in the communication section were designed to support private and semi-private communication within and between groups. Each team had a private discussion board where they could post messages only to people in their own lab report group and their paired lab report group. Students also had access to a public discussion board addressing more general issues and technical problems. To ensure that students could exchange with other members of their team privately we added the internal ernail
Web-Based Self and Peer Assessment 42 tool. Students could choose to use it or not but we made it availabfe. Under the questionnaire section students could access the self and peer assessment forms and tips on how to give and receive feedback. Task on WebCT The self and peer assessment forms were adapted to the quiz tool in WebCT (see paper versions in Appendix E). To complete the self and peer assessment forms students had to go online, login to WebCT and enter the questionnaire section. In each assessment form (or quiz) students had to select which criteria had been successfully completed, give a grade and justify themselves by giving constructive and specific feedback on the lines below each category. Each form had between 18 and 19 questions to fill in and students could do them in more th2n one sitting as long as they save their answers. Upon
- . . '
completion students could post the results of the assessment on the discussion board to their peers. Self assessment forms for each lab report were only available the week following the lab experimentation prior to handing in the report on paper. The peer assessment was available two weeks following the lab experimentation. Online Communication, Web Course Usage and Statistics Sharing of lab reports combined with the peer assessment was designed to promote exchanges and communication within and among teams. Depending on the length and requirements of each lab experiment students had some time or no time to work and discuss their lab reports in class. Online communication tools and class related information was posted on WebCT to encourage students
Web-Based Self and Peer Assessment 43 to extend their collaboration outside of classroom time. The discussions and types of use students would make of the tools would have been examined, but the lack of depth in the discussions limited our analysis to the general use of messages. Observations related to general statistics about students visiting the website, reading and writing messages were used to examine the impact of these tools on students' collaboration. Data Analysis Participation and Task Completion For each lab report, groups had to complete three tasks: fill out the selfassessment, post their lab report and complete the peer-assessment. If any group did not post their lab report, the group paired with them could not do the peer-assessment. Students' participation to.the.assessment activities was strongly encouraged by the instructor for lab report 10 and 11 but was left to students' discretion for lab 12 and I 3 due to heavy workload at the end of the semester. Students' participation was a variable measured to confirm part of the third hypothesis. The participation and the percentage of task completion were used to estimate the relevance of the assessment activities from students' perspectives. Self and Peer Assessment Forms Numerical data coming from the self and peer assessment forms (as listed in table 3.2 section A,B,C,D,E) were used to observe the evolution of the number of errors made by lab report groups. Number, types of errors (or category) and provenance were monitored over time to answer hypothesis.
Web-Based Self and Peer Assessment 44 To confirm the first hypothesis, the overall number of errors identified per lab report was calculated to observe if a decrease of errors would happen over time. Then, the types of errors were taken into consideration to verify if some types of error would reveal contrasting trends. To investigate the second hypothesis about errors in others being easier to detect, the average number of errors identified in self assessment was compared to the average number of errors identified in peer assessment. This comparison was done over time and types of errors found in both the self and peer assessment. The students' ability to recognise these types of errors in their self assessment was investigated. In other words, this measure aimed at determining if some types of errors would be harder than others for students to identify. The third hypothesis explores to what extend students have played an active role in the assessment activities and if they have
participated and
discussed above the minimum requirements. The observations examine the overall task completion and participation, the web course usage and the amount and types of exchanges students have amongst each other. Grades and Comments on Lab Reports Even if the main goal of this study is not to compare the validity of the self and peer assessment's grade to the instructor's grade, part of hypothesis one is to verify the accuracy of students at predicting their grades and their teammates' grade. Lab reports graded by the senior demonstrator were photocopied before handing them back to students two weeks following the experimentation. Data
Web-Based Self and Peer Assessment 45 about the grades and comments coming from the corrected lab reports were manually entered and combined with the corresponding self and peer assessment data. Grades varied from alpha to delta and very bad lab reports can receive an "F"for failed. The instructor's grading scale alpha, beta, gamma or delta was converted to corresponding numerical values (alpha = 90%, beta =
80%, gamma = 70% or delta = 59%). Interviews and Post Activity Questionnaire All answers to questions from the interviews were transcribed and compiled into a table corresponding to the section listed in table 2. Answers from the post activity questionnaire (See appendix C for post a~tivi$~uestionnaire) were combined to the interviews and relevant quotes and summaries were translated into English (See appendix F for detailed , . . summary}.
Web-Based Self and Peer Assessment 46
CHAPTER 4 RESULTS In this study three main questions aimed at observing and measuring the impact of the self and peer assessment activities on students' performance and collaboration. Pre-requisite to addressing the results of these questions we report on the participation rate in the activities as it has a major impact on the design of the study. Participation For each of the four lab reports, lab groups had to complete three tasks: fill in the self-assessment, post their lab report and complete the peerassessment. As reported in table 5 the overall student participation in all twelve tasks was approximately.44%. However, participation was not evenly distributed and the discrepancy in the participation rate between the first two lab reports and the last two lab reports .~ . is high. For the first two lab reports the participation rate ~
is around 79% whereas for the last two lab reports it is around 9%. This very low rate of participation in lab reports 12 and I 3 had an important impact on the analysis since no peer assessment data was available and only two self assessment forms had been submitted. Due to this low response rate data from lab reports 10and 11 only were considered for analysis.
Web-Based Self and Peer Assessment 47
Table 5
Overall Participation Lab I 0 Lab 11 Lab 12 Lab 13 Overall
Self
Lab posted
70 % 88 % 17 % 8% 46 %
78% 71 % 17 % 13 % 45%
Peer completed 94% 75 % 0% 0% 42 %
Overall
80.6% 78% 11 % 7% 44 %
Measures Related to the First Hypothesis As part of the first hypothesis we had predicted that the number of errors
would decrease in time. We also expected that certain types of errors would decrease and others remain constant. Number of Errors
.
.
.
Errors were identified by types and categories for each form filled in by students. For the experimentation period related to lab 10 and 11, 52 assessment forms out of the possible 66 forms were filled-in on WebCT. The number of assessment forms submitted for lab 10 and 11 were not too different; 28 assessment forms were submitted for lab report 10 and 24 were submitted for
lab report 11. As reported in table 6, a total of 233 errors were identified, which makes a global average of 4.5 errors identified per assessment. Table . . .6 . Number of Errors Identified in Lab Report f 0 and 11 Lab Number of Number of Assessment Average Number of Errors per Report Errors Forms Assessment Form Lab 1D 137 28 4.89 Lab 11 96 24 4 Total 233 52 4.5
Web-Based Self and Peer Assessment 48 The number of errors in lab 11 is lower than in lab 10 but this difference is
not statistically significant Types of Errors The assessment forms were structured to help students verify lab reports systematically for different types of error. Each type of error had a list of criteria. The categories of errors were used for analysis purposes rather than each distinct criterion to report on the types or errors. As reported in table 4.4 below, out of 233 errors, 30 were general types of errors, 86 occurred on graphs, 30 related to data. 17 related to sample calculations and 70 had to do with least square fits. The most common identified errors are related to the graphs (37%) and to the least squares fit (30%). Table 7 Types of Errors in Lab 10 and Lab I1 Lab Report General Graph Lab 10 Lab 11 Total
19 11 30
52 34 86
.
Data
15 15 30
. . Sample Least Total Calculation Sauares Number of Errors I1 40 137 6 30 96 17 70 233
The number of errors decreased from lab 11 to lab 10 in some but not all categories and not uniformly. Errors that apply to general details and graphs decrease in lab report 11 compare to errors in lab report 10. As shown in figure
2, general errors, graph, and sample calculation errors decreased while the three other categories of errors remained more stable. Errors associated to the least squares fit and data seem to be more resilient.
Web-Based Self and Peer Assessment 49 .
-
Types of Errors
- lab 10 vs lab 11
Figure 2. Average of Types of Errors per Assessment Form Grades Grades given by the senior demonstrator were recorded and compared to both the grades students assigned to themselves and to peers. Descriptive statistics in table 8 show that the average of grade given by the instructor in lab
10 is 75% and 77% for lab 11. A paired I-test on assessment of self and peer with the instructor's grade as reference determined that students are not very good at predicting their or their peer's grade. Grades do not improve significantly from lab 10 to lab 11, nor does students' ability to predict their grade or their peer's grade Table 8 Grades for Self, Peer and lnstructor's Grades
Lab 10 Lab 11 Overall
Self assigned grade 92 94 93
Number of observations 16 19
Peer assigned grade 94 92 93
Number of observations
Instructor's grade
Number of observations
12 9
75 77 76
23 23
Web-Based Self and Peer Assessment 50 Grades' Estimation by Students Grades given by the senior demonstrator were recorded and compared to both the grades students assigned to themselves and to peers. Descriptive statistics in table 8 show that the average grade given by the instructor in lab 20 is 75% and 77% for lab 11. A paired t-test on predictions of self and peer with the instructor's grade as reference was not significant. This result determined that students are not very good at predicting their own or their peer's grade. Grades do not improve significantly from lab 10 to lab 11, nor does students' ability to predict their grade or their peer's grade. Even the comparison of students self and peer attributed grade did not correlate. Measures Related to the Second Hypothesis In the second hypothesis, measures were aimed at finding if the peer assessment could facilitate self assessment. Part of the hypothes~swas that students could find more and different types of errors in their peers than in themselves. Related to error identification we hypothesised that a difference could be found between the types of errors students made and recognized making and errors they made but could not recognize making in themselves. To assess students' preference of the self or peer assessment activities the participation rates between the two were compared. Students were then asked in an interview which one of the two they preferred. Number or Errors found in Self vs. Peer Assessment Out of the 52 assessment forms submitted by students, 33 were self
assessment forms and 19 were peer assessment forms. On average, students ..
Web-Based Self and Peer Assessment 51 identified more errors in peer assessment than in self assessment. In other words, they see more errors in the lab reports of others than in their own Table 9 Number of Errors in Self and Peer Assessment Type Errors .. of Assessment Average of error form Self 4 132 Peer 5.32 101 Total 4.5 233
Number of Assessment form 33 19
52
We cannot confirm a significant trend due to the reciprocity of the design. We had only 15 pairs of completed peer and self-assessments for a given lab. Out of the 15 pairs, 9 identified more errors in peer and 6 identified more errors in themselves. We computed a paired t-test on the number of errors in found in self and peer assessment for each lab but results did not reach significance. Types of Errors in Self vs. Peer '
.
.
'
When we look at the data in more detail and take the types of errors identified in self and peer assessment forms into consideration the picture is more complex. Looking a t the raw data can by misleading since different numbers of self and peer assessment forms were received; 33 were self assessment forms and 19 were peer assessment forms. In figure 3 where types of errors per assessment form awrepresented, students seem to be better at identifjring errors related to graphs, data and sample calculations in athers than in themselves.
Web-Based Self and Peer Assessment 52
Table 10 Types of Errors in Self and Peer Assessment Type of General Graph Data Sample Assessment calculation
19 11
Self Peer
I
..
Types of Errors
.
47 39
-
12 18
Self vs Peer
8
9
Least squares
46 24
Total Number of Errors 132 I01
I
.- ..-.. ... ..
Figure 3. Average of Types of Errors .in.Self and Peer Assessment Type of Error Recognition in Self vs. Peer Errors reported in peer assessments were considered to accurately reflect errors done by the students. These errors were compared to the selfassessment for the same lab. Errors made and identified were compiled and added to the errors made and not identified to observe if students had a harder time identifying some types of errors in themselves. These manipulations were done for each lab report but considering the low number of entries -there were only nine entries for lab 10 and six for lab 11 - no test of significance was computed. The data still suggest interesting exploration paths. As repoked in figure 3, errors related to graph and least square fits seem to be easier for students to self identify in lab report 10.
Web-Based Self and Peer Assessment 53
Average of Actual Errors Not Identified vs Self tdentified In Lab 10 1.20
,
I
L
OActual !Jl Identified
I
General
Graph
Data
Sample calculation
Leasl squares
Figure 4. Average of Actual Errors Not ldentified vs. Self Identified in Lab $0
In figure 4 above the dotted columns correspond to the actual errors whereas the lined cotumns correspond to the self-identified errors (but they are still making). Some types of errors seem to be mainly identified by peers like
general errors, errors concerning data and sample calculations. Errors retated to graphs also seem to be more obvious when doing peer assessment.
Looking at the data for lab 11 in figure 5 can help us predict what could happen over time. In lab 11 students seem to be better at identifying errors they have made concerning graphs and least square fits but they tend not to identify other types of errors they have made.
Web-Based Self and Peer Assessment 54
Average of Actual Errors Not ldentified vs. Self ldentified in Lab 11
UU ldentified
I
General
Graph
Data
Sampfe
calculation
Least squares
Figure 5. Average of Actual Errors Not ldentified vs. Self ldentified in Lab 11 Participation in Self vs. Peer As reported in table 1I, student participation rate in peer assessment is '
. . '
higher than participation rate in the self assessment activities for lab 10 and overall but not for lab 1 1 . Table 11
Participation Rafe in Self and Peer Lab 10 Lab 11 Overall
Self 70 % 88 % 79 %
Peer 94% 75 % 84.5 %
Overall
80.6% 78% 79.3%
Students' Preference for Self or Peer Assessment In the section B of the interviews and the post-questionnairewe asked student what they liked and disliked about doing the self and the peer
,
assessment and which they would choose to keep if they could only keep one. Students' answers varied, two preferred the self, three preferred the peer, and two liked both equally. Students' preferences are not as revealing as the
Web-Based Self and Peer Assessment 55 reasons they gave to justify their preferences. Students liked to see how others do their lab reports. S1 explained why he thought both the self and the peer assessment were good:
"benles deux etaient petfinents parce que ce que j'ai aim6 dans 1'6vaiuation de I'autre lab cgtait que tu pouvais voirla methode des autres parce que toi t'as toujours fa propre mbthode. " Translation: "Oh well, both were appropriate because what I liked about evaluating the other's lab is that you could see their methods since yourself tend to have your own way of doing it. " Another student S2 who preferred the peer gave a similar justification:
'Lepeer c'est plus interessant, j'aime bien recevoir les commentaires des autres, c'est bien aussi de voir comment les autres fonf les choses. On se rendaif compte de nos erreurs en regardant le lab des autres. " Translation: "The peer [assessment] is more interesting, I like a lot receivina comments from others, it also useful to see how others do thin$. We realized our errors es we went over others' lab report." Students that did not appreciate doing the peer assessment were quite ambivalent in their justifications; they liked looking at other's lab report but they did not like having to grade it.
"J'aimemieux I'auto-Bvaluation, Fa te permet de te rendre compte de tes erreurs,,8valuer les autres c'eqt plus dificile. Je pense que le self est plus utile. Mais j'aime lire I'autre pas I'~valuer,mais j'aime recevoir les commentaires des autres. " (S4) Translation: "Ilike the self evaluation better, it allows you to revise and find out about your errors, evaluating others is harder. I think the self is more useful. However I like to read rather than evaluate others' lab reports, but I like receiving comments from others."
Web-Based Self and Peer Assessment 56 Measures Related to the Third Hypothesis Task Completion and Participation As stated in table 4, the overall participation rate for lab tO and IIwas relatively high (79%), however many groups completed only the self or only the peer assessment for any given lab report. For each lab report, groups had to complete three tasks and the reciprocal nature of the peer review excluded a lot of participants when their matching group did not either post their lab report or completed the peer review. Some groups only had the opportunity to complete 4 or 5 of the 6 required tasks. In table 12 next page, participation rate was calculated per task for lab 10 and 1Ibased on the opportunity groups had to complete tasks and how many of them they actually completed. All groups had the opportunity to complete a minimum of four tasks: fill in the self-assessment for lab 10, post lab report 10, fill in the self-assessment for lab 11, and post lab report 11. When the corresponding group had posted their lab report for any given lab, then the group had the opportunity to complete the peer assessment for that lab report. For example, group M l a (Monday group l a ) did the self assessment for lab 10 and 11, posted both lab reports but could only do one of the peer assessments as their match group M I b did not post their lab report for lab 11. The number of messages exchanged amongst team is an indicator of the level of participation
as a minimum of four posting was required to complete the task.
Web-Based Self and Peer Assessment 57
Table 12 Task Completion for Lab 10 and I I Group Tasks done I possible ... tasks hlla 515 Mlb 516 M2a 316 M2b 616 M3a 5i6 M3b 516 M4a 314 M4b 015 M5a 214 M5b M5c M6a 414 M6b 616 W Ia Wlb 516 W2a 616 W2b 416 W3a 516 W3b 515 W4a 414 -
-
Percentage of participation 100 83 50 100 83 83 75 0 50
~-~
Number of messages 7 ~
-
~
-
9
17 1
0 100
4
100 83 100 67 83 100 100
16
67 50 100 33 100 67
0
4 9
2
W4b W 5a W5b W6a W6b W7a W7b ~
416 214 4 14 216 GI6 416
~-
2 13
For lab 10. 27 lab groups were clustered in 13 teams; 6 teams of two groups on Mondays with one team of three groups and seven teams of two groups on Wednesdays. The consent forms were obtained for 23 of these groups (the grey rows correspond to groups who did not agree to give access to their data). Out of these 23 groups, 16 did their self-assessment which
Web-Based Self and Peer Assessment 58 represents a participation rate of 70%. The lab reports were posted on WebCT by 17 groups and shared on paper by one team which resulted in a 78% participation rate for this task. Only one group out of these 18 groups (94%) did not complete the 3rd task, the peer-assessment. For lab 11,28 lab groups were grouped in 13 teams; six teams on Mondays and seven teams on Wednesdays with a team of three groups in each class. Signed consent forms were obtained for 24 out of the 28 groups matched
in 10 working teams. Out of these 24 teams, 21 did their self-assessment for a participation rate of 88%. In the following week only 16 groups posted their lab reports on WebCT and one shared it on paper (71% participation). Out of these 17 groups that had the opportunity to do the peer-assessment, 17 groups did so
(75% participation). ,
.
.
Web Course Usage and Statistics The WebCT site for the course was available for students from January 3rd until the end of semester April 24th. All students first visited the site between January 6th and February 9th. The last visits of the sites were done on April 1 4 ' ~ . More than fifty percent of students visited the site in April where all course requirements were bver and no exam -relatedmaterial was required. Number of hits (or click) per student varied from 7 to 209 for an average of 50 hits per student for the semester. Fifty students read between 1 and 21 messages and thirty of them posted at least one message for an average of three messages per student.
Web-Based Self and Peer Assessment 59 The time students took to complete and submit their self and peer assessment questionnaire was recorded. Half of the students that completed the self assessment for lab 10 did it in one sitting. On average they took approximately 10 minutes to complete the form. For the peer assessment questionnaire for lab 10 we have similar results, a third of the students completed the questionnaire in one sitting with an average of 16 minutes per form. For lab
11 two thirds of the students took approximately 10 minutes to complete the questionnaire in one sitting. For the peer assessment half of students who did it in one sitting took approximately 12 minutes. Students' Exchanges Online Communication Amongst Teams Students could only read and answer messages in their team; nobody .
'
except the instructor had access to more than one private discussion board. All students ha21 idcess to the main and technical topics for the class. In these two sections any student could post and reply to messages. In 13 private team discussion boards student posted a total of 84 messages. In the main discussion board open to all students posted 4 messages. Messages $ere categorized &ccordingto their goals and conversation could not be analyzed since only one question and answer occurred. In general students did not reply to each other online. Messages were categorized under functional (posting lab report, sharing self or peer assessment), questions (when questions were posted on their own), notice or reminder (when students told other group to post their lab, or to clarify missing information
Web-Based Self and Peer Assessment 60 technical problems with upload). Approximately 80 percent of the messages were functional; others were about notices and problems. In three teams only
one group posted messages and two teams were affected by technical problems with the upload on the discussion board. Table 13 Number and Types of Messages per Team Team Functional Questions Reminder and Answer
M1
6
M2
7
1
1 I
M3
14
2
1
M4 M5
Problems
Notes
1
technical problems
or Notice
1
only one M6 W1 W2 W3
3 14 4
w4
2
YOUP
8 only one group
W5
w6
2
W7
7
2
4
only one group technical ~roblerns
Endorsement of the Activifies In section 0 owhe interviews and questionnaire we asked students to their opinions on whether or not they would recommend repeating the activities next year, to indicate if doing the activities online was suitable for them and to
suggest improvements. Six out of the seven students said it would be worth repeating the exercise. In general students said it would be more usefuI to do it during the first semester for a couple of lab experiments and maybe have
Web-Based Self and Peer Assessment 61 questions evolve afteiwards. They would incorporate it to the course load and make it mandatory to ensure that everyone would have a chance to do the peer assessment. All students liked the online assessment but they suggested that the lab reports should be exchanged on paper on the day of submission. It
would be easier to evaluate as you would not need to toggle back and forth between programs and windows. To improve the activities students suggested more intervention from the instructor, and to make it mandatory for students to participate. Group Exchanges and Responsibilities To try understanding how groups share the work and collaborate we asked students about the grouping, the timing and the task sharing for the activities within their lab group. We also asked them how they usually . . split the work for their-lab report as we think both wlll be related. Answers from two male students ST and S4 were clear, we divide the work in two, work on our own, email the work to each other before submitting and we do not meet to do anything. In both cases they did both the peer and self assessment on their own. To contrast with these two students, two female students said they would meet 4
and do everything together in the computer lab. In their case they did the self assessment as a group as well. Only the self assessment was done in group by
52 (female) because their group did not have the opportunity to do any peer assessment. The two other students' groups (one male, one female) were in the middle, they would work on their own but met to check everything together before submitting. In this case one group did the assessments as a group and the other
Web-Based Self and Peer Assessment 62 one said she did it by herself. One student mentioned that she felt it had been a useful tool for helping the group revise their lab report. We also asked them their opinion about the pairing of groups for the peer assessment. Students said they liked exchanging in small groups; it gave them some privacy and the opportunity to develop more familiarity with each other. They like having a restricted audience and knowing who will read their comments.
"No,1wouldn't like anonymify; I could not go and talk to the person. " (S5) "Oui, j'aime bien avoir un groupe avec iequelje me sens plus a raise, le ton devient plus farnilier. Je pense que je ne veux pas que Ies autres eleves voient /es commentaires queje fais ou que je reqois ii propos de rnon lab. " (52) Translation: "Yes. I like small groups better, I feet more at ease, and the ambiance gets friendlier. I think I would not like that other classmates see the comments I do or receive concerning my lab report."
Web-Based Self and Peer Assessment 63 CHAPTER 5 DISCUSSION In this study of classroom implementation of formative assessment activities using technology, we had three main questions. First, will the activities help students monitor their errors, gauge their performance and eventually produce better lab reports? Second, can peer assessment facilitate self assessment? And third, will the activities be endorsed by students, foster collaboration and exchanges amongst them? After discussing related results to these questions the limitations of this study and educational implications for future research are also discussed. Error Monitoring. Gauging Performance and Improvement of Lab Reports The self and peer assessment activities were designed to provide moriitoring and address problems raised by Astotili (1999): students' misunderstanding of instructions and students' habits of work. By providing students with a clear assessment grid their understanding of the instructor's instruction and expectations was expected to improve. These activities were also designed to improve learning strategies and actively involve students in the revision of their work and their peer's work. The goal was to measure the impact
of these activities on students' error production by observing the number and types of errors they detected. Instructor's grades were used to measure the improvement of lab reports and also to evaluate students' ability to predict their grades.
Web-Based Self and Peer Assessment 64 Number of Errors To measure the effectiveness of the assessment activities the number of errors detected by students were examined. It was hypothesized that the amount af errors would decrease from first to the last lab report. On average students identified 4.9 errors per lab report in lab 10 and 4 errors per lab report in lab 11. This decrease in the average number of errors found per lab report between lab reports 10 and I 1 was not statistically significant. However, it may suggest a trend that could be confirmed if the study was conducted over a longer period of time. Types of Errors For each of the five distinct categories of errors in the assessment grid, a general decrease in the number of errors was expected over time. The rate of
.
.
decrease for the different types of errors was also of interest. As shown in figure
2, errors pertaining to general, graph and sample calculations decreased while the three other categories of errors remained more stable. The experiment should be repeated for more than two lab reports to be able to see reliable patterns of errors over time. The exact nature of the cognitive tasks involved in each lab is a potentially confounding variable that should be considered. Categories of errors like general, graph or data might not be sensitive to the nature of the tasks involved in each lab. However, categories of sample calculation and least squares fit have to be interpreted in light of the nature of a specific lab. For example, lab 10 was a demonstration where sample calculations were central to the task. Lab 11 did have sample calculations
Web-Based Self and Peer Assessment 65 required but they were less extensive. This might explain the decrease of errors for the sample calculation category from lab 10 to lab 11. The decrease in the number of errors in the general and graph categories is probably more revearing as tasks for both lab reports in these categories were similar. Improvement of Lab Reports The instructots grades were used as a measure of improvement for the quality of lab reports from lab 10 to lab 11. The alpha, beta, gamma, and delta values were converted into numerical values to match those on the assessment grid. The grades did not improve significantly. The senior demonstrator confirmed that she did not find any significant improvement in the quality of the lab reports. Estimation of Performance
.
.
One of the main goals of the study was to find out whether students' ability to evaluate themselves and understand the assessment process improves. One way to measure this is to see if their self and peer assessment can predict the grade'given by the instructor. The assessment grid is assumed to be aligned with the summative assessment since it was designed by the instructor. As shown in Table 8, the correlation between grades predicted by students and grades given by the instructor is non existent and it does not improve for lab 11. This lack of correlation might be explained by the different scales used by students and instructor. It could also be due to studentsneeding more training at using the assessment grid. However, considering students' comments about grading as they expressed surprise about their lab report being shorter and ..
Web-Based Self and Peer Assessment 66 getting the same grade as their partner having a long lab, it could be that their uncertainty about what the instructor's expectations were and their misconceptions about what a good lab was could deserve more attention. Some monitoring and concrete examples might be needed to better show students what the instructor expects. Summary The results of error monitoring and grade prediction cannot firmly confirm the hypothesis about the positive effect of these assessment activities on the improvement of the lab report and students' ability to gauge their performance. Results suggest a positive effect but the limited amount of data obtained limits the analysis. Relationship Between Self and Peer.Assessment As mentioned by Johnson and Johnson (1994) peer collaboration gives students an outside perspective on a topic or task. By having to complete the peer assessment students need to actively use the assessment criteria on someone else's work. The peer assessment was expected to help bridge the recognition-production gap, as it is easier to see errors in others' work. To verify this hypothesis, the following were examined: whether students could find more errors in others, if they found the same types of errors in others than in themselves, and if they improved at identifying errors in themselves. It was also hypothesized that peer assessment would be more popuiar for students as it is more interactive and allows students to compare themselves to others. To
Web-Based Self and Peer Assessment 67 measure this motivational aspect of the peer assessment, the participation rates for the peer versus self assessment were examined and during their interviews students were asked what their preferences were and why. Number of Errors in Self vs. Peer For the task where students had to monitor their own errors and the ones from their peers, it was hypothesized that they would have an easier time identifying errors in others than in themselves. Therefore, students were expected to identify more errors in the peer assessment than in the selfassessment. On average students tended to find more errors in others (5.32 errors per lab) than in themselves (4 errors per lab) but the difference was not statistically significant. This lack of significance is probably due to the small sample size (only 15 pairs of observations).
. .
Types of Errors in Self vs. Peer When looking at the data in more depth, a trend was found in the types of errors students identified in the self assessment compared to the peer assessment. Students seemed to have an easier time identifying errors regarding graphs, data and sample calculation in others than in themselves. This finding suggests that the peer and self assessment, although identical in their forms (they contained the same criteria and questions) were not simply redundant but possibly complementary. Further investigations would be needed to confirm this difference, as well as to show whether self and peer assessment could be optimized by questioning students on different types of questions.
Web-Based Self and Peer Assessment 68 Recognition of Errors in Self vs. Peer The ability to identify errors might not automatically lead to error correction. Therefore, it was hypothesized that students who were able to identify errors in themselves would continue to make similar types of error. One would hope that self-monitoring of errors would lead to error correction, and that a decrease or at least a different pattern in the types of errors made would be seen. Figure 4 on page 53 showed that in lab 10 students could identify errors they made mainly in their graphs and in least square fits calculations but not in the other categories. Figure 5 on page 54 showed that in lab 1 Ithe number of errors decreases for graphs which could indicate that students caught onto their errors from lab 10. However, the number of errors concerning the least squares fitdoes not decrease even though half of the errors in lab 10 were self-identified. This resilience could be explained by the need for feedback from peers or instructor on where they make certain errors and how to fix them. Part of the resilience could also be explained by more complex calculations required in lab 1Ias compared to lab 10.
In general, the amount of errorsdeclines from lab 10 to lab 1 Iin all but one category, the least squares fit calculations, Students also became more aware of the general types of errors they made from lab 10 to lab 11. For the other types of error, the ability to self-identify decreases. This might be due to the number of errors decreasing in general or to some types of error being more resilient. It could also reinforce the idea that some types of errors are difficult to
Web-Based Self and Peer Assessment 69
I
self identify and require external feedback for revision. Once again more lab report data would be needed to identify clear trends. Participation Rate in Self and Peer Assessment Considering the overall participation rates for the different tasks, the initial hypothesis concerning the popularity of peer assessment compare to the one of the self-assessment appears to be confirmed. Students' participation rate is higher for the completion of peer assessments (84%) than for self-assessment (79%). However, the progression in time shows that students' initial response
rates for the self-assessment in lab 10 is the lowest response rate. This might be due to some confusion, or simply to the students' need for time to understand and familiarize themselves with what they needed to do and how to do it. In lab IIstudents' participation increased for the self-assessment . . whereas the rate of
participation for the peer assessment decreased. This behaviour might be linked to the timing schedule in which students had to complete the activities. The second peer assessment had to be done during or after Spring break when students had other exams and term papers to do. In terms of the timing, week 2 was the best week for participation where students filled-in their peer assessment for lab 10 at 94% and their self-assessment at 88%: Students' Preference Students did not express a clear preference for peer assessment. Some said they preferred the peer assessment, some said they preferred the self assessment, some said they liked them equally and others cannot give thelr opinion since they did not have the opportunity to experience the peer
Web-Based Self and Peer Assessment 70 assessment. The justifications behind their answer leads us to think that they find both of them useful but particularly liked the peer because it allows them so see others' lab reports. Students found the task of evaluating others challenging and hard yet they appreciated receiving feedback from others. Peer assessment
is not an easy task for students, it has been reported as difficult or even as a threatening experience by students in another study on triadic assessment (Gale, Martin, B Mcqueen, 2002) Students in the current study said they liked the selfassessment because it helped them revise their lab reports and they liked the peer assessment because they could see how other people were doing their lab reports. Summary Students' preference for the peer assessment was confirmed to some , . .
extent. Participation rates were a bit higher and, considering the extra amount of work involved and carrying out the peer assessment. this may indicate a positive opinion towards it. Students' comments about the peer assessment suggest that they believe self and peer assessments are complementary. The error monitoring results go in the same direction as the hypothesis stated earlier, but considering the Tack of significance, it cannot be confirmed that peer assessment leads to
more error detection. However, looking at the types of errors detected in peer versus self assessment supports the hypothesis of the cornplementarities of
these two assessment tools.
Web-Based Self and Peer Assessment 71 Community of Practice In its report on science education the National Science Foundation (1996) recommended shifting the emphasis from teaching to learning. They confirmed that students are influenced by the conditions under which learning takes place, and that many students learn better when they are actively involved in collaborative groups inside and outside of the classroom (Springer et al., 1999). in their meta-analysis af undergraduates in science, Springer et al. (1999) confirmed that "various forms of small-group learning are effective in promoting greater academic achievement". In this study, students were provided with the opportunity to play an active role in the assessment process. A situation of group interdependence was created in order to foster a community of practice within the class. It was hypothesized that such a community would develop if students endorsed the activities, collaborated within and amongst groups, used the online facilities, and communicated about their lab reports above and beyond the suggested task. To verify the degree to which students would endorse the activities students' participation, their task completion rates, and their opinions about the repetition of the activities were examined. To measure collaboration amongst and between groups students were directly asked in interviews about task distribution and grouping. Students' use of the online facility, the amount and types of conversations and exchanges they had in their small groups were also examined.
Web-Based Self and Peer Assessment 72 Students' Endorsement of the Activities Parficipation and Task Completion
Overall participation in the activities for the four lab reports could be acceptable (44%), but participation was not evenly distributed. There was a discrepancy between the first two and the last two lab reports. Participation and task completion for the first two lab reports was high (79%). Nine out of the 23 groups did all possible tasks and nine other teams completed more than 60% of their tasks. This represents more than 314 of the participants. Following the instructor's advice to concentrate on exams, very few students continued to participate in the activities after the Spring break. For lab reports 12,and 13, only four groups continued to do their self assessment. However, none of them were paired together and they did not have the opportunity to do peer assessment. The discrepancy between the first and last two lab reports might be due to the instructor's overt support for the activities before the exam period. Students' time constraint might have discouraged them to continue the activities at the end of the year as their workloads and stress levels were high.
Students' Opinion about Repeating the Activities Students' opinions were positive; six out of the seven said they thought the activities had been useful and should be repeated next year. They also had suggestions on how to improve the logistics of the activities. All students agreed that it should be mandatory for everyone to do peer assessment and that it should be implemented early in the first semester. Considering the amount of extra work involved in the activities, the students' involvement and opinions were
Web-Based Self and Peer Assessment 73 encouraging. Students might have wanted to please the interviewer but if that was the case they would probably not have given as many suggestions to improve the activities. Their reactions and suggestions are considered to be positive feedback towards the endorsement of the activities. Collaboration
Task Distribution and Group Dynamics Students were responsible for working together as a group to produce their lab report. How they distributed the work and when they did their work are
their responsibilities. One of the goals of the activities was to give students the opportunity to see and discuss each other's work. Their group processes and dynamics in writing their lab report and doing the activities were of interest in this study. The results suggest that the activities did increase discussions and
.
.
collaboration amongst the two lab groups that were already doing their work physically together. However, the activities did not change the work habits of the
two groups who were doing everything individually. Two other groups said they did the self-assessment together after having each done their part individually. These comments suggest that the activities might have amplified the already existing group dynamics. That is, if students were already actively collaborating the activities would be done in groups, but if they were not it would be done
individually. Use of WebCT Course and Communication amongst Students Students continued to access the online course faciliiy after the required activities were completed. More than 50 percent of them visited the site in April
Web-Based Self and Peer Assessment 74 when all assignments and course requirements were met. Students' did not really engage in online conversation. The number of messages is misleading; some teams had up to 17 messages postings but the content of these messages did not suggest that any real discussion occurred. Very few teams had exchanges about their lab reports; 80 percent of the exchanges were one way postings for functional reasons (e.g., requesting the posting of a lab report. posting a lab report, posting the assessment). One group tried to start a discussion by posting open-ended questions but their corresponding group answered saying they did not have time to spend on these questions. The absence of discussion can be attributed to the lack of reward and monitoring of the questions. The grouping might not have been detrimental to this part of the activities but more likely the absence of monitoring and reward might have discouraged students to invest time in the process. Summary Overall results about the instantiation of a community of practice are not conclusive. Students show a good level of endorsement of the activities but their involvement in the discussions and the collaboration was limited. This limitation might be due to the classroom settings which reinforced gradable activities. It may be important to implement some of students' suggestions to reach a better level of involvement.
Web-Based Self and Peer Assessment 75 Limitations of the study Participation Data collection was planned for four lab reports in order to observe students' monitoring of errors and interactions. However student attrition resulted in data for only two lab reports. This attrition had a major impact on the power of the design; instead of four sets of data from four points in time, only two were available. Further, the overall participation rate of students for all the activities is approximately 46 %. As shown in table 5, students participated very well in lab 10 and 11 but the participation rate drops quickly for labs 12 and 13.
Participation is directly linked to reward; unless students see short-term benefits a majority of them will not invest the time andeffort.. Even if a small group of students were involved in the activities after the required period, the design of the peer assessment required both lab groups in the team to participate. This interdependency of the activities left most of them on their own and unable to communicate and continue the activities with others. Timing The activities were designed during the fall semester and implemented in the winter semester. In the winter, the first third of the semester is dedicated to special projects and formal lab experimentation only stark in mid- February.
Mid-February was chosen to introduce the activities as the senior demonstrator believed in the need for students to get accustomed to their environment. This late onset also aimed at testing this new procedure with more stable groups as
Web-Based Self and Peer Assessment 76 the rate of attrition and group change is a lot higher during the first semester. This decision might have had a negative impact on the credibility of the activities
as students wondered why they did not have to do this in the first semester. Another potential problem with implementing the activities in mid-February is the impact of mid-term exams and students' high workload at that period. The participation rate throughout the weeks shows that the week corresponding to mid-term exams had a lower participation rate. The implementation of the activities might be more successful earlier in the year, but more experimentation is required to test this hypothesis. Task Requirements As mentioned by Vazquez-Abad, Winer, and Derome (1997), the task requirements for lab experimentation in first year physics is challenging.
' . .
-
Students do not have the theory-based knowledge to understand the experimentation they are doing. This gap leads to significant disconnection between the theory-based classes and practice based lab experiments. In this context, students cannot be expected to demonstrate much depth in their understanding. Open-ended questions might not have triggered discussion since '
not much can.be discussed in term of students' understanding of the theory behind the experiments. If the main goals of this first year course are to teach students procedures and manipulation, it could have been more appealing to students if some questions about the lab were posted by the instructor as bonus grades to trigger discussion.
.
Web-Based Self and Peer Assessment 77 Technology The advantages of using WebCT for the introduction of these activities outweigh the disadvantages but the way some the tools were used had a negative impact on the study. Using the university wide course platform for this experimentation was good as students were already familiar with the technology, 75% of them had already used the facility and knew how to use the basic
features. Nevertheless, the utility of the quiz and discussion modules for the activities were not transparent; they added an extra burden and made the activities more cumbersome. The quiz module in WebCT does not allow for peer grading or commenting, students had to cut and paste their answer into the discussion board. As students pointed out, it would be more stimulating to have an assessment f o m that could be directly sent to peers, commented on by them and returned. Designing a quiz in ~ e b ~ ~ ' d not b egive s a lot of flexibility or options for formatting. Every section required the design of 3 different questions and the quiz ended up having 18 questions instead of 7. The quiz module also makes it impossible to have formative assessment; it requires grading for each question. Even if students were informed before and during the activities to ignore grading, some students could not understand why they kept on getting low grades. In the discussion board some browsers were not supported by WebCT and students mentioned it and complained about the situation. Two teams also had serious difficulties with the uploading procedure and even after contacting the support group these problems remained.
Web-Based Self and Peer Assessment 78 Educational Implications and Future Directions Active Use of Assessment Criteria The instructor had given the grading criteria to students at the beginning of the year but when we asked students about the criteria in the interview some said they did not remember the criteria. Students said that they had not really paid attention at the beginning of the year. They appreciated receiving precise criteria and guidelines to do their labs and some admitted that they would probably not have read them if they did not have to do self and peer assessment. Giving students a list of criteria and asking them to do something with them should produce different results. Students not only have to be able to understand the criteria but they also have to develop the skills of judging and evaluating a piece of work. This ability to judge and . . evaluate a piece of work requires higher order thinking (Bender, 2003). This exercise implies that students need to read carefully and identify problematic aspects in the work of others. Unlike when they apply the criteria to themselves they need to explicitly verbalize or, in this case, give written feedback on why some criteria have not been met appropriately. Monitoring Comments given by Peers The instructor could benefit from monitoring comments given by peers. Monitoring could improve the students' view of the activities in terms of its validity and the investment they are willing to make in it. When students are required to evaluate someone else's work and justify their views and judgments, it might show successful and unsuccessful understanding of the material. Depending on ..
Web-Based Self and Peer Assessment 79 the content and context of the assignment, the instructor might be able get insightful feedback on students' learning. Simultaneously, the monitoring of comments can ensure feedback validity and constructive feedback from students. Even if there had been no problems regarding the constructive approach taken by students in this study it could be different if these activities were implemented in a more competitive context. Dynamic Self and Peer Assessment Form Students gave both positive and negative feedback about the activities. Some said it was too long and others said it was a bit repetitive. As shown by the comparison of the types of errors found in self and peer assessment, it may be important to explore dynamic and different self and peer assessment forms. Some types of errors might be more appropriate for self or peer assessment and . . , the length could be optimized by not asking the same questions in both.
n not her
way to optimize the length would be to have an incremental list of questions that could evolve with students' performance and the type of lab experiment done. As students stop making some errors the requirement for self and peer assessment could change. This implementation would not necessarily require the instructor's involvement in keeping track of students' errors, students could also keep their own error logs and justify which questions are appropriate for their level.
Web-Based Self and Peer Assessment 80 Conclusion The impact of the activities on students' learning is difficult to measure; exploratory data obtained are not conclusive on the improvement of error monitoring nor on grade improvement but they suggest a possible effect. This lack of significant impact is not unusual considering the time frame in which data where collected. Any training or new instructional strategy requires training and time for students to assimilate (Dochy F. 8 Mcdowell. L.. 1997). The fact that students' ability to predict grades does not correlate with the instructor's grade might imply a mismatch in the understanding of criteria. It is impossible to identify where the misunderstanding lies, it could be due to implicit criteria used by the instructor that were not conveyed to students or students' misunderstanding of the actual criteria.
This result c'onfirms the importance . . .of
either training students on using the criteria or even designing the criteria with them (Topping, 2003). Even if instructions appear to be self explanatory with details criteria and grading points, instructors cannot take for granted that their standards are transparent for students. Frederiksen and Collins (1989) stress the importance of giving students series of exarnplars of work with grading
comments with the assessment criteria to help them understand the criteria for themselves. Documenting and making the assessment process more public can show a shared misunderstanding from the group that which will probably lead to the instructor improving their explanation of the list of criteria for the following year.
Web-Based Self and Peer Assessment 81
I
Results regarding the monitoring of errors do not confirm a statistically significant decrease in the number of errors but the results do raise questions about the types of errors that students can easily detect in themselves and in others. These preliminary results about error recognition in others suggests a positive impact of peer feedback similar to findings in the area of writing (Bereiter & Scardamalia, 1987). The participation rate in the peer assessment was
slightly more popular than in the self assessment. Students said they found it difficult to evaluate someone else's work, yet they liked seeing and comparing themselves with others. In a context where the level of competition is low this comparison amongst learners is positive, as reported by a student "When looking at the other group's lab I tried to see what they were doing well to try to do thesame but I also learned not to reproduced what they were doing wrong". This statement confirms some of Astoflli (1999) ideas about the usefulness of errors. Combined with students shared preference for both assessments, results about usefulness of seeing and monitoring errors in others suggest that the self and the peer assessment add to students' learning. The concept of a community of practice in a classroom setting is difficult to implement since students' behaviours in classroom settings are heavily influenced by the assessment (Eoud et al., 1999; Mcdowell. 1995). Students are used to taking tests and writing papers to fill-in the requirements for a grade. Transferring part of the responsibility for assessment might be a good way to involve students in their learning process but results show that substantial
Web-Based Self and Peer Assessment 82 changes need to be done to the task to meet students' needs. Students did appreciate the task and show endorsement but to improve their collaboration and discussions students suggest that the task should be introduced earlier in the year. They also propose a time frame during which the activities would start as a requirement and become optional as students' performance improve. Future Studies The present study suggests several possible paths of exploration. One of the characteristics of design experiments is its iterative design (Cobb et al., 2003) which requires a cycle of revisions and improvements. Findings, negative or positive, need to be tested in different contexts to gain more credibility and robustness. In future implementations, a longer period of time would be required to test the effectiveness of the activities and see more solid and revealing
,
.
patterns in students' errors, The integration of technology could be improved to make it more transparent to both students and instructor. Such improvement would allow more direct interactions amongst users and increase the instructor's involvement in the online Component of the activities. Finally, this model of self and peer assessment activities could be adapted and implemented in different contexts to allow for education.
and guidelines for future use in higher
Web-Based Self and Peer Assessment 83 REFERENCES American Association for the Advancement of Science. (1997). Project 2061: Science Literacy for a Changing Future. Washington DC. Anderson, J., Greeno, J. G., Reder, L.,, & Simon, H. A. (2000). Perspective on learning, thinking and activity. Educational Researcher, 29(4). Anderson. J., Reder, L., & Simon, H, ($998). Radical constructivism and cognitive psychology. 227-278. Astolfi, J.-P. (1999). L'erreur - un outil pourenseigner. Paris: ESF bditeur. Ballantyne. R., Hughes, K., & Mylonas, A. (2002). Implementing Peer Assessment in Large Classes: Procedures to Facilitate Student Learning. Assessment & Evaluation in Higher Education, 27(5), 427-439. Beckwith, J. B. (1991). Approaches to learning, their context and relationship to assessment performance. Higher Education, 22, 17-30. Bender, T. (2003). Discusion based online teaching to enhance student learning: Theory, practice and assessment. Sterling. Virginia: Stylus Publishing. Bereiter, C., & Scardamalia, M. (1987). knowledge telling and knowledge transforming in written composition. In S. Rosenberg (Ed.), Advances in Applied Psycholinguistics (Vol. 2, pp. 142-175). New York: Cambridge University Press. Birenbaum, M., 8 Dochy. F. (1996). Alternatives in assessment of achievement, learning processes and prior knowledge. Boud, D., Cohen, R., & Sampson, J. (1999). Peer Learning and Assessment. . .. . Assessment & Evaluation in Higher Education, 24(4), 413-426. Brown, A. (1992). Design experiments: Theoretical and methodological challenges in creating complex interventions in classroom settings. The Journal of the Learning Sciences(2), 141-178. Brown, A,, & Campione, J. C. (1990). Communities of learning and thinking, or A context by any other name. Human Devetopmenf, 21. 108-125. Brown, A,, & Campione, J. C. (1996). Psychological theory and the design of innovative learning environments: On procedures, principles, and systems. In L. Schauble 8 R. Glaser (Eds.), innovations in Learning: New Environments for Education (pp. 389-325). Mahwah. New Jersey: Lawrence Erlbaum Associates. Brown, S., & Knight, P. (.1994). Assessing Learners in Higher Education. Bruffee, K. A. (1995). Sharing Our Toys: Cooperative Learning versus Collaborative Learning. Change, 27(1, 12-18. Butler, D. L., & Winne, P. H. (1995). Feedback and self-regulated Iaarning: A theoretical synthesis. Review of Educational Research, 65, 245-281. Cobb. P., Confrey, J., diSessa, A., Lehrer. R., & Schauble. L. (2003). Design Experiments in Educational Research. Educafional Researcher, 32(1),913, Council of Ministers of Education. (1997). Common Framework of Science Learning Outcomes. Toronto: CMEC Secretariat. Divaharan, S., 8 Atputhasamy, L. (2002). An attempt to enhance the qulaity of cooperative learning through peer assessment. Journal of Educational Enquiry, 3(2), 72-82.
Web-Based Self and Peer Assessment 84 Dochy. F., & McDowell, L. (1997). Assessment as a tool for learning. Studies in Educational Evaluation, 23(4),279-298. Dochy, F., 8 Moerkerke. G. (1997).The present, the past and the future of achievement testlng and performance assessment. lnfernational Journal of Educational Research, 27,415432. Dochy, F., Segers, M., 8 Sluijmans, D. (1999a). The use of self, peer & coassessment in Higher Education: a review. Studies in Higher Education, 24(3).332. Dochy, F., Segers, M., & Sluijsmans, D. (1999b). The Use of Self-, Peer and Coassessment in Higher Education: A Review. Studies in Higher Education, 24(3),331 -350. Dochy, F. J. R. C., & McDowell, L. (1997). Introduction: Assessment as a Tool for Learning. Studies in Educationel Evaluation. 23(4), 279-298. Edward, N. S. (2003). Mark my words: self and peer assessment as an aid to learning. European Journal of Engineering Educetion, 28(1), 103-116. Eschenbach, E. A. (2001). Improving technical writing via web-basedpeer-review of final reports. Paper presented at the 31st ASEEAEEE Frontiers in Education Conference, Reno, NV. Falchikov, N. (1995). Peer feedback marking: Developing peer assessment. "lnnovations in Education and Training Internat1'0na1, 32': 2, 175-187. Falchikov, N., & Boud, D. (1989). Student self-assessment in higher education: A meta-analysis. Review o f EducationalResearch. 544). 395-430. Falchikov, N., & Goldfinch, J. (2000). Student peer assessment in higher education: A Meta-analysis comparing peer and teacher marks. Review of Educational Rgsearch, 70(3).287-322. Flecknoe, M. (2002). How Can ICT Help Us To Improve Education? lnnovations in Educafion & Teaching International, 39(4),271-279. Frederiksen, J. R., & Collins, A. (1989). A Systems Approach to Educational Testing. Educational Researcher, 18(9),27-32. Gale, K., Martin, K., & McQueen, G. (2002). Triadic Assessment. Assessment & Evaluation in Higher Education, 27(6), 557-567. Gentle, C.R. (1994). Thesys: an expert system for assessing undergraduate projects. Paper presented at the Deciding our future :technological imperatives for education : the eleventh International Conference on Technology and Education, University of London, London, England. Gielen, S., Dochy. F:.& Dierick, S. (2003). Evaluating the Consequential Validity ofNew Modes of Assessment: The Influence of Assessment on Learning, Including Pre-, Post-, and True Assessment Effects. In M. Segers, F. Dochy 8 E . Cascallar (Eds.). Optimising New Modes ofAssessment: In Search of Qualities and Standards (Vol. 1, pp. 37-54). Dordrecht: Kluwer Academic Pubiisher. Gipps, C.V. (1999). Socio-cultural aspects of assessment. Review of Research in Education, 24. 355-392. Graham, S., & Harris, K. R. (1994). Implications of Constructivism for Teaching Writing to Students with Special Needs, llournal of Special Education,
28(3),275-289.
Web-Based Self and Peer Assessment 85 Greeno, J. (1998). The situativity of knowing, learning, and research. American Psychologist, 53(1). Hacker, D. J. (1998). Definitions and Empirical Foundations. In D. Hacker, J. Dunlosky & A. Graesser (Eds.), Mefacognition in Educational Theory and Practice (pp. 407). Mahwah, New Jersey: Lawrence Edbaum Associtaes, Inc. Halpern, D. F. (1998). Teaching Critical Thinking for Transfer across Domains: Dispositions, Skills. Structure Training, and Metacognitive Monitoring. American Psychologist, 53(4),449-455. Hassard, J. (1992). Minds on Science. New York: Harper Collins Publishers. Hayes, J. R., & Flower, L. S. (1986). Writing research and the writer. American Psychologist, #?(lo), 1106-1113. Johnson, D. W., & Johnson, R. (1994). Learning TogetherAnd Alone: Cooperative, Competitive, And individualistic Learning (4th ed.). Englewood Cliffs, NJ: Prentice-Hall. Johnson, D. W., Johnson. R. T., & Smith, K. A. (1998). Cooperative learning returns to coilege: What evidence is there that it works? Change, 30(4), 2fi-35-- --. Karabenick, S. A. (1996). Social Influences on Metacognition: Effects of Colearner Questionina on Comorehension Monitorina. Jouma! of Educational~sychol&y, 88(4),'689-703. Lajoie, S. P., Lavigne, N. C., Munsie, S. D., & Wilkie, T. V. ($998). Monitoring Student Progress in Statistics. In S. P. Lajoie (Ed.), Reflections on Statistics: Learning, Teaching, and Assessment in Grades K-I2 (pp. 199-. . . 231). Mahwah, New Jersey: Lawrence Erlbaum Associates. Lave. J., 8 Wenger, E. (1991 ). Sifuated Learning: legitimate peripheral participafion. Cambridge: Cambridge University Press. Lewis, R. (2001). Learning together in vi&sl communities. Paper presented at the FREREF ICT. Longhurst, N., & Norton, L. S. (1997). Self-Assessment in Coursework Essays. Studies in Educational Evaluation, 23(4),319-330. McDowell, L. (1995). The impact of innovativeassessrnent on student learning. innovations in Education and Trainina Intemationai.. 32(41. , ,. 302-313. National Academy of Sciences - National ~esearchCouncil Center for Science Mathematics and Engineering Education. (1997). introducing fhe National Science Education Standards. Washington DC. National Research Council. (2001). Classroom assessment and the National Science Education Standards. Washington, DC: National Academy Press. National Science foundation. (1996). Shaping the Future: Strategies for Revitalizing Undegraduate Education. Proceedings from the National Working Conference (No. NSF-98-73). Washington, DC. Orsmond, P. (1996). The Importance of Marking Criteria in the Use of Peer Assessment. Assessment & Evaluafion in Higher Educafion. 27(3).239250. Palincsar, A. S., & Herrenkohl, L. R. (2002). Designing Collaborative Learning Contexts. Theory Into Practice, #?(I), 26-32.
-
Web-Based Self and Peer Assessment 86 Panitz, T. (1997). Collaborative Versus Cooperative Learning: Comparing the Two Definitions Helps Understand the nature of Interactive learning. Cooperative Learning and College Teaching, 8(2), 6-10. Pelfegrino, J., & Chudowsky, N. (2003). Large-scale assessments that support learning: What will it take? Theory into Practice, 12(1), 75-83. Pellegrino, J. W., Baxter, G. P., & Glaser, R. (2000). Addressing the "two disciplines" problem: Linking theories of cognition and learning with assessment and instructional practice; Review of research in education. American Educational Research Association, 24. Prosser, M., & Trigwell, K. (1999). Relational Perspectives on Higher Education Teaching and Learning in the Sciences. Studies in Science Educafion, 33, 31-60. Resnick, L., Salmon, M., & Zeitz, C. M. (1993). Reasoning in conversation. Cognition and instruction, 17, 347-364. Ross, S. (1998). Self-assessment in second language testing: A meta-analysis and analysis of experiental factors. Language Testing, 15, 1-20. Rushton, C., Ramsey, P., & Rada, R. (1993). Peer assessment in a collaborative hypermedia environment: a case study. Journal of Computer-Based Insfruction, 20(3),75. Sadler, D. R. (1998). Formative assessment: revisiting the territory. Assessment in Education, 5(1),77-84. Schraw, G., & Moshman. D. (1995). Metacognitive theories. Educational Psychology Review, 7, 35 1-373. Schunk, D. H. (2001). Social-Cognitive Theory and s e l f - ~ e ~ u l a t e Learning. d In - . . B. J; Zirnmerman & 0.H. Schunk (Eds,), Self-regulafed learning and academic achievement: h he ore tical perspectives (pp. 125-152). Hillsdale, NJ: Erlbaum. Shepard, L. A. (2000). The role of assessment in a learning culture. Educational Researcher, 29(7). Slavin, R. E. (1980). Cooperative learning. Review of Educational Research, 50(2),'315-342. Slavin, R. E. (1995). Cooperative-Learning: Theory, Research, and Practice (2 ed.). Boston: Allyn and Bacon. Sluijsmans, D., Dochy, F., & Moerkerke, G. (1998). Creating a Learning Environment by Using Self-, Peer-, and Co-Assessment. Learning Environments Research, 1(3), 293-319. Smith, B., & MacGregor, J. (1992). What is Collaborative Learning? In A. e. a. Goodsell (Ed.), Collaborative Learning: 4 Sourcebook for Higher Education. University Park, PA: National Center on Postsecondary Teaching and Learning Assessment. Springer. L., Stanne, M. E., & Donovan,'S. S. (1999). Effects of small-group learning on undergraduate in science, mathematics, engineering, and technology: A meta-analysis. Review of Educations/ Research, 69(1), 2151.
.
Web-Based Self and Peer Assessment 87 Stefani, A. J. (1992). Comparison of collaborative, self, peer and tutor assessment in a biochemistry practical. Biochemical Education, 20, 148151. Topping, K. (1998). Peer assessment between students in colleges and universities. Review of Educational Research, 68(3),249-276. Topping, K. (2003). Self and Peer Assessment in School and University. In M. Segers, F. Dochy & E. Cascallar (Eds.), Optimising New Modes of Assessment: In Search of Qualities and Standards (Vol. 1, pp. 55-87). Dordrecht: Kluwer Academic Publisher. Topping, K., 8 Ehly, S. (1998). Introduction to Peer-Assisted Learning. In K. Topping & S. Ehly (Eds.). Peer-assistedlearning (pp. 1-23). Mahwah, NJ: Erlbaum. Vasquez-Abad, J., Winer, L. R., & Derome, J, (1997). Why Some Stay: A Study of Factors Contributing to Persistence in Undergraduate Physics. McGill Journal of Education, 32(3),209-229. Winer, L. R. (2002). Computer-Enhanced Collaborative Drafting in Legal Education. Journal of Legal Education, 52(1-2), 278-286. Winer, L. R., & Cooperstock. J. (2002). The "Intelligent Classroom": Changing Teaching and Learning with an Evolving Technological Environment. Computers & Education, 38(1-3),253-266. Winne. P. H. (1996). A Metacognitive View of Individual Difference in SelfRegulated Learning. Learning and Individual Differences, 8(4), 327-353. Winne, P. H. (q 997). Experimenting to bootstrap self-regulated learning. Journal of EducationalPsychology, 89,397-41 0. Zimmerrnan, 8.J. (1986). Devetopmeint of self-regulated learning: Which are the key sub-processes? Contemporaty EducationalPsychology, 16.307-3 13. Zimmerrnan, B. J. (2000). Attainment of self-regulation: A social cognitive perspective. In P. P. M.Boekaerts, & M. Zeidner (Ed.), Self-regulation: Theory, research and applications (pp. 13-39). Orlando, FL: Academic Press. Zimmerman, 6.J., 8 Schunk, D. H. (2001). Self-reguleted learning and academic achievement: Theoretical perspectives (2nd ed.). Hillsdale, NJ: Erlbaum. 8 Ben-Chaim. 0.(1998). Student self-~ssessmentin HOCS Science Zoller, Z., Examinations: Is There a Problem? Journal of Science Education and Technology, 7(2), 135-147.
Web-Based Self and Peer Assessment 88 Appendix A Consent Form and Ethics farm *-
MCGILL UNIVERSITY FACULTY OF EDUCATION
?
FUNDED AND NON FUNDED RESEARCH fN
-i
Received
I1
LVING H
wrim
as proposed by:
r
Applicant's Name
Superviso