P1: FYJ Journal of Behavioral Education [jbe]
ph122-jobe-372884
April 27, 2002
8:31
Style file version Nov. 19th, 1999
C 2002), pp. 89–104 Journal of Behavioral Education, Vol. 11, No. 2, June 2002 (°
Item-by-Item Versus End-of-Test Feedback in a Computer-Based PSI Course Jay Buzhardt1,2 and George B. Semb1
This study examined the effects of three different types of computer feedback on the following variables in a Personalized System of Instruction (PSI) course: unit quiz and final exam performance, the amount of time tutors and other teaching staff spent answering student questions, and students’ preference for each type of feedback. The feedback conditions were the following: (a) end-of-test, (b) item-byitem with the option to skip questions during the test, and (c) item-by-item without the option to skip questions during the test. Students who received item-by-item feedback with the skip option performed the same as students who received end-oftest feedback on the unit quizzes and final exam. However, the teaching staff spent significantly less time answering questions when students received item-by-item feedback with the skip option than when they received end-of-test feedback. Finally, 65% of the students preferred item-by-item feedback with the skip option. The authors concluded that this type of item-by-item feedback decreases the workload on teaching staff in a PSI course without sacrificing performance, and students like it more than the other types of feedback. KEY WORDS: feedback; personalized system of instruction; computer-based instruction; cost; individualized instruction.
INTRODUCTION Research suggests that feedback helps students learn more effectively and that it promotes higher levels of long-term retention (Ellis & Semb, 1998; Semb & Ellis, 1994). Skinner (1953, 1958) used this principle as the basis for programmed instruction, a system that presented material in an invariant, linear sequence. 1 Department
of Human Development, 4001 Dole Human Development Center, University of Kansas, Lawrence, Kansas. should be directed to Jay Buzhardt, MA, Department of Human Development, 4001 Dole Human Development Center, University of Kansas, Lawrence, Kansas 66045-2133; e-mail:
[email protected].
2 Correspondence
89 C 2002 Human Sciences Press, Inc. 1053-0819/02/0600-0089/0 °
P1: FYJ Journal of Behavioral Education [jbe]
ph122-jobe-372884
April 27, 2002
90
8:31
Style file version Nov. 19th, 1999
Buzhardt and Semb
In programmed instruction, students progress through a series of instructional “frames.” They respond to fill-in-the-blank questions within each frame and see immediately whether their answer was correct by uncovering the correct response. The assumption is that seeing the correct answer reinforces learning. This assumption provided the theoretical basis for “immediate feedback.” Academic Performance Several studies have reported that immediate feedback improves academic performance (Beeson, 1973; Kulik & Kulik, 1988; Leeds, 1970). However, other studies found that delayed feedback improves student performance on retention tests more than immediate feedback (O’Neill, Rasor, & Bartz, 1976; Sassenrath & Yonge, 1969; Strang & Rust, 1973; Sturges, 1978; Webb, Stock & McCarthy, 1994). Kulhavy and Anderson (1972) argued that immediate feedback hinders learning because it prevents students from “processing information” provided by immediate feedback. Delayed feedback, they argued, allows students to process information provided by the feedback. Conflicting results also exist in studies of the acquisition of performance skills in areas such as job training (O’Reilly, Renzaglia, & Lee, 1994; Schmidt & Bjork, 1992). The overall ambiguity of this body of research suggests that something other than the timing of feedback may affect learning. In a course, such as the one used in the current study that uses Keller’s (1968) Personalized System of Instruction (PSI) (described in more detail in the Methods section), students continuously take tests and receive immediate feedback from tutors. Several studies have reported that this type of immediate feedback is a key variable in producing superior academic performance in PSI classes. For instance, Austin and Gilbert (1973) concluded that the superiority of their PSI class over a lecture class (based on a comparison of final exam scores) was due to the immediate feedback provided by tutors. Other studies (Farmer, Lachter, Blaustein, & Cole, 1972; Johnson & Sulzer-Azaroff, 1975) have shown that students who receive tutoring progress through courses at a significantly higher rate, perform better on final exams, and require fewer remedial quizzes in order to achieve mastery. To add to the ambiguity of this literature, there is another variable that confounds the issue: whether or not students control the sequence of items on a test. The historical precedent for this question originated with Pressey’s (1926) “teaching machine” (p. 373) that provided students with feedback after each response, or item-by-item (IBI) feedback. This type of testing usually uses an invariant sequence of items. That is, all students proceed through the test in the same order— they do not have the opportunity to skip among items. Thus, there are two issues: (a) immediate versus delayed feedback, and (b) what controls the sequence of items, the student or the test.
P1: FYJ Journal of Behavioral Education [jbe]
ph122-jobe-372884
April 27, 2002
Computer Feedback in a PSI Course
8:31
Style file version Nov. 19th, 1999
91
Student Preference One of the most commonly reported problems with IBI feedback is that students do not like it. Strang and Rust (1973) found that students reported “worrying” more when confronted with an important test that had feedback after each item. Hail (1984) found that students reported more “anxious” or “nervous” feelings when exposed to feedback after every item than when they received feedback at the end of the test. Gaynor (1981) found that “high ability” students reported more frustration by feedback after every item than other students. Cohen (1985) suggested that Gaynor’s findings might be due to the fact that this type of feedback slows down the pace of testing, regardless of whether students respond correctly or incorrectly. Allowing students to view questions in any sequence allows them to return to difficult questions after answering easier ones. This is a common test taking strategy promoted by educators (Loulou, 1995) and the government (Office of Educational Research and Improvement, 1992). Preventing students from altering the sequence may elicit anxiety and/or anger, which may explain why some of them dislike feedback after every item. None of the studies cited here explicitly stated whether students had the option to alter the sequence of questions. However, due to their instructional methods, some studies (Beeson, 1973; Clariana, 1990; Clariana, Ross, & Morrison, 1991; Sassenrath & Yonge, 1968) must have used invariant sequences of items. Cost Effectiveness Aside from eliciting anxiety, another potential problem with providing feedback after every item is its high cost. Some researchers (Koen, Wissler, Lamb, & Hoberock, 1975; Sherman, 1992) have cited the cost of a PSI course as a contributing factor in its decrease in popularity over the past twenty years. Specifically, Koen and his colleagues assert that using tutors can significantly increase the cost of a PSI course. Outside of PSI courses, Lockhart, Sturges, Zachai, Dubois, and Groves (1981) reported poor cost efficiency in the development and implementation of IBI feedback, referring to the enormous amount of time and effort required for teachers to provide this type of feedback. Pressey (1926, 1950), on the other hand, suggested that when delivered via mechanical devices, such as punchboards, feedback after every item is more efficient than delivering feedback at the end of a test. Computers allow educators to deliver more complex feedback than punchboards without a significant increase in costs. Levin, Leitner, and Meister (1987) reported that personnel costs accounted for the majority of the expenses in computerassisted instruction (CAI). Computer delivered feedback that reduces the need for personnel without a loss in student performance may be a valuable element in the development of efficient CAI systems. Prior to the use of computers in classrooms,
P1: FYJ Journal of Behavioral Education [jbe]
ph122-jobe-372884
April 27, 2002
92
8:31
Style file version Nov. 19th, 1999
Buzhardt and Semb
it was difficult to deliver feedback after every item and allow students to answer items in any sequence. Perhaps this is why researchers have not analyzed the effects of allowing students to control the sequence of items when feedback is delivered after every item. Lewis, Dalgaard, and Boyer (1985) state, “The most consistent finding in the CAI literature is its ability to reduce instructional time” (p. 92). Melmed’s (1995) research supported this assertion. He reported that integrating CAI into military training reduced the time it took trainees to reach a criterion performance by an average of 30%. Other studies (Fletcher, Hawley, & Piele, 1990; Levin, Glass, & Meister, 1987) further confirm the cost effectiveness of CAI over other instructional methods. However, one of the goals of the current study was to determine how IBI feedback affects the cost of CAI in terms of the amount of time instructors spend with students. The current study analyzed the effects of three types of feedback: (a) feedback after each item in an invariant sequence (IBI/i), (b) feedback after each item in any sequence determined by the student (IBI/a), and (c) feedback provided at the end of the test (EOT). The dependent measures were student performance on two unit quizzes in a six-unit sequence, performance on a retention test (the final exam), the amount of time the teaching staff spent helping students during and after testing, and what type of feedback students preferred. We hypothesized that students who receive feedback after each item in either the invariant or student-controlled sequence would perform better and require less time from the teaching staff than students who receive feedback at the end of the test. We also hypothesized that students would prefer feedback after each item over feedback delivered at the end of the test. METHODS Participants One hundred fifty students enrolled in six sections of Introduction to Child Development and Behavior at the University of Kansas participated in this study. All participants signed a letter of informed consent indicating their voluntary participation in the study. The demographics of the students were as follows: 70% female, 30% male; M Age = 20 years; M GPA = 2.91; and 59% majored in Liberal Arts, 18% in a health related major, 15% in a natural science (e.g., biology, chemistry, etc.), and 8% in a business related field. Procedure and Design Instructional Method The course was taught using Keller’s (1968) Personalized System of Instruction (PSI), a teaching method that uses the principles of behavior analysis. In 1968,
P1: FYJ Journal of Behavioral Education [jbe]
ph122-jobe-372884
April 27, 2002
Computer Feedback in a PSI Course
8:31
Style file version Nov. 19th, 1999
93
Keller established the five components of a PSI course: 1) the course should be self paced (i.e., no compulsory testing schedules), 2) the student should master each unit of material before continuing to the next unit, 3) lectures should be used as motivational tools instead of a source of information, 4) students’ written responses (e.g., quizzes) should be stressed as a method of assessing their progress, and 5) the use of proctors to grade tests and strengthen the social interactions between student and teacher. Perhaps the most frequently used component of modern PSI courses, and one used in the course described in the current study, is the use of frequent quizzes to assess student progress. In fact, recent studies (Marek, Griggs, & Christopher, 1999; Weiten, Deguara, Rehmke, & Sewell, 1999) have shown that frequent quizzes improve performance, and that students prefer taking frequent quizzes to taking only two or three tests throughout the semester. Students in the course used in the present study took 15-item multiple choice/ true-false quizzes on each of the six units from the course text. They took all tests and quizzes on PC’s arranged around the perimeter of the computer laboratory. Students could take a quiz on each unit a maximum of three times, but only their highest score counted toward their final grade. Although each quiz on each unit covered the same material, the actual quiz items differed for each quiz. Thus, students never took the same quiz twice. Similar to many PSI courses, students had access to two tutors and the TA. Tutors were undergraduates who earned an ‘A’ in the course during the previous semester, and who were asked to tutor in the class. The tutors checked students’ study guides and offered tutoring after students completed their quizzes. University regulations forbid tutors from interacting with students while they took tests. The TA, however, could answer any questions while students tested. The quiz material came from the course text. Each unit had a study guide that students could use to assess their knowledge of the material. Students had to complete the study guide if they did not correctly answer at least 12 out of 15 questions on their first quiz attempt. This contingency discouraged students from taking their first attempt without studying. Students could take the unit quizzes as soon as they wanted, but they could not fall behind a schedule provided to them at the beginning of the semester. Feedback Conditions In the end-of-test feedback condition (EOT), students could go through the questions in any sequence. Once they finished the test, the computer asked if they were sure they wanted their test graded. When they entered ‘OK,’ the computer graded the test. At this point, students could review all the items on the test, including the question, their responses, and the correct answers. This was an option—the computer did not require them to review their answers. There were two item-by-item feedback conditions. In the item-by-item/any sequence (IBI/a), students received feedback (the question, their answer, and the
P1: FYJ Journal of Behavioral Education [jbe]
ph122-jobe-372884
April 27, 2002
8:31
Style file version Nov. 19th, 1999
94
Buzhardt and Semb Table I. Experimental Conditionsa by Unit and Class
Class 1 Class 2 Class 3 Class 4 Class 5 Class 6
Unit 1
Unit 2
Unit 3
Unit 4
Unit 5
Unit 6
IBI/a IBI/i IBI/a IBI/i EOT EOT
IBI/a IBI/i IBI/a IBI/i EOT EOT
IBI/i IBI/a EOT EOT IBI/i IBI/a
IBI/i IBI/a EOT EOT IBI/i IBI/a
EOT EOT IBI/i IBI/a IBI/a IBI/i
EOT EOT IBI/i IBI/a IBI/a IBI/i
= IBI with a “skip” option, IBI/i = IBI with no “skip” option, EOT = end-of-test feedback with a “skip” option.
a IBI/a
correct answer) immediately after responding to each item. However, they could skip items before answering them and return to them before the end of the test. In the item-by-item/invariant sequence (IBI/i), students could not skip items, which meant they had to answer questions in the order the computer presented them. Similar to the EOT condition, students in both IBI conditions could also review the test immediately after completing it. Throughout the semester, students took quizzes over six units, allowing exposure to each condition on two consecutive units. This resulted in six different orders of the three feedback systems divided among the six classes; thus, each class received the three types of feedback in a different order. The researchers counterbalanced the three feedback systems (experimental conditions) between the six classes as shown in Table I. Dependent Measures Performance on the First Two Unit Quizzes Performance on the first two unit quizzes was evaluated to compare the effects of the three feedback conditions on immediate retention. Unlike later units, the first two units were assessed because they were unaffected by carry over effects (Keppel, 1991). These units represented the first condition for all students, eliminating the potential that previous conditions would affect their performance. Students’ first, last, and highest score on unit quizzes one and two were analyzed. Retention At the end of the semester, students took a final exam that consisted of 30 multiple-choice items. All students received EOT feedback on the final exam. Furthermore, they could take it again using a different form of the test with different questions. The experimenters used students’ highest score to assess retention. The final exam consisted of five questions from each unit (ten from each condition because each condition had two units), allowing for the assessment of
P1: FYJ Journal of Behavioral Education [jbe]
ph122-jobe-372884
April 27, 2002
8:31
Computer Feedback in a PSI Course
Style file version Nov. 19th, 1999
95
long-term retention for each condition. Thus, retention for each condition was determined by computing each student’s percent correct for each of the three conditions represented on the final exam. The three retention scores for each student, one for each condition, allowed for a repeated measures ANOVA. However, a potential disadvantage of the repeated measures design is the potential for order effects (Keppel, 1991). Cohen (1996) suggests that the only way to “average out” (p. 605) order effects is to counterbalance the conditions across several groups. Thus, the three conditions were counterbalanced across the six units and six classes as shown in Table I so that all three potential orders of the three conditions were used.
Students’ Reported Preferences Students completed a computer-administered survey (see Appendix) after they completed the final unit quiz. This survey assessed what type of feedback students preferred. Responses to the survey did not affect students’ grades. However, the computer did not allow students to progress to the final exam without completing it.
Time Instructors Spent With Students This refers to the amount of time that tutors, TA’s, or both spent with students who had a question about a quiz item either during or after a unit quiz. It did not include interactions about their grade or questions about how the computer worked. Both the TA’s and tutors maintained logs that tracked, to the nearest minute, how much time they spent with individual students during testing and while students reviewed their completed test. The computer had a real-time clock in the bottom corner of the screen. Thus, when students had questions, the tutor or TA recorded the time he/she arrived at the workstation, and recorded the time he/she left the workstation. The difference between these two recorded times defined how long the interaction lasted. Two reliability observers maintained logs identical to the ones described above. For instance, when a student raised his/her hand, the reliability observer approached the student with the TA or tutor and recorded what time the interaction started and stopped. The reliability observers made these recordings on 22.2% of the total TA/tutor interactions at random times during the semester in all conditions. The reliability logs were compared to the TA/tutor logs. An agreement counted as no more than 1 min difference between the reliability observation and the TA/tutor observation. The experimenters calculated reliability by dividing each reliability observer’s agreements by his/her total number of observations (agreements + disagreements), and multiplying by 100.
P1: FYJ Journal of Behavioral Education [jbe]
ph122-jobe-372884
April 27, 2002
8:31
96
Style file version Nov. 19th, 1999
Buzhardt and Semb
RESULTS Immediate Retention (Unit Quiz Performance) The experimenters conducted a one-way multivariate analysis of variance (MANOVA) to determine the effect of the three types of feedback (IBI/a, IBI/i, and EOT) on three aspects of students’ quiz performance: first attempt, highest score, and last attempt. The MANOVA revealed significant differences between the three feedback conditions, F(3, 152) = 3.31, p < .05. Based on Cohen’s (1988) classification of effect sizes, the multivariate η2 revealed a medium overall effect size of .06. The experimenters then conducted one-way analyses of variances (ANOVA’s) on each dependent variable as follow-up tests to the MANOVA. Using the Bonferroni correction to control for Type I error, alpha was set at .02 (.05/3) for each ANOVA. The only comparison that reached significance was for highest score, F(2, 152) = 4.47, p < .02, which had a medium effect size (η2 = .06). Post hoc analyses to the one-way ANOVA’s for highest scores consisted of pairwise comparisons to determine if one or more of the feedback conditions had a greater effect on this variable than the others. Again, using Bonferroni’s correction, alpha was set at .006 (.025/3). None of the comparisons yielded significant differences at this alpha level.
Long-Term Retention (Final Exam Performance) Mean performance scores on the retention test for IBI/a, IBI/i, and EOT were 73.47%, 73.53%, and 71.33%, respectively. A repeated measures ANOVA revealed no significant within subject differences between the three conditions, F(2, 298) = 1.37, p = .26. Students’ Reported Preferences Table II shows the percentage of students who preferred each type of feedback. The experimenters conducted a Chi Square test (χ 2 ) on the total percentages of students who preferred each type of feedback. This test revealed that the differences 2 (2, N = 150) = 69.78, p < .001. between these percentages were significant, χobt The majority of students (60.8%) preferred IBI/a feedback. The table also shows the percentage of students who indicated how much they preferred their choice. Table III is similar to Table II except that it only shows the percentage of students who earned an ‘A’ in the class who preferred each type of feedback. A 2 (2, N = Chi Square test revealed that these differences were also significant, χobt 42) = 30.99, p < .001. None of these students reported “No Preference.”
P1: FYJ Journal of Behavioral Education [jbe]
ph122-jobe-372884
April 27, 2002
8:31
Style file version Nov. 19th, 1999
Computer Feedback in a PSI Course
97
Table II. Percentage of Students that Indicated a Preference for One Type of Feedback Degree of preference Slight preference
Moderate preference
Great preference
Absolute preference
Total
IBI/a EOT IBI/i
8.32 3.22 0.6
14.72 7.13 2.52
24.96 8.28 7.56
12.8 3.91 0.6
60.8 22.54 11.28
Total
12.14
24.37
40.8
17.31
94.62a
Condition
a The
percentages do not sum to 100% because about 5% of the students indicated “No Preference.”
Time Required by Instructors Two independent observers also recorded TA/tutor time interactions. One observer agreed with the TA/tutor 89% of the time and the other agreed 100%. Of the students who requested verbal feedback during any one condition (N = 43), the tutors and TA’s spent an average of 0.81, 1.26, and 2.29 min with each student in the IBI/a, IBI/i, and EOT conditions, respectively. A repeated measures ANOVA revealed that these data were significantly different, F(2, 84) = .3.88, p < .05, and had an overall medium effect size (η2 = .09). Follow-up paired-sample t-tests using an alpha of .02 based on Holm’s sequential Bonferroni correction were conducted to determine if one feedback condition affected instructor time more than the others. These comparisons revealed that only IBI/a and EOT differed significantly F(1, 42) = 5.89, p < .02, and a medium-high effect size (η2 = .12). DISCUSSION Pressey (1926) asserted that presenting “each question before [the student] until he finds the answer” (p. 374) was an advantage of his original testing machine. The current study investigated the effectiveness of this type of testing. The researchers hypothesized that (a) students who receive either IBI/a or IBI/i feedback Table III. Percentage of ‘A’ Students that Indicated a Preference for One Type of Feedback Degree of preference Condition IBI/a EOT IBI/i Total
Slight preference
Moderate preference
Great preference
Absolute preference
Total
17.02 2.4 2.38 21.8
25.9 7.2 7 40.1
21.46 2.4 4.62 28.48
9.62 0 0 9.62
74 12 14 100
P1: FYJ Journal of Behavioral Education [jbe]
ph122-jobe-372884
April 27, 2002
8:31
Style file version Nov. 19th, 1999
98
Buzhardt and Semb
would perform better on immediate retention tests than students who receive EOT feedback, (b) students who receive either IBI/a or IBI/i feedback would perform better on long term retention tests than students who receive EOT feedback, (c) students who receive either IBI/i or IBI/a feedback would require less time from instructors than students who receive EOT feedback, and (d) students would prefer IBI/a over EOT and IBI/i. The data supported the third and fourth hypotheses, but not the first and second. The following will address each of these hypotheses.
Immediate Retention Tests (Hypothesis 1) Because students had a week in which to take the quizzes for each unit, the researchers viewed the unit quiz data as a measure of immediate retention. An ANOVA suggested that students’ highest scores on the unit quizzes were significantly different between feedback conditions. However, post hoc analyses did not reveal significant differences between any two conditions. Thus, the data suggest neither the timing of feedback nor the ability to skip items during IBI feedback affected students’ performance on the unit quizzes. As indicated by the preference data discussed below, students prefer to control the sequence of items on a test (e.g., skip the most difficult questions). In fact, answering the easiest questions first is a common test taking strategy recommended by the government (Office of Educational Research and Improvement, 1992). However, one limitation of this study was the lack of an EOT condition in which students could not control the sequence of items. The addition of this condition in future research would help us determine whether or not an invariant sequence of items alone had this effect, or if it was the combination of IBI feedback and an invariant sequence. These results bring into question the conclusions of other studies that investigated the effectiveness of IBI feedback. It is difficult to determine what studies, if any, allowed students to control the sequence of items, because they did not specifically state it in their methods. Thus, we do not know if previous studies showed poor performance during IBI feedback due to invariant sequences, or some other aspect of IBI feedback.
Long-Term Retention (Hypothesis 2) Similar to the conclusions drawn from the unit quiz data, the type of feedback received on unit quizzes had no effect on final exam performance. This is in accord with other similar studies (Beck & Lindsey, 1979). However, because students’ highest score on the final exam (they could take it twice) was used in the analysis, there may have been a test-retest confound in the current study. In other words, one group may have actually performed worse than the others on their initial take of
P1: FYJ Journal of Behavioral Education [jbe]
ph122-jobe-372884
April 27, 2002
8:31
Style file version Nov. 19th, 1999
Computer Feedback in a PSI Course
99
the final exam but improved their score on their retake, thus masking their inferior initial performance. However, based on the experimenters’ 30+ years of experience with students in the course, almost all students take the final exam twice (different questions on each take) since only the highest score counts towards their final grade. Thus, any potential test-retest effects were likely spread across all three conditions. Time Required by Teaching Staff (Hypothesis 3) The critical finding of this study is that the teaching staff (TA’s and tutors) spent significantly less time with students during the IBI/a condition without a loss in performance. This suggests that IBI feedback where students control the item sequence is more cost efficient than EOT feedback because it requires less staff time without sacrificing performance. The experimenters believe that rather than simply reducing the amount of instructor-to-student interactions, IBI feedback made these interactions more efficient relative to EOT feedback. For example, when an instructor helps a student only at the end of a test, part of the instructor’s time is spent waiting for the student to find the item(s) that he/she missed. However, when students receive IBI feedback, the time is reduced because the student is already looking at the question. In other words, with IBI feedback, the instructor does not have to spend time with students as they navigate through the test looking for the questions they missed. It is unclear why there was not also a significant difference between the amount of instructor time required in the IBI/i and EOT conditions. Perhaps students wanted to see all of the questions before asking for help. For instance, in the IBI/a condition, students could answer the easy questions and skip the difficult questions. Then, when they asked for help, they knew exactly where they had difficulty. However, in the IBI/i condition, students could not look at the entire test before answering all of the questions, forcing them to either ask questions as they came up, or waiting until the end of the test. Finally, students in the IBI/a condition may have occasionally answered their own questions by the time they returned to the difficult items they initially skipped. Paying staff and instructors is probably the most expensive single cost of any course, but in a PSI course requiring several tutors, the financial burden can sometimes lead administrators to discourage the use of this teaching method. Koen, Wissler, Lamb, and Hoberock (1975) assert, “the major supporting cost of a PSI course lies in salaries for the proctors” (p. 333). Instructors in the present study spent more time with students in the EOT condition than the IBI/a condition by a factor of 2.8 without an improvement in learning. Decreasing the amount of time tutors have to spend with students would result in a higher student-tutor ratio. Not only would this directly affect the monetary cost of the course, but it would also decrease the response cost required of professors who use PSI because they would have fewer tutors to manage each semester.
P1: FYJ Journal of Behavioral Education [jbe]
ph122-jobe-372884
April 27, 2002
100
8:31
Style file version Nov. 19th, 1999
Buzhardt and Semb
Students’ Reported Preferences (Hypothesis 4) The results showed that students preferred IBI/a feedback over IBI/i and EOT. This is interesting because previous studies have shown that students often report higher levels of “nervousness” and/or “anxiety” when receiving IBI feedback than with EOT feedback (Hail, 1984). The current study suggests that preventing students’ from skipping questions during IBI feedback could have contributed to the “anxious” and “nervous” feelings reported by students in previous studies because 61% of students preferred IBI/a but only 12% preferred IBI/i. Some researchers have suggested that “high ability” students have even more difficulty with IBI feedback than other students (Cohen, 1985; Gaynor, 1981) because it slows down the pace of the test or piece of instruction. However, the preference data for students who received an ‘A’ in the class revealed that a significant majority of these students reported a preference for IBI/a feedback over EOT feedback. As most teachers know, students like to know how they are progressing in a course. This study suggests that this may also be true on individual tests. Although it did not appear to affect their long-term retention, students liked knowing immediately whether or not they answered a question correctly. However, students disliked not having the ability to control the sequence of test items. Limitations One limitation of the current study is that the researchers did not know how often students actually skipped items when the option was available. In other words, even though they preferred having the ability to skip questions, did students actually take advantage of it? Also, if they did skip questions, did it actually improve their performance, reduce anxiety, or neither? Current computer software, which works in the background of most operating systems, can track every keystroke and mouse movement a user makes. This could help experimenters track how often students skip questions, and what questions they skip. Another potential limitation is that students were not required to review their answers in the EOT condition. In the IBI conditions, the computer automatically gave students feedback after they responded to each question. Thus, it is difficult to determine whether students would have performed better in the EOT condition if they had to review the correct answers at the end of the test. However, based on anecdotal observations, the researchers speculate that most students did review their answers at the conclusion of the test in the EOT condition. Again, software that tracks user behaviors could help answer this question. Finally, the groups for the unit quiz performance analyses were not matched based on demographic variables, such as sex, age, year in school, or GPA. Thus, the researchers cannot be sure that the differences seen between the groups for
P1: FYJ Journal of Behavioral Education [jbe]
ph122-jobe-372884
April 27, 2002
Computer Feedback in a PSI Course
8:31
Style file version Nov. 19th, 1999
101
these analyses were not affected by potential demographic differences between the groups. However, this could not have affected the within subject, repeated measures analyses (instructor time, retention, and preference). Implications Today, online instruction has become the most popular form of CAI. Green (2001) reported that 83.3% of all four-year public universities have courses taught completely online. Unfortunately, very little research has been conducted on the effectiveness of online instruction (Duchastel, 1997; Gavriel & Perkins, 1996; Smith, 1999, Windschitl, 1998). However, several corporations have reported that providing online training rather than traditional training has several advantages, such as the following: Cost savings—Hewlett-Packard reported saving $5.5 million when they switched from on-site training to online (Picard, 1996); improved learning— Merrill Lynch reported that employees who trained via networked computers had significantly higher test scores than classroom trainees (Kruse, 1999); and less travel—Aetna reported saving $5 million by switching to online training primarily due to travel savings (Kroll, 1999). These studies reported significant cost savings, but most of these savings were due to reduced travel time and/or reduced overhead because online training requires less classroom space. The current study showed how costs could be further decreased while maintaining a high level of learning. By providing IBI/a feedback on quizzes and tests, students require less one-to-one help from live instructors. Thus, on-site testing situations (required by most online training programs) should provide IBI/a to allow an increase in the student:instructor ratio. Costs have also become a major concern within higher education. The National Center for Education Statistics (NCES) (1997) reports that between 1977 and 1997, without adjusting for inflation, tuition and fees at four-year public universities increased 375% compared to a 250% increase in the average family income. Several studies (Davis, 1997; US General Accounting Office, 1998; and McPherson & Schapiro, 1998) have suggested that the primary cause of increased tuition is the 12% decrease in federal funding to public four-year institutions from 1980–1995. Regardless of the cause, an overlooked potential solution is to deliver more cost effective instruction. PSI has often been cited as one of the more costly methods of instruction within higher education (Koen, Wissler, Lamb, & Hoberock, 1975; Sherman, 1992), which is one reason its use has not been maintained at the levels it reached in the 1970s and 80s. One way to reinforce the use of PSI by professors is to decrease its response cost. Whether this means teachers spend less time providing feedback to students while testing (perhaps allowing them to interact with students in other ways), or if it simply reduces the number of assistants professors have to hire and manage, providing IBI/a feedback should negatively reinforce the use of PSI by reducing costs.
P1: FYJ Journal of Behavioral Education [jbe]
ph122-jobe-372884
April 27, 2002
102
8:31
Style file version Nov. 19th, 1999
Buzhardt and Semb
Future Research The above limitations reveal some potential variables that require further investigation. In particular, researchers need to determine more precisely the function of invariant sequences during testing. The present study only manipulated what controlled the sequence of questions (student or computer) in the IBI conditions. Future research could analyze whether or not the same effects would occur if these sequences were manipulated within the EOT condition. Also, because the present study used EOT feedback for the retention test, it is not clear whether or not IBI/i feedback during the retention test would affect performance on this test. Future studies should also investigate further the cost effectiveness of IBI feedback. One weakness of educational research is the lack of cost effectiveness analyses. The current study found that IBI/a feedback required less student-toinstructor interaction. There are several other variables to consider in this type of analysis, such as time spent outside of class, development of course materials, and administrative costs. Each of these variables should be assessed to determine how they affect learning and the costs of delivering instruction. Otherwise the cost of a quality education will continue to increase. IBI feedback can be difficult to deliver without computers, and IBI/a feedback presents even more problems. Computers make this method of delivering feedback much easier, while also opening the possibility of delivering varying degrees of feedback. For instance, rather than simply providing the correct answer, feedback could also include an explanation of the correct answer, references to the text from which the question came, or even links to specific areas that relate to the question. This would be an interesting area to examine, particularly in the context of online instruction. REFERENCES Austin, S.M., & Gilbert, K.E. (1973). Student performance in Keller Plan course in introductory electricity and magnetism. American Journal of Physics. 41, 12–18. Beeson, R.O. (1973). Immediate knowledge of results and test performance. The Journal of Educational Research, 66, 224–226. Beck, F., & Lindsey, J. (1979) Effects of immediate information feedback and delayed information feedback on delayed retention. Journal of Educational Research, 12, 283–284. Clariana, R.B. (1990). A comparison of answer until correct feedback and knowledge of correct response feedback under two conditions of contextualization. Journal of Computer-Based Instruction, 17, 125–129. Clariana, R.B., Ross, S.M., & Morrison, G.R. (1991). The effects of different feedback strategies using computer administered multiple-choice questions as instruction. Educational Technology, Research, & Development, 39, 5–17. Cohen, B. (1996). Explaining Psychological Statistics. Pacific Grove, CA: Brooks/Cole Publishing Company. Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Hillsdale, NJ: L. Erlbaum Associates. Cohen, V.B. (1985). A reexamination of feedback in computer-based instruction: Implications for instructional design. Educational Technology, 25, 33–37.
P1: FYJ Journal of Behavioral Education [jbe]
ph122-jobe-372884
Computer Feedback in a PSI Course
April 27, 2002
8:31
Style file version Nov. 19th, 1999
103
Davis, J., (1997). College Affordability: A Closer Look at the Crisis. Washington, DC: Sallie Mae Education Institute. Duchastel, P. (1997). A web-based model for university instruction. Journal of Educational Technology Systems, 25, 221–228. Ellis, J., & Semb, G. (1998). Very-long term memory for information learned in an introductory college course. Contemporary Educational Psychology, 17, 243–261. Farmer, J., Lachter, G.D., Blaustein, J.J., & Cole, D.K. (1972). The role of proctoring in personalized instruction. Journal of Applied Behavior Analysis, 5, 401–404. Fletcher, J., Hawley, D., & Piele, P. (1990). Costs, effects and utility of microcomputer assisted instruction in the classroom. American Educational Research Journal, 27, 783–806. Gavriel, S., & Perkins, D. (1996). Learning in wonderland: What do computers really offer education? In Stephen T. Kerr (Ed.), Technology and the Future of Schooling (pp. 111–130). Chicago, IL: The University of Chicago Press. Gaynor, P. (1981). The effect of feedback delay on retention of computer-based mathematical material. Journal of Computer Based Instruction, 8, 28–34. Green, K. (2001). The 2001 National Survey of Information Technology in Higher Education. Available: http://www.campuscomputing.net Hail, J.G. (1984). The effects of immediate feedback on retention in an academic setting. Community/ Junior College Quarterly, 8, 225–232. Johnson, K.R., & Sulzer-Azaroff, B. (1975). The effects of different proctoring systems upon student examination performance and preference, In J.M. Johnston (Ed.), Behavior Research and Technology in Higher Education 2. Springfield, IL: Charles C. Thomas. Keller, F.S. (1968). “Goodbye, teacher . . .” Journal of Applied Behavior Analysis, 1, 79–89. Keppel, G. (1991). Design and Analysis. Englewood Cliffs, NJ: Prentice Hall, Inc. Koen, B., Wissler, E., Lamb, J., & Hoberock, L. (1982). PSI management: Down the administrative chain. In J.G. Sherman, R.S. Ruskin, & G.B. Semb (Eds.), The Personalized System of Instruction: 48 Seminal Papers. Lawrence, KS: TRI Publications. Kroll, L. (1999). Good morning, HAL. Forbes Magazine, 8. Kruse, K. (1999). Real world WBT: Lessons learned at the Fortune 500. ASTD International Conference and Exposition. Atlanta, GA: The American Society for Training and Development. Kulik, J., & Kulik, C-L. (1988). Timing of feedback and verbal learning. Review of Educational Research, 58, 79–97. Kulhavy, R.W., & Anderson, R.C. (1972). Delay-retention effect with multiple-choice tests. Journal of Educational Psychology, 63, 505–512. Leeds, R.D. (1974). The effects of immediate and delayed knowledge of results on immediate and delayed retention. Dissertation Abstracts International, 31, 3343A. (University Microfilms No. 7017924) Levin, H., Glass, G., & Meister, G. (1987). Cost-effectiveness of computer assisted instruction. Evaluation Review, 11, 50–71. Lewis, D., Dalgaard, B., & Boyer, C. (1985). Cost effectiveness of computer assisted economics instruction. The American Economics Review, 75 91–96. Lockhart, K.A., Sturges, P.T., Zachai, J., Dubois, B., & Groves, D. (1981). The effects of the timing of feedback on long-term knowledge retention in PSI courses. (Report No. NPRDC TR 83-13). San Diego, CA: Navy Personnel Research and Development Center. (ERIC Document Reproduction Servive No. ED 230 183) Loulou, D. (1995). Making the A: How to study for tests. Washington, DC: ERIC Clearinghouse on Assessment & Evaluation. Marek, P., Griggs, R., & Christopher, A. (1999). Pedagogical aids in textbooks: Do college students perceptions justify their prevalence? Teaching of Psychology, 26, 11–19. McPherson, M., & Schapiro, M., (1998). The Student Aid Game. Princeton, NJ: Princeton University Press. Melmed, A. (1995). The Costs and Effectiveness of Educational Technology: Proceedings of a Workshop [Online]. Available: http://www.ed.gov/Technology/Plan/RAND/Costs/ National Center for Education Statistics (NCES), (1997). Digest of Education Statistics. Washington, DC: US Government Printing Office. Office of Educational Research and Improvement. (1992). Help your child improve in test taking [Brochure]. Washington, D.C.: U.S. Department of Education.
P1: FYJ Journal of Behavioral Education [jbe]
ph122-jobe-372884
April 27, 2002
104
8:31
Style file version Nov. 19th, 1999
Buzhardt and Semb
O’Neill, M, Rasor, R.A., & Bartz, W.R. (1976). Immediate retention of objective test answers as a function of feedback complexity. Journal of Educational Research, 70, 72–75. O’Reilly, M.F., Renzaglia, A., & Lee, S. (1994). Analysis of acquisition, generalization and maintenance of systematic instruction competencies by preservice teachers using behavioral supervision techniques. Education and Training in Mental Retardation and Developmental Disabilities, 29, 22–34. Picard, D. (November, 1996). The future in distance training. Training, 5–10. Pressey, S.L. (1926). A simple apparatus which gives tests and scores-and teaches. School and Society, 23, 373–376. Pressey, S.L. (1950). Development and appraisal of devices providing immediate automatic scoring of objective tests and concomitant self-instruction. The Journal of Psychology, 29, 417–447. Sassenrath, J.M., Yonge, G.D., (1969). The effects of delayed information feedback and feedback cues in learning retention. Journal of Educational Psychology, 60, 174–177. Semb, G.B., & Ellis, J.A. (1994). Knowledge taught in school: what is remembered? Review of Educational Research, 64, 253–86. Sherman, J.G. (1992). Reflections on PSI: Good News and Bad. Journal of Applied Behavior Analysis, 25, 59–64. Schmidt, R.A. & Bjork, R.A. (1992). New conceptualizations of practice: Common principles in three paradigms suggest new concepts for training. Psychological Science, 3, 207–217. Skinner, B.F. (1953). Science and Human Behavior. New York, NY: Macmillan. Skinner, B.F. (1958). Teaching machines. Science, 128, 969–977. Smith, S. (1999). The effectiveness of traditional instructional methods in an online learning environment. Dissertation Abstracts International, 60 (09A) 3330. Strang, H.R., & Rust, J.O. (1973). The effects of immediate knowledge of results and task definition on multiple-choice answering. Journal of Experimental Education, 42, 77–80. Sturges, P.T. (1978). Immediate vs. delayed feedback in a computer-managed test: effects on longterm retention. (Report No. NPRDC TR 78-15). San Diego, CA: Navy Personnel Research and Development Center. (ERIC Document Reproduction Service No. ED 160 635) US General Accounting Office, (1998). Tuition Increases and Colleges’ Efforts to Contain Costs. (DHHS Publication No. HEHS-98-227.). Washington, DC: US Government Printing Office. Webb, J., Stock, W., and McCarthy, M. (1994). The effects of feedback timing on learning facts: The role of response confidence. Contemporary Educational Psychology, 19, 251–265. Weiten, W., Deguara, D., Rehmke, E., & Sewell, L. (1999). University, community college, and high school students’ evaluations of textbook pedagogical aids. Teaching of Psychology, 26, 19–21. Windschitl, M. (1998). The WWW and classroom research: What path should we take? Educational Researcher, 27, 28–33.
APPENDIX
Survey Questions Now that you have completed Unit 6, you have had three types of feedback since the course began. Overall, which type of feedback did you prefer? a. feedback after each question, with the option of skipping around the test b. feedback after each question, without the option of skipping around the test c. feedback at the end of the test Your choice indicates a preference for one type of feedback over the others. How strong is this preference? a. I really don’t have a preference b. I only slightly prefer the type I chose c. I moderately prefer the type I chose d. I greatly prefer the type I chose e. I absolutely, without a doubt prefer the type I chose