Student evaluations of teaching: perceptions and ... - IngentaConnect

1 downloads 0 Views 351KB Size Report
Items 1 - 8 - American University of Sharjah, Sharjah, United Arab Emirates ... demographic groups with respect to both students' perceptions of the evaluation ...
The current issue and full text archive of this journal is available at www.emeraldinsight.com/0968-4883.htm

QAE 15,3

Student evaluations of teaching: perceptions and biasing factors

302

American University of Sharjah, Sharjah, United Arab Emirates

Ahmad Al-Issa and Hana Sulieman Abstract Purpose – The purpose of this study is to examine students’ perception of end-of-semester Student Evaluations of Teaching (SET), and to explore the extent to which SET are biased by non-instructional factors. Design/methodology/approach – A survey questionnaire about the end-of-semester SET was designed and administered to 819 students selected from a random list of summer classes at the American University of Sharjah in the United Arab Emirates. Appropriate statistical analysis methods of the resulting data were performed. Findings – The results of this study show that significant differences exist among the various demographic groups with respect to both students’ perceptions of the evaluation process and their tendency to be biased by a number of non-instructional factors. The study has presented evidence on how students’ cultural and linguistic backgrounds affect their responses to SET. Practical implications – This paper provides useful information for the academic community concerning the validity and reliability of SET rating scales used in US universities abroad and whether the data obtained from such rating scales should be used for administrative and personnel decisions. In addition, teachers should examine SET assessments with care before undertaking modifications to their teaching practices. Originality/value – This paper was the first to examine SET in a US university overseas, where the majority of students are non-native speakers of the English language, and of an Arab origin. The findings illuminate the importance of understanding the cultural and linguistic contexts of the institution in which SET are conducted. Keywords Students, Perception, Bias, Teaching, United Arab Emirates Paper type Research paper

Quality Assurance in Education Vol. 15 No. 3, 2007 pp. 302-317 q Emerald Group Publishing Limited 0968-4883 DOI 10.1108/09684880710773183

Introduction In recent years an increasing number of American universities have opened abroad, including several in the region of the Arabian Gulf. In the United Arab Emirates (UAE) alone, six new American-style universities have been established since 1997; in addition, several colleges have adopted an American-style curriculum. In accordance with the American educational model, these institutions have implemented student evaluations of teaching (SET) as a means of assessing the effectiveness of their course offerings and evaluating their academic staff. Unfortunately, the American rating scales that are used to conduct these evaluations are often adopted without appropriate modification to take account of the cultural and linguistic backgrounds of the students being assessed. This casts doubt on the reliability of the assessments. As Pennington and Young (1989) have noted, the approaches to SET in American educational institutions might not even be appropriate for students in the United States of America USA for whom English is a second language (ESL). According to these authors (1989, p. 629), SET rating scales are of:

. . . questionable validity when employed by those whose exposure to English and to the American educational and cultural context is limited.

If this is true of ESL students studying in the USA, it is likely be the case for students who are studying in American universities in the Arabian Gulf region. Due to their cultural and educational upbringing, these students might perceive the evaluation process differently from those studying in the USA. For example, students born and raised in the Arabian Gulf region are not accustomed to “passing judgment” on their teachers, and therefore find themselves in an unusual position if they are suddenly called upon to “judge” their instructors. No serious attempt has been made to assess students’ perceptions of the evaluation process in such educational contexts. The existing literature on SET is largely restricted to evaluations conducted in the USA, and even in that setting there has been little attempt to establish the reliability and validity of specific SET instruments in an ESL context (Pennington and Young, 1989). There has been even less research on this topic in an international context. The purposes of the present paper are thus: firstly, to examine students’ perceptions of SET in an American university in the Arabian Gulf region; and secondly, to explore the extent to which SET is influenced by various potentially biasing factors. Literature review The literature on SET is immense, including comprehensive reviews of research on the subject by Aleamoni (1987), Cashin (1988), Germain and Scandura (2005), Kolitch and Dean (1999), Marsh (1984; 1987), and Theall and Franklin (2001). In a comprehensive review, Cashin (1988) found no fewer than 1300 articles and books dealing with research on the subject of SET, and the present authors found 2988 articles on the subject published in scholarly journals between 1990 and 2005. With the huge amount of data available, it is difficult for university administrators and researchers to reach agreement on the utility and effectiveness of SET. As Felder (1990, p. 1) has noted, there is a plethora of literature to “ . . . read, digest and argue for and against”. There are several arguments in favor of students rating their instructors. Proponents of ratings, such as Arreola (1994) and Theall and Franklin (1990, 2001), have argued that students spend long periods of time observing and interacting with their instructors, and that they are therefore qualified to make assessments of the teaching they receive. Scriven (1995) has argued that students are in a good position to judge such matters as whether a test covers all the materials in a course and whether a course meets the stated objectives. Aleamoni (1981; 1987) has claimed that students are logical evaluators of the quality and effectiveness of course content, methods of instruction, textbooks, homework, and student interest; indeed, this author (1987) provided evidence that student evaluations are consistent in the long term, and he specifically rejected the suggestion that student ratings are merely a “popularity contest”. Opponents of this view have maintained that students cannot judge all aspects of teaching performance (Crumbley et al., 2001; Fish, 2005; Wallace and Wallace, 1989; Trout, 2000). Indeed, Trout (1997) concluded that students, especially freshmen, cannot judge any aspect of teaching. According to these critics, students do not have the knowledge or the experience to evaluate the multidimensionality of teaching. In addition, it has been argued that student ratings of teachers are often influenced by

Student evaluations of teaching 303

QAE 15,3

304

non-instructional factors, and that data obtained from student evaluations should not therefore be used by university administrators in making decisions about personnel (Crumbley et al. 2001; Emery et al., 2003). In this regard, Armstrong (1998, p. 3) has contended that: . . . students’ ratings of teachers are intended to change the behavior of teachers . . . [and there is] no evidence that these changes are likely to contribute to learning.

Some studies have examined students’ perceptions of SET (Ahmadi et al., 2001; Crumbley et al., 2001; Dwinell and Higbee, 1993; Penfield, 1976). These studies found that students believed that their assessments were an effective means of voicing their opinions about teaching, but that they were not fully aware of the implications of their evaluations for university administrators and teachers. This raises the question of whether students were motivated to take the evaluation seriously. As Sojka et al. (2002, p. 45) noted: If . . . students believe faculty do not take [SET] seriously, then students will be less likely to take them seriously either.

If this is the case, Chen & Hoshower (2003) have questioned the quality and utility of students’ input. The most common criticism of student evaluations is that they are biased. In this regard, several non-instructional factors that can potentially affect an instructor’s evaluation have been identified. These include: . The level of the class being taught (Marsh, 1987). . Students’ interest in the subject matter before enrolling in the class (Marsh and Dunkin, 1992). . The size of the class (Greenwald and Gillmore, 1997). . Gender (Habson and Talbot, 2001). . Grades (Greenwald, 1997; Tang, 1999). . The rigor of the course (Best and Addison, 2000). . An instructor’s “warmth-inducing” behaviour (Best and Addison, 2000). Similar views were expressed by Martin (1998), who reviewed the literature on the subject and noted that assessments by students can be influenced by: . students’ characteristics (motivation for taking the course, disposition toward instructor and course); . instructor characteristics (rank, experience, personality traits); . course difficulty (grading, leniency); and . other environmental characteristics (physical attributes and the ambience of the classroom). The fact that many non-instructional factors can affect assessments by students has raised concerns about how some university teachers might react to SET. Armstrong (1998, pp. 3-4) has expressed concern that teachers might:

. . . try to appeal to the least common denominator. Teachers may make their classes less challenging and decide it risky to work on skill development . . . teacher ratings may reduce experimentation by teachers.

Similarly, Crumbley (1995) has noted that instructors might inflate grades, reduce course content, and simplify examinations. Similarly, Martin (1998) has referred to attempts by teachers to influence student ratings in their favor by using practices that distract from learning rather than enhancing learning. Most of the research reviewed above was conducted in the USA, where the majority of students are native speakers of English and familiar with the culture of the American educational system. In contrast, the majority of students in American universities in the UAE are of Arab origin, and their cultural views of education in general (and classroom practices in particular) might be quite different from those held by mainstream American students. If a fair and accurate evaluation of SET data is to be achieved, the present authors believe that it is important to understand the cultural and linguistic context of the institution in which assessments by students are conducted. Methodology Context Data for the present study were collected from the American University of Sharjah (AUS) in the United Arab Emirates (UAE). The AUS has a student body of approximately 5,500 students and 250 staff members of mixed cultural and linguistic backgrounds. The students represent more than 70 nationalities and staff members come from more than 65 countries. As such, the university reflects the UAE itself, which actually has more expatriates in residence than locals. Although the majority of the students are from Arab and Islamic countries, most of the academic staff consists of: firstly, expatriates from Western countries (USA, Canada, UK, and western Europe); and secondly, Arab-Americans who were either born and raised in the United States or obtained their degrees from the USA or Europe. The language of instruction at AUS is English, and students must obtain a score of 510 on the Test of English as a Foreign Language (TOEFL) if they are to matriculate. Sample A total of 819 students (482 males and 337 females) who were enrolled in summer courses in the 2004-2005 academic year completed a questionnaire about the end-of-semester student evaluations of teaching (SET). Participants in the study included students enrolled in arts, science, business, management, architecture, design, and engineering. The students had come from: . the Gulf region (UAE, Kuwait, Bahrain, Oman, Qatar, Yemen, and Saudi Arabia); . the Levant (Syria, Jordan, Palestine, and Iraq); . Africa (Egypt, Sudan, Morocco, Algeria, Tunisia, and Nigeria); and . the Indian sub-continent (India and Pakistan). The sample reflected the cultural and linguistic diversity of the AUS student population. Table I presents a summary of the demographic data on the subjects.

Student evaluations of teaching 305

QAE 15,3

306

Table I. Respondents demographics

Country of origin Gulf Levant Africa Indian sub-continent Gender Males Females Student status Freshman Sophomore Junior Senior Graduate Language of instruction at high school Arabic English Asian School/college College of Arts and Science School of Architecture and Design School of Business and Management School of Engineering GPA 3.50-4.00 3.00-3.49 2.50-2.99 Below 2.50 Total

396 243 64 116

48 30 8 14

482 337

59 41

214 231 201 124 49

26 28 25 15 6

434 350 35

53 43 4

122 79 261 357

15 10 32 43

120 223 223 253 819

15 27 27 31

Instrument and procedure The questionnaire was designed to address the two main objectives of this study: firstly, to investigate students’ perceptions of SET (items 1-8 on the questionnaire); and secondly, to identify factors that might potentially bias SET (items 9-15). Respondents were asked to indicate the extent to which they agreed with each item/question on a 5-point Likert-type scale (5 ¼ “strongly agree”; 4 ¼ “agree”; 3 ¼ “not sure”; 2 ¼ “disagree”; 1 ¼ “strongly disagree”). A pilot study was conducted to ensure that adequate time was allowed for completion of the questionnaire and that all students were capable of comprehending the items used in the questionnaire. Comprehension was a concern because the questionnaire was in English, and although all AUS students had met the university admission requirements with respect to English proficiency, there was concern as to whether the respondents would completely understand the evaluation form. A total of 43 students, representing various levels of English proficiency, volunteered to participate in the pilot study. The majority of students (62 percent) identified at least one question that was difficult to understand due to an unfamiliar English word. As a result of the pilot study, the language of all problematic items was simplified in consultation with the students. In addition, it was decided that participants who did not understand the meaning of a word during the administration of the final

questionnaire would be allowed to seek clarification from the person administering the questionnaire. The final version of the questionnaire is presented in the Appendix. The final version of the survey was administered in class by instructors teaching the selected courses. The questionnaires were distributed at the beginning of the classes and collected after respondents had been allowed 15 minutes in which to complete the questionnaire. Uniformity of administration was ensured by distributing copies of the questionnaire and instructions for administration in envelopes, which were personally handed to each instructor. Envelopes containing the completed questionnaires were then collected from the instructors at the end of each class.

Student evaluations of teaching 307

Data analysis and results Students’ perceptions of SET Table II shows the percentages, mean responses, and rankings of the perception questions (items 1-8). As can be seen in the Table, 68 percent of the respondents “strongly agreed” or “agreed” with item 1 – that, by evaluating teachers, students are helping them to improve their teaching effectiveness. The majority (79 percent) also believed that AUS should continue having students evaluate their teachers (item 8).

Item no.

Students’ perception of end-of-semester faculty evaluation

1.

By evaluating my professors, I am actually helping them improve their teaching effectiveness Professors change their teaching methods as a result of students’ evaluations Faculty evaluations should be used by the administration for the purpose of promotion, contract renewal, and salary decisions The Course Evaluation Form we use at AUS provides me with an effective means of evaluating my professors From my experience, I think AUS students take faculty evaluations seriously I usually fill out the questions on the second part of the Evaluation Form where I am asked to write comments and suggestions My professors really care about what I think about their teaching methods AUS should continue having students evaluate their professors

2. 3.

4. 5. 6.

7. 8.

Strongly Strongly disagree/ agree/ disagree Not sure agree (%) (%) (%)

Mean

Rank

22

10

68

3.65

2

39

29

32

2.87

8

23

22

55

3.45

3

21

21

58

3.42

4

30

36

34

3.04

6

31

15

54

3.29

5

34

27

39

2.99

7

9

12

79

4.15

1

Table II. Perceptions of SET

QAE 15,3

308

In contrast, only 32 percent “strongly agreed” or “agreed” that teachers change their teaching methods as a result of SET evaluations (item 2), and only 39 percent thought that teachers really care about what students think about their teaching methods (item 7). Opinion was divided on whether AUS students take SET seriously (item 5), with 34 percent “agreeing”, 30 percent “disagreeing”, and 36 percent “not sure”. Table III shows the mean responses for each perception item and the mean responses for overall perception according to various demographic factors. Statistically significant differences are highlighted in italic in the table, and discussed below. It is apparent that students’ academic status was the only demographic factor to have a significant effect on the overall mean perception. Graduate students (i.e. students enrolled in a masters degree), who constituted only 6 percent of the sample, showed the most positive perception of SET (3.58), whereas senior students gave the lowest mean value (3.22). These results were mirrored in the responses to items 1, 2, 4, and 5 – in that senior students showed the most negative perceptions of SET, whereas graduate students showed the most positive attitudes. For example, senior students Item Factor

Table III. Perceptions of SET by groups

Origin Gulf Levant Africa Indian sub-continent Gender Male Female Language of inst. in high school Arabic English Asian School/college Arts & Sciences Architecture Business and Management Engineering GPA Above 3.50 3.00-3.49 2.50-2.99 Below 2.50 Academic status Freshman Sophomore Junior Senior Graduate

1

2

3

4

5

6

7

8

Overall

3.68 3.54 3.84 3.69

2.90 2.76 3.15 2.83

3.41 3.44 3.51 3.53

3.48 3.37 3.31 3.39

3.09 3.08 2.97 2.81

3.26 3.35 3.39 3.21

3.06 2.92 2.95 2.94

4.10 4.10 4.31 4.28

3.37 3.32 3.43 3.33

3.34 3.82

2.61 2.97

3.53 3.42

3.41 3.42

3.09 2.98

3.29 3.29

2.95 3.05

4.14 4.15

3.41 3.39

3.67 3.63 3.70

2.82 2.95 2.60

3.49 3.37 3.66

3.40 3.44 3.43

3.09 3.00 2.86

3.31 3.26 3.23

2.99 3.00 2.96

4.13 4.14 4.34

3.36 3.35 3.34

3.68 3.89 3.58 3.61

2.93 3.07 2.84 2.84

3.37 3.29 3.65 3.41

3.49 3.46 3.50 3.35

2.99 3.00 3.12 3.02

3.19 3.21 3.29 3.26

3.08 3.16 3.05 2.99

4.08 4.19 4.16 4.17

3.35 3.41 3.96 3.34

3.62 3.59 3.60 3.63

3.01 2.79 2.89 2.86

3.33 3.56 3.51 3.35

3.48 3.32 3.45 3.41

2.61 3.08 3.00 3.16

3.46 3.33 3.35 3.12

3.04 2.94 2.99 3.03

4.42 4.07 4.21 4.01

3.37 3.33 3.37 3.32

3.73 3.73 3.65 3.15 4.22

2.95 2.88 2.79 2.66 3.37

3.39 3.42 3.37 3.61 3.37

3.59 3.40 3.41 3.08 3.69

3.22 3.10 2.88 2.81 3.20

3.26 3.27 3.30 3.33 3.22

3.15 2.92 2.92 2.86 3.20

4.19 4.07 4.16 4.15 4.21

3.43 3.35 3.32 3.22 3.58

Note: Numbers in italics are statistically significant

were least likely to “strongly agree” or “agree” that by evaluating their teachers they are actually helping them to improve their teaching effectiveness (item 1, mean ¼ 3.15) or that teachers change their teaching methods as a result of student evaluations (item 2, mean ¼ 2.66). In addition, senior students who had relatively longer experience than other students with respect to SET practices at AUS were also the least likely to agree that the course-evaluation form used at the university provided them with an effective means of evaluating their teachers (item 4, mean ¼ 3.08), and that AUS students take faculty evaluations seriously (item 5, mean ¼ 2.81). Gender had a significant influence on the responses to two items. Female students responded more positively than male students to item 1 (regarding SET evaluation helping teacher effectiveness) and item 2 (regarding SET evaluation changing teaching methods). Students’ grade point average (GPA) standing was associated with significant differences in the responses to items 5, 6, and 8. For example, students with GPA 3.5 or above were least likely to agree with item 5 (regarding whether AUS students take evaluations seriously). This group had a mean value of 2.61 for this item (the lowest of any GPA group), indicating that students in this GPA category had the greatest tendency to perceive other students as being flippant towards SET. The largest group of students in this GPA category were the seniors (about 44 percent of all students with GPA 3.5 or above); as previously noted, such seniors demonstrated the most negative attitudes with respect to this item. For items 6 (regarding the propensity to write comments and suggestions on an valuation form) and item 8 (AUS should continue having students evaluate their faculty), students with GPA 3.5 or above had more favorable responses than students from other GPA groups. In particular, most students whose GPA was 2.5 (or less) were less likely to fill out the questions on the comments or suggestions part of the SET form (item 6, mean ¼ 3.12). Biasing factors Table IV shows the findings with regard to factors that might exert potential bias in evaluations of teachers by students. It is apparent that the largest number of respondents (32 percent) agreed that their evaluations were influenced by their expected grade in a course (item 9), and 25 percent agreed that, if they had a good relationship with a teacher, they would rank him or her higher on teaching effectiveness, even if that teacher were not an effective teacher (item 14). Students’ evaluations were also affected by their teacher’s gender (item 10, 18 percent), age (item 11, 15 percent), nationality (item 12, 16 percent), personality (item 13, 23 percent), and “immediate” knowledge (item 15, 15 percent). Table V shows the mean response values for individual items of potential bias and the overall attitude of bias according to various demographic factors. Statistically significant differences are highlighted in italic, and discussed below: As can be seen in Table V, place of origin affected some responses. Students from the Gulf region had the highest overall mean response (2.40), suggesting that these students were the most likely to “strongly agree” or “agree” with the biasing factors compared with the other origin groups. The Indian sub-continent students showed the lowest overall mean value (2.15), which was the strongest level of disagreement with the biasing factors. Further investigation of the responses revealed that 61 percent of Gulf students “strongly disagreed” or “disagreed” with the biasing factors, 16 percent were “not sure”,

Student evaluations of teaching 309

QAE 15,3 Item no. 9.

310

10. 11. 12. 13.

14.

15. Table IV. Potential bias factors in evaluations by students

Factors biasing students’ evaluation objectivity My rating of my professors is affected by my expected grade in the course The gender of my professor (male-female) affects my evaluation The age of my professor (old-young) affects my evaluation The nationality of my professor (Arab-Non-Arab) affects my evaluation When evaluating my professors, I usually pay more attention to their personality (i.e. friendless, leniency, looks, dress, etc.) than to their teaching methods or course content If I have a good relationship with my professor, I will rank him/her high on teaching effectiveness even if he/she is not an effective teacher If I ask my professor a question that is related to the subject being taught and my professor responds by saying, “I am not really sure, but I will check that and get back to you,” I will still not consider him/her knowledgeable

Strongly Strongly disagree/ agree/ disagree Not sure agree (%) (%) (%)

Mean

Rank

48

20

32

2.79

1

73

9

18

2.06

5

77

8

15

1.98

7

76

8

16

2.01

6

54

23

23

2.55

3

53

22

25

2.61

2

69

16

15

2.28

4

and 23 percent “strongly agreed” or “agreed”. The corresponding figures for the Indian sub-continent students were 70 percent “strongly disagreed” or “disagreed”, 14 percent were “not sure”, and 16 percent “strongly agreed” or “agreed”. Gulf students were also the least likely to disagree with the proposition that their evaluations were influenced by expected grade (item 9), teacher’s age (item 11), and teacher’s personality (item 13), whereas the Indian sub-continent students were the most likely to disagree with these biasing factors. In interpreting these findings with respect to origin, it is significant that Gulf students constituted the largest group in the sample (48 percent); their biased attitudes thus have the potential to invalidate the SET results at AUS. With regard to gender, female students had a smaller overall mean value (2.18) for potentially biasing factors than male students (2.40). Approximately 70 percent of all responses by female students to questions about biasing factors were “strongly disagree” or “disagree”, whereas 60 percent of male students’ responses were “strongly disagree” or “disagree”. In addition, male students were more influenced than female students by expected grade (item 9), teacher’s age (item 11), teacher’s nationality (item 12), and teacher’s personality (item 13). It should be noted that 60 percent of the AUS student body consists of males. With regard to language (of instruction at high school), students who had been instructed in Arabic showed the least tendency to disagree with the biasing factors

Factor Origin Gulf Levant Africa Indian sub-continent Gender Male Female Language Arabic English Asian School/college Arts & Sciences Architecture Business and Management Engineering GPA Above 3.50 3.00-3.49 2.50-2.99 Below 2.50 Academic standing Freshman Sophomore Junior Senior Graduate

9

10

11

Item 12

2.87 2.84 2.52 2.48

2.15 2.08 1.98 1.95

2.15 1.87 1.91 1.62

2.07 1.96 1.87 2.00

2.62 2.56 2.63 2.12

2.66 2.61 2.73 2.57

2.32 2.25 2.06 2.36

2.40 2.31 2.24 2.15

2.89 2.51

2.10 2.01

2.08 1.81

2.08 1.89

2.67 2.31

2.64 2.58

2.34 2.20

2.40 2.18

2.97 2.52 2.48

2.28 1.86 1.66

2.10 1.83 1.82

2.25 1.83 1.74

2.64 2.48 2.37

2.59 2.64 2.43

2.34 2.22 2.15

2.45 2.19 2.08

2.65 2.72 2.78 2.83

2.07 1.87 2.04 2.10

1.86 1.88 2.01 2.09

1.95 1.84 2.02 2.06

2.44 2.38 2.52 2.51

2.58 2.63 2.67 2.57

2.09 2.22 2.24 2.30

2.23 2.17 2.35 2.35

2.74 2.64 2.94 2.81

1.95 2.05 2.11 2.08

1.83 1.91 2.03 2.03

1.88 1.97 2.03 2.07

2.43 2.43 2.61 2.64

2.52 2.60 2.63 2.64

2.12 2.23 2.29 2.33

2.21 2.26 2.37 2.37

2.87 2.85 2.78 2.87 2.01

2.19 2.20 2.00 1.94 1.37

2.16 2.09 1.89 1.85 1.26

1.18 2.03 1.97 1.98 1.26

2.72 2.56 2.49 2.51 2.06

2.64 2.65 2.59 2.66 2.42

2.39 2.45 2.17 2.16 1.67

2.32 2.40 2.27 2.28 1.72

13

14

15

Overall

Note: Numbers in italics are statistically significant

(overall mean response of 2.45); this was the highest value among the three language groups. More detailed analysis among the Arabic-instructed students revealed that 60 percent of the responses were “strongly disagree” or “disagree”, 14 percent were “not sure”, and 26 percent were “strongly agree” or “agree”. For the English-instructed students, 69 percent of the responses were “strongly disagree” or “disagree”, 15 percent were “not sure”, and 16 percent were “strongly agree” or “agree”. Among the Asian-instructed students, 71 percent of the responses were “strongly disagree” or “disagree”, 18 percent were “not sure”, and 11 percent were “strongly agree” or “agree”. These findings indicate that Arabic-instructed students had the greatest tendency to provide biased evaluations of their teachers. Analysis of the responses to the individual biasing items revealed that Arabic-instructed students also had the greatest tendency to be influenced by expected grade (item 9), teacher’s gender (item 10), teacher’s age (item 11), and teacher’s nationality (item 12). In interpreting these results, it is interesting to note that more than half of the students at AUS have graduated from Arabic-instructed high schools. With regard to academic status, the overall bias of freshmen (mean score of 2.32) and sophomores (mean score of 2.40) were the highest among the various academic status

Student evaluations of teaching 311

Table V. Potential bias factors by groups

QAE 15,3

312

groups. The graduate students were most likely to disagree with the biasing factors (overall mean value of 1.72). Further analysis revealed that 86 percent of responses from graduate students were “strongly disagree” or “disagree”, about 60 percent of responses from freshmen and sophomores were in these categories, and about 68 percent of responses from junior students and senior students were in these categories. In contrast, about 9 percent of responses from graduate students were “strongly agree” or “agree”, about 19 percent of responses from junior and senior students were in these categories, and about 24 percent of responses from freshmen and sophomores were in these categories. In addition, Table V shows similar significant differences in the students’ attitudes to the biasing factors when responses to individual items were analyzed. Responses to item 14 (relationship with teacher) were the only ones that did not show significant differences among the academic status groups. Discussion and summary The present study examined: students’ perceptions of students’ evaluations of teaching (SET) at the American University of Sharjah (AUS) in the United Arab Emirates; and the extent to which SET might be biased in this institution by various non-instructional factors. The results show that the majority of students felt that the university should continue to have students evaluate their instructors and that SET enables students to voice their opinions about teaching. However, they did not think that teachers take students’ evaluations seriously, and this attitude probably explains why 30 percent of the respondents felt that AUS students do not take SET seriously. If this is the case, such an attitude would have an adverse effect on the quality of SET (Sojka et al., 2002). Students’ perceptions of SET were influenced by their gender, academic status, and grade point average (GPA). Senior students were the least likely to have positive perceptions of SET, whereas freshmen and graduate students had the most positive attitudes. It is possible that the attitudes of senior students were influenced by the fact that they had been required to complete SET forms for a number of years, whereas graduate students and freshmen (many of whom were in their first or second semesters in the university) had probably never evaluated a teacher before; for these students, the “novel” task of evaluation might therefore have been viewed more positively. It would be interesting to ascertain whether the relatively positive perceptions of the graduate students and freshmen are sustained in subsequent years. The findings of this study also showed that SET assessments at AUS are potentially biased. Assessments are potentially influenced by the student’s expected grade, teacher’s gender, teacher’s age, teacher’s nationality, teacher’s personality, and the students’ views of what constitutes “knowledge”. The extent to which SET assessments are influenced by these factors was also found to be dependent on some of the students’ demographic factors (origin, gender, language of instruction in high school, and academic status). These findings cast doubt on the validity of SET at AUS and whether the data obtained from SET should be used for administrative and personnel decisions. It is also apparent that teachers should not be misled by data obtained from SET; rather, they should examine SET assessments with care before undertaking modifications to their teaching practices. One of the major findings of this study concerns students’ origin and students’ language of instruction in high school. The evaluations of Gulf students were found to

be most likely to be influenced by biasing factors. Furthermore, students who had been instructed in Arabic in high schools (which include most of the Gulf students in addition to most other Arab students who participated in this study) were more influenced by the various biasing factors than students whose languages of instruction in high schools were English or Asian. They were more significantly influenced than other groups not only by the expected grade but also the teacher’s age, gender, nationality, and personality. Arabs’ perceptions of age and personality traits (especially leniency, which usually means “caring” in this part of the world) are culturally rooted. As Meleis (1982, p. 443) has observed, Arab students: . . . have learned that somebody who is more qualified, more educated, and more expert than they in matters of education should be responsible for decisions relating to education.

According to this view, “good” teachers, are those who are highly educated, caring, know the answer to every question, formal, and highly skilled in classroom management. According to Meleis (1982, p. 444), the teacher is the “ . . . epitome of wisdom inculcated by years of teaching, researching, and plain living” (see also Al-Issa, 2003; 2005). Cultural values of this type are more likely to be held in Arabic-instructed schools rather than in English or Asian schools in this region. Students attending Arabic schools are therefore more likely to apply these values when evaluating their teachers in the university. In addition, overall, the fact that Arabic-instructed students were more likely to be influenced by their relationship with their teacher can also be ascribed to their cultural upbringing. In a collectivist society, like the Arab culture, the concept of friendship is inseparable from social obligations. As pointed out by Al-Issa (2003, p. 587), part of a healthy friendship among Arabs is that a friend is obliged: . . . to fulfil certain obligations such as offering help and doing everything he/she can to comfort a friend. Furthermore, [a friend] is always expected to show admiration for his/her friends, and praise their goodness, preferably in their presence.

In conclusion, this study has presented evidence that casts doubt on the quality of the data obtained from SET in certain linguistic and cultural settings. It has also shed light on how students’ cultural and linguistic contexts affect their responses to SET. More research is required to enhance understanding of how culturally and linguistically different students in this region respond to SET and how their input is interpreted by university administrations. However, it is hoped that the present investigation will stimulate the scope of research on SET and motivate more studies pertaining to American institutions abroad. References Ahmadi, M., Helms, M. and Raiszadeh, F. (2001), “Business students’ perceptions of faculty evaluations”, International Journal of Educational Management, Vol. 15 No. 1, pp. 12-22. Aleamoni, L.M. (1981), “Student ratings of instruction”, in Millman, J. (Ed.), Handbook of Teacher Evaluation, Sage, Beverly Hills, CA, pp. 110-45. Aleamoni, L.M. (1987), “Student rating: myths versus research facts”, Journal of Personal Evaluations in Education, Vol. 1, pp. 111-19.

Student evaluations of teaching 313

QAE 15,3

314

Al-Issa, A. (2003), “Sociocultural transfer in 12 speech behaviors: evidence and motivating factors”, International Journal of Intercultural Relations, Vol. 27, pp. 581-601. Al-Issa, A. (2005), “When the west teaches the east: analyzing intercultural conflict in the classroom”, Intercultural Communication Studies, Vol. 4 No. 3, pp. 129-48. Armstrong, J.S. (1998), “Are student ratings of instruction useful?”, American Psychologist, Vol. 53, pp. 1123-4. Arreola, R.A. (1994), Developing a Comprehensive Faculty Evaluation System, Anker, Boston, MA. Best, J.B. and Addison, W.E. (2000), “A preliminary study of perceived warmth of professor and student evaluations”, Teaching of Psychology, Vol. 27, pp. 60-2. Cashin, W.E. (1988), Student Ratings of Teaching: A Summary of the Research, Kansas State University Center for Faculty Evaluation and Development, Manhattan, KS. Chen, Y. and Hoshower, L.B. (2003), “Student evaluation of teaching effectiveness: an assessment of student perception and motivation”, Assessment and Evaluation in Higher Education, Vol. 28 No. 1, pp. 71-88. Crumbley, L. (1995), “The dysfunctional atmosphere of higher education: games professors play”, Accounting Perspectives, Spring, pp. 67-76. Crumbley, L., Henry, B. and Kratchman, S. (2001), “Students’ perceptions of the teaching of college teaching”, Quality Assurance in Education, Vol. 9 No. 4, pp. 197-207. Dwinell, P.L. and Higbee, J.J. (1993), “Students’ perceptions of the value of teaching evaluations”, Perceptual and Motor Skills, Vol. 76, pp. 995-1000. Emery, C., Kramer, T. and Tian, R. (2003), “Return to academic standards: a critique of student evaluations of teaching effectiveness”, Quality Assurance in Education, Vol. 11 No. 1, pp. 37-46. Felder, R.M. (1990), “What do they know, anyway?”, Chemical Engineering Education, Vol. 26 No. 3, pp. 134-5. Fish, S. (2005), “Who is in charge here?”, The Chronicle of Higher Education, Vol. 51 No. 22, p. C2. Germain, M. and Scandura, T. (2005), “Grade inflation and student individual differences as systematic bias in faculty evaluations”, Journal of Instructional Psychology, Vol. 32 No. 1, pp. 58-67. Greenwald, A.G. (1997), “Validity concerns and usefulness of student ratings of instruction”, American Psychologist, Vol. 52, pp. 1182-6. Greenwald, A.G. and Gillmore, G.M. (1997), “No pain, no gain? The importance of measuring course workload in student ratings of instruction”, Journal of Educational Psychology, Vol. 89, pp. 743-51. Habson, S.M. and Talbot, D.M. (2001), “Understanding student evaluations”, College Teaching, Vol. 49 No. 1, pp. 26-31. Kolitch, E. and Dean, A. (1999), “Student ratings of instruction in the USA: hidden assumptions and missing conceptions about ‘good’ teaching”, Studies in Higher Education, Vol. 24 No. 1, pp. 27-42. Marsh, H. (1984), “Students’ evaluations of university teaching: dimensionality, reliability, validity, potential biases, and utility”, Journal of Educational Psychology, Vol. 76, pp. 707-54. Marsh, H. (1987), “Students’ evaluations of university teaching: research findings, methodological issues and directions for future research”, Journal of Educational Research, Vol. 11, pp. 253-388.

Marsh, H. and Dunkin, M. (1992), Students’ Evaluations of University Teaching: Handbook on Theory and Research, Vol. 8, Agathon Press, New York, NY. Martin, J.R. (1998), “Evaluating faculty based on student opinions: problems, implications, and recommendations from Deming’s theory of management perspective”, Issues in Accounting Education, Vol. 13 No. 4, pp. 1079-94. Meleis, A. (1982), “Arab students in western universities”, Journal of Higher Education, Vol. 53 No. 4, pp. 439-47. Penfield, D.A. (1976), “Student ratings of college teaching: rating the utility of rating forms”, Journal of Educational Research, Vol. 76 No. 1, pp. 19-22. Pennington, M.C. and Young, A.L. (1989), “Approaches to faculty evaluation for ESL”, TESOL Quarterly, Vol. 23 No. 4, pp. 619-46. Scriven, M. (1995), “Student ratings offer useful input to teacher evaluations”, Practical Assessment, Research and Evaluation, Vol. 4 No. 7, p. 1. Sojka, J., Gupta, A. and Deeter-Schmelz, D. (2002), “Student and faculty perceptions of student evaluations of teaching: a study of similarities and differences”, College Teaching, Vol. 50 No. 2, pp. 44-9. Tang, S. (1999), “Student evaluation of teachers: effects of grading at college level”, Journal of Research and Development in Education, Vol. 32, pp. 83-8. Theall, M. and Franklin, J. (1990), “Student ratings of instruction: issues for improving practice”, in Theall, M. and Franklin, J. (Eds), New Directions for Teaching and Learning, No. 43, Jossey-Bass, San Francisco, CA. Theall, M. and Franklin, J. (2001), “Looking for bias in all the wrong places: a search for truth or a witch hunt in student ratings of instruction?”, New Directions for Institutional Research, Vol. 109, pp. 45-56. Trout, P. (1997), “What the numbers mean: providing a context for numerical student evaluations of courses”, Change, Vol. 29 No. 5, pp. 13-25. Trout, P. (2000), “Teacher evaluations: some numbers don’t add up”, Commonweal, Vol. 127 No. 8, pp. 10-11. Wallace, J.J. and Wallace, W.A. (1989), “Why the costs of student evaluations have long since exceeded their value”, Issues in Accounting Education, Vol. 2, pp. 224-48.

Student evaluations of teaching 315

QAE 15,3

316

Figure A1. Survey

Appendix

Student evaluations of teaching 317

Figure A1.

Corresponding author Ahmad Al-Issa can be contacted at: [email protected]

To purchase reprints of this article please e-mail: [email protected] Or visit our web site for further details: www.emeraldinsight.com/reprints