Dec 16, 2009 - 2,3 College of Arts and Science, Universiti Utara. Malaysia. Abstract-- In Information Technology related programs, computer programming ...
2009 2nd. Conference on Data Mining and Optimization 27-28 October 2009, Selangor, Malaysia
Classification of Students’ Performance in Computer Programming Course According to Learning Style N. M. Norwawi1, S. F. Abdusalam2, C. F. Hibadullah3, B. M. Shuaibu4 1, 4
Faculty of Science and Technology, Universiti Sains Islam Malaysia, Malaysia.
Abstract-- In Information Technology related programs, computer programming courses are made compulsory subjects in most institutions of learning. However there are many reports on poor performance among students in such courses. Previous studies examined some of the variables influencing students’ performance using statistical data analysis. The critical point of this study is the use of classification algorithm to extract patterns which are examined from the cognitive factor specific learning style. The findings show that that student’s good performance in programming courses has a visual, active and sequential learning style.
2,3
College of Arts and Science, Universiti Utara Malaysia.
but even in the higher-level Computer Science courses [3]. To buttress this fact, an interview conducted at Universiti Utara Malaysia (UUM) on TIA 5013 Principles and Techniques in Programming, a first semester postgraduate course in 2006/2007 shows an unsatisfactory performance. Detail statistic grades shown in Table I. Grade A/AB+/BB/C+
Keyword- Classification, Proficiency in Programming. I. INTRODUCTION Computer science students are expected to be well familiar on programming skills. In fact, most science, mathematics, engineering, and technology programs in higher academic institutions require students to acquire programming skills as a part of their education [1]. Previous study has found that learning style can affect an individual’s skill in information processing and also in students’ performance, particularly in introductory computer science courses. In order to overcome these matters, information on student learning style could be used as a guide in instructional delivery approaches and study habits in the teaching and learning of the computer science courses [2]. Students’ learning style indicates how the students respond to a wide range of intellectual activities and their preference in approaching new material. For example, some students may prefer to discuss new concepts in small groups, while others may prefer solitary study of those concepts. Some students may learn better by participating in active classroom activities, while others may learn better through reflection on the material. The most astonishing issue is the fact that students in those Information Technology related programs ended up not improving their programming skills despite undergoing several programming courses. This eventually creates problem not only in their employment,
Table I TIA5013 Course Percentage 19% 27% 54%
Table I indicates that almost half of the students got B or C+. This is not a very encouraging performance based on the fact that these students are already familiar with programming concepts during their bachelor degree. However, a study that examined the technical factors and academic background and its influence towards performance in programming courses has being conducted, this research investigated the relationships between students’ learning style which is a cognitive factor and performance in programming course extracted using data mining technique. . This paper is organized as follows: Section 2.0 will present related work; Section 3.0 describes the study setting followed by discussion on the findings in Section 4.0. II.
FACTORS INFLUENCING PERFORMANCE IN PROGRAMMING COURSES Some researchers have identified a number of personality variables, including computer self-efficacy, student attitude, comfort level, cognitive profile, and learning style [4] – [8] that have varying degrees of impact on student performance. Other researchers have studied common variables and agreed on the effect of the following variables on student performances; previous computer experience [9], mathematics and science
978-1-4244-4944-6/09/$25.00 ©2009 IEEE
Authorized licensed use limited to: UNIVERSITI UTARA MALAYSIA. Downloaded on December 16, 2009 at 11:32 from IEEE Xplore. Restrictions apply.
37
background [10], reading comprehension [11] and logical reasoning [12] which have varying degrees of influence on student success. Reference [13] found that mathematical ability and exposure to mathematical courses are important predictors of performance on introductory computer science modules. Similarly, experience in science subjects is also important even though fewer studies have indicated this. In addition, programming and non-programming computer experience (for example, experience of computer applications, emailing, game playing and surfing the web) are important predictors to programming performance. Besides the predictor variables, some studies have collected demographic variables on their respondents, including age, gender, ethnicity and marital status [6], [10] This study looks at cognitive factors and learning style chosen as the contributing factor to students’ performance in programming course. It extends the earlier work done by [7] and [8] with unit of analysis as undergraduate students and cognitive factor as Holland’s classification of personality style. A. LEARNING STYLE Learning style is defined as the characteristic of strengths and preferences in the way people take in and process information. Each student has his/her unique way of learning. Reference [13] defined student’s learning style as the way a student approaches learning and master's new material which essentially can affect student’s performance not only in introductory computer science course but also across other courses in the computer science curriculum. Reference [14] stated that there are two dominant learning style assessment tools used in science and engineering education; Kolb’s Learning Styles Inventory (LSI) and the Felder-Silverman learning styles model. The later model is known as the Soloman-Felder Index of Learning Styles (ILS) was developed by Richard Felder and Barbara Soloman in 1991 helps to identify students’ learning preferences [15]. This instrument is made up of a set of 44 questions, 11 for each of the first four dimensions. Each dimension classifies learning dispositions based on student opinion surveys. Felder’s Index of Learning Styles (ILS) measures four different dimensions of an individual’s learning style [13]. The four dimensions are active/reflective, sensing/intuitive, visual/verbal, and sequential/global. Active learners prefer an environment that enables them to learn by using the knowledge such as writing programs or discussing material with their peers. Reflective learners prefer an environment that enables them to cogitate over the material. Sensing learners prefer learning facts and concepts, intuitive learners prefer learning possibilities, applications, and relationships. Visual learners prefer learning from material they can see, for example, the use of charts, figures, and
demonstrations. Verbal learners prefer words, either spoken or written. Sequential learners follow material in a step-by-step sequence. Global learners tend to learn by putting material into a global context and seeing how the material relates to each other. In his study, [14] found, in relation to the Felder instrument, that reflective and verbal learning style students achieved top grades compared to the other dimensions. The Soloman-Felder Index of Learning Styles instrument was chosen to be adopted in this study due to its simplicity and recentness compared to Kolb’s Learning Styles Inventory instrument (as cited in [14]). Further, [16] applied data mining algorithm for selecting students for remedial classes based on their achievement in O-level subjects. Any student who do not perform higher than the cut-off marks will be recommended for the remedial classes. From this findings of this study and understanding of the cognitive factors, improvement on the pedagogy for programming courses can be design to suit students who do not perform well III. STUDY SETTING The study considers active Masters Students at Faculty of Information Technology (FTM), University Utara Malaysia (UUM) in the second semester session 2006/2007. Although, the Faculty offers four Masters programs which are; Information and Communication Technology (MSc. ICT), Information Technology (MSc. IT), Intelligent System (MSc. Intel. Sys) and Technopreneurship (MSc. Techno), MSc. Techno program is not included due to the absence of programming course in its curriculum. For eligibility of the programs under study, a student has to have IT or non-IT background. The students with non-IT background are those who previously have a qualification in other professional fields such as Accounting, Linguistic and Management. These students will be enrolled in first category of program MSc ICT program, which is offered as a conversion program for non-IT related qualification. In such a case, it is made compulsory for the students to register the basic programming course offered like Principles and Techniques in Programming (TIA 5013) using Java programming language. On the other hand, students with IT background are enrolled in the MSc. IT program and required to register for an advanced programming courses like Advanced Programming (TIW 5023) using Java programming language. The third category MSc. Intel. Sys., students are required to take Artificial Intelligence Programming Languages (TIN 5023) that uses Prolog. Data was collected using Soloman-Felder Index of Learning Style test instrument that is randomly distributed to the postgraduate students in FTM. The list of active FTM postgraduate students in current semester obtained from the Graduate Center is 91. Thus the test is
978-1-4244-4944-6/09/$25.00 ©2009 IEEE
Authorized licensed use limited to: UNIVERSITI UTARA MALAYSIA. Downloaded on December 16, 2009 at 11:32 from IEEE Xplore. Restrictions apply.
38
GLO
distributed to 75 students. However, only 71 tests were returned. IV.
SEN
RESULTS AND DISCUSSIONS
INT
A descriptive analysis on the demographic factors are presented in Table II. Age
Table II Age of Respondent Frequency
ACT REF
Percent
20-25
25
35.20
26-30
34
47.90
31-40
12
16.90
Above 40
0
0.00
Most of the respondents (47.90%) are between the ages of 26 to 30 years old, followed by those between 20 to 25 years old (35.20%). The remaining 16.90% are those aged between 31 to 40 years old. Thus, most of the respondents are young or fresh graduates who are still in their twenties. Besides, most of the postgraduate students (80.30%) who took the programming courses in FTM have background in IT, whereas 19.70% of them have no formal education in IT. Fig. 1 portrays the grade achieved by the respondents in their programming course. Basically, majority of the students (29.00%) managed to score B while the least (5.60%) scored are grade B- and C+. On the whole, about half of the students (55.00%) score grade B and B+.
0.00
9.00
5.1127
1.00
10.00
6.2958
1.00
10.00
4.7042
3.00
11.00
6.3662
0.00
8.00
4.6338
Visual = VIS, Verbal = VRB, Sequential = SEQ, Global = GLO, Sensing = SNS, intuitive= INT, Active = ACT, Reflective = REF
The result demonstrates that the most reoccurring learning style is visual where the mean is equal to 6.9296. On the contrary, verbal is the least reoccurring learning style with the mean score of 4.0704. In addition, more students have sensory learning style compared to intuitive. Fig. 2 shows that in general, the respondents are mostly visual, sensing, and active learners. However, the non-IT students tend to be more verbal and reflective. 18 16 14 12 10
IT Non-IT
8 6 4 2 0 VIS
VRB
SEQ
GLO
SNS
INT
ACT
REF
Visual = VIS, Verbal = VRB, Sequential = SEQ, Global = GLO, Sensing = SNS, intuitive= INT, Active = ACT, Reflective = REF Fig. 2: The Trend between Education Background and ILS Dimension
A cross-tabulation of the grades and learning style is as shown in Table V. Table V Summary of Results for each ILS Dimension with Grade ILS dimension G Percentage rade Visual A 50%
Fig. 1. The distribution of grades in programming courses
Table III shows the mean value for learning style dimensions.
Verbal
B
Active
A-
100%
Reflective
B
100%
Intuitive
B
100%
Sensing
A
42%
Global
C + B-
60%
Sequential Table III Respondents’ ILS Dimension Dimension Mi Max Mean n VIS 2.00 11.00 6.9296 VRB SEQ
0.00
9.00
4.0704
2.00
11.00
5.8873
50%
50%
Table V depicts that most of the students that scored grade A in the class are those that have visual (50.00%) and sensing (42.00%) learning style. However, all students who are active learners scored A- (100.00%). Those who tend to be more verbal, intuitive and reflective learners can be regarded as average students where they managed to attain grade B. Finally, those who are global and sequential learners scored grade C+ and B-
978-1-4244-4944-6/09/$25.00 ©2009 IEEE
Authorized licensed use limited to: UNIVERSITI UTARA MALAYSIA. Downloaded on December 16, 2009 at 11:32 from IEEE Xplore. Restrictions apply.
39
. Thus, it can be concluded that those who achieved good grades are students categorized as more visual, sensing and active students. On the other hand, the unsatisfactory performers in programming courses are those with global and sequential learning style. Next, the data analysis is carried out using a tool called Weka. It is an acronym of Waikato Environment for Knowledge Analysis. It is a data mining tool for data exploration. This software is a collection of machine learning algorithms used for data classification, clustering, evaluation. In this study, the data set is applied to the J48 decision tree algorithm (the java implementation of building a C4.5 decision tree) using 4-fold cross-validation. Decision tree represents a supervised approach to data classification and to predict unknown values based on known data. Decision tree is a simple structure where non-terminal nodes represent tests on one or more attributes and terminal nodes reflect decision outcomes. Decision tree was run ten times for training, test and cross-validation of their seeds in order to reduce error. The training set is all the same for the seeds from 1 to 10. It was found that cross-validation provides better classification and lower error rates. A decision tree was generated by Weka 3.4 based on seed 1. This decision tree has 17 numbers of leaves and the height of the tree is 33. From this tree, 21 rules are extracted which covered all attributes and instances. Table VI shows the extracted pattern for grade A. In general, a higher score of visual, active, sending and sequential learners scored a grade A.
No. 1
2
3
4
5
Table VI Extracted Pattern for grade A Pattern If sequential score > 2 and sequential score 4 and sequential score 6 and sensing score > 6 age 2 and sequential score 4 and sensing >4 and visual score > 9 and age > 1 If sequential score > 2 and sequential score 2 and sequential score 1 sequential score > 2 and sequential score