Automatised Examination of Programming Courses

Automatised Examination of Programming Courses – Design, Development and Evaluation of the AutoExam Online System Stefan Göransson, Department of Computer and Systems Sciences, Stockholm University, Sweden Fredrik Sköld, Department of Computer and Systems Sciences, Stockholm University, Sweden Peter Mozelius, Department of Computer and Systems Sciences, Stockholm University, Sweden

Abstract: Traditional hand-written exams are still the main assessment method in programming courses at Swedish universities. Several conducted research studies indicates that computer based examination is a more natural environment for student examination and that the most authentic form of examination for programming courses should be to solve problems at a computer. The aim of this study is to describe the development of the webbased AutoExam system and discuss how the software system might contribute to modern programming education. AutoExam is designed and deployed as an online system and the artefact is constructed by using the Python programming language and the Django framework with a three folded Model-View-Controller division. As a main approach for this study methods defined in Design Science to be able to cover all the phases in the design, implementation, evaluation and communication of the AutoExam software artefact. The implemented automatised assessment system was evaluated in test examination by students with basic skills in the Python programming language. Semi-structured interviews were conducted with two programming teachers at the Department of Computer and Systems Sciences at the Stockholm University. Answers from the students’ tests were analysed by using the One-Way Anova analysis method. The results were insignificant but showed a slightly better result for answers that were generated in the AutoExam environment. Both the interviewed teachers posit that there are positive aspects of computer generated examination answers both from the teachers’ and the students’ point of view. Our recommendation is that the kind of automatised examination of programming skills that is described in this article ought to be tested and evaluated in larger student groups and during longer time than what has been done in this limited study.

Keywords: Automated examination, Technology Enhanced Learning, Programming education, Design Science, Computer Science, AutoExam 1. Introduction

Amongst Universities and Technical institutes in Sweden, there is still common practice to examine programming courses at introductory level with traditional hand-written exams as the main assessment. This is normally accompanied by several assignments with more practical programming exercises that are submitted separately during the course, before the final written exam. To have this kind of dual headed setup is motivated by the fact that programming courses should assess two different things, programming skills and theoretical knowledge. To assess the students’ knowledge on Computer Science and Programming theory has seldom met any objection, but the fact that the skills in code construction are tested by paper programming in written exams have been questioned several times in several countries. “Having to produce working code during the final exam exhibits the student’s actual programming ability” (Medley, 1998)

“In general we can still see that programming courses have examination on paper instead of computer-aided exams. The laboratory work is done at the computer. Why not the exam? The most authentic form of examination for programming courses should be at the computer” (Jonsson et al., 2002) Despite the suggestions from researchers where it has been posit that a change of environment, from teaching and practicing to examination, creates unfair conditions for the examining students (English, 2002) there are still few sign of changes at Swedish universities

1.1 Paperless Examination A study done by Medley (1998) indicates that examinations performed in an online environment tend to mitigate the unnatural setting of writing hand-written exams even if there were no statistical differences regarding the results between the compared environments. Cassidy & Gridley (2005) states that an examination performed in an online environment generates less examination related anxiety and lower levels of perceived threat regarding an examination situation if compared to an examination written by hand. This was especially the situation for the top 25%-rated in the studied student group. In other parts of the world like China, the use of web based examinations is widespread and has been so for some time. One reason seems to be the campaign in education of basic computer skills that was launched in the country (Zhemming et al., 2003).In China the term Paperless Examination is an accepted and widely used term. Paperless Examination can be applied to any kind of examination or assessment taking place in a computerised and often online environment. In this approach paperless and automatised examination with auto-grading is developed to be an effective solution for mass education (Zhemming et al., 2003). This study will explore if the same basic approach to design and develop a tool for automatised assessment and finally evaluate if this might contribute to improve the quality in modern technology enhanced assessment. 1.2 Aim The aim of this study is to describe the design, implementation and evaluation of the web-based AutoExam system and discuss how the software might be useful in programming education. 1.3 Constraints This study has a focus on the development and evaluation of the AutoExam tool. There is no analysis done on web security issues or the online authentication process which both must be taken care of before the system can be used in real course examination contexts.

2. The AutoExam System

AutoExam is designed developed and deployed as an online system (AutoExam, 2012). The artefact is built in the Python programming language by using the Django framework, with a three folded Model-View-Controller approach (Django Project, 2013) and the description of the system below will also be divided into these three parts or layers. 2.1 Model

The model layer of the AutoExam system consists of two separate model classes where one separate Questionclass holds the questions of the actual examination which facilitates the scalability of the artifact. Furthermore this enables the usage of examinations in other programming languages simultaneously, since each instantiation of the Question-class can be reached and manipulated independently. The questions used in the examination, are stored as UTF-8 encoded strings in a Django model text field.

The other part of the model layer is the examination-class, named Tester. An object of this class is instantiated each time a volunteer uses the randomly created disposable password to log in to the system. This functionality keeps the workload of the database to a minimum since no unused data is stored and the workload of the facilitators is kept to a minimum as well. 2.2 View

In the view layer of a Django application HTML-templates display results from the systems Python sections. The AutoExam artefact uses these templates to display the questions and the data saved from the test examinations stored in a database. Every submitted answer is stored as a simple text string in the database. This string is fetched as a Python variable which is loaded as the value parameter of the textarea intended for editing the examination answers. If the examinee wishes to save their changes, the save button triggers a function that converts the entered input into an UTF-8 encoded string and simply replaces the previously saved submission. 2.3 Controller

There are a total of six functions in the controller layer of which two are used only to handle user authentication and logout. Two other functions handle the cookie controlled time frame. The startScreen-function creates the HTML form used to submit the email address to which the authentication credentials are sent. This function also instantiates new user objects in the database and assigns randomly created passwords to them. The q-function handles all the interactions between the examinee, the AutoExam software and the database holding the stored data. After writing and editing the code the examinee can use the button labelled Save to store changes to the database. 2.4 JavaScript

The AutoExam system uses several JavaScript functions that need to cooperate in order to make the artifact perform satisfactorily. Of these functions the most central ones are the functions controlling the syntax highlighting used to simulate the features in Pythons integrated development environment IDLE. In the AutoExam early prototype this was handled in the controller layer and only returned each time the source code was saved at database level with a subsequent page load or reload. To be able to update the syntax highlighting instantly it has to be carried out as close to the users view as possible which makes a JavaScript solution the natural choice. To connect the syntax highlighting functions to the source code entered by the examinee, the built-in event onKeyRelease is connected to the text area designated for entering code. This arrangement calls the syntax highlighting functions every time a key on the examinees keyboard is released.

3. Methodology

The main approach in this study has been the use of methods defined in Design Science since the AutoExam project is about development and evaluation of a software artefact. There is nothing contradictory between the choice of Design Science and the use of Social science research methods. Interviews, observations and surveys can be used to produce a specification of requirements and to support the evaluation of artifacts (Johannesson & Perjons, 2012). Design Science Design Science is a scientific research method based on the idea that problems can be solved using artifacts, where artifacts may be physical items or entities such as blueprints or digital systems. (Johannesson & Perjons, 2012) Design Science should try to enhance human and organisations’ possibilities by designing and developingf new and innovative artifacts (Hevner et al., 2004). This is an approach that originates from engineering (Simon, 1996) and contains methods for developing and evaluating artifacts iteratively and incrementally (Johannesson & Perjons, 2012).

A set of guidelines for how Design Science should be applied to develop artefacts has been created by Peffers et al. (2008). The main idea is to use a process model consisting of six firmly defined activities: • • • • • •

Problem identification and motivation Define objectives of a solution Design & Development Demonstration Evaluation Communication

To collect data for the first two steps semi-structured interviews were conducted with two programming teachers at the Department of Computer and Systems Sciences at the Stockholm University. Based on the initial data collection a software prototype was designed, developed and demonstrated during October – December 2012. For the evaluation a total of 30 examinations were performed during December 2012 and January 2013, where 18 of them were executed in the AutoExam system and the remaining 12 examinations were conducted as traditional pen and paper exams. Informants and software testers in the evaluation sessions were students from Stockholm University, Umeå University and Linköping University in Sweden. The final sixth communication step will be implemented as a series of articles, a thesis and written feedback to the informants.

Figure 1: The steps in the iterative Design Science Research Method model by Peffers et al.

4. Questions for the Test Examinations Test examinations consists of three separate questions, with a basic level of difficulty fitted for student’s taking their first introductory programming course at university level. These questions should be answered either by editing the existing Python source code, or by creating new source code. Our aim was to cover as wide scope of knowledge as possible in the constraint of only using three questions. Both the early prototype and the fully implemented AutoExam artifact only use questions in Swedish. The translation to English is done to increase the readability of this article. Question 1: Extend the list below with two more items of any type. Print out the length of the list.

list = [‘one’, 2] Question 2: Print all of the given variables, var1, var2, var3, in one print statement. var1 = ‘string’ var2 = 123 var3 = 345.7 Question 3: Iterate over the given tuple and print the values, one value per line. tup = (1,2,3,4,5)

5. Test Examinations To evaluate the AutoExam system as an assessment tool test examinations based on the questions in chapter 4 were conducted with selected students. Exactly the same questions were given in both test environments. 5.1 Paper based examinations The primary aim when designing the paper-based examination and the surrounding environment was to recreate the settings given during a traditional pen and paper examination. A secluded room was arranged so that disturbing surrounding elements was kept to a minimum. The students were presented the general outlines of the examination beforehand before they randomly picked one of the sealed envelopes containing the exam and a questionnaire to assess their previous programming experience. There was a time-limit set to 30 minutes in total starting when the students opened their envelopes. One of the authors was present as a supervisor to monitor and to answers any questions that might arise. 5.2 Examinations conducted in AutoExam The students who had volunteered to take the examination in the AutoExam environment were given a randomly generated identification number with an accompanying password which gave access to the system. Once the student had used their password to access the system, a timer counting down from 30 minutes of disposable time started. After the time had expired students were still able to navigate within the system, but could no longer save any changes in their answers. The main reason to deploy the AutoExam prototype online was to simplify for students to be able to do the examination anytime from anywhere. 5.3 Test Examination Results Answers from the test examinations were analysed by using the One-Way Anova analysis method. The results were insignificant but question one and question two showed a slightly better average score for examinations taken in AutoExam. Question three on the other hand showed a slight advantage for examinations conducted by hand with pen and paper.

6. Findings, Conclusions and Recommendations Findings show that there is no significant difference between the automatised examination and traditional pen and paper assessment of the tested programming skills. Since several studies indicates that computer based examination is a more natural environment for students (Medley,1998) (Jonsson et al., 2002) (English, 2002), our recommendation is that assessment of programming skills should be conducted as close as possible to the programming situation used in the programming courses. Both interviewed teachers mentioned that it sometimes is hard to read the handwritten answers when the exams are corrected and that it happens that sentences paragraphs are classified as unreadable. The male teacher mentioned the halo effect (Sutherland, 1992) (Michelioudakis, 2011) and that there might be a risk that bad handwriting might create a general negative attitude towards the examinee. The female teacher thinks that

computer generated answers could save time for the correcting teacher and avoid misunderstanding. None of the interviewed teachers finds automatised computer assessment to be an issue starting with Generation Y (Mozelius, 2012) or the Nintendo Generation (Guzdial & Soloway, 2002) since this matter has been discussed at the Department of Computer and Systems Sciences at the Stockholm University since the 1990s. Before the introduction of a system like AutoExam authentication and multi-user security problems must be investigated, but these issues might be solved in the same way as online bank systems and web shops have been made secure. There will initially be some costs for the implementation of systems for automatised examination but in a longer perspective they will be cost effective, especially if the software will be enhanced with auto-correction and auto-grading. Our recommendation is that automatised examination of programming skills should be tested and evaluated on programming courses in larger groups and during longer time than what has been the case in this limited study.

7. Future Work An interesting next step would be to implement a fully automatised grading feature in the system. Other fields to investigate are the examination related web security and online authentication of AutoExam users. References: 1.

AutoExam (2012) “AutoExam, an online platform for automatised assessment of programming courses”, http://bsg.alwaysdata.net/ (retrieved 01/02/2013)

2.

Cassidy, J.C. & Gridley, B. E. (2005) "The effects of online formative and summative assessment on test anxiety and performance", Journal of Technology, Learning and Assessment, 4(1)

3.

Django Project (2013) "Django documentation, General; Django appears to be a MVC framework, but you call the Controller the “view”, and the View the “template”. How come you don’t use the standard names?" https://docs.djangoproject.com/en/dev/faq/general/#faq-mtv (retrieved 14/12/2012)

4.

English, J. (2002) "Experience with a Computer-Assisted Formal Programming Examination", ITiCSE’02, June 24-26, 2002, Aarhus, Denmark

5.

Guzdial, M. & Soloway, E. (2002) “Log on education: teaching the Nintendo generation to program” Communications of the ACM, 45(4), 17-21.

6.

Hevner, A.R., March S. T., Park J. & Ram S. (2004) “Design Science in Information Systems Design”, MIS Quarterly Vol. 28 No. 1/March 2004

7.

Johannesson, P. & Perjons, E. (2012) “A Design Science Primer”, 1st edition, Createspace, ISBN: 1477593942

8.

Jonsson,T., Loghmani, P. & Nadjm-Tehrani, S (2002) “Evaluation of an Authentic Examination System (AES) for Programming Courses”, (retrieved 04/01/2013) from: http://www.ida.liu.se/~snt/teaching/HGUR-sept02.pdf

9.

Medley, M, D. (1998) ”Online Finals for CS1 and CS2”, ITiCSE, 1998, Dublin, Ireland

10. Michelioudakis, N. (2011) "Social Psychology and ELT – The HALO Effect", ETNI Blog, http://asketni.blogspot.se/2011/07/nick-michelioudakis-b.html (retrieved 01/02/2013)

11. Mozelius, P. (2012) "The Gap between Generation Y and Lifelong Learners in Programming Courses – how to Bridge Between Different Learning Styles?", EDEN 2012 Porto, Portugal 12. Peffers, K., Tuunanen, T., Rothenberger, M, A. & Chatterjee, S. (2007) ”A Design Science Research Methodology for Information Systems Research”, Journal of Management Information Systems/Winter 2007-8, Vol. 24, No. 3, pp 45-77 13. Simon, H. A. (1996) “The Sciences of the Artificial” (3rd ed.), MIT Press, Cambridge, MA, 1996 14. Sutherland, S. (1992) “Irrationality: the Enemy Within” Constable and Company, London, 1992 15. Zhemming, y., Liang, Z. & Guohua, Z. (2003) “A Novel Web-based Online Examination System For Computer Science Education” ASEE/IEEE Frontiers in Education Conference. ISBN: 0-7803-7961-6

Automatised Examination of Programming Courses

Automatised Examination of Programming Courses

Suggest Documents

C Programming Guide.pdf - Courses

Programming Languages | Types - Undergraduate Courses ...

Undergraduate Programming Courses, Students ...

Analyses of Student Programming Errors In Java Programming Courses

Analyses of Student Programming Errors In Java Programming Courses

COURSES & EXAMINATION SCHEME For B. Tech. Electrical ...

UNIQUE PROGRAMMING: AN EXAMINATION OF ...

Discussion on Advanced Programming Language Courses Grades

Computer Science and Programming Courses in

Language Trends in Introductory Programming Courses

(GWT): Programming Basics - Custom Training Courses

Laboratory Exams in First Programming Courses

OpenGL Programming Guide (Addison-Wesley ... - Courses

observations on plagiarism in programming courses

Student Concerns in Introductory Programming Courses

Introductory Programming Courses and Computer Games

Enhancing CS Programming Lab Courses using

Designing introductory programming courses - ePublications@SCU

The Flowchart Interpreter for Introductory Programming Courses

Motivating Students in Component-based Programming Courses

Enhancing CS Programming Lab Courses using

Open Source Projects in Programming Courses

automatic plagiarism detection in programming laboratory courses

Introductory Programming Courses and Computer Games