Measuring Usability of Educational Computer ... - Semantic Scholar

2011 International Symposium on Humanities, Science and Engineering Research

Measuring Usability of Educational Computer Games Based on The User Success Rate Marina Ismail, Norizan Mat Diah, Suzana Ahmad, Nor Ashikin Mohamad Kamal and Mohd Khairulnizam Md Dahari Faculty of Computer and Mathematical Sciences Universiti Teknologi Mara Shah Alam, Malaysia marina@ tmsk.uitm.edu.my, norizan@ tmsk.uitm.edu.my, suzana@ tmsk.uitm.edu.my, nor_ashikin@ tmsk.uitm.edu.my

Abstract— Children learn through play [1]. According to Crawford [2] educational game is the children’s first contact with the computer. Therefore, it is important that the computer games for children are well designed and usable. Usability testing can also be adopted to assess learner’s motivation towards learning. Assessing the usability with children requires special consideration. Children must be allowed to play freely and expressed their opinion while being observed. This study explores the measuring consideration for the usability of an educational game called Jelajah. Jelajah, was designed to teach pre-schools children to learn the Malay words and was developed based on a solid research on educational approaches. Five pre-school children were selected for this study. During the usability testing session, children are observed and their behaviors are recorded in an observation checklist. The session was video-taped for further observation. A post test was administered after the session to seek the children’s satisfaction in using the game for learning. Data collected were quantitative and qualitative data. Data were later quantified to measure the effectiveness, efficiency and satisfaction towards the software. Based on the analysis, children showed interest and were highly motivated to learn when using educational game for learning.

4) Errors: How many errors do users make, how severe are these errors, and how easily can they recover from the errors?

Keywords- measuring; usability; educational computer game;

5) Satisfaction: How pleasant is it to use the design?

I.

INTRODUCTION

Nowadays, computers are not just being used by domain experts such as engineers and programmers. They are now used by almost everyone including children who want to get exposed to computers and information technology at an early age [1]. For most children the first contact with the computer is through some sort of educational game [3]. According to Crawford [2], play (game) is essential in children’s learning. Therefore, it is important that computer games for children are well-designed and should have a good usability. The continual exchange of information between user and tool is very important for interactive educational game. Any difficulties arising during the use of the tool can hinder effective learning. A proper usability testing will also be able to disclose the learner’s motivation towards learning [4]. This paper will discuss on the usability measuring method that could be conducted on children through observation.

978-1-61284-4577-0265-5/11/$26.00 ©2011 IEEE56

II.

USABILITY TESTING

One of the well-known usability definitions is by Jacob Nielsen which stated “usability is a quality attribute that assesses how easy user interfaces are to use, making it possible to the customers to develop tasks in a clear, transparent, agile and useful way [5]. Nielsen [10] considers that the usability of a system can have five quality components: 1) Learnability: How easy is it for the users to accomplish basic tasks the first time they encounter the design? 2) Efficiency: Once users have learned the design, how quickly can they perform tasks? 3) Memorability: When users return to the design after a period not using it, how easily can they reestablish proficiency?

Research by Bleken et. al [6] stated that system’s usability methods are divided into two categories which are analytical methods and empirical usability evaluation methods. In analytical method, the evaluation has to be conducted by usability experts who put themselves in the user’s position. The second category consists of usability tests and questionnaires with the real time users. According to Adikari and McDonald [7] a key reason for the presence of poor usability in products is the insufficient specification of usability perspectives effectively in product requirements specification. III.

SAMPLE OF USABILITY TESTING

Representative sample for this study comprises of five kindergarten children aged between five to six years old. Conducting a usability test with children as the sample is not an easy task. There are many ethical issues that must be taken into consideration. However, conducting usability evaluation with children has been proven useful as children too can deliver

If a task is completed successfully it will be marked as a ‘Yes’. A success mark is given the full credit of 100%. Tasks that are not completed successfully will be given a ‘No’ mark. ‘No’ marks are given zero (0%) credit. Unsuccessful tasks can include an event such as the child giving up, event where the child is unable to complete a task on the first attempt or child completing tasks incorrectly, etc. Partial credit will also be made available in the form of a ‘Partial’ mark which will allow for 50% credit. Partial credit will be reserved for instances that will be up to the discretion of the researcher to determine if the mistake should be given partial credit rather than a ‘No’ mark. The following table shows the list of tasks that were evaluated for its effectiveness. The collected data are later summarized for easy analysis and were analyzed for its effectiveness using success rate evaluation:

useful usability findings [8]. The survey collected basic demographic information from the respondents. The basic demographic information will ensure sample’s homogeneity. Among the five children selected, they are two boys and three girls. Two of them were 5 years old and another three were six years old. All of them began using the computer at the age of four years old. Based on the interview with the parents, it is also found that most of the children use the computer to play games and only two of them will also browse into the encyclopedia. They were all using the computers both at home and school. They are frequent users to the computer as they used the computer every week. The sample is homogenous as the children are in the same age group, have the same computer and Malay language literacy capability. IV.

MEASURING USABILITY TESTING TABLE I.

One of the research objectives in this study is to measure the level of effectiveness, efficiency and satisfaction of an educational game. Referring to ISO 9241-11 [9], usability is defined as “The extent to which a product can be used by specified users to achieve specified goals with effectiveness, efficiency and satisfaction in a specified context of use”.

Screen / Evaluation element

The three prominent components in this definition are: 1) Effectiveness: The accuracy and completeness with which customers achieve specified goals. 2) Efficiency: The accuracy and completeness of goals achieved in relation to resources. 3) Satisfaction: Freedom from discomfort and positive attitudes toward the use of the system. Based on the definition for usability, it is obvious that effectiveness and efficiency are the two components that have objective characteristic, while satisfaction has the subjective characteristic to be measured. Therefore the effectiveness and efficiency metrics can be measured using a usability measurement as introduced by Nielsen [10] which is by analyzing user’s success rate, the simplest usability metric. Research by Nielsen [10] defined success rate as the percentage of tasks that users complete correctly. This is an admittedly coarse metric. This metric however will not be able to tell t why users fail or how well they perform the tasks they completed. Success rates are easy to collect and measure. If users cannot accomplish their target task, it is deemed a failure. In this observation performing satisfaction children.

LIST OF TASKS FOR THE EFFECTIVENESS EVALUATION

study, user success can be derived from the checklist which is used by the researcher while the usability test and the measurement of are taken from post questionnaire with the

A. Effectiveness Effectiveness measures the capability of the user to complete a task within the application. For this study, effectiveness is measured for every screen. All tasks that need to be completed are listed.

57

Child Child 1 2 Start Screen

Child 3

Child 4

Child 5

The child do not have Yes No No trouble to navigate using keyboard The child do not have Yes No Partial trouble to find the start menu Graphics used on the Yes Yes Yes page attract the child Music used attract the Yes Yes Yes child Size of the game window Partial Partial Yes opened is good enough Character Selection Screen

Yes

No

Yes

Yes

Yes

Yes

Yes

Yes

Yes

Yes

The child succeeded in Yes Yes Yes selecting the desired character The child do not have Yes No No trouble to navigate using keyboard when choosing character Exploration Game Screen

Yes

Yes

Yes

No

The child do not have Yes Partial No trouble to move the character around using keyboard The child knows what to Yes Yes No do during the game play The child give focused on Yes Yes Yes the game The child shows positive Yes Yes Partial reaction during the game play The child used the jump Yes Yes Partial key only where it should The child knows clearly Yes Yes No the obstacle that should be avoided The child knows clearly Yes No Yes what should be collected The child understood the Yes Yes No overall character exploration concept Word Puzzle Game Screen

Yes

No

Yes

No

Yes

Yes

Yes

Yes

Yes

No

Yes

No

Yes

No

Yes

No

The child has no problem to complete the word puzzle The child knows how to move the cursor using keyboard arrows The child knows how to confirm with the selection by pressing Enter button The child knows whether their answer is correct or No The child understood the overall the word puzzle concept The child alert which letter is the current pointed letter to be matched The child succeeded matching all letters correctly

Yes

Yes

Yes

Yes

Partial

Yes

Yes

Partial

Yes

No

Unsuccessful tasks can include, child requiring assistance from the researcher or completing a task after several attempt. TABLE III.

Yes

Yes

Yes

Yes

Yes

No

Yes

Partial

Partial

No

Partial

Yes

Yes

Yes

No

Yes

Yes

Yes

Yes

Yes

No

Yes

Yes

Yes

Screen / Everluation Element

No

SUMMARY OF EFFECTIVENESS ANALYSIS TABLE

Answear

Child 1

Child 2

Child 3

Child 4

Child 5

Subtotal

Yes

18

16

11

21

9

75

Partial

2

2

5

1

1

11

No

2

4

6

0

12

24

TOTAL:

Child Child 1 2 Starting Screen

Child 3

Child 4

Child 5

Yes

Yes

Yes

Partial

Bigger picture of each Yes Yes Yes character helps the child to decide which character to choose Exploration Game Screen

Yes

Yes

The child completed the Yes No No first level with the first try Error and mistake done by Partial No No the child was minimal The child knows how to Partial Yes No recover from errors and mistakes The child succeeded Yes Yes Partial collecting all letters before end of each stage Level of interaction Yes Partial No between researcher and the child was minimal Guide and help from Partial Partial No researcher was minimal Word Puzzle Game Screen

Yes

No

Yes

No

Yes

No

Yes

No

Yes

No

Yes

No

The child completed the word puzzle in the first try Error and mistake done by the child was minimal The child knows how to recover from errors and mistakes

The child selected the Yes Yes Yes correct menu to start the game in the first try The child can easily Yes Partial Partial recover from errors or mistakes that happened Character Selection Screen

The summarized data is shown in Table II. TABLE II.

ANALYSIS OF THE EFFICIENCY USING SUCCESS RATE EVALUATION

110

Table I shows 22 tasks with 5 attempts per task, totaling 110 task attempts. 75 attempts were successful and 11 were partially successful. There are total of 24 unsuccessful tasks which will be ignored as 24 x 0% = 0. Therefore, to arrive at the overall effectiveness rating for this set of tasks we use the following equation:

Yes

Yes

No

Yes

No

Yes

Yes

Partial

Yes

No

Yes

Yes

Yes

Yes

Partial

Effectiveness (%) = (Yes + (Partial x 0.5)) / Total x 100% = (75 + (11 x 0.5)) / 110 x 100%

From Table III, we can summarize all results into a simpler

= 73.18%

efficiency analysis table as in Table IV below.

From the above equation, we can say that usability testing with children has proved that the effectiveness rating for Jelajah is approximately 73%.

TABLE IV.

B. Efficiency Efficiency is measured for the smoothness in completing a task. For measuring efficiency, the method and algorithm used is the same as measuring for effectiveness. If a task is completed smoothly, it will be marked as a ‘Yes’. A success mark is given the full credit of 100%.

SUMMARY OF EFFICIENCY ANALYSIS TABLE

Answear

Child 1

Child 2

Child 3

Child 4

Child 5

Subtotal

Yes

9

7

3

12

2

33

Partial

3

3

3

0

2

11

No

0

2

6

0

8

16

TOTAL:

60

Table III shows 12 tasks with 5 attempts per task, totaling 60 task attempts. 33 attempts were successful and 11 were partially successful. There are total of 16 unsuccessful

Tasks that are not completed smoothly will be given a ‘No’ mark. ‘No’ marks are given zero (0%) credit.

58

tasks which will be ignored as 16 x 0% = 0. Therefore, to arrive at the overall efficiency rating for this set of tasks we use the following equation:

Using a 5 point Likert scale with a negative weighting to 1 and a positive weighting to 5, each question answered by 5 children offers a possible positive response factor of 25 points and for 7 questions there are total of 175 points or 100% satisfaction. To arrive at the satisfaction rating for the overall game, we use the following equation:

Efficiency (%) = (Yes + (Partial x 0.5)) / Total x 100% = ((33 + (11 x 0.5)) / 60 x 100%

Satisfaction (%) = Answer Point / Total Point x 100%

= 64.17%

= 145/175 x 100%

From the above equation, we can say that usability testing with children has proved that the efficiency rating for Jelajah is approximately 64%.

= 82.86% From the above equation, we can say that usability testing with children has proved that the satisfaction rating for Jelajah is approximately 83%.

C. Satisfaction Measures of satisfaction are taken using post questionnaires with children. In other words, children are needed to answer few questions after completing all test scenarios. The questions and answers are structured using a 5 point Likert scale. The scale ranged from 1 to 5 as per below (Table V): TABLE V.

D. Single Metric for Usability To measure the overall usability (effectiveness, efficiency, and satisfaction) of the application, each usability components were expressed in a percentage. By averaging these three scores, the usability of a product can be defined with a number between 1 and 100. Therefore, the single metric for usability of Jelajah can be derived based on the following equation:

LIKERT SCALE POINTS TABLE USED IN POST QUESTIONNAIRE Answer Options

Likert Scale

Usability(%) = ( Effectiveness + Efficiency+Satisfaction)/3 x 100%

Yes, very much

5 points

= (73.18 + 64.17 + 82.86) / 3 x 100%

Yes

4 points

Moderate

3 points

Not really

2 points

Not at all

1 point

= 73.40% From the above equation, we can say that usability testing with children has proved that the usability level for Jelajah is approximately 73%. V.

All questions was designed to get inputs from the child on how they feel about the game, do they like it and was it easy to play or not. Table VI showing the post questionnaire asked to children together with their answer in Likert scale points. TABLE VI.

A usability test using observation method on Jelajah was done successfully involving children as the test user. From the observation checklist, rating for effectiveness and efficiency can be measured by analyzing user’s success rate. Based on the analysis of the success rate, we found that level of effectiveness in Jelajah is 73% and level of efficiency is 64%. On the other hand, analysis on post questionnaire shows the level of satisfaction is 82%.

LIKERT SCALE POINTS TABLE USED IN POST QUESTIONNAIRE WITH CHILDREN

Post Test Questionnaire The game was fun The game was easy to play I like the game characters I can control the character movement easily The word puzzle was easy I want to play this game again I would like to play this game at home

Child 1

Child 2

Child 3

Child 4

Child 5

Subtotal

5

5

4

5

4

23

5

5

1

5

4

20

5

4

4

5

4

22

5

4

2

4

2

17

5

5

4

5

2

21

5

4

2

5

4

20

4

5

4

5

4

22

TOTAL:

CONCLUSION

By averaging rates of effectiveness, efficiency and satisfaction, we can say that usability level for Jelajah is approximately 73%. The score is the single metric that shall be used to answer some fundamental questions about Jelajah such as: • • • •

How usable is the game? How do we rate against the competition? How much more usable must the game be? How will we know if the game is more usable?

The purpose of this study is to gather the understanding on children acceptance of a product and how children would evaluate whether a product is usable, fun and user friendly. This metric is great to accessing long-term progress on a project and could give us a way to measure our progress toward better, more usable designs in the future.

145

59

[5]

J. Nielsen (2003, August 25). Usability 101: Introduction to Usability. Alertbox. [6] A. Bleken, D. Bruggeman, and W. Marx (2010). “Usablity Evaluation of a Learning Management System”, Proceedings of the 43rd Hawaii International Conference on System Sciences. 2010, pp. 1-9. [7] S. Adikari, C. McDonald. “User and Usability Modeling for HCI/HMI: A Research Design” Proceedings of International Conferance on Information and Automation ICIA 2006, 2006, pp151 - 154 [8] M. Ismail.. A multimedia courseware (BACA) for learning Bahasa Melayu for preschool using the Vygotsky’s approach. PhD thesis, UKM, Bangi, Selangor, 2009. [9] ISO 9241-11, Ergonomic requirements for office work with visual display terminals (VDTs) – Part 11: Guidance on usability,1998. [10] J. Nielsen (2001, August 5). First Rule of Usability? Don't Listen to Users, Alertbox.

REFERENCES [1]

[2] [3]

[4]

N.M. Diah, M. Ismail, S.Ahmad and M.K.M Dahari. “Usability testing for educational computer game using observation method”, Proceedings of International Conference on Information Retrieval & Knowledge Management, (CAMP), 2010 , pp 157 – 161. C. Crawford, Chris Crawford on game design. Indianapolis: New Riders, 2003 W. Barendregt, M. M. Bekker, and M. Speerstra, “Empirical evaluation of usability and fun in computer games for children.”, in Proceedings of Human-Computer Interaction INTERACT-03', IOS Press, Zürich, Switzerland, 2003. pp. 705-708. P.Zaharias. Developing a Usability Evaluation Method for e-Learning Applications: Beyond Functional Usability. International Journal of Human-Computer Interaction, 2009, Vol 25 (1), pp 75 - 98

60