Automatic Evaluation of Learning Objects based on ...

2 downloads 0 Views 2MB Size Report
[14] Leivas Pozzer, Lilian and Roth, Wolff-Michael. Prevalence, function, and structure of photographs in high school biology textbooks. Journal of Research in ...
Automatic Evaluation of Learning Objects based on Cross-Entropy of Eye Fixations Minimization Carlos Lara-Alvarez

Maria Alvarado-Hernandez

Hugo Mitre-Hernandez

CONACYT Research Fellow – CIMAT Zacatecas Av. Universidad 222, La Loma Zacatecas, Mexico, 98068

HCI Lab, CIMAT Zacatecas Av. Universidad 222, La Loma Zacatecas, Mexico, 98068

HCI Lab, CIMAT Zacatecas Av. Universidad 222, La Loma Zacatecas, Mexico, 98068

[email protected]

[email protected]

[email protected] ABSTRACT Learning objects (LOs) are important information resources that support traditional learning methods. To evaluate the impact, effectiveness, and usefulness of learning objects it is necessary a theoretically, reliable, and valid evaluation tool. This paper presents a cross-entropy metric to compare the design of LO that uses the information provided by visual fixations measured from a small focus group. The cross-entropy is measured on the test set to assess how accurate the entropy constancy rate principle is in predicting the test data. We conducted an experiment with children of elementary school (n=23). Results show that images with lower values of the proposed metric can be easily read (Mean = 0.746 min/image) than those LO composed of random images (Mean = 0.977 min/image). Hence, the metric is useful to optimize the fluency. This is an important step through the design of a fully automated tool to evaluate LO.

CCS Concepts • Human-centered computing → Heuristic evaluations; Interaction design theory, concepts and paradigms; •Information systems → Multimedia content creation; • Applied computing → Computer-assisted instruction;

Keywords Eye tracking, Learning object, cross entropy.

1. INTRODUCTION There are many definitions of Learning Objects (LOs) in the literature [1]. For the purposes of this paper the following definition is adopted: “Learning objects are information resources or interactive software used in online learning [1]”. A single image, a page of text, an interactive simulation, or an entire course could all be examples of learning objects. Thousands

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the owner / author (s). Interaccion'17, September 2017, Cancun, Quintana Roo, MEXICO © 2017 Copyright held by the owner / author (s). 978-1-4503-5229-1 / 25/09. $ 15.00

of LOs are currently available through the web [1]; hence, it is necessary an automatic tool for evaluating the impact, effectiveness, and usefulness of learning them. In this paper we are interested on evaluating LOs composed of a sequence of images. The order and content of these images are designed to ensure a given learning objective. When designing such a learning object, every person on the design team could have different ideas of what the student needs or wants. Let us suppose that the content and sequence of a LO is already defined; but, for every position in the sequence, there are several options with different graphical design. The problem of selecting the best image for each position in the LO sequence can be stated as follows: Given a set of candidate images: G = I! |i = 1 ⋯ , N , where each image I! ∈ G contains the same text information but different layout design, even different decorative pictures, select the image I ∈ 𝐺 that optimize certain predefined evaluation criteria. An image of a LO is a semiotic space; that is, meaning is created through an interplay of visual and verbal units [2]. The interdependence between the different units of a semiotic space can be hard to model because attention devoted to a particular element promotes or detracts attention to other elements [3]. To avoid these complexities, the proposed approach focuses on analyzing textual information. The main contribution of this paper is an entropy-based metric able to compare candidate images. The algorithm requires a set of images and their corresponding areas of interest (AOI) – for comparison, equivalent areas must show the same information. The proposed metric can be used to select the best design by comparing the information provided by visual fixations measured from a small focus group against the previous knowledge of the approximated time to study each area. Hence, it does not require the experts’ intervention, and the information is more accurate as it is originated directly from students. As shown in the empirical evaluation section, the proposed metric is related to the observing duration on each image (trial duration), in such way that images with lower values of the proposed metric can be easily read. The rest of this paper is organized as follows: section 2 discusses previous approaches to evaluate LOs, section 3 introduces the proposed entropy-based metric, section 4 presents an experiment with students from primary school, section 5 presents the results and discusses the pros and cons of the proposed technique; finally, section 6 concludes this paper.

2. RELATED WORK In general, LO evaluation approaches can be categorized as: Indirect. These approaches consider that experts, aided with guidelines or other evaluation instruments, can improve the design and content of LOs [4], [1]. Direct. These approaches obtain information of the quality of LO from the focus group. A simple strategy is using questionnaires or surveys. Technological advances had opened the possibility of automating the evaluation of LOs. A widely used sensor for this purpose is the eye tracker device commonly known as eye-tracker that is capable of obtaining data of when and what the user views on a screen. A previous study by the authors of this paper [5] proposes an entropy measurement calculated from the complete image; i.e., a predefined grid divides the image, and the average fixation duration on each cell is used to estimate the entropy. Although such measurement gives good results, it has the disadvantage of having a rigid division of the image. Then, in [6] a more flexible schema – that considers different positions and size of cells was used. In both cases, the entropy measurement just considers that the probability of seen an area is equal for any sort of object. In contrast, the approach proposed here considers that different elements (text, pictures) attract students’ attention in different ways; hence, predefined fixation duration should be different. For texts in general, a common question is ‘how easy a reader can understand a written text?’ Gray and Leary [7] categorize variables that affect reading ease into: Content, Style, Format, and Organization, they found no way to measure content, format, or organization. Analogously, there are plenty of approaches to measure style; e.g., the Flesch Reading Ease Formula [8], or DaleChall readability formula [9]. The proposed approach is an effort towards a direct evaluation of format and organization of textbased images.

3. CROSS ENTROPY MODEL TO EVALUATE LEARNING OBJECTS Let 𝐺 be a set of images, such that every image 𝐼! ∈ 𝐺 is divided into m areas of interest. To evaluate text-based learning objects, those composed only of text and decorative pictures, the proposed approach focuses on the overall fixation duration on text areas. Let us denote by 𝑇! 𝐼 the 𝑗–th text area of image 𝑗 and 𝑇! the 𝑗–th text area of any image in 𝐺. Two images in 𝐺 share the same information in equivalent text areas; that is, 𝑇! 𝐼 ∼ 𝑇! 𝐼′ for all I, 𝐼 ! ∈ 𝐺. By using the eye data collected from subjects viewing an image I, the cross entropy of fixation duration on text is calculated as: 𝐻 ! 𝐼, 𝑄 = 𝑝! 𝐼 log 𝑞! 𝐼 ,

(1)

where 𝑄 = 𝑞! | 𝑗 = 1, … , 𝑚 is the apriori distribution, with 𝑞! being the probability of observing text 𝑇! , and the empirical probability 𝑝! is estimated as 𝑝! =

!! ! ! !!! !!

!

(2)

where 𝑡! 𝐼 is the fixation duration – defined as the sum of all fixations on 𝐴! 𝐼 . Our hypothesis is that the image 𝐼 ∈ 𝐺 that minimizes 𝐻 ! 𝐼, 𝑄 has a better design; i.e., it increases the reading ease.

To estimate the probability of observing an AOI that only contains text, the entropy rate constancy principle [10] can be used, it states that speakers produce language whose entropy rate is on average constant. This principle also holds for connected text; i.e., sentences shown in context are equally easy to process. Another important prediction that can be derived from the entropy rate principle: sentences with higher entropy should have higher reading times [11]. In this paper, the apriori probability 𝑞! for text area 𝑇! is estimated as: 𝑞! =

! !! ! !!! ! !!

(3)

where 𝑚 is the number of areas containing just text, and 𝑤 𝑇 are the number of words contained in 𝑇.

4. EXPERIMENTS This study addresses the following research question: Is the aforementioned cross entropy related to the observing median speed for a given image? Materials. An eye-tracking device type “EyeTribe” model ET1000 with 60Hz sampling frequency was used in a screen with 1440 × 960 pixels resolution. Eye tracker device was located at a distance of 50cm from the student’s face. The device calibration was made with OGAMA [12], using twelve calibration points. In addition, this tool was used for generating the image sequences for evaluation. Participants. Thirty-three fifth grade children from the “Pedro Coronel” Elementary School in Zacatecas, Mexico participated in this study. Procedure. The experiments consist of two parts: the first one, selection, addresses the problem of identifying images with the minimum cross entropy for each position in a given sequence. The second one, testing, compares the minimum cross entropy sequence against a random sequence. These parts are outlined in the following paragraphs: (1) SELECTION. As an effort to implement arts education in primary grades, a group of Mexican teachers defined a piece of writing devoted to explain basic theater concepts – e.g., actors, director, play, costume design, stage, etc. Text was conveniently divided into seven sections, each section explaining a single concept. A graphical designer composed three images for each section, each image having the same text, same decorative images, but different graphical background. Finally, equivalent text areas for each position were defined. Nine children of the focus group participated in this study part. Each child observed a random sequence; i.e., for each position, an image was selected randomly among the three possible choices, in such way that each image was observed exactly three times. By using the eye tracking data, the cross entropy of fixation duration on text was calculated by (1). For each position in the sequence, the image with minimum value was selected to conform the sequence 𝐻!"# = [𝐼! , … , 𝐼! ], and one of the two remaining images was selected randomly to conform the sequence 𝐻!!" = [𝐼′! , … , 𝐼′! ]. (2) TESTING. Twenty-three participants were randomly assigned to one of the following two groups: Group 1. Twelve children studied a learning object composed by images from 𝐻!"# at even positions, and images from 𝐻!"# at odd positions.

Group 2. Eleven children studied a learning object composed by images from 𝐻!"# at odd positions, and images from 𝐻!"# at even positions. Each session took approximately 30 minutes per participant, until each participant completed the entire LO presentation. Metrics. In order to answer the research question, the following metrics were used: PICTURE DURATION RATIO. This metric is aimed to compare the accumulated fixation duration between picture and text areas, defined as 𝑟! =

!""#$#%!&'( !"#$%"&' !"#$%&'( !" !"#$%&' !"#!$ !""#$#%!&'( !"#$%"&' !"#$%&'( !" !"#! !"#!$

,

and it is calculated for each trial (a participant observing a single image). CROSS ENTROPY. To select elements in the 𝐻!"# sequence, the cross entropy of fixation duration on text (1) is used. MTD. The Median Trial Duration is the total time that a student spent viewing a given image. The MTD(𝐼) is calculated from data generated by a number of participants who observed image I. Statistical analysis. Data are represented as mean ± SD, and the significance was assessed by Student’s t-test for paired data.

5. RESULTS There was no significant difference between the sequences with regard to the picture duration ratio (0.082 ± 0.1) for H!"# and (0.054 ± 0.063) for H!"# . Results of the median observing speed are described in Table 1, where: 𝐻 𝑄 , is the entropy for the predefined distribution –the ideal minimum cross entropy value; 𝐻 ! 𝐼! , 𝑄 , 𝐻 ! 𝐼′! , 𝑄 , are the cross entropies for images from 𝐻!"# , and 𝐻!"# , respectively. There was a extremely significant difference in the median trial duration for 𝐻!"# sequence (0.746 ± 0.209 min/image) and images in the 𝐻!"# sequence (0.977 ± 0.245 min/image); t(6)= 8.144, p < 0.0002. A stronger linear relation was observed between reading duration and number of words, Pearson correlation = .285, p = 0.01 (2-sided). In other words, the observing speed is faster for images that minimize the proposed cross entropy. As it is expected, the observing time is correlated to the number of words. Table 1. Comparing 𝑯𝐦𝐢𝐧 and 𝑯𝐫𝐚𝐧 sequences; best results are marked in bold. Values for 𝐌𝐓𝐃 𝑰 are in minutes per image.

1 0.2958 𝐻 𝑄 0.2959 𝐻 ! 𝐼! , 𝑄 0.2961 𝐻 ! 𝐼′! , 𝑄 45 # Words 0.498 MTD(𝐼! ) MTD(𝐼′! ) 0.772

2

3

Image 4

5

6

7

0.2551

0.2716

0.6552

0.3110

0.7711

0.2930

0.2559 0.2885 62 0.723

0.2724 0.2818 88 0.747

0.6682 0.6706 78 0.698

0.3185 0.3187 62 0.607

0.7824 0.8618 141 1.163

0.2980 0.3024 65 0.782

0.827

0.976

0.988

0.791

1.495

0.988

As stated earlier, an image of a LO can be seen as a semiotic space containing visual and verbal units. Visual Units (pictures) normally do not receive extensive attention. Levin et al. [13] describe five functions of visualizations as text adjuncts: decorative, representational, organizational, interpretational, and transformational; a similar classification is presented by [14]. These classifications can be associated to the degree of attention that a picture should theoretically receive from a student [15]. Decorative pictures introduce colors, may provide certain aesthetics, but lack informational function [14]; the reader observes them mainly in an early phase of exploration [16]. An integrated format with spatial contiguity between text and illustrations facilitates integration [17]; for example, the visual design is important to gaining and capturing the attention of readers in advertising magazines [3]. A low picture duration ratio was found for both sequences (𝑟! < 0.1), it means that students did observe decorative images for shorter periods in comparison to text. Nevertheless, images can also detract attention from text. In our study we observe three patterns: in Fig. 1a, the image detracts attention from T! multiple times; in Fig. 1b the interplay between text and the picture causes opposition to reading, in Fig. 1c, text T! seems unessential. Due to the different interactions a simple metric, such as the picture duration ratio, cannot be helpful to select good designs – in the presented experiments, there are no statistical difference for 𝑟! . Although several metrics can be used to detect these undesirable patterns – e.g., the number of Gazes on a given text area, time of the first fixation in relation to the overall gaze time, etc.; the proposed cross entropy model can cope with these undesirable effects. Moreover, some of these effects can be natural – as the shown in Fig. 1a – and do not always affect the fluency. In conclusion, different design issues can be detected by using a single metric. The proposed cross entropy metric stated by (1) only evaluates the image in terms of the gaze time over text areas, but it is good enough to evaluate good LO designs. A complete metric must include gaze time over every area of interest; i.e., also including the picture areas. Some works have study the relationship between gaze time on pictures and their attributes; for instance, Da Silva et al. [18] state that attention is positively correlated to image complexity, Pieters and Wedel [3] suggest that gaze time over a picture depends on many factors such as their relationship to other elements – causing attention transfer –, picture size, and personal factors. These studies cannot be used here because they do not estimate the gaze time or pursue specific purposes other than evaluating LOs.

6. CONCLUSION AND FUTURE WORK This paper presents a metric to compare learning objects with eye tracking. The proposed metric uses the information provided by visual fixations measured from a small focus group. Images selected by minimizing the proposed metric characteristics agree to guidelines for authors of Learning Objects, they also

Figure 1. Patterns that detract attention from text areas: a red circle represents the beginning of a gaze; an encircled red circle, the end of a gaze; and a dotted arrow, a saccadic eye movement. (a) Text area 𝑻𝟏 has multiple detractions (b) The end of paragraph in 𝑻𝟏 and the saliencies of the picture are very close, which causes the image to attract attention. But the closeness of 𝑻𝟏 and 𝑻𝟐 hinders to relocate user’s eyes on the correct position; consequently, text 𝑻𝟐 is not completely read, (c) Students do not read text area 𝑻𝟐 because seems not important. maximize the fluency. Results show that the aesthetic content can be evaluated by the proposed metric.

7. ACKNOWLEDGMENTS This research was partially supported by Grant CATEDRAS 3163 from the National Council of Science and Technology (CONACYT) of Mexico. We thank all of the children who participated in the study.

8. REFERENCES [1] Vargo, John, Nesbit, John C., Belfer, Karen, and Archambault, Anne. Learning object evaluation: computermediated collaboration and interrater reliability. International Journal of Computers and Applications, 25 (2003), 198-205. [2] Holsanova, Jana, Rahm, Henrik, and Holmqvist, Kenneth. Entry points and reading paths on newspaper spreads: comparing a semiotic analysis with eye-tracking measurements. Visual communication, 5 (2006), 65-93. [3] Pieters, Rik and Wedel, Michel. Attention capture and transfer in advertising: Brand, pictorial, and text-size effects. Journal of Marketing, 68 (2004), 36-50. [4] Kay, Robin H. and Knaack, Liesel. Assessing learning, quality and engagement in learning objects: the Learning Object Evaluation Scale for Students (LOES-S). Educational Technology Research and Development, 57 (2009), 147-168. [5] Mitre-Hernandez, H., Alvarado-Hernandez, M., and LaraAlvarez, C. Evaluation of learning objects through eye tracking. In 2016 International Conference on Software Process Improvement (CIMPS) ( Oct 2016), 1-8. [6] Lara-Alvarez, Carlos, Mitre-Hernandez, Hugo, and Alvarado-Hernandez, Maria. Entropy of Eye Fixations: a Tool for Evaluation of Learning Objects. Research in Computing Science (2016), 89-98. [7] Gray, William Scott and Leary, Bernice Elizabeth. What makes a book readable. (1935). [8] Flesch, Rudolph. A new readability yardstick. Journal of applied psychology, 32 (1948), 221.

[9] Chall, Jeanne Sternlicht and Dale, Edgar. Readability revisited: The new Dale-Chall readability formula. Brookline Books, 1995. [10] Genzel, Dmitriy and Charniak, Eugene. Entropy rate constancy in text. In Proceedings of the 40th annual meeting on association for computational linguistics (2002), 199-206. [11] Keller, Frank. The entropy rate principle as a predictor of processing effort: An evaluation against eye-tracking data. In Proceedings of the conference on empirical methods in natural language processing ( 2004), 324. [12] Voßkühler, Adrian. OGAMA description (for Version 2.5). Berlin, Germany: Freie Universit. [13] Levin, Joel R., Anglin, Gary J., and Carney, Russel N. On empirically validating functions of pictures in prose. In Willows, Dale M. and Houghton, Harvey A., eds., The psychology of illustration. Springer, New, 1987. [14] Leivas Pozzer, Lilian and Roth, Wolff-Michael. Prevalence, function, and structure of photographs in high school biology textbooks. Journal of Research in Science Teaching, 40 (2003), 1089-1114. [15] Slykhuis, David A., Wiebe, Eric N., and Annetta, Len A. Eye-tracking students ́attention to PowerPoint photographs in a science education setting. Journal of Science Education and Technology, 14 (2005), 509-520. [16] Bucher, Hans-Jürgen and Schumacher, Peter. The relevance of attention for selecting news content. An eye-tracking study on attention patterns in the reception of print and online media. Communications, 31 (2006), 347-368. [17] Holsanova, Jana, Holmberg, Nils, and Holmqvist, Kenneth. Reading information graphics: The role of spatial contiguity and dual attentional guidance. Applied Cognitive Psychology, 23 (2009), 1215-1226. [18] Silva, Da and Perreira, Matthieur, Courboulay, Vincent, and Estraillier, Pascal. Image complexity measure based on visual attention. In Image Processing (ICIP), 2011 18th IEEE International Conference on ( 2011), 3281-3284.

Suggest Documents