Individualized competence assessment in educational games. - CSS

Not Breaking the Narrative: Individualized Competence Assessment in Educational Games Michael D. Kickmeier-Rust, Dietrich Albert, Cord Hockemeyer, Thomas Augustin University of Graz, Austria [email protected] [email protected] [email protected] [email protected] Abstract: Most existing educational games cannot compete with their non-educational counterparts in terms of visual and narrative quality, gameplay, or adaptability. Amongst the most advanced approaches is ELEKTRA, a European project targeting on producing a 3D adventure game teaching physics. The project developed a scientifically sound framework for intelligent and adaptive tutoring, enabling the game to adapt learning/gaming activities to individual learning progress and pedagogical strategies. A crucial aspect, and a weak spot of present educational games, is the individualized assessment of knowledge. Existing approaches frequently rely on typical quiz-like methods, failing to adapt to individual learners and, most likely, they break the game’s narrative, what in turn weakens the “natural” advantages of educational games by compromising immersion and motivation to play and learn. In ELEKTRA, assessment occurs in integrated and individualized game situations within which learners have to accomplish adapted and tailored physics-related tasks, for example to hit a light sensor with a narrow beam of light, created with different optical devices, in order to open a door. ELEKTRA’s methodology allows providing individualized game situations on the basis of the same pool of game assets. For example, a high performer will be provided with fewer but more complex situations than an underachiever. The set of possible actions and action sequences is modeled in terms of problem spaces. Problem solution states are determined and linked with a skill structure established by prerequisite relations between skills. An ontology holds both information, enabling a “learning engine” to reason about the learner’s skills and increase or decrease their probabilities, approaching the true skill state. On this basis, the skills and therefore the learning progress can be assessed without compromising the learner’s immersion with the game and, furthermore, subsequent learning and assessment situations can be adapted to the learners’ needs. Keywords: Adventure game, micro-adaptivity, competence assessment, non-invasive interventions 1. What do you want to play/learn today? The majority of current approaches to technology-enhanced learning are based on traditional, unexciting 2D user interfaces. This perspective is compounded by the proliferation of immersive recreational computer games. In addition, traditional interfaces for educational applications have distinct weaknesses from the perspectives of learning psychology and didactics. For example, they are not intrinsically motivational and it is difficult to retain a learner’s interest, to provide a meaningful context throughout learning episodes, or to activate prior knowledge as a basis for learning. Moreover, it is not always possible to provide real-world problems for practicing new knowledge and a purposeful application of new knowledge is difficult without a meaningful and engaging context. Immersive digital educational games (DEGs) offer a highly promising approach to make learning more engaging, satisfying, inspiring, and probably more effective. Thus, it is not surprising that currently there is significant hype over game-based learning (cf. Kickmeier-Rust et al. 2006). Many of the potential advantages of DEGs (e.g., interactivity, feedback, problem solving) are considered to be important for successful and effective learning (Merrill 2002). Moreover, DEGs serve the needs of the “Nintendo generation” or the “digital natives” who grew up on “twitch speed” computer games, MTV, action movies, and the Internet. Marc Prensky (2001) argues that the exposure to such media has emphasized certain cognitive aspects and de-emphasized others, thus, the demands on education have changed. Still, DEGs have major disadvantages such as difficulties in providing an appropriate balance between gaming and learning activities or between challenge and ability, in aligning the game with national curricula, or the extensive costs of developing high quality games (Van Eck 2006). Thus, DEGs most often cannot compete with commercial counterparts in terms of gaming experience, immersive and interactive environments, narrative, or motivation to play. Moreover, most educational games do not rely on sound instructional models, leading to a separation of learning from gaming;

1

often they provide gaming actions only as reward for learning. Existing DEGs do not differ significantly from other multimedia learning objects and applications and there is considerable debate regarding the power of games for educational purposes, the advantages, disadvantages, costs, and risks. At the same time, computer games are tremendously successful and game industry constantly increases sales to several billions of Euros. A significant number of young people spend many hours a week playing computer games and most often these games are the preferred play. Adventure games like Myst sold 6 million copies; simulations like The Sims 2 sold 1 million copies in the first ten days after publications; the new generation game consoles (i.e., Microsoft Xbox 360, Nintendo Wii, and Sony Playstation 3) sold approximately 17 million units until February 2007. In conclusion, the attempt to utilize - at least parts of - gaming activities for educational purposes and to utilize the educational potential of computer games is a highly promising approach to facilitate learning and to make it a more pleasant task. The very nature of utilizing (computer) games for learning is that playing games is one of the most natural forms of learning. Children start learning to talk by playing with noises or they learn collaboration and strategic thinking when playing Cowboys and Indians. Since the 1990s research and development has increasingly addressed learning aspects of playing recreational games and also the realization of computer games for primarily educational purposes. Kickmeier-Rust et al. (2006) or Mitchell & Savill-Smith (2004) provide an overview of existing DEGs. From a psycho-pedagogical viewpoint, the state-of-the-art in game-based learning is at an early stage. Most existing DEGs are rather small and often simple games, focusing on insight in processes and complex issues (e.g., the Palestine conflict) or addressing particular sets of skills (e.g., job application trainings). They generally do not related to school curricula or do not attempt to enable learning related to school-related subject matter. More importantly, existing games do not provide sound assessment methods and generally there is an imbalance between learning and gaming. Finally, while game intelligence is well developed, educational games do not include adaptation to the learner in terms of knowledge, learning progress, motivation, or individual preferences. Thus, they cannot compete with their commercial counterparts and they cannot utilize the full potential of immersive digital games with respect to learning efficacy and learning experience. 2. The ELEKTRA project The ELEKTRA project (www.elektra-project.org), funded by the European Commission, has the ambitious and visionary goal to fully utilize the advantages of computer games and their design fundamentals for educational purposes and to address and eliminate the disadvantages of gamebased learning as far as possible. Nine interdisciplinary European partners contribute to the development of a sound methodology for designing educational games and the development of a comprehensive game demonstrator based on a state-of-the-art 3D adventure game teaching physics according to national curricula. Furthermore, ELEKTRA will address important research questions concerning game design, didactic design, or adaptive interventions. The linchpin of successful DEG is motivation to play and therefore to learn. So, an appropriate balance of challenges by the game and the learner’s abilities is required. Thus, from the perspective of cognitive science and computer science, an adaptive and individualized approach to DEG technology is the focus. This is true for pure gaming activities but in particular for learning activities. As attempted by conventional adaptive and personalized approaches to technology-enhanced education (Brusilovsky 1999, De Bra 1997), a learner must not be overcharged by subject matter in order to avoid frustration but at the same the learner must not be subchallenged in order to avoid boredom. Only if such balance can be achieved, some sort of flow experience can rise, enthralling and captivating the learner. In contrast to conventional adaptive tutoring and knowledge testing, adaptive assessment and interventions within a DEG are restricted by the game’s narrative and the game flow. Existing approaches to assessment frequently rely on typical quiz-like methods, failing to adapt to individual learners and, most likely, they break the game’s narrative, what in turn weakens the “natural” advantages of educational games by compromising immersion and motivation to play and learn. On the other hand, within an educationally adaptive game such as ELEKTRA the learning tasks are so integrated with the games narrative that the reordering of learning tasks in order to personalize learning experience would result is a corresponding reordering of narrative plot elements. With a

linear narrative this would result in a nonsensical narrative that is implausible. The challenge of creating dynamic yet plausible adaptive narratives is considerable and requires arduous manual editing of branching narratives. Experimental systems such as Façade (Mateas & Stern 1998) exemplify the difficulties of creating adaptive narratives. Within the field of adaptive hypermedia, adaptation is limited by the presentation medium and so adaptation is manifested through intermittent curriculum ordering and adaptive presentation. Due to the nature of 3D immersive games adaptation needs to be continuous and less periodic; it needs to occur at a greater frequency than on a task by task level. Considering this with the existing difficulties associated with generating adaptive narratives, the ELEKTRA game provides micro-adaptivity, that is, assessment by interpreting the learner’s behavior and adaptation within learning situations (LeS) as opposed to around them. 3. Micro-adaptivity The very basis of micro-adaptive skill assessment and non-invasive interventions is a formal model for interpreting a learner’s (problem solution) behavior within learning and assessment situations in an educational game. As an example, a learner might be confronted with a torch, a number of blinds, and a screen. The learner’s task might be to reduce the cone of the torch’s light into a narrow beam of light using the blinds (contributing to the understanding that light propagates in a straight line). To obtain a formal model, we describe such a game situation and its current status at a certain point of time by a set of props (e.g., torch, blinds, and screen) and their current properties (e.g., location or alignment). For each of a situation’s status a number of admissible actions can be performed by the learner (e.g., to turn on the torch or to position a blind). Each action, in turn, is interpreted regarding its correctness or appropriateness for accomplishing the task (e.g., narrowing the light cone). These interpretations of behavior enable conclusions (in a probabilistic sense) about the presence of certain skills and, in some cases, also the absence of certain skills. The probabilistic assessment of skills is the very basis for micro-adaptive interventions, either in terms of generating LeS tailored for an individual learner or in terms of providing a learner with non-invasive educational interventions within a LeS, for example giving the learner hints. To realize non-invasive assessment of skills and adaptive educational interventions, ELEKTRA relies on the formal framework of Competence-based Knowledge Space Theory (CbKST). Originating from conventional adaptive and personalized tutoring, this set-theoretic framework allows assumptions about the structure of skills of a domain of knowledge and to link the latent skills with observable behavior. 3.1 Skill structures and performance structures To address the challenges for research and development and to incorporate a separation of latent skills and observable performance, ELEKTRA utilizes the framework of CbKST to provide the game with a methodology for suitable adaptive interventions. It offers an internal cognition-based logic that is quite similar to the logic of ontologies: well-defined entities (the skills) are in a well-defined relationship (a so-called prerequisite relation). Skills are defined as distinct entities of ability or knowledge. The term “competence” is often used synonymously. CbKST is an extension of the originally behavioral Knowledge Space Theory (KST, Doignon & Famagne 1985, 1999) where a knowledge domain Q is characterized by a set of problems or test items. The knowledge state of an individual is identified on the subset of problems this person is capable of solving. Due to mutual dependencies between the problems captured by prerequisite relations, not all potential knowledge states will occur. The collection of all possible states is called a knowledge structure Κ. To account for the fact that a problem might have several prerequisites (i.e., and/or-type relations) the notion of a prerequisite function was introduced. The basic idea of CbKST is to assume a set E of abstract skills underlying the problems and learning objects of the domain. The relationships between the skills and problems are established by a skill function. Such function assigns a collection of subsets of skills (i.e., skill states) to each problem, which are relevant for solving it and it assigns the skills to each learning object taught. By associating skills to the problems of a domain, a knowledge structure on the set of problems is induced. The skills, which are not directly observable, can be uncovered on the basis of a person’s observable performance. A further extension is to assume prerequisite relationships between the skills, inducing a skill structure C on the set of skills (Korossy 1999). To illustrate this approach, assume that a knowledge domain is

represented by Q={a, b, c, d}. Consider the set E={V, W, X, Y, Z} of skills that are relevant for solving them. A prerequisite function that might exist among these skills is demonstrated in Figure 1a. For example, this function reads that if a student has skill X we can assume that this student also possesses either skill V or W, or both; the corresponding skill structure is shown in Figure 1b. It includes only 13 possible skill states from a total of 25 = 32 states.

Figure 1: The left panel illustrates a prerequisite function (the bended line below skill X indicates a logical or). The right panel shows the corresponding skill structure. The bolded line indicates one of several meaningful learning paths. This approach entails several advantages. Given the performance, that is, the subset of problems a student could master, the latent skills underlying that problem solving performance can be identified. Due to the utilization of representation and interpretation functions no one-to-one mapping of performance to skills is required and meaningful learning paths can be identified. 3.2 Problem spaces In addition to the formal model of the knowledge domain, its skills, and the prerequisite relations between those skills, a formal model of tasks and problem definitions within a LeS must be defined; the so-called problem spaces. Each LeS is characterized by a set of props or objects (e.g., torch, blinds, and screen) the learner can manipulate in order to achieve a certain goal. For example, the torch, two blinds, and the screen must be aligned in a row to narrow the torch’s light cone (Figure 2).

Figure 2: Blinds in a row Formally, let O be a set of props that can be used to define a certain LeS. For simplicity of notation, we assume that O={o1,…,oN}. Furthermore, for 1 ≤ n ≤ N let Pn be a non-empty set such that ∅∈ Pn, which contains the properties of the n-th object on. These properties can be of quite different character

(e.g., a six-dimensional vector describing position and orientation of an object in the virtual space or simply two values “on” and “off” for a switch). The definitions of such properties for each object by location and alignment, unfortunately, would result in an almost infinite number of combinations. To make the properties manageable or computable, we define categories for the objects’ properties. For example, there might be four “location categories” for a blind, each having a certain value of correctness (Figure 3). This allows us to describe a problem state as the N-tuple of all objects’ properties, that is, (po1,…,poN), where pon ∈ Pon (for simplicity of notation, we write pi and Pi respectively). If pn=∅, then the n-th object does not appear in the problem situation. If, on the other hand, pn≠∅, then the n-th object on appears in the problem situation and can be manipulated by the learner. The set S of all problem states is called the problem space: S=P1 x…xPN. Finally, to specify a problem situation, we have to fix an initial state s∈S and, for a fixed initial state s∈S, a set Ss⊂S of solution states.

Figure 3: Four categories of a blind’s locations; category 1 is the most correct location. To solve a certain problem, the learner can perform different actions to modify the objects and therefore change the problem state. Additionally, we assume that any problem can be solved in a finite number of steps. For a problem space S=P1 x…xPN, let us assume an initial state s∈S, a set Ss⊂S of solution states. Furthermore, let A be a non-empty set of actions a user may perform. Furthermore, let R⊂SxA denote a “compatibility relation”, that is, (s,a)∈R if and only if action a is performable in problem state s. Furthermore, let f:R→S be a “transition function” in the following sense: If a learner performs action a when the problem state s is given, then the problem state f(s,a) results. Finally, a finite sequence 〈(s1,a1), (s2,a2), …,(sm,am)〉 is called a problem solution process if the following conditions are satisfied: (1) s1=s; (2) (st,at)∈R for all t=1,…,m; (2) st+1=f(st,at) for all t=1,…,m-1; (4) st ∉ Ss for all t=1,…,m; and (5) f(sm,am)∈Ss. 3.3 Interpreting learner behavior – continuous skill assessment The combination of both skill structure and problem spaces allows the continuous interpretation of the learner’s behavior / actions within a LeS in terms of present and absent skills. This interpretation of course cannot be deterministic but rather probabilistic. For example, if a learner does not turn on the torch, we can assume - with a certain probability - that this learner lacks the skill to “know that the task requires a light source”. Let us assume a finite set E of skills associated with a given problem state. Furthermore, let C be a family of subsets of E, containing at least E, and the empty set ∅. For simplicity of notation, we assume that C=(C0,…,CM), where C0=∅ and CM=E. The elements C∈C are referred to as skill states and the tuple (E,C) is denoted as skill structure. We assume that at a certain point of time, any learner is in exactly one of the skill states in C. However, since the skill state of a person is not directly observable, the problem solution process of a person is analyzed to obtain evidence about the person’s skill state. Let us assume a skill structure (E,C), and a problem solution process 〈(s1,a1), (s2,a2), …,(sm,am)〉. Then, for 0 ≤ t ≤ m, let assume a probability distribution Lt:C→[0,1], with the following interpretation in mind: Lt(C) denotes the likelihood that a person who has performed the actions a1 (in state s1), a2 (in state s2), …, and at (in state st), is in skill state C. Similarly, L0 is the initial distribution at the beginning of the solution process (i.e., before action a1 is observed).

Note that there are different options to obtain the initial distribution L0. At the beginning of the learning process (i.e., before the first problem situation is presented to the user), either the initial distribution can be estimated from an entry test or the skill states are assumed to be uniformly distributed. Alternatively, let us assume that the user has already solved some of the problems. If, at last, we have observed the problem solution process 〈(s1,a1), (s2,a2), …,(sm,am)〉, then the final likelihood function Lm:C→[0,1], is used as initial distribution for the next problem situation. Similarly, we assume that the initial state of this next problem situation depends on the likelihood function LM. This general idea can be formalized by introducing a function g :{( p0 ,..., pM ) : p0 + ... + pM = 1} → S , which assigns to each probability distribution on C, a problem state in S, which can be used as initial state of the next problem situation. The important question remains how to update the likelihood of the skill states. In the following, the multiplicative updating rule by Falmagne & Doignon (1988) is adapted to our needs: Let us assume the problem solution process 〈(s1,a1), (s2,a2), …,(sm,am)〉. Then the likelihood of the skill states is updated according to the following idea: If the action at in state st provides evidence in favor of c ∈ E, then increase the likelihood of all skill states containing c, and decrease the likelihood of all skill states not containing c. If the action at in state st provides evidence against the elementary skill c ∈ E, then decrease the likelihood of all skill states containing c, and increase the likelihood of all skill states not containing c. Formally, let us assume a skill structure (E,C), a problem solution process 〈(s1,a1), (s2,a2), …,(sm,am)〉 and, for 0 ≤ t ≤ m a likelihood function Lm:C→[0,1]. Furthermore, let us assume two “skill assignment” functions fs:R→E and fu:R→E with the following interpretations in mind: if action a is performed in problem state s, then we can surmise that the user has all the skills in fs(s,a) (“supported skills”), but does not have the skills in fu(s,a) (“unsupported skills”). Furthermore, to recalculate the likelihood of the skill state C, let us fix two input parameters ζ0 and ζ1 with 1 ≤ζ0 and1 ≤ζ1. Then we update the likelihood function iteratively for each of the supported und unsupported skills c according to following formula.

Lt +1 (C ) =

ζ (C ) ⋅ Lt (C ) ∑ ζ (C ') ⋅ Lt (C ')

C '∈C

with a parameter function ζ(C) defined as

ζ 0 , c ∉ C , c ∈ f u ( st , at )  ζ (C ) =  ζ 1 , c ∈ C , c ∈ f s ( st , at ) . 1, otherwise  3.4 Non-invasive, individualized, adaptive interventions On the basis of the probabilistic assessment of skills/skill states, several methods exist to provide the learner with tailored educational interventions without compromising the game’s narrative and the game flow. 3.4.1 Generating and adapting LeS ELEKTRA’s methodology allows providing individualized game situations on the basis of the same pool of game assets. For example, a high performer will be provided with fewer but more complex situations than an underachiever. Moreover, based on the presence or absence of certain skills, specific props can be presented or not and tasks can be adjusted to the learner’s needs. In the same way, a specific LeS can be presented repeatedly if necessary, for example with an increasing level of difficulty. 3.4.2 Non-invasive interventions

In addition to tailoring an entire LeS, the learner can be educationally supported by interventions (e.g., hints) when necessary. The conditions under which a certain adaptive intervention is given are to be developed on the basis of pedagogical rules; however, these rules will apply the micro-adaptivity framework and utilize the learner model obtained through the assessment within the framework. Types of interventions are: A skill activation adaptive intervention may be applied if a learner gets “stuck” in some area of the problem space and some skills are not used although the user model assumes that the user masters these skills. A skill acquisition adaptive intervention may be applied in a similar situation where, however, the user model assumes that the user does not master the unused skill. Basically independent of the model is the application of motivational adaptive interventions. These might be applied, for example if the learner does not act at all for a certain, unexpectedly long time. Assessment clarification adaptive interventions may be applied, for example if the learner’s actions give contradicting support for and against the assumption of a certain skill state. 4. Technical realization The introduced framework for micro-adaptive skill assessment and non-invasive interventions is currently implemented in a game demonstrator within the ELEKTRA project. The architecture consists of four modules or engines (Figure 4). The learner is connected to the ELEKTRA system through the game engine (GE). It provides the non-adaptive parts of the game, and as such it is also the user interface to the system. The GE provides information on the learner’s action in the game to the skill assessment engine (SAE). The SAE updates the learner model (i.e., the skill state likelihoods) according to the procedure proposed in Section 3.3 and the information it has in the ELEKTRA ontology. This ontology serves as a database, containing various information, particularly the skills assigned to objects and their properties as well as the prerequisite relations between those skills (Kickmeier-Rust & Albert, in press). The resulting information about the learner’s skill state and its changes are then forwarded to the Educational Reasoner (ER), the pedagogical part of microadaptivity. Based on pedagogical rules and learning objectives, the ER gives recommendations on adaptive interventions to the adaptation realization (AR) module which maps the abstractly formulated educational recommendations onto more concrete game recommendations. In this mapping process, data on game elements and information on previously given recommendations are considered. The game recommendations are then forwarded to the GE which realizes them as concrete adaptive interventions in the game.

Figure 4: ELEKTRA’s architecture for micro-adaptive assessment and interventions 5. Conclusions The aim of micro-adaptivity is to enable an assessment of skills and learning progress during the game, which does not compromise the game flow and therefore does not negatively impact intrinsic motivation. The probabilistic assessment on the basis of interpreting the learner’s behavior and actions within the game is supplemented with more “significant” test items, for example the accomplishment of a certain task in order to reach a new level of the game. On the basis of this

assessment, non-invasive adaptive interventions can be triggered in order to support the learning process. Based on sound psychological models for problem solving and for skill structures, we have outlined a framework for micro-adaptivity within complex learning objects. However, micro-adaptivity is still in an early stage of research and development. The underlying framework uses some simplifying assumptions like the identity of properties (or position categories) and actions. For example, with each action only a single object can be manipulated. Based on the experiences in the ELEKTRA project, the framework will be generalized within and beyond the domain of game-based learning. Future work will also address the integration of meta-cognitive aspects such as confidence ratings into the assessment procedure In future projects also the realization of adaptive storytelling is envisaged in order to enable educational game technology even a broader range of individualization and adaptation to specific learners. 6. Acknowledgements The research and development introduced in this work is funded by the European Commission under the sixth framework programme in the IST research priority, contract number 027986. References Brusilovsky, P. (1999) “Adaptive and intelligent technologies for web-based education”. In C. Rollinger & C. Peylo (Eds.), Special Issue on Intelligent Systems and Teleteaching, Künstliche Intelligenz, Vol. 4, pp. 19-25. De Bra, P. (1997) “Teaching through adaptive hypertext on the WWW”, International Journal of Educational Telecommunications, Vol. 3, pp. 163-180. Falmagne, J.-C. and Doignon, J.-P. (1988) “A class of stochastic procedures for the assessment of knowledge”, British Journal of Mathematical and Statistical Psychology, Vol. 41, pp. 1–23. Doignon, J.-P. and Falmagne, J.-C. (1985) “Spaces for the assessment of knowledge”, International Journal of Man-Machine Studies, Vol. 23, pp. 175-196. Doignon, J.-P. and Falmagne, J-C. (1999) Knowledge spaces, Springer-Verlag, Berlin. Kickmeier-Rust, M.D. and Albert, D. (in press). The ELEKTRA ontology model: A learner-centered approach to resource description. Kickmeier-Rust, M.D., Schwarz, D., Albert, D., Verpoorten, D., Castaigne, J.-L., and Bopp, M. (2006) “The ELEKTRA project: towards a new learning experience”. In M. Pohl, A. Holzinger, R. Motschnig, & C. Swertz (Eds.), M3 – Interdisciplinary aspects on digital media & education, Österreichische Computer Gesellschaft, Vienna, pp. 19-48. Korossy, K. (1999) “Modelling knowledge as competence and performance”. In D. Albert & J. Lukas (Eds.), Knowledge Spaces: Theories, Empirical Research Applications, Lawrence Erlbaum Associates, Mahwah, pp. 103–132. Mateas, M. and Stern, A. (2007) “Façade, an artificial intelligence-based art/research experiment in electronic narrative”, [online], Procedural Arts, http://www.interactivestory.net. Merrill, M.D. (2002) “First principles of instruction”, Educational Technology, Research and Development, Vol. 50, pp. 43-59. Mitchell, A. and Savill-Smith, C. (2004) The use of computer and video games for learning: A review of the literature, Learning and Skills Development Agency, London. Prensky, M. (2001) Digital game-based learning, McGraw-Hill, New York. Van Eck, R. (2006) “Digital game-based learning: It's not just the digital natives who are restless”, Educause Review, Vol. 41, pp. 16-30.