Deeper Knowledge Tracing by Modeling Skill ... - ACM Digital Library

0 downloads 0 Views 986KB Size Report
Jul 17, 2016 - Deeper Knowledge Tracing by Modeling Skill Application. Context for Better Personalized Learning. Yun Huang. ∗. Intelligent Systems ...
Deeper Knowledge Tracing by Modeling Skill Application Context for Better Personalized Learning Yun Huang



Intelligent Systems Program University of Pittsburgh 210 S. Bouquet Street Pittsburgh, PA, USA

[email protected] ABSTRACT

on KT is the assumption of skill independence in problems that involve multiple (complex) skills. Recent research on KT has challenged this assumption, and has demonstrated that there is additional knowledge related to specific skill combinations; in other words, the knowledge about a set of skills is more than the “sum” of the knowledge of individual skills [8], some skills must be integrated (or connected) with other skills to produce the desired behavior [11]. Also, recent research that has applied a difficulty factor assessment [12] demonstrated that some factors underlying the context, when combined with a students’ original skills, can raise the difficulty of the material being learned and should be included in the skill model representation. Here are two examples of just such a situation: • Students were found to be significantly worse at translating two-step algebra story problems into expressions (e.g., 800-40x) than they were at translating two closely matched one-step problems (with answers 800y and 40x) [8]. • “16-30” can be more difficult than “20-16” since it involves the difficulty factor of “Negative Result” [12]. This illustrates that in some domains, we need to pay specific attention to the context of skill application. One of these domains is arguably computer programming. Research on computer science education has long argued that knowledge of a programming language cannot be reduced to a sum of knowledge about different programming constructs, since there are many stable combinations (patterns or plans) that must be taught and practiced [14]. To generalize these findings to other domains, we argue that skill application context can represent “chunks”, general problem-solving patterns that are critical in acquiring expertise in a domain. Meanwhile, complex skill knowledge modeling has been a challenge and has attracted increasing attention. Starting from [4] constructing simple variants based on traditional Knowledge Tracing, more advanced models have been put forward to address the multiple skill credit and blame assignment issue [13, 6, 15]. However, these student models use a “flat” knowledge structure that overlooks any potential interactions among skills and any interactions between skills and difficulty factors. Essentially, these models don’t provide the requisite formalism of considering the “context” for a skill’s mastery. Works that consider the relationships among skills mostly focus on prerequisite relationships [2, 1] or a granular hierarchy [7]. Regarding the data-driven evaluations of learner models, prior studies mostly use student problem-solving per-

Traditional Knowledge Tracing, which traces students’ knowledge of each decomposed individual skill, has been a popular learner model for adaptive tutoring. Typically, a student is guided to the next skill when the student’s knowledge on current skill is inferred as mastery. Unfortunately, this traditional approach no longer suffices to model complex skill practices where simple decompositions can not capture potential additional skills underlying the context as a whole. In such cases, mastery should only be granted when a student not only understands the basic of a skill but also can fluently apply a skill in varied application contexts. In this thesis, we aim to propose a data-driven approach to construct learner models considering different skill application contexts for tracing deeper knowledge, primarily based on Bayesian Networks. We aim to conduct novel, comprehensive, “deep” evaluations, including internal data-drive evaluations, and external end-user evaluations examining the real world impact for students’ personalized learning.

Keywords complex skill, multiple skills, deep learning, Knowledge Tracing, Bayesian Network

1.

INTRODUCTION

Knowledge Tracing (KT) [3] has established itself as an efficient approach to model student skill acquisition in intelligent tutoring systems. To apply KT, one needs to decompose domain knowledge into elementary skills and map each problem-solving step to an individual skill. KT has demonstrated its ability to track student knowledge for different domains, and may now be considered the most popular learner modeling approach. However, a known limitation ∗Advisor: Peter Brusilovsky, School of Information Sciences, University of Pittsburgh Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the owner/author(s).

UMAP ’16 July 13-17, 2016, Halifax, NS, Canada c 2016 Copyright held by the owner/author(s).

ACM ISBN 978-1-4503-4370-1/16/07. DOI: http://dx.doi.org/10.1145/2930238.2930373

325

• Model interpretability. We expect that the extracted “context” representation units could be interpreted by learners or teachers. However, data-driven methods can be sensitive to the size or characteristics of the data, the information type (such as performance information or domain knowledge). We will explore the effect of such factors and will consider the application of automatic text analysis, combined with affordable expert effort, to achieve our goal without the loss of model predictive performance.

formance prediction [3, 6], which raises some concerns. For example, [5] has shown that highly predictive models may be useless for adaptive tutoring, and [9] has shown that they can have low parameter plausibility or consistency. While some attempts have been made to evaluate models in terms of their effects on tutoring [13], a recent learner outcome-effort paradigm [5] offers a promising way to empirically evaluate student models for adaptive tutoring, which we plan to extend. We believe that a good learner model should demonstrate significant impact for real-world personalized learning, so we also plan to design classroom studies to investigate the impact of such a model on actual diagnosis accuracy, student awareness of mastery, recommendation quality, and content creation.

2.

3.

PROPOSED APPROACH AND EVALUATION

3.1

EXPECTED CONTRIBUTIONS We expect to achieve the following major contributions:

3.1.1

• A novel perspective and data-driven approach to building skill and learner models considering skill application context. Our work will be the first to introduce the variability of application context to model students’ deeper knowledge using data-driven techniques. We propose to categorize the “context” by the combinations among skills, between skills and difficulty factors. Mastery of a skill can only be granted when a student demonstrates the ability to apply this skill in varied contexts. • A novel multifaceted evaluation framework for learner models that considers the practically important aspects. We propose to use a new, comprehensive internal data-driven evaluation (such as parameter plausibility, mastery prediction quality), and external evaluation to examine end-user values. • A novel learner model that more accurately diagnoses and increases students’ awareness of pursuing deeper learning. We expect that our new learner model can more accurately differentiate shallow learning from deep learning. We also expect that the open learner model implementation can increase students’ awareness of pursuing real mastery, rather than simply increasing a performance score by attempting simple problems. • A novel learner model that enables better recommendation. We expect that under our new learner model, recommendation can more accurately target specific applications or aspects of a skill, which would save time on simple cases. • A novel skill model representation that encourages better content creation to address different skill application contexts. We expect that our extracted skill model (used by the learner model) would be able to guide content authors to find content that addresses a variety of skill application contexts.

Model Construction Overview

We plan to construct a Bayesian network (BN) which we call conjunctive knowledge modeling with hierarchical skillcontext units (CKM-HC) to model the context of skill application. Figure 1 shows the structure of our model. The O nodes (shaded) represent binary observed student performance, K nodes represent binary latent skill knowledge level, and M nodes represent the aggregated binary latent skill knowledge level, which we call Mastery. Edges denote causal relations. This model has three main functionalities that derive from three main parts: performance prediction (Prediction), dynamic knowledge estimation (Knowledge), and mastery decision (Mastery). To save space, we will mainly describe how we represent the skill application context, learn the network, and obtain the mastery decision.

Figure 1: The BN structure of a simplified instantiation of CKM-HC with pairwise combinations for skill application contexts in one practice time slice. • Skill Application Context Representation. We represent application contexts by units arranged hierarchically in both the Knowledge and Mastery Parts: - The first layer models the basic understanding of each individual skill (Figure 1 Layer 1). It also contains individual difficulty factors (if any). - The intermediate layers (e.g., Figure 1 Layer 2) model the skill application contexts by context units. Nodes from upper layers can be derived from lower layers with smaller units. A skill context unit can be constructed from skill combinations or from skill and difficulty factor combinations.

Also, we expect to contribute to addressing the following important issues in building learner models: • Model complexity. This is a common concern when applying Bayesian networks to build learner models. We will explore proper representations for the skill and learner model, heuristics to search within the space of representations, and efficient implementations and advanced techniques to reduce the overall complexity.

326

• Conjunctive knowledge modeling [13, 15] with original individual skills, where each observation maps to multiple skills with a conjunctive relation among skills. • Conjunctive knowledge modeling with hierarchical skillcontext units where each observation maps to multiple skills with a conjunctive relation among skills.

- The last layer models the mastery of each individual skill, where the nodes are fed from skill context units or single skills in the first layer. To avoid repeated computation, skill context units only connect to a single Mastery node, with the one representing the latter basic skill in the temporal order in which the skills appear in the course.

3.2.1

• Network Structure and Parameter Learning. The final structure of the network depends on the skill context units that it incorporates. If we do not limit the search space of units, the network’s complexity will grow exponentially. Also, since the network involves latent variables, we use the Expectation Maximization algorithm, which requires time-consuming posterior computation. We propose a greedy search algorithm for learning the network structure. It requires a pre-ordering of the skill context units of the candidates. During each iteration, it compares the cost function value (e.g., data log likelihood) of the network incorporating a new skill context unit with the optimal one from previous iterations. To rank the context units, we can use the following general information:

We first conduct an internal data-drive evaluation based on extending a recent learner effort-outcome paradigm (LEOPARD) [5] and a multifaceted evaluation framework [9]: • Mastery accuracy. Once a learner model asserts mastery of an item’s required skills, the student should be unlikely to fail in the actual performance. • Mastery effort. This metric empirically quantifies the number of practices needed to reach mastery of a set of skills by the estimations of a learner model. • Parameter plausibility. The metric investigates the consistency of the fitted parameters with the models’ assumptions and with users’ intuition. • Predictive accuracy on student answers. We will evaluate how well the new model predicts the correctness of a student’s answer, or the content of a student’s solution, based on the problem type. We can also consider external data-driven evaluations, such as predicting external test performance in cases where such data is available, as in [3].

- Original problem to skill qmatrix. This provides basic frequency information for skill context units – those with higher frequencies can be considered to be more stable “patterns” to be modeled. - Students’ performance data. We can employ strategies such as extracting context units when the difference of difficulties between the combined skill unit and its hardest constituent skill (unit) is large. - Natural language processing on the problem text. We can employ strategies such as extracting context units that have enough proximity to one another in the text.

3.2.2

External End-User Evaluation

We will conduct classroom and user studies based on a personalized learning system, based on our learner model. This system should contain the following components: • Open learner model. Our proposed learner model intrinsically empowers new visualizations, and we intend to investigate possible open learner model implementations. Figure 2 demonstrates one example. • Recommendation. We plan to implement new recommendations based on the new learner model. For example, in Figure 2, each cell that corresponds to a specific context unit will link to the recommended materials for its application context unit, so that students can be guided to more focused practice materials. • Learning content creation. Once we construct a skill model that has all of the important application context units, we can build a tool to help identify “missing” learning content, in order to cover the context of some important skill application.

We can also consider using domain knowledge or resources to help extract more meaningful or more typical skill context units. For example, in programming, we can use an abstract syntax tree (AST). Another aspect we find challenging is to consider the temporal learning effect in such a “complex” network. As a first step, we ignore this effect during the model learning process, while maintaining the dynamic knowledge estimation during the application phase, just as [2]. We leave this issue for further exploration. • Mastery Decision. CKM-HC aggregates knowledge estimates from skill context units assigned to the current skill, and gives a final knowledge estimate of the skill, based on which mastery is decided. We now aggregate knowledge levels by computing the joint probability of all required skill units being in a known state.

3.2

Internal Data-Driven Evaluation

Model Evaluation

We plan to conduct both internal data-driven and external end-user evaluations to compare our proposed skill and learner model with different alternatives, including: • Traditional Knowledge Tracing with original coarsegrained individual skills, where each observation maps to a single skill. • Weakest Knowledge Tracing [4] with original individual skills, where each observation maps to multiple skills with a minimization function over skills.

Figure 2: An example of the open learner model implementation of our proposed learner model in the Java programming domain. Based on the new personalized learning system, we will compare the following aspects of the different models:

327

1. Do students agree more with the knowledge and mastery estimation or diagnosis from the new learner model? 2. Does the new learner model increase students’ awareness of pursuing true mastery rather than performance? 3. Does the new learner model enable more helpful recommendation or remediation? 4. Does the new learner model enable students to achieve a deeper understanding evaluated by certain specifically designed intermediate and final tests? 5. Does the new learner model encourage better learning content creation?

4.

[2] C. Conati, A. Gertner, and K. Vanlehn. Using bayesian networks to manage uncertainty in student modeling. User Modeling and User-Adapted Interaction, 12(4):371–417, 2002. [3] A. T. Corbett and J. R. Anderson. Knowledge tracing: Modelling the acquisition of procedural knowledge. User Modeling and User-Adapted Interaction, 4(4):253–278, 1995. [4] Y. Gong, J. E. Beck, and N. T. Heffernan. Comparing knowledge tracing and performance factor analysis by using multiple model fitting procedures. In Intelligent Tutoring Systems, pages 35–44. Springer, 2010. [5] J. P. Gonz´ alez-Brenes and Y. Huang. Your model is predictive but is it useful? theoretical and empirical considerations of a new paradigm for adaptive tutoring evaluation. In Proc. of the 8th Intl. Conf. on Educational Data Mining, pages 187–194, 2015. [6] J. P. Gonz´ alez-Brenes, Y. Huang, and P. Brusilovsky. General features in knowledge tracing: Applications to multiple subskills, temporal item response theory, and expert knowledge. In Proc. of the 7th Intl. Conf. on Educational Data Mining, pages 84–91, 2014. [7] J. E. Greer and G. I. McCalla. A computational framework for granularity and its application to educational diagnosis. In Intl. Joint Conf. on Artificial Intelligence, pages 477–482, 1989. [8] N. T. Heffernan and K. R. Koedinger. The composition effect in symbolizing: The role of symbol production vs. text comprehension. In Proceedings of the Nineteenth Annual Conference of the Cognitive Science Society, pages 307–312. Lawrence Erlbaum Associates. [9] Y. Huang, J. P. Gonz´ alez-Brenes, R. Kumar, and P. Brusilovsky. A framework for multifaceted evaluation of student models. In Proceedings of the 8th International Conference on Educational Data Mining, pages 203–210, 2015. [10] Y. Huang, J. Guerra, and P. Brusilovsky. A data-driven framework of modeling skill combinations for deeper knowledge tracing. In Proc. of the 9th Intl. Conf. on Educational Data Mining (Accepted). [11] K. R. Koedinger, A. T. Corbett, and C. Perfetti. The knowledge-learning-instruction framework: Bridging the science-practice chasm to enhance robust student learning. Cognitive science, 36(5):757–798, 2012. [12] K. R. Koedinger, E. A. McLaughlin, and J. C. Stamper. Automated student model improvement. Proc. of the 8th Intl. Conf. on Educational Data Mining, pages 17–24, 2012. [13] K. R. Koedinger, P. I. Pavlik Jr, J. C. Stamper, T. Nixon, and S. Ritter. Avoiding problem selection thrashing with conjunctive knowledge tracing. In Proc. of the 7th Intl. Conf. on Educational Data Mining, pages 91–100, 2011. [14] E. Soloway and K. Ehrlich. Empirical studies of programming knowledge. IEEE Trans. Software Engineering, SE-10(5):595–609, 1984. [15] Y. Xu and J. Mostow. A unified 5-dimensional framework for student models. In Workshop on Approaching Twenty Years of Knowledge Tracing at the 7th Intl. Conf. on Educational Data Mining, pages 122–129. Citeseer, 2014.

PROGRESS

We have conducted preliminary studies with the simplest form of skill application context units, which are pairwise skill combinations constructed by grouping two original individual skills together, as in Figure 1, and we evaluated the new model by using some proposed internal data-driven metrics. The results seem to be promising, but more effort is required for further improvement. We used a Java programming comprehension dataset and a SQL generation dataset collected across two years from University of Pittsburgh classes. Due to the runtime limitation, we employed a heuristic approach to choose skill combinations (without a complete search procedural), and conducted internal datadriven evaluations (by a 10-fold cross validation). We found that incorporating pairwise skill combinations can significantly increase mastery accuracy and more reasonably direct students’ practice efforts, as compared to traditional knowledge tracing models and their non-hierarchical counterparts. Meanwhile, the predicted performance of all models were similar. On the Java dataset, we also explored the effects of considering textual proximity (in a simple way) or of adding experts’ examination for skill combination extraction, and achieved some interesting results. Our preliminary results are summarized in [10]. As the next steps in our study, we aim to achieve the following goals: • Consider and collect datasets with higher complexity and a greater variety of skill application contexts. • Explore efficient implementations and advanced techniques to reduce the complexity and to include the temporal learning effect of the Bayesian network. • Incorporate higher order skill combinations and difficulty factors by intensively applying natural language processing techniques. • Explore the necessity and effects of including external domain knowledge or resources in skill context unit extraction, particularly on how to balance the model’s accuracy and make its results easier to interpret. • Conduct an affordable user study and use expert engineering to explore the significance, nature, and categorization of the skill application context units. • Implement the new learner model into an adaptive system and conduct both user and classroom studies.

5.

REFERENCES

[1] C. Carmona, E. Mill´ an, J.-L. P´erez-de-la Cruz, M. Trella, and R. Conejo. Introducing prerequisite relations in a multi-layered bayesian student model. In User Modeling, pages 347–356. Springer, 2005.

328