Data-driven Construction of a Student Model using ...

2 downloads 0 Views 714KB Size Report
lineman on the pole can place the medium tension live line back on the new insulator just placed. Click on the polypropylene rope. 6 Once it is released the ...
Data-driven Construction of a Student Model using Bayesian networks in an Electrical Domain Yasmín Hernández1, Marilú Cervantes-Salgado2, Miguel Pérez-Ramírez1, Manuel Mejía-Lavalle2 1

Instituto Nacional de Electricidad y Energías Limpias, Gerencia de Tecnologías de la Información, Reforma 113, 62490 Cuernavaca, México 2 Centro Nacional de Investigación y Desarrollo Tecnológico, Computer Science Department, Interior Internado Palmira s/n, 62490 Cuernavaca, México {myhp, mperez}@iie.org.mx {marilu.cersa, mlavalle}@cenidet.edu.mx

Abstract. The student model is a key component of intelligent tutoring systems since enables them to respond to particular needs of students. In the last years, educational systems have widespread in school and industry and they produce data which can be used to know students and to understand and improve the learning process. The student modeling has been improved thanks to educational data mining, which is concerned with discovering novel and potentially useful information from large volumes of data. To build a student model, we have used the data log of a virtual reality training system that has been used for several years to train electricians. We compared the results of this student model with a student model built by an expert. We rely on Bayesian networks to represent the student models. Here we present the student models and the results of an initial evaluation. Keywords: Bayesian networks, educational data mining, Student model, virtual reality, training systems.

1 Introduction Over the last decades, intelligent tutoring systems (ITS) have evolved and proved to be a successful application of artificial intelligent techniques. A part of the intelligence of these systems resides in knowing students and, consequently, in responding to individual needs of students. This adaptation process is based on the student model, which is a structure storing knowledge about student such as errors, misconceptions, trials, and also it can store information about personality, emotions, self-efficacy and motivation. There is extensive research in student modeling [1, 2, 3, 4], and a novel approach is concerned in to analyze data from educational environments to understand learning and students. This emerging field is called Educational Data Mining (EDM). The EDM have emerged as a result of the growing usage of educational environments, such as e-learning systems and ITS, which produce an amazing volume of data about the student-system interaction and how the learning process is advancing. EDM

|

exploits statistical, machine-learning and data mining algorithms [5], and it is defined as an emerging discipline, concerned with development of methods for exploring the unique types of data that come from educational settings, and using those methods to understand students, and the settings in which they learn [6]. We developed a non-immersive virtual reality training system for electrical maintenance. This system has been used for several years as a complementary tool to certify electricians in maintenance procedures to medium tension power lines. We want to build a student model based on the data log of this system. Bayesian networks have been used in ITS to model student knowledge, predict student behavior and make tutoring decisions due to their strong mechanisms for managing the involved uncertainty [7]. We rely on Bayesian networks to probabilistically relate behavior and actions of the students with their current knowledge. The tree augmented naive Bayes algorithm [8] and the GeNIe software package [9] were used to learn the Bayesian model from the data from the system for electrical training. We conducted an initial evaluation comparing the data-driven student model with a student model built with expert knowledge. Here we describe the process to build the data-driven student model and the results of the initial evaluation. The rest of this paper is organized as follows: Section 2 presents the Virtual Reality System for Electrical Training that we have used as studycase. Section 3 describes the procedure to build the data-driven student model. Section 4 presents the results of the initial evaluation of the student model. Finally, conclusions and future work are presented in section 5.

2 Virtual Reality System for Electrical Training We have developed several non-immersive virtual reality systems for training. The Virtual Reality System for Electrical Training (SRV) is one of them and it includes lessons and practices for 43 maintenance procedures (MP) to medium tension power lines which are a rigorous sequence of steps. Fig. 1 shows the SRV while a trainee is performing a MP.

Fig. 1. Virtual Reality System for training on maintenance procedures to medium tension power lines. The MP “Change of a pin insulator using bucket truck” is being performed.

Each MP is composed by a different number of steps, and in turn each step is composed by a different number of sub steps. At the beginning of each MP the system also includes a training section where students should learn all the materials, equipment, safety gear and tools needed to perform the MP. Thus, each MP includes two sections, namely, selection of tools and development of a MP throughout a series of steps and sub steps. The system provides students with facilities for these two sections to be learnt, practiced and evaluated. In Table 1, a MP step which consists of six sub-steps is described. Each sub-step consists of the description of the sub-step and the instruction; the instruction is an action to be executed by the trainee. The MP described in Table 1 corresponds to the MP shown in Fig. 1.

Table 1. Example of a step of the maintenance procedure “Change of a pin insulator using bucket truck”. The step consists of six sub-steps. Sub-step 1 Place the new pin insulator and screw it to the crossarm 2 3

4

5

6

Instruction Select the 13PC pin insulator from menu of materials The lineman places the pin insulator in the crossarm. The Click on the 13PC pin insulator isolator is previously climbed up using the errand bucket Proceed to screw and fix the insulator using the 1/2" Click on the 1/2" reversible ratchet with a 15/16" socket. Then the insulator reversible ratchet base and the crossarm are covered back with the rubber blanket Proceed to remove the rubber blanket covering the Click on the clamp clip or the auxiliary support and the medium tension line. Then the rubber blanket new insulator just placed is covered with the same blanket. See Chapter 100, Section 119, Paragraph L, for more details on the importance of covering the operating point Once the insulator is covered, the floor lineman releases Click on the polypropylene the moorage holder made in the rope restrainer. So the rope lineman on the pole can place the medium tension live line back on the new insulator just placed Once it is released the errand rope from the support for Click on the medium voltage restraining rope, medium tension line (the one covered), is line (the one covered) placed on the new insulator

Evaluations are organized in two separate sub evaluations: a) practical test, which consist in selecting tools and development of the MP visiting the two mentioned learned sections and b) theoretical test, which consist of questionnaires of multiple choice questions selected from a database, and they are marked automatically by the system. During the evaluation process, the system generates relevant data such as approved MP, errors during tools selection, errors made in specific steps and correct and incorrect answers. The errors are classified according their impact on learning. Table 2 shows the type of errors in the performance of the MP and Table 3 shows the type of errors in the tools selection.

Table 2. Error types in the performance of a maintenance procedure. Error type 1 2 3 4

Description The trainee is trying to guess because he clicked on the wrong element in the virtual environment. The trainee is trying to guess because he selected a tool which is not required for the MP. This error is moderate. The trainee is unfamiliar with the interface; he clicked on an element when it was asked to interact with the menu. This error is weak. The trainee was distracted because he selected a tool when a scene interaction was required.

Table 3. Error types in the tools selection section of a maintenance procedure. Error type 5 6 7

Description The trainee is trying to guess because he selected a tool which is not required for the MP. The trainee has incomplete knowledge because he selected the corrected tool but he selected a wrong number of items. The trainee was distracted because he selected a tool already selected.

3 Data Mining for Student Modeling Human teachers and tutors know students by means of interaction and observation, and in this way they adapt their instruction to particular needs of students. The interaction between students and teachers provide data about student knowledge, goals, skills, motivation, and interests. In an intelligent tutoring system this information is recorded in the student model to ensure that the system has principled knowledge about each student, and hence it can respond effectively, engage students and promote learning [1]. It is likely that educational data mining and machine learning techniques will play larger role in augmenting student models automatically. Machine learning is concerned with the ability of a system to acquire and integrate new knowledge through observations of users and with improving and extend itself by learning rather than by being programmed with knowledge [1, 10]. These techniques organize existing knowledge and acquire new knowledge by intelligently recording and reasoning about data. For example, observations of the previous behavior of students will be used to provide training examples that will form a model designed to predict future behavior [1, 11]. The SRV has been used in several training courses by hundreds of trainees since 2006. During this time it has produced a big sum of data that can be exploited to understand the electrical training and to know the trainees. This new knowledge could be useful to improve the learning process and also could be worthwhile for the training politics of the electricity company.

Data mining or Knowledge Discovery in Databases is the field of discovering novel and potentially useful information from large amounts of data [12]. Educational data mining methods are often different from standard data mining methods, due to the need to explicitly account for the multi-level hierarchy and non-independence in educational data [6]. As a first attempt to take advantage of the data produced by the SRV, we decide to build a student model by mining the data about the performance of trainees in the SRV. As we mentioned, the SRV includes 43 MP, every MP is different from the others as they contains different numbers of steps and sub-steps. Therefore we decided to mine the data about each MP separately. Consequently, in this exercise we mined the data related with only one MP. Also we just take the data of a division; we believe that we could find differences in training among divisions of the company. In order to obtain a sample, we analyzed the different tables at the databases of the SRV. As we mentioned, the SRV only keeps track the errors in the performing of the MP and a mark for the theoretical evaluation. Some examples of the records of errors are shown in Table 4.

Table 4. Examples of the errors stored in the SRV database. Evaluation Id 32 32 23 23 23

Trainee Id 93713 67856 YF022 YF022 YF022

MP Id 1 1 1 1 1

No. Step 5 3 5 5 5

No. Sub-step 1 1 1 1 1

Error Id 1 1 1 1 1

After several attempts to prepare and to mine this data, we obtained a data sample which consisted of 518 registers of 100 trainees in 67 courses. However, after analyzing the completeness and correctness of the data, we found only 157 useful records. For the learning process we use 100 records and for the evaluation, we used 57 records. Examples of resultant registers after preprocessing are shown in Table 5. The attributes for the MP, step and sub-step are not included because the data mining algorithms did not find a relationship with the other attributes.

Table 5. Examples of resultant registers after preprocessing the errors in the SRV database. Mark 87 100 80 90 70

EPT1 8 3 0 0 3

EPT2 0 0 0 0 0

EPT3 0 0 0 0 0

EPT4 0 0 0 0 0

EMT5 7 5 0 5 1

EMT6 4 0 3 3 0

EMT7 5 0 3 3 0

The database contains the errors in the performance of the MP and the mark in the theoretical test, however there is not a field which tells us if the trainee is qualified in

the MP, because the system is not intended to certify trainees. To get a classification of the records, we used an unsupervised learning approach by applying the k-means algorithm in Weka [13], the popular suite of machine learning software. The k-means algorithm allows us to group a set of objects in such a way that objects in the same cluster are more similar (in some sense or another) to each other than to those in other clusters [14]. As we need to know if the trainee is skilled in the MP, we used two classes: trained and untrained. The Table 6 shows the results of the k-means algorithm. An expert decided which cluster corresponds to trained participants and which cluster corresponds to untrained participants.

Table 6. Results of the k-means clustering algorithm with a sample of 100 records. Class 1 (Trained) 0 (Untrained)

Number of records classified 87 13

Then, we analyzed the fields in the database to select the adequate attributes for building the Bayesian network representing the student model. We selected eight attributes related with the errors during the performance of the MP and the mark of the theoretical evaluation. The Table 7 describes the selected attributes. The description of each type of error can be found in Table 2 and Table 3.

Table 7. Selected attributes to be included in the learning process of the Bayesian network. Attribute Trained Mark EPT1 EPT2 EPT3 EPT4 EMT5 EMT6 EMT7

Description Competency of the trainee in the MP Mark of the trainee in the theoretical test Number of type 1 errors made by the trainee in the MP Number of type 2 errors made by the trainee in the MP Number of type 3 errors made by the trainee in the MP Number of type 4 errors made by the trainee in the MP Number of type 5 errors made by the trainee in the MP Number of type 6 errors made by the trainee in the MP Number of type 7 errors made by the trainee in the MP

Values 1,0 0-100 0-n 0-n 0-n 0-n 0-n 0-n 0-n

Since the data are continuous (except the attribute trained, the class) a preprocessing was needed in order to discretize them. The discretization of the attributes is shown in Table 8 and Table 9. After the processing, the data was ready to be used to find patterns and learn a Bayesian network which represents the student model. We rely on the tree augmented naive Bayes algorithm of the GeNIe software package. GeNIe is the graphical interface to SMILE, a fully portable Bayesian inference engine implementing graphical decision-theoretic methods, such as Bayesian networks and influence diagrams and structural equation models [9].

Table 8. Discrete values for the attribute mark. Value Between 0-10 Between 11-20 Between 21-30 Between 31-40 Between 41-50 Between 51-60 Between 61-70 Between 71-80 Between 81-90 Between 91-100

Description A mark between 0 and 10 was obtained by the trainee A mark between 11 and 20 was obtained by the trainee A mark between 21 and 30 was obtained by the trainee A mark between 31 and 40 was obtained by the trainee A mark between 41 and 50 was obtained by the trainee A mark between 51 and 60 was obtained by the trainee A mark between 61 and 70 was obtained by the trainee A mark between 71 and 80 was obtained by the trainee A mark between 81 and 90 was obtained by the trainee A mark between 91 and 100 was obtained by the trainee

Table 9. Discrete values for the attributes representing the errors. Value Between 0-10 Between 11-20 Between 21-30 Between 31-40 Between 41-50 Between 51-60 Between 61-70 Between 71-80 Between 81-90 More than 90

Description A number of errors between 0 and 10 was made by the trainee A number of errors between 11 and 20 was made by the trainee A number of errors between 21 and 30 was made by the trainee A number of errors between 31 and 40 was made by the trainee A number of errors between 41 and 50 was made by the trainee A number of errors between 51 and 60 was made by the trainee A number of errors between 61 and 70 was made by the trainee A number of errors between 71 and 80 was made by the trainee A number of errors between 81 and 90 was made by the trainee A number of errors greater than 90 was made by the trainee

To select the data-driven techniques to apply to a particular class of problems is a function of the nature of the data and the problem to be solved [15]. There are several interesting alternatives; we tried the algorithms within GeNIe in order to get the datadriven model. These algorithms are Bayesian search, PC, essential graph search, greedy thick thinning, tree augmented naive Bayes, augmented naive Bayes, and naive Bayes. However, the only one which found beliefs and dependencies that we were looking for was the tree augmented naive Bayes algorithm [16]. We are looking for dependencies such as: class node given errors nodes. Thus, the network obtained with tree augmented naive Bayes algorithm is the one that found this kind of relationship. For this exercise, we chose node EMT3 as network class node because this type of error shows knowledge on both sections: tools selection and maintenance procedure. The tree augmented naıve Bayes network is an extension of the naive Bayes network. Similar to naive Bayes, the root node is the class node, and it is causally connected to every evidence node. The tree augmented naıve Bayes network structure relaxes the assumption of independence between the evidence nodes, and allows most evidence nodes to have a second parent, which can be a related evidence node. This maintains the directed acyclic graph requirements and produces a tree that captures relationships among the evidence [15]. The resultant network of this algorithm is presented in Fig. 2.

Fig. 2. Bayesian network learned from the data by the tree augmented naive Bayes algorithm. This network represents the student model.

In the learned Bayesian model we can see the relevance of EMT3 node (which represent no severe errors when selecting tools) to classify the trainee as trained or untrained in the MP. This relationship confirms that knowing the tools is important to knowing the performance of the MP.

4 Results In order to evaluate the data-driven model, we built a student model with base on expert knowledge. We asked an expert instructor to assert relationships between the factors of a MP, namely, steps, sub-steps, tools, errors, theoretical test. The expert built the model presented at Fig. 3. As it can be observed, the expert considers that errors have impact on each step. The values for step nodes are the types of errors. The expert stated the conditional probabilities with base on the type of errors and on the number of them. Namely, the probability of knowing a step depends on the number of errors and its severity. This means that the trainee make errors type 1, the probability of knowing that step decreases more than the error is type 3.

Fig. 3. Bayesian network representing the student model built by the expert.

To evaluate the data-driven model we compared the results of the classification of both models. The results of this evaluation are presented in Table 10; also, some examples of the 47 records are presented. As it can be observed the precision of the data-driven model is around 50%. We need more experimentation and a more principled evaluation in order to have a comprehensive result. Table 10. Comparison between the data-driven model and the model built by the expert. Trainee 1 2 3 4 5

Data-driven model Expert model Difference between Trained Untrained Trained Untrained Models 50% 50% 62% 38% 12% 75% 25% 80% 20% 5% 88% 13% 80% 20% 8% 88% 13% 85% 15% 3% 75% 25% 84% 16% 9% Precision in cases with difference ≥ 5% 50% Precision in cases with difference ≥ 10% 70% Precision in cases with difference ≥ 20% 80%

5 Conclusions and Future Work We have presented a student model built with base on data of a training system. We evaluate the model comparing its results of classification with the results of a student model built by an expert. In an initial evaluation, we obtained encouraging results since in most cases the classification by the data-driven model coincides with the classification made by the model built by the expert. However, more experimentation is needed before to integrate the model into the system for training. We need to build the structure for the rest of the MP. Also we want to apply other EDM algorithms that we did not consider for this experiment. Additionally, we would like to compare the results of the data-driven model with the opinions of the trainees, in this way the trainee could evaluate the assessment of the model and give us his own opinion. In the integration of a student model to the SRV there several alternatives. One of them is to use the model for adapting the instruction to the trainee learning necessities. Another alternative is using it as an open student model; namely, to show the model to instructors and trainees. The trainee could see what are the topics and steps he ought to practice. The instructor could use the model to help the student to learn and also to plan the lessons and to prepare his own teaching. Also, the student model can be useful to design new courses, and to design the instructional material. These potential applications of the student model in turn will improve the learning process. Acknowledgments. Authors would like to thank CFE experts for many useful advises in the definition of the model. MCS wants to thank INNEL for all the support in the development of this experiment.

References 1. Woolf, B. P.: Student Modeling. In: Nkambou, R. Mizoguchi, R, Bourdeau, J. (Eds.): Advances in Intelligent Tutoring Systems. Studies in Computational Intelligence, Vol. 308. Springer-Verlag, Berlin Heidelberg New York (2010) 267–279 2. Sosnovsky, S., Brusilovsky, P.: Evaluation of Topic-based Adaptation and Student Modeling in QuizGuide. User Modeling and User-Adapted Interaction 1, 4 (2015) 371-424 3. Conati, C., Samad, K.: Student modeling: supporting personalized instruction, from problem solving to exploratory, open-ended interactions. AI Magazine, 34, 3 (2013) 13-26 4. Mitrovich, A.: Modeling Domains and Students with Constraint-Based Modeling. In: Nkambou, R., Mizoguchi, R., Bourdeau, J. (Eds.): Advances in Intelligent Tutoring Systems. Studies in Computational Intelligence, Vol. 308. Springer-Verlag, Berlin Heidelberg New York (2010) 63-80 5. Romero, C., Ventura, S.: Educational Data Mining: A Survey from 1995 to 2005. Expert Systems with Applications, 33, (2007) 125-146 6. Baker, R.S.J.d.: Mining Data for Student Models. In: Nkambou, R. Mizoguchi, R, Bourdeau, J. (Eds.): Advances in Intelligent Tutoring Systems. Studies in Computational Intelligence, Vol. 308. Springer-Verlag, Berlin Heidelberg New York (2010) 323-337 7. Pearl, J.: Causality: Models, Reasoning, and Inference. Cambridge University Press (2000). 8. Jiang, L., Zhang, H., Cai1, Z., Su, Jiang.: Learning Tree Augmented Naive Bayes for Ranking. In Zhou, L., Ooi, B.C., Meng, X. (Eds): DASFAA 2005, LNCS, Vol, 3453, Springer-Verlag, Berlin Heidelberg New York (2005) 688-698 9. Druzdzel, M.: SMILE: Structural Modeling, Inference, and Learning Engine and GeNIe: A Development Environment for Graphical Decision-Theoretic Models. In Proceedings of the 11th conference on Innovative Applications of Artificial Intelligence (1999) 902-903 10. Shapiro, S.: Encyclopedia of Artificial Intelligence, 2nd edn. John Wiley & Sons, Chichester (1992) 11. Webb, G., Pazzani, M., Billsus, D.: Machine Learning for User Modeling. User Modeling and User-Adapted Interaction, Netherlands, 11, 19-29 (2001) 12. Witten, I. H., Frank, E.: Data mining: Practical Machine Learning Tools and Techniques with Java Implementations. Morgan Kaufmann, San Francisco, CA (1999) 13. Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I. H.: The WEKA Data Mining Software: An Update; SIGKDD Explorations, 11, (2009) 10-18 14. Kanungo, T., Mount, D. M., Netanyahu, N. S., Piatko, C. D., Silverman, R., Wu, A. Y.: An efficient k-means clustering algorithm: Analysis and implementation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24 (2002) 881–892 15. Mack, D.L.C., Biswas, G., Koutsoukos, X.D., Mylaraswamy, D.: Using Tree Augmented Naıve Bayes Classifiers to Improve Engine Fault Models. In Uncertainty in Artificial Intelligence: Bayesian Modeling Applications Workshop (2011) 16. Friedman, N., Geiger, D., Goldszmidt, M.: Bayesian Network Classifiers. Machine Learning, 29 (1997) 131–163

Suggest Documents