Learning Path Generation by Domain Ontology ... - CiteSeerX

14 downloads 17615 Views 669KB Size Report
a domain ontology supporting a web tutoring system is presented. Even ... domain. An ontology can be defined as a way to specify concepts and relations.
Learning Path Generation by Domain Ontology Transformation Roberto Pirrone1 , Giovanni Pilato2 , Riccardo Rizzo2 , and Giuseppe Russo1 1

DINFO - University of Palermo Viale delle Scienze 90128 Palermo, Italy 2 ICAR - Italian National Research Council Viale delle Scienze 90128 Palermo, Italy [email protected],{pilato,ricrizzo}@pa.icar.cnr.it

Abstract. An approach to automated learning path generation inside a domain ontology supporting a web tutoring system is presented. Even if a terminological ontology definition is needed in real systems to enable reasoning and/or planning techniques, and to take into account the modern learning theories, the task to apply a planner to such an ontology is very hard because the definition of actions along with their preconditions and effects has to take into account the semantics of the relations among concepts, and it results in building an ontology of learning. The proposed methodology is inspired to the Knowledge Space Theory, and proposes some heuristics to transform the original ontology in a weighted graph where the A* algorithm is used to find the path. The proposed approach is applied to the implementation of a web based tutoring system about the Java programming language.

1

Introduction

The task of knowledge management for an e-learning application aimed to satisfy users requests by the generation of personalized learning paths, relies on the definition of a domain ontology structuring the concepts related to the course domain. An ontology can be defined as a way to specify concepts and relations among them [1], and a learning path between concepts is thus constrained by such relations. The correct definition of the ontological structures that are involved in an elearning process has been a greatly debated topic in recent years. Stojanovic and his collegues [2] devise three ontology types to provide a correct definition of a learning material: content, context, and structure. In the work of Panteleyev [3] up to five ontological levels or layers are defined: standard layer (basic concepts definition) relations layer (relations among concepts) logical layer (axioms to define concepts and relations) action layer (action definition) and methods layer (definition of sequences of action executions). These approaches, like many others, make use of the semantic web paradigm to build domain ontologies: the use of XML and RDF can provide interoperability between learning objects following different annotation standards such as IEEE LOM and IMS. In this direction

is oriented the work by Nejdl [4] where the framework EDUTELLA is proposed as a peer-to-peer infrastructure to share information between distributed repositories of RDF metadata by means of a common data model (ECDM: Edutella Common Data Model). In all the previous works, knowledge is managed in the framework of a suitable RDF schema where concepts and relations are defined: the core of these works is the organization of learning materials, and not so much attention is devoted to the use of a particular learning theory. All the implementations inspired to these ideas result in simple curriculum systems. An interesting approach in the opposite direction can be represented by the GetSmart system [5] where a constructivist environment is proposed, which is based on the integration of a searching tool, a curriculum tool, and a concept map to obtain a visual arrangement of information. Marshall observes that concept maps are crucial to obtain mental schemata construction that is the core concept in the constructivist theory, but this is not completely true. At least two knowledge organization levels are needed to obtain a learning system centered on the student needs: an ontological structure of knowledge, and a concept map. Concept maps are intuitive interaction systems allowing the student to browse information according to his preferences, but also capable to implement a not so strong guidance procedure. Ontologies allow knowledge structuring, and management finalized to several goals such as abstract definition of a learning theory used to inspire the real system, user modeling, and planning personalized learning paths. XML based ontologies are not so flexible to address all these tasks in a simple way. A linguistic approach that makes use of a terminological knowledge base is preferable because of the possibility to easily implement predicates between concepts allowing to extract other knowledge by means of reasoning or planning techniques.

This work deals with a domain ontology for a web based tutoring system, and it is particularly focused on a possible strategy to transform its structure to obtain simple generation of learning paths. In a generic domain ontology it is possible to devise two kinds of relations between concepts: structural, and navigation relations. Structural relations are the classical specialization and subsumption predicates plus some predicates related to the specific structure of the arguments in the knowledge domain. As an example, in a history domain, events have to be correlated to dates, while in a OOP domain, classes have to hold information about their methods. Navigation relations are related to the logical links between different pieces of knowledge: an argument is a prerequisite for another one, two arguments are related in a some way and so on. Moreover, given a concept, not all the other ones related to it concur in the same way to its explanation, so the links have to be tagged with respect to concepts relevance. A planning approach is needed to obtain an articulated learning path from such an ontology, but direct application of classical action-state planners is very hard because one has to define wich are the most suitable actions to implement, with their preconditions and effects. It is not possible to simply follow the relations because they have different semantics, so it should be needed to define a set of

actions as functions of the links traversal. Such actions are learning actions so an ontology about a particular learning process should be defined. In this paper, authors were inspired by the Knowledge Space Theory (KST) [6] to obtain a transformation from the original ontology defined in the OpenCyc [7] knowledge base to a weighted graph where the A* algorithm was applied to determine learning paths. KST is a classical theory about knowledge structuring to support learning, and in particular to devise what the student already knows about a certain topic, and what she/he wants to know. A natural language dialog session with the student elicits these sets of concepts and allows for an initial pruning of not useful relations. A suitable heuristics has been employed to obtain arcs’ weights from the semantics of each relation in the ontology, the absolute relevance of each concept in the domain, and a dynamic term describing the subjective relevance of each concept with respect to the student’s goal. Finally, a map that represents and organizes a set of concepts in a spatial way is used to visualize the path inside the whole arrangement of learning materials. In the following discussion this map will be referred as a concept map. A path may not be a unique trajectory, but it can be visualized with ramifications due to the level of detail needed to explain the goal topic: this is not in contrast with any modern learning theory, and in particular it accounts for a constructivist approach to the problem. The presented methodology is quite general, and the implementation for an ontology describing the Java programming language is detailed throughout the paper. The rest of the paper is arranged as follows. Section 2 provides some brief remarks about the Knowledge Space Theory, and the OpenCyc knowledge base. In section 3 the ontology-to-graph transformation is detailed, while in section 4 the implementation of the whole interaction cycle is explained, starting from the dialog with the student, until the visual presentation of the path. Finally in section 5 some conclusions are drawn, and future work is discussed.

2

Theoretical Background

The proposed methodology is based on the use of the KST as a framework to devise the starting point of the path to reach the goal, using natural language dialog system to capture the intentions of the user. Another fundamental component of the implemented system is the OpenCyc knowledge base that has been used because of the presence of truly general concepts and relations which can be considered as a sort of root for quickly develop a new ontology. In the presented work this statement is true both for the domain ontology, and the ontology about graphs which has been defined in order to perform the transformation, and to apply the A* algorithm. In this section a brief review of these two topics is reported. 2.1

Knowledge Space Theory

The KST was proposed by Doignon and Falmagne [6][8][9], and is a theoretical framework in the field of student modelling. KST is a psychological model for

structuring domains of knowledge and offers a mean to formally describe the structure of a given knowledge domain based on prerequisite relationships [10]. According to the KST, a field of knowledge can be decomposed into items: an item is a notion or skill to be learned. An item may also be presented as a task, which the student has to perform if the goal is to assess procedural and/or strategic knowledge. A field of knowledge is characterized by a set of items called a domain. A domain is the set of all items making up a particular subject matter. A student is considered to have learned the domain when she is capable to solve problems corresponding to all the items of the domain. Each student can be described by her knowledge state, i.e the collection of items the student is capable to solve. Knowledge states are related to a specific student and a particular domain. Besides, a students knowledge state changes during time, and the goal of learning is that, in the end, it should correspond to the complete domain. It is worthwhile to point out that, since there exist prerequisite relationships between the items, not all possible subsets of items are knowledge states. The collection of feasible knowledge states for a particular domain is called knowledge space. Such a knowledge space contains the empty set Ø and ′ the complete item set Q as elements, and, for any twoSknowledge states K, K ′ belonging to the knowledge space Ks , their union K K is also a member of Ks [8]. The application of the knowledge space framework for tutoring tasks, leads to the possibility of obtaining learning paths which describe the students possible paths form the total novice learner (identified by the empty set Ø) through the knowledge space to the complete expert of the domain (identified by the knowledge state Q) [11].

2.2

OpenCyc

In the last years, the Cycorp, Inc company has developed the Cyc knowledge base (KB) which has a very large ontology constituted by over one hundred thousands atomic terms axiomatized by a set of over one million assertions fixed in nth-order predicate calculus. The Cyc KB, at present, is the largest and most complete general knowledge base, equipped with a good performing inference engine [12]. OpenCyc is suitable for automated logical inference to support knowledge-based reasoning applications, it also supports interoperability among software applications, it is extensible, provides a common vocabulary, and is suitable for mapping to/from other ontologies. Cyc collections are natural kinds or classes. The collections are opposed to mathematical sets; their instances have some common attribute(s). Each Cyc collection is similar to a set since it may have elements, subsets, and supersets, and may not have parts or spatial or temporal properties. The differences between a mathematical set and a Cyc collection are the following: the former can be constituted by an arbitrary set of uncorrelated things; besides two sets with the same elements are considered equal. The latter is characterized by the fact that the instances of a collection will have some common features; besides two Cyc collections can have all the same instances without being identical.

3

The Proposed Methodology

In this work a three-level schema to model course topics is adopted. At the lowest level, information is aggregated as a set of HTML documents which can represent single learning objects (e.g. a Java documentation page) or a composition of them as in the case of lessons. The intermediate representation is achieved by a concept map that is implemented as a SOM used to cluster documents in a Vector Space Representation using a measure of the similarity between the documents. A concept map is trained in an unsupervised way, and it is labelled with some landmark concepts that are used to bridge the gap with the symbolic level. The concept map owns an implicit representation of logical relations between concepts, and allows easy integration of new clusters standing for new concepts that can be instantiated at the symbolic level together with their relations with the nearest regions. Finally, the linguistic representation of the domain ontology is provided, where the landmark concepts play the role of atomic assertions inside the knowledge base.

Fig. 1. Three-levels information organization

All the information and the structures of the concepts in the implemented Java ontology (relations, terms, constraints, applications an so on), are organized and verified starting from the official Sun Microsystems document presenting the main language characteristics and the Java structure. Java is a strongly typed language; every variable and every expression has a type that is known at compile time. The Java Ontology reflects this feature: the ontology is strongly connected, every concept is related to one or many others via a few relations. Figure 2 illustrates a portion of the ontology as a UML class diagram. The domainspecific theory has been partitioned essentially in two levels: the structure level and the navigation level. The first level realizes a structural definition of the ontology concepts, in terms of composition, definition of properties and all the

other structural relations needed to represent the world. The navigation level gives the opportunity to tie down different concepts allowing the student to improve her/his knowledge. Structural level predicates are: – (#$isComposed #$Thing1 #$Thing2): Thing2 is part of Thing1 ; – (#$genls #$Collection1 #$Collection2): OpenCyc inheritance predicate; – (#$isaMethodOf #$Method #$Thing): an element of the Cyc collection Method a method defined in class Thing; The previous predicate are Ground Atomic Forms (GAF) in the OpenCyc vocabulary; they are just used to describe a fact in the representation. This is a very important starting point, but a formulation like this is still incomplete in term of accuracy and importance of the concepts. A couple of relations is defined to navigate the ontology: – (#$isaPrerequisiteFor #$Thing1 #$Thing2): Thing1 is a prerequisite to learn about Thing2 ; – (#$conceptuallyRelated #$Thing1 #$Thing2): Thing1 and Thing2 are related in some way;

Fig. 2. A portion of the Java ontology

The predicate isaPrerequisiteFor has been implemented to provide a backbone structure inside the ontology ensuring the possibility of extracting meaningful learning paths on the basis of a sequence of precedence relations. The conceptuallyRelated predicate is just an alternative way to explore the ontology: this is a less strong relation than isaPrerequisiteFor, but it enables free exploration of the ontology in response to a query like ”what about ... ?”. To avoid the combinatorial explosion of search paths resulting from the direct exploration of the ontology a tag system has been introduced where every concept has an

index representing his importance with respect to all the others. This tag is implemented with the relation (#$WeightRelation #$Thing (#$Unity SubLRealNumber)). This system presents two advantages: the selection of a navigation trajectory can be implemented simply choosing, in a first approximation, the concept with a higher tag, and in the same way it is possible to introduce a heuristic criterion in the planning module. The tag system is related to the number of occurrences of a concept in the master document from which the ontology has been built. This citation weight can be assumed as an absolute relevance measure of the concept. The ontology-to-graph transformation is performed according to the Knowledge Space Theory formulation of learning path: the student is characterized by a knowledge state defined as the set of concepts she/he knows at a certain time, and she/he can move to another state by evaluating what she/he is ready to learn. In the original KST this is performed by questioning the user about topics directly related with the ones defining her current state. The ontology structure allows to define knowledge states in the same manner as in KST, and transitions between states take place moving across the navigation relations. The natural language dialog interaction is used to determine the initial state, and the goal. The transformation proceeds in this way: – mapping of the original ontology in an ontology describing a graph; – extraction of the path from the graph. The mapping step starts with the of the process of pruning of the prerequisite concepts of the initial state: these are the things that the student already knows. Successively a heuristic weighting is performed according to the following procedure. Two concepts ci and cj have a citation weight respectively wi and wj . The global weight wij of the arc connecting them in the graph is computed as: wij =

wi wj (f (ds )w0S + f (dg )w0N ) MW

(1)

Here MW is maximum value of the citation weight, f (ds ) and f (dg ) are two weighting functions depending respectively on the distance from the starting point, and from the goal, while w0N and w0S are the initial values of the navigation, and structure relations that can eventually connect ci and cj . This means that in the initial part of the path, navigation links will be preferred due to the need to go far away the initial knowledge state: in fact in this phase the student is in region of the knowledge space she is more confident. On the contrary, in the neighbourhood of the goal state, structure links are preferred because they represent well defined relationships between concepts. These relations are precisely defined in the ontology, and are taken from the domain structure. The resulting graph is represented by means of two predicates: – (#$onPath #$Thing Path-Generic): Thing is part of path defined in the graph whose Cyc type is Path-Generic; – (#$pathConnection #$Thing1 #$Thing2 Path-Generic): Thing1 and Thing2 are connected inside the graph.

The A* algorithm is then applied to the resulting graph (see figure 3). In the implementation, the A* heuristic h is proportional to dg , and it verifies all the monotonicity conditions required by the algorithm.

Fig. 3. Ontology-to-graph transformation

4

System Implementation and Results

An example of dialogue is reported in order to show the approach used to obtain information for the realization of the learning path. A session starts chatting with a conversational agent: the A.L.I.C.E. chat-bot (http://www.alicebot.org) which an XML DTD called AIML to perform pattern matching on a repository of request/answer pairs. Conversations have to be redirected to the knowledge base in order to obtain information about the learning requests issued to the system. A properly modified version of the CyN chat-bot [13] has been used to make a bridge with Cyc. The original version of CyN is based on the A.L.I.C.E. and allows to add the ”common sense” to a chat-bot. With this solution it is possible to access the OpenCyC Common Sense Knowledge Base simply using an enhanced version of AIML. The chat-bot tries to guide the user to explain her needs, and the actual level of comprehension of the topic. Logs from the dialog are used to extract concepts for the ontology, and the inferential engine uses them to extract the learning path. In figure 4 is reported an example of dialog. As it can be seen, the dialogue is inspired to the Knowledge Space Theory because the user answers lead to the questions formulation. In fact, to build of the knowledge structures of KST, the way the user answers to the query of the system is taken into account. Besides, it is supposed that, if a student correctly answers to the system questions, probably she masters that particular argument.

User: HI TUTORJ Chat-bot: HI STUDENT DO YOU WANT KNOW ANYTHING ABOUT JAVA? U: YES. I AM INTERESTED IN INTERFACE C: TO OBTAIN THE LESSON ABOUT THE CONCEPT OF INTERFACE YOU MUST KNOW SOMETHING ABOUT InterfaceType, Class ARE YOU INTERESTED IN ONE OF THESE? U: YES I DO. I WANT TO KNOW SOMETHING ABOUT InterfaceType C: OK. InterfaceType IS RELATED TO ClassType AND ArrayType U:OK. I AM INTERESTED IN THE CONCEPT OF ClassType C: TO OBTAIN THE LESSON ABOUT THE CONCEPT OF ClassType YOU MUST KNOW SOMETHING ABOUT Type U: NO, THANKS I WANT TO KNOW WHAT IS AN Interface? C: OK. Interface IS Keyword, ClassLiteral, InterfaceType, ClassType. DO YOU WANT TO KNOW SOMETHING ABOUT ONE OF THESE? U: NO, TELL ME ALL ABOUT Interface? C: Interface : IS A PREREQUISITE FOR : InterfaceDeclaration, IS COMPOSED BY : ExtendInterface, InterfaceBody, InterfaceIdentifier, InterfaceModifier, IS A : Keyword, ClassLiteral, InterfaceType, ClassType, IS RELATED TO : ARE YOU INTERESTED IN ONE OF THESE? U: TELL ME SOMETHING ABOUT InterfaceDeclaration C: InterfaceDeclaration IS COMPOSED BY InterfaceBody, InterfaceIdentifier, InterfaceModifier, Interface U: OK, PLEASE GIVE ME A LEARNING PATH.

Fig. 4. A dialog example

In this case, the dialogue proceeds through those concepts that are more known or that are more interesting for the user as it is the case for the KST. From the analysis of the previous dialog, the following sets have been extracted: – set A: the Goal: Interface ; – set B: Subjects which are known on not interesting for the user: Type, Keyword, ClassLiteral, InterfaceType, ClassType; – set C: Arguments which are interesting for the user: InterfaceType, ClassType, InterfaceDeclaration, Interface; S – set D: Prerequisites of the set A C (given by the CyC ontology): null ; – set E: the set D-B: null ; The lerning path will therefore start form the initial state defined by E, and it will stop in the concept belonging to A. The obtained path in Cyc is reported in figure 5. The visual interface reproducing the path is reported in figure 6. It is inspired to the StarryNight visualization tool (http://rhizome.org/starrynight) which gives an immediate feedback to the user about the topics holding the greatest amount of information by means of the star cluster metaphor. The implemented interface is split into areas addressing the different landmark concepts. Clicking on a single area, the user can highlight more terms that have been clustered close to the landmark. Finally, from each term it is possible to obtain a pop-up window linking javadoc, lessons, and other learning material.

Fig. 5. Cyc generation of the path

The user is not constrained to follow the highlighted path, but he can also freely browse all the documents in the map.

5

Conclusions

A methodology has been presented to automatically extract learning paths from a domain ontology, via a transformation of the ontology itself in a suitable weighted graph. The graph can be managed using an A* path finding algorithm. Direct use of a classical planner on the ontology implies the definition of actions along with their preconditions and effects in terms of the relations semantics: this leads to the definition of an ontology of learning. The presented methodology avoids this problem, and makes the system responsive because no re-planning is necessary. Moreover the explained technique is grounded on a well known theory of knowledge management to support learning. Future work will regard the development of an integrated system to support the application of the methodology to different domains. Other work will regard the extension of the system to a IMS SCORM compliant framework for documents annotation.

References 1. Staab, S., Studer, R., Schnurr, H., Sure, Y.: Knowledge Processes and Ontologies. IEEE Intelligent Systems 16 (2001) 26–34 2. Stojanovic, L., Staab, S., Studer, R.: Elearning based on the semantic web. In: Proc. of World Conference on the WWW and Internet WebNet2001, Orlando, Florida, USA (2001) 3. Panteleyev, M., Puzankov, D., P.V.Sazykin, Sergeyev, D.: Intelligent educational environments based on the semantic Web technologies. In: Proc. of 2002 IEEE International Conference on Artificial Intelligence Systems (ICAIS 2002), Divnomorskoe, RUSSIA, IEEE Computer Society Press (2002) 457–462 4. Nejdl, W., Wolf, B., Qu, C., Decker, S., Sintek, M., Naeve, A., Nilsson, M., Palm´er, M., Risch, T.: EDUTELLA: A P2P Networking Infrastructure Based on RDF. In: Proc. of the 11th World Wide Web Conference, ACM Press (2002) 604–615

Fig. 6. Learning path visualization

5. Marshall, B., Zhang, Y., Chen, H., Lally, A., Shen, R., Fox, E., Cassel, L.: Convergence of knowledge management and e-learning: the GetSmart experience. In: Proc. of Joint Conference on Digital Libraries 2003, Houston, TX, IEEE Computer Society Press (2003) 135–146 6. Falmagne, J., Doignon, J., Koppen, M., Vilano, M., Johannesen, L.: Introduction to knowledge spaces: How to build, test, and search them. Psycological Review 97 (1990) 201–224 7. Lenat, D., Guha, R.: Building Large Knowledge Bases. Addison-Wesley, Reading MA, USA (1990) 8. Albert, D., Hockemeyer, C.: Adaptive and Dynamic Hypertext Tutoring Systems Based on Knowledge Space Theory. Artificial Intelligence in Education: Knowledge and Media in Learning Systems, Frontiers in Artificial Intelligence and Applications 39 (1997) 553–555 9. Harp, S., Samad, T., Vilano, M.: Modeling student knowledge with self-organizing feature maps. IEEE Trans. on Systems, Man and Cybernetics 25 (1995) 727–737 10. Albert, D., Hockemeyer, C.: Applying demand analysis of a set of test problems for developing adaptive courses. In: Proc. of International Conference on Computers in Education. Volume 1. (2002) 69–70 11. Hockemeyer, C., Held, T., Albert, D.: Rath: A relational adaptive tutoring hypertext www-environment based on knowledge space theory. In: Proc. of Computer Aided Learning and Instruction in Science and Engineering (CALISCE’98), G¨ oteborg, Sweden (1998) 417–423 12. Reed, S.L., Lenat, D.B.: Mapping Ontologies into Cyc (2002) 13. Coursey, K.: Living in CyN: Mating AIML and Cyc together with Program N (2004)

Suggest Documents