Ontology-based Knowledge Retrieval H´ector D´ıez-Rodr´ıguez and Guillermo Morales-Luna Computer Science, Cinvestav-IPN Mexico City, Mexico {hdiez,gmorales}@cs.cinvestav.mx Abstract We deal with knowledge retrieval within the context of Virtual Learning Environments (VLE). A good VLE should deliver relevant learning materials to the learner at the most appropriate time to facilitate knowledge acquisition by Problem-Based Learning (PBL). In PBL, students should retrieve information about a problem by working in small groups with the guidance of a learning facilitator providing materials required through the problem solving process. The use of ontologies to represent domain knowledge improves the information management in a VLE because it enables automatic reasoning and facilitates the processes of knowledge searching and retrieving needed to promote the interest in problem solving. We propose an ontologybased for searching, discovering and publishing relevant materials as Learning Objects to help students in the PBL approach. Keywords. Intelligent Tutoring System, Knowledge Management, Ontology, Problem-based Learning.
1. Introduction Relevant information is necessary to help students in Problem-Based Learning (PBL) [4]. With current computing and information storage, the volume of learning materials has increased dramatically. However, saturated with huge and diverse learning materials, users feel that retrieving relevant learning materials is still a challenging task. A Virtual Learning Environment (VLE) [9] increases productivity in education since it provides access to learning materials at any time and any place, and it may provide information transmission for knowledge construction to a PBL. Historically, relevant learning materials are manually integrated to VLE by teachers, and this approach involves hard work. Although students using a VLE also have access to information retrieval tools in a network (e.g. Google, Lycos or CiteSeer), a proliferation of superfluous data obtained under these conditions in the Web does not guar-
Jos´e Oscar Olmedo-Aguirre Electrical Engineering, Cinvestav-IPN Mexico City, Mexico
[email protected]
antee any form of validation or trustworthiness. Overabundance of search results poses a cognitive overload [10]. Applications of ontologies to model related components of relevant learning materials contribute to effective knowledge search. Concept ontologies in any area aim for standardizing and improving knowledge search and discovery mechanisms. However, within VLE, there is a lack of formal ontology description of learning materials as a knowledge repository to help students to retrieve relevant domain information necessary to solve a specific problem. We present a constructivist VLE paradigm that integrates automated mechanisms of knowledge searching and publishing based on ontologies describing learning domains within a PBL approach. Our main contribution is content search in known repositories of relevant learning materials. The relevant solutions, useful to solve a specific problem, can be extracted by queries based on semantic relationships embedded in the domain ontology. In section 2 some foundational concepts of PBL theory are described, in section 3 the integration of ontologies and Learning Objects in a VLE as a suitable knowledge representation for pedagogical design are analyzed, in section 4 a problem domain ontology and its use in a problem solving strategy is described and in section 5 related work is cited. Section 6 consists of concluding remarks.
2. A learning model from problem-solving PBL has been used in education for over forty years as a method of teaching the practical application of knowledge in a real world setting [25], and it is defined as a way of constructing and teaching courses using problems as the stimulus and focus for student activity [6]. The traditional path that takes the process of conventional learning is reverted when working in a PBL model. Traditionally, the course material is exposed first and then its application is presented, including later on some comments about more general applications. In a PBL, the problem is presented and analyzed first, asking to the students to identify their learning necessities. Afterwards they search
for relevant information to solve the problem at hand, and if any information is found, they integrate it into the solution. There are certain common characteristics in PBL courses: • use of real-life, complex, ill-structured problems, with any number of correct solutions; • the problem is presented to students without direct information of how to solve it, however, resources and scaffolding are available for students to solve the problems by themselves; • students work in small group, with a facilitator; and • the problem is used as the focus from which the learning is structured. In PBL, the students collaborate on complex problems, thereby distributing the cognitive load among the whole profiting of the distributed expertise. Collaboration is a social structure in which two or more people interact with each other, leading to social situations and interactions that have a positive effect on them [13]. Searching, discovering, exchanging and publishing information are important parts of PBL, because knowledge is constructed socially through the joint efforts directed towards common objectives. The appropriate application of PBL depends on the knowledge domain characteristics and representation, which conducts the solution process in PBL.
3. Ontology and Learning Objects for knowledge management in a VLE Learning Objects conform one of the main research topics in the e-learning community on recent years [14], and they are widely adopted for knowledge representation in many VLE’s. An ontology [17, 24] is a explicit representation of domain concepts that provide the basic structure around which knowledge bases can be built. The intention to represent concepts of any area in ontologies is to standardize and improve knowledge searching and discovery mechanisms. An ontology that actualizes its structure of concepts allows the user of VLE to play an active role in pedagogical development through semantically relevant knowledge searching.
3.1. Learning objects to represent knowledge in a VLE Learning Objects (LO) constitute a new way of thinking about learning materials. An LO is a unit of digital resource that can be shared to support teaching and learning [28], it is an independent and self-standing unit of learning content. Because learning materials can take a variety of
forms, among plain text, program simulation, or any other form of multimedia content, there are LO’s suitable to represent them. In VLE there are emerging standards for describing learning resources, among them Learning Objects Metadata (LOM) proposed by the IEEE [19] as an extension of Dublin Core, that gradually has become the standard reference for educational system managing LO. Usually LO metadata standards are intended to generalize taxonomies and vocabularies for learning objects repositories for the involved disciplines [15, 29]. The development of such taxonomies suggests that there is a tacit ontology behind the metadata standard. Though LO have been already used for knowledge sharing in previous VLE systems [12], [23], they have not been explicitly organized in ontological structures for solving problem purposes.
3.2. Ontologies as conceptual models Ontology is a science that studies the explicit formal specification of the terms of a knowledge domain along with the existing relations among them [18, 20]. Many different definitions of the term are proposed. One of the most widely quoted and well-known definition of ontology is Grubers: Ontology is an explicit specification of a conceptualization [16]. Thus ontology is a logical system of concepts and their relations in which they are defined and interpreted in a declarative way. The use of ontologies in the educational environment design is not new. Bloom defined a taxonomy of educative goals, in which the category Contents had a roll that specifies concepts that were taught in a course [5]. Bloom’s taxonomy of education objectives is a framework which has been widely used in all disciplines. The original Bloom’s framework includes six levels of learning: knowledge, comprehension, application, analysis, synthesis and evaluation. However, given the recent development in the field of knowledge management, the term knowledge is no longer appropriate in this context. Nevertheless, investigations that use ontology in a VLE have been focused in two fundamental issues [8]: a) Interoperability and classification of Learning Objects used in Learning Management Systems (LMS) [22]. The ontologies define a vocabulary that is shared by the applications. b) Generation of adaptable Learning Environment [27, 26]. The ontologies describe roles and contents to personalize a learning process. In a rather general way, we realize ontologies as directed multigraphs. The nodes are classes and the edges are ordered pairs of classes. Each edge is labelled by the set of relations among the corresponding ordered pair of classes. Typically, a query can be translated into an equivalent problem consisting in finding a special path connecting two classes. The search of such paths may be conducted with advising strategies based either on the discriminating
words in the query or in the query’s syntactical structure. Few research reports provide an explicit generic structure of ontologies for knowledge sharing for educational purposes [26, 31], although the literature related to PBL and LOM is huge. While learning objects metadata describe the artifacts of LO shared by diverse domains, ontology represents a knowledge domain that shares the relationships of LO within a specific context. The use of ontology does not exclude the use of metadata. The intelligent discovery of LO requires information not supported by the current set of metadata in the LOM standard. For example, it is necessary for each LO to specify exactly how that LO is related to concepts in a particular domain and the kinds of learning outcomes that are possible in that domain, i.e. an ontology of concepts in a domain. Another kind of ontologies required is for the physical structuring of LO. To allow LO being interpreted and rendered consistently in different learning environment, it is important that ontologies be developed for describing the structure of LO.
4. Use of ontology in guided-problem solving
Table 1. Concepts and Solutions examples Domain Graph Concepts Solutions Positive/Negative weight Bellman-Ford algorithm directed graph Dijkstra algorithm shortest path Kruskal algorithm maximum flow Boruvka algorithm Domain Sorting Concepts Solutions Stable/Non stable Radix/Quick sort Quadratic/Semilogarithmic Bubble/Merge sort Time Complexity Large/Small Data set size External/Internal sort
professors as explained in section 4.2. These links allow navigation to review a learning material stored in the Learning Objects Repository. When clicked in those links, relevant documents are displayed in the browser pane.
4.1. Searching for knowledge In PBL, while students are identifying crucial parts of the problem, they are also conceiving possible solutions. These solutions can be characterized according to the description along with the restrictions of the problem domain to guide the student to a good solution. In problem domains that are more susceptible than others of having a better formalization exist fundamental concepts that may be classified with the basic ontological relationships SubclassOf and PartOf. The ontologies involved in the guided problem solving organize knowledge in two categories: Concepts and Solutions. The Concepts class describes the context of the problem domain, whereas the Solutions class describes existing algorithms or solutions that are related with this concept. In table 1 we show some examples of concepts and solutions from a Computational Algorithms course. Class Concepts organizes in subclasses concepts describing a problem domain and each subclass has the name and solvewith properties. Property name is used to identify class or subclass names. Property solvewith associates concepts with solutions. Class Solutions organizes solutions solving problems within the domain. Each subclass belonging to this category has several properties. Property description has a brief narrative description of the solution to the students. Properties linkPW and linkOA contain pointers to educative materials (as LO’s) describing solutions. Property linkPW points to the Permanent Learning Objects describing a solution normally elaborated by an expert. Property linkOA points to Temporal Learning Objects (TLO) which are elaborated by students and
Searching for the set of solutions to a given problem by a query consists on determining the set of Learning Objects that represents an appropriate set of solutions to the problem. The algorithm SEARCH shown in Figure 1 retrieves all the known solutions that can better solve the given problem. The algorithm SEARCH receives as inputs an Ontology and a Query that is an abstract narrative description of the problem, and returns as outputs the set of Solution that solves the Query according to the Ontology and the set of LearningObjects associated with the Solution. As Ontology has a hierarchical structure, search starts in the top of the structure descending by a breadth-first traversal from the most general to more specialized concept. The algorithm begins by getting all the words extracted from the Query (line 2). The algorithm iterates for all word in set Words (lines 3 through 10) and for all Concept in the Ontology (lines 5 trough 9) to find those Concept whose property Name is the root of a discriminating word. In case the Name identifies an abstract Concept in the Ontology (lines 6 through 8), a new entry in the Solution array is defined to associate the Name to the Solution obtained from property SolveWith of Concept (line 7). The set of all final Solutions are obtained by intersecting all partial solutions (lines 10 through 13) and the set of all LearningObjects are obtained by joining the sets of LO given by property linkPW of each final solution (lines 14 through 16). In this algorithm, function Split(Query) returns the set of all Words (with no duplicates) that appear in Query, func-
Figure 1. Algorithm SEARCH ALGORITHM SEARCH INPUT Ontology, Query OUTPUT Solutions, LearningObjects BEGIN 1 Solutions, LearningObjects:=EmptySet 2 Words := Split(Query)) \ NonDiscriminantWords 3 FORALL Word IN Words DO 4 Name := Lexicon.GetStem(Word) 5 FORALL Concept IN Ontology 6 IF Concept.Name=Name THEN 7 Solution[Concept.Name] := Concept.SolveWith 8 END IF 9 END FORALL 10 END FORALL 11 FORALL s IN dom(Solution) DO 12 Solutions := Intersection[Solutions,Solution[s]] 13 END FORALL 14 FORALL s IN Solutions DO 15 LearningObjects := Union[LearningObjects , s.linkPW] 16 END FORALL END
tion GetStem(Word) returns the root of Word by using a Lexicon such as WordNet [3]. The algorithm uses dynamic associative arrays (like those found in JavaScript) in which a new entry is defined by assignment (as in line 7). There are no duplicated entries for this array. Associative arrays have an intrinsic function Dom() that returns the set of all elements for which an entry for the array is defined. Predefined set NonDiscriminatingWords contains frequently used words, among articles, pronouns, and verbs, which do not contribute to determine the problem domain. The operations of Union(), Intersection()) and Difference() for generic sets have their usual meaning. The algorithm also uses high-level iterator FORALL having the form FORALL element IN set DO action END, meaning that variable element is instantiated with each member of set, if non-empty, to perform the given action upon element. For the Ontology, the iterator traverses the hierarchy of nodes in a breath-first manner beginning by the top node, as explained before. Since a problem generally involves concepts whose solutions may completely differ from others, the algorithm returns no solution when Solutions is empty. No found solution means inconsistency in the Query.
4.2. Publishing of knowledge Publishing consists on augmenting a centralized repository of LO, in a Sharable Content Object Reference Model (SCORM) standard [2], with the known solutions for the problem so far. The publication process is lead by an instruction facilitator. In practice, LOs can be either permanent or temporal, according to their duration in the repository. Permanent Learning Objects are elaborated by experts (generally the facilitators) to be used as reference in the subject matter and represents the most complete information available. Temporal Learning Objects are elaborated by students as incomplete, tentative, discardable solutions that arise during the problem solving process. These materials are implementations of a solution and complement an exposed description solution in Permanent Learning Objects.
4.3. Implementation The domain ontology, the search algorithm and the LO repository are located in a dedicated server, which it is manipulated by means of a Java application that executes in a Web server. We have chosen Prot´eg´e [1] for ontology developing because it is a well formalized, well supported that outputs an ontology in the OWL language [21]. For knowledge management we implemented a RIbONTOMidleware. It is a middleware [24] that manipulates both the ontology and the LO repository, providing high-level services for discovery, searching and publishing of knowledge. In addition, we develop a constructivist VLE EnEMoCi that provides the functionality of a learning management system to conduct PBL tasks involved in teaching courses. It facilitates user’s administration and knowledge retrieval as learning objects by RIbONTOMiddleware.
4.4. Case study Let us analyze two different problems, a Graph Theory problem and a Sorting problem, that are asked to the students of a Computational Algorithms course: Problem # 1 Travel Salesman Problem: “A road map contains information about 20 cities and the roads that connect them have a length given in kilometers. There is always at least one route between any two cities of the map. The problem consists in finding an optimal route between any two cities that minimizes the distance covered by the route.” Problem # 2 Bulk sorting: “Sort 1 gigabyte of data using a computer with only 128 megabytes of RAM.” Following the PBL methodology, the students start their activities by identifying the learning objectives they have. For
Table 2. Search results from Google Description Avg (%) Dijkstra’s Algorithm Description 10 (*) Directed weighted Graph Theory and 20 (*) Dijkstra’s Algorithm Description Data structure exercises 50 Floyd’s Algorithm application paper 20
Table 3. Algorithm search results using RIbONTOMiddleware Solutions Bellman-Ford Floyd-Warshall Dijkstra
Solutions Merge Sort Four Type Sort Polyphase Sort
a) Graph Theory problem Description Sorting Algorithms Description Merge sort Quick sort Complexity Analysis of Algorithms Design of Algorithms The link could not be shown
Avg (%) 20 (*) 10 (*) 10 30 20 10
b) Sorting problem (*) Link has information useful for students. Avg: Average
those problems, the learning objective can be specified in abstract terms by the queries “Finding the shortest path in a directed graph” for the first problem and “Finding a suitable external sorting algorithm” for the second problem. Documents related to the specified queries were obtained from a search engine like Google [24] that comprises databases containing million of documents organized by classical information retrieval methods. Table 2 summarizes the first ten results the search engine returned to answer each query. From the list of results, the students have to decide which information is most appropriate by examining each result. It was observed that only about the 30% from the retrieved information is useful, because it contains enough information (theoretical explanations and algorithms) related to the purpose of the query, so that the students can satisfy their learning objectives. Nevertheless, if the learning objective that the students have identified can be situated in an ontological domain of Computational Algorithms, then more precise query results could be obtained using the RIbONTOMiddleware, a non-conventional search engine based on ontology containing abstract terms. The results obtained using RIbONTOMiddleware in same queries are summarized in table 3. Using this search engine based on a context ontology, the following conclusions can be derived: (1) 100 % of the retrieved information is useful for the students, and (2) the number of links was reduced significantly with respect to the results obtained from the Google search engine.
5. Related work Proposals that implement a constructivist approach and PBL [30] in the educational process have made emphasis in the experimentation phase for knowledge generation. Nevertheless, its effectiveness is limited because it has insufficient mechanisms for reusability and integration of generated knowledge and it lacks of motivation in searching for the known solutions and making widely available new knowledge as automated mechanism integrated to VLE. There have been a number of recent efforts aimed at the development of ontologies for e-learning. The O-DEST system, proposed in [26], comprises ontology for e-learning process, such as course syllabus, teaching methods and learning activities. However the description only refers to pedagogical rolls and activities, and it does not approach the use of search mechanisms for knowledge discovery. In [7] the authors give an example of an ontology development in accordance with the ACM Computer Classification, this ontology is represented in RDF and is used in the Edutella System. However, these solutions do not lead to the possibility of using a LO in different way. In [11] is presented COFALE, a system to support flexible learning. The system approaches problem-based learning by allowing an adaptable presentation of learning contents, pedagogical resources and generation of evaluations. Nevertheless, these systems do not include search and discovery knowledge mechanisms.
6. Conclusions PBL is a constructivist learning process that requires knowledge searching and discovering, though historically knowledge discovery has not attracted too much attention in VLE design. We have outlined a mechanism for guiding learners to find a solution of the problem at hand by means of VLE ontologies. We show how a semantic model can improve searching and discovery of knowledge (represented as LO according to the SCORM standard). The ontology defines the vocabulary of the problem domain and a set of constraints on how the terms can be combined to model the domain. The search algorithm is included in
RIbONTOMiddleware, its use has demonstrated that a retrieval mechanism based on context ontologies reduce significantly the links amount that students should navigate with respect to the results obtained from traditional search. The retrieved information is more useful for students, and diminishes their cognitive overload. In order to guide them through problem solving, we developed EnEMoCi VLE which implements a methodology for knowledge searching and publishing. In the near future, we plan to include more domain ontologies and to improve the user interface.
References [1] [2] [3] [4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
[14] [15]
Prot´eg´e. http://protege.stanford.edu/. Scorm. http://www.adlnet.gov/scorm/. Wordnet. http://wordnet.princeton.edu/. H. Barrows. Practice-Based Learning: Problem-Based Learning Applied to Medical Education. Southern Illinois University School of Medicine, 1994. B. S. Bloom and D. R. Krathwohl. Taxonomy of Educational Objectives: The Classification of Educational Goals. Handbook I: Cognitive Domain. Longmans, New York, 1956. D. Boud and G. I. Feletti, editors. The Challenge of Problem-Based Learning. 2nd Edition. Kogan Page Limited, London, 1997. J. Brase and W. Nejdl. Ontologies and metadata for elearning. In S. Staab and R. Studer, editors, Handbook on Ontologies, International Handbooks on Information Systems, pages 555–574. Springer, 2004. J. Breuker and B. Bredeweg. Ontological modelling for designing educational systems. In AI-ED 99 Workshop on Ontologies for Intelligent Educational Systems, 1999. T. Browne. A longitudinal perspective regarding the use of VLEs by higher education institutions in the United Kingdom. Interactive Learning Environments, 14:177–192(16), August 2006. P. Chandler and J. Sweller. Cognitive load theory and the format of instruction. Cognition and Instruction, 8(4):293– 332, 1991. V. M. Chieu. Cofale: An authoring system for supporting cognitive flexibility. In ICALT ’06: Proceedings of the Sixth IEEE International Conference on Advanced Learning Technologies, pages 335–339, Washington, DC, USA, 2006. IEEE Computer Society. E. B. Cohen and M. Nycz. Learning objects and e-learning: An informing science perspective. Interdisciplinary Journal of Knowledge and Learning Objects, 2:23–34, 2006. P. Dillenbourg, M. Baker, A. Blaye, and C. O’Malley. The evolution of research on collaborative learning. In E. Spada and P. Reiman, editors, Learning human and machine: towards an interdisciplinary learning science, pages 189–211. Oxford: Elsevier, 1995. E. Duval and W. Hodgins. A LOM research agenda. In WWW (Alternate Paper Tracks), 2003. N. Friesen. Interoperability and learning objects: An overview of e-learning standardization. Interdisciplinary Journal of Knowledge and Learning Objects, 1:23–31, 2005.
[16] T. R. Gruber. A translation approach to portable ontology specifications. Knowledge Acquisition, 5(2):199–220, June 1993. [17] T. R. Gruber. Toward principles for the design of ontologies used for knowledge sharing? Int. J. Hum.-Comput. Stud., 43(5-6):907–928, 1995. [18] N. Guarino and R. Poli. Formal ontology in conceptual analysis and knowledge representation. Guarino, N. and Poli, R. (eds.) Formal Ontology in Conceptual Analysis and Knowledge Representation. Special issue of the International Journal of Human and Computer Studies, vol. 43 n. 5/6, Academic Press., 1995. [19] IEEE Learning Technology Standards Committee. IEEE standard for learning object metadata (draft). IEEE standard 1484.12.1, 2002. [20] C. Knight, D. Gasevic, and G. Richards. An ontology-based framework for bridging learning design and learning content. Educational Technology & Society, 9(1):23–37, 2006. [21] D. McGuinness and F. van Harmelen. OWL Web Ontology Language Overview. W3C Recommendation, 2004. [22] P. Mohan and B. K. Daniel. A new distance education model for the University of the West Indies: A learning objects’ approach. In ICALT ’04: Proceedings of the IEEE International Conference on Advanced Learning Technologies, pages 938–942, Washington, DC, USA, 2004. IEEE Computer Society. [23] P. Mustaro and I. Silveira. Learning objects: Adaptive retrieval through learning style. Interdisciplinary Journal of Knowledge and Learning Objects, 2:35–46, 2006. [24] D. Serain. Middleware. Springer-Verlag, 1999. [25] P. B. A. Smits, J. H. A. M. Verbeek, and C. D. de Buisonje. Problem based learning in continuing medical education: a review of controlled evaluation studies. British Medical Journal, 324(7330):153–156, 2002. [26] C. Snae and M. Brueckner. Ontology-driven e-learning system based on roles and activities for thai learning environment. Interdisciplinary Journal of Knowledge and Learning Objects, 3:1–17, 2007. [27] K. Verbert, D. Gaˇsevi´c, J. Jovanovi´c, and E. Duval. Ontology-based learning content repurposing. In WWW ’05: Special interest tracks and posters of the 14th international conference on World Wide Web, pages 1140–1141, New York, NY, USA, 2005. ACM. [28] D. A. Wiley. Connecting learning objects to instructional design theory: A definition, a metaphor, and a taxonomy. In The Instructional Use of Learning Objects: Online Version, 2000. [29] K. Yordanova. Meta-data application in development, exchange and delivery of digital reusable learning content. Interdisciplinary Journal of Knowledge and Learning Objects, 3:229–337, 2007. [30] H. Zhuge and Y. Li. Active e-course for constructivist learning. In WWW Alt. ’04: Proceedings of the 13th international World Wide Web conference on Alternate track papers & posters, pages 246–247, New York, NY, USA, 2004. ACM. [31] A. Zouaq, R. Nkambou, and C. Frasson. An integrated approach for automatic aggregation of learning knowledge objects. Interdisciplinary Journal of Knowledge and Learning Objects, 3:135–162, 2007.