LORDiLIS: Integrating Learning Objects Repositories and Digital Libraries Geórgia R. R. Gomes1, Sean W. M. Siqueira2, Maria Helena L. B. Braz3, and Rubens N. Melo1 1
Computer Science Department, Pontifical Catholic University of Rio de Janeiro, Rua Marques de Sao Vicente 225, Gavea, Rio de Janeiro - RJ, Brazil 22453-900 {georgia, rubens}@inf.puc-rio.br 2 Informatics Institute, Federal University of Goias, IMF 1, Campus Samambaia, Goiania - GO, Brazil 74001-970
[email protected] 3 DECivil, Technical University of Lisbon, Av. Rovisco Pais, Lisbon, Portugal
[email protected]
Abstract. E-Learning Systems are intended to promote learning, thus they provide access to content material and incentive knowledge building through exchange of experiences. On the other hand, Digital Libraries (DLs) are good sources of information, where digital materials are made available. Although it is straightforward the integration of these two environments, nowadays elearning systems and DLs work separately and usually there is not integrated access to them. In this paper we describe an architecture for integrating distributed LO repositories and DLs. This architecture is based on mediators and wrappers and it is implemented through web-services and ontologies. It provides more flexibility on the implementation of the integration components while providing semantic treatment through a more formal representation. A case study was developed considering the integration of LOs described on IEEE LOM and digital documents described on MARC. Keywords: Information integration; e-Learning; Learning objects; Digital libraries; Metadata; Webservices; Ontologies
1 – Introduction Content is one of the important aspects of learning. Providing adequate content in addition to the right activities enables good learning experiences, which also consider pedagogical, didactical, technological and administrative factors. However, developing high quality content is time-consuming and expensive, leading to content reuse approaches such as the development of learning objects (LOs). LOs are defined in this paper as reusable digital entities that may be used for learning, education or training. They are usually stored in Learning Content Management Systems (LCMSs) or E-Learning Repositories and referenced through metadata. On the other hand, Digital Libraries (DLs) have been developed worldwide and they represent good sources of information for complementing learning resources. Therefore, e-learning systems should make use of digital documents stored in DLs in order to get an increase of available resources for reuse. There are several metadata standards proposals for describing LOs, such as Dublin Core [DCMI, 2005], Ariadne [ARIADNE, 2005] and IEEE LOM [LTSC, 2002], as well as for describing digital assets stored in DLs, such as Dublin Core and Marc [LC, 2000], [LC, 2004]. The specification and utilization of standards guarantee the existence of a set of common information about a specific theme or area, with clearly established rules that are accepted by the community. Standards make easier the understanding, the integration and the shared use of information among users with different backgrounds and purposes. The establishment of standards brings the commitment between users and suppliers of information
LORDiLIS: Integrating Learning Objects Repositories and Digital Libraries
2
that should accept, collaborate and use the established terminologies and definitions. A metadata standard is made of a set of describing elements that can be related to each other [LC, 2004]. Generally names, information or data groups that are used to describe a specific type of collection are standardized. However, as DL and e-learning are two different areas and have different metadata standards, nowadays a student needs to leave the e-learning environment and enter into the DL environment in order to search the digital resources through the bibliographical metadata. Similarly, a student needs to leave the DL environment and enter the e-learning environment in order to search learning materials through the learning metadata. The different metadata standards in these areas and the environments are not integrated. Therefore, in order to access distributed and heterogeneous LOs repositories and DLs it is necessary to deal with these different metadata standards. Data integration has been researched for several years in the database area. The use of mediators and wrappers [Gruber, 1994] [UDuham, 2003] [Tzitzikas, 2002] is one of the possible and consolidated ways to treat information integration. The use of wrappers to encapsulate data sources allows the sources to be maintained and evolve with some independence. The wrappers can be seen as a communication mechanism to provide access to the data sources independently of their format and implementation. The mediators are used to provide the uniform and integrated access to the information through the wrappers. A mediator has a set of articulations that represent the relations between their terms and the terms of their sources. The mediator uses these articulations to define the queries to be submitted to the data sources from a global query. Also, the mediator should combine the answers of each data source and return a consolidated result to the global application. Web-Services are defined by the W3C (World Wide Consortium) as a software application or component that is identified by a URI (Universal Resource Identifier), whose interfaces and connections are capable of being defined, described and discovered in XML and supports interactions directly with other software applications using XML coded messages via Internetbased protocols. They provide interoperability among software components because they are based on standard mechanisms and protocols, so that web services are independent of implementation language (Java, C++, JScript, Perl, VB.Net, C#, and J#), object model (EJB, COM, etc.) and plataform (J2EE, .Net, etc.). The technology evolution brought different perspectives to the management of heterogeneous information resources. It’s a great challenge to store any type of data, make it available and transform it on useful information to a segment of society, either they are corporative or academic institutions. The use of ontologies to describe the implicit knowledge of data sources is interesting to solve problems of semantic heterogeneity. Ontology is usually defined as the explicit specification of a concept [Gruber, 1994]. The construction of common ontologies has been proposed as a promising approach to the interoperability of systems. In this paper we describe an architecture for integrating distributed LO repositories and DLs. This architecture is based on mediators and wrappers and it is implemented through webservices and ontologies. Therefore this paper describes an extension of LORIS architecture (Learning Objects Repositories’ Integration System), in order to integrate LOs repositories and DLs. A case study was developed, considering the integration of LOs described on IEEE LOM (TecBD/PUC-Rio’s Learning Objects Repository) and digital documents described on MARC (Bibliographical Data Repository of PUC-RIO’s Library).
LORDiLIS: Integrating Learning Objects Repositories and Digital Libraries
3
The remainder of this paper is organized as follows: in section 2 we present LORIS, describing its evolution to LORDiLIS; in section 3 we describe the prototype implementation and the case study; and in section 4 we present some final remarks. 2 – LORIS and its evolution to LORDiLIS LORIS [Moura et al., 2005] uses an approach based on mediators and wrappers, which are implemented through web-services and ontologies. The services-oriented architecture allows to share and exchange standardized content among heterogeneous repositories in a decentralized environment. It provides more flexibility on the implementation of the integration components while providing semantic treatment through a more formal representation. LORIS architecture (Figure 1) provides access to autonomous, heterogeneous and distributed data sources, which can be anything ranging from database systems to collections of files. We use the mediators and wrappers approach, using a global scheme that has the role of a passive integrator [UDuham, 2003], [Tzitzikas, 2002]. The passive integrator looks for information on demand and offers a customized view of data. Application Layer
Client Web Service Mediation Layer
Query
Result
Provider Web Service
Common Schema
Data Access Layer LOM
Wr
Provider web service
DC
MARC
Wr
Wr Provider web service
Provider web service
XML file
RDBMS
OODB
XMLDB
Fig. 1. An Overview of LORIS Architecture The Application layer allows the client applications to access the mediation layer, which offers the search services. This layer can be composed of different kinds of applications that can be specific to each operational environment or have the same interfaces in the WEB environment. These applications can be developed in C++, Java, PHP, ASP, etc. This layer accesses an integrated representation of data sources, using a global query language. In order to allow an integrated query over the DLs and LCMSs, it is important to add these applications on these environments or make user queries activate these applications to get an integrated answer.
LORDiLIS: Integrating Learning Objects Repositories and Digital Libraries
4
The Mediation Layer gets the applications’ global queries, interprets and translates them, so that it is possible to access a uniform and transparent view of the data sources. Therefore, this layer receives the global queries from the Application layer and makes the homogenization of heterogeneous data sources. The global queries are validated according to a Global Ontology (common schema). Then the mediator identifies which sources should be accessed, transforming the user queries in sub-queries, sending them through a standard codification to the corresponding wrapper (Wr) of each data source. The Global Ontology represents an integrated schema of DLs and LCMSs metadata while the mediator has the knowledge of the available DLs and LCMSs data sources. Therefore, when the mediator gets a global query, it validates this query according to the global ontology and defines sub-queries to each wrapper that access the DLs and LCMSs. The Data Access Layer (containing wrappers and local ontologies) receives the sub-queries from the mediator and accesses the local ontologies, which contains the mappings of the global scheme to the local data source scheme. Then, the wrappers transform the sub-queries into corresponding queries according to the query language of each data source. Therefore, in order to access each DL or LCMS, its corresponding wrapper should get the respective subquery and access the local ontology that maps the global schema into the local schema and translates the sub-query to the adequate query to the DL or LCMS. The result of a query in the local data source is sent back to corresponding wrapper that translates the answer according to the global language (through the local ontology). Then the wrappers send the results to the mediator that centralizes all the answers. The mediator executes the necessary operations and finally sends the integrated answer to the user. Therefore, learning objects and digital resources that correspond to the global query are presented to the user that can access them. LORDiLIS extends for Learning Objects Repositories and Digital Libraries Integration System. When creating an integration system, the first step is to define the scheme of the integrated view through the integration of the elements of the source schema [Gomes, 1999]. When considering the metadata represented in the different standards for educational material and for digital bibliographic resources, it is important to have a “semantic understanding” among the concepts expressed by these standards. LORDiLIS’ global/common scheme takes into account concepts from bibliographic e-learning metadata standards. LOM and MARC were the base for the common scheme because they are the most used and cover great amount of the metadata represented in other standards. Besides defining the common scheme, it was also necessary to make the mappings between the global (common) scheme and each of the local (source) schema. Therefore we defined mapping rules to guarantee the integrated view of data. These mapping rules are represented through ontologies. Finally, in addition to considering metadata elements, it was also important to take into account the reference values. Therefore, complex elements such as subject/keyword were linked to taxonomies and only valid values were mapped.
LORDiLIS: Integrating Learning Objects Repositories and Digital Libraries
5
3 – The prototype and the case study LORDiLIS architecture has been implemented as web-services and therefore it uses open standard protocols such as Java, SOAP, WSDL and XML. Therefore, XML is used as a common language in description of the data sources’ schema and in the queries and subqueries that the mediators work with. We use XPath [W3C, 1999] as the XML query language and SOAP (Simple Object Access Protocol) [Clements, 2001] [Gudgin, 2001] as communication protocol to send queries and get data from the wrappers. A prototype for integrating a LOM repository (the metadata repository of TecBD/PUC-Rio’s LCMS) and a MARC repository (an extract with bibliographical metadata of digital resources from PUC-Rio’s Library) has been developed at PUC-Rio according to the proposed architecture. Therefore, the initial prototype considers wrappers from two sites (TecBD and Library) as well as the mediation components. In each site we implemented the translation and data access services. In TecBD, we used .Net as development platform, C# as programming language and DB4Objects to store LOs. In the library, we used Java as development language and SQL Server to store the digital documents and its descriptors in MARC. The JDBC and DOM APIs were used, respectively, in the access to the DBMS and for manipulating XML documents. In the development of the mediation components we used Tamino for storing ontologies and taxonomies in OWL and RDF descriptions as XML. The mediation and translation services were described as web-services using WSDL, allowing the interoperability among the applications that were developed in different platforms and programming languages. In order to treat structural and semantic heterogeneity in LORDiLIS we used mappings supported by ontologies. First, to each metadata standard that is used in the case study, there is an equivalent description in RDF containing its respective structure. Similarly we created a common ontology from each standard, containing a generalization of concepts such as author, title and keyword. The concept keyword has a taxonomy defining its possible values. This taxonomy is based on the Library of Congress and Brazilian National Library standard values. Then, we defined mappings from the common ontology to the schema of the metadata standards. The descriptions in RDF, as well as the mapping ontologies are stored in the ontology repository. The ontologies were represented in OWL - Web Ontology Language [W3C, 2003]. The OWL aims at providing a language that can be used to describe classes and relationships among them that are inherent to web documents and applications. This language can be used to formalize a domain through the classes’ definition and their properties; to define individuals and to assert properties about them and to provide logical reasoning about these classes and individuals according to the degree that is allowed for the formal semantic of OWL. In order to represent the semantic mappings we explored the available OWL resources. The OWL properties that allow the interontologies mapping are: equivalentClassm, equivalentProperty, sameAs, differentFrom, and AllDifferent [Smith, 2004].
LORDiLIS: Integrating Learning Objects Repositories and Digital Libraries
6
In the Global Ontology (Figure 2) we have the classes: Search Type, Source and Document. The Source and Search Type classes identify respectively the data sources and the search parameter. Notice that although nowadays the Search Type class is defined to attend queries by Author, Keyword and Title (or in all of these parameters through the Everywhere option), the developed prototype can be easily extended to other types of searches, such as Edition Date and Language. The Document class allows the integration of the query results that the mediator receives from the wrappers.
Fig. 2. Global Ontology classes in Protégé
In the MARC Local Ontology (Figure 3) the Title class indicates to the wrapper its equivalent mappings to the MARC source, defining all tags that correspond to the title. The classes Author and Keyword work in a similar way to the equivalent tags in MARC, which are related to Author and Keyword.
Fig. 3. MARC Local Ontology in Protégé
In the LOM Local Ontology (Figure 4) the Title class indicates to the wrapper its equivalent mappings to the LOM source, defining all the categories and respective items that correspond to the title. The Author and Keyword classes work in a similar way to the wrapper according to the equivalent items and categories.
LORDiLIS: Integrating Learning Objects Repositories and Digital Libraries
7
Fig. 4. LOM Local Ontology in Protégé
One of the authors of this paper has been involved in the development of a Brazilian Library System called Pergamum [Pergamum, 2005], which is used at PUC-Rio. Pergamum has a simple web search interface that we extended to incorporate the LORDiLIS client web service (Figure 5). This interface was also adapted to the TecBD/PUC-Rio’s LCMS. The JAVA application (Pergamum’s extended interface) makes searches over the LOM and MARC repositories through the mediator and wrappers.
Fig. 5. Application Interface
The user can choose the type of search that he/she desires to make (by Author, Title, Keyword or Everywhere) and the sources to be queried (LOM, MARC or Both). Then the user types the text for his/her search and submits it. Figure 5 shows the application interface. The mediator receives the user query and accesses the global ontology to validate the query. Then, the mediator defines the sub-queries and sends them to the wrappers. Therefore, in order to search an Author in the MARC data source, the mediator sends the search text and other necessary information to the equivalent wrapper, which makes the mappings to the local schema through the support of the local ontology. In other words, the local ontology provides to Author (a global term) the following MARC tags: 100-Main Entry - Personal Name = "search text", 110-Main Entry - Corporate Name = "search text", 700-Added Entry - Personal Name = "search text", 710-added Entry - Corporate Name = "search text", and 720-Added
LORDiLIS: Integrating Learning Objects Repositories and Digital Libraries
8
Entry - Uncontrolled Name = "search text". Then, the sub-query will look for the author in all these tags in the MARC repository. Similarly, in order to search an Author in the LOM data source, the mediator sends the search text and other necessary information to equivalent wrapper, which makes the mappings to the local schema through the support of the local ontology. In other words, the local ontology provides to Author (a global term) the following LOM categories and items: Lifecycle.Contribute.Role = "Author" and Lifecycle.Contribute.Entity = "search text". Then, the sub-query will look for the author in entity but only when the role value is equal to “author” in the LOM repository. The wrappers access the data sources and return the query results to the mediator, which access the global ontology to assemble the integrated answer to the application. Through the results of this prototype it was possible to search LOM and MARC repositories in an integrated way. Although some developments are being considered such as the use of taxonomies as controlled vocabularies for both repositories and the use of other query parameters, the functionalities and easiness of using our proposed architecture were accomplished. 4 – Conclusion The architecture presented in this paper aims at providing the integration of LO repositories and DLs. This architecture provides to the users a transparent and integrated view of the learning objects and digital resources that are stored in the data sources. This integration is independent of data model, query language, operational system and localization. A prototype of the proposed architecture has been implemented to integrate digital resources of PUC-Rio’s DL and learning objects from TecBD/PUC-Rio. There are other works in the specialized literature that presents the use of DLs for e-learning, however they have different approaches from what we have presented in this paper. Ilumina [Ilumina, 2005] and DILLEO [Mikulecky, 2005] have a DL of LOs. A tentative mapping of LOM and MARC standards can also be found at [Qin, 2002]. The most similar work is the LEBONED Project [Oldenettel, 2003] (Learning Environment Based on Non Digital Educational Libraries). It considers an architecture to integrate DL into a Learning Management System (LMS). They use an extension of SCORM standard [ADL, 2004] (which is based on IEEE LOM) in order to support the METS [LC, 2005] standard (which is based on MARC) while we use ontologies to provide mappings among DL and ELearning standards. Although we used LOM and MARC in the prototype, any other standard for describing digital resources or learning objects can be considered. LEBONED uses wrappers to export data from the DLs to the extended SCORM repository. We considered a global view of the repositories and DLs. It would be an integrated virtual database with the digital resources and learning objects of all data sources. In addition, the use of web services and ontologies as considered in the proposed architecture allows easier and more flexible implementation of the components of the architecture, the definition of better mapping rules, as well as a better semantic orientation through the representation of the respective metadata schemes. Therefore, integration processes are more reliable and the corresponding semantics are better represented.
LORDiLIS: Integrating Learning Objects Repositories and Digital Libraries
9
Data sources can be added to the integration environment and then a series of conflicts (heterogeneity) needs to be treated. The proposed architecture, through the use of ontology and mediators, has more flexibility and versatility when a new data source is added. If a new data source uses a data model that already exists in a local scheme, it can use the same wrapper of this other local source, thus allowing code reuse. As future works it is important to define wrappers to other standards and enrich the use of taxonomies, for example treating values that were not initially considered in the taxonomies. It is also important to generate educational values semi-automatically for bibliographic resources. References [ADL, 2004] Advanced Distributed Learning, SCORM 2004 2nd Edition Overview, http://www.adlnet.org/scorm/history/2004/documents.cfm [ARIADNE, 2005] ARIADNE Foundation for the European Knowledge Pool. Available at: http://www.ariadneeu.org/ Accessed on September, 2005. [Clements, 2001] Clements, T. (2001) "Overview of Soap Web Services - Technical Overviews", Sun Microsystems, August 2001. [DCMI, 2005] Dublin Core Metadata Initiative. Available at: http://dublincore.org/ Accessed on September, 2005. [Gomes, 1999] Gomes, G. R. R.: An environment for bibliographic data integration based on mediators, MSc.ThesisCatholic University of Rio de Janeiro-Brazil Computer Science Department, 1999. (in Portuguese). [Gruber, 1994] Gruber, T.: Towards principles for the design of ontologies used for knowledge sharing. International Journal of Human and Computer Studies, v43, n.5/6, pp. 907-928, 1994. [Gudgin, 2001] Gudgin, M., Moreau, J. and Nielsen, H. F. (2001) "SOAP Version 1.2", W3C, July, 2001. [Ilumina, 2005] Ilumina, http://www.ilumina-dlib.org/browse.asp (acessado em Julho 2005) [LC, 2000] Library of Congress. Network Development and MARC Standards Office. MARC21 format for bibliographic data: including guidelines for content designation. 2000 ed. [LC, 2004] Library of Congress, Cataloger’s Desktop [CD-ROM].n.2, 2004 [LC, 2005] Library of Congress, METS: An Overview & Tutorial. http://www.loc.gov/standards/mets/METSOverview.v2.html, May 2005 [LTSC, 2002] Learning Technology Standards Committee of the IEEE: LOM Draft Standard 1484.12.1-2002. http://ltsc.ieee.org/wg12/20020612-Final-LOM-Draft.html [Mikulecky, 2005] Mikulecký, S.,Maruna, Z., Smrčka, I.; DILLEO – Digital Library of Learning Object. https://dilleo.osu.cz/dilleo_v2/About.aspx; Acessado em: Agosto de 2005. [Moura et al., 2005] Moura, S.L., Coutinho, F.J., Siqueira, S.W.M. and Melo, R.N. (2005) “Integrating Repositories of Learning Objects Using Web-Services to Implement Mediators and Wrappers” In: Proceedings of the International Conference On Next Generation Web Services Practices (NWESP'05), 2005, Seul [Oldenettel, 2003] Oldenettel, F., Malachinski, M., Reil D.: Integrating Digital Libraries into Learning Environments: The LEBONED Approach. In Proceedings of the 3rd ACM/IEEE-CS: Join Conference on Digital Libraries 2003. pp. 280- 290 [Pergamum, 2005] Pergamum – Integrated System of Libraries, http://www.pergamum.pucpr.br/ [Qin, 2002] Qin, J.: Crosswalk of MARC 21 and IEEE LOM. 2002 http://web.syr.edu/~jqin/LO/Metametadata_crosswalk.html#extended [Smith, 2004] Smith, M. K., Welty, C. and Mcguinness, D. L. (Ed.) "OWL Web Ontology Language Guide", W3C, Feb., 2004, http://www.w3.org/TR/owl-guide/. [Tzitzikas, 2002] Tzitzikas, Y., Spyratos, N., Constantopoulos, P.: Query Translation for Mediators over Ontologybased Information Sources. In Proceedings of the Second Hellenic Conference on AI: Methods and Applications of Artificial Intelligence 2002 pp.423 - 436 , Lecture Notes In Computer Science, Springer-Verlag London, UK [UDuham, 2003] Universities of Durham (2003) "IBHIS: Integration Broker for Heterogeneous Information Sources", Universities of Durham, Keele, http://www.co.umist.ac.uk/ibhis/ [W3C, 1999]World Wide Web Consortium (1999) "XML Path Language (XPath), Version 1.0 - W3C Recommendation", November, 1999, http://www.w3.org/TR/xpath [W3C, 2003]. Web Ontology Language (OWL) Reference Version 1.0. W3C Working Draft 21 February 2003. http://www.w3.org/TR/2003/WD-owl-ref-20030221/