A generic component for exchanging user models ... - CiteSeerX

6 downloads 32596 Views 546KB Size Report
interoperability for user models is a component in a series of concrete generic components for model-based application development. These components are ...
64

Int. J. Cont. Engineering Education and Lifelong Learning, Vol. 16, Nos. 1/2, 2006

A generic component for exchanging user models between web-based systems Kees van der Sluijs* Eindhoven University of Technology, PO Box 513, NL-5600 MB Eindhoven, The Netherlands E-mail: [email protected] *Corresponding author

Geert-Jan Houben Eindhoven University of Technology, PO Box 513, NL-5600 MB Eindhoven, The Netherlands E-mail: [email protected] Vrije Universiteit Brussel, Pleinlaan 2, B-1050 Brussels, Belgium E-mail: [email protected] Abstract: Educational web-based systems exemplify the increasing need for personalisation. Applications that adapt to individual users need a model of the user that contains as accurate data as possible. On the web, learners use multiple educational systems and spend their time over many applications: these are individually limited in their user modelling but can gain from joining forces. This boils down to establishing semantic interoperability of user or learner models. While semantic interoperability is hard, the emerging Semantic Web (SW) might offer just the mechanisms we need. In this paper, we develop the Generic User model Component (GUC): a generic software that utilises SW technology to support the exchange of user model data between applications. For a semantically effective user model exchange, GUC allows the configuration of a distributed management of mappings between user models. Thus, applications can choose different levels of uniting user models to maximise their personalisation. Keywords: learner model; personalisation; semantic interoperability; user model server; Semantic Web (SW); schema mapping. Reference to this paper should be made as follows: van der Sluijs, K. and Houben, G-J. (2006) ‘A generic component for exchanging user models between web-based systems’, Int. J. Continuing Engineering Education and Lifelong Learning, Vol. 16, Nos. 1/2, pp.64–76. Biographical notes: Kees van der Sluijs received his MSc degree in Computer Science from Technische Universiteit Eindhoven. He is currently a PhD student in the Department of Information Systems of the same university. His research is part of the Hera research program that includes research on web information systems, adaptation and personalisation and Semantic Web technologies. His individual research interests include web information systems, Semantic Web, user adaptation, user modelling, semantic

Copyright © 2006 Inderscience Enterprises Ltd.

A generic component for exchanging user models

65

interoperability and document retrieval. He is currently involved in the AlterEgo-project and the MobiLife-project that both focus on personalisation and sharing and exchanging user models. Geert-Jan Houben is a Professor of Information Systems at Vrije Universiteit Brussel in Belgium and part-time associated with Technische Universiteit Eindhoven in the Netherlands. He is heading the Hera research program on web information systems, adaptation and personalisation and Semantic Web technologies. In Brussels, he is codirector of the WISE Research Laboratory. Main research topics are (Semantic) Web technology, web query and transformation languages, adaptation and personalisation in web applications and web engineering. He (co-)organised, chaired or served as PC member for numerous events in the field and serves as an Editorial Board Member or Managing Editor for journals in the area.

1

Introduction

The nature of modern day learning has changed in many aspects. Web-based educational systems offer opportunities that could not be realised previously in the educational and instructional processes. The main aim today is to enable learners to have their own teacher in their own learning context. To realise this, web-based educational systems become personalised, which means that the applications require functionality for automatic adaptation. Having adapted to their learners, the applications can provide the behaviour that is desired in a specific learning context. Thus, an adaptation allows a better and more effective learning experience. One critical requirement for this adaptation is the availability of a good description of the current learning context. This description is captured in the user or learner model, a formal and explicit specification of the user’s knowledge and context. The user model typically records the user properties such as preferences, behaviour, knowledge and other facts that are deemed relevant to the application. To optimise the personalisation, this model should be accurate and complete as possible. However, learners spend their time for many educational applications and even use many devices to do so. Since learners are typically not willing to invest a lot of time to explicitly fill this user model for every single application they use, the result is the data on the user is fragmented for many applications. Furthermore, the user model for a single application is usually limited to the goals and purpose of this application. As applications often register similar data on the user, information on the user is also gathered, which has been already known in another application’s user model. Related is the so-called cold-start problem that refers to the difficulty of applications to start up the adaptation for new users: most adaptation does not work properly as long as the user model is (practically) empty. Being able to use the user data from other applications decreases this problem. Consequently, we see a number of reasons, both for the users and the applications they use, for integrating user models. Therefore, we need a way to support applications in sharing their knowledge to the user. Sharing user knowledge between applications requires semantic interoperability with respect to the user models. This is in general an ambitious challenge and requires a high degree of alignment between the applications on syntax, structure and semantics.

66

K. van der Sluijs and G-J. Houben

In the past, it is nearly impossible to enforce applications to use a prescribed vocabulary in a heterogeneous environment such as the web. In the educational field we have seen the emergence of many domain-specific standards, for example, PAPI (Collet et al., 2001) and LOM (Hodgins and Duval, 2002), but still a semantic understanding between applications remains a challenge. However, the recently emerged Semantic Web (SW) offers mechanisms that might help us in this quest. The SW provides a common framework for structuring data in a syntax-independent way. It allows the flexible definition of data structures and offers mechanisms to define relations between these structures. Moreover, using the SW’s well-defined semantics within the user models allows to reason on the model instance data. These ingredients fulfil the basic requirements to establish semantic interoperability. The context of our research and some related work are given in Section 2. We describe how we use rule-based mappings to translate between user models in Section 3. Then, in Section 4, we explain some mapping configuration techniques, which we have used within our component and motivate how we came to that choice. In Section 5, we present the Generic User model Component (GUC) and describe its architecture. Finally, we discuss some peripheral functionality in Section 6.

2

Background

The GUC component that we described in this paper as the basis for the semantic interoperability for user models is a component in a series of concrete generic components for model-based application development. These components are developed in the spirit to offer designers’ components that after little configuration can be linked together into a complete component-based Adaptive Web Information System (AWIS). This component-based architecture is based on the Hera (Vdovjak et al., 2003) and AMACONT (Fiala et al., 2003) technology. Formerly, we created GAC (Fiala and Houben, 2005), which is a generic transformation component for different transcoding steps, SAG (Houben et al., 2005) which is an extension to GAC that allows dynamic self-configuration and HPG (Rutten et al., 2004) which is a component for hypermedia presentation generation. In this series, GUC fits in as a generic component that offers functionality to store data models for applications and to exchange user data between these models. GUC can be used as a part of an AWIS, to support adaptation for personalisation, but can also be used as a stand-alone user model server. Section 5 explains the GUC architecture. When we look at the earlier work on user model servers and semantic interoperability for user models, we see that this focused mainly on the syntactic problems. The basic problems were already identified in the work of Orwant called ‘Doppelgänger’ (Orwant, 1995). This work focused primarily on the gathering and distribution of data and the use of several learning techniques. More recent approaches such as UserML (Heckmann and Krüger, 2003) and GUMO (Heckman et al., 2005) focus on creating a common language and ontology for the communication of the user models. Another example is MUMS (Brooks et al., 2004) in which communication interoperability is established by a generic communication framework and semantic interoperability is stimulated by offering an ontology database for the sake of reuse of (parts of) ontologies.

A generic component for exchanging user models 3

67

Translating user models through mappings

In the approach for translating user models, we allow the applications to use their own vocabulary, that is, a proprietary format for their user models. The SW language OWL allows a high expressivity, while also providing a semantic structure to exploit relationships between concepts. The fact that we have chosen to use such a language does not automatically mean a limitation for the models, for example, for legacy applications. Already many SW wrappers are out there (e.g. refer for a collection of wrappers to RDFizers (2005) and for most data models such a wrapper can be produced easily.

3.1 An example Within the approach we can assume that applications have an OWL-schema that describes the structure of their user model. Furthermore, we assume that applications maintain an instance of this model for every user they wish to model. Figure 1 shows an example with two rather simple user models (that are derived from actual user profiles from the two Dutch applications). Both applications maintain basically the same information about the user, namely their name, e-mail address, the region of residence and the password for an application. However, both applications use a different model to store this information both syntactically and semantically. Figure 1

Two examples of user model schemas

App1 for instance maintains a name by dividing it into ‘Title’, ‘First Name’, ‘Infix’ and ‘Last Name’, while App2 only maintains one string with the full name: the information is syntactically as well as differently structured. A more complex relationship is the semantic equivalence between the properties ‘Postal Code’ and ‘City’, as the values of these properties are different, but there is however a one-to-one relationship between the instances. A specific ‘Postal Code’ relates directly to a specific ‘City’ and a specific ‘City’ relates directly to a ‘Postal Code’ (range). Note that even if the semantics and the syntax are the same, as is with the Password property, there may still be a pragmatic difference, for example, the password is application-dependent. Now, the question that we try to solve in our component is: suppose we have an instance of App1, such as in Figure 2 how would we translate that into an instance of App2? In the solution, discussed in the next sections, we utilise the fact that user models

68

K. van der Sluijs and G-J. Houben

have a predefined context in terms of their relation with a specific user. For this we need to assume that the component can uniquely identify a user, which is a separate problem that we will not discuss here. We will assume throughout the rest of the paper that the component knows which instances belong to which user. Figure 2

Instance of a user model for App1

3.2 Mapping instances based on schemas Figure 3 gives a schematic overview of the translation process for user models that in GUC is available as a Translation Service. Based on User Model (UM) schema A and UM schema B we create a mapping. This mapping acts as a recipe with which we are able to later translate into an instance of schema A, say ‘a’, into an instance of schema B, namely ‘a′’. Since schemas A and B may maintain different information about the user, ‘a′’ may not be a complete instance of schema B in the sense that B can contain more information than A and thus the translation of ‘a’ might not completely ‘fill’ schema B. Figure 3

Schematic overview of the translation service

The mappings are generated based on the similarities between two input schemas. This can be done by hand or by an external component. To find matching information we can effectively utilise SW technology. The SW language OWL (McGuinness and van Harmelen, 2004), together with a rule language like SWRL (Horrocks et al., 2004), provides just the mechanisms we need to relate these data structures. We express the mappings in the rule language SWRL, which is based on RuleML (Hirtle et al., 2005). Now consider the partial (simplified) schemas of Figures 4 and 5 for the applications App1 and App2 from Figure 1. These schema snippets contain the different structures for the user name. Now based on these schemas we can construct a mapping for translating an instance of App1, in particular the name part, into an equivalent instance of App2. This will result in the (simplified) mapping as expressed in

A generic component for exchanging user models

69

Figure 6. Using SWRL Built-Ins (Horrocks et al., 2004) many value transformations can be performed, so that most mapping needs can be fulfilled. By extending these Built-ins and by using additional resources (e.g. a standard postal code – city table or Wordnet, etc.), more powerful mappings can be created. Figure 4

Part of schema App1

Figure 5

Part of schema App2

Note that the mappings are constructed only once and that is done by a (human) designer. Since this can be a labourious task, we aim to provide the designer with some supporting techniques to semi-automatically generate these mappings. For example, the designer could use graph-matching techniques to match two input schemas and identify the overlapping information between the schemas. There exist numerous heuristic algorithms for data in a language like OWL that can generate such a merged schema. These algorithms exploit a variety of heuristics such as name similarity, thesauri, schema structures, value distribution of instances, past mappings (e.g. by using statistical analysis), constraints, cluster analysis and similarities to standard schemas. Our component allows to use such matching algorithms, but whether such an algorithm is used and if so which one, is optional and part of the configuration by the designer. Examples of such algorithms are similarity flooding (Melnik et al., 2002) and the machine learning-based matching software GLUE (Doan et al., 2004). For a survey on these techniques see Rahm and Bernstein (2001). An existing framework that is able to match OWL graphs is COMA++ (Broekstra et al., 2002). Whatever the algorithm used for schema matching, the result must be verified and possibly be edited by hand before it can be effectively used. This is because the semantic structures may not be interchangeable, for example, pragmatics as we explained earlier.

70

K. van der Sluijs and G-J. Houben

Figure 6

4

Mapping, to translate the name structure from App1 to App2

Translation configuration

4.1 Exchanging user models between multiple applications In Section 3, we have showed how we can translate instances between two user model schemas. In our educational context we do not want to translate learner models from one to other application, but we want to exchange learner models between N applications. We can use several strategies to support this, for example, the strategy depicted in Figure 7 by making mappings to and from every application. The advantage of this approach is that the application is not required to use a specific vocabulary and structure to be interoperable with other applications and specific custom-made translations between applications can be built. The disadvantage is the complexity involved in the obligation to have 2N 2 mapping for N applications. This means that for adding one application to the system 2N mappings need to be created before being able to fully utilise the exchange of data between all the applications. A variant of this would be a P2P approach.

A generic component for exchanging user models Figure 7

71

Direct connection approach

Another strategy is depicted in Figure 8. The principle is to create a Shared User Model (S-UM), which contains the most used concepts within the domain. Typically this S-UM could be based on one of the available standard ontologies for UM like Heckman et al. (2005). By creating a mapping to and from every application and S-UM, S-UM can be used as a mediator for the exchange of user data between the applications. The advantage of this approach is that new applications can be easily added to the system through only two mappings. The complexity here is 2N mappings for N applications. Disadvantage is that translating in two steps, via S-UM, might result in loss of information. Moreover, one cannot make specific translations between applications that take into account subtle details of the two applications that are not reflected in S-UM. Figure 8

Shared-UM approach

The approach in our system is to mix the two strategies. We use S-UM for exchanging the most common UM information between the applications, allowing to easily adding new applications to the system with a low design complexity. Next we allow for direct mappings between applications, such that the need for a more custom-made exchange of application-specific information can be fulfilled.

72

K. van der Sluijs and G-J. Houben

4.2 Data integration Part of what applications aim for when they exchange information with other applications is to gain previously unknown information or to update existing information to the user. This gives rise to data integration problems. As the information from other applications may be inconsistent with each other or with the existing information, data reconciliation is an issue. We support data reconciliation by again applying the OWL and SWRL techniques. For each application, rules can be defined that specify how to reconcile data in the case of a conflict. These rules are evaluated by the reconciliation engine. For example, a possibility for elements that are annotated with dates is to choose the element with the newest date. The simplified rule example in Figure 9 depicts comparing two elements with a date property and choosing the element with the newest date. Another example is to reconcile based on the trust values for applications in case that trust ratios are stored. In general, the system allows the designer to use every reconciliation rule conceivable in SWRL. Figure 9

5

Simplified SWRL-rule for date-reconciliation

GUC

We have combined the techniques and approaches that we described in Sections 3 and 4 in our software component called GUC. GUC contains a user model server, which is able to utilise the discussed techniques to support the exchange of the user data between applications with heterogeneous user models.

A generic component for exchanging user models

73

5.1 User model server Applications can ‘subscribe’ to GUC. Subscribed applications can request and upload the user data of particular users. To provide this functionality for an application, GUC requires the schema describing the user model in this application. GUC stores these application schemas in its application schema repository. If this application schema is present in GUC, an application can upload instances of that schema for particular users, such that for each user an instance is stored in GUC. An application may ask GUC for a particular user for the corresponding instance of its application schema.

5.2 UM exchange In GUC, we combine this basic user model server functionality with the translation framework discussed in Section 4. The translation service needs the data integration engine for data integration purposes. To get the mappings we couple the translation service to a mapping creator. In this mapping creator, designers can create the mappings with or without the supporting matching tools. Furthermore, there is a user model editor, so that the user can edit their data. This leads to the basic architecture of our model as depicted in Figure 10. Figure 10 Basic architecture of GUC

5.3 Implementation The implementation of GUC is currently done within the framework of the AlterEgo-project (Schuurmans and Zijlstra, 2004) as a component for a cross-domain user profiling system. However, GUC is designed to be applicable within every user model or learner model-based environment. The system is entirely developed in Java. Furthermore, we use Sesame for the OWL-repositories and RuleML for executing the SWRL-rules. We are currently building the translation service, based on hand-made mappings. Simultaneously we are building a UM editor to let users edit their own UM data. For the Mapping Creator, we investigate the possible application of existing software components such as COMA++ (Aumüller et al., 2005). Furthermore, we consider the implementation of the peripheral functionality, which we discuss in the next section.

74 6

K. van der Sluijs and G-J. Houben

Peripherals

In this section, we briefly discuss current views on some parts of the peripheral functionality that is important for a system as GUC. We will discuss our ideas, which we would later on implement within our system.

6.1 Reasoning and learning Using SW languages such as OWL and SWRL, we have a great collection of technologies to our disposal. The most important feature of OWL is its reasoning capabilities. By writing statements in a language as OWL, we can use reasoners (e.g. RACER (Haarslev and Möller, 2003) and FACT++ (FaCT++, 2005)) to make implicit data explicit. In this way additional, possibly useful, data for a UM becomes available. This reasoning can be done before storing the data in the UM repository (‘offline’). Or it can be done upon application request, in which case less data are stored and the inferred data can be more easily dismissed by the application. As applications can store UM data in GUC, they would benefit from having data mining functionality. An example is finding patterns in user and cross user data. This can be used to learn the user model based on application access data. Thus, UM data can be gathered without constant user dialogue and the user can get recommendations.

6.2 Context-based UM Users typically act in different roles. Their behaviour and preferences will differ depending on the context. A person will, for instance, display different preferences alone or acting within a group. Therefore, we want to give support to maintain a UM given a certain context. This can be done by storing a UM for a user that is only valid given a certain context. If the user uses the application in another context another UM is created. This topic is part of our research in the IST MobiLife-project (MobiLife, 2005).

6.3 Privacy and trust Within a system that stores UM-data and exchanges UM data between applications, trust of the user in the system is crucial, as otherwise the user might not be willing to use the applications in the first place. Therefore, we make GUC as transparent as possible and under control by the user. Via the UM editor we want to give the user access to all data that is stored about the user and enable the user to alter this data. We also want to give the user control over which applications may get access to which data. Therefore, GUC must be able to apply privacy policies. We assume this can be solved by attaching a privacy enforcer component to the translation service. The privacy and trust topics are prominent topics within the AlterEgo-project (Schuurmans and Zijlstra, 2004) in which we are involved. 7

Conclusion

In this paper, we introduced a new generic component for the storage of UM data and the exchange of this UM data between applications. For this system we utilise SW technologies. The exchange of UM data is done by applying SWRL mappings. In this

A generic component for exchanging user models

75

way, an instance of one application’s UM schema (modelled in OWL) can be translated into a (partial) instance of another application’s UM schema. For the translation, mappings are needed which drive the instance translation. Such mappings can be semi-automatically generated using schema-matching techniques and are then fed to the translation service in the component. To control all mappings between N applications, the component supports the configuration of the distributed management of these mappings utilising a shared-UM that contains the most used concepts within the domain combined with hand-made mappings for more specific UM data sharing. We have illustrated the architecture of the component and discussed some important peripheral functionality such as learning, reasoning, context-awareness, privacy and trust.

References Aumüller, D., Do, H.H., Massmann, S. and Rahm, E. (2005) ‘Schema and ontology matching with COMA++’, ACM SIGMOD/PODS, Baltimore, Maryland, Software Demonstration, Available at: http://lips.informatik.uni-leipzig.de:80/pub/2005-5. Broekstra, J., Kampman, A. and van Harmelen, F. (2002) ‘A generic architecture for storing and querying rdf and rdf schema’, Proceedings of the First International Semantic Web Conference (ISWC), Sardinia, Italy, 9–12 June, pp.54–68. Brooks, C., Winter, M., Greer, J. and Gord, M. (2004) ‘The massive user modelling system (MUMS)’, Proceedings of the Seventh International Conference on Intelligent Tutoring Systems (ITS), Maceio, Brazil, pp.635–645. Collet, M., Linton, F., Goodman, B. and Farance, F. (2001) ‘Standard for learning technology – public and private information (PAPI) for learners (PAPI learner)’, IEEE P1484.2.1/D8, Draft. Doan, A., Domingos, P. and Halevy, A. (2004) ‘Ontology matching: a machine learning approach’, in S. Staab and S. Studer (Eds). Handbook on Ontologies, International Handbooks on Information Systems, Springer, pp.384–405. FaCT++ (2005) ‘OWL: FaCT++’, Available at: http://owl.man.ac.uk/factplusplus/. Fiala, Z. and Houben, G.J. (2005) ‘A generic trancoding tool for making web applications adaptive’, Proceedings of the 17th Conference on Advanced Information Systems Engineering (CAiSE’05) Forum, Porto, Portugal, 13–17 June, pp.15–20. Fiala, Z., Hinz, M., Meissner, K. and Wehner, F. (2003) ‘A component-based approach for adaptive dynamic web documents’, Journal of Web Engineering, Vol. 2, Nos. 1/2, pp.58–73. Haarslev, V. and Möller, R. (2003) ‘Racer: a core inference engine for the Semantic Web’, Proceedings of the Second International Workshop on Evaluation of Ontology-Based Tools, (EON2003), Sanibel Island, Florida, 20 October, pp.27–36. Heckman, D., Schwartz, T., Brandherm, B. and Von Wilamowitz-Moellendorff, M. (2005) ‘Gumo, the general user model ontology’, Proceedings of the Tenth International Conference on User Modeling (UM’2005), Poster, Edinburgh, Scotland, 24–29 July, pp.428–432. Heckmann, D. and Krüger, A. (2003) ‘A user modeling markup language (userml) for ubiquitous computing’, Proceedings of the Ninth International Conference on User Modeling (UM’2005), Poster, Johnstown, USA, 22–26 June, pp.393–397. Hirtle, D., Boley, H., Grosof, B., Kifer, M., Sintek, M., Tabet, S. and Wagner, G. (2005) ‘Schema specification of RuleML 0.89’, Available at: http://www.ruleml.org/spec/. Hodgins, W. and Duval, E. (2002) ‘Learning Object Metadata standard (LOM)’, IEEE 1484.12.1-2002. Horrocks, I., Patel-Schneider, P.F., Boley, H., Tabet, S., Grosof, B. and Dean, M. (2004) ‘SWRL: a Semantic Web rule language combining OWL and RuleML’, Draft Version 0.7 of 21 December 2004, Available at: http://www.daml.org/rules/proposal/.

76

K. van der Sluijs and G-J. Houben

Houben, G.J., Fiala, Z., van der Sluijs, K. and Hinz, M. (2005) ‘Building self-managing web information systems from generic components’, First International Workshop on Adaptive and Self-Managing Enterprise Applications (ASMEA’05), Proceedings of the CAiSE2005, Porto, Portugal, 14 June, Vol. 2, pp.53–67. McGuinness, D.L. and van Harmelen, F. (2004) ‘OWL web ontology language overview’, W3C Recommendation 10 February 2004, Available at: http://www.w3.org/TR/owl-features/. Melnik, S., Garcia-Molina, H. and Rahm, E. (2002) ‘Similarity flooding: a versatile graph matching algorithm and its application to schema matching’, Proceedings of the 18th International Conference on Data Engineering (ICDE’02), San Jose, USA, pp.117–129. MobiLife (2005) ‘MobiLife – life goes mobile, Available at: http://www.ist-mobilife.org/. Orwant, J. (1995) ‘Heterogeneous learning in the Doppelgänger user modeling system’, User Modeling and User-Adapted Interaction, Vol. 4, No. 2, pp.107–130. Rahm, E. and Bernstein, P.A. (2001) ‘A survey of approaches to automatic schema matching’, The VLDB Journal, Vol. 10, pp.334–350. RDFizers (2005) ‘SIMILE | RDFizers’, Available at: http://simile.mit.edu/RDFizers/index.html. Rutten, B., Barna, P., Frasincar, F., Houben, G.J. and Vdovjak, R. (2004) ‘HPG: a tool for presentation generation in WIS’, Proceedings of the 13th International World Wide Web Conference (WWW2004), Poster, New York, USA, 17–22 May, pp.242–243. Schuurmans, J. and Zijlstra, E. (2004) ‘Towards a continuous personalization experience’, Proceedings of the Eighth Conference on Dutch Directions in HCI (SIGCHI.NL), Amsterdam, The Netherlands, 10 June, pp.19–22. Vdovjak, R., et al. (2003) ‘Engineering Semantic Web information systems in Hera’, Journal of Web Engineering, Vol. 2, Nos. 1/2, pp.3–26. h

Suggest Documents