Ontology Markup for Web Forms Generation

1 downloads 0 Views 140KB Size Report
Ontology Markup for Web Forms Generation. Marlon Dumas1 Lachlan Aldred1 Mitra Heravizadeh1,2 Arthur H.M. ter Hofstede1. 1 Centre for Information ...
Ontology Markup for Web Forms Generation Marlon Dumas1 Lachlan Aldred1 Mitra Heravizadeh1,2 Arthur H.M. ter Hofstede1 1

2 Centre for Information Technology Innovation GBST Holdings Pty Ltd Queensland University of Technology PO Box 1511 GPO Box 2434, Brisbane QLD 4001, Australia Milton QLD 4064, Australia {m.dumas, l.aldred, a.terhofstede}@qut.edu.au [email protected]

Abstract Ontologies are promising to become the keystones of the Semantic Web. The realisation of this promise requires that adequate approaches to model, represent, and mark up ontologies on the Web are developed. This paper presents an approach to model ontologies as populated conceptual schemas and to implement them using XML tools and relational databases. The issue of marking up these ontologies with presentation information is then addressed. The approach is applied to the generation of Web forms from marked up ontologies, in such a way that the generated forms encode the constraints expressed in the ontology.

1

Introduction

The explosive development of the Web has brought forward the need for machine processable representations of semantically rich information: a vision at the heart of the Semantic Web [1]. At present, the concept of ontology is being put forward as a potential enabler of this vision [9]. In a nutshell, an ontology is a “shared understanding of a domain that can be communicated between people and application systems” [5]. Ontologies provide a consensual conceptualisation through which heterogeneous programs can communicate with each other and with their users. Potential applications of ontologies are site organisation and browsing support, heterogeneous data sources integration, products and services advertisement, etc. Since ontologies are intended (among other things) to support the communication between application programs and their users, there needs to be a way to mark them up with presentation information. In particular, it should be possible to generate from a given ontology with presentation markups, user interfaces supporting tasks such as designating or displaying one or several elements of this ontology. This paper describes an approach to model and represent marked up ontologies in a way that enables their processing through mature and scalable technologies. We specifically study the case where ontologies are marked up to generate Web forms allowing users to designate a specific term of an ontology (e.g., designating a car in a car ontology, or a wine in a wine ontology). In the proposed approach, an ontology is modelled as a populated conceptual schema and implemented as an XML schema and a relational database. Eventually, this populated WWW’02 Workshop on Real-World Applications of RDF and the Semantic Web

1

conceptual schema is marked up by a forms developer and the resulting marked up ontology is translated into presentation annotations over the underlying XML schema. These annotations are used to generate the XHTML/Javascript code for the form, that is ultimately displayed by a browser. The data entered by the user in the form is then parsed by a servlet which creates an XML document conforming to the XML schema of the marked up ontology. In the rest of the paper, we successively introduce our approach to model and mark up ontologies for generating Web forms (Sections 2 and 3), and we describe an ongoing implementation (Section 4). This description is followed by a review of related work (Section 5) and concluding remarks (Section 6).

2

Modelling ontologies

An ontology can be seen as a set of terms, relationships between these terms, and rules governing these relationships. For tasks that do not involve complex reasoning, the rules captured in an ontology can be limited to those found in conceptual modelling languages such as UML (Unified Modelling Language) [10] and ORM (Object-Role Modelling) [7]. In order to ground the discussion, we make use of ORM, although the results can certainly be adapted to UML. For the purposes of this paper, the basic concepts of ORM are those of entity, role and fact. Entities are the things being modelled (e.g., the terms of an ontology), while facts are the statements that are made about entities (e.g., their relationships). A fact relates two or more entities, each of them playing a different role in the fact. Entities are grouped into entity types (graphically denoted by circles) and facts are grouped into fact types made up of one or more roles (denoted by rectangles with a line pointing to an entity type). The special fact type “Subtyping”, denoted by a directed arrow, captures a notion of inheritance. If an entity type ET1 is a subtype of an entity type ET2, then any entity of ET1 is also an entity of ET2, and consequently, all the fact types involving ET2 also involve ET1. If ET1 is a subtype of ET2, ET1 will typically have a subtyping condition expressed in terms of the fact types in which ET2 is involved. All the instances of ET2 which satisfy this condition are in ET1. An example ORM diagram is given in Figure 1. This diagram involves the entity types Item, Car, Book, Model, Make, and Year, where Car and Book are subtypes of Item. In addition to these entity types, the diagram features 5 fact types: ItemCategory, CarYear, CarMake, CarModel, and MakeModel. ItemCategory is used to define the subtyping conditions (see bottom left corner of the diagram). For example, a car is an item related to the category “Car” through ItemCategory. The fact types CarYear, CarMake, and CarModel capture the relationship between cars on the one hand, and makes, models, and years on the other. The fact type MakeModel captures the relationship between models and makes. The following types of constraints (among others) can be captured in an ORM diagram: • Role optionality: states whether all the entities of a type must take a given role or not. Mandatory roles are denoted by filled circles at the extremities of the role. In Figure 1, the roles “is of”, “has model”, “has make” and “has year” are mandatory.

WWW’02 Workshop on Real-World Applications of RDF and the Semantic Web

2

Item

is of

includes

Category

ItemCategory u CarYear Year (AD)

CarModel

is year of has year

Car (ID)

has model is model of

Model (Name)

Book (ISBN)

{1950.. } has make CarMake

Subtyping conditions Car is Item is of Category "Car" Book is Item is of Category "Book"

is make of

is made by makes MakeModel

Make (Name)

Populations (non−exhaustive list) population(CarModel) = { (C1, Civic), (C2, Legend), (C3, Falcon), (C4, Mustang), ... } population(MakeModel) = { (Honda, Civic), (Honda, Legend), (Ford, Falcon), (Ford, Mustang), ... }

Figure 1: Fragment of a populated ORM schema modelling an ontology. • Role disjointness: states that an entity of a type can play only one among a set of roles • Role uniqueness: states whether a binary fact type captures a 1:1, 1:N or N:M relationship. In Figure 1, the fact type CarModel has a uniqueness constraint on the role “has model” (indicated by a double arrow on top of the role) meaning that a car has only one model. There is no uniqueness constraint on the role “is model of”, so several cars can have the same model. Hence, the fact type CarModel captures a 1:N relationship. The full definition of role uniqueness takes into account higher arity fact types. • External uniqueness constraint: specifies uniqueness constraints over combinations of roles belonging to different fact types. In the example of Figure 1, there is an external uniqueness constraint involving the roles “is year of” and “is model of”. This means that a type of car is uniquely identifier by a year and a model (e.g., Mustang 1988). • Value constraint: enumerates in extension or through ranges, the values that a value type can take. In Figure 1 for example, there is a value constraint over the type AD stating that the ontology only deals with years greater than or equal to 1950. When describing ontologies, it is important to consider not only the structure of the Universe of Discourse (its schema) but also its population. Thus, in order to capture an ontology, a conceptual schema must be accompanied by a population, leading to what we call a populated conceptual schema. The population of an ORM schema is composed of the populations of the fact types of the schema. The population of a fact type is simply a set of WWW’02 Workshop on Real-World Applications of RDF and the Semantic Web

3

tuples. For example, the population of the fact type MakeModel is a set of pairs composed of a name of a model and a name of a make as exemplified at the bottom of Figure 1. For further details about the ORM notation, the reader is referred to [7].

3

Marking up ontologies

Once an ontology has been specified as a populated conceptual schema, it can be marked up with presentation information. Below, we sketch a method for marking up ontologies in order to generate Web forms for designating an entity O of a given type T within an ontology. In the context of the car ontology (Figure 1), this form would allow a user to designate a particular type of car (e.g., “Honda Legend 1986”) by specifying a make, a model, and a year. To simplify the discussion, we assume that all fact types are binary. The extension to ternary and higher arity fact types is straightforward. The marking up process works as follows. First, a subset of the fact types in which T participates are selected as those for which an input element should be created in the form. These fact types must be chosen in such a way that specifying the entity to which entity O is related through each of these fact types leads to an unambiguous designation of O. In the context of designating a car in the ontology of Figure 1, the fact types CarModel and CarYear are sufficient to perform this designation. Indeed, as discussed in the previous section, the presence of the external uniqueness constraint over the roles “is year of” and “is model of” entails that there is no need to take into account the fact type CarMake when designating a car. Still, the developer may decide to include the fact type CarMake in the form, with the idea that this might help the user in finding the model of the car to be designated. In the second step of the marking up process, the following items are specified for each of the fact types selected in the first step: 1. The role played by the designated object O is this fact type. 2. The external name (as it is to appear in the form) of the above role. For example, the external name for the role “has make” can be “Make”. 3. A documentation of the fact type’s meaning, to be included or linked to the form. For example the documentation of the fact type CarMake could be: “The make of a car is the brand given to it by its manufacturer. Examples of makes are Honda and Ford”. 4. Whether the entity O0 to which O is related through this fact type is to be designated: • By an identifier (e.g., an agreed-upon label or an URI) • By means of a sub-form These two cases are respectively called designation by identifier and designation by sub-form. In the example of Figure 1, the developer will certainly use designation by identifier for all the 3 fact types involving the type Car, since there is no other means of designating a make, a model or a year, other than by an identifier (at least according to the WWW’02 Workshop on Real-World Applications of RDF and the Semantic Web

4

ontology’s schema). Sub-forms would be useful for example in the case of an ontology involving a type Hotel related to a type Address. In this setting, a form for designating an object of the type Hotel is likely to involve a sub-form for designating the address of the hotel. This sub-form would contain an input element for each of the fact types involving the type Address (e.g., street number, street name, city). 5. In the case of designation by identifier, the type of graphical element used to enter the identifier (text field, pull-down menu, check-box, slider, etc.). In the working example, the fact types CarModel and CarMake can be marked up so that a pull-down menu is used to represent them, while the fact type CarYear can be marked up as a text field. For each fact type designated by sub-form, the marking up process is recursively repeated, this time taking as starting point the entity type that plays the inverse of the role specified in item 1 above. In the working example (i.e., designating a car), no designation by sub-form occurs, so the process is stopped at this point. In the example of marking up an ontology to designate a hotel, the marking up process would need to be repeated taking as starting point the type Address (to which the entity type Hotel is related). The role optionality, role disjointness, and role uniqueness information contained in the ORM schema of an ontology are exploited during the form generation. Role optionality is used to determine whether an input element in a form is optional or not. Role disjointness is used to detect that if the user fills in an input element, (s)he does not need to fill another one anymore. Finally, role uniqueness is used to determine whether the term appearing in an input element of a form fully or partially determines the term(s) that must appear in another element. With respect to the example of Figure 1, if the ontology is used to generate a form to designate a type of car, and at some point the user selects a given make of car, only the car models of the selected make will appear as options in the pull-down menu for the CarModel fact type. This is because the fact type MakeModel captures a 1:N relationship. Similarly, the form will detect that once an input element corresponding to a type T is filled in, all other input elements corresponding to entity types that are in a 1:1 relationship with T through a fact type, can be automatically filled in. Following this principle of exploiting the ontology information during the form generation, the value constraints appearing in the ORM schema are translated into range checks in the code of the generated form. In the working example, a range check will be performed by the form to ensure that the value entered for the year is greater than or equal to 1950. The population of the ontology is also exploited during the form generation. Specifically, if a fact type is mapped into a pull-down menu, the options that appear in this menu are derived from the population of one of the roles in this fact type. With respect to the example of Figure 1, if the population of the role “is make of” is { Honda, Ford, Toyota }, the pulldown menu for the make of a car will contain these values as its options. Similarly, in the case of a text field, the population of the corresponding role is used to check that the term entered by the user in this field actually exists in the ontology.

WWW’02 Workshop on Real-World Applications of RDF and the Semantic Web

5

4

Implementation aspects

An overview of the ontology-based forms generation system is given in Figure 2. In this figure, the circles denote system modules, the boxes denote inputs and outputs of these modules, and the dashed lines denote interactions between the system and the users.

Ontology Developer

Ontology Editor Ontology Ontology schema

Ontology population

(Serialized ORM)

(relational database)

Forms Developer Ontology Marker

Document schema (XML schema)

Presentation markups (XML)

XSLT Form implementation instance of

(XHTML + Javascript)

XSLT

JDBC

Legend System module

Forms Generator Output

Web Server Forms Processing Servlet

XML Document End User

Input

User interaction External module Browser

Figure 2: Ontology-based forms generation system. The process begins when the ontology developer(s) interact(s) with the ontology editor to design an ontology1 . The ontology editor is essentially an ORM diagram drawing tool in which populations can be specified by attaching textual annotations to the entity and fact types2 . The output of the ontology edition is an ontology schema, specified using a serialised syntax of ORM (e.g., based on XML), and a relational database. There are well-known techniques to map ORM diagrams into relational tables [7]. 1 2

Alternatively, the ontology could be obtained through a translation from another ontology language. The populations can be stored in external files or databases and linked to the diagram through a reference.

WWW’02 Workshop on Real-World Applications of RDF and the Semantic Web

6

At this point of the process, a developer of a Web page can import the ontology schema into the Ontology Marker in order to specify the structure of a form by specifying presentation markup information. The output of this marking up process is an XML schema containing the marked up portion of the ontology’s schema, and an XML document containing the presentation markup information. This markup information describes how the elements of the schema are mapped into elements of the form, how the elements of the form are constrained and inter-related, and how the contents of the form elements (e.g., the options of the pull-down menus) are extracted from the ontology population. The presentation markup information is then given as input to the Forms Generator module, which uses this together with the schema population contained in the database, to produce an XHTML element with embedded Javascript code. Javascript is used to code the variable parts of the pages, such as the correlated pull-down menus. In the example discussed in Section 3, the pull-down menus for the make and the model will be correlated since there exists a relationship between these two items. As a result, each time that the user will select a given make of car, the pull-down menu corresponding to the model will be updated so that it displays only the models of that particular make. The form element generated in the previous process can be included within the body of an XHTML page. Eventually, this XHTML page is served to an end user who fills up the form. The data entered by the user is then submitted to the Forms Processing Servlet, which formats these data as an XML document. By construction, this XML document will be conform to (i.e., will be an instance of) the XML schema generated by the Ontology Marker. This XML document can then be used by any application program. A developer can mark up the same ontology in different ways. The resulting marked up ontologies are then processed separately by the Forms Generator, leading to different XHTML form elements. As an example, we consider the case of an ontology capturing a set of product categories, where each category is modelled as a type within a type hierarchy: the type Item appearing at the top of the hierarchy, and the most specific product types such as Car, Mobile Phone, or PDA appearing as leaves. Each type inherits the roles of its super-type(s) while having its own roles (e.g., Mobile Phone will inherit the role “has price” from Item while having its own roles such as “has model” and “has network”). Given this ontology, a forms developer can generate a separate Web form for each of the leaf categories (e.g., one for Mobile Phone, one for PDA, etc.), so that an end user looking for to designate a mobile phone will get a different form than another user looking to designate a PDA. At present, we have a working prototype that implements the Forms Generator and the Forms Processing Servlet. The presentation markup information and the schema population are assumed to be given. The forms generator uses XSLT to process the presentation markup information and JDBC to access the database containing the population.

5

Related work

The use of Knowledge Representation (KR) languages to describe ontologies has recently gained a considerable momentum, as evidenced by the ongoing activity of the W3C Web Ontology Working Group [14]. KR languages such as DAML+OIL [4] are indeed powerful WWW’02 Workshop on Real-World Applications of RDF and the Semantic Web

7

enough to capture structural constraints (classes, properties), formal is-a relationships, value restrictions, and general logical constraints, everything within a unified framework. General logical constraints can be exploited by intelligent agents in order to reason about the elements of the ontology. While acknowledging their advantages, we believe that in the near future, the use of KR languages to describe ontologies at the implementation level is not likely to lead to mature and scalable solutions. Indeed, KR languages rely on inference engines that are either in a prototype stage, or do not yet scale well to large amounts of facts and rules. This is true even for inference engines supporting tractable KR languages such as those based on Description Logics [6], especially when the size of the ontology is such that secondary memory processing techniques are required. Hence, alternative approaches to ontology modelling and representation need to be explored in order to enable the first stages of the Semantic Web. In these early stages, ontologies will not necessarily be used to conduct complex reasoning, but rather as a means to enable meaningful communication between humans and heterogeneous application programs. The need to explore means for representing ontologies at the implementation level has been indirectly brought forward in [8]. The authors present a method for mapping the “schema” part of an ontology expressed in OIL [2] into an XML schema. Each instance of the ontology can then be represented as an XML document conforming to this XML schema. This differs from our approach where the set of instances of the ontology are stored in a relational database. Our proposal also differs from the above, in that it addresses the issue of marking up ontologies with information that allows their processing for a specific purpose (the generation of Web forms). A methodology for modeling Web pages using the UML is proposed in [3]. Specifically, the author describes a collection of UML stereotypes and icons for modelling Web forms. These include a stereotype to declare that a class models a form, a stereotype to declare that a given method will process the submission of a form, and other presentation stereotypes such as , , etc. This approach differs from ours in that the generation of forms only takes into account the schema, but not a given population. So for example the forms generated using the above approach will not contain information about the instances of the schema types, such as the models and makes of cars in the example of Figure 1. The Schema Adjunct Framework (SAF) [11] is an XML-based language used to associate domain-specific data with XML schemas (or DTDs) and their instances. In a nutshell, a schema adjunct is a set of of data items associated with the elements of an XML schema that describe a particular kind of processing. Schema adjuncts can be used for example to describe a mapping of an XML schema into a relational database, or a transformation method from an XML schema into a Web form. In particular, the presentation markups generated by the Ontology Marker in our architecture (see Figure 2) could very well be represented as a schema adjunct. The SAF API could then be used to process these markups. XForms [13] is an XML-based framework for specifying Web forms in a way that separates the purpose from the presentation. The framework includes a language for specifying deviceindependent forms, a language for representing the data gathered from a form, and a protocol

WWW’02 Workshop on Real-World Applications of RDF and the Semantic Web

8

for submitting these data. With respect to our proposal, XForms can be seen as an attractive alternative to XHTML-based forms as the target language of the form generation.

6

Conclusion

The main contributions of this paper are: • An approach to model ontologies as populated conceptual schemas. • An implementation of this approach using XML tools and relational databases. • An approach to mark up ontologies for the purpose of generating Web forms that encode the constraints captured by the ontology. Regarding this last item, a direction for further work is to explore other applications of ontology markup, including the generation of structured and comparative search interfaces. Another direction for future work is to study the synergies between ontology modelling and representation approaches based on populated conceptual models, and approaches based on KR languages. In particular, the development of mappings from ORM and UML diagrams to KR languages such as DAML+OIL, and vice-versa, could lead to a seamless approach to ontology-based systems development, whereby populated conceptual models are used for simple tasks, whereas full-fledged ontology descriptions with general logical rules are used for tasks requiring inference capabilities. Finally, the possibility of using RDF(S) [12] and its associated technologies as an implementation platform in lieu of XML tools and relational databases as we currently do, is also worth consideration. The use of RDF(S) would probably simplify the current implementation by providing a higher-level model for handling the complex relationships captured by a marked up ontology. Acknowledgment This work was supported by the Australian Research Council SPIRT Grant “Self-describing transactions operating in a large, open, heterogeneous, distributed environment” involving QUT, UNSW and GBST Holdings Pty Ltd.

References [1] T. Berners-Lee, J. Hendler, and O. Lassila. The Semantic Web. Scientific American, May 2001. [2] J. Broekstra, M. Klein, S. Decker, D. Fensel, F. van Harmelen, and I. Horrocks. Enabling knowledge representation on the Web by extending RDF schema. In Proceedings of the 10th International World Wide Web Conference (WWW10), Hong Kong, China, May 2001. [3] J. Conallen. Modeling Web applications architecture with UML. Communications of the ACM, 42(10), October 1999.

WWW’02 Workshop on Real-World Applications of RDF and the Semantic Web

9

[4] D. Connolly, F. van Harmelen, I. Horrocks, D.L. McGuinness, P.F. Patel-Schneider, and L.A. Stein. DAML+OIL (March 2001) Reference Description. Note, W3C Consortium, December 2001. Accessed from http://www.w3.org/TR/daml+oil-reference on 11 March 2002. [5] D. Fensel. Ontologies: Silver Bullet for Knowledge Management and Electronic Commerce. Springer Verlag, 2001. [6] J. Gonzalez-Castillo, D. Trastour, and C. Bartolini. Description logics for matchmaking services. In Workshop on Applications of Description Logics, Vienna, Austria, September 2001. Accessed from http://www-lti.informatik.rwth-aachen.de/ki01dlws. html on 8 March 2002. [7] T. Halpin. Information Modeling and Relational Databases. Morgan Kaufmann, 2001. [8] M. Klein, D. Fensel, F. van Harmelen, and I. Horrocks. The relation between ontologies and XML schemas. Electronic Transactions on Artificial Intelligence (ETAI), 6, 2002. To appear. [9] D. McGuinness. Ontologies come of age. In D. Fensel, J. Hendler, H. Lieberman, and W. Wahlster, editors, The Semantic Web: Why, What, How. MIT Press, 2002. [10] J. Rumbaugh, I. Jacobson, and G. Booch. The Unified Modeling Language Reference Manual. Addison-Wesley, 1999. [11] S. Vorthmann, J. Robie, and L. Buck. The Schema Adjunct Framework. Note, W3C Consortium, July 2001. Accessed from http://www.w3.org/TR/daml+oil-reference on 12 March 2002. [12] W3C Consortium. Resource Description Framework (RDF), 1997-2002. http://www. w3.org/RDF/. [13] W3C Consortium. XForms — The next generation of Web forms, 2000-2002. http: //www.w3.org/MarkUp/Forms. [14] W3C Consortium. Web ontology working group, 2001. http://www.w3.org/2001/sw/ WebOnt.

WWW’02 Workshop on Real-World Applications of RDF and the Semantic Web

10

Suggest Documents