A coordination model for the Semantic Web Robert Tolksdorf
Elena Paslaru Bontas
Lyndon J. B. Nixon
Freie Universitat ¨ Berlin, Institut fur ¨ Informatik, Netzbasierte Informationssysteme Takustr. 9 14195 Berlin, Germany
Freie Universitat ¨ Berlin, Institut fur ¨ Informatik, Netzbasierte Informationssysteme Takustr. 9 14195 Berlin, Germany
Freie Universitat ¨ Berlin, Institut fur ¨ Informatik, Netzbasierte Informationssysteme Takustr. 9 14195 Berlin, Germany
[email protected]
[email protected]
[email protected] ABSTRACT
The Semantic Web foresees a Web of machine-processable knowledge interacted with by clients in an operationalized manner. At Web scale, the coordination between clients will be vital for ensuring the success of their interactions. In this paper we consider how a coordination model for the Semantic Web would look, as a precursor to the design and implementation of Semantic Web Spaces, a middleware platform for real-world Semantic Web applications.
Categories and Subject Descriptors C.2 [Computer Systems Organization]: Computer Communication Networks; C.2.4 [Distributed systems]: [Distributed applications]; D.3 [Software]: Programming Languages; D.3.2 [Language Classifications]: [Linda]
General Terms Semantic Web Space
Keywords Semantic Web, middleware, Linda, tuplespaces
1.
INTRODUCTION
The vision of the Semantic Web is of distributed knowledge interacted with by clients with reasoning capabilities in a similar manner to how (human) clients today interact with Web pages through a Web browser. However, due to the machine-understandable representation of the Web data, the Semantic Web enables a novel level of automatization of these activities. It is clear in that in such a scenario coordination will play an important role, as clients seek to interact with multiple knowledge sources (which need to handle concurrent access for knowledge updating, access and removal) in real time in order to fulfill their goals.
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. SAC’06 April 23-27, 2006, Dijon, France Copyright 2006 ACM 1-59593-108-2/06/0004 ...$5.00.
The current Semantic Web research is still focused on desktop applications or on distributed ones following the client/server or the P2P model. However, the advantages of Linda [7] and the tuplespace model for other cases of open distributed systems have already been discussed in the literature [18]. We consider a coordination model for the Semantic Web as an important contribution to the emerging Semantic Web infrastructure. This paper outlines a coordination model that is being realized in our Semantic Web Spaces,1 a middleware platform for real world Semantic Web applications [25, 24].2 By applying a tuplespace-based approach to the concurrent interaction of multiple clients with distributed knowledge repositories, we foresee the benefits of a simple, yet powerful coordination model in which parallel and distributed Semantic Web processes can be uncoupled in space and time. We consider the necessary conceptual changes to the classical Linda/tuplespace paradigm to support its integration with technologies of the Semantic Web. As a result, we can give a clearer overview of how Semantic Web Spaces could be implemented and function. The structure of this paper is thus: Section 2 introduces the Linda coordination model and the Semantic Web Spaces, while Section 3 outlines the Linda-based conceptual model that we use which takes into account the particularities of the Semantic Web. Related and future work are considered in Section 4.
2.
SEMANTIC WEB SPACES
The coordination language Linda [8] has its origins in parallel computing. It consists of coordination operations (the coordination primitives) and a shared data space (the tuplespace) which contains data in form of (tuples). Tuples can be inserted into the space (“out”), read from the space (“read”) or removed from the space (“in”)3 . The tuplespace provides for persistent publication of data decoupled in space and time from the processes which are interacting in the space. As the operations are blocking (i.e. they wait until a matching tuple is found) a simple coordination model is formed. The original Linda idea has been extended 1 Further information at http://nbi.inf.fu-berlin.de/ research/semanticwebspaces 2 This work is partially supported by the EU Network of Excellence KnowledgeWeb (FP6-507482). 3 Another primitive “eval” has become less used in Linda implementations due to issues about its operational semantics.
in various ways to be applied to open distributed systems, multi-user and workflow coordination, XML middleware and self-organization [21, 4, 22, 5]. The central components of a tuplespace-based system use the Linda coordination model and the (potentially distributed) tuplespace as a shared data space for tuples. In Semantic Web Spaces we extend the core architecture with a reasoning component for interpreting ontologies according to their formal semantics (and drawing inferences, checking satisfiability etc.) as this is out of the scope of the Linda paradigm. Accordingly, the tuplespace is extended to support building a semantic view upon the tuples (i.e. construction of a RDF graph model from RDF data stored in the tuplespace and linking RDF statements with the ontologies they reference [10]). Additionally, we introduce a component intended to fulfill different administrative services as are determined as requisite in a Semantic Web middleware (e.g. issues of security and trust), and a set of metadata for the tuplespace itself. The metadata is aligned to an ontology we define for describing a tuplespace and the tuples that it contains. This ontology provides concepts for expressing security and trust policies, and hence allows for an ontology-based approach to organizing and initializing these extension modules. Furthermore, it explicitly describes the structure of the space (e.g. whether sub-spaces are allowed) and the supported matching templates. Finally, as the system is foreseen as a middleware platform, it should be independent of the underlying implementations of the different computer systems it interacts with. This necessitates interfaces to isolate the heterogeneity of both the clients communicating with the system and the physical storage solutions from the system kernel.
3.
THE CONCEPTUAL MODEL
In order to realize a coordination middleware for the emerging Semantic Web we identified, following the categorization in [18], 4 categories of extensions to the Linda coordination model:4 1) A tuplespace model which enables the partitioning of the space into semantically related subspaces and provides a formal description of the tuplespace is introduced. Using contexts and ontologies to represent the structure of the tuplespace allows a more flexible and efficient management of the tuplespace content and of the interaction between tuplespace and information providers and consumers (extendability, automatic inferencing etc.). 2) New types of tuples are required in order to represent standard Semantic Web languages such as RDF(S) [10] or OWL [17] within a tuplespace. 3) New matchings extending the standard Linda matching approach are needed to manage the newly defined tuple types. Furthermore, semantic matching techniques taking into account ontological knowledge can be used to enrich the retrieval capabilities of the tuplespace. 4) New coordination primitives and a revision of core Linda operations are required in order to ensure the transit from classical data-centered tuplespaces to the new semanticsaware Semantic Web Spaces.
3.1
Tuplespace model
4 A comprehensive description of the conceptual model can be found in [23].
Template
Access Policy
appliesToAgent
Agent
hasPredefinedTemplate definesPolicy
Trust Policy
TupleSpace hasSubSpace
hasContext
hasTuple hasContext
Context
Tuple rdf:resource
XMLDocument
hasSubTuple is-a
hasField
…
Primitive Datatype
is-a is-a
Type
Field hasType
Figure 1: Excerpt of the Semantic Web Spaces Ontology
The structure of Semantic Web Spaces is represented explicitly by means of an ontology (Figure 1). The ontology describes the typical components of the tuplespace, such as sub-spaces, supported tuple types and matching templates, and coordinates the clients’ access to the tuplespace content. The ontology provides an explicit means to model access rights, trust policies and active contexts of clients (here named agents) of the space.5 It can easily be extended with new types of tuplespaces, rules describing new primitives and further metadata about spaces and allows the usage of automatic reasoning in the management of tuples and accessing agents. The ontological model, syntactically (a set of) RDF(S) or OWL documents, can be seen as a meta-level referenced by concrete spaces and tuples published by different parties, which are represented as typed RDF statements. Representing the entire tuplespace by an ontological knowledge base offers a means to reference tuples and tuplespaces which can be identified and addressed using URIs – by definition assigned to named Semantic Web resources. In [23], some concrete examples of using the tuplespace ontology are given.
3.1.1
Contexts
To support the semantic partitioning of Semantic Web Spaces, we introduce the concept of “contexts”. Both clients and tuples are associated to a set of contexts, while clients can only see the tuples which exist in a same context. The concept is based on scopes [14], which allow for a tuplespace to be split into (overlapping) partitions. This provides a means to control client access to statements in the space and to split client operations into subsets of relevant statements. In order to support contexts in Semantic Web Spaces we include two operations, allowing clients to construct a context and to acquire a context from a matched tuple, respectively. Further operations (e.g. if a client upon joining a space is allocated a certain context set) are handled at system level.
3.2
New types of tuples
Following the Linda paradigm, Semantic Web Spaces should be able to represent Semantic Web information through tuples. The expressivity of the information representation should be aligned to the expressivity of common Semantic 5 The format of access rights and trust policies is currently undefined.
Web languages, while respecting their semantics, so that tuples could be mapped to and from external Semantic Web resources. As Semantic Web languages, we currently focus on RDF(S) [10] and OWL [17]. In the tuplespace model, we add a new dedicated type of tuple known as a RDFTuple. RDF statements can be represented in a four field RDFTuple of form . Following the RDF data model, each tuple field contains an URI (or, in the case of the object, a literal). These URIs identify Semantic Web resources. Each field is also typed accordingly, URIs by RDFS/OWL Classes and literals by datatypes. Finally, we additionally choose to uniquely identify each tuple by means of a fourth ID field6 . In this way tuples sharing the same subject, predicate and object can be addressed separately, which is consistent with the Linda model. For example, a RDF tuple (using QNames instead of full URIs) could look like this: In respecting the RDF data model three special cases must also be taken into consideration: blank nodes, containers and reified statements: 1). Blank nodes Blank nodes are anonymous resources in a RDF graph, which have no externally addressable label, but are assigned internal identifiers in order to allow other local statements to reference and distinguish them. In order to provide the same semantics in Semantic Web Spaces, we define a field type ts:BlankNode which identifies the use of a blank node in a tuple. The system itself will administer internal IDs for these blank nodes, ensuring that they remain unique within the tuplespace or within the scope of the tuple. For a client placing a tuple with a blank node into the tuplespace, a local identifier is allocated to the field when the tuple is added to the tuplespace. To add multiple tuples which refer to the same blank node, it is necessary to claim all the tuples together in the form of a ‘subspace’ (see Section 3.4). 2). Collections and containers A collection is a closed set of objects. Containers are incomplete sets built according to the open world assumption. The RDF semantics does not impose any special conditions on the defined RDF vocabulary, which permits some flexibility in the tuple representation. We consider the use of array constructs to model collections and containers in tuples. The client can retrieve the List as an array object and handle it locally using host programming language constructs. 3). Reification To reify a statement means to be able to refer to it in another statement. In RDF, this is achieved by creating an instance of rdf:Statement and giving values to its properties rdf:subject, rdf:property and rdf:object. The URI of the rdf:Statement instance is then used as subject or object in another statement. In Semantic Web Spaces we permit standard RDF syntax for expressing a reified triple. We refer to a rdf:Statement instance in a tuple, and give this instance values for the properties rdf:subject, rdf:property and rdf:object. As an alternative approach, a client could form a RDF tuple and add it to a tuple field typed as rdf:Statement. This allows a tuple containing a 6 Since the tuplespace is described by means of an ontology, every RDFTuple becomes an instance of the ontology class ts:RDFTuple – where ts is the namespace of the tuplespace ontology – and is therefore automatically identified by the URI assigned to this instance.
reified statement to be placed in the space in a single operation.
3.3
New matchings
Matchings in Linda are based on checking the equality of number of fields, equality of field constants and binding of field variables. Whether a matching occurs is therefore dependant upon matching rules for different types. In Semantic Web Spaces, dedicated matching templates can be defined and associated to a tuplespace. For example, RDF can be handled both as a special datatype with a particular syntax (following the abstract syntax model) and as a knowledge representation form with pre-defined meaning (following the RDF model-theoretic semantics). We consider a simple template language for constructing queries on Semantic Web Spaces, which can be easily transformed to RDF query languages. For example (we use here a abbreviated syntax), a RDF query like SELECT ?x WHERE (book, title, ?x) is equally expressible in the template (book, title, ?x) using URIs for book and title and a variable of the desired type for x. We now turn to a description of the different types of matching supported in Semantic Web Spaces.
3.3.1
Matching RDF abstract syntax
RDF syntactic matching considers only tuples identified in the space as RDFTuples. Every RDF statement contains typed URI values (the object of a statement can also be a literal, i.e. a XML Schema datatype) and an ID in the form of a system-asigned URI. RDF resources are matched on the basis of URI reference equivalence (as defined in the RDF abstract syntax, section 6.4) and RDF type equivalence (using URI string equivalence of the URIs used to identify the corresponding classes). In accordance to the RDF model, matching on RDF statements considers only the first three fields of the RDFTuple. Blank nodes match on variables of type ts:BlankNode or wildcards (which match all values). Matches on reified triples, containers or collections only occur if every variable type in the template matches the corresponding RDFS/OWL class. In the case of constants, collections and containers are treated as array datatypes in which the unordered rdf:Alt or rdf:Bag matches any other rdf:Alt or rdf:Bag containing exactly the same set of objects while the ordered List and Seq matches with the same set of objects in the same order. Syntactic matching provides a retrieval mechanism which takes into account RDF types and constructs such as blank nodes, containers, collections and reification, while avoiding the added complexity of subsumption reasoning.
3.3.2
Matching RDF Semantics
In the RDF model-theoretic semantics, RDF URIs are no longer considered as URI references, but resources which, at a knowledge representation level, identify some concept which has some agreed-upon meaning.7 Hence, matching of RDF tuples at this level shall support the interpretation of tuple content according to the known meaning of the concepts in the tuple (as defined by a RDF Schema or OWL ontology). This semantic matching may 7 http://www.w3.org/TR/rdf-mt/ discusses RDF semantics.
choose to support interpretations at different levels of expressibility, in order to provide a compromise with computational complexity (the more expressive the interpretation we attempt to match, the more computationally complex it will be). As an example of a core matching algorithm for the information view of Semantic Web Spaces we mention subsumption relationships between RDF/OWL classes, a relationship which is covered by every Description Logics reasoner. In this case any T 1 in a template is considered to match with any T 2 in a tuple, given that T 2 subsumes T 1. Again special attention is paid to certain RDF constructs. Blank nodes now also match on variables typed as RDF resources (ts:BlankNode is subclass of rdf:Resource in the tuplespace ontology) and RDF statements match on variables typed as RDF tuples (as we define rdf:Statement as a subclass of ts:RDFTuple in the tuplespace ontology). This type of matching can be further extended to support OWLbased inferences (such as satisfiability).
3.4
New coordination primitives
In Semantic Web Spaces we make a fundamental distinction between a data view and an information view upon stored tuples. In the data view all tuples, despite their RDF-based typing, are seen as plain data, without semantics, like in traditional Linda systems. In the information view we see the set of RDF tuples in the tuplespace as a RDF graph, which imposes consistency and satisfiability constraints w.r.t. the RDF semantics and to its RDF schema. Hence, we make two distinctions which both need a new set of Linda primitives, as each distinction determines a new semantic for the traditional Linda operations of out, rd and in (Figure 2).
%&'
()
inr: (s,p,o,id) -> tuple.
These operations are still considered to occur in the data view, i.e. a RDF tuple conforming to RDF syntax is accepted, no check is made against ontological information. They serve to allow well formed RDF statements to be placed in the tuple space, and be retrieved destructively or non-destructively. However, for the information view, we want that the RDF statements are also satisfiable according to a constraining schema and retrievable based on ontological information. To assert tuples in the information view, we propose a further set of primitives: claim: (s,p,o,id) -> boolean. claim: (Subspace) -> boolean.
The insertion primitive at the information level only permits the insertion of tuples whose content satisfies the available ontological information. A claim contains either a single tuple or a “subspace”, which is defined at the client and contains one or more tuples, thus allowing multiple tuples to be claimed in one operation. Additionally it provides the means to make claims which contain blank nodes. Within a subspace, a blank node with the same identifier are considered as being the same when tuples are added into the space. endorse: (s,p,o,id) -> Subspace.
Reading a tuple from the information view requires that it is “endorsed”, i.e. that it has been found to be satisfiable according to the current ontological information. The match is returned as a subspace. This subspace may contain a single matching tuple, however in the case of blank nodes the “linking” tuples are included in the response, e.g. if the matching tuple has a blank node as its subject, then the set of tuples with the blank node as their object are included. excerpt: (s,p,o) -> Context.
! " ! "
*+
#"!
*+
#"!
%&'
*+ &
$
Figure 2: Different views on different spaces The first distinction is between non-RDF tuples and RDF tuples. A simple data tuple could also contain three fields with URIs as such tuples are unconstrained in terms of field number or field content. RDF tuples however have a special structure and components, which have to be handled according to the usage of RDF in the Semantic Web. So besides the traditional data tuple operations (out, rd, in) we redefine the primitives for RDF tuples, adding the constraint that the tuple or template that is used must conform to a RDF tuple or template: outr: (s,p,o,id) -> boolean. rdr: (s,p,o,id) -> tuple.
We also support a multiple read operation (i.e. get all matching tuples for a template) similar to the copy-collect primitive [19]. This primitive works within contexts – a context is created by the system into which all matching tuples are copied. A reference to this context is passed to the client who is given alone the right of access and can then read the tuples out of that context. Importantly, endorse and excerpt as information view operations, can match tuples which are not in the data view but exist implicitly in the information view (i.e. as inferrable tuples based on the available ontological information and a suitable reasoner). retract: (s,p,o,id) -> Subspace.
Destructively reading from the information view is not the same as expressing negation (which is not a part of the RDF/OWL semantics). Rather, we are removing a statement from the space without denying its truthfulness. As a result, the retraction operation does not remove a matched RDFTuple completely but replaces the RDF statement it contains with empty values. The tuple content remains in the data view but inferred tuples in the information view based on the retracted statement will be removed.
Synchronization between data and information views must also be maintained. As has been noted, operations on one view can have consequences in the other view. Fundamentally, both views are the same tuplespace, with or without the RDF graph and its inferred content. Furthermore, the RDF Schema/OWL ontologies which underlie the information views’ interpretation of the tuplespace content may change, or the model of the space built by the tuplespace ontology. Hence, the space not only coordinates the external processes’ interactions but also internally the different conceptual representations that it contains.
4.
RELATED WORK
We consider this work to be the first comprehensive and formal specification of a Semantic Web-enabled coordination model. The idea of combining Linda and Semantic Web information has also been proposed elsewhere [6], yet subsequent formalizations of a semantics-enabled coordination model [2, 3] have not addressed issues covered in this paper such as the particular representation of RDF syntax, different levels of matching or tuplespace partitioning. It is also unclear to what extent they continue to respect the basic principles of Linda, while our approach is clearly “backwards compatible”. One prototype system, sTuples, has been developed, in which the JavaSpaces platform was extended to support OWL types in tuple fields [12]. However, this approach has also not further considered the implications of coordinating Semantic Web information with Linda, as we have done. Many emerging areas of computing such as agent-based computing [26, 11, 9, 1] and Web Services [13, 20, 1, 15, 16], which identify value in their extension with rich semantics, are fields in which the issue of coordination is already extremely relevant. Adding simple yet powerful coordination capabilities to the Semantic Web infrastructure can also be of benefit in these efforts.
5.
CONCLUSION
We believe that Linda and tuplespaces could be used as a basis of a middleware platform for the Semantic Web, allowing us to benefit from the paradigms powerful coordination capabilities, asynchronous messaging and uncoupling of processes from space and time. We illustrated that certain extensions to the tuple structure, coordination model, matching rules and tuplespace model need to be made in order to perform this integration and outlined what we foresee these changes to be. We will realize this conceptual model in a prototype Semantic Web Space in order to validate the proposed coordination model and bring the possibility of a coordination middleware for the Semantic Web a step closer to reality.
6.
REFERENCES
[1] P. Buhler and J. M. Vidal. Semantic Web Services as Agent Behaviors. In Agentcities: Challenges in Open Agent Environments, 25–31. Springer-Verlag, 2003. [2] C. Bussler. A minimal triple space computing architecture. In Proc. of the WIW’05 Workshop on WSMO Implementations, 2005. [3] C. Bussler, E. Kilgarriff, R. Krummenacher, F. Martin-Recuerda, I. Toma, and B. Sapkota. D21. v0.1 WSMX Triple-Space Computing, June 2005.
[4] P. Ciancarini, D. Rossi, F. Vitali, A. Knoche, and R. Tolksdorf. Coordination Technology for the WWW. In Proce. of the 5th IEEE Workshops on Enabling Technologies: Infrastructure for Collaborative Enterprises (WET ICE’96), pages 321–326. IEEE Computer Society, Press, 1996. [5] P. Ciancarini, R. Tolksdorf, and F. Zambonelli. Coordination Middleware for XML-centric Applications. In Proc. of ACM SAC’02, 335–343, 2002. [6] D. Fensel. Triple-Space Computing: Semantic Web Services Based on Persistent Publication of Information. In INTELLCOMM, 43–53, 2004. [7] D. Gelernter. Generative Communication in Linda. ACM Trans. Prog. Lang. Syst., 7(1):80–112, 1985. [8] D. Gelernter and N. Carriero. Coordination languages and their significance. Commun. ACM, 35(2):97–107, 1992. [9] N. Gibbins, S. Harris, and N. Shadbolt. Agent-based Semantic Web Services. In Proc. of the 12th International Conference on World Wide Web WWW’03, 2003. [10] P. Hayes and B. McBride. RDF Semantics. http://www.w3.org/TR/rdf-mt/, 2004. [11] J. Hendler. Agents and the Semantic Web. IEEE Intelligent Systems, 16(2), 2001. [12] D. Khushraj, O. Lassila, and T. W. Finin. sTuples: Semantic Tuple Spaces. In MobiQuitous, 268–277, 2004. [13] R. Lara, H. Lausen, S. Arroyo, J. de Bruijn, and D. Fensel. Semantic Web Services: description requirements and current technologies. In Proc. of the ICEC’03, 2003. [14] I. Merrick and A. Wood. Coordination with scopes. In Proc. of ACM SAC’00, 210–217. ACM Press, 2000. [15] E. Motta, J. Domingue, L. Cabral, and M. Gaspari. IRS-II: A Framework and Infrastructure for Semantic Web Services. http://www.cs.unibo.it/ gaspari/www/iswc03.pdf, 2003. [16] OWL Services Coalition. OWL-S: Semantic Markup for Web Services. http://www.daml.org/services/owl-s/1.0/owl-s.pdf, November 2003. [17] P. F. Patel-Schneider, P. Hayes, and I. Horrocks. OWL Web Ontology Language Semantics and Abstract Syntax. http://www.w3.org/TR/owl-absyn/, 2004. [18] D. Rossi, G. Cabri, and E. Denti. Tuple-based technologies for coordination. In A. Omicini, F. Zambonelli, M. Klusch, and R. Tolksdorf, editors, Coordination of Internet Agents: Models, Technologies, and Applications, chapter 4, 83–109. Springer Verlag, 2001. ISBN 3540416137. [19] A. I. T. Rowstron and A. M. Wood. Solving the Linda multiple rd problem using the copy-collect primitive. Sci. Comput. Program., 31(2-3):335–358, 1998. [20] K. Sivashanmugam, K. Verma, A. Sheth, and J. Miller. Adding Semantics to Web Services Standards. In Proc. of the Int. Conf. on Web Services ICWS’03, 2003. [21] R. Tolksdorf and P. Ciancarini. Integrating Internet Services with a PageSpace. In Proc. of the ACM SIGCOMM’95 Workshop on Middleware, 1995. [22] R. Tolksdorf, F. Liebsch, and D. M. Nguyen. XMLSpaces.NET: An Extensible Tuplespace as XML Middleware. In Proc. of .NET Technologies’04, 2004. [23] R. Tolksdorf, L. Nixon, and E. Paslaru Bontas. A Conceptual Model for Semantic Web Spaces. Technical Report TR-B-05-14, Free University of Berlin, September 2005. [24] R. Tolksdorf, L. Nixon, E. Paslaru Bontas, D. M. Nguyen, and F. Liebsch. Enabling real world Semantic Web applications through a coordination middleware. In Proc. of the 2nd European Semantic Web Conf. ESWC’05, 2005. [25] R. Tolksdorf, E. Paslaru Bontas, and L. Nixon. Towards a tuplespace-based middleware for the Semantic Web. In Proc. of the IEEE Web Intelligence Conf. WI’05, 2005. [26] Y. Zou, T. Finin, L. Ding, H. Chen, and R. Pan. Using Semantic Web technology in Multi-Agent systems: a Case Study in the TAGA Trading Agent Environment. Proc. of the 5th Int. Conf. on Electronic Commerce, 2003.