A Publish/Subscribe Scheme for Peer-to-Peer Database Networks

0 downloads 0 Views 805KB Size Report
the other hand, a car rental company may wish to work with hotel chains and travel agencies ..... “Find the types of all the cars which are picked up at Tilburg and.
A Publish/Subscribe Scheme for Peer-to-Peer Database Networks Jian Yang1 , Mike P. Papazoglou1 , and Bernd J. Kr¨ amer2 1

Tilburg University, INFOLAB, P.O. Box 90153, 5000 LE Tilburg, The Netherlands {jian,mikep}@uvt.nl 2 FernUniversit¨ at Hagen, D-58084 Hagen, Germany [email protected]

Abstract. Peer-oriented computing is a natural way for meeting the data sharing requirements of decentralized, highly dynamic, scalable applications. In this paper we present a framework for data sharing in a peer-to-peer database network. We first introduce a publish/subscribe model where peer groups are formed by matching peer interests (subscriptions) against publications published by relevant peers in the network. We show that queries can be processed on basis of peer collaboration without the need for a global schema. Keywords: Peer to peer databases, query processing, publish/subscribe models, query transformation.

1

Introduction

Modern data intensive applications, such as web-based information systems, digital libraries, electronic catalogues, and content-based management, require information sharing among different data sources that are heterogeneous by nature and highly dynamic in particular. Heterogeneity implies that the data sources may be based on different data modelling formalisms, they may organize the same kind of data in different formats, and may represent different aspects of the same data elements. In most cases, these data sources either have some data in common or they are complementary to each other, regardless of their heterogeneous nature. To cope with these challenges, cooperating sites need to share information. The typical solution to this problem is data or schema integration techniques [2, 3]. These techniques distinguish between a mediator schema and a set of local source schemas and specify schema mappings between the local and the mediated schemas. All queries are posed against a mediator schema, which plays the role of a global schema, and are finally processed locally based on schema mappings. Data integration techniques can be classified according to the way the local data source schemas are related to the global schema. The global-as-view (GAV) approach defines a global schema as a view over local schemas. Another approach, known as local-as-view (LAV), defines the local sources as views over the global schema [1]. The trade-offs between GAV and LAV are in terms of query R. Meersman et al. (Eds.): CoopIS/DOA/ODBASE 2003, LNCS 2888, pp. 244–262, 2003. c Springer-Verlag Berlin Heidelberg 2003 

A Publish/Subscribe Scheme for Peer-to-Peer Database Networks

245

processing and scalability. With the GAV approach, translating global query into sub-queries on the local schemas is straightforward. In the case of LAV, the global query needs to be reformulated in terms of local schema elements. This is known as a hard problem of “rewriting queries using views” [2,4]. Modification of local schemas in the GAV approach involves redesigning the global schema; whereas in the case of the LAV approach, a local change only involves adding, deleting, and updating the local view definitions. As a consequence, the LAV approach scales better. The ever-increasing need for data sharing between geographically dispersed data sources over the Internet has resulted in loosely coupled, open applications. These require highly dynamic solutions in which the arrangement of data requesters and providers is constantly changing. Schema integration is no solution for such dynamic applications as it does not scale well and requires constant redefinition and updating of the global schema and related schema mappings. In contrast to LAV or GAV solutions, the dynamic nature of information sharing in an open, distributed environment demands an adaptive, and reconfigurable architectures that are capable of supporting ad hoc collaboration among related data sources and that can also update themselves when data sources join and leave the network. The initial results from peer-to-peer file systems [10,11,12] are very encouraging and inspiring in that regard. These demonstrate that ad hoc sharing of files is achievable in a system with high scalability and resiliency. Peer-to-peer applications rely on equal rights of all network participants to share resources among them. There is no host with special administrative or facilitating roles but all hosts provide the same application logic and perform both client and server roles. The work presented herein is based on the peer-to-peer communication principle between data sources in a database network. This implies the absence of a global schema or global knowledge. Instead, all peers (data sources) in the network handle queries based on their own schema and may possibly engage acquainted peers (in their group) for processing the query further, if necessary. Peer groups are formed strictly on the basis of publish/subscribe techniques and algorithms are provided for subscribe/publish matching and peer group formation. In particular, we demonstrate how the system scales with respect to peers joining and leaving the network. The remainder of this paper is organized as follows. Section 2 frames the problem and outlines our proposed solution. In Section 3 we present a publish/subscribe model for P2P database network and define how a peer group is formed. In Section 4 we describe how XML-based queries are distributed among peers and how they are processed. Section 5 discusses related work, while Section 6 presents our summary.

2

Problem Framing and Solution

In this section we introduce an architecture for sharing data among P2P databases. In particular, we explain how peer information is organized and discovered.

246

2.1

J. Yang, M.P. Papazoglou, and B.J. Kr¨ amer

A Motivating Example

This example relates to activities performed in the travel industry. A travel plan could involve air ticket reservation, hotel booking, car rental, and leisure activities. Different peers in a common network compete with each other to provide better quality services to their customers in terms of cheaper prices, better package deals, or comprehensive information. At the same time they also need to share information, collaborate, and form partnerships in order to gain maximum competitive advantages.

BudgetCarRental Car ( regNo , type, model, carID , branchNo ) Customer ( customerID , name, address, contact, creditCardNo ) Branch ( branchNo , location, phNo ) Booking ( customerID , carID , duration, pickUpBranch , returnBranch ) CustomerStay (customerID , hotelName , address) FlexiCarRental Car ( regNo , type, model, mileage, carID , status, branchNo ) Customer ( customerID , name, address, contact, creditCardNo ) Branch ( branchNo , location, phNo ) Booking ( bookingId , customerID , carID , duration, pickUpBranch , returnBranch )

HappyTravelAgent Customer (name, address, email, contactNo , description) IssuedTicket (airline, departTime , departPlace , arriPlace , customerName , price) Airline( airlineName , airlineCode , contact, location) Hotel ( hotelName , contact, location, priceInfo ) CarRental (companyName , contact, location) PackageInfo (dealNo , airline, hotel, carRent , time, price, condition) Reservation ( ResNo , customer, flightNo , hotel, carRental , package)

LeisureHotelBooking Hotel ( hotelName , location, service, price) Customer ( customerID , name, homeAddress , contact, creditCardNo ) Room ( hotelName , roomNo , type, status) reservation ( customerId , hotelName , roomNo , dates, price)

AirlineTravel FlightInfo (flightNo , route, from, to, price) customer ( customerID , bookingNo , name, address, contact, creditCardNo ) frequentFlyer (no, name, mileage, contactDetails ) issuedTicket (customerID , customerName , departTime , departPlace , arriPlace , customerName , price)

Fig. 1. Example schemas in the travel domain.

We assume that there are five data sources (peers) available in a database network (see Figure 1): – The BudgetCarRental a relational data base that stores information about customers (current and previous), cars for rental purposes, their branches, and booking information. – The FlexiCarRental a relational data base that stores similar information but contains only records of its current customers. – The LeisureHotelBooking a relational data base that stores information about hotels, available rooms, customers and reservations. – The AirlineTravel a relational database that stores information about flights, customers, frequent flyers, and tickets. – The HappyTravelAgent a relational database that stores information about customers, airlines, car rentals, hotels, tickets, reservations, and package deals. The latter may include specific airlines, car rental companies, and hotels under certain conditions. A travel agency needs to work with airlines, car rentals, hotels, and other travel agents to be able to provide attractive package deals for its customers. On

A Publish/Subscribe Scheme for Peer-to-Peer Database Networks

247

the other hand, a car rental company may wish to work with hotel chains and travel agencies for attracting more customers. Consequently, there is a strong desire for these sources to share and exchange information. In addition, this network is highly dynamic as existing data sources may decide to leave the network or need to get updated, and new data sources may join in, with each peer intent on its own agenda. Bearing all this in mind, we require a system that exhibits the following characteristics: – all data sources are autonomous, they have total control over their data and decide which of their data other peers in the network can share; – there is no need for a global schema. When a query is posed against a peer, it will be processed on the basis of the schema of this peer and, at the same time, the query together with relevant schema information will be forwarded to related peers for further processing. This is conducted on the basis of minimal peer information that this specific node has accumulated; – information about other peers gets updated whenever there are new pertinent peers joining the network, or when existing peers update their shareable data or leave the network. To support the aforementioned requirements, we employ a publish/subscribe mechanism as the backbone of the peer-to-peer database network. An important requirement of this kind of P2P network is that every time a new peer database joins the network, it has to publish its shareable contents, i.e., its relations and attributes. All peer databases in the network can subscribe to the data contents that are relevant or interesting to them. For example, the FlexiCarRental may choose to subscribe to Car, and Branch data, which is relevant to it, and Hotel data, which although not necessarily relevant, could prove to be interesting for it. 2.2

A P2P Architecture for Data Sharing and Exchange

The P2P database network combines aspects of the directory services P2P model exemplified by Napster [10] and the “pure” P2P architecture exemplified by Gnutella and Freenet [11,12]. It follows a federated approach where a relatively small number of event-servers provide directory services to peer groups. Peers register high-level information about themselves, such as their name, address and names of the data elements they are willing to share with other peers, with an event server. However, they do not use the event-server to locate or communicate with each other. Instead, peers form cohesive groups that provide a common set of information, e.g., hotel accommodation, recreation activities, car rental, etc, depending on their requirements and interests. Each peer builds up a (constantly changing) peer- group of other peers and stores some minimal information about them. Whenever a peer receives a data request it attempts to forward it to appropriate peers within its peer-group for execution. Thus, queries get propagated from peer to peer within a group and responses follow the same path back. The primary advantages of this approach are scalability and lack of logical centralization.

248

J. Yang, M.P. Papazoglou, and B.J. Kr¨ amer

When joining the P2P database network, a peer should first register itself by publishing (public) data items that it is willing to share with other peers. Secondly, it may subscribe to data items that it is interested to know from other peers. All peers can subscribe to the data items they are interested in and publish those data items they wish to share with other peers. A peer group is formed on the basis of matching a peers’ subscription needs against relevant publications that other peers advertise. Matching between subscription and publication data is the responsibility of a server module, see Figure 2. Whenever a match is detected, the server forwards the publication data and the addresses of relevant peers to all the peers that have subscribed to the published data. As a consequence, each peer contains a set of addresses of the peers that have published information that this peer has subscribed to.

Fig. 2. High-level view of the P2P database network architecture.

The key components of the P2P database network depicted in Figure 2 are as follows: – Event Server: this component is similar to that of a hybrid P2P architecture where indexing is centralized and file exchange is distributed [16]. The event server manages a select set of meta-operations for peers, such as joining/leaving the network, publishing data offerings, and data subscriptions. The event server also performs subscription/publication matching, so that relevant notifications can be sent to interested subscribers (peers). When the server receives a new publish/subscribe event from a peer that wishes to join the network, it performs two kinds of matching operations.

A Publish/Subscribe Scheme for Peer-to-Peer Database Networks

249

The first matching operation matches the new peer’s publication against all subscriptions that are relevant to it. In this way a notification (regarding the newly published peer) is sent to all peers in the network that have subscribed to data relating to data offered by this new peer. The second matching operation matches all previous publications against the data subscriptions of this new peer. In this way the new peer is informed about existing peers whose publications match its subscription needs and can thus establish its own peer group. Subscribe/publish mechanisms and peer group formation are discussed in section 3. – Peers: Query processing is performed within peers. For each peer in the network, its peer group is established implicitly by matching its subscription needs against all relevant publications made by other peers in the network. Whenever a match is detected, the event server sends the new peer, say A, a notification containing the address of the matching peers, say peers B and C. Subsequently, peers B and C will be included in the group of peer A, and relevant published data contents of peers B and C will be stored in peer A’s peer information base (called peer-info in Figure 2). Peers B and C are called acquaintances of peer A. Queries in the P2P database network are handled by the peers where the queries originate. We refer to these peers as query hosting peers. If a queryhosting peer detects that a query involves data from other peers in its peer group, the peer first decides which of its acquaintances are the most appropriate for processing segments of the query. This is determined on the basis of peer-acquaintance relevance to the query by analysing this peer’s information base. Subsequently, the query-hosting peer will forward the query together with its (partial) schema and a desired result format in XML to relevant peers - acquaintances of a specific peer that may answer parts of the query - in its group. Peer acquaintances reformulate the part of the query they receive according to their own schema and process it. Partial results are then sent back to the query-hosting peer for validation and aggregation. Query processing in this P2P network is discussed in section 4, where we use the running example of section-2.1 to exemplify how schematic heterogeneity can be resolved and how results from diverse peers are merged.

3

Publish/Subscribe in the P2P Database Network

Publish/subscribe is a communication mechanism that enables the loose coupling of peers in peer-to-peer data exchange networks. The participants of such networks exchange notifications about data publications and subscriptions via asynchronous notifications. A peer publishes a set of relations1 , each having a set of attributes2 characterizing the contents published. For a given application domain we assume 1 2

The term relation is used in a very loose way. It can be relational table in a database or element in XML terms. The term attribute is also used in a loose way. It can be attribute in a database relation, or a simple element or attribute in XML terms.

250

J. Yang, M.P. Papazoglou, and B.J. Kr¨ amer

that the set of all possible relation and attribute names R and A, respectively, are well defined. All publications are maintained by an event notification server known to all potential peers P (see Figure 2). In this paper we follow the practices of e-marketplaces, i.e., all the data sources that join an e-marketplace, such as travel industry, chemical, or semiconductor industry, are forced to use a standard vocabulary and the same naming conventions, when they subscribe and publish information. Accordingly, we are not dealing with terminology fluctuations and semantic mismatches. In addition, we assume that all the peers in the database network employ XML views over the data structures that a peer wishes to make publicly available. For example, the HappyTravelAgent may decide to publish only some of its data elements such as Customer, Airline, Reservation, Hotel with attributes. In a similar fashion, a subscription can specify a set of contents (e.g., car, branch) with or without attributes. 3.1

A Model for Publication and Subscription Matching

In this section we introduce a formal model for publication and subscription matching for a given application domain where the sets R, A, and P contain all permissible relation, attribute, and peer names, respectively. Publication Contexts. A publication is a set of pairs (r, A) with r ∈ R and A ⊆ A. More concretely, we can express a publication as: {(r1 , {ar1 ,1 , . . . , ar1 ,k1 }), . . . , (rl , {arl ,1 , . . . , arl ,kl })}.

(1)

The set of all publications known in a specific domain at a particular time is called its publication context. Publication contexts and the matching of a subscription against a given publication context can be modelled mathematically in terms of formal concept analysis [18], which relies on the theory of ordered sets and complete lattices. A publication context can be represented by a matrix that relates a set of peer names with relation names and a set of accompanying attribute names that peers in the P2P network have published. The peer names are represented by rows in the matrix, while the relations and their associated attributes are represented by columns in the matrix. A cross in row p and column r indicates that peer p has published relation r. Table 1 illustrates a publication context for a subset of the relations and attributes used in the example in Figure 1. Formally a publication context C := (P, R, I) consists of a set P ⊆ P of peer names, a publication R ⊆ {(r, A) | r ∈ R ∧ A ⊆ A}, and an incidence relation I ⊆ P × R. The fact that a peer p has published a certain relation (r, A) is written as (p, (r, A)) ∈ I. The set of relation names in R is defined by: rel(R) = {r | (r, A) ∈ R}

(2)

and the set of attributes published for some relation name r ∈ R is defined by: att(r, R) = {a | (r, A) ∈ R ∧ a ∈ A}.

(3)

A Publish/Subscribe Scheme for Peer-to-Peer Database Networks

251

Airline() Booking(pickUpBranch) Booking(bookingNo,pickUpBranch) Car(type) Car(mileage,type) Customer(address) Customer(address,description) Customer(address,creditcardNo) Hotel(contact,hotelName,location) Hotel(hotelName,service) Reservation() Room()

Table 1. Example of a publication context for the travel domain depicted in Fig. 1

BudgetCarRental × × × × FlexiCarRental × × × × × × HappyTravelAgent × × × × × LeisureHotelBooking × × × × × Airline × ×

The following expression Q computes the set of commonly reachable relations for a set of peers in Q ⊆ P : Q := {t ∈ R | (p, t) ∈ I for all p ∈ Q}

(4)

Similarly, we define T  as the set of peers that published all relations in a set of relations T ⊆ R: T  := {p ∈ P | (p, t) ∈ I for all t ∈ T }

(5)

When we apply definitions 2, 3, 4 and 5 to the publication context C in Table 1, we obtain: rel(C) = {Airline, Booking, Car, Customer, Hotel, Reservation, Room} att(Customer, C) = {address, description, creditCardNo} {Airline} = {Customer(address), Customer(address, creditCardNo)} and

{Car(type)} = {BudgetCarRental, FlexiCarRental}

A simple matrix such as the one illustrated in Table 1 can easily represent the elementary publication context represented by our running example and can be used to compute the results shown in the previous examples. However, for extended publication contexts this procedure is impractical and may lead to errors.

252

J. Yang, M.P. Papazoglou, and B.J. Kr¨ amer

A more elegant way to visualise publication contexts and perform contextbased computations, is by means of constructing a concept lattice C(P, R, I) constructed from a given publication context (P, R, I) using an efficient algorithm that relies on the notion of formal concepts and formation of sub-context spaces [19].

Customer{address}: publ

Customer{address,creditCardNo}: publ

Reservation{ }: publ

AirlineTravel: peer

Airline{ }: publ Customer{address,description}:publ Hotel{contact,hotelName,location}: publ

Booking{pickUpBranch}: publ Car{type}: publ BudgetCarRental: peer

HappyTravelAgent: peer

Booking{BookingNo, pickUpBranch}: publ Car{type,mileage}: publ

Hotel{hotelName,service}: publ Room{ }: publ LeisureHotelBooking: peer

FlexiCarRental: peer

Fig. 3. Concept lattice derived from Table 1

The concept lattice of the sample publication in Table 1 is shown in Figure 3. There are two types of nodes in this lattice: concept (publication) nodes and peer nodes. This concept lattice contains eight concepts (which include the bottom node representing the universal concept) and five peers. All concept nodes reachable upwards from a peer node in the lattice forms the complete concept set that this peer has published. For instance, the node labelled with the peer name BudgetCarRental has published: { Booking(pickUpBranch), Car(type), Customer(address,creditcardNo) , Customer(address) }. Conversely, all the peer nodes reachable downwards from a concept node in the lattice forms the set of peers that have published a common concept. For instance, the concept node Reservation() is supported by peer nodes LeisureHotelBooking, HappyTravelAgent.

A Publish/Subscribe Scheme for Peer-to-Peer Database Networks

253

Subscriptions. A subscription S is specified, just like a publication, as a set of pairs (s, A) with s ∈ R and A ⊆ A. We say that a peer p matches a subscription S if it has published at least the relations listed in the subscription, i.e., S ⊆ {p} . All peers of a publication context (P, R, I) that match a subscription form the acquaintances of this publication context: [S] := {p ∈ P | S ⊆ {p} }.

(6)

Table 2 shows simple examples of subscriptions and their resulting acquaintances.

Table 2. Examples of subscriptions and their acquaintances example subscription acquaintances Subscription1 { Customer(address) } P Subscription2 { Customer(address), Car(type) } { BudgetCarRental, FlexiCarRental } Subscription3 ∅ P Subscription4 { Car() } ∅ Subscription5 { Car(mileage) } ∅

These examples show that Subscription 2 extends Subscription 1 by requesting an additional relation to be matched. We observe that Subscription 2 is a superset of Subscription 1, while the acquaintances of Subscription 2 form a subset of the acquaintances of Subscription 1. An empty subscription is matched by all peers, while Subscriptions 4 and 5 yield an empty set of acquaintances because the requested combination of attributes and relation names is not supported. Intuitively, we want the subscriptions {Car()} and {Car(mileage)} to be inter-related within the publication context in Table 1 since there are two peers, BudgetCarRental and FlexiCarRental, that have published a common part of this relation with some differing attributes. It is thus useful to view the relation Car() as a subtype of the relations Car(type) and Car(mileage,type). In general, we want a subscription element (s, A) to be matched by a peer that has published a relation (r, B) if s = r and A ⊆ B. To achieve this, we extend a given publication context (P, R, I) to the context (P, R, I), where: R = {(r, B) | r ∈ rel(R) ∧ B ⊆ att(r, R)} and

(7)

254

J. Yang, M.P. Papazoglou, and B.J. Kr¨ amer

I = I ∪ {(p, (r, B)) | if (p, (r, A)) ∈ I for all A ⊆ P(B))}

(8)

That is, we supplement the relations (r, A1 ), . . . , (r, An ) published in R by additional relations (r, B), where B denotes all possible subsets of A1 ∪, · · · ∪ An , and we introduce a cross in row p and the new column (r, Ai ∪ · · · ∪ Ak ) if we find a cross in all columns (p, (r, Aj )) in the original publication context (for j = i, i + 1, . . . , k). Table 3 shows, for illustration purposes, only part of the new matrix resulting from extending Table 1 with definitions 7 and 8. All new columns are marked in a light grey colour. Figure 4 depicts the extended concept lattice corresponding to Table 3. This lattice is stored at the event-server and can be used to assist locating peers related to a query that can not be completely decomposed by its query hosting peer, this is explained in detail in section-4.

Airline() Booking() Booking(bookingNo) Booking(pickUpBranch) Booking(bookingNo,pickUpBranch) Car() Car(mileage) Car(type) Car(mileage,type) Customer() Customer(address) Customer(creditCardNo)

Table 3. Example of a publication context for the travel domain depicted in Fig. 1

BudgetCarRental × × × × × × × FlexiCarRental × × × × × × × × × × × HappyTravelAgent × × × × LeisureHotelBooking × × × Airline × × ×

If we now attempt to match Subscriptions 4 and 5 listed in the subscription Table 2 against the extended context (P, R, I), we obtain the acquaintances { BudgetCarRental, FlexiCarRental} and {FlexiCarRental, HappyTravelAgent }, respectively. To exemplify this consider the extended concept lattice shown in Figure 4 and the Subscription Car(). Firstly we need to allocate a concept node that matches this subscription. Then the peer node associated with this concept node (i.e., BudgetCarRental) and all peer nodes associated with the nodes reachable from edges descending from this concept node (i.e., FlexiCarRental) form the acquaintances of this subscription.

A Publish/Subscribe Scheme for Peer-to-Peer Database Networks

255

We henceforth always refer to the extended publication context and its corresponding concept lattice, when matching subscriptions with publications. In addition to the above, we also require that a subscription such as S = {Hotel(), Room(type)} (partially) matches a certain publication context if at least one of the elements in S is matched against a publication such as Hotel(contact, hotelName). We refer to this type of matching as a partial match. A partial match of a subscription S over a publication context C := (P, R, I), in terms of the function match, is defined as follows: match(S, C) = {p ∈ P | S ∩ {p} = ∅}

(9)

Customer{ }: publ Customer{address}: publ Hotel{ }:publ Hotel{hotelName}: publ Reservation{ }: publ

Customer{creditCardNo}: publ Customer{address,creditCardNo}: publ AirlineTravel: peer

Airline{ }:publ Customer{description}:publ Customer{address,description}:publ Hotel{contact}: publ Hotel{location}: publ Hotel{contact,hotelName}: publ Hotel{contact,location}: publ Hotel{hotelName,location}: publ Hotel{contact,hotelName,location}: publ

Booking{ }: publ Booking{pickUpBranch}: publ Car{ }: publ Car{type}: publ BudgetCarRental: peer

Hotel{service}: publ Hotel{hotelName,service}: publ Room{ }: publ

HappyTravelAgent: peer

LeisureHotelBooking: peer

Customer{creditCardNo,description}:publ Customer{address, creditCardNo description}:publ Hotel{location, service}: publ Hotel{contact,hotelName,service}: publ Hotel{hotelName,location,service}: publ Hotel{contact,location,service}: publ Hotel{contact,hotelName,location,service}: publ

Booking(BookingNo}: publ Booking(BookingNo, pickUpBranch}: publ Car{type}: publ Car{type,mileage}: publ FlexiCarRental: peer

Fig. 4. Concept lattice including subtype relationships among published relations.

3.2

Peer Group Formation and Updating

A peer group is formed by applying the match function, defined in section- 3.1, on a peer’s subscription against the publication context C. Suppose that the subscription of BudgetCarRental is: S := {(Car(type), Booking(pickUpBranch), Hotel(hotelName, location, contact)}

256

J. Yang, M.P. Papazoglou, and B.J. Kr¨ amer

A matched-subscription lattice for BudgetCarRental, shown in Figure 5, is generated by matching its subscription S against the extended publication lattice of the event-server, shown in Figure 4. The matched-subscription lattice is a true subset of the extended publication lattice stored at the event-server and indicates the peers and matching concepts that conform to BudgetCarRental’s subscription. This information is stored locally at this peer’s site so that it can be used in the future to locate relevant peer acquaintances when attempting to process a query.

Booking{pickUpBranch}: publ Car{type}: publ FlexiCarRental: peer

Hotel{contact,hotelName,location}: publ

HappyTravelAgent: peer

Fig. 5. A matched-subscription lattice for BudgetCarRental

Whenever a query is posed at a query-hosting peer, its set of subscriptions needs to be evaluated to determine whether it can support the new query. In case that execution of the new query is not fully supported, then a new subscription (for this peer) will be generated from this query and will be sent to the eventserver. Eventually, a new matched-subscription lattice, which satisfies the new subscription, will be generated and sent to the query-hosting peer. When a peer leaves the P2P network, a notification will be sent from the event- server to all relevant peers, which in turn will update their matched- subscription lattices. This happens in accordance with lattice updating algorithms found in [19].

4

Query Processing

The flexibility and wide-use of XML as a data exchange format makes it a prime candidate for use as a common data model in data integration applications. Using an XML-based schema at the interface level allows to hide the proprietary data elements that the owners of data do not wish to disclose, and allows to adhere to a standard interface without having to migrate existing data. All peers in the database network expose their own private XML views of their schema and data content. Figure 6 illustrates such an XML view over the BudgetCarRental schema and data contents. To query the XML peer views we use XQuery, the standard XML query language proposed by the W3C [20].

A Publish/Subscribe Scheme for Peer-to-Peer Database Networks

257

"BJ100""small""MAZDA121" "B1""CA2" ..... ... "CU1""CA2" "Bredasweg 421, Tilburg" ... similar to similar similar

Fig. 6. The BudgetCarRental database and its XML view.

Suppose that the following query is issued at the BudgetCarRental peer: “Find the types of all the cars which are picked up at Tilburg and find all the hotels that are close to each car’s pick up location.” Figure 7 shows an XQuery corresponding to the previous narrative and formulated on the basis of XML view presented in Figure 6.

01 FOR $car in document("BudgetCarRental.xml")//Car { 02 FOR $booking in document("BudgetCarRental.xml")//Booking 03 WHERE $car/carID=$booking/carID 04 AND $booking/pickUpBranch=*Tilburg 05 RETURN 06 07 { 08 FOR $hotel in document ($Hotel)//Hotel 09 WHERE distance ($hotel/location, $booking/pickUpBranch)

Suggest Documents