A Self-organizing Data Store for Large Scale ... - Semantic Scholar

7 downloads 92 Views 191KB Size Report
that they do not have the money to make their content ac- cessible. The goal of the BRICKS project is to develop and main- tain an infrastructure, which provides ...
A Self-organizing Data Store for Large Scale Distributed Infrastructures∗ Thomas Risse, Predrag Kneˇzevi´c Fraunhofer IPSI Integrated Publication and Information Systems Institute Dolivostrasse 15, 64293 Darmstadt, Germany {risse|knezevic}@ipsi.fraunhofer.de Abstract The introduction of Service Oriented Architectures enables the transition from integrated, centrally controlled systems to federated and dynamic configurable systems. Scalability and reliability are key requirements for such infrastructures. Due to the size and dynamic only decentralized and self-organizing systems are able to accomplish the previously mentioned requirements. In this paper we describe our approach to tackle the problem within the BRICKS project. Here we use a completely decentralized XML database to store several types of metadata. We will present the database architecture and the way it is used in the BRICKS architecture, exemplary for service discovery. Furthermore will we give an overview of current research activities and preliminary results.

1. Introduction The introduction of Service Oriented Architectures, based either on Web Service or Grid Services, and the wider acceptance of peer-to-peer resp. decentralized architectures enables the transition from integrated, centrally controlled systems to federated and dynamic configurable systems. The benefits for the individual service providers and users are robustness of the system, independency of central authorities and flexibility in the usage of services. Within the European project BRICKS1 a decentralized and service oriented architecture will be used to enable the integrated access on distributed cultural resources. The project idea is motivated by the fact that the amount of digital information and digitized content is continuously in∗ 1

This work is partly funded by the European Commission under BRICKS (IST 507457) BRICKS - Building Resources for Integrated Cultural Knowledge Services

creasing but that still a lot of effort has to be spend to discover and access these information. The reasons are heterogeneous data formats, restricted access, proprietary access interfaces, etc. But also economic constraints exist for most especially smaller cultural institutions, which simply mean that they do not have the money to make their content accessible. The goal of the BRICKS project is to develop and maintain an infrastructure, which provides cultural users the possibility to share and use cultural knowledge resources (e.g. digital libraries) as well as sophisticated processing tools in an integrated way. Obvious requirements for such an infrastructure are scalability, robustness and reliability but also low maintenance costs. A centralized architecture forbids itself as it cannot guarantee the required scalability and reliability while at the same time being very cost effective. Peerto-peer file sharing systems like Gnutella or Kazaa [14] have already shown in practice that decentralized systems are able to achieve these goals. Hence a similar approach is used in BRICKS. The major challenge in the implementation of decentralized system is the missing central coordination unit. Such coordination units are part of all distributed systems, e.g. a transaction manager is used to handle transaction in distributed database systems. In decentralized systems the situation is different. Here, no global view on all participating nodes exists. Hence, necessary global agreements can not be found. Furthermore, large scale decentralized systems like BRICKS have to deal with a certain degree of dynamics in the participation of the system. This means that nodes can join and leave the system whenever they like. These facts make it impossible to define or elect a central coordination unit. Even if a lot of functionalities, e.g. distributed query processing, can be implemented in a decentralized way, there are always situations where at least some sort of highly available centralized storage is necessary. In BRICKS arises the need to have service descriptions, administrative information about collections, ontologies and some annotations

globally available for all nodes. To fulfil these requirements of an always available and transparent accessible data store, we implement a selforganizing decentralized XML storage on top of a distributed hash table (DHT) overlay network. In this paper we will show how our decentralized XML storage is used in large scale information environments like BRICKS. The paper is organized as follows: In the next Section we first give an introduction into the BRICKS project, its requirements and the overall architecture. Afterwards in Section 3 we focus on the challenges and foreseen solutions for the decentralized XML storage. Some insights of the current implementation is given in Section 4. In Section 5 we demonstrate the usage of our decentralized XML storage to provide an alternative implementation of the UDDI service discovery. Section 6 gives an overview on related work. We conclude our paper with a summary of achievements and outlook on future activities.

2. The BRICKS Project The aim of the BRICKS project [8] is to design, develop and maintain an open user and service-oriented infrastructure to share knowledge and resources in the Cultural Heritage domain. The target audience is very broad and heterogeneous and involves cultural heritage and educational institutions, research community, industry, and citizens. Typical usage scenarios are integrated queries among several knowledge resource, e.g. to discover all Italian artefacts from renaissance in the European museums. Another example is to follow the life cycle of historic documents, whose manual copies are distributed all over Europe. These examples are specific application, which are running on top of the BRICKS infrastructure. The BRICKS infrastructure uses the Internet as a backbone and has to fulfil the following requirements: • Expandability, which means the ability to acquire new services, new content, or new users, without any interruption of service. • Scalability, which means the ability to maintain excellence in service quality, as the volumes of requests, of content and of users increase. • Availability, which means the ability to operate in a reliable way over the longest possible time interval. • Graduality of Engagement, which means the ability to offer a wide spectrum of solutions to the content and service providers that want to become members of BRICKS. • Interoperability, which means the ability to make available services to and exploit services from other digital libraries.

Figure 1. Decentralized BRICKS Topology

In addition, the user community has the economic requirement to be low-cost. This means (1) that an institution should be able to become a BRICKS member with minimal investments, and (2) that the maintenance costs of the infrastructure, and in consequence the running costs of each BRICKS member, are minimized. Interested institution should not invest much additional money in its already existing infrastructure to become a member of BRICKS. In the ideal case the institution should only get the BRICKS software distribution, which will be available for free, install it, connect to the internet and become a BRICKS member. This will already gives the possibility to search for content and access some services. For sure, additional work is necessary to integrate and adapt existing content and services to provide them in BRICKS. Also, the BRICKS membership will be flexible, such that parties can join or leave the system at any point in time without administrative overheads. To minimize the maintenance cost of the infrastructure any central organization, which maintains e.g. the service directory, should be avoided. Instead, the infrastructure should be self-organizing in a way that the scalability and availability of the fundamental services, e.g. service discovery or metadata storage, are guaranteed.

2.1. Overall Architecture In order to fulfil these requirements the BRICKS architecture will be service-oriented and decentralized. There are

Workstation

Workstation Workstation

BNode Austrian Library

Workstation

Workstation

Request

Request

BNode Studio Azzuro

Workstation

Re qu

Request

es t FhG IPSI Workstation

Re

qu

es

t

Workstation

BNode Ufizzi

Workstation

Workstation

Figure 2. Request routing in BRICKS

several reasons for the decentralization approach: Decentralized architectures do not have a central points, which could stop or slowdown in failure or overload situations. Instead of that, the decentralized architectures offer better load-balancing, and a failure will not influence system availability in whole. Only parts of the system might be affected. Furthermore, having central points (servers) limits their scalability in a long-run. It is always a good idea to design a system in a way that it is able to handle loads, which was not foreseen during the initial design phase. For standard client-server the system load must be carefully estimated, e.g. by guessing the maximum number of users. In open systems like BRICKS this is not possible. Due to the simplicity of joining the system the number of nodes can increase rapidly. Hence upper limits can not be given. With the decentralization approach we can avoid single points of failure of core functionalities, e.g. service discovery, by completely distributing them. Furthermore, the lack of central points removes the needs for centrallized maintenances. Important infrastructure functionalities are all implemented in a self-organizing way or using services which are implement in that way. That is a strong advantage of the decentralized approach because a centralized administration costs additional money and personnel must be dedicated for the tasks. BRICKS is going to be very heterogeneous system without a central body that will maintain the system. Also, our aim is to make a system whose infrastructure costs are as low as possible, and a decentralized architecture is good approach that has the necessary properties. Figure 1 shows a possible topology of the future BRICKS system. Every node represents a member insti-

tution, where the software for accessing the BRICKS is installed. Such nodes are called BNodes. BNodes communicate among each-other and use available resources for content and metadata management. Every BNode knows directly only a subset of other BNodes in the system. However, if a BNode wants to reach another member that is directly unknown to it, it will forward request to some of its known neighbour BNodes that will deliver the request to the final destination or forward again. This is depicted in Figure 2. It shows also that BRICKS users access the system only through a local BNode available at their institution. Hence every user request is first sent to the institution’s BNode and then the request is routed between other BNodes to the final destination. Search requests behave like that; the BNode will preselect a list of BNodes where a search request could be fulfilled, and then the BNode will route it there. When the location of the content is known, e.g. as a result of the query, the BNode will directly be contacted.

3. Decentralized Metadata Data Management Standard Digital Library architectures store all types of metadata in a central place, e.g. in a database. Unfortunately, the approach cannot be applied in BRICKS, because it would add central points in the architectures with all disadvantages they carry. Therefore, a decentralized system like BRICKS requires decentralized metadata management. It enables the system members the sharing of metadata without need for a central point. The decentralized metadata storage will provide an API that makes metadata location transparent to users. Users of the metadata storage will not know where metadata are stored, if and how they are replicated or queried. The data access will be transparent to the user, and the provided API will be similar to that of already existing centralized metadata management solutions. Since many metadata can be encoded in XML, the decentralized metadata storage will be implemented on the top of a decentralized XML storage. The decentralized XML storage will use storage space available at every BNode of a BRICKS member, it will balance the load of individual storage spaces, and it will take care about data replication and querying. The decentralized metadata storage should be used for all metadata that have very large volume and/or must have very high availability. Examples are service descriptions, administrative information about collections, ontologies and some annotations which need to be globally available during the whole life of the system. Other sorts of metadata like descriptive metadata of content are not required to be stored there, as they are kept locally under the control of the owner.

Applications Query Engine P2P-DOM

Index Manager

DHT Abstraction Layer

DHT Network Layers

Figure 3. Peer-to-peer XML Storage Architecture

3.1. Database Architecture Within BRICKS we design and implement a decentralized/P2P datastore that manages XML documents within a P2P community. The approach differs from current P2P datastores, like Gnutella or Kazaa, where each peer makes their local files available for community download. In our approach XML documents are split into finer pieces that are spread then in the community. The documents are created and modified by the community during system run-time and they can be accessed from any peer in a uniform way, e.g. a peer does not have to know anything about the data allocation. The proposed datastore mimics a subset of the Document Object Model (DOM) [3] interface, which has been widely adopted among developers, and a significant legacy of application is already built on top of the DOM. Therefore, many applications could be ported easily to the new environment. XML query languages like XPath [22] or XQuery [23] can use the DOM as well, so they could be used for querying of the datastore. Finer data granularity will make querying and updating more efficient. The storage must be able to operate in highly dynamic communities. Peers can depart or join at any time, nobody has a global system overview or can rely on any particular peer. These requirements differ significantly from those in the distributed databases where node leaving is only due to some node failure and the system overview is globally known. Figure 3 presents the proposed database architecture. All layers exist on every peer in the system. The datastore is accessed through the P2P-DOM component or by using the query engine. The query engine could be supported by an

optional index manager that maintains indices. P2P-DOM is the core system layer. It exports a large portion of the DOM interface to the upper layers, and maintains a part of a XML tree in a local store (e.g. files, database). P2P-DOM serves local requests (adding, updating and removing of DOM-tree nodes) and requests coming from other peers through a Distributed Hash Table (DHT) [18, 5, 21] overlay, and tries to keep the decentralized database in a consistent state. In order to make the DHT layer pluggable, it is wrapped in a tiny layer that unifies APIs of particular implementations, so the upper layer does not need to be modified.

3.2. Replication Since the BRICKS members can leave and join the system at any time, it is an important requirement to provide high availability of stored data. A usual way for achieving data availability is to use redundant data. Typically replica control protocols are used to ensure consistency among all copies of the original data. On the other hand, as a trade off, replication introduces overhead; it requires more storage space for placing replicas and uses network bandwidth for communication. Selecting the most appropriate protocol for a given environment is then not more than choosing the protocol with an acceptable trade off. Applying a replica control protocol in decentralized architectures needs to take care about some new issues along with those that exist in the distributed architectures. In decentralized architectures peers are completely autonomous; they join and leave the system at any time and even change their identities/addresses when going online next time. Therefore, the system topology is very dynamic; there is neither a global system overview, nor reliable points that could overtake some critical operations. Going offline in a distributed architecture is considered as a fault and should rarely happen, so replication is used to increase system performance and decrease response times. Our research has a different focus; we are trying to guarantee data availability in the system, rather than optimizing response times. As already mentioned, replication, besides achieving higher data availability, introduces higher storage and communication costs. Selecting a suitable protocol for a given decentralized system requires a careful analysis of these costs, in order to choose the right protocol. Also, the selected protocol must be fully decentralized, e.g. it must not require any sort of centralized synchronization among peers. First analysis of existing replication protocols have been performed in [13]. The research has started from replica control protocols that exist in distributed data management.

The paper has shown that the existing protocols have some drawbacks if they are applied within a decentralized/P2P environment, e.g. they are very costly in terms of used resources for achieving high data availability. Therefore, we have slightly modified the original ROWA-A (Read One Write All Available) protocol: stored data and their replicas are periodically re-created, e.g. in a refreshment round, all missing replicas of a particular data object will be inserted again in the community. By taking this approach, it is now possible to decrease needed storage space by introducing additional communication costs. However, one has a freedom to balance those costs, based on the system parameters and additional requests (e.g. which cost type is cheaper). Data objects and corresponding replicas are stored by using the DHT layer. Every object and its replica are associated with a unique key; also object and replica key are correlated, e.g. by knowing object key, all replica keys can be generated: replicaKey(originalKey, i) = hash(originalKey + i) where i is ordinary replica number, hash is good hashing function, and + is used for joining two byte arrays. Every data object is replicated fixed and globally-known number of times R. Therefore, creating an object requires generation of the original and R replicas, getting the original key by using random number generator and calculating replica key by applying the above formula. Reading an object would require that at least one replica is online and available. The proposed protocol is fully decentralized and requires much lower number of replicas compared to the ROWAA for keeping the same high data availability (≥ 99%) in communities with low peer-online probability [20]. Also, the protocol is able to sense environment and to adapt itself to some changes keeping the requested data availability and minimizing costs at the same time. Therefore, when a community is pretty stable (e.g. peer online probability is high), the protocol behaves very similar to the ROWA-A (e.g. time until the next refreshment round is longer).

peers can often be offline, locally stored replicas are inaccessible. Therefore, an update might not address all replicas, leaving some of them unmodified. Further, uncoordinated concurrent updates of an object result in unpredictable values of object replicas. As a consequence, different object replicas may have different values. Thus, the main issues are how to ensure that the correct value is read, to synchronize offline replicas after going online again, and to handle concurrent updates on the same data. There are two ways for doing concurrency control: pessimistic and optimistic. Our current preliminary results show that the pessimistic approach could not be taken, because of the properties of decentralized environments (e.g. lack of central coordination, global system overview, freedom of peers to join or leave the system). Therefore, we handle the concurrency control in an optimistic way: with every update, a new data object version is produced. Therefore, every replica should track a version number that is increased with every update by peer that makes a modification. Peers that store replicas in their local storages control refuse the update if they already have the same version. The peers that get a refuse can re-read the data, inform itself about the newest version and try to update the data again, or to cancel the request. Data versioning changes the way how the stored data are accessed. Without support for updates, it was enough to find at least one data replica available, in order to read it. By enabling updates and when more replicas are available, it is important to access the one with the highest version number. Even then, there is no guarantee that this is the latest version. The proposed protocol has been analyzed with the same goal as the protocol without update support: finding out what are costs for maintaining requested data availability. Our preliminary analysis has shown that the costs are definitely higher than before, e.g. starting from 4-5 times when peer online probability is low, and they decrease with the increase of peer online probability p (less than 2 times, when p ≥ 0.8).

4. Implementation 3.3. Updates The data managed by the decentralized storage are not read-only. Annotations, service descriptions, or collection descriptions can be changed during system run-time. Therefore, updates must be allowed, and must be supported by the replication protocol. In the following we are summarizing our current preliminary results, which will be presented later in more detail in a separate paper as space is limited. The previously described replication protocol has been extended to support updates, allowing concurrent modifications and keeping data consistent at the same time. Since

In parallel to our analysis we started implementing our approach. In the first step our major goal is to test the proposed protocols in practice. Later we will integrate our datastore into the BRICKS architecture. The current Java prototype is based on the JXTA peer-to-peer framework [4], implementing on the top of it a Pastry DHT layer. The P2P-DOM layer supports currently a subset of the DOM Level 2 specification and has full XPath support. The layer manages documents as tree structures, but working data granularity is not on XML node level, because it would decrease the system performances. Instead of that, XML

nodes are grouped into sets, which are then managed in the community by using the DHT layer. The maximum size of sets (e.g. number of XML nodes) is limited and equal for all of them. Every node set is associated with an unique ID used by DHT. Within a set, every XML node has an unique ID. Thus, accessing an XML node requires knowing both the set ID, where the node is stored, and the node ID. Usually, sets contain related XML nodes, e.g. whole subtrees or siblings. Therefore, by accessing a single XML node, the whole set is sent on remote side, where it is cached, hoping that the other XML nodes will be needed soon. Finally, in order to get access to a stored XML document, you need access to its root XML node. In order to make root references more friendly, when a root node is created, its reference could be obtained by hashing a symbolic name that will be later known by users. Let’s take a look at the following example document: Text8 Text6 Text3 Text4 Text1 Text2

Figure 4 shows how the example document could be placed in the storage. Nodes with the same fill pattern belong to the same set. Besides the first node in set is written the full reference (e.g. 0A/00 means that 0A is the set ID, and 00 is node ID within the set). By using set IDs as DHT keys, sets will end up in local storages of different peers. By generating set IDs randomly, we ensure balancing of storage usage available at every peer. Also the querying of documents is fully decentralized. Peer that starts a query does not retrieve needed node sets, because this would not be efficient for large documents and would generate high traffic. Instead of that, the peer sends the relevant parts of the queries to peers that might have relevant results. If some nodes are matched, then the references to the nodes will be sent to the originator of the query. In order to make querying even more efficient, we are working on support for decentralized indices for stored XML documents, that should helps us to locate faster XML nodes that could be matched against posed queries.

5. Decentralized Service Discovery In the previous section we introduced our decentralized XML storage. In the following we will shortly present how it can be used for service discovery. Service registration and discovery play a very important role in service-oriented architectures like BRICKS.

Many system components are available only through a webservice interface, and we would like to achieve dynamic bindings of components during run-time. Therefore, service registration and discovery mechanisms help in finding an appropriate component/service that provides needed functionalities. There are already well-established standards for web services registration and discovery. Before registering a service, it has to be described. WSDL [2] is used for such purpose; it gives all needed details about service functionalities, its location, method names, and parameter types. Universal Description, Discovery and Integration (UDDI) [1] is a repository of service descriptions, which allows querying information about specific Web Services. UDDI provides a possibility to search for Web Services according to certain criteria. Web Services suppliers can register their services with a UDDI registry provider. Relevant information about specific Web Services are then stored into the respective UDDI database. The BRICKS service registration and discovery component follows the well-established standards of WSDL and UDDI. Usually, UDDI repositories are centralized, build on top of a database. Since one aim of the BRICKS project is to build a decentralized architecture, we cannot simply take an existing UDDI implementation and deploy it in the system. Hence we keep the standard interface, and build a decentralized UDDI repository on top of our decentralized XML storage. By doing this we avoid that the UDDI repository can be a single point of failure in the system. Every UDDI entry about a web-service contains a link to WSDL entry that describes how to access the given service. Usually, the link points to the machine where the service is deployed. In environments where web-services are always online, such data distribution is perfectly valid. However, in the BRICKS could happen that some services are not always online, and if we keep their definitions on same machines, they will be offline as well. As a consequence, during discovery phase, we could find services, whose WSDL description is not accessible. One could argue that there is no drawback, if one cannot access the WSDL description of the offline service. However, the service could become online again, and if we would have its descriptions, we could just try to access it. Managing WSDL description in a repository has an additional advantage: we could perform a new sort of queries during the discovery phase. For example, we could search for services that implement an interface, methods, have specific bindings, etc. Such search is currently not possible by using the UDDI specification. By storing WSDL documents in the decentralized XML storage such more sophisticated discovery operations can be performed.

19/00

Root 0A/00



01

FF/00



32/00



70/00



01

01

02

@e 02

Text7

Text8

01

02

01

Text6

Text4





02

03

@d

Text3

02

Text2 03

03

Text5

Text1

Figure 4. A possible distribution of a XML document

6. Related Work Decentralized data management can be seen as lessrestricted variant of distributed data management, because some properties that exist in distributed system (e.g. working in stable, well connected environments (e.g. LANs) with the global system overview, where every crashed node is eventually replaced by a new proper one) cannot be found in decentralized/P2P system. The area of distributed systems is very well explored, and many available books cover different aspects [6, 15]. The field of distributed databases (DDBS) [15] gives a good overview of existing limitations and solutions that could be applied to decentralized data stores as well. Although, distributed databases do not capture node leaving and unknown global structures being a consequence of the highly dynamic of P2P systems. In particular, this vitality makes it very challenging to achieve high data availability, that is, data must be available even if the creator goes offline. Therefore, the data must be replicated. It is the job of the replica control protocol to find a good tradeoff between several system parameters (e.g. peer availability, community size, and required data availability). Known DDBS replica control protocols and algorithms [15, 16] are good starting points for further research. They offer different levels of data consistency (from strict to lazy and none) and support different node availability requirements. Some sort of replication was already introduced in the distributed file-systems. The Andrew File System (AFS) and other proposals like Coda and xFS [9] cache files that are accessed very often. In order to deal with failures and to provide high availability, xFS applies erasure coding and implements a software RAID storage system by stripping

file data across many machines. For restoring the complete file, only some portion of the blocks is needed. Erasure codes for replication are investigated in [7]. It is shown that this replication method gives better file availability compared to the ROWA-A approach, but this is valid only for very large files and high peer online probabilities. Current popular P2P file-sharing systems (e.g. KaZaA, eDonkey, or Gnutella [14]) do not have any built-in support for replication. A file will be replicated every time when it is downloaded on a new peer. Therefore, the file availability depends on its popularity and this works fine for media file exchange. Unfortunately, the file-sharing systems do not consider updates at all. If a file update occurs, it is not propagated to other replicas. There is no way that a peer that wants to get a file can conclude what is the freshest version. Updates in replicated distributed databases are a widely researched field. As it has already mentioned in Section 3.3, both optimistic and pessimistic approaches for resolving updates exist [15]. However, they all assume high peer online probability and global system view, e.g. Kemme and Alonso [12] propose hierarchy-less data distribution, but their approach requires high peer online probability. P2P file-storing system like CFS [10] or PAST [19] have recognized the need for replication. CFS splits every file in a number of blocks that are then replicated a fixed number of times. PAST applies the same protocol, without chunking files into blocks. Oceanstore [17] supports updates and does versioning of objects. An update request is sent first to the object’s inner ring (primary replicas), which performs a Byzantine agreement protocol to achieve fault-tolerance and consistency. When the inner ring commits the update, it multicasts the

result of the update down to the dissemination tree. To our knowledge, analysis of consistency guarantees has not been published so far. Datta et al. [11] addresses updates in P2P system, but the aim of the research is to reduce communication costs, data consistency has been not addressed.

7. Conclusion and Outlook In this paper we presented the BRICKS approach to address the requirement of globally accessible data with high availability guarantees in a decentralized environment. Hence, we introduced a peer-to-peer based XML storage, which is using some storage capacity of each node in the system and distributes parts of the XML document among the nodes in a self-organizing way. To achieve high data availability we are using a modified version of the ROWAA replication protocol, which automatically re-creates missing replicas. Furthermore, we extended it to support concurrent updates while keeping the data consistency up to a requested degree. The current implementation is only supporting static nodes and is not fault tolerant. Hence next steps are to implement our replication and update mechanisms. However, further analyses of the protocols are still necessary. In addition, a verification of the theoretical results in practice are required. Other aspects, which we have not addresses so far are the problems of security and access control. Hence, research in these areas is also necessary.

References [1] Universal Description, Discovery and Integration (UDDI), 2001. http://www.uddi.org/. [2] Web Services Description Language (WSDL) 1.1, March 2001. http://www.w3.org/TR/wsdl. [3] Document Object Model, 2002. http://www.w3.org/DOM/. [4] JXTA Project, 2002. http://www.jxta.org/. [5] K. Aberer. P-Grid: A self-organizing access structure for P2Pinformation systems. Lecture Notes in Computer Science, 2172, 2001. [6] P. A. Bernstein, V. Hadzilacos, and N. Goodman. Concurency Control and Recovery in Database Systems. Addison-Wesley, 1997. [7] R. Bhagwan, D. Moore, S. Savage, and G. Voelker. Replication strategies for highly available peer-to-peerstorage, 2002. [8] BRICKS Project. BRICKS - Building Resources for Integrated Cultural Knowledge Services (IST 507457), 2004. http://www.brickscommunity.org/. [9] G. F. Coulouris and J. Dollimore. Distributed Systems - Concepts and Design. Addison Wesley, 1989. [10] F. Dabek, M. F. Kaashoek, D. Karger, R. Morris, and I. Stoica. Wide-area cooperative storage with CFS. In Proceedings of the Eighteenth ACM Symposium on OperatingSystems Principles, pages 202–215. ACM Press, 2001.

[11] A. Datta, M. Hauswirth, and K. Aberer. Updates in highly unreliable, replicated peer-to-peer systems. In Proceedings of the 23rd International Conference onDistributed Computing Systems, page 76. IEEE Computer Society, 2003. [12] B. Kemme and G. Alonso. Don’t be lazy, be consistent: Postgres-r, a new way toimplement database replication. In The VLDB Journal, pages 134–143, 2000. [13] P. Kneˇzevi´c, A. Wombacher, and T. R. P. Fankhauser. Recycling the ROWA-A protocol in a decentralized/peer-topeerenvironment. In the International Conference on DistributedComputing Systems (ICDCS’05) (submitted). [14] D. Milojiˇci´c, V. Kalogeraki, R. L. andKiran Nagaraja, J. Pruyne, B. Richard, S. Rollins, and Z. Xu. Peer-to-peer computing. Technical report, HP, 2002. http://www.hpl.hp.com/techreports/2002/HPL-2002-57.pdf. ¨ [15] M. T. Ozsu and P. Valduriez. Principles of Distributed Database Systems. Prentice Hall, 1999. [16] N. G. Philip A. Bernstein, Vassos Hadzilacos. Concurrency Control and Recovery in Database Systems. Addison Wesley, 1987. [17] S. Rhea, C. Wells, P. Eaton, D. G. andBen Zhao, H. Weatherspoon, and J. Kubiatowicz. Maintenance-free global data storage. IEEE Internet Computing, 5(5):40–49, 2001. [18] A. Rowstron and P. Druschel. Pastry: Scalable, decentralized object location, and routing for large-scale peer-to-peer systems. Lecture Notes in Computer Science, 2218, 2001. [19] A. Rowstron and P. Druschel. Storage management and caching in past, a large-scale, persistentpeer-to-peer storage utility. In Proceedings of the Eighteenth ACM Symposium on OperatingSystems Principles, pages 188–201. ACM Press, 2001. [20] S. Saroiu, P. K. Gummadi, and S. D. Gribble. A measurement study of peer-to-peer file sharing systems. In Proceedings of Multimedia Computing and Networking 2002 (MMCN’02), San Jose, CA, USA, January 2002. [21] I. Stoica, R. Morris, D. Karger, F. Kaashoek, and H. Balakrishnan. Chord: A scalable Peer-To-Peer lookup service for internet applications. In Proceedings of the 2001 ACM SIGCOMM Conference, pages 149–160, 2001. [22] W3C. XML Path Language (XPath) Version 1.0, 1999. http://www.w3c.org/TR/xpath. [23] W3C. XML Query, 2003. http://www.w3c.org/XML/Query.