SPICE: Scalable P2P implicit group messaging - Semantic Scholar

6 downloads 5118 Views 896KB Size Report
Aug 31, 2007 - b School of Computer Science and Informatics B2.13, University College Dublin, Belfield, Dublin 4, Ireland ... The technical difficulty of publishing content on the ... scientific collaboration tools, multimedia distribution, blog.
Available online at www.sciencedirect.com

Computer Communications 31 (2008) 437–451 www.elsevier.com/locate/comcom

SPICE: Scalable P2P implicit group messaging Daniel Cutting b

a,*

, Aaron Quigley b, Bjo¨rn Landfeldt

a

a School of Information Technologies J12, University of Sydney, NSW 2006, Australia School of Computer Science and Informatics B2.13, University College Dublin, Belfield, Dublin 4, Ireland

Available online 31 August 2007

Abstract Implicit group messaging (IGM) is a decoupled messaging paradigm for connecting content publishers and consumers over the Internet. Unlike traditional multicast or publish/subscribe messaging, IGM delivers content to ‘‘implicit groups’’ of consumers with characteristics specified by the publisher at the time of publication. IGM systems must support thousands of users and an infinite number of implicit groups formed on demand as messages are published. These groups may be messaged repeatedly or once only, with group sizes scaling from no members to the entire network. Load distribution is a key problem of such systems. This paper broadens our earlier work [D. Cutting, B. Landfeldt, A. Quigley, Implicit group messaging over peer-to-peer networks, in: A. Montresor, A. Wierzbicki, N. Shahmehri (Eds.), Sixth IEEE International Conference on Peer-to-Peer Computing (P2P2006), IEEE Computer Society, Cambridge, United Kingdom, September 2006, pp. 125–132.] in three ways: we provide a formal specification of implicit groups and implicit group messaging; we introduce a comprehensive framework for analysing the efficiency and fairness of generic IGM implementations; and our distributed structured peer-to-peer IGM model, SPICE, is augmented with adaptive load distribution techniques. Through detailed simulation and analysis using Zipfian data sources we demonstrate these techniques are capable of very fairly distributing incoming and outgoing loads over peers irrespective of the scale of implicit groups or frequency of messages. ! 2007 Elsevier B.V. All rights reserved. Keywords: P2P; Implicit groups; Implicit group messaging; Distribution; Replication; Fairness

1. Introduction The technical difficulty of publishing content on the Internet has significantly diminished with the introduction of free, straightforward tools such as Blogger [1] and Flickr [2]. These tools support the proliferation of blogs, web sites, and photo journals [20,26,24] which has led not only to copious content, but also content of a qualitatively different kind. For example, the ‘‘Baghdad blogger’’, Salam Pax [5] was able to publish news and opinions from within Baghdad before and during the invasion of Iraq in 2003, exposing many people to a side of the conflict not prominent in mainstream media. ‘‘Open source journalism’’ [4] is also beginning to take root and the concurrent evolution of connected mobile devices such as camera phones also *

Corresponding author. Tel.: +61 2 9351 3423. E-mail addresses: [email protected] (D. Cutting), aquigley@ ucd.ie (A. Quigley), [email protected] (B. Landfeldt). 0140-3664/$ - see front matter ! 2007 Elsevier B.V. All rights reserved. doi:10.1016/j.comcom.2007.08.026

means people are able to capture photographs and upload them to web sites ‘‘on-the-go’’. This multitude of niche publishers serves a correspondingly large number of consumers [34,29]. Their ability to cope with the swell of information in these environments is somewhat aided by familiar technologies such as search engines including Google [3] which allow keyword searches of web pages. Directories such as Yahoo! [6] also provide detailed taxonomies of sites and podcasts, for example. However, these approaches generally place the onus on the consumers to hunt for information, sifting through search results, or progressively narrowing their browsing through pre-determined categories. A complementary technique to delivering content is push technology, which has the publishers of messages initiating the delivery to consumers [37]. For instance, publish/subscribe messaging (pub/sub) allows consumers to subscribe to specific sorts of information and have matching items delivered as they are published. Push is also the basis of traditional

438

D. Cutting et al. / Computer Communications 31 (2008) 437–451

multicasting, where consumers subscribe to certain channels or join particular multicast groups. Many websites now offer periodic RSS feeds that deliver news and updates to readers. While push addresses the issue of timeliness and is appropriate for information-driven applications, it does not address the initial burden on the consumer of finding interesting content or RSS feeds. However, since the publisher of a message knows precisely the demographic for which it is intended, an alternative strategy is to have the publisher specify the type of consumer for each message created. Underpinning this is the notion of an implicit group, a set of consumers that have some inherent features in common [15,8,23], such as ‘‘all English football supporters’’. An implicit group is defined by the required characteristics for membership, rather than by an explicit list of members. We posit that combining push with implicit groups mitigates the need for consumers to search for interesting content; it is automatically delivered to them from a range of publishers without intervention. We term this concept implicit group messaging (IGM). Publishers are able to deliver content to interested consumers without knowing their names or addresses and consumers do not need to subscribe to specific types of content or find particular groups catering to their interests. Instead, IGM allows publishers to deliver to implicit groups by describing the consumers. Each message can be directed to a different implicit group without any reconfiguration of the network and without any interaction by the consumers. IGM is designed to act as a messaging middleware for scientific collaboration tools, multimedia distribution, blog syndication and any other applications where the publishers of messages know the characteristics of their audiences but not their actual members. Although sharing some features with traditional multicast and publish/subscribe messaging, IGM is functionally distinct [15,16]. IGM systems must support thousands of Internet users and an infinite number of implicit groups formed on demand as messages are published. These groups may be messaged repeatedly or once only, with group sizes scaling from no members to the entire network. Although success stories such as Google show that the client/server approach can support many users when well-provisioned, there is a trend in the research community towards distributed peer-to-peer (P2P) systems that can conceptually scale smoothly from small to large numbers of participants. P2P systems can also avoid centralised bottlenecks and points of failure, and eliminate inherent bias or censorship of content, whether intentional [19] or not [27]. Distributed systems may also be more resilient to manipulation by individuals or organisations, and less vulnerable to legal or technical attacks [25]. In earlier work [15], we informally established the concept of implicit group messaging and presented a structured P2P system capable of reducing peer load as compared to a centralised model. This paper extends that work by formalising IGM (Section 2), greatly augmenting the load distribution features of the P2P model (Section 3),

and introducing a generic framework for analysing IGM implementations (Section 4). In particular, this paper focuses on evaluating the ability of the new P2P features to fairly diffuse incoming and outgoing load over all participating peers, regardless of the scale of implicit groups or frequency of casts (Section 5). We then present related work (Section 6) and conclude in Section 7. 2. Formal basis of implicit group messaging Any concrete implementation of an IGM system (such as SPICE, the structured P2P design of Section 3), should have the following basic properties: (i) All selected consumers eventually receive messages. (ii) Messages are only delivered once to each consumer. (iii) Only messages that are published are delivered. (iv) Non-selected consumers do not receive messages. The delivery of messages to all members of an implicit group is of the highest priority. Beyond these basic properties, a model should seek to minimise overall network and participant load and deliver messages without delay, although the relative importance of these is applicationspecific. Each participant in an IGM system has certain associated attributes or aspects of their identity. These attributes are codified with a descriptive modelling language which can be tailored specifically for the application domain. An implicit group is defined as a subset of all participants that are described by a target expression over the same language. An expression implicitly segments the entire network into two parts: those participants that are described by the expression and those that are not. An expression is said to select those participants it describes. Implicit group messaging is the process of delivering a message from a source participant (the publisher) to the implicit group (the consumers) selected by the target expression associated with a message. Publishers need not be members of the implicit groups to which they publish. Fig. 1 shows publishers at left delivering messages to the consumers in various implicit groups.

Fig. 1. Conceptual model of implicit group messaging.

D. Cutting et al. / Computer Communications 31 (2008) 437–451

IGM does not prescribe any particular modelling language to describe consumers and implicit groups; it may be chosen to suit a particular application domain. The following generic definitions are independent of language. Let P be the set of all participants in the system. Each participant p 2 P has a registration pr which describes it according to a modelling language. In fact, a participant may have many registrations, but without loss of generality we assume exactly one registration per participant. Any language used to model implicit groups must define a surjective function . (read ‘‘selects’’) which maps a target expression and registration to true or false. T and R are the set of all target expressions and registrations for a language respectively. . : T ! R ! ftrue; falseg

ð1Þ

The notation t . r is read as ‘‘target expression t selects registration r’’. P t is then defined as the implicit group (a subset of P) selected by target expression t. P t ¼ fp 2 Pjt . pr g

ð2Þ

The structured P2P model described in Section 3 uses a concrete modelling language based around the idea of simple descriptive tagging. Consumers’ attributes are expressed as tags (non-empty strings). Target expressions are strings that combine tags using conjunctive (&) and disjunctive (j) operators, and parentheses to modify evaluation order. tag :¼ ½a & zA & Z'þ expression :¼ factorð\&"factorj\j"factorÞ) factor :¼ tagj\ð"expression\Þ"

ð3Þ ð4Þ

ð5Þ

The selection function . operates as expected under Boolean logic. For example, the target expression (AjB)& C selects all consumers that have registered the tags A and C, and/or B and C. More formally, let a registration r describing a participant’s attributes be represented by a registration set R(r) containing a tag for each attribute. E.g., a registration for a peer with tags A and B has a registration set {A,B}. Let a target expression t be represented by a target set T(t) which is a set of target elements, each of which represents a disjunctive term of the expression. Each target element is a set of conjunctive tags. E.g., the expression (AjB)& C expands to two disjunctive terms, (A& C)j(B& C), each containing two conjunctive terms. This expression is therefore represented by a target set {{A,C},{B,C}}. The selection function . can then be defined as true if and only if any non-empty subset of the registration set is an element of the target set: t . r () 9K 2 T ðtÞ;

K * RðrÞ; K 6¼ ;

ð6Þ

The definition of an implicit group (Theorem 2) can thus be rewritten using the tag-based selection function (Theorem 6) to yield: P t ¼ fp 2 Pj9K 2 T ðtÞ; K * Rðpr Þ; K 6¼ ;g

ð7Þ

439

That is, an implicit group is the set of participants that have registered all of the tags in at least one element of a target set. Publishing a message to an implicit group is termed casting a message. Such messages may themselves also be referred to as casts. Casting is based around an interface of operations in an IGM system, and conditions that define which orders of invocations are legal. This specification is based upon the trace-based specifications of Mu¨hl [28]. A minimal IGM interface is specified in Table 1 which allows consumers to express their attributes with a registration, and publishers to cast messages to implicit groups. To guarantee the desired IGM properties introduced at the beginning of the section, these operations must occur only in certain legal orderings. For example, it is legal for a participant p to be notified of a cast c if it has previously made a registration selected by the message’s target expression (i.e., ct.pr): regðpÞ; castðq; cÞ; notifyðp; cÞ

ð8Þ

This instance satisfies all informal properties of an IGM system. It would however constitute an illegal ordering if ct 7 pr since p would be notified of a cast for which it had not previously registered a selected registration, violating Property 4. The legal orderings of IGM operations can be formalised by adapting Mu¨hl’s safety and liveness conditions. These were derived for content-based publish/subscribe systems, but the same concept can be applied to any event-based messaging system such as IGM or IP multicast. Similar safety and liveness conditions have also been used to specify other event-based messaging systems such as Gryphon [9]. The conditions use temporal operators with the following meanings: – hX means X is true in all future states. – eX means X is true in at least one future state. – +X means X is true in the next state. The reader is referred to Mu¨hl [28] for a full treatment of trace-based specifications. A safety condition (Predicate 9) is used to express IGM Properties 2, 3, and 4. It states that when a consumer is notified of a cast, it will not be notified again, the message was previously cast by some publisher, and its registration is selected by the cast’s target expression. Additionally, the condition requires that the consumer has completed a registration. Let C(q) be the set of all messages cast by q so far, and R be the set of all peers that have registered so far. Table 1 The IGM interface Method

Description

reg(p) cast(p,c) notify(p,c)

Consumer p registers pr Publisher p casts c to group selected by expression ct Consumer p notified of cast c

440

D. Cutting et al. / Computer Communications 31 (2008) 437–451

!½notifyðp; cÞ ) +!:notifyðp; cÞ ^ 9q:c 2 CðqÞ ^ ct . pr ^ p 2 R'

ð9Þ

A liveness condition (Predicate 10) is used to express IGM Property 1. It states that if a consumer registers, it will eventually (and subsequently) be notified of all casts from any publisher that select its registration. !½regðpÞ ) "!½ðcastðq; cÞ ^ ct . pr Þ ) "notifyðp; cÞ''

ð10Þ

Note that these specifications do not state how quickly a notification must take place after a message is published, or place limits on resources used. These are domain and implementation concerns that are highly dependent on the intended usage, the system design and the modelling language. The following section presents a concrete implementation of IGM that considers such issues. A more detailed description of IGM can be found in an associated technical report [16]. 3. SPICE (P2P model) The focus of this paper is the adaptive load distribution features of our P2P IGM implementation called 1 SPICE . The general approach is to construct registration and casting algorithms over a Distributed Object Location and Routing (DOLR) structured overlay network. The initial SPICE design presented in this section could use any DOLR, such as Tapestry [38]. However, the advanced load distribution algorithms presented in Section 3.3 require specific features enabled by ICE, our DOLR substrate described in Section 3.2. This section first describes the basic SPICE model without load distribution features. DOLRs are similar to distributed hash tables (DHTs) which operate much like classical hash tables, storing and retrieving objects over a network of peers structured in an address space. Peers are typically responsible for a portion of the address space and store and serve the objects that are hashed to it. DOLRs are more generic; they support the routing of arbitrary messages to objects or nodes in the substrate. A simple DOLR-based IGM system is implemented in two parts. The basic approach is known as ‘‘vertical’’ partitioning, a technique often used for distributed search indices [33]. A new peer first constructs a description of all of its tags called a summary. The summary is then stored in registries at several addresses throughout the DOLR. The peers maintaining the registries are known as rendezvous peers (RPs) and they are found by hashing tags with a globally-known uniform function. Fig. 2(a) shows the process of 3 peers registering 2 tags each. Note that the summaries stored by each RP contain all of the tags of the peer registering.

1

Named for the variety of delivered content.

Fig. 2. The basic

SPICE

(a) registration and (b) cast algorithms.

When a peer wishes to send a cast to an implicit group, it hashes any one of the tags in the target expression and routes the cast to the rendezvous peer. When the rendezvous peer receives the cast it is a simple matter to determine which peers are members of the implicit group, as it has stored registrations for all peers expressing that tag. It then IP unicasts the cast to each of them. Fig. 2(b) shows a message cast to the implicit group ‘‘Football & England’’. The publisher hashes the tag ‘‘Football’’ and routes the cast to its RP. The RP determines from its tables that two peers have registered both of the tags in the target expression and unicasts it to them. Target expressions containing disjunctive terms can be treated as separate casts to several conjunctive expressions. 3.1. Load imbalance The basic SPICE model does not scale well to frequently messaged or large implicit groups. Frequent casts containing a tag would generally increase the incoming load to the rendezvous peer responsible for the tag. Likewise, a tag that is commonly expressed by peers means the RP must store many summaries, and transmit many unicasts to members of implicit groups. Table 2 permutes the loading possibilities on RPs in terms of incoming and outgoing load. It is preferable to spread this load over many peers. We define two parameters that capture this notion. The storage limit (SL) is the maximum number of summaries a rendezvous peer is willing to store (which consequently limits the number of consumers to which it must forward casts). The frequency limit (FL) is the maximum number of casts per second it is prepared to service. The basic SPICE model is augmented by load distribution algorithms which use these parameters to address loading problems in Section 3.3. However, it is necessary to first describe the ICE substrate Table 2 Varieties of casts engendered by peer registrations and casts Common tag Rare tag

Frequently cast

Seldom cast

› in, out › in

› out fl in, out

D. Cutting et al. / Computer Communications 31 (2008) 437–451

that supports SPICE, as its unique features form the basis of the algorithms. 3.2. The

ICE

substrate

ICE [17] is our novel DOLR substrate based around tesseral addressing and an efficient amortised multicast routing algorithm. ICE is a useful fundament for building P2P systems such as SPICE and has just two functions: to organise the peers into a structured overlay over the physical network; and to route messages from source peers to multiple destination peers. Fig. 3(a) illustrates the layered approach to building applications with ICE. Peers are organised on a d-dimensional surface. The entire surface is claimed by peers; there are no ‘‘holes’’. Each peer ‘‘owns’’ an exclusive region of the surface and communicates directly only with peers that own bordering regions (additionally, the surface wraps on all edges so as to not introduce discontinuities). Messages are routed across the surface by passing them from neighbour to neighbour in the direction of the destination. Fig. 3(b) shows an example of a two-dimensional surface. The ICE surface is similar to the CAN DHT [32], but there is a crucial difference; unlike CAN which uses a Cartesian coordinate space, ICE surfaces use hierarchical tesseral addressing.

3.2.1. Tesseral addressing Tesseral addressing is a compact and elegant addressing scheme commonly used for spatial indexing. Instead of specifying points with coordinates, tesseral addressing describes regions of space. It is particularly useful for arbitrarily decomposing a space to different granularities and is used here to decompose the ICE surface. ICE regions are addressed by strings of d-bit digits (of base 2d). These regions are termed extents, and each digit represents a progressive index into the hierarchically addressed surface. Extents may be large and coarse-grained or extremely small and fine-grained. The depth of an extent is the number of digits in its address. The solitary extent of zero length is called the universal extent (denoted U) and represents the entire surface. A collection of extents is called a tract and can be used to describe an arbitrary region of the surface.

Fig. 3. (a) Layered P2P architecture; (b) extents on a 2D

ICE

surface.

441

Fig. 3(b) illustrates the addressing scheme, ordered leftto-right and top-to-bottom. For example, the address 0 specifies the top-left quadrant of a two-dimensional surface, and address 3032 specifies a small extent towards the bottom-right corner. This mapping approach can be applied to surfaces of arbitrary dimensionality without loss of generality. This addressing scheme affords compact descriptions of large and small regions of the surface, and is a convenient way for peers to communicate with one another. The selfsimilar properties of tesseral addressing are an integral part of the load distribution mechanisms of SPICE described in Section 3.3. It is important to note that although tesseral addressing is similar in concept to quad-tree data structures, nowhere is such a structure maintained by peers. In particular there is no ‘‘root’’ peer. The addressing scheme is only a convention used to communicate surface information. 3.2.2. Amortised routing ICE’s other major component is an efficient amortised multicast routing algorithm. Payloads are routed from a source peer to a destination tract. All peers that own tracts intersecting the destination tract receive the message. Messages are routed geometrically across the surface between neighbouring peers. At each hop, peers forward the message to a neighbour that owns an extent nearer to the destination. Fig. 4(a) shows an example of a message routed across a two-dimensional surface. As! with " CAN, 1 routing a message across the surface takes O dnd hops. The routing algorithm is not restricted to point-to-point routing; it delivers copies of a message to all peers with tracts intersecting a destination tract. Often the tract will be a small contiguous region of the surface intersecting with only one or a few peers. However there is no requirement for a tract to be contiguous. A tract can specify any region of the surface whether contiguous or not. An optional mask can also be applied such that only extents in the destination tract that also match the mask receive the routed message. There are times when it is useful for a message to be delivered to disparate parts of a surface. For instance, a resilient storage layer built atop ICE could

Fig. 4. (a) Point-to-point routing over a 2D ICE surface; (b) a message delivered from a source to the shaded region using amortised routing.

442

D. Cutting et al. / Computer Communications 31 (2008) 437–451

use the routing algorithm to copy data to multiple points on a surface. Routing separate copies of a message from a source is inefficient if some of the route is the same for each copy. Because ICE is a structured overlay and the surface is a geometric construct, the routing algorithm can cluster destinations based on their direction away from the source in order to amortise the routing cost. If several destinations lie in the same direction away from a source, it is sufficient for a single physical packet to be routed that way. This technique is recursively applied. Not only are destinations clustered from the initial source, they are clustered again at each hop. This results in messages taking a tree-like route from the source to all peers with tracts that intersect the destination tract. Initially the route has few branches but as the messages approach their destinations they branch as necessary to balance the total number of physical packets against direct individual routes. Fig. 4(b) shows an example of amortised routing delivering a message to a highlighted destination tract. Clustering is based on a divisive hierarchical cluster using the angular difference between destinations from the current peer as a distance metric. Due to parallax, distant extents are more likely to be clustered than nearby extents, which results in the tree-like branching as messages approach destinations. The parameter controlling the degree of branching is called the branch factor. This is the angular threshold value used in the clustering algorithm, expressed in radians and taking a value in the range 0–p. A value of 0 means messages are never amortised; each copy is independently routed point-to-point. A value of p means the message never branches but visits each destination in turn. The advantage of a high branch factor is that a minimal total number of messages are needed, which in turn reduces the incoming and outgoing load on peers and network links. The drawback is that messages take longer to reach their destinations, because more circuitous routes are taken. A more detailed description of ICE can be found in an associated technical report [17]. 3.3.

SPICE

unloaded

The unique features of ICE – tesseral addressing and amortised routing – are used to distribute load in the SPICE model. The intention is to reduce outgoing load on RPs that hold registries for common tags, and incoming load on RPs that hold registries for tags that appear frequently in target expressions. The two complementary techniques employed are called registry distribution and registry replication. The former adaptively distributes the summaries held in a registry over many peers surrounding the original RP, such that each is required to forward only a fraction of the total outgoing load for a cast. The latter makes complete copies of registries at other parts of the surface that can be independently used to resolve casts, thereby reducing the total incoming

load on any single RP. The techniques are guided by the parameters defined in Section 3.1 – the storage limit (SL) and the frequency limit (FL) – and can be combined for registries of tags that are both commonly registered by peers, and frequently used in target expressions. The techniques are presented in the following sections and detailed in an associated technical report [18]. 3.3.1. Registry distribution The basic SPICE registration algorithm requires registries to be stored by an RP at the addresses found by hashing tags. In SPICE, tags are hashed to deep extents on the surface known as rendezvous extents. When many peers register the same tags, these registries will become large and the RP will need to store many summaries and forward many casts to large implicit groups. Registry distribution is designed to reduce the outgoing load of peers that store many summaries. It works by expanding in incremental steps the size of the rendezvous extent where a registry is stored. When the rendezvous extent is small, it is covered by a single peer which stores all of the summaries. As a rendezvous extent grows, it covers more neighbouring peers, each of which becomes an RP and is responsible for storing a fraction of the summaries. The rate at which the rendezvous extent grows is determined by the storage limit and the popularity of the tag being registered. When an RP reaches its storage limit, the rendezvous extent is increased in size by omitting a digit from the end of its address. Due to the hierarchical tesseral addressing scheme provided by ICE, this results in a new extent covering a larger part of the surface. Take Fig. 5 as an example. Suppose on a two-dimensional surface the ‘‘Football’’ tag hashes to the rendezvous extent 210 (the small highlighted extent in the first subfigure). All peers that register ‘‘Football’’ route their summaries to the RP that owns that extent. If the tag is common, the storage limit of the RP will soon be reached so it will increase the rendezvous extent to 21 by omitting the trailing 0. This extent is 2d = 4 times the size and encompasses four peers (assuming the surface is evenly divided), each of which becomes an RP for the ‘‘Football’’ registry and stores a fraction of the summaries. This process can continue as needed to comfortably accommodate all summaries stored in the registry. Eventually the rendezvous extent can reach the universal extent which contains the entire surface. When a cast arrives at an RP that has reached its storage limit, it is routed to the next largest rendezvous extent. This continues until all summaries are found, whereupon the cast is forwarded to selected consumers. By distributing the summaries over larger extents, individual peers need store fewer summaries, and perform fewer unicasts when matches are found. In one version of SPICE [18], the dispersion of summaries to RPs within larger rendezvous extents is random. In this paper, we explore the possibility of intentionally placing summaries at locations that encode information about

D. Cutting et al. / Computer Communications 31 (2008) 437–451

443

Fig. 5. Registry distribution – a registry’s rendezvous extent grows as it is overloaded with summaries. Rendezvous extent 210 overflows to 21, then 2, and finally to U. Summary 312 is stored at the locations marked with a star as the rendezvous extent grows.

the tags they contain. This allows casts to be routed only to the small subset of locations in rendezvous extents that may contain matching summaries. This is achieved by representing the summaries as Bloom Filters [10]. Bloom Filters are compact representations of sets of objects which support membership tests with an adjustable error rate of false positives, effectively trading precision for reduced storage. Stored as bit strings, Bloom Filters can incorporate new objects by hashing them with k different functions, each returning a position in the string to be set. An object can be tested for membership by hashing it with the k functions and ensuring all returned bit positions in the filter have been previously set. Because Bloom Filters are bit strings, they can be read as if they were an extent address on the ICE surface by taking d bits at a time. The location where each summary is stored is calculated by combining digits of the rendezvous extent with those of the summary. By using the rendezvous extent as the head of the address, the summary will be stored somewhere within the rendezvous extent, and by using the summary itself as the tail, its precise location will encode the tags it contains. Continuing with the example above, suppose a summary contains the tags ‘‘Football’’ and ‘‘England’’, resulting in a Bloom Filter with the bits 110110. This summary is read as the extent address 312 on a two-dimensional surface. Storing this summary in the ‘‘Football’’ registry requires it to be stored at extent 210, the hash of that tag. If the RP owning that extent has reached its storage limit, the rendezvous extent becomes 21. This is then combined with the summary, 21 + 3(12), to produce the address 213, which is where the summary is stored. Storing the summary in the next largest rendezvous extent would result in the address 2 + 31(2) = 231. If the rendezvous extent reaches U, the entire summary determines the location where it is stored, i.e., 312. Fig. 5 shows this particular summary as a star. Note that the locations of summaries produce a self-similar pattern as the rendezvous extent grows in size. Since the locations of summaries encode the tags they contain, this can be used to limit the tract that must be explored to find all selected summaries for any given cast. Recall the amortised routing algorithm allows an optional mask to select only those extents in a destination tract that also match the masked address. When routing a cast to a

larger rendezvous extent, a Bloom Filter containing the tags in the target expression is used as a mask. Hence, the cast will only be routed to those parts of the next rendezvous extent that can possibly be storing summaries including all necessary tags. For example, the target expression ‘‘Football & England’’ would create a mask from the Bloom Filter 110110. Routing this cast to the rendezvous extent 21 would have a mask 21 + 3 = 213, meaning only this extent would need to be explored for selected summaries. Every summary that includes the tags ‘‘Football’’ and ‘‘England’’ must be stored in that extent. Our earlier P2P model also employs this notion of storing summaries at locations that encode information about their contents [15]. However it does not use smoothly scaling rendezvous extents, instead storing summaries only in the universal extent. While that approach works well for very large registries, it is not suitable for registries of moderate size because casts must be routed large distances in the sparsely populated universal extent. The new approach can adapt to registries of any size as necessary. 3.3.2. Registry replication Registry replication is designed to reduce the incoming load on peers caused by frequent casts to the same implicit groups. Once a rendezvous peer reaches its frequency limit, it routes copies of all of its summaries to corresponding extents on the surface. These replicas then resolve a fraction of new casts in the same way as the original RP, reducing its incoming load. The extents to which registries are copied are found by replacing the head of the rendezvous extent address with a number of ‘‘wildcards’’. For example, Fig. 6 shows a two-dimensional surface with the RP at 210 holding the registry for the ‘‘Football’’ tag handling many casts (at left). Level 1 replication of this rendezvous extent, *10, copies the registry to corresponding extents in the other three quadrants, i.e., 010, 110 and 310. Level 2 replication makes 16 copies at corresponding extents within the next deepest set of extents, **0. To make use of replicas, it is necessary to adjust the cast algorithm slightly. Ordinarily, casts are routed directly from the publisher across the surface to the rendezvous extent of one of the tags in the target expression. The cast is typically routed a reasonable distance over the surface on its way to the RP. By perturbing the route slightly so as to

444

D. Cutting et al. / Computer Communications 31 (2008) 437–451

Fig. 6. Registry replication – 0, 1 and 2 levels of replication. Registries are replicated to extents around the surface corresponding to the original RP. Casts need only find a nearby replica to be resolved.

pass through extents where the summaries for the destination tag may have been replicated, a cast may happen upon a replica, allowing a reduction in path length and incoming load for the original rendezvous peer. If no replicas exist, the message will still reach the original rendezvous extent, having deviated only slightly from the optimum route. Replication combined with the modified cast algorithm divides the amount of incoming traffic for a tag evenly over all replicas (assuming publishers are randomly spread through the network). For common tags that also frequently appear in target expression, replication can be trivially combined with distribution. When a registry is replicated, all tags are routed to the corresponding rendezvous extents for storage. As the summaries arrive, the replica RPs may reach their storage limits and the summaries will be distributed to larger rendezvous extents based around the replicated regions. 4. Experimental design Evaluation is by way of OMNeT++/INET simulations [36] at the UDP packet level, simulating 1024 casts between 4096 peers, with one cast per second. Each simulation is run five times with different random seeds and data sets. Errors shown on graphs are 95% confidence intervals. SPICE and the simulation environment have several parameters, shown in Table 3 together with a range of typical values and the defaults chosen for this evaluation. This paper is focused on the effect of varying the storage and frequency limits. Other parameters will be investigated in future work, although preliminary investigation suggests Table 3 Parameters used in the evaluation Parameter

Typical range

Default

Surface dimensionality Branch factor (c)

2–4 0–p

3

Summary error (%) Frequency limit (FL) (casts/s) Storage limit (SL)

1–10 61 6#peers

1 Varied Varied

5–50 Dependent

4096 1024 18 174

Number of peers Number of casts Tags per peer Number of summary bits

p 3

the defaults chosen for this evaluation are appropriate for a range of scenarios. Summaries are represented by Bloom Filters, so the number of hash functions and bits must be determined. These can be calculated according to the equations of Broder and Mitzenmacher [12]. With the false positive rate fixed at 1% (the summary error), the optimal number of hash functions is approximately 7. For 18 tags per peer, this requires approximately 173 summary bits. However, summaries are used as extent addresses and must be expressed as a whole number of digits for a particular surface dimensionality so we increase the number of summary bits accordingly. Here dimensionality is 3, so the number of summary bits is 174. We now outline a generic framework for analysing IGM implementations which will be used in the experiments of Section 5. Analysis is broadly broken into three facets reflecting various concerns – peer, network and cast. Peer facet. By joining an IGM system peers permit the use of four resources: storage; computation; incoming network traffic; and outgoing network traffic. With everincreasing computational power and storage capacity available to desktop computers, the slight computational load and storage requirements imposed by a P2P client is presumed to be unproblematic. However, incoming and outgoing network loads on peers are important. Network facet. IGM models are designed to run over fixed physical networks comprising a core of routers. Participants reside at the fringe of the network and communicate exclusively with IP unicast messaging. The network facet examines physical network link loading. Cast facet. The performance of the IGM model in terms of total messages needed to deliver casts reflects the overall efficiency of a model. The latency of cast delivery may also be important in some applications. In earlier work [15], we compared a simpler version of SPICE to a benchmark centralised model and found it to greatly reduce maximum peer and network link load. This paper is concerned with investigating how SPICE’s augmented load distribution features fairly and efficiently distribute the load over peers. As such, only the peer and cast facets are used and the benchmark for comparison is the basic SPICE model itself with no load distribution features employed; in all experiments, this is the instance with the greatest storage and frequency limits.

D. Cutting et al. / Computer Communications 31 (2008) 437–451

4.1. Metrics This section describes the metrics that make up the peer and cast analysis facets. 4.1.1. Cast facet Ratio of Total Hops (RTH) is the total number of overlay messages needed to deliver a cast as a ratio of the number needed in a centralised client/server model. In a centralised model, a publishing peer sends a single message to the server, which forwards the message to all matching peers in the system. This is the minimal number of application-level packets needed. In order to measure RTH for cast x in SPICE, we divide the sum of messages used in the delivery of a cast by this minimum (Eq. (11)). Pn out k RTH x ¼ k¼1 x ð11Þ 1 þ gx k out x is the number of packets for cast x forwarded by peer k (including duplicates). gx is the number of peers in the implicit group selected by cast x. 4.1.2. Peer facet TOUT. Total outgoing load measures the total number of cast packets a peer forwards to other peers. RPs for tags frequently used in target expressions that select large implicit groups would be expected to have high TOUT since they must forward many copies of those casts to consumers. Registry distribution and replication would be expected to reduce TOUT by spreading the outgoing load to neighbouring peers and other replicated RPs, respectively. TIN. Total incoming load measures the total number of cast packets handled by a peer on behalf of others (i.e., messages it is not interested in). Very frequent casts to the same group may be expected to load the same RPs in the system. Registry replication should reduce TIN by having other peers act as RPs for the same tags for a fraction of the casts. POUT. Per cast outgoing load measures the maximum number of cast packets a peer forwards for a single cast. Although any peer may act as an RP for a cast (thus spreading the load over many casts), the transient outgoing load on peers can still be very high when implicit groups are large. POUT should fall as registry distribution is employed. PIN. Per cast incoming load measures the maximum number of cast packets received by a peer for a single cast. Generally this will be very low, but registry distribution may increase it by forwarding duplicates of casts to the same peers during a single cast. Loading. We take the maximum of these metrics across all peers and casts – denoted TINM, TOUTM, PINM and POUTM – to characterise the worst loading of any peer

445

in a particular simulation. A high maximum indicates at least one peer is heavily loaded according to the metric. Fairness. The notion of ‘‘fairness’’ is important in P2P systems, though ambiguous; load is balanced over many peers and content is available to all. The main design goal of SPICE is to ensure that the incoming and outgoing number of packets are not biased towards any peers, even when handling skewed distributions of peer registrations and casts. In this sense, fairness means all peers have the same incoming and outgoing loads. A possible measure of fairness is the standard deviation of a metric across all peers. However, this is in the same units as the value, and difficult to compare across different configurations of the model. The concept of fairness is found in other fields such as economics, which has several measures of inequality: the unequal distribution of income in a society. For a completely equal distribution, every member has the same level of income while in an unequal society only one member has any income. The analogy to peer loading in a P2P network is clear. One of the most commonly used measures of inequality is the Gini coefficient. The Gini coefficient (G) is based on Lorenz curves, which are the plotted representations of cumulative distribution functions (CDFs). Fig. 7 is an example of a Lorenz curve, showing the cumulative distribution of a set of tags registered by peers in this evaluation. G is defined as the shaded area between the diagonal line of equality and the Lorenz curve, as a fraction of the entire area below the line of equality. It can thus take a value from 0.0 (equality, or absolutely fair) to 1.0 (inequality, or absolutely unfair). The Gini coefficient has recently been used to measure fairness in P2P networks [30]. In that paper, the authors demonstrate that within the context of a DHT, a ‘‘good’’ value of G representing a fairly loaded system falls between 0.5 and 0.65, while an unfairly loaded system has G in the range 0.85 to 0.99.

Fig. 7. A Lorenz curve representing the frequency of tags registered by peers. The Gini coefficient is 0.72.

446

D. Cutting et al. / Computer Communications 31 (2008) 437–451

Eq. 12 defines the Gini coefficient for a set of N values sorted in increasing order. We use this measure to summarise the fairness of all peers’ total incoming and outgoing packet distributions, denoted TING and TOUTG. PN ð2i & N & 1Þxi G ¼ i¼1 2 ð12Þ N & !x 4.2. Data source Many Internet phenomena follow a Zipfian frequency distribution [7]. A skewed distribution such as this will tend to load some parts of an IGM system more heavily than others and serve as a suitable test of the load distribution features of SPICE. A list of ,30,000 distinct tags is generated with a Zipfian distribution such that some tags are very common and most are very rare. Specifically, the frequency of a tag with rank i is 1i . From this distribution, each peer randomly selects a number of tags which make up their registration. Conjunctive casts comprising two tags (e.g., ‘‘Football & England’’) are also constructed from the distribution. The Lorenz curve of Fig. 7 shows an example of the cumulative frequency of tags registered by peers. This curve shows, for instance, that 75% of the distinct tags registered by peers account for just 20% of the sum of all registered tags. I.e., most tags are registered by only a single peer, but a few common tags are registered by a majority of peers. 5. Results This section presents the results of simulations testing the fairness and loading properties of SPICE. Section 5.1 first investigates how the model performs under extremely difficult casting conditions to gauge its ‘‘worst case’’ effectiveness. Section 5.2 then demonstrates SPICE under more realistic conditions using a Zipfian distribution of cast frequencies.

Fig. 8. TOUTM decreases with heavier registry distribution and replication.

(,1 million unicasts). A very large reduction in TOUTM is observed by reducing the storage limit, because individual peers store only a small number of summaries each. Replication also reduces TOUTM because each registry replica is only used to resolve a fraction of all casts. TOUTM converges to a minimum when replication is combined with distribution, because the expanded rendezvous extents used by registry distribution coincide with those from other replicas. For example, with level 1 replication, all distributed rendezvous extents eventually coincide at the universal extent. For heavier levels of replication, distributed rendezvous extents coincide more quickly at deeper extents. This is known as tag saturation. Fig. 9 shows that TINM is lowest when there is no distribution. This is a consequence of the routing algorithm; by not needing to route a cast to all RPs in a distributed registry, fewer total messages are needed. This is confirmed in Fig. 10 which shows that the RTH is highest when registries are distributed. Replication significantly reduces TINM, as it is designed to do, though the reduction is less dramatic as the degree of replication increases. Again tag

5.1. Extreme cast load This experiment tests how the model performs when 1024 casts are made to the same very large implicit group (a quarter of the whole network, or ,1024 members). The storage limit is varied from 16 (heavy distribution) to 4096 (no distribution) and the frequency limit from 2.56 casts/s (no replication) to 0.005 casts/s (heavy replication). The benchmark for these experiments is the basic SPICE model with no distribution or replication. This is the instance where the storage and frequency limits are greatest. Fig. 8 shows that TOUTM is worst when there is no replication or distribution. This is because a single peer is required to unicast each of the 1024 casts to ,1024 peers

Fig. 9. TINM increases with heavier registry distribution but can be reduced by also applying registry replication.

D. Cutting et al. / Computer Communications 31 (2008) 437–451

Fig. 10. RTH is highest with heavy distribution since more peers must be contacted. Replication has no significant effect.

saturation limits the benefits of heavy distribution and replication beyond a certain point. With no distribution or replication, the total number of messages needed to publish a single cast is comparable to a centralised model (Fig. 10), since a single RP is acting like a server. When heavy replication and distribution are applied, the RTH increases to just twice that of a centralised model, while greatly reducing the load any single peer endures. Fig. 13 shows that registry distribution is capable of greatly reducing per cast outgoing load (POUT) (in addition to total outgoing load), at the cost of slightly increased per cast incoming load (PIN). The slight increase in PINM is due to a small number of duplicate messages being routed through the same RPs when searching for summaries in distributed rendezvous extents. As these metrics measure per cast load, registry replication does not affect them. It is very difficult to achieve global fairness with an extremely skewed cast set such as this, but the combination of registry distribution and replication is decidedly effective (Figs. 11 and 12). The benchmark SPICE model with no distribution or replication has TOUTG of 0.99 and TING of 0.93. With heavy distribution and replication, RTH increases but TOUTG and TING are reduced to exceptionally fair values of 0.52 and 0.50, respectively. TIN is inherently fairer than TOUT for two reasons. First, the ICE routing algorithm branches as it delivers its payload, so some peers will send several messages for the one they received. Second, RPs must unicast to potentially many group members for each cast they receive. Hence, there will always be a significant number of peers that have transmitted more messages than others. Messages received however will tend to be similar in number across peers. 5.2. Realistic cast load This section tests the model using a set of casts with a Zipfian distribution. This is a more realistic type of access

447

Fig. 11. The extremely skewed casts make fairness difficult, but heavy distribution and replication have a remarkable effect on TOUTG.

Fig. 12. The extremely skewed casts make fairness difficult, but heavy distribution and replication have a remarkable effect on TING.

load than that used in Section 5.1. Consequently, the range of the frequency limit is modified to produce similar levels of replication, between 0.2 (no replication) and 0.0125 (high replication). The general trends seen mirror those in Section 5.1 and most of the observations remain valid. However, there are several differences worth noting. TOUTM is more than an order of magnitude lower generally with Zipfian casts than the extreme casts of the previous experiment (Fig. 14). This is due to the far more varied mix of implicit group sizes selected by the casts; many of the groups have no members at all. TINM is also an order of magnitude lower than the extreme case (Fig. 15), since even the most popular tag is only accessed relatively seldom in comparison. RTH is dramatically higher for the Zipfian cast set on the whole (Fig. 16). The total number of SPICE messages for a cast using a registry that is not distributed is 14 times that of a centralised model. This is because many of the groups in the Zipfian cast data have very few or no members. However, the number of members is not known until the cast has been routed from the publisher to the RP,

448

D. Cutting et al. / Computer Communications 31 (2008) 437–451

Fig. 13. POUTM decreases with registry distribution at the cost of increased PINM. Replication does not affect per cast metrics.

which is approximately 24 hops over a 4096 peer threedimensional ICE surface (Section 3.2.2). In a centralised model, only a single message to the server is required to cast to an empty group. Although the RTH seems high, the actual number of messages required is still quite low, and could be reduced further by increasing the dimensionality of the surface, shortening the average path length across the surface. Fig. 19 shows that POUTM and PINM follow much the same trend as the previous experiment because there are still some casts published to large implicit groups. As before, registry distribution reduces by two orders of magnitude the outgoing load on any single peer for any single cast, at the cost of a slight increase in incoming load. Generally, SPICE is fairer for the Zipfian cast distribution (Figs. 17 and 18). With no replication or distribution, TOUTG is 0.89. More prominently, TING is just 0.39. As the degree of distribution increases, TOUTG only improves and as replication is added it falls to an extremely fair 0.43. As before, TING increases slightly due to the increased

Fig. 15. TINM is generally lower for Zipfian casts, but registry distribution still leads to increased incoming peer load.

Fig. 16. RTH is higher for Zipfian casts since they must still be routed to rendezvous peers, despite many groups being memberless.

Fig. 17. With Zipfian casts, TOUTG.

Fig. 14. Zipfian casts result in much lower TOUTM, but distribution/ replication still shows a clear improvement.

SPICE

shows exceptional fairness in terms of

number of packets transmitted around distributed tag registries. Replication keeps TING fair despite this by dividing the load between replicas.

D. Cutting et al. / Computer Communications 31 (2008) 437–451

Fig. 18. TING is less fair with heavy distribution, unless replication is also employed.

449

guage supports aggregation functions (e.g., MAX or MIN) for specifying implicit groups of recipients but unlike SPICE, relies on hierarchically structured domain ‘‘superpeers’’ responsible for maintaining aggregation information and forwarding messages to deeper domains. mSearch [21] uses application-level multicast (Scribe [13]) to resolve keyword search queries in a P2P network. Each peer joins the set of multicast trees corresponding to its documents’ keywords. A conjunctive query is resolved by multicasting the request to the group of whichever is the least common term (i.e., the smallest group). Each peer in the group then performs a local document search. mSearch rendezvous peers can opt to classify a keyword as ‘‘too common’’, disbanding its multicast tree. In cases where the query comprises only common keywords, the system resorts to a limited breadth-first search. This approach shares elements with SPICE’s registry distribution technique, although in SPICE, searches are constrained to rendezvous extents and guaranteed to find all summaries selected by a cast. Such a guarantee is not needed (or supported) in mSearch which only needs to find ‘‘enough’’ matches to satisfy a search query. Beehive [31] is a replication framework for improving lookup performance in DHTs. It achieves this by proactively replicating objects along paths away from the rendezvous point such that future requests will encounter replicas first. This is similar to the way SPICE replicas are strategically placed on the surface such that casts may encounter them on their way to rendezvous extents. 7. Conclusion

Fig. 19. POUTM decreases with registry distribution at the cost of increased PINM. Replication does not affect per cast metrics.

6. Related work Many P2P systems have been proposed to support distributed keyword search queries [33,39]. Typically documents are partitioned by keyword, much like the tag partitioning used in our model. However, these systems are designed to return a list of relevant documents to the source of the query, and could not be directly adapted to efficiently handle multicasting to implicit groups. Joung et al. [22] proposes a system that stores document references at a single location in a DHT, similar to the way a summary’s location encodes its tags in SPICE. To resolve a query, a dynamically constructed hypercube containing only possible matches is flooded and results returned. Mirinae [14] uses a similar construct to route pub/sub events. These routing algorithms are both conceptually similar to ICE’s amortised routing. SelectCast [11] allows a sender to multicast a message to peers matching an SQL-like query. The expression lan-

We have formally described implicit group messaging, a push-based multicast scheme allowing publishers to deliver messages to implicit groups of consumers formed on demand by specifying their common characteristics. This paper has also introduced a generic analysis framework for measuring IGM implementations, and extended our innovative structured P2P model, SPICE, with registry distribution and replication algorithms which automatically adapt to the distribution of registered tags and cast frequencies. P2P designs must consider peer reliability. Often peers will fail or leave unexpectedly or behave maliciously so as to subvert the system. Like most structured networks, SPICE can respond to peer failure with self-stabilisation [35], by defining rules for converging illegitimate states to stable states in finite time. Likewise, malicious rendezvous peers that forward casts incorrectly, for example, may be detected and then excised by periodically casting messages to groups managed by peers suspected of malicious behaviour. These important issues remain as future work. Other future work will investigate how SPICE performs with a real data source and physical network topologies derived from actual measurements (analysed using the network facet). We expect the results to be consistent with those presented in this paper.

450

D. Cutting et al. / Computer Communications 31 (2008) 437–451

The load distribution techniques of SPICE are capable of delivering messages from publishers to dynamically specified implicit groups of any size with a reasonably low overhead. Through detailed simulation and analysis using realistically skewed data sources, we have shown that SPICE greatly curtails the maximum total incoming and outgoing peer load, and furthermore, distributes the load with exceptional fairness over all peers. Beyond Zipfian distributions of casts, SPICE is also thoroughly capable of dealing with extremely skewed cast distributions that would require a highly provisioned server in a client/server model. Acknowledgements The authors wish to acknowledge the support of the Smart Internet Technology CRC, and the constructive suggestions of Derek J. Corbett and the anonymous reviewers. References [1] [2] [3] [4] [5] [6] [7] [8]

[9]

[10] [11]

[12]

[13]

[14]

[15]

[16]

Blogger. . March 2007. Flickr. . March 2007. Google. . March 2007. OhmyNews. . March 2007. Salam Pax. . March 2007. Yahoo! . March 2007. L.A. Adamic, B.A. Huberman, Zipf’s law and the Internet, Glottometrics 3 (2002) 143–150. X. Ao, N.H. Minsky, T.D. Nguyen, V. Ungureanu, Law-governed Internet communities, COORDINATION’00: Proceedings of the 4th International Conference on Coordination Languages and Models, Springer-Verlag, London, UK, 2000, pp. 133–147. S. Bhola, R.E. Strom, S. Bagchi, Y. Zhao, J.S. Auerbach, Exactlyonce delivery in a content-based publish-subscribe system, DSN’02: Proceedings of the 2002 International Conference on Dependable Systems and Networks, IEEE Computer Society, Washington, DC, USA, 2002, pp. 7–16. B.H. Bloom, Space/time trade-offs in hash coding with allowable errors, Commun. ACM 13 (7) (1970) 422–426. A. Bozdog, R. van Renesse, D. Dumitriu, SelectCast: a scalable and self-repairing multicast overlay routing facility, SSRS’03: Proceedings of the 2003 ACM workshop on Survivable and self-regenerative systems, ACM Press, New York, NY, USA, 2003, pp. 33–42. A. Broder, M. Mitzenmacher, Network applications of Bloom Filters: a survey, in: Proceedings of 40th Annual Allerton Conference, October 2002. M. Castro, P. Druschel, A. Kermarrec, A. Rowstron, SCRIBE: A large-scale and decentralized application-level multicast infrastructure, IEEE J. Select. Areas Commun. (JSAC) 20 (8) (2002) 1489– 1499. Y. Choi, D. Park, Mirinae: a peer-to-peer overlay network for largescale content-based publish/subscribe systems, NOSSDAV’05: Proceedings of the international workshop on Network and operating systems support for digital audio and video, ACM Press, New York, NY, USA, 2005, pp. 105–110. D. Cutting, B. Landfeldt, A. Quigley, Implicit group messaging over peer-to-peer networks, in: A. Montresor, A. Wierzbicki, N. Shahmehri (Eds.), Sixth IEEE International Conference on Peer-to-Peer Computing (P2P2006), IEEE Computer Society, Cambridge, United Kingdom, September 2006, pp. 125–132. D. Cutting, B. Landfeldt, A. Quigley, A formal basis of implicit group messaging. Technical Report 616, School of Information Technologies, University of Sydney, Australia, August 2007.

[17] D. Cutting, B. Landfeldt, A. Quigley, ICE: a tesseral P2P substrate. Technical Report 617, School of Information Technologies, University of Sydney, Australia, August 2007. [18] D. Cutting, B. Landfeldt, A. Quigley, SPICE: implicit group messaging on ICE. Technical Report 618, School of Information Technologies, University of Sydney, Australia, August 2007. [19] C. Ding, C.-H. Chi, J. Deng, C.-L. Dong, Centralized content-based web filtering and blocking: How far can it go? in: IEEE International Conference on Systems, Man and Cybernetics (SMC), 1999, pp.115– 119. [20] D. Gruhl, D. Liben-Nowell, R. Guha, A. Tomkins, Information diffusion through blogspace, SIGKDD Explor. 6 (2) (2004) 43–52. [21] A. Gulati, A. Nandi, S. Ranjan, Efficient keyword search using multicast trees in structured P2P networks. Work in progress at Department of Computer Science, Rice University, TX, USA, 2006. [22] Y.-J. Joung, C.-T. Fang, L.-W. Yang, Keyword search in DHT-based peer-to-peer networks, ICDCS’05: Proceedings of the 25th IEEE International Conference on Distributed Computing Systems (ICDCS’05), IEEE Computer Society, Washington, DC, USA, 2005, pp. 339–348. [23] M. Khambatti, Peer-to-peer communities: architecture, information and trust management. PhD thesis, Arizona State University, December 2003. [24] R. Kumar, J. Novak, P. Raghavan, A. Tomkins, On the bursty evolution of blogspace, in: WWW’03: Proceedings of the 12th International Conference on World Wide Web, ACM Press, New York, NY, USA, 2003, pp. 568–576. [25] J. Lee, An end-user perspective on file-sharing systems, Commun. ACM 46 (2) (2003) 49–53. [26] C. Lindahl, E. Blount, Weblogs: simplifying web publishing, Computer 36 (11) (2003) 114–116. [27] A. Mowshowitz, A. Kawaguchi, Bias on the web, Commun. ACM 45 (9) (2002) 56–60. [28] G. Mu¨hl, Large-Scale Content-Based Publish/Subscribe Systems, PhD thesis, Darmstadt University of Technology, Darmstadt, Germany, August 2002. [29] B.A. Nardi, D.J. Schiano, M. Gumbrecht, Blogging as social activity, or, would you let 900 million people read your diary? CSCW’04: Proceedings of the 2004 ACM conference on Computer supported cooperative work, ACM Press, New York, NY, USA, 2004, pp. 222– 231. [30] T. Pitoura, N. Ntarmos, P. Triantafillou, Replication, load balancing and efficient range query processing in DHTs, in: Y.E. Ioannidis, M.H. Scholl, J.W. Schmidt, F. Matthes, M. Hatzopoulos, K. Bo¨hm, A. Kemper, T. Grust, C. Bo¨hm (Eds.), Proceedings of 10th International Conference on Extending Database Technology (EDBT06), Springer, Munich, Germany, 2006, pp. 131–148. [31] V. Ramasubramanian, E.G. Sirer, Beehive: O(1) lookup performance for power-law query distributions in peer-to-peer overlays, NSDI’04: Proceedings of the 1st conference on Symposium on Networked Systems Design and Implementation, USENIX Association, Berkeley, CA, USA, 2004, pp. 99–112. [32] S.P. Ratnasamy, A scalable content-addressable network. PhD thesis, University of California at Berkeley, 2002. Chair-Scott Shenker and Chair-Ion Stoica. [33] P. Reynolds, A. Vahdat, Efficient peer-to-peer keyword searching, Proceedings of ACM/IFIP/USENIX International Middleware Conference (Middleware 2003), Springer, Rio de Janeiro, Brazil, 2003, pp. 21–40. [34] S. Saroiu, K.P. Gummadi, R.J. Dunn, S.D. Gribble, H.M. Levy, An analysis of Internet content delivery systems. 5th Symposium on Operating Systems Design and Implementation SIGOPS, 36(SI) (2002) 315–327. [35] M. Schneider, Self-stabilization, ACM Comput. Surv. 25 (1) (1993) 45–67. [36] A. Varga, The OMNeT++ discrete event simulation system. in: Proceedings of the European Simulation Multiconference (ESM’2001), Prague, Czech Republic, June 2001.

D. Cutting et al. / Computer Communications 31 (2008) 437–451 [37] W3C. W3C workshop on push technology. , September 1997. [38] B.Y. Zhao, L. Huang, J. Stribling, S.C. Rhea, A.D. Joseph, J. Kubiatowicz, Tapestry: a resilient global-scale overlay for service deployment, IEEE J. Select. Areas Commun. 22 (1) (2004) 41–53. [39] M. Zhong, J. Moore, K. Shen, A.L. Murphy, An evaluation and comparison of current peer-to-peer full-text keyword search techniques, in: A. Doan, F. Neven, R. McCann, G.J. Bex (Eds.), Eighth International Workshop on the Web and Databases (WebDB 2005), Baltimore, Maryland, June 2005, pp. 61–66.

Daniel Cutting is a Ph.D. candidate at the University of Sydney, Australia supervised by Drs. Quigley and Landfeldt. He received his bachelor degree from the University of New South Wales in 1998 and holds an Australian Postgraduate Award and Smart Internet Technology CRC scholarship. His research interests include P2P, operating systems and middleware.

Dr. Quigley is an academic member of the Systems Research Group of the School of Computer Science & Informatics, UCD. He is a member of Lero@UCD, the Irish Software Engineering Research Center, and is an IBM CAS visiting scientist and a director of the ODCSSS research internship program. His research interests include software engineering, pervasive computing, P2P and ad hoc networking, information visualisation and human computer interaction. He has over 50 peer reviewed publications including edited proceedings, book papers and research papers in leading journals, conferences and workshops. His current research team consists of nine and his

451

research is supported by companies including IBM, Microsoft, Intel, HP, MERL and the Smart Internet CRC Australia. In addition, he has been awarded many competitive grants from the Science Foundation Ireland, Enterprise Ireland, EU FP6 (Marie Curie) and others. He serves as an editorial journal board member and has had leading roles in the organisation of seven international conferences and workshops: IBM CAS 2007; PERVASIVE 2006; CASCON Dublin 2006; GD 2005; ITI@AVI 2004; AIED 2004 and SoftVis 97. In addition he has served as a member of 26 conference/workshop program committees. He holds a B.A. (Mod.) from Trinity College and a Ph.D. (Newcastle) in Computer Science, and is a member of the IEEE and ACM.

Dr. Landfeldt started his studies at the Royal Institute of Technology in Sweden. After receiving a B.Sc. equivalent, he continued studying at The University of New South Wales where he received his Ph.D. in 2000. In parallel with his studies in Sweden he was running a mobile computing consultancy company and after his studies he joined Ericsson Research in Stockholm as a Senior Researcher where he worked on mobility management and QoS issues. In November 2001, Dr. Landfeldt took up a position as a CISCO Senior lecturer in Internet Technologies at the University of Sydney with the School of Electrical and Information Engineering and the School of Information Technologies. Dr. Landfeldt has been awarded 8 patents in the U.S. and globally. He has over 50 publications in international conferences, journals and books and has been awarded many competitive grants such as ARC discovery and linkage grants. Dr. Landfeldt is also a research associate of National ICT Australia (NICTA) and the Smart Internet CRC. Currently, he is serving on the editorial boards of international journals and as a program member of many international conferences and is supervising 8 Ph.D. students. Dr. Landfeldt’s research interests include mobility management, QoS, performance-enhancing middleware, wireless systems and service provisioning.

Suggest Documents