Journal of Network and Computer Applications 34 (2011) 1210–1224
Contents lists available at ScienceDirect
Journal of Network and Computer Applications journal homepage: www.elsevier.com/locate/jnca
Super-peer-based coordinated service provision Meirong Liu a,n, Timo Koskela a, Zhonghong Ou b, Jiehan Zhou a, Jukka Riekki a, Mika Ylianttila a a b
Computer Science and Engineering Laboratory, Department of Electrical and Information Engineering, University of Oulu, Finland Data Communications Software Laboratory, Department of Computer Science and Engineering, Aalto University, Finland
a r t i c l e i n f o
a b s t r a c t
Article history: Received 10 April 2010 Received in revised form 22 December 2010 Accepted 21 January 2011 Available online 31 January 2011
Leveraging P2P technologies for Web service provision attracts considerable research interests. One of the challenges is how to enable the service providers to adapt themselves in response to dynamic service demand. More specifically, one interesting research issue is coordinating the service groups in order to enable inter-group collaboration and resource sharing. In this paper, we propose a super-peerbased coordinated service provision framework (SCSP), consisting of an S-labor-market model (superpeer-based labor-market model), a recruiting protocol based on a weighting mechanism, and an optimal dispatch algorithm. In the SCSP, the S-labor-market model is designed to build the coordination among service groups by employing the proposed recruiting protocol. The optimal dispatch algorithm is designed to select the optimal service peers within a service group to process service requests. Finally, we perform simulations to evaluate the SCSP with four application scenarios. The experimental results show that our SCSP is efficient in coordinating the service groups, and possess good scalability and robustness. & 2011 Elsevier Ltd. All rights reserved.
Keywords: Service provision Coordination Peer-to-peer Super-peer
1. Introduction Web services have emerged as a popular middleware to offer dynamic integration and interaction of heterogeneous software artifacts (Alonso et al., 2004). Some typical applications include enterprise integration, e-business, and Web applications like weather forecasts, context-aware map services, etc. (Alonso et al., 2004). Web services are loosely coupled, reusable software components, which encapsulate discrete functionality and are accessible through standard internet protocols (Fensel and Bussler, 2002). The Web service architecture supports service provision, service registry, and services consumption. One of the challenges in Web service provision is how to make the Web service providers to adapt themselves to the changes in dynamic service demand (Brazier et al., 2009; Chen et al., 2008; Cuenca-Acuna and Nguyen, 2004; Foster, 2005; Pacifici et al., 2005; Papazoglou et al., 2007). In particular, how to enable the service providers to automatically monitor their resources and tune themselves to meet end users or business requirements on the quality of Web services (e.g. service response time) by deploying new instances of services or removing the existing ones. This challenge widely exists in the application areas that are computational intensive with dynamic fluctuations in service
n
Corresponding author. Tel.: +358 468119788. E-mail addresses:
[email protected].fi (M. Liu),
[email protected].fi (T. Koskela), zhonghong.ou@tkk.fi (Z. Ou),
[email protected].fi (J. Zhou),
[email protected].fi (J. Riekki),
[email protected].fi (M. Ylianttila). 1084-8045/$ - see front matter & 2011 Elsevier Ltd. All rights reserved. doi:10.1016/j.jnca.2011.01.007
demand, such as in the fields of cosmology, climate and computational grid services (e.g. Condors (Butt et al., 2006) and bioinformatics analysis (Chakravarti et al., 2005)). In these applications, on one hand, simple Web services are required; on the other hand, a considerable amount of computation (i.e. a large number of Web services) is required to process the large sets of data (Foster, 2005). Furthermore, as Web service entities are autonomous and heterogeneous, how to connect and coordinate them is a delicate and time-consuming task (Benatallah et al., 2002). For the above challenge of Web service provision, some studies adopted the central server infrastructure for Web service provision (e.g. Chappell, 2004; Dhesiaseelan and Ragunathan, 2004). In these methods, a Web service container works as an administrative server to provide service components and to connect business services and low-level services. Some studies used cluster with resource managers to monitor and configure Web service provision dynamically (e.g. Pacifici et al. 2005; Whalley et al. 2006). Other studies applied peer-topeer (P2P) technologies for Web service provision to increase scalability and robustness (e.g. Benatallah et al. 2002; Chen et al. 2008; Cuenca-Acuna and Nguyen, 2004; Gu and Nahrstedt, 2006; Kanellopoulos and Panagopoulos, 2008; Sioutas et al. 2008). In addition, some specifications also work on the cooperation of Web services, such as WS-Coordination (Burdett and Kavantzas, 2004) and WS-Choreography (Cabrera et al., 2005). In this paper, we propose a super-peer-based coordinated service provision framework (SCSP) to coordinate service groups and their service peers, which enables service groups to adapt to dynamic service demand. The SCSP framework is efficient in coordinating the
M. Liu et al. / Journal of Network and Computer Applications 34 (2011) 1210–1224
service groups, and possesses good scalability and robustness. Specifically, the SCSP framework consists of an S-labor-market model (super-peer-based labor-market model), a recruiting protocol based on a weighting mechanism, and an optimal dispatch algorithm. In addition, we measure the coordination effectiveness of our SCSP by examining two metrics for the service provision: the average response time and the resource efficiency. The response time is an emphasized criterion because we focus on minimizing the service response time to the users. Improving the resource efficiency, in its turn, aims at utilizing the peers of the service groups to the best extend and reducing the cost of managing the service groups. In the proposed SCSP framework, the S-labor-market model defines an abstract coordination model and employs the recruiting protocol to manage service groups. When the workload of a service group increases, the group uses the recruiting protocol to recruit peers from other service groups for processing service requests. Thus, the service response time of the service group is reduced by the recruited peers. When the workload of a service group decreases, the recruiting protocol enables the service group to dismiss peers. Through this operation, the resource efficiency of the service group is improved. The optimal dispatch algorithm is designed to select optimal service peers for processing service requests, which further reduces the service response time. That is, the algorithm can reduce the average service response time by selecting optimal service peers within a service group. As a conclusion, the SCSP framework coordinates the service groups and their service peers efficiently to meet the dynamic service demand. Consequently, the average response time of the overall service groups is reduced, and the resource efficiency is kept high in all the service groups. Overall, our SCSP has the following contributions: (1) The SCSP introduces two core components: S-labor-market model and a recruiting protocol based on a weighting mechanism to coordinate service groups. They dynamically recruit service peers from one or more service groups to the other groups to reduce the service response time. In addition, they dismiss idle service peers within service groups to improve resource efficiency. (2) The SCSP presents an optimal dispatch algorithm for superpeers to dispatch service requests, aiming to reduce the response time of the service groups further. (3) The SCSP adopts a super-peer-based overlay for managing service groups in service provision. Thus, the scalability and robustness of the service provision are enhanced. The rest of this paper is organized as follows: Section 2 reviews related work on Web service provision and P2P technologies. Section 3 introduces the proposed SCSP framework, including its architecture and components: the S-labor-market model, the recruiting protocol, and the optimal dispatch algorithm. Section 4 presents the experiments to evaluate the performance of the SCSP. Section 5 concludes the paper.
2. Related work In this section, we briefly review some related work. We start with Web service provision and P2P technology, and then present the related work on Web service provision leveraging P2P technology. Some studies adopted central server infrastructure for Web service provision (e.g. Chappell, 2004; Dhesiaseelan and Ragunathan, 2004; Papazoglou and Van den Heuvel, 2007). In this infrastructure, the
1211
Web service container works as server to host multiple Web service components and provides facilities such as location, routing, service invocation, and lifecycle management. This approach is simple, but usually suffers from poor scalability and a single point of failure. Some presented cluster architecture with resource managers for the coordinated Web service provision (e.g. Pacifici et al. 2005; Whalley et al. 2006). Specifically, Pacifici et al. (2005) proposed a cluster-based framework to manage Web services in face of fluctuating workloads. They utilized two types of resource managers, gateways and a global resource manager, to monitor the workload and schedule requests. The gateways implement local resource allocation mechanisms. A global resource manager solves an optimization problem and tunes the parameters for gateways. Whalley et al. (2006) proposed a cluster framework with two types of resource managers (node group managers and a provision manager) to manage resources in an autonomic Web service system. Each node group manager allocated service processes and requests within the node groups using modeling and optimization algorithms. The provision manager manages machine allocations for the node groups (which are managed by node managers) to balance their workload. The detailed comparison between the two works (Pacifici et al. 2005; Whalley et al. 2006) and our SCSP is presented in Section 3.5. Some specifications were proposed for the cooperation of Web services provision, such as WS-Coordination (Burdett and Kavantzas, 2004) and WS-Choreography (Cabrera et al., 2005). WS-Coordination describes a generic framework for application services to create a shared context to propagate an activity to other services and to register for coordination protocols. WSChoreography aims to create interoperable collaborations among different Web service parties by defining their common and complementary behaviors and tracking public message exchanges. However, our work focuses on defining a mechanism to support cooperation in terms of sharing resources between service groups. The P2P technologies have been widely applied in many applications (e.g. file and video stream sharing) due to their scalability, self-organization, and adaptability (AndroutsellisTheotokis and Spinellis, 2004). In a P2P system, nodes organize themselves into network topologies for sharing resources. The P2P system does not require the intermediation or support from a global centralized server, but still maintains acceptable connectivity and performance. Most of the researches on P2P networks concern the following issues: (1) routing algorithms and resource discovery (e.g. Lv et al. 2002; Stoica et al. 2001; Yang and GarciaMolina, 2002), (2) traffic optimization (e.g. Qiu and Srikant, 2004; Saroiu et al. 2002), (3) the overlay construction of distributed hash table (DHT) based structured systems (e.g. Stoica et al. 2001) and that of unstructured systems (e.g. Kwong and Tsang, 2008), and (4) security, reliability, and fault resiliency (e.g. Jain et al. 2007; Zhou and Hwang, 2007). Applying P2P technologies into Web service provision is increasingly attracting researchers’ attention. For example, Cuenca-Acuna and Nguyen (2004) proposed a distributed resource management framework for self-managing federated service provision in the P2P environment. Service management agents and node monitoring agents were designed to monitor the state of service nodes and adjust service nodes’ configuration (e.g. the number of services and their position). Chen et al. (2008) proposed a service provision framework to coordinate service groups to adapt to dynamic service demand. They introduced a DHT-based P2P overlay at the bottom of their framework. They presented a labor-market model and a recruiting protocol to build the coordination mechanism among different service groups. Since these two works (Cuenca-Acuna and Nguyen (2004) and Chen et al. (2008)) bear some similarity to our SCSP, the
1212
M. Liu et al. / Journal of Network and Computer Applications 34 (2011) 1210–1224
comparison between the two studies and our SCSP is detailed in Section 3.5 (after our SCSP is presented). Composed Web service provision using P2P technologies focuses on the provision of composed Web services without requiring a central authority (e.g. Gu and Nahrstedt, 2006; Goela et al., 2007; Ranjan et al., 2008; Votis et al., 2008). The target is to find a proper workflow to coordinate or integrate different peers distributed across multiple enterprises. These studies focused on data conversion rules between service peers, shared data repository design, and the provider (peers) selection policies. The SCSP does not address those issues, but focuses on how to efficiently share service peers among service groups. Service discovery using P2P technologies emphasizes the improvement of scalability and robustness in service lookup (e.g. Banaei-Kashani et al., 2004; Sioutas et al., 2008). These works utilized P2P technologies to distribute and discover services in a decentralized manner. Our work focuses on service peer management rather than service discovery.
3. The SCSP framework This section introduces the SCSP framework, i.e. the architecture and its three components: the S-labor-market model, the recruiting protocol, and the optimal dispatch algorithm. 3.1. Architecture of the SCSP Figure 1 illustrates the SCSP architecture. It consists of three layers: the super-peer layer, the service peer layer, and the network layer. Herein, the super-peer layer has two functions: (1) dispatch users’ service requests to service peers and (2) maintain the service peers in the service groups. The service peers that provide the same Web services are grouped as service groups (Le Fessant et al., 2004). Different super-peers manage different services, and one service group provides only one type of service that differs from the service provided in other groups. The service peer layer contains the service peers providing the services. A
User-1
super-peer layer
User-2
request
request
request
Service group 1
request Service group 3
Service group 2 Super-peer
Super-peer
Super-peer
Service peer
Service peer
Service peer
Service peer
Service peer
Service peer
Service peer
Service peer
Service peer
service peer layer
P2P Overlay Network
network layer
Connection between super-peers Connection between super-peers and service peers
Fig. 1. SCSP architecture.
Peers running in the overlay
service peer belongs to one service group, provides the type of service in that service group, and is managed by the super-peer of that service group. A service peer only provides one type of service at one time and can provide other different services by quitting from the current service group and joining another service group. Service groups cooperate with each other on sharing service peers to meet the dynamic service demand. The network layer is a P2P overlay, and provides basic network facilities, such as service lookup and communication. In this paper, we only focus on the super-peer layer and the service peer layer. More studies on the P2P overlay can be found in Androutsellis-Theotokis and Spinellis (2004) and Lua et al. (2005). A coordinated system based on the SCSP architecture in Fig. 1 operates as follows: it handles users’ requests and coordinates the service groups as well. Specifically, on the one hand, when a super-peer receives a service request from an end user, the superpeer dispatches the request to one of its member service peers using the optimal dispatch algorithm. The service peer processes the service request and returns the results to the end user. The details of routing service requests to super-peers are not discussed in this paper, related studies can be found in Lua et al. (2005) and Yang and Garcia-Molina (2003). On the other hand, the service groups utilize the S-labor-market model and the recruiting protocol to coordinate their peers, aiming to reduce the service response time and to improve the resource efficiency. The S-labor-market model defines two events (the service change event and the resignation event) that are performed by service groups for coordination. The recruiting protocol defines how to perform the S-labor-market events with policies. The coordination between service groups aims to coordinate service groups to share their service peers to meet dynamic service demand. The coordination is initiated by a service peer applying a service change event in the S-labor-market model. Each service peer applies a service change event at regular intervals. The decision on the service change application is made by using the recruiting protocol. If this application is approved, the service peer joins another service group. For example, suppose that group A is mostly idle and group B is under heavy workload. Group B performs coordination using the recruiting protocol to recruit a service peer from group A. If the service peer is approved to leave through negotiation with the super-peer of group A, the service peer would be recruited into group B and provide the service managed by group B. Thus, the service peer is shared between groups A and B through coordination. The coordination within service groups aims to make full use of the service peers in the service groups to reduce service response time and improve resource efficiency. It is conducted by service peers performing the resignation event (defined in the S-labor-market model) and super-peers performing the optimal dispatch algorithm. The reasons for adopting a super-peer overlay for coordinated service provision shown in Fig. 1 are as follows. The super-peerbased approach provides an effective way to run applications in the network, where nodes’ capabilities are heterogeneous (Li et al., 2005; Montresor, 2004). The super-peer approach takes advantage of the robustness and scalability of the P2P approach, and also, the search efficiency in the centralized approach (Yang and Garcia-Molina, 2003). Especially in the computational intensive applications that need a huge number of services, scalability, and robustness are the crucial requirements. 3.2. The S-labor-market model The market model has been widely applied to regulate the coordination among the distributed entities (Buyya et al., 2000;
M. Liu et al. / Journal of Network and Computer Applications 34 (2011) 1210–1224
Chen et al., 2008; Li and Li, 2009; Hausheer and Stiller, 2005), since it provides an analogy between coordinated systems and the social recruiting structure. We design an S-labor-market model to define the operations that enable service groups to conduct coordination. The S-labor-market features with two market roles and two events. Specifically, one role is the recruiter, taken by super-peers, and is responsible for conducting the operation of recruiting service peers from other service groups. The other role is the employee, taken by service peers, and is in charge of processing service requests. The two events are service change event and resignation event. Before presenting the events in the S-labor-market model, we define the notations used in the events as follows: Service group is a peer group providing a type of service and processing service requests. It is expressed as SGi ¼ oSPi, Si, {pi j}, STi, Ni 4, iA[0,n 1], jA[1,m], where n is the number of service groups and m is the number of all the service peers. In a service group SGi:
1213
SGi to process service requests within per unit time. It indicates the current service processing workload of SGi and is expressed as follows: Ni X j¼1
CRPij
X Ni
CPij
ð1Þ
j¼1
where CRpi j is the contributed resources of pij and Cpij is the capability of pij. Average processing capability of a service group is the average value of all the service peers’ capabilities in a service group. The two events in the S-labor-market model are presented as follows:
(1) SPi is a super-peer. Each service group SGi has only one superpeer. (2) Si is the only type of service provided in SGi. This service is self-contained and independent of other types of services. The service provided in one service group differs from that provided in other service groups. (3) pij is a service peer that provides service Si in the group SGi, where i is the ID of the group SGi and j is the unique ID of the service peer. pij has two features: capability and state, i.e. pij ¼ oCpi j,{busy9idle9resigned}4. Here, Cpi j denotes the capability of pij, which is a combination of three types of resource metrics that pij provides: CPU cycles, bandwidth, and P storage. That is, Cpij ¼ Kk ¼ 1 ðvk wk Þ, where K is the number of resource types and K ¼3; vk is the value of the kth resource metric; wk is the weight of the kth resource metric and PK k ¼ 1 wk ¼ 1. The value of wk can be set experientially according to requirements of specific applications. For example, if a specific application has higher requirement on the bandwidth, the value of bandwidth weight could be set larger than the other two weights (e.g. the weight for CPU cycles w1 ¼0.3, the weight for bandwidth w2 ¼ 0.4 and the weight for storage w3 ¼ 0.3). The busy state denotes that the peer pij is processing a service request. The idle state denotes that pij provides a service but is not processing a service request at the moment. The resigned state denotes that pij does not provide a service. We assume that one service peer pij only provides one type of service at a time and processes only one service request at a time as well. (4) STi denotes the state of service group SGi. It has three optional values: idle, busy, and resigned. If there is at least one idle service peer in SGi, STi is idle. If each peers is busy, STi is busy, otherwise, STi is resigned. There is only one resigned service group SG0, which manages all the resigned service peers. P (5) Ni is the cardinality of set {pij} in SGi and m ¼ ni¼ 0 Ni .
(1) Service change: after staying in a service group for a specific time period, every service peer in the SCSP can apply to change the service it provides. If a service peer’s service change application is approved, the service change event happens. In brief, the event is composed of the following steps: (a) a service peer pij sends the service change application to its super-peer SPi; (b) SPi communicates with other super-peers, selects service groups (with heavy workload) that pij might join and sends the service change application to these service groups; (c) the selected service groups that receive the application make an employment decision; (d) SPi makes an initial decision and then negotiates with pij for the final decision based on other groups’ employment decisions, the incentive mechanism and the current group’s workload. (e) pij takes action based on the negotiated service change decision: either join a new service group or continue staying in the current service group. The detail of the incentive mechanism is not discussed in the paper. Note that to guarantee that pij is managed by one service group, SPi must make the service change decision (step (d)) after the employment decision is made (step (c)). Otherwise, pij could end up with not being managed by any service group. This happens if pij first leaves the current group (step (e)) and then be declined by SPk (step (c)). (2) Resignation: a service peer pij can apply to terminate its service provision after it has served for a certain time period. pij sends a resignation application to its super-peer SPi when pij wants to resign. SPi makes an initial decision and then negotiates the final decision with pij considering the group’s workload. pij takes action based on the negotiated decision: either resign or continue staying in the group. Specifically, if SPi agrees pij to leave, pij resigns. If SPi hopes pij to stay considering the group’s workload, SPi sends pij a message to invite pij to stay for a longer time with an incentive mechanism. At this time, (1) if pij insists on the resignation, pij sends a message of insisting on resignation to SPi, SPi agrees pij to resign; (2) if pij agrees to stay for a longer time considering the incentive mechanism, pij sends a message of extending staying time to SPi and continues staying in the current service group.
The contributed resource of a service peer CRpi j is defined as the service peer pij resource that is actually utilized for processing service requests within per unit time. Cpi j is measured by Cpi j (the capability of pij). The idle time of a service peer Tpidle is defined as the amount of ij time that pij is idle during a period of time. Response time of a service group to a service request SRRTSGi is the average value of the response times of all the service requests received by all the service peers in SGi within a period of time. Resource efficiency of a service group is defined as the fraction of the resources having been actually utilized in the service group
For the abrupt leave of pij, which could be caused by some accidental reasons (e.g. power off or network disconnection), SPi removes pij from its list of group member. Service peers are autonomous for their staying in the super-peer overlay and the super-peers manage the service provision of these service peers in the service groups. The reasons for maintaining a resigned group SG0 are as follows: (1) to manage the resigned peers efficiently and (2) to improve the resource efficiency of the other service groups. Service requests are not routed to the resigned peers in SG0 because these peers are not active. Furthermore, the service peers
1214
M. Liu et al. / Journal of Network and Computer Applications 34 (2011) 1210–1224
in the service groups (except SG0), in general, can handle service requests efficiently through coordination (i.e. a service group recruits service peers from the other groups when it needs more service peers). It is only in the condition that all the service groups are under heavy workload and cannot handle service requests efficiently, the peers in SG0 would be recruited into the service groups with heavy workload and process service requests again.
3.3. The recruiting protocol To manage the S-labor-market model events above, we propose a recruiting protocol that uses a set of policies and introduces a weighting mechanism. Using the weighting mechanism, service groups with heavy workload are given a priority to recruit service peers. Utilizing the recruiting protocol, service groups know how to perform the S-labor-market model events.
Combining the S-labor-market model and the recruiting protocol enables service groups to self-organize their resources by recruiting more service peers when the demand increases and dismissing service peers when the demand decreases. In other words, the SCSP proposes a coordination process for distributing the peers among service groups. Table 1 summarizes all the basic operations used in Figs. 2–4. Figure 2 shows that the recruiting protocol performs S-labormarket model events. The recruiting protocol consists of two subprotocols: the service change sub-protocol (as shown in Fig. 2(a)), and the resignation sub-protocol (as shown in Fig. 2(b)). In brief, in Fig. 2(a), a service peer pij sends a service change application to its super-peer SPi. SPi calls the policy of sending the service change application to select service groups and sends out the service change application (e.g. to SPk, SPm, and SPn). The selected service groups (e.g. SPk) that receive the application call the employment policy to make an employment decision. After this,
Table 1 Basic operations in the recruiting protocol. Operation
Description
AddMember(SPi, pij) CollectNeighborsbyResEff(SPi, nNG)
The super-peer SPi adds the peer pij into its service group. SPi queries its neighbor groups and obtains a set of sorted list of service groups based on their resource efficiencies, and nNG is the number of neighbor groups. The super-peer SPi gets the capability of pij. SPi gets the contributed resource of pij.
GetCapability(SPi, pij) GetContriRes(SPi, pij) GetIdleTime(SPi, pij) GetLowestCapPeer (SPi, {pij}) GetMedianCapPeer(SPi, {pij}) GetMedianResPeer(SPi, {pij}) GetMemberPeersbyCap(SPi, {pij}) GetPeerWithShortestQueue(SPi,Pix ) GetResponseTime(SPi,pi,x,R); IsExtGoupEmpyAppr(SPi, pij) IsOnlyPeerinGroup(SPi, pij) MakeEmpyDecision(SPi, pij) MakeInitResgDecision(SPi, pij ) MakeInitServChangeDecision(SPi, SGr, pij) RemoveMember(SPi, pij) RetrieveRequiredResource(SPi, R) SendServChangeApp(SPi, SGr,pij) UpdateMembership(pij, SGi, SGr)
SPi gets the idle time of the peer pij. SPi gets the service peer with the lowest capability. SPi gets the median peer from the set {pij} according to their capabilities. SPi gets the median peer from the set {pij} according to their contributed resources. SPi gets sorted member peers based on their capabilities, and returns a sorted set of peers {pij}. SPi computes the peer that has the shortest queue in the peer set of Pix and returns that peer. SPi queries the response time of service request R that is processed by the peer pi,x SPi checks whether there exists one service group that approves employing pij, and returns that group (if such a group exists) or null. SPi checks whether there is only pij in SGi, and returns true or false. SPi decides whether to approve employing pij, and returns true or false. SPi decides whether should approve pij’s resignation, and returns true or false. SPi decides whether should approve pij to join the group SGr, and returns true or false. The super-peer SPi removes the peer pij from its service group. SPi queries the required resource for handling the service request R, and returns the value of the required resource CR SPi sends pij’s service change application to a service group SGr. pijupdates its membership from SGi to SGr.
Fig. 2. Recruiting protocol performs S-labor-market model events. (a) The service change sub-protocol performs a service change event. (b) The resignation sub-protocol performs a resignation event.
M. Liu et al. / Journal of Network and Computer Applications 34 (2011) 1210–1224
1215
Fig. 3. (a) Policy of sending the service change application, (b) employment policy, (c) policy of service change decision, and (d) resignation policy.
the super-peer SPi calls the policy of service change decision to make an initial service change decision and then negotiates the final service change decision with pij. pij takes action according to the negotiated service change decision. Specifically, if at least one service group agrees to employ pij but SPi wants pij to stay considering the workload, SPi sends pij a message to invite pij to stay for a longer time with an incentive mechanism. At this time, (1) if pij insists on changing service provision, pij sends a message of insisting on changing service provision to SPi, SPi agrees pij to change service provision and join another service group. The negotiated approval decision on the service change is made. (2) If
pij agrees to stay for a longer time, the negotiated rejection decision on the service change is made. Figure 3 describes the four policies: the policy of sending the service change application, the employment policy, the policy of service change decision, and the resignation policy. These policies are used in Fig. 2 for making initial decisions on performing the S-labor-market model events. Each policy is expressed as a precondition-action statement. Super-peers check the precondition periodically. When the precondition is true, the super-peers perform the actions to generate decisions for performing the events (e.g. resignation).
1216
M. Liu et al. / Journal of Network and Computer Applications 34 (2011) 1210–1224
response time of some service groups might become shorter, and vice versa. This is because more service peers would be approved to join the service groups when Ycap becomes smaller. Figure 3(c) shows the policy of service change decision. The focus of this policy is to make an initial decision whether a peer pij should be allowed to leave its current group SGi when some other groups have approved employing pij. The median peer in SGi is set as a reference peer pref, which is used to determine the minimum idle time that pij must satisfy when it leaves SGi. If a group (e.g. SGr) who approves employing pij is busy, a weight factor o (o 41) is used to increase the chance of pij being approved to leave SGi and then joining SGr in the initial decision. The idle time threshold Yidl is used to judge whether the ratio (yidl) of pij’s idle time to that of pref is big enough. If this is the case, pij is allowed to change the service provision in the initial decision. Herein, the variation in Yidl affects the performance of the service groups. If Yidl becomes smaller, the resource efficiency of the service groups would be improved, and in parallel, the difference between the service groups’ response time would become smaller and vice versa. This is because, when Yidl becomes smaller, more service peers with relatively long idle time would be approved to join other service groups and the resource efficiency in the current group is improved. Figure 3(d) presents the resignation policy. Both the capability threshold Ycap and the resource threshold Yres are used to help SGi make an initial decision whether should let pij resign. Two reference peers, pref1 and pref2, are obtained in step 2. According to pref1 and pref 2 , we can judge that: (1) whether the capability of pij is low enough and (2) whether pij’s contributed resource is small enough. If the above two conditions are met, pij is permitted to resign in step 4. The variation in Yres has impact on service groups’ performance. If Yres becomes larger, the resource efficiency of the service groups would be improved, but the response time would become larger, and vice versa. This is because, when Yres becomes larger, more service peers with relatively high contribution would resign, and the available service peers for processing service requests in the service groups would decrease.
3.4. The optimal dispatch algorithm Fig. 4. Optimal dispatch algorithm executed by a super-peer SPi
Figure 3(a) shows the policy of sending the service change application. The application is sent to those service groups with a high workload because they are more likely to employ the service peer. Thus, in step 2, the top half of the ranked neighbor service groups (i.e. r ¼[1, nNG/2]) are selected as the groups with a relatively high workload. Figure 3(b) shows the employment policy. A reference peer pref is used to determine the minimum value of capability required for a peer (e.g. pij) to be employed. In step 1, pref is selected as follows: when SGr is busy, the median peer is selected as the reference pref. The reason is that if a service peer pij’s capability is higher than the median value, when pij is employed, pij can contribute more (with a relatively higher capability) to processing requests for the busy group SGr. When SGr is idle, we select pij (the service peer with the lowest capability) as pref. This is because pij can contribute to processing requests only if pij’s capability is higher than that of pi,low. This constraint decreases the number of idle service peers that are unable to contribute to processing requests in SGr, and hence, improves the resource efficiency of SGr. In step 2, a weight factor o (o 41) is used to increase the chance of a busy service group to employ pij. As shown in step 3, the capability threshold Ycap is used to judge whether the ratio (ycap) of pij’s capability to pref ’s capability is big enough. If this is the case, SGr approves employing pij. Note that Ycap affects the response time and the resource efficiency of service groups. If Ycap becomes smaller, the
To further reduce the service response time and improve resource efficiency within a service group, we design an optimal dispatch algorithm. The super-peers utilize this algorithm for dispatching service requests to optimal service peers, as shown in Fig. 4. We categorize the capabilities of the service peers into three classes: C(h) (the high capability class), C(m) (the middle capability class), and C(l) (the low capability class) due to the heterogeneity of service peers’ capabilities. The capability variance within each class is much smaller than the capability variance between the classes. The detail of how to classify the peers is presented in Section 4.1. Step 2 is executed when the service group SGi is idle. This step aims to find the idle service peer, whose capability is the closest to the resource requirement of the service request (R), as the optimal service peer. This step starts by calculating the class of the service peer’s capability (i.e. C(x) (x¼{l, m, h})), which is needed to process the service request. CmaxðxÞ (x¼{l, m, h}) means the upper bound of C(x). The capability class Cy is the one with the smallest difference to CR (the capability needed to process the request R). After calculating the capability class Cy, the service peer fulfilling the following conditions is returned as the optimal one (if such a peer exists): (1) the peer is idle, (2) the peer belongs to the capability class Cy, and (3) the peer’s capability is the highest among the peers who fulfill conditions (1) and (2). Hence, the capability of the optimal service peer is utilized to the largest
M. Liu et al. / Journal of Network and Computer Applications 34 (2011) 1210–1224
extent and the resource efficiency of the service group is improved. Furthermore, the response time is reduced through selecting the service peer with the highest capability in Cy as the optimal one for processing requests. Step 3 is executed when the service group SGi is busy. The basic idea of this step is to find the service peer that provides the shortest response time as the optimal service peer. Here, we define the sets Pih ,Pim , and Pil as follows: Pih ¼ fpij 9Cpij A Ch ,pij A SGi g; Pim ¼ fpij 9Cpij A Cm ,pij A SGi g; andPil ¼ fpij 9Cpij A Cl ,pij A SGi g. They denote three sets of service peers with high, middle, and low capability in SGi. pi,x is the service peer that can process the request Rusing the shortest time in the above sets. More specifically, if the set of service peers Pix is empty, pi,x is set to be null, as no service peer in an empty set can provide the shortest response time. The response time of pi,x (i.e. SRRTpi x) is set as infinity accordingly. However, if Pix is not empty, the peer pi,x that has the shortest service request queue in Pix is set as the peer that has the shortest response time in set Pix . The reason is that the shortest request queue of pi,x means the shortest waiting time. 3.5. Comparison with existing methods In this section, we compare our SCSP with the existing related methods: Pacifici et al. (2005), Whalley et al. (2006), Cuenca-Acuna and Nguyen (2004), and Chen et al. (2008) in detail in terms of
1217
architecture, solution, type of services, and performance, which are shown in Tables 2 and 3. As shown in Table 2, first, both Pacifici et al. (2005) and Whalley et al. (2006) utilized cluster architecture for Web service provision. The cluster in Pacifici et al. (2005) consists of a set of servers, gateways, and a global resource manager. The cluster in Whalley et al. (2006) is composed of a provisioning manager, a set of node group mangers, and servers. Second, both works had a global resource manager (i.e. the global resource manager and the provisioning manager) to maintain the workload of all the servers and compute the servers allocation in a cluster. As for Cuenca-Acuna and Nguyen (2004), they proposed a P2P-based framework to automatically adjust the number and placement of service components. Their P2P framework consists of peers, per-node monitoring agents and per-service management agents. Each per-node monitoring agent publishes the peer’s state. Each per-service management agent manages a service. Meanwhile, each per-service management agent searches a better configuration for the service using a genetic algorithm, and decides the service allocation (i.e. stop or migrate the current service or spawn new instance). Finally, all of these three works focus on the provision of one type of services. In contrast, our SCSP utilizes a super-peer-based P2P architecture, which consists of super-peers and services peers. Superpeers manage the service groups and exchange groups’ workload status in gossip message in distributed way. That is, no central
Table 2 Comparison between our SCSP and related works. Methods
Architecture
Solution
Type of services
Resource manager
Resource management
Pacifici et al. (2005)
A cluster
Global resource manager
(1) Maintain the workload
One type
Whalley et al. (2006)
A cluster
Provisioning manager
(2) Compute the servers allocation (1) Maintain the workload
One type
(2) Compute the servers allocation (1) Manage a service
One type
Cuenca-Acuna Peer-to-peer and Nguyen (2004) overlay
Our SCSP
Each peer-service management agent
Super-peer-based P2P overlay
Super-peers in the service groups
(2) Search better configuration using a genetic algorithm (3) Decide a service allocation (1) Manage service groups
More than one type of services
(2) Exchange groups’ workload status in gossip message (3) Coordinate peers using a recruiting protocol
Table 3 Comparison between our SCSP and Chen et al. (2008). Methods
Architecture
Solution
P2P overlay
Market model
Recruiting protocol
Recruiter
Employee
Service change application
Make recruiting decision
Communication
A service peer selects other service groups and sends the groups the application A super-peer selects other super-peers and sends the super-peers the application
Service peers use the extremal optimization and stimulus– response mechanism
Communicate with other peers within the group Communicate with a service peer for negotiation
Service groups
Performance
Chen et al. (2008)
DHT based
Service peers
Each service peer
Each service peer
Our SCSP
Superpeer based
Superpeers and service peers
Superpeers
Service peer
Super-peers use a simple weighting mechanism for initial decisions and make negotiated decisions
Request dispatch
SRRT (5 service groups with 30 peers)
Randomly assign service requests Using an optimal dispatch algorithm
27.00 46.60s
21.00 33.00s
1218
M. Liu et al. / Journal of Network and Computer Applications 34 (2011) 1210–1224
resource manager is required in the SCSP. Furthermore, the superpeers coordinate the service peers allocation among service groups using the proposed recruiting protocol in decentralized way. The SCSP works on the provision of more than one type of services. The comparison between our SCSP and the work of Chen et al. (2008) is shown in Table 3. Specifically, (1) for the architecture, the latter applied a DHT-based P2P overlay framework, where each service peer takes the same responsibility. The service groups consist of service peers. However, our SCSP uses a superpeer-based P2P overlay, where super-peers take more responsibility than service peers. The service groups consist of super-peers and service peers. Adopting the super-peer overlay makes use of heterogeneity of peers’ capabilities, which shortens the time for making decisions and reduces communication traffic. (2) For the market model, the latter proposed a labor-market model, in which each service peer takes both roles of recruiter (recruit or dismiss peers) and employee (process service requests). In contrast, our SCSP proposes an S-labor-market model, in which super-peers take the role of recruiter, due to their high capacity, and service peers take role of employee. Consequently, the coordination between service groups is performed in different ways. (3) For the recruiting protocol, in the latter work, a service peer selects service groups, sends the service change applications to these groups, and utilizes extremal optimization and stimulus– response mechanism for making recruiting decision. In our SCSP, a super-peer selects a few other super-peers, sends the superpeers the service change application. Our SCSP uses a simple weighting mechanism for making an initial recruiting decision and makes negotiated decision with the service peer. In addition, each service peer in the latter work makes decisions through communication with other peers within the service group. In our SCSP, a super-peer negotiates with a service peer for the final negotiated decision, which considers the service peer’s unilateral action (e.g. insist on the resignation). (4) For dispatching requests, the latter randomly assigned service requests. Our SCSP uses an optimal dispatch algorithm. (5) For the performance, our SCSP achieves much shorter service response time and comparable resource efficiency compared to the latter according to the experimental results (presented in Section 4).
mean value mC(x) and a standard deviation sC(x) (x ¼{l, m, h}). We say a service peer pi jAC(x) (x ¼{l, m, h}) if Cpi j has the highest possibility to fall into the category C(x). The value settings for the above three categories in our simulation are shown in Table 4. For example, for category C(h), the mean value mC(h) is 10 104 and the standard deviation sC(h) is 10 103. In our simulation, the service peers are randomly assigned to C(h), C(m), and C(l) due to the heterogeneity of service peers’ capabilities. However, the superpeers are only assigned to C(h) because of their high capabilities. Second, the service peers’ capabilities follow the distribution from the second to the fourth rows in Table 4. The super-peers’ capabilities follow the distribution shown in the second row of Table 4. Five service groups used in the experiments are defined in Table 5. Each group provides a service, which differs from the service provided by other groups. For example, SG1 provides the service S1. All the peers are randomly assigned to one of the service groups. The number of service requests that each service group receives is listed in the third column of Table 5. The amount of resources, consumed for processing each type of service requests, follows the normal distribution. The mean values and standard deviations of the normal distributions for different types of service requests are shown in the fourth and fifth columns of Table 5. In addition, the arrival rate of the service requests is assumed to follow the Poisson distribution with the rate l. Different values of l simulate the varying service demands. Note that, as shown in Table 5, there is a clear difference in the required resource for processing each request between service groups SG2 (20 105) and SG5 (5 105). Hence, we focus on the comparisons of SG2 and SG5. The settings concerning the S-labor-market model events are as follows: The service change event is set to occur at every 10 s (10 simulation seconds). One simulation second is a simulation round, in which all the peers finish executing their operations once. The service change applications are sent out by randomly selected service peers. Service peers are encouraged to stay in their current service groups to keep the structure of the services groups stable and to reduce the communication overhead between the service groups. However, the service peers might change their service provision when a relatively high demand for a particular service in another service group appears. The precondition for performing the service change event is that a service
4. Performance evaluation In this section, we evaluate the performance of the proposed SCSP. First, we introduce the setups of the experiments. We then conduct experiments under four different application scenarios to evaluate the SCSP from three aspects: the efficiency of the coordination, the scalability, and the robustness. In what follows, we use the term peers to denote both the super-peers and service peers for simplicity.
Table 4 Capability distributions of the peers. Category
High category (Ch) Middle category (Cm) Low category (Cl)
Mean value
Deviation
(104 )
(103 )
Percentage (%)
10 5 2
10 8 3
30 40 30
4.1. Experimental setups We evaluate the SCSP in a P2P simulator called PeerSim (Jelasity et al., 2006). Two performance metrics defined in Section 3.2 are examined: (1) the response time of a service group to a service request (i.e. SRRTSGi) and (2) the resource efficiency. The settings about service peers, service groups, and the S-labor-market model events for the simulation are presented in the following parts. The settings of the peers include their classification and capability distributions. First, as mentioned in Section 3.4, peers are classified into three categories Ch, Cm, and Cl according to their capabilities. We assume that the peers’ capabilities follow the normal distribution. For each category C(x) (x ¼{l, m, h}), there is a
Table 5 Settings of the service groups. Service group
SG1 SG2 SG3 SG4 SG5
Service
S1 S2 S3 S4 S5
Number of requests
4000 4000 4000 4000 4000
Mean value of the resource consumed for processing a
Deviation value of the resource consumed for processing a
request (105 )
request (104 )
10 20 15 10 5
5 50 5 5 5
M. Liu et al. / Journal of Network and Computer Applications 34 (2011) 1210–1224
peer is assumed to stay in its current group for at least 40 s. The resignation event is set to be performed at every 10 s. The weight factor and the thresholds are set experientially, i.e. w¼1.2, Yidl ¼0.5, Ycap ¼0.9, and Yres ¼0.5. We believe that, based on the analysis of the impact of thresholds’ variations on the SCSP’s performance in Section 3.3, the above reference values provide a good starting point for setting the parameters’ values in other applications. Noted that, besides the assumption on the service peers in Section 3.2 (i.e. we assume that one service peer provides one type of service at a time and processes only one service request at a time), in this subsection, we assume that peers have fast connections to the network (i.e. the time used to transfer a message between any two peers is negligible compared to the time used to process service requests). In network-based systems, the network delay might take up a very large portion of the overall response time and possibly affect the accuracy of the simulation results significantly. In this paper, our SCSP is employed in the applications that the service request processing usually takes up the dominant time and the network delay only takes up a small portion of the overall response time. For example, in the computational intensive applications for bioinformatics analysis (Bardram and Venkataraman, 2010), the computation usually takes up the dominant portion of the overall application response time and the time used to transfer a message between any peers is negligible. In addition, we assume that all the peers’ capabilities and the amount of resource consumed by each type of service requests follow the normal distribution. However, the last assumptions are not necessary for the SCSP framework, which are used for carrying out experiments to evaluate the SCSP’s performance.
4.2. Evaluation of the coordination mechanism
20 Group size
The setting of the peers follows Table 4. The settings of SG1 follow the second row of Table 5. For example, the resources required to process each service request for S1 are listed in the fourth and fifth columns, the second row of Table 5. The arrival rate of service requests l is set to be approximately two service requests per 10 s. These settings are the same as those in Chen et al. (2008). Figure 5 illustrates the simulation results. On the one hand, in Fig. 5(a), the group size of SG1 decreases fast and then stays around four peers. One the other hand, in Fig. 5(b), the average group processing capability of SG1 gradually increases and then becomes stable after 800 s. These results show that the resource efficiency in SG1 has improved dramatically. The increased group processing capability and improved resource efficiency of SG1 in Fig. 5 are brought by the optimal dispatch algorithm and the resign policy in the recruiting protocol. Specifically, service requests are only dispatched to high capability service peers, whose capability is the closest to the required resource for processing a service request. Most of the service peers with a relatively low capability turn idle and then resign. Thus, the group size decreases, but the average group processing capability increases. Furthermore, the resource efficiency becomes higher because most of the remained service peers are in busy state. In addition, the average processing capability of SG1 becomes stable after 800 s (shown in Fig. 5(b)) because only the needed service peers are kept in SG1 after 800 s. Table 6 shows the comparison among the SCSP, Chen et al. (2008), and Cuenca-Acuna and Nguyen (2004). First, the SCSP is more efficient regarding the convergence of the response time (800 s in comparison to 10,000 s in Chen et al., 2008). Second, the SCSP achieves a smaller value of the minimum SRRT, being 10 s compared to 13.8 s in Chen et al. (2008) and 11.8 s in CuencaAcuna and Nguyen (2004). Third, the SCSP achieves relatively higher resource efficiency. The performance difference between our SCSP and Chen et al. (2008) is caused by different dispatch algorithms: the optimal dispatch algorithm used in our SCSP, and the random dispatch Table 6 Comparison of our method to the related work. Methods
The minimum SRRT(s)
Resource efficiency
Converging time of the response time (s)
The SCSP Chen et al. CuencaAcuna et al.
10 13.8 11.8
0.9 0.8 0.4
800 10,000 N/A
Average Capability of the service group
4.2.1. Evaluation of the coordination within service groups In this subsection, a simple computational intensive application is examined, for example, the directory service provision in European Data Grid (http://www.eu-datagrid.org/). In this application, how to efficiently coordinate the nodes that provide the directory service is an important issue. We carry out simulations to test whether our coordination mechanism (the recruiting protocol and the S-labor-market model) can coordinate the single type of the service provision in one service group efficiently. In the simulation, we set 24 peers in one service group SG1. Note that, the number of peers can be set according to a specific application.
10
0 0
400
800
1200
Simulated seconds
1600
2000
1219
12
x 104
8
4 0
400
800
1200
1600
Simulated seconds
Fig. 5. (a) Group size of SG1 during simulation time. (b) Average group processing capability of SG1.
2000
1220
M. Liu et al. / Journal of Network and Computer Applications 34 (2011) 1210–1224
method used in the latter. In the optimal dispatch algorithm, only the optimal service peers that process service requests with relative short time are dispatched with service requests in our SCSP. However, in the random dispatch method used in the latter, the peers with high, middle, and low capabilities are dispatched with requests. Therefore, the minimum SRRT becomes longer. In addition, in the latter work, the service peers with low and middle capabilities do not resign because of their contribution to processing service requests. Consequently, it takes longer for the response time to converge and the resource efficiency becomes relatively lower in the latter work. Cuenca-Acuna and Nguyen (2004) reduced the SRRT at the expense of lower resource efficiency. This is because they used a fitness function to identify the fit peers for forming a service group, and there is a trade-off between the SRRT and the resource efficiency in the fitness function. This proves that our SCSP is able to achieve both relatively short SRRT and relatively high resource efficiency. Note that the convergence about S-labor-market model is verified by convergence of the response time. The coordination built by the S-labor-market model converges when the response time converges. In our work, service groups’ response time converges (one example shown in the second row, the fourth column of Table 6) are based on the following assumptions: (1) the resources required for processing service requests are assumed to follow the normal distribution with a small deviation and (2) the arrival rate of service requests is assumed to follow the Passion distribution.
4.2.2. Evaluation of coordination between service groups Some computational intensive service provisions include several types of services. For example, the applications for bioinformatics analysis (Bardram and Venkataraman, 2010), which utilizes available computational resources in biology laboratories on an ad-hoc basis, need to perform different tasks and complete
the tasks in relatively limited time. In this case, it is important to coordinate the available resources between different tasks (which can be conduct by different service groups) to complete the bioinformatics analysis. We carry out simulations to evaluate the coordination among different service groups built by our SCSP. In the simulations, there are five different service groups and 30 peers involved totally. The peers’ capabilities follow the distributions shown in Table 4. In turn, the settings of the service groups follow the distribution shown in Table 5. We compare two groups, SG2 and SG5. The reason is twofold. On the one hand, there is obviously a difference in the resource consumed for processing each service request sent to SG2 (i.e. 20 105) and SG5 (i.e. 5 105), as shown in Table 5. On the other hand, the initial service peers distributions of SG2 and SG5 are quite different from the requirements of their service requests, since SG2 and SG5 are randomly assigned with initial service peers (as mentioned in Section 4.1). Thus, this experiment provides a good chance to verify the efficiency of the coordination between service groups in the SCSP as follows. First, the resources owned by the service peers in each service group should be in proportion to the resource requirement of service requests in each service group after performing coordination. For example, SG2 should be able to recruit its needed service peers from other groups, and SG5 should be able to decrease its idle service peers to improve the resource efficiency. Second, the difference in the response time between SG2 and SG5 should be relatively small. Figures 6 and 7 demonstrate the simulation results and comparisons between our SCSP and Chen et al. (2008). As shown in Figs. 6(a) and (b), SG2 possesses a higher processing capability and a larger group size than SG5 during the simulation. In other words, the processing capabilities of SG2 and SG5 are in proportion to their service requests’ resource requirement after SG2 and SG5 become relatively stable. Specifically, before 400 s, the group size and processing capability of SG2 increase, but the group size and processing capability of SG5 decrease. It is because SG2 recruits some service peers and SG5
Fig. 6. (a) Total processing capability of SG2 and SG5 over the simulation time. (b) Group size of SG2 and SG5 over simulation time. (c) Resource efficiency of SG2 and SG5 over the simulation time.
Fig. 7. Performance in terms of coordination efficiency: (a) SRRTs, (b) group sizes, (c) group processing capabilities, and (d) communication messages for coordination per service peer.
M. Liu et al. / Journal of Network and Computer Applications 34 (2011) 1210–1224
shown in Fig. 7(d). Another four scenarios, in which the total number of peers in a system is relatively large, are evaluated, and traffic results are shown in Fig. 8. More detailed traffic analysis could refer to the Appendix. Figure 7(d) presents the communication messages for coordination in Chen et al. and that in our SCSP based on the simulation settings. According to this figure, the number of messages to complete a service change operation per peer in Chen et al. is larger than that in our SCSP. The reason is that, in Chen et al., a service peer in one service group needs to communicate with the other 5 service peers within the same group and also needs to communicate with peers in other service groups. In our SCSP, a service peer in one service group only needs to contact its superpeer and this super-peer communicates with super-peers in other service groups. In addition, in our SCSP, the super-peer contacting other super-peers generates the dominant communication messages. Given the total 30 peers in this simulation, the number of service groups (i.e. 5) is small and the number of super-peers is 5 as well. Thus, the number of communication messages in our SCSP is smaller than that in Chen et al. As for the resignation, the number of communication messages in Chen et al. is larger than that in our SCSP. The rationale is that, in Chen et al., a service peer in one service group needs to communicate with all the other service peers within this group for the resignation operation. In our SCSP, a service peer only communicates with its super-peer. Figure 8 illustrates the trend of the traffic difference between Chen et al. and our SCSP in a system with relatively large number of peers. The Y-axis shows the traffic difference between Chen et al. and our SCSP in terms of the number of communication messages to complete a service change operation. The X-axis is the number of service groups. Four curves illustrate four scenarios with different sizes of total peers in a system. Some critical points (i.e. the traffic difference in communication messages is 0) are shown in Fig. 8 (i.e. red dots), which mean that the same number of communication messages is generated in Chen et al. and in our work. Above the critical point in each curve, more communication messages are generated in Chen et al. than that in our SCSP, while below the critical points, more communication messages are generated in our SCSP than that in Chen et al. It can be clearly found in Fig. 8 that (1) when the total number of peers in a system is relatively small (e.g. 250 and 500), our SCSP generated less communication messages when compared to Chen et al. because the settings above the critical points (i.e. each service group contains at least 3 peers on average) are usually utilized in a system, and (2) when the total number of peers in a system is The difference in commumication messages
dismisses some service peers. Between 400 and 800 s, the group size of SG5 increases. A possible reason for this is that the service requests sent to SG5 increase during this period. After 800 s, both the group size and processing capability of SG2 and SG5 become relatively stable, and SG2 possesses a higher processing capability and a larger group size than that of SG5 to meet the requirement of their service requests shown in Table 5. As shown in Fig. 6(c), there is small difference between the resource efficiencies of SG2 and SG5 after the two service groups become relatively stable. This indicates that a relatively even workload distribution is achieved. The above results verify that the coordination mechanism works well between service groups. According to Fig. 7(a), the variations in the response time (SRRT) among the five service groups in our SCSP are small. This result verifies that the coordination mechanism in the SCSP works well among different service groups. Furthermore, the variations in the SRRTs among the five service groups in the SCSP are smaller than those of Chen et al. (2008). Specifically, the SRRTs in the SCSP vary from 21.00 to 33.00 s with the standard deviation of 5.30. The SRRTs in Chen et al. (2008) vary from 27.00 to 46.60 s with the standard deviation of 5.60. In addition, as shown in Fig. 7(b) and (c), the variations in the service groups’ sizes and group processing capabilities in our SCSP are comparable to those of Chen et al. (2008). That is, the SCSP achieves comparable resource efficiency in comparison to Chen et al. (2008). Herein, the performance values (e.g. the SRRT) in Fig. 7 are average values. For instance, the SRRT of SG1 is the average value of 200 sampled SRRTs, which are sampled at every 10 s. The rationale behind the small differences among the service groups’ SRRTs in our SCSP is as follows. First, the optimal dispatch algorithm can help a service group reduce the SRRT. Second, whenever a service group is under a heavy workload, the recruiting protocol with the weighting mechanism assigns this service group a high priority to recruit its needed service peers. Thus, SRRTs of the service groups with heavy workload are reduced and the differences in SRRTs among the five service groups are decreased. The relatively longer SRRTs in Chen et al. (2008) can be partly explained by its random dispatch algorithm. In their algorithm, the peers with high, middle, and low capabilities are used for processing the service requests, which enlarges the SRRTs of the service groups. Furthermore, in their work, more peers with middle and low capabilities are assigned to process service requests since they stay in the service groups for a longer time compared to the SCSP. In the SCSP, some of the peers with middle and low capabilities turn idle and resign from the service groups according to the recruiting protocol. Thus, fewer peers with middle or low capability are used to process service requests and the SRRTs in the SCSP become relatively shorter. In summary, the SCSP effectively coordinates the service groups on sharing their service peers. In addition, the SCSP achieves relatively smaller values of SRRTs and comparable resource efficiency in comparison to Chen et al. (2008). Traffic analysis. In the following paragraphs, we will analyze the traffic generated in Chen et al. and our SCSP. The overall traffic basically consists of (1) the traffic for maintaining the overlay network (the super-peer overlay or the DHT-based overlay) and (2) the traffic generated by the coordination mechanism (i.e. communication messages). We will analyze the second part of the overall traffic, i.e. communication messages generated by the coordination mechanism because we concentrate on coordinating service groups. For the traffic analysis on the first part, please refer to Zoels et al. (2006), Li et al. (2004), Pyun and Reeves (2004), and Acosta and Chandra (2007). In the following traffic analysis, one scenario, in which the total number of peers in a system is relatively small, is evaluated, and traffic results are
1221
2000 1500 1000 500 0 -500 The number of the total The number of the total The number of the total The number of the total
-1000
peers peers peers peers
is is is is
250 500 1000 2000
-1500 0
50
100
150
200
250
300
350
400
The number of service groups Fig. 8. Traffic difference between Chen et al. and the SCSP in terms of the number of communication messages for a service peer to complete the service change operation.
1222
M. Liu et al. / Journal of Network and Computer Applications 34 (2011) 1210–1224
relatively large (e.g. 1000 and 2000) and the number of service groups is relatively small (i.e. above the critical points in curves), the number of communication messages generated by a peer to complete the service change operation in the SCSP is smaller than that in Chen et al. For example, for total 1000 peers, the number of service groups is smaller than 188 and each group contains at least 5 peers; for 2000 peers, the number of service groups is smaller than 281 and each group contains at least 7 peers. In these cases, the number of communication messages in the SCSP is smaller than that in Chen et al.
4.3. Evaluation of the scalability For the computational intensive applications, e.g. the applications for bioinformatics analysis (Bardram and Venkataraman, 2010), the applications usually need a large number of nodes and a huge amount of computation to complete the application tasks. When the end users submit more tasks to the applications, more nodes are needed for processing the application tasks in relatively limited time. In this case, it is a critical issue that whether the time to complete these application tasks varies in a small scope when more types of services and nodes are employed in the applications. That is, the scalability is a critical issue. We conduct three sets of simulations to examine the scalability of our SCSP (i.e. the change in the SCSP’s performance with the increase in the number of service groups and service peers). The settings of the three sets of experiments are shown in Table 7. The settings of peers’ capabilities follow Table 4. The settings of the service groups are as follows: in Experiment 1, the settings of each set of five service groups: SGi (i¼1, 2,y, 5) and SGj (j¼6, 7,y, 10) follow Table 5. In Experiment 2, the settings of the first ten service groups SGi (i¼1, 2,y, 10) are the same as the ten ones in Experiment 1. So do the second ten, third ten, and fourth ten service groups. In Experiment 3, there are 2000 peers totally. The settings for the first ten service groups SGi (i¼1, 2,y, 10) are Table 7 Settings of the three sets of experiments.
Experiment 1 Experiment 2 Experiment 3
the same as those of Experiment 1, and then each of the following nine sets of ten service groups (e.g. SGi (i¼11, 12,y, 20), SGi (i¼21,y, 30)) uses the settings of the first ten service groups. Figure 9 presents the results of the three sets of experiments: the SRRTs and group sizes of the 10 service groups. In Fig. 9, mexpei(SRRT) and sexpei(SRRT) (i¼1, 2, 3), mean the average value and the standard deviation of the SRRTs in the ith experiment separately. mexpei(group size) and sexpei(group size) (i¼1, 2, 3) mean the average value and the standard deviation of the group sizes in the ith experiment, respectively. In addition, the performance values of each service group (e.g. the SRRT) shown in Fig. 9 are average values. For example, for the SRRT of SG1 in Experiment 1, we sample the values of SRRT at every 10 s during the simulation time, and obtain 200 sampled SRRTs, then compute the average value of the 200 sampled SRRTs as the final SRRT. According to Fig. 9(a), the SRRT of each service group SGi (i¼1, 2,y, 10) varies marginally among the three sets of experiments. For instance, there is only a little difference among the SRRTs of SG2 in the three sets of experiments. Furthermore, there is only a little difference on the standard variations of the service response time (sexpei(SRRT)) among the three sets of experiments. Specifically, when the number of the service groups increases from 10 to 100 (10 times), sexpei(SRRT)(i ¼1, 2, 3) changes from 4.7 to 4.9 (only 0.2 s). These results verify that the coordination mechanism scales well with increase in the number of service groups and service peers. As shown in Fig. 9(b), there is also only a little difference in the standard variations of the group sizes (i.e. sexpei(group size)) among the three sets of experiments. Specifically, when the service groups increase from 10 to 100, sexpei(group size)( i¼1, 2, 3) changes from 1.5 to 2.1 (only 0.6 peer compared to 10 or 100 peers). In conclusion, the SRRTs vary marginally among the three sets of simulations when service groups and service peers increase. The variations in the average group sizes fluctuate limitedly among the three sets of simulations as well. That is, the SCSP achieves good scalability.
4.4. Evaluation of the robustness
Number of service groups
Number of peers
Types of services
10 40 100
200 500 2000
10 40 100
In a computational intensive application, for example, the applications supporting bioinformatics analysis on an ad-hoc basis (Bardram and Venkataraman, 2010), the nodes that contribute resource for bioinformatics processing might join or leave the application system. The random joining or leaving of nodes
Fig. 9. Performance of ten groups in terms of scalability: (a) SRRTs and (b) group sizes.
M. Liu et al. / Journal of Network and Computer Applications 34 (2011) 1210–1224
SRRT (s)
30
25
20
(8670, 20)
15 0
3000
6000
9000
12000
Simulated seconds Fig. 10. Variation in SRRTSG2 with the activity of peers’ joining.
200 Number of peers
Total service groups SG2
150 100 50 0 0
0.5
1
1.5
2
×104
Simulated seconds 25 SRRT (s)
could affect the time to finish the applications tasks. Thus, it is an important issue that whether the application is robust to nodes’ joining or leaving. Our SCSP adopts a super-peer-based overlay for service provision to increase the robustness of the applications. We conduct two sets of simulations to verify the robustness of the SCSP. Both simulations employ five service groups that use the settings shown in Table 5. The first simulation is to simulate the activity of service peers’ joining. In the beginning, 30 peers are assigned to the five service groups randomly. As the simulation proceeds, new service peers randomly join the five service groups at a rate of one service peer per 20 s. The joining activity stops until 170 peers have joined the five service groups. We only depict the variations in the SRRT of SG2 in Fig. 10 because SG2 requires much more resources to process service requests than the other service groups. Figure 10 illustrates that (1) the SRRT of SG2 reduces significantly after several new peers join and (2) the SRRT of SG2 stops decreasing after 8670 s with the final SRRT value being equal to 20 s. The reason for the above results is that in the beginning, SG2 only employs a few of optimal service peers because the total number of initial service peers in the five service groups is small. As the joining activity goes on, SG2 employs more optimal service peers that apply to join SG2. Therefore, the SRRT value of SG2 reduces significantly. After all the 170 service peers join the five service groups, SG2 has employed enough optimal service peers and achieves a short SRRT value (i.e. 20 s). The second simulation is to simulate the activity of service peers’ leaving. There are 200 peers initially. As the simulation goes on, some of the service peers that resign from service groups are selected to leave the system. When the simulation is terminated by 20,000 s, there are only 36 peers left in the five service groups. The simulation results are shown in Fig. 11. According to Fig. 11(a), the number of peers in SG2 decreases when the total number of peers in the five service groups decreases. According to Fig. 11(b), a flattened SRRT value of SG2 is achieved. The reason for this result is twofold. First, in the beginning, SG2 employs enough optimal service with a high capability since the total number of initial service peers in the five service groups is large. Second, the leaving peers are chosen from the resigned service peers and most of them have relatively low and middle capabilities according to the recruiting protocol. That is, most of the service peers with a high capability remain in SG2 even when the leaving activity continues. Therefore, the performance of SG2 is not affected much by service peers’ leaving. In addition, we also find that the final SRRT (i.e. 20 s in Fig. 10) in the activity of service peers’ joining is the same as the SRRT (i.e. 20 s in Fig. 11(b)) in the activity of service peers’ leaving. This finding also proves that the SCSP is robust to service peers’ joining and leaving.
1223
20
15 0
0.5
1
1.5
2 x 104
Simulated seconds Fig. 11. Performance with the activity of peers’ leaving. (a) variations in the number of peers in all service groups and SG2. (b) SRRTSG2 during the simulation time.
Note that the events (i.e. the service peers’ joining and leaving) affect the converging of a service group’s response time. Specifically, these events change the processing capability of a service group. The change of the service group’s processing capability would change the response time to service requests, which in turn would affect the converging of a service group’s response time. For example, when nodes leave the service groups (which is different from the case that resigned nodes join the resigned service group), the number of available service peers in all the service groups becomes smaller. When performing the recruiting protocol, the service groups have less chance to recruit their needed service peers because of fewer available service peers. Thus, it takes longer for the service groups to recruit needed service peers, which in turn increases the converging time. In contrast, when more nodes join the service groups, the converging time would decrease accordingly.
5. Conclusion In this paper, we present a super-peer-based coordinated service provision framework (SCSP) to coordinate the service groups to work collaboratively and share their service peers. The SCSP is made up of an S-labor-market model, a recruiting protocol based on a weighting mechanism, and an optimal dispatch algorithm. The performance of the SCSP is examined by the simulations of four different application scenarios. The simulation results show that the SCSP coordinates the service groups and their service peers efficiently. Furthermore, the SCSP achieves much better performance in terms of reducing the response time and a comparable performance in the resource efficiency, compared to existing work. In addition, the SCSP achieves good scalability and robustness over the network with 2000 peers. In the future work, we will apply the SCSP into composed Web services and long running Web service provision. The interdependencies between services and the transaction property will be considered in the coordinated service provision. Moreover, it is also an interesting future work to evaluate the scenario that
1224
M. Liu et al. / Journal of Network and Computer Applications 34 (2011) 1210–1224
different types of services are controlled by some probability density function (e.g. Zipf).
Acknowledgments This work was supported by the ITEA2-Expeshare project, the DECICOM project, and the project of SOPSCC (Pervasive Service Computing: A Solution Based on Web Service), funded in the Ubiquitous Computing and Diversity of Communication (MOTIVE) Program by the Academy of Finland. The authors would like to thank Dr. Jie Chen for his valuable comments in improving the paper and the anonymous reviewers for their helpful comments. References Alonso G, Kuno H, Casati F, Machiraju V. Web services: concepts, architectures and applications. Springer; 2004. Androutsellis-Theotokis S, Spinellis DA. Survey of peer-to-peer content distribution technologies. ACM Computing Surveys 2004;36(4):335–371. Acosta W, Chandra S. Trace driven analysis of the long term evolution of Gnutella peer-to-peer traffic. In: Proceedings of the passive and active network measurement; 2007. Banaei-Kashani F, Chen C, Shahabi C. WSPDS: web services peer-to-peer discovery service. In: Proceedings of international symposium on web services and applications; 2004. Bardram J, Venkataraman N. The Mini-Grid framework: application programming support for ad-hoc, peer-to-peer volunteer grids. In: Proceedings of the advances in grid and pervasive computing; 2010. Benatallah B, Dumas M, Sheng QZ, Ngu AHH. Declarative composition and peer-topeer provisioning of dynamic web services. In: Proceedings of the international conference data engineering; 2002. Brazier FMT, Kephart JO, Van Dyke Parunak H, Huhns MN. Agents and serviceoriented computing for autonomic computing: a research agenda. IEEE Internet Computing 2009;13(3):82–87. Burdett D, Kavantzas N. The WS-Choreography model overview. W3C Draft, /http://www.w3.org/TR/ws-chor-model/S, March 2004. Butt AR, Zhang R, Hu YC. A self-organizing flock of condors. Journal of Parallel and Distributed Computing 2006;66(1):145–61. Buyya R, Abramson D, Giddy J An economy driven resource management architecture for computational power grids. In: Proceedings of the international conference on parallel and distributed processing techniques and applications; 2000. Cabrera L, Copeland G, Freund T, Klein J, Langworthy D, Orchard D, et al. Web services coordination specification. /http://www.ibm.com/developerworks/ library/specification/ws-tx/S 2005. Chakravarti AJ, Baumgartner G, Lauria M. The organic grid: self-organizing computation on a peer-to-peer network. IEEE Transactions Systems, Man, and Cybernetics 2005;35(3):373–384. Chappell D. Enterprise service bus. O’Reilly Media, Inc.; 2004. Chen G, Low CP, Yang Z. Coordinated service provision in peer-to-peer environments. IEEE Transactions Parallel and Distributed Systems 2008;19(4): 373–384. Cuenca-Acuna FM, Nguyen TD. Self-managing federated services. In: Proceedings of the 23rd IEEE symposium on reliable distributed systems; 2004. Dhesiaseelan A, Ragunathan A. Web services container reference architecture (WSCRA). In: Proceedings of the IEEE conference on web services; 2004. Fensel D, Bussler C. The web service modelling framework WSMF. Electronic Commerce Research and Applications 2002;1(2):113–137. Foster I. Service-oriented science. Science 2005;308(5723):814–817. Goela S, Talya SS, Sobolewski M. Service-based P2P overlay network for collaborative solving. Problem, Decision Support Systems 2007;43(2):547–568. Gu X, Nahrstedt K. On composing stream applications in peer-to-peer environments. IEEE Transactions Parallel and Distributed Systems 2006;17(8): 824–837.
Hausheer D, Stiller B. PeerMart: the technology for a distributed auction-based market for peer-to-peer services. In: Proceedings of the IEEE conference communication (ICC); 2005. Jain K, Lovasz L, Chou PA. Building scalable and robust peer-to-peer overlay networks for broadcasting using network. Distributed Computing 2007;19(4):301–311. Jelasity M, Montresor A, Jesi GP, Voulgaris S. PeerSim P2P simulator. /http:// peersim.sourceforge.net/S, 2006. Kanellopoulos DN, Panagopoulos AA. Exploiting tourism destinations’ knowledge in an RDF-based P2P network. Journal of Network and Computer Applications 2008;31(2):179–200. Kwong KW, Tsang DHK. Building heterogeneous peer-to-peer networks: protocol and analysis. IEEE/ACM Transaction Networking 2008;16(2):281–292. Le Fessant F, Handurukande S, Kermarrec AM, Massoulie L. Clustering in peer-topeer file sharing workloads. Int’l Workshop on Peer-to-Peer Systems 2004. Li C, Li L. Three-layer control policy for grid resource management. Journal of Network and Computer Applications 2009;32(3):525–537. Li X, Zhang Z, Liu Y. Dynamic layer management in superpeer architectures. IEEE Transactions Parallel and Distributed Systems 2005;16(11):1078–1091. Lua EK, Crowcroft J, Pias M, Sharma R, Lim SA. Survey and comparison of peer-topeer overlay network schemes. IEEE Communications Surveys and Tutorials 2005;7(2):72–93. Lv Q, Cao P, Cohen E, Li K, Shenker S. Search and replication in unstructured peerto-peer networks. In: Proceedings of the ACM conference supercomputing; 2002. Li J, et al. Comparing the performance of distributed hash tables under churn. Workshop on peer-to-peer systems; 2004. Montresor AA. Robust protocol for building superpeer overlay topologies. In: Proceedings of the international conference on peer-to-peer computing; 2004. Pacifici G, Spreitzer M, Tantawi A, Youssef A. Performance management for cluster-based Web services. IEEE Journal on Selected Areas in Communications 2005;23(12):2333–2343. Papazoglou MP, Traverso P, Dustdar S, Leymann F. Service-oriented computing: state of the art and research challenges. IEEE Computer 2007;40(11):38–45. Papazoglou MP, van den Heuvel W-J. Service-oriented architectures: approaches. Technologies and Research Issues. VLDB 2007;16(3):389–415. Pyun YJ, Reeves DS. Constructing a balanced, (log(N)/1oglog(N))-diameter superpeer topology for scalable P2P systems. In: Proceedings of the IEEE peer-topeer computing; 2004. Qiu D, Srikant R. Modeling and performance analysis of BitTorrent-like peer-topeer networks. In: Proceedings of the ACM SIGCOMM; 2004. Ranjan R, Rahman M, Buyya R. A decentralized and cooperative workflow scheduling algorithm. In: Proceedings of the IEEE conference cluster computing and the grid (CCGRID); 2008. Saroiu S, Gummadi PK, Gribble SD. A Measurement study of peer-to-peer file sharing systems. In: Proceedings of the international conference on multimedia computing and networking; 2002. Sioutas S, Sakkopoulos E, Drossos L, Sirmakessis S. Balanced distributed web service lookup system. Journal of Network and Computer Applications 2008;31(2):149–162. Stoica I, Morris R, Nowell D, Karger D, Kaashoek M, Dabek F, et al. Scalable peer-topeer lookup protocol for internet applications. In: Proceedings of the ACM SIGCOMM; 2001. Votis K, Alexakos C, Vassiliadis B, Likothanassis S. An ontologically principled service-oriented architecture for managing distributed e-government nodes. Journal of Network and Computer Applications 2008;31(2):131–148. Whalley I, Tantawi A, Steinder M, Spreitzer M, Pacifici G, Das R, et al. Experience with collaborating managers: node group manager and provisioning manager. Cluster Computing 2006;9(4):401–416. Yang B, Garcia-Molina H. Efficient search in peer-to-peer networks. In: Proceedings of the IEEE conference distributed computing systems (ICDCS); 2002. Yang B, Garcia-Molina H. Designing a super-peer network. In: Proceedings of the international conference on data engineering; 2003. Zhou R, Hwang K. Powertrust: A robust and scalable reputation system for trusted peer-to-peer computing. IEEE Transactions on Parallel and Distributed Systems 2007;18(4):460–473. Zoels S, et al. Cost-based analysis of hierarchical DHT design. In: Proceedings of the IEEE peer-to-peer computing; 2006.