An Efficient I/O based Clustering HTN in Web Service ... - IEEE Xplore

1 downloads 11791 Views 483KB Size Report
Email: [email protected] ... automation, is achievable through Web Service Composition ... level of WSC automation, the greater the adoption of Web.
An Efficient I/O Based Clustering HTN in Web Service Composition Abdullah Abdullah

Xining Li

School of Computing Science University of Guelph Guelph, Canada Email: [email protected]

School of Computing Science University of Guelph Guelph, Canada Email: [email protected]

Abstract—AI-planning has been evidently proved to be effective as a successful Web Service Composition (WSC) technique. However, with a numerous number of Web services in real world, planning time and search space become influencing factors of performance. In order to provide better plans in terms of both efficiency and quality, it is desired to increase the automation level of the applied composition approach. In this paper, we propose a novel model of AI-planning for WSC. The model adopts an I/O-based clustering technique to generate a Hierarchical Task Network (HTN) clustered planning domain. Experiments show that the proposed approach is effective and efficient in service composition. Keywords—web Service Composition; I/O-based clustering; AIplanning; HTN.

I. I NTRODUCTION Service Oriented Architecture (SOA) is an emerging system application domain in which individual and smaller software services are used to build larger systems. Similarly, creating new Web services from existing ones, with certain degree of automation, is achievable through Web Service Composition (WSC) techniques [1]. As the number and diversity of Web services continually increase over the Internet, the higher the level of WSC automation, the greater the adoption of Web services in commerce and business world. A WSC approach involves a sequence of activities, such as service discovery, selection, integration, and so on. Obviously, the efficiency is the main issue towards developing an effective WSC system. A popular approach to WSC is to use Hierarchical Task Network (HTN) planning [2]. HTN is one of the most practical artificial intelligent planning techniques over the last fifteen years. The reason of adopting HTN comes from the fact that the decomposition performed in HTN is basically identical with WSC in nature. In other words, during decomposition, a new service can be generated by a selected set of already existing ones. However, with the presence of a huge number of Web services, planning time and search space become influencing bottlenecks. In this paper, we introduce a new I/O-based clustering HTN planning approach to WSC. The motivation of this work is to improve the efficiency of WSC process. In order to

978-1-4673-2088-7/13/$31.00 ©2013 IEEE

obtain reliable clusters, we propose a prioritized I/O similarity clustering method. The output of the method is an HTN domain, consisting of multiple functional-based clusters of Web services. The HTN domain is then fed to the planner for finding all possible plans. The proposed method significantly reduces the search space. Comparing with the HTN domains generated in classical approaches, the I/O-based clustering method helps the planner to find all plans in a timely manner. The novelties of our research involve two major points. First, the system is built on the top of the well-known HTN planning. We propose two new types of methods by extending HTN’s essential base of definitions. The main role of the new methods is to enable the planner to rapidly explore all available services, without altering the name of Web service operator or adding any additional profile-linking predicates. Secondly, we deploy a functional-based clustering approach which is able to regenerate HTN problem domain effectively. To the best of our knowledge, this approach is the first to show a beneficial combination between clustering and HTN planning in the design of a WSC system. This paper is organized as follows. Section II briefly introduces related research work and HTN planning. In Section III, we present the proposed model as well as the associated I/O-based clustering approach. Section IV discusses implementation issues and experimental results. Finally, we give concluding remarks as well as some future suggestions. II. RELATED WORK AND HTN PLANNING In the last decade, several approaches have been proposed to tackle WSC as a planning problem through converting OWL-S service profiles into a suitable planning representation such as HTN and PDDL. For example, a typical work is to adopt HTN for WSC by utilizing the well-known planner SHOP2 [3]. Another notable work is OWLS-Xplan [4], in which a PDDL 2.1 description is generated by converting the original Web services written in OWL-S. Then a simplified HTN-Graph-based planner is called to solve the problem. An OWLS-PDDL-OWLS approach is presented in [5], which has an additional final step to translate the obtained plan back to OWL-S as a new set of composited Web services.

252

On the other hand, many semantic Web service I/O-based matchmakers have been developed in the past few years such as OWLSUDDI matchmaker [6], and OWLS-MX [7]. Other researchers classify Web services by using functional and non-functional characteristics such as QoS parameters, concept analysis or user requirements [8]–[11]. To the best of our knowledge, no I/O-based clustering techniques have been applied to HTN planning domain in order to improve WSC performance. HTN Planning is one of the most popular planning paradigms used in WSC. As an approach to automated planning, the primary idea of HTN is to reach a goal in the form of a set of primitive tasks. A typical HTN planner receives inputs from three divisions: an initial state, an initial task network, and a planning domain which consists of a set of operators and decomposable methods. The planner starts by continually decomposing compound methods into smaller ones until a list of operators, i.e. a plan, is reached. Two types of constraints can be associated with every decomposable method, namely, order and condition. The order constraint is to specify the sequence of sub-tasks listed within the method, while the condition constraint is to determine applicability of the method under the current state. To simplify our discussion, we adopt a simple E-commerce scenario throughout this paper. We assume that the initial task network is a single task Buy-a-Book. This task can be decomposed into Get-Book-Info, Add-to-Cart, Order, Payment, and Dispatch. In HTN planning, an operator is a primitive action, which has a name, preconditions and effects. In our scenario, Pay-by-Visa is an operator example associated with the Payment task. Following Ghallab et al.s work [2], we give a set of basic definitions for HTN planning. Definition 1: domain D is a pair

further decomposed by more than one option, such as Bookby-Author, Book-by-ISBN, etc. Definition 4: A Task Network T is a pair T = (U, C) where U is a set of task nodes and C a set of constraints. The constraints are normally considered of type precedence constraint, before-constraint, after-constraint or between-constraint. Definition 5 : A Plan π for HTN planning problem P is of the form π = O1 O2 . . . Ok if there is a primitive decomposition W initiated from W0 , then π is an instance of task network W . In our proposed planning model, we define two new methods. Definition 6: Wrapper Method WM is an expression of the form WM= (H, P E, SST ) where H is the method name assigned automatically, P E a precondition expression and SST a single subtask to represent an operator call. To invoke a Web service, the caller must satisfy the precondition list and call the designated operator. The main purpose of the Wrapper method is to form a wrap of the basic operator. The wrapping approach helps prepare the operator for a strict involvement in later search stage and retains the original operator name. Definition 7: A Cluster-Head Method CH is an expression of the form CH = (H, ST )

D = (O, M ) where O is a set of operators and M is a set of methods. Definition 2: An HTN planning problem P is a 3-tuple P = (S0 , W0 , D) where S0 is the initial state, W0 the initial task network, and D the HTN planning domain. Definition 3: An HTN method M is a 4-tuple M = (H, T, ST, C) where H is a symbolic name of the method, T a nonprimitive task, ST the set of subtasks of the methods task network and C a set of constraints. A method instance of M is applicable in a state if its preconditions are satisfied in that state. A method instance of M is relevant to task T if there is a substitution σ such that σ(M ) = T . In general, several methods can be relevant to a particular non-primitive task T , leading to different decompositions of T . In our example, the task, Buy-a-Book, can be decomposed into a set of subtasks, such as Get-Book-Info and Payment, etc, with the constraint that subtask of getting book information precedes subtask payment. Moreover, Get-Book-Info can be

where H is the method name and ST a list of subtasks associated with corresponding wrapper methods. Cluster-head method definition states that it is an ordinary HTN method with multiple instant subtasks. Each subtask represents a Wrapper method that is to be chosen if its precondition expression is satisfied. If the precondition test fails, the control is then transferred to the next available wrapper in the list. The selection process is performed continually until a matching wrapper method is found or the candidate wrapper list is exhausted. III. PROPOSED MODEL The proposed planning model consists of three major components, namely, Translator, Domain Generator, and Planner, as illustrated in Fig. 1. The input is a set of OWL-S service profiles, representing available services, while the output is a set of plans, if any, applicable to fulfill a user request. The distinctive feature of our proposed model is the effective clustering to direct the search process. As a result, the approach provides a high reduction in the search space, i.e. number of visited nodes, which means less time to achieve the target.

253

OWL-S service profiles

TRANSLATOR

HTN domain

PLANNER

I/O-based Clustered Domain

DOMAIN GENERATOR

set of input parameters is omitted from the method declaration. Second, the declaration retains the precondition list in order to check the applicability of the operator even before the operator is examined. Third, the Wrapper adopts an automatic naming approach, in which the methods name is a concatenation of all parameter types. Finally, the most distinctive feature is that the Wrapper is a single-subtasked method, that is, it acts as the only entry to an atomic operator.

Plan(s)

Figure 1.

System Components and Work Flow of the Proposed Model.

To realize the idea of clustering based domain generation, we added two new methods to HTN planning, i.e., Wrapper method and Cluster-Head method. Those methods works with an Input/Output clustering to finally generate the clustered domain. The domain is then fed to the well-known HTN planner, SHOP2, to solve the initial user WSC problem. A. Wrapper method The role of a Wrapper method is to translate Web services into their corresponding operators within the planning domain. For a given Web service, it retains the service identifier, without adding any form of linking predicate to the operator definition. Obviously, retaining the original service name makes it easier and convenient to refer to external web service records, within any real application system. Since every operator has a unique name, it is the responsibility of the Wrapper to unify all Web services belonging to the same functional cluster. In other words, all services that share similar, completely or partially, functional characteristics such as input/output parameters, are grouped within the same service cluster. Using unified naming for similar operators ease comprehensively exploring all available services within the cluster through the already embedded backtracking process in the HTN planner. The advantage of visiting all related services within the same cluster is to effectively applying any further search optimization techniques. As the clustering paradigm considerably reduce the search space, considering fewer yet related nodes during the search attracts for adding more intelligent techniques which certainly leads to better results in a timely manner. Based upon the definition of Wrapper method, the following algorithm shows how Wrapper methods are generated with respect to operators in the domain. The input to the Wrapper generator is a set of operators originally translated from OWLS Web service profiles whereas the output is a set of newly created methods. Comparing with an ordinary HTN domain method, a Wrapper method has four unique features. First, the

Algorithm 1 Wrapper Generator Input: TO, a set of operators in SHOP2 format. Output: WR, a set of Wrapper methods. Procedure: 1: for each TOi do 2: Pi ⇐ the list of all the preconditions of TOi . 3: Ni ⇐ Name of Wrapper method, derived from concatenation of all parameter types. 4: Ti ⇐ a single subtask represents calling operator TOi . 5: WRi ⇐ (Ni (), Pi , Ti ) 6: end for 7: return WR B. Cluster-Head Method The main role of a Cluster-Head method is to combine all related wrapper methods, and by default their operators, within the same cluster in a single method. The automatically achieved combination of Wrapper methods provides a means of efficiently exploring the whole space. Specifically, every single subtask inside this method is itself a call to one single Wrapper method attached with its own set of preconditions. As a result, the system is able to check the preconditions of all underlying operators before invoking any function. If the preconditions of a subtask cannot satisfy to current system state, then the whole cluster will be immediately skipped by the planner. Clearly, skipping many operators in only one decision step can greatly improve the performance of the entire planning process. The Cluster-Head method is generated automatically by the following algorithm. Algorithm 2 Cluster Head Generator Input: WR, a set of Wrapper methods; CHN, the name of cluster-head method. Output: CH, a cluster-head method. Procedure: 1: for each WRi do 2: Pi ⇐ the list of all the preconditions of WRi , plus output parameters required (all items must be identical by all underlying Wrapper methods). 3: Ni ⇐ ith wrapper method name. 4: Attach ith wrapper method to CH, i.e. CH= (CHN(), P1 N1 , P2 N2 , .., Pi−1 Ni−1 , Pi Ni ) 5: end for 6: return CH

254

The output of the Cluster Head Generator Algorithm is a

method whose body consists of a list of subtasks, each of which is an entry to a unified Wrapper method. A list of preconditions is located in front of each Wrapper entry for the purpose of condition checking before method invocation. C. I/O-based domain clustering The idea of I/O-based domain clustering is to rearrange the domain methods and operators in a way that assures a better domain design. Practically, instead of looking into all operators for an optimal plan, the proposed model only need to go through the related clusters. In other words, it is beneficial to skip over some operators that are impossible to be part of any successful plan. In the clustering mechanism, every cluster consists of a set of services that are similar to some degree in their input/output parameters. By matching those parameters, our approach provides both applicability and simplicity within the planning domain generation process. There are some earlier research works to model the similarity of Web services. However, those approaches were mainly used within matchmaking algorithms, in which functional and/or nonfunctional properties are queried against a collection of available service profiles in some sort of registry structures. On the other hand, our approach deals with OWL-S Web service profiles as input, in which both functional and nonfunctional properties are represented through the concept of ontology. Obviously, it is very important to select the similar definitions among the compared parameters. Thus, we define a specific priority order, shown in Table I, as the order which reflects the degree of similarity among the services under consideration in a descending fashion. Our model follows this given order when listing the clusters within the domain definition, specifically, within the cluster-head method body. From Table I, it is easy to notice that if two Web services are completely identical in their input/output parameters then the label is Exact-IO with 4 priority degree, the highest similarity. If they are identical only in their input or output parameters, then they are labeled Exact-I or Exact-O similarity respectively. Finally, Different similarity label is assigned in the case that no match can be found from both services. Using a classical sorting algorithm with respect to the input/output types, a re-grouping process is performed to assign similarity labels and to organize the wrapper methods within their designated clusters. The specified similarity order can easily fit the multi-case structure of HTN. Therefore, the Cluster-Head method must be created as an HTN multi-case method, with each case labeled by its similarity degree. Needless to say, such an

arrangement can guarantee that highly matching clusters are considered earlier enough in the search process. This can lead to a significant reduction in the search space through rapid skipping over all unrelated clusters with minimal space traversing. To illustrate further, Fig. 2 shows the GetBookInfo cluster and the relationships with its sub-clusters along with the associated preconditions. It is worth noting that every subcluster has its own relation line towards the main cluster, which means that those methods are listed following their priority of similarity, not as a list of subtasks to decompose the main task. The other important point to notice is the order in which the sub-clusters are listed from top to bottom, where top three are all labeled EXACT-IO. Furthermore, a set of services, such as WS1 , WS2 , . . . , WSn , under the first sub-cluster BookAuthor, have fully matched input and output parameters, and those services adopt Book as input and Author as output. IV. EXPERIMENT AND RESULTS An experimental prototype of the proposed model has been implemented. Listing 1 shows a partial snapshot of an HTN domain generated by our proposed model. In the snapshot, we notice that the first two operators are represented by their own original Web service names, while their associated wrapper, namely BookPrice, has an identical name in both cases. The following code shows the Cluster-Head method, where the first three methods represents Exact-IO sub-clusters. The next method is an Exact-I sub-cluster. In fact, those sub-clusters are Wrapper methods. A simplified version of the main method is given at the end of the domain. In this work, we only focus on measuring the performance of our proposed model. Therefore, in our experiments, a secsnario-based clustered domain is autmatically generated

WS1

BookAuthor (EXACT_IO) BookPrice (EXACT_IO) GetBookInfo

Input X X -

Output X X -

... WSn

BookInput (EXACT_I)

PriceOutput (EXACT_O)

TABLE I P RIORITY O RDER OF S ERVICE S IMILARITY Class Exact-IO Exact-I Exact-O Different

AuthorBook (EXACT_IO)

WS2

Label 4 3 2 1

Default (Different)

Figure 2.

255

An Example of Clustered Domain

and then fed to the planner. Our prototype model is built on top of the HTN planner, SHOP2. On the other hand, OWLS profile translation and automatic I/O clustering are manually applied. All experiments are performed on a 64-bit Windows R CoreTM i7 CPU 720. 7 system, 8 GB RAM, and Intel Listing 1. A Partial Snapshot of Clustered HTN Domain ; List of all operators with their Wrappers (: o p e r a t o r (!Bookprice1 ?_book) (book ?_book) () ((known ?_price))) (:method (BookPrice) ((book ?_book)) ((!Bookprice1 ?_book))) (: o p e r a t o r (!Bookprice2 ?_book) (book ?_book) () ((known ?_price))) (:method (BookPrice) ((book ?_book)) ((!Bookprice2 ?_book))) . . ; Cluster-head-method (:method (GetBookInfo) ((ISBN ?_ISBN) (Required book)) ((ISBNBook))) (:method (GetBookInfo) ((book ?_book) (Required author)) ((BookAuthor))) (:method (GetBookInfo) ((book ?_book) (Required price)) ((BookPrice))) (:method (GetBookInfo) ((book ?_book)) ((BookInput))) . . ; The main entrance to the problem (:method (main) () ((GetBookInfo) (Add-to-Cart) (Order) (Payment) (Dispatch))))

DT-CM

PT-CM

PLNT-CM

DT-UCM

PT-UCM

PLNT-UCM

1st PlanT-CM

1st PlanT-UCM

2500 2000

1500 1000

500 0 10

20

30

40

50

60

70

80

90

100

Number of Services

Figure 3.

The experiments start by feeding the planner with scenario domain and problem files. The design of operators in our experimental domain is adopted from web service profile collection, OWL-S TC [12]. Due to its incompleteness in both variety and number of services needed to evaluate our planning model, we have added more services to the experimental domain. The addition of services is performed increasingly. In each round, after adding 10% more services, a set of measures are recorded. The observed measures are mainly related to the size of search space and the time consumed. More precisely, measures include (1) number of visited nodes, (2) total number of Web services, (3) total number of plans, (4) domain-to-java compiling time (DT), (5) problem-to-java compiling time (PT), (6) actual planning time (PLNT), and (7) finally first-plan-time (1stPlanT) found. Table II shows speedup ratio between our model, Clustered Model (CM), and the plain or Un-Clustered Model (UCM). From the results shown, it is noticeable that the gain in speed usually does not occur suddenly. In fact, high numbers of services achieve considerable speedup. The highest speedup recorded is close to 60% which is great enough to ease applying additional optimizations, i.e. to choose only the best plan. The next set of experimental results is shown in Fig. 3. These experiments include measures of three time segments, namely, DT, PT, and PLNT and in both Un-Clustered Model

Time(ms) vs. Number of Services

and Clustered Model. Obviously, DT and PT compiling times are steadily proportional to the number of services and therefore, no time gain is obtained here. However, as the number of services grows, less time is needed to perform the entire planning process. Fig. 3 also marks the 1stPlanT for each experiment. Obviously in CM, the first plan is reached much faster than in UCM, which indicates the effectiveness of our approach. Finally, Fig. 4 shows the number of visited nodes during the whole planning process. The number of visited nodes in UCM has a linear relationship with respect to the number of services. On the other hand, CM visits lesser nodes during planning. Obviously, the lesser the number of nodes visited, the higher the performance of entire planning. In real world applications, it is desired to find a satisfactory plan by visiting nodes as less as possible. The reason is that an invocation of a Web service might take a nondeterministic time to respond.

256

TABLE II P ERFORMANCE R ESULTS Services 10 20 30 40 50 60 70 80 90 100

Plans/Services 20.0% 15.0% 06.7% 20.0% 52.0% 41.7% 54.3% 06.3% 41.1% 42.0%

Speedup 0.5% 03.8% 03.5% 02.5% 44.4% 43.4% 45.6% 44.0% 45.6% 58.0%

# Visited - CM

[9] G. Fenza and S. Senatore, “Friendly web services selection exploiting fuzzy formal concept analysis,” Soft Comput., vol. 14, no. 8, pp. 811– 819, 2010. [10] M. Driss, N. Moha, Y. Jamoussi, J.-M. Jzquel, and H. H. B. Ghzala, “A Requirement-Centric approach to web service modeling, discovery, and selection,” in Proceedings of ICSOC’10, pp. 258–272, 2010. [11] Z. Azmeh, M. Driss, F. Hamoui, M. Huchard, N. Moha, and C. Tibermacine, “Selection of composable web services driven by user requirements,” in Proceedings of IEEE International Conference on Web Services (ICWS), pp. 395–402, 2011. [12] M. Klusch, “OWLS-TC-v4: OWL-S service retrieval test collection,” 2010. http://projects.semwebcentral.org/projects/owls-tc/. (Last accessed July 12, 2012).

# Visited - UCM

120

100 80 60

40 20 0 5

25

45

65

85

105

Number of Services Figure 4.

Number of visited nodes in CM and UCM modes

V. CONCLUSION In this paper, we have presented a new approach to HTNbased Web service composition. By extending HTN planning with two new methods, namely, Wrapper and Cluster-Head, an I/O-based clustering is applied to enhance efficiency of the planning domain. A set of experiments has been performed on our prototype system. Results show the effectiveness and efficiency of the proposed approach in terms of time and search space. Future works include extending the approach to seek for the best plan within real world scenarios and applying context-aware paradigm during planning process. R EFERENCES [1] B. Srivastava and J. Koehler, “Web service composition - current solutions and open problems,” in ICAPS 2003 Workshop on Planning for Web Services, pp. 28–35, 2003. [2] M. Ghallab, D. Nau, and P. Traverso, “Hierarchical task network planning,” in Automated Planning: Theory and Practice, pp. 229–261, Burlington: Morgan Kaufmann, 2004. [3] E. Sirin, B. Parsia, D. Wu, J. Hendler, and D. Nau, “HTN planning for web service composition using SHOP2,” Web Semant., vol. 1, no. 4, pp. 377–396, 2004. [4] M. Klusch and A. Gerber, “Semantic web service composition planning with OWLS-XPlan,” in Proceedings of the 1st Int. AAAI Fall Symposium on Agents and the Semantic Web, pp. 55–62, 2005. [5] Y. Bo and Q. Zheng, “A method of semantic web service composition based PDDL,” in Proceedings of IEEE International Conference on Service-Oriented Computing and Applications (SOCA), pp. 1–4, 2009. [6] K. Sycara, M. Paolucci, A. Ankolekar, and N. Srinivasan, “Automated discovery, interaction and composition of semantic web services,” Web Semantics: Science, Services and Agents on the World Wide Web, vol. 1, no. 1, pp. 27–46, 2003. [7] M. Klusch, B. Fries, and K. Sycara, “Automated semantic web service discovery with OWLS-MX,” in Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems, pp. 915– 922, ACM, 2006. [8] Z. Azmeh, M. Huchard, C. Tibermacine, C. Urtado, and S. Vauttier, “Using concept lattices to support web service compositions with backup services,” in the Fifth International Conference on Internet and Web Applications and Services (ICIW), pp. 363–368, 2010.

257

Suggest Documents