A Cloud-unaware Programming Model for Easy ... - IEEE Xplore

11 downloads 11648 Views 318KB Size Report
a programming model and execution runtime to ease the development and execution of ... Keywords-Parallel programming models, Cloud computing,.
2011 Third IEEE International Conference on Coud Computing Technology and Science

A Cloud-unaware Programming Model for Easy Development of Composite Services Enric Tejedor∗† , Jorge Ejarque∗ , Francesc Lordan∗ , Roger Rafanell∗ , ∗ , Daniele Lezzi∗† , Ra¨ ´ ul Sirvent∗ and Rosa M. Badia∗‡ Javier Alvarez ∗ Barcelona Supercomputing Center (BSC-CNS), Barcelona, Spain † Universitat Polit`ecnica de Catalunya (UPC), Barcelona, Spain ‡ Artificial Intelligence Research Institute (IIIA), Spanish National Research Council (CSIC) E-mail: [email protected]

loosely-coupled services that correspond to specific business processes; furthermore, such services can be reused to create new composite services with an added value. The convergence of cloud computing and SOA arises a need for, on the one hand, programming models that ease the development of applications composed by services and, on the other, systems which orchestrate the execution of those services in the Cloud. In that sense, the existing approaches fail to provide true cloud-unaware programming combined with easy service composition and orchestration. Some of them [3], [4] are PaaS that require the programmer to use APIs for interprocess communication or for accessing cloud storage; besides, the deployment and execution of applications with these vendors are tied to their own infrastructure. Others permit the graphical composition of service-based workflows [5], [6], [7], where the data dependencies between services need to be manually and statically specified; most of them also lack the ability to express conditions or introduce control flow statements as in an imperative language. Lastly, programming models for distributed infrastructures which are now applied to the Cloud are either not flexible enough for all kinds of applications [8] or do not offer special support for service composition and orchestration [9], [10]. In order to correct these shortcomings, this paper presents Service Superscalar (ServiceSs), a new programming model and runtime system for developing and running addedvalue composite cloud services. ServiceSs gathers previous experience of the authors in parallel programming models for grids and clusters [11], [12]; nevertheless, it goes a step further and addresses how cloud computing changes application architecture and development. In the ServiceSs model, only sequential programming skills are required to write SaaS composites; the inner services are easily invoked as normal methods with no use of any library or new syntax. Service tasks can be combined with non-service ones and all together can be automatically orchestrated by the runtime. Such runtime is able to run in cloud environments and elastically reserve virtual resources depending on the number of requested tasks. Moreover, ServiceSs prevents vendor lock-in since it can potentially work on top of any cloud provider.

Abstract—Cloud computing is inherently service-oriented: cloud applications are delivered to consumers as services via the Internet. Therefore, these applications can potentially benefit from the Service-Oriented Architecture (SOA) principles: they can be programmed as added-value services composed by pre-existing ones, thus favouring code reuse. However, new programming models are required to simplify their development, along with systems that are capable of orchestrating the execution of the resulting SaaS in the Cloud. In that regard, this paper presents Service Superscalar (ServiceSs), an alternative to existing PaaS which provides a programming model and execution runtime to ease the development and execution of service-based applications in clouds. ServiceSs is a task-based model: the user is only required to select the tasks, which can be services or regular methods, to be spawned asynchronously. The application, a composite service, is programmed in a totally sequential way and no API call must be included in the code. The runtime is in charge of automatically orchestrating the execution of the tasks in the Cloud, as well as of elastically deploying new virtual resources depending on the load. After describing the main characteristics of the programming model and the runtime, we evaluate the productivity of ServiceSs and show how it offers a good trade-off between programmability and runtime performance. Keywords-Parallel programming models, Cloud computing, Service composition/orchestration, Productivity, PaaS, SaaS.

I. I NTRODUCTION Cloud computing is emerging as an IT paradigm shift, enabling technology to be accessed as services delivered over the Internet. Enterprises can outsource to the Cloud any part of the IT stack, paying only for what is consumed; the levels of that stack, which ranges from hardware to applications, are known in the Cloud as Infrastructure as a Service (IaaS), Platform as a Service (PaaS) and Software as a Service (SaaS) [1]. The ability to adjust virtualized cloud resources in real time to match demand allows to reduce infrastructure costs and maintenance effort. Moreover, by facilitating and accelerating software development, the time to market of new products decreases. The growing interest in providing applications as cloud services raises one question: how to develop such applications in order to take full advantage of the service-oriented nature of the Cloud? A potential answer is the ServiceOriented Architecture (SOA) style [2]. SOA is based on 978-0-7695-4622-3/11 $26.00 © 2011 IEEE DOI 10.1109/CloudCom.2011.57

375

public interface SampleCEI { @Service(namespace = ”http://servicess.com/example”, name = ”StatelessWS”, port = ”samplePort”) R statelessService( @Parameter(direction = IN) Q query );

The paper is organized as follows. Section II introduces the ServiceSs programming model. Section III describes how the ServiceSs runtime orchestrates a composite in the Cloud. Section IV illustrates the programming model and runtime operation with a real example. Section V evaluates the runtime performance. Section VI discusses the related work. Section VII presents the conclusions and future work.

@Service(namespace = ”http://servicess.com/example”, name = ”StatefulWS”, port = ”samplePort”) void statefulService( @Parameter(direction = IN) U update );

II. P ROGRAMMING M ODEL The ServiceSs programming model is a new approach for the easy development of cloud applications as composite services. A composite is written as a sequential Java program from which other services and regular methods are called. Therefore, composites can be hybrid codes that reuse functionalities wrapped in services or methods, adding some value to create a new product that can also be published as a service. Moreover, the Cloud is kept transparent to the programmer as no information about aspects like deployment or scheduling has to be included in the code. The model can be defined as task-based and dependencyaware. In a first step, the user selects the set of services and methods called from the composite that she intends to run as tasks in parallel on the available resources. The task selection is done by means of a Java interface which declares those services/methods, along with some metadata in the form of Java annotations [13]. One of these annotations is used to state the direction of each task parameter (input, output or in-out); with this information the runtime discovers, at execution time, the data dependencies between tasks and dynamically builds a task dependency graph. Tasks can handle the same kinds of data that appear in any Java program (primitives, objects, files). In summary, a ServiceSs composite service is programmed sequentially, while all the information needed for data-dependency detection and task-based parallelization is contained in a separate Java annotated interface. From this point on, the composite service will be referred to as Orchestration Element (OE). Those services and methods invoked from the OE and selected as tasks will be called Core Elements (CE), and the interface where they are chosen will be the Core Element Interface (CEI). The next subsections go into further details about how the CEI and the OE are defined.

} 1 2 3 4 5

@Method(declaringClass = ”simple.Example”) void postProcess( @Parameter(direction = INOUT) R reply ); (a) public class SampleWS { @Orchestration public static void sampleComposite() { Q query = new Q(); R reply = statelessService(query);

6

postProcess(reply);

7 8

reply.printToLog();

9 10

...

11

// other statements

12 13 14 15 16 17

}

}

StatefulWS s = new StatefulWS(); for (int i = 0; i < numIter; i++) s.statefulService(new U(i)); (b)

Figure 1. Sample CEI (a) and its associated OE (b). Two service operations (statelessService and statefulService) and one method (postProcess) are selected as CEs. Regarding the service CEs, the @Service annotation specifies their namespace, name and port. In the case of postProcess, the @Method annotation contains its declaring class simple.Example.

statelessService, are generated from the WSDL. On the other hand, a method CE maps to a Java method. Every CE is preceded by an annotation, at method level, which helps to uniquely identify that CE. In the case of a service CE, the @Service annotation contains attributes to determine the corresponding SOAP web service operation, namely the namespace, the service name and the port name; the operation name is the one used in the declaration, e.g. statelessService. For method CEs, the @Method annotation specifies the class implementing the method. Optionally, a @Constraints annotation can be added to a method CE to describe resource constraints for that method (see [11] for more details). At CE parameter level, the @Parameter annotation is used to state its direction. CEs can be void, like statefulService and postProcess, or return a value like statelessService. They can also be static or non-static (i.e. always operate on an instance of an object); this can be used to distinguish between stateless or stateful services, as explained in Section II-B.

A. Core Element Interface A CEI is a plain Java interface for selecting the CEs of a given OE. Figure 1(a) is an example of CEI, where two service CEs and one method CE are declared. A Service CE corresponds to a SOAP [14] web service operation, defined in a WSDL [15] document. In a CEI, service CEs are declared as normal Java methods whose name and parameters match exactly those of the service operation to which they refer. The Java classes of the operation parameters or return value, like R and Q for

376

B. Orchestration Element Once the CEI is completed, the selected CEs can be invoked inside an OE. Continuing with the example, Figure 1(b) depicts a web service class that contains the composite (sampleComposite, lines 3-16) associated with the CEI in (a). The @Orchestration annotation marks a given method as an OE or composite service (line 2). The body of the OE is programmed as any sequential Java code, without using any API or special syntax constructs. The CEs are invoked as normal Java methods, and the runtime will be in charge of replacing these local invocations by the creation of CEs, also inspecting their data dependencies. The next subsections point out some key operations in the sample OE of Figure 1(b) and explain how the runtime behaves in each case. 1) Service CE Invocation: Lines 4-5 show a simple example of service CE invocation. Let us assume that statelessService implements a service with no state which receives a query and produces a reply. The call to statelessService from the OE is performed on a local representative of the service operation with the same signature. Note that this representative is generated beforehand only for invocation purposes; it will never actually be executed, since the runtime will substitute its call by a service CE creation. 2) Method CE Invocation and Dependencies: After the CE for statelessService is asynchronously generated, the OE continues its execution and reaches line 7 with a call to a method CE. This method performs a post-process on the return value of statelessService, thus requiring the result of the latter. This is a case of a data dependency between two CEs, which is transparently handled by the runtime: the postProcess CE will not start until the statelessService CE finishes and its result is transferred to the resource where postProcess will run. 3) Synchronization in OE: Data updated by a CE can eventually be accessed from the OE, like in line 9, where a local method is called on the reply object to print it to a log file. In such case, the runtime must enforce synchronization: the thread running the OE is blocked until the postProcess CE updates the reply object and that value is obtained. 4) Service CEs with State: Lines 13-15 illustrate how to program invocations to services with state. Such invocations may modify the internal state of the service, and consequently they must be serialized to ensure coherency. This is accomplished by invoking a non-static local representative, the associated object representing the service state. Again, the class declaring the representative is pre-generated to simplify programming. First, the service object s of class StatefulWS (which is the service name) is created in line 13. Then, a series of numIter statefulService CEs are spawned asynchronously and added to the CE dependency graph. They are all invoked on object s, which acts as an implicit in-out parameter; as a result, the CEs will be arranged as a chain in the graph, which guarantees serialization.

III. S ERVICE O RCHESTRATION The programming model presented in Section II allows a service developer to write composite services. These composites are declared in a service class, possibly together with other methods which can be normal service operations. The next subsections go through the steps of how a service programmed with ServiceSs is deployed and orchestrated. A. Service Class Instrumentation The deployment of a service requires the instrumentation of its class. This process takes as input the class of the service, containing the OEs, and the annotated CEI where the CEs were selected. Javassist [16], a java library for class editing, is used by the ServiceSs runtime to replace the calls to the selected CEs from inside an OE by the creation of asynchronous CE tasks; moreover, code to trigger synchronization at OE level is inserted as well. B. Service Deployment After the instrumentation phase, the service can be deployed in a container publishing the methods it offers in a service interface. In the deployment phase an initial set of virtual resources is requested for hosting the service container (which encompasses the OEs) as well as the different method CEs, taking into account the resource constraints specified by the user. Figure 2 depicts such a scenario, where a service containing several composites (OEs) has been published for them to be consumed by service users. Once the service is deployed, whenever a request for one of the service OEs is received by the container, the execution of that OE starts. C. Service Operation 1) Dependency Analysis: Every call to a selected CE from an OE implies the asynchronous creation of a CE node in a dependency graph. The graph is automatically built as an OE executes, based on the data dependencies between CEs. These dependencies are discovered by inspecting the parameters of each CE and taking into account their direction as specified in the CEI. The CE graph is key for the orchestration of the composite service since it contains information about what can be run at every moment and how the data has to flow between CEs. 2) Scheduling & Resource Management: The parallelism exhibited by the graph is exploited as much as possible, resolving the data dependencies and thus scheduling the CEs ready to be executed on the available resources. Such resources can be VMs under the supervision of the ServiceSs runtime, in the case of method CEs, or external services for service CEs. The runtime balances the load considering CE constraints, resource capabilities and data locality. Moreover, the ServiceSs runtime can adapt resource consumption to the topology of the graph; if a region of the graph with more concurrency is reached and the number

377

 

     

  



 



  

  

     

   

 !

     

Figure 2. Architecture of the ServiceSs runtime. A service hosted in a WS container can be accessed by any service consumer (e.g. web portal, application). The interface of this service offers several operations, which can be OEs (composites) previously written by a service developer following the ServiceSs programming model. When the container receives a request for a given OE the orchestration of the selected CEs starts generating the corresponding CE dependency graph on the fly. Service CEs will lead to the invocation of external services (possibly deployed in the Cloud), while method CEs can be run either on virtualized cloud resources or physical ones.

of ready method CEs grows overloading the available resources, the ServiceSs runtime reacts by requesting new VMs where to schedule more CEs. Once this region has been executed, the amount of CEs decreases and the load in these VMs is reduced. In this situation, the runtime shuts down those VMs which are not executing CEs. This algorithm for dynamic resource provisioning will not be further described since it is not a contribution of this paper. The provisioning is achieved by contacting cloud providers through a configurable connector; currently, connectors for Amazon EC2 [17] and the Open Cloud Computing Interface (OCCI) [18] are supported. Regarding the service CEs, since they are provided by external entities, their resources are not under the control of the resource management system of the ServiceSs runtime. However, ServiceSs allows to define different locations of a service and to set limits of concurrent invocations for each location, so that the external services are not overloaded. 3) Data Management: ServiceSs manages the data accessed by CEs and how it flows between CEs connected in the graph. The data generated by a method CE in a given virtual machine may be transferred to another resource if a second CE is scheduled on that resource. In addition, data exchange can also occur between two service CEs, or between a service CE and a method CE. For instance, if a CE produces an input parameter for a service CE, that value is obtained before invoking the service. Similarly, the result of invoking a service CE can be passed to another CE. 4) CE Execution: Finally, the ServiceSs runtime also submits the CEs for execution and monitors their completion. The way CEs are submitted for execution depends on their type. The invocation of service CEs is performed by

using dynamic clients generated through Apache CXF [19] configured with the data extracted from the @Service annotation properties. Method CEs are run using JavaGAT [20], a uniform interface with adaptors for various underlying protocols; one of these adaptors works on top of SSH libraries and it is used to connect to virtual machines and start computations on them. IV. A R EAL E XAMPLE : A C OMPOSITE S ERVICE FOR G ENE D ETECTION In order to better illustrate the advantages of programming and executing a composite service with ServiceSs, a realworld Life Sciences application has been evaluated by porting the existing sequential code to ServiceSs and executing it on a Cloud testbed. The next subsections describe the considered algorithm and the required steps to implement it as a ServiceSs application, highlighting the simplicity of the proposed solution. A. The Original Algorithm The original application is a gene detection code [21] designed by members of the Life Sciences department of the Barcelona Supercomputing Center. Its core algorithm is GeneWise [22], a program for identifying genes in a genomic DNA sequence. First, the application finds a set of relevant regions in the DNA sequence, and then runs GeneWise only for those regions, which is faster than scanning the whole DNA. In the Perl version, a set of publicly available bioinformatics services are invoked by means of WS libraries; in addition, those invocations are synchronous, and therefore no parallelism is achieved between different service calls.

378

    

   

  

     





 



          !    "  # 







        





0   /   













 

  









$  %& $  ' ( $  %   "&)



*   ' +)  ,   ") --# 





$ "  "  '     !











































* "%&#)





$ %& '    " #)



.













. .

 "   $  

 



    

















$

    1   



3





 





3



   $

  /   





 2    

 



  

Figure 3. Gene detection composite service. The dependency graph of the whole orchestration is depicted on the right of the Figure: circles correspond to method CEs and diamonds map to service CE invocations, while stars represent synchronizations due to accesses on CE result values. A snippet of the OE code is represented, focusing on a particular fragment which runs BLAST against a set of aminoacid sequences. The graph section corresponding to this piece of code is also highlighted in the overall structure of the OE.

B. Programming and Executing with ServiceSs In order to address the aforementioned shortcomings and exploit the ServiceSs features, the gene detection application was ported to Java following the steps of the ServiceSs programming model, namely: first, defining a CEI to identify both kinds of CEs (bioinformatics services and regular methods); second, programming the OE as a sequential application that invokes the selected CEs. The structure of the resulting OE is represented on the right side of Figure 3. Each box represents a different part of the OE which contributes to the overall process: translation of the input genomic DB to a given format; obtention of a list of aminoacid sequences which are similar to a reference input sequence; search of the relevant regions of the genomic database and execution of the GeneWise algorithm on them. In each of these parts CEs are invoked, thus leading to the creation of the dependency graph; the fragment of the graph generated in every part is shown inside the boxes. The leftmost side of Figure 3 contains a brief summary of the OE code where the calls to the CEs are highlighted. The represented loop runs, for every sequence, the BLAST [23] program against the genomic database to find a set of relevant regions. The following points are worth to be noted: •









While the Perl script runs in a totally sequential way, ServiceSs is able to exploit the inherent parallelism between the CEs. runBlast, a selected CE which will invoke a BLAST service, is called as any other regular method. The ServiceSs runtime will replace this call by the creation of a service CE. Service CEs like runBlast can be combined with method CEs like prepareBlast. At each iteration, the latter prepares the request object for the former, which causes a data dependency between them. The prepareBlast and runBlast invocations lead to the asynchronous generation of CEs, so that the OE goes through all the iterations without stopping at any point. As a result, the part of the graph shown in the middle of Figure 3 is created. The data dependencies between the CEs are automatically discovered and represented as arrows in the graph. With this information, the ServiceSs will ensure the coherency when orchestrating the execution of the CEs in the Cloud. V. E VALUATION

This section evaluates ServiceSs from a performance point of view. Two scenarios are considered here for analyzing, first, the speedup of a composite which uses publiclyavailable services and, second, how the runtime exploits

The gene detection OE is programmed sequentially, just like the original Perl script. Nevertheless, the behaviour of both versions differs significantly at execution time.

379

B. Cloud Elasticity for Method CEs

cloud elasticity in the scheduling phase to adapt to a dynamic number of method CEs.

The main goal of these tests is to bring out the benefits offered by the Cloud in terms of scaling up and down the computational infrastructure. One of the most important advantages of the Cloud is that it provides an execution environment adjustable to specific and dynamic computational needs. Regarding a ServiceSs execution, this allows both to reduce the response time of an OE invocation and to optimize the usage of resources in terms of cost.

A. Speedup - Use of Public Services This series of tests focuses on the core part of the workflow described in Section IV, based on an execution of a BLAST computation. The algorithm is composed, on the one hand, of a computational part programmed as service CEs which invoke the BlastProDom [24] public web service for each fragment of the input data and, on the other, of a merge phase implemented as a set of regular method CEs. In particular, a set of 5000 input protein sequences is split in eight fragments and then compared against the ProDom database with BLAST. Figure 4 depicts the dependency graph automatically generated by the runtime as a result of the invocation to CEs from the OE.

# cores

128 112 96 80 64 48 32 16 8 0

8 initial cores 16 initial cores 32 initial cores 64 initial cores

100 200 300 400 500 600 700 800 900 1000 1100 time (s)

Figure 6. Evolution of the amount of resources reserved for an HMMPfam execution starting with 8, 16, 32 and 64 cores.

In this case, a different approach has been taken by implementing the computationally-intensive phase of the OE as method CEs using the HMMPfam [25] tool that, much like BLAST, compares a set of DNA sequences against a protein database; the merge phase of partial results has been implemented as service CEs. The topology of the generated graph, though, also corresponds to a binary tree, similar to the one in Figure 4. The experiments have been conducted on an Amazon EC2 testbed using extra-large instances. Besides the ability to run on a cloud infrastructure, the ServiceSs runtime is also capable of exploiting its elasticity. The amount of resources reserved may change depending on the workload of the runtime, as depicted in Figure 6. At the beginning of the execution, a workload peak happens when the runtime deals simultaneously with all the comparison CEs, which are ready to run since they have no data dependencies. From this moment on, the runtime requests more instances in order to improve the task throughput and, therefore, it exploits better the parallelism inherent in the application (i.e. that of the dependency graph). The user can limit the amount of resources that the runtime can reserve; in these experiments, this upper limit was set to 128 cores. Once most of the method CEs have been executed, the number of CEs ready to be run decreases; the runtime detects such reduction in the load and consequently it starts to switch off those VMs which will not be used anymore. Table I presents the execution times of comparing 25.000 sequences against a database with 725 protein families using HMMPfam. Each row contains two execution times for an

Figure 4. Dependency graph generated on-the-fly by the ServiceSs runtime for a particular execution of the BLAST algorithm, where the input data are split in eight parts. A total of eight BLAST service CEs, represented as diamonds, are first spawned from the OE, followed by seven method CEs (circles) that merge the BLAST results.

Figure 5 shows the application speedup according to the limit of CEs allowed to run simultaneously on the web service. The speedup grows close to the theoretical limit up to 8 simultaneous service invocations (since it is the maximum number of concurrent invocations the service accepts). 10

Speedup

8 6 4 2 0

1 2

4

6

8 10 12 Simultaneous Tasks

14

16

Figure 5. BLAST application speedup depending on the number of simultaneous tasks sent to the public service.

380

control flow statements (if-then-else, while) and the data flow is defined by linking services. In the SWPT [6] GUI, in addition, the user can specify a semantic description for each task enabling the discovery of the service that provides that task at execution time. The FAST [7] platform allows not skilled users to construct mashups out of pre-built gadgets whose core building blocks are services. In contrast to these approaches, the workflow of a ServiceSs application (the CE dependency graph) is not defined graphically, but dynamically created as the main program runs: each invocation of a method or service is replaced on-the-fly by the creation of an asynchronous CE which is added to the graph. ServiceSs only requires skills in sequential programming; no knowledge in multithreading, parallel/distributed programming or service invocation is necessary. While semantic information in CEs is not supported yet, it could be added as an interface annotation instead of selecting a particular instance of a service. JOLIE [27] allows to program textually service compositions, but unlike ServiceSs it uses a custom syntax and requires the user to deal with parallelism and synchronization explicitly. Similarly to ServiceSs, other projects have evolved from the Grid/Cluster scenario into a more Cloud/service-oriented perspective. Aneka [9], originally a .NET-based software system for the creation of enterprise grids, has moved to a market-oriented cloud platform. While the most important changes in Aneka concern its runtime and how it manages dynamic provisioning and accounting of virtual resources, the programming model it offers do not address the easy development of composite services. ProActive offers a new resource manager that has been developed in order to mix Cloud and Grid resources [10], but the programming model lacks proper service-orientation: although an active object can be deployed as a service, there is no special support for orchestration of several service active objects.

initial number of CPUs. The first one corresponds to an execution where no resources are added dynamically and the second one benefits from the elasticity of the Cloud. The results confirm that the performance is improved if more processors are used and that the introduction of elasticity mechanisms brings a remarkable gain rate as shown in the last column. This feature also permits to improve the usage of resources without having to afford the elevated cost of reserving a bigger pool of nodes throughout the whole run. Table I E XECUTION TIMES IN SECONDS FOR 8, 16, 32, 64 AND 128 CPU S WITH AND WITHOUT ELASTICITY

initial CPUs 8 16 32 64 128

without elasticity 11573 5851 3041 1731 897

with elasticity 1078 1071 1060 1003 897

Gain rate 10.74 5.46 2.87 1.73 -

VI. R ELATED W ORK Numerous Platform-as-a-Service (PaaS) solutions have appeared to facilitate the process of developing, deploying and running applications in the Cloud. Some of them propose programming models that offer APIs to write applications. In the Microsoft Azure [3] Cloud programming model applications are structured in roles, which use APIs to communicate (queues) and to access persistent storage (blobs and tables). Google App Engine [4] provides libraries to invoke external services and queue units of work (tasks) for execution; furthermore, it allows to run applications programmed in the MapReduce model. Contrarily to these platforms, ServiceSs does not require including any API call in the application code; CE creation (either from regular methods or services), data transfer and synchronization are handled automatically by the runtime. Moreover, data dependencies between CEs do not need to be managed manually in the application code since they are resolved by the runtime. Regarding the expressiveness of the models, ServiceSs is more flexible than MapReduce because its applications can generate any arbitrary CE (task) graph. Such generality is also pursued by Dryad [26], but in this case the graph has to be created programmatically by means of C++ libraries and overloaded operators. Finally, the aforementioned PaaS proposed by Microsoft and Google restrict the deployment and execution of their applications to their own infrastructure; oppositely, ServiceSs can potentially work on top of any cloud provider. Another kind of approaches allows the graphical composition of an application workflow whose nodes can be services. Some vendors have implemented their own WSBPEL [5] visual editor to create orchestrations of services which, at their turn, can also be published as services; BPEL features

VII. C ONCLUSIONS AND F UTURE W ORK This paper presented ServiceSs, a new programming framework that aims at filling the gap of missing tools for the transparent development and execution of science applications in cloud environments. Only by writing a sequential Java program and defining an interface, service developers are able to create complex orchestrations which invoke remote services and local libraries in a very simple way, with no use of APIs. The ServiceSs programming model is cloudunaware as no resource deployment/scheduling information has to be provided by the user. The ServiceSs runtime implements advanced scheduling policies for the optimized usage of resources, elastically adapting the number of available nodes to the computational load. The availability of a configurable connector that currently supports Amazon EC2 and OCCI clouds allows the usage of different Cloud infrastructures and thus prevents vendor lock-in. The proposed framework has been evaluated through the implementation

381

of real-world use cases. Experiments prove that the runtime is able to make use of public services in a composite service execution, achieve good scalability and dynamically obtain new cloud resources depending on the application needs. Future work includes the support for REST web services in the composites in addition to the SOAP ones already supported. The development of an Integrated Development Environment (IDE) is planned in order to provide developers with a complete programming tool. This IDE will include features such as the possibility of selecting service and method CEs graphically and will automate all the steps of code generation and deployment that are currently performed manually using scripts.

[10] B. Amedro et al., “An efficient framework for running applications on clusters, grids and clouds,” in Cloud Computing: Principles, Systems and Applications. Springer Verlag, 2010. [11] E. Tejedor and R. Badia, “COMP Superscalar: Bringing GRID superscalar and GCM Together,” in 8th IEEE International Symposium on Cluster Computing and the Grid, 2008. [12] E. Tejedor, M. Farreras, D. Grove, R. M. Badia, G. Almasi, and J. Labarta, “ClusterSs: A Task-Based Programming Model for Clusters,” 20th International ACM Symposium on High-Performance Parallel and Distributed Computing, June 2011. [13] “Java annotations,” http://java.sun.com/j2se/1.5.0/docs/guide/ language/annotations.html.

ACKNOWLEDGMENTS

[14] “Simple Object http://www.w3.org/TR/soap/.

This work has been supported by the Spanish Ministry of Science and Innovation (contract no. TIN2007-60625 and CSD2007-00050), by the Universitat Polit`ecnica de Catalunya with a pre-doctoral grant and by the European Commission (grant agreement no. 257115, OPTIMIS project). We would also like to thank Romina Royo and Dmitry Repchevsky for their valuable help in the porting of a gene detection algorithm to ServiceSs.

Access

Protocol,”

[15] “Web Services Description http://www.w3.org/TR/wsdl.

Language,”

[16] “Java programming assistant,” http://www.javassist.org. [17] “Amazon Elastic http://aws.amazon.com/es/ec2/.

Compute

Cloud,”

[18] “Open Cloud Computing Interface,” http://www.occi-wg.org/.

R EFERENCES

[19] “Apache CXF,” http://cxf.apache.org/.

[1] P. Mell and T. Grance, “The NIST Definition of Cloud Computing (Draft). Recommendations of the National Institute of Standards and Technology,” Nist Special Publication, vol. 145, no. 6, pp. 1–2, 2011.

[20] G. Allen et al., “The Grid Application Toolkit: Towards Generic and Easy Application Programming Interfaces for the Grid,” in Proceedings of the IEEE, vol. 93, no. 3, Mar. 2005, pp. 534–550.

[2] D. S. Linthicum, Cloud Computing and SOA Convergence in Your Enterprise: A Step-by-Step Guide, 1st ed. AddisonWesley Professional, 2009.

[21] R. Royo, J. L´opez, D. Torrents, and J. Gelpi, “A BioMobybased workflow for gene detection using sequence homology,” in International Supercomputing Conference (ISC’08), Dresden (Germany), 2008.

[3] “Microsoft Azure,” http://www.microsoft.com/azure/.

[22] E. Birney, M. Clamp, and R. Durbin, “GeneWise and Genomewise,” Genome Research, vol. 14, no. 5, pp. 988– 995, May 2004.

[4] “Google App Engine,” http://code.google.com/appengine/. [5] “OASIS Web Services Business Process Execution Language,” http://www.oasis-open.org/committees/wsbpel/.

[23] S. Altschul, W. Gish, W. Miller, E. Myers, and D. Lipman, “Basic local alignment search tool,” Journal of Molecular Biology, vol. 215, pp. 403–410, 1990.

[6] G. Avellino, S. Beco, B. Cantalupo, and A. Cavallini, “A semantic workflow authoring tool for programming grids,” in Proceedings of the 2nd workshop on Workflows in support of large-scale science, ser. WORKS ’07. New York, NY, USA: ACM, 2007, pp. 69–74.

[24] F. Servant et al., “Prodom: Automated clustering of homologous domains,” Briefings in Bioinformatics, vol. 3, no. 3, pp. 246–251, 2002. [25] “HMMER: biosequence analysis using profile hidden Markov models,” http://hmmer.janelia.org.

[7] Volker Hoyer et al., “The FAST Platform: An Open and Semantically-Enriched Platform for Designing Multi-channel and Enterprise-Class Gadgets,” in International Conference on Service Oriented Computing, 2009, pp. 316–330.

[26] M. Isard, M. Budiu, Y. Yu, A. Birrell, and D. Fetterly, “Dryad: Distributed data-parallel programs from sequential building blocks,” in Proceedings of the 2nd ACM SIGOPS/EuroSys European Conference on Computer Systems 2007. New York, NY, USA: ACM, 2007, pp. 59–72.

[8] J. Dean and S. Ghemawat, “MapReduce: simplified data processing on large clusters,” Commun. ACM, vol. 51, pp. 107–113, January 2008. [Online]. Available: http://doi.acm.org/10.1145/1327452.1327492

[27] F. Montesi, C. Guidi, and G. Zavattaro, “Composing Services with JOLIE,” in Proceedings of the Fifth European Conference on Web Services. Washington, DC, USA: IEEE Computer Society, 2007, pp. 13–22.

[9] C. Vecchiola, X. Chu, and R. Buyya, “Aneka: A Software Platform for .NET-based Cloud Computing,” Computing Research Repository, vol. abs/0907.4, 2009.

382

Suggest Documents