This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. IEEE TRANSACTIONS ON SERVICES COMPUTING 1
Application Portability in Cloud Computing: An Abstraction Driven Perspective Ajith Ranabahu, E. Michael Maximilien, Amit Sheth, and Krishnaprasad Thirunarayan Abstract—Cloud computing has changed the way organizations create, manage, and evolve their applications. While the abundance of computing resources at low cost opens up many possibilities for migrating applications to the cloud, this migration also comes at a price. Cloud applications, in many cases, depend on certain provider specific features or services. In moving applications to the cloud, application developers face the challenge of balancing these dependencies to avoid vendor lock-in. We present an abstraction-driven approach to address the application portability issues and focus on the application development process. We also present our theoretical basis and experience in two practical projects where we have applied the abstraction driven approach. Index Terms—Cloud computing, Domain Specific Languages, Application Generation
!
1
I NTRODUCTION
Cloud computing is one of the most notable evolution in computing. Availability of seemingly unlimited, readily provisioned, pay per use computing resources has not only spawned a number of new industries but has also changed the mindset of all information technology (IT) centric businesses. Larger tech businesses now offload their overflow computing requirements to computing clouds while technology startups use them to establish their IT infrastructure without a heavy up front capital expenditure. The adoption of clouds by organizations however, does not imply that all the challenges in using computing clouds have been well understood. Clouds offer access to cheap, abundant computing resources, but the most appropriate utilization of these resources is still limited by the unavailability of relevant software stacks. For example, infrastructure as a service (IaaS) clouds offer the ability to quickly and programmatically provision computing instances but it is up to the user programs to make use of this capability, say by dynamically load balancing. Some cloud service providers offer platform services where the difficulties in scaling and load management are transparent to certain types of user programs, e.g., Web applications. User programs merely adhere to a set of predefined software frameworks and the platform takes care of the otherwise mundane tasks such as load balancing. These platforms however are focused on limited technical domains and • Ajith Ranabahu, Amit Sheth and Krishnaprasad Thirunarayan are with the Ohio Center of Excellence in Knowledge-Enabled Computing (Kno.e.sis) Center, Wright State University, Dayton, OH 45435 E-mail: {ajith,amit,tk}@knoesis.org • E. Michael Maximilien is with IBM Research at 650 Harry Road, San Jose, CA 95120. E-mail:
[email protected]
Digital Object Indentifier 10.1109/TSC.2013.25
thus are not applicable across all types of applications. For example, Google App Engine (GAE)1 , one of the leading cloud platform service providers, supports only a limited set of Web development frameworks and two data storage options. Similarly, Windows Azure cloud2 primarily supports the .NET platform and has limited support for other languages and frameworks. As illustrated by these examples, the current cloud computing landscape consists of a large number of heterogeneous service offerings, ranging from infrastructure oriented services to specific software based services. These differences result in application architectures dictated by service provider specific features, ultimately resulting in non-portable, vendor-locked applications. Many incidents have repeatedly shown that this is indeed a serious pitfall in adopting the cloud. Two incidents in this regard, recorded publicly, are listed below. 1) The Amazon Elastic Compute Cloud (EC2) became unavailable on April 21st, 2011 for about 12 hours due to a network misconfiguration3 . Many popular startups, including Foursquare, Reddit, and Quora, were unable to function during this period. None of these services were able to restore their functionality until EC2 was fixed. 2) The Microsoft Azure cloud became unavailable for about 3 hours on February 28th, 2012 due to a leap year (February 29th) time calculation bug4 in the Azure platform software. All Microsoft cloud services were not restored until a fix was deployed. Microsoft later issued service credits for all the customers affected by the outage. 1. 2. 3. 4.
http://code.google.com/appengine/ http://www.windowsazure.com/en-us/ http://aws.amazon.com/message/65648/ http://goo.gl/tt0Pp
1939-1374/13/$31.00 © 2013 IEEE
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. IEEE TRANSACTIONS ON SERVICES COMPUTING 2
These incidents provide evidence that being locked into a cloud service provider is indeed an important issue to consider. We take a fundamentally different mindset and use an application oriented perspective where users would focus on describing the application behavior rather than its implementation. This matches well with the perspective of typical cloud service consumers where they expect a certain functionality from their application and are oblivious to the actual underlying service provisioning mechanism. The service providers on the other hand, design their interaction patterns, input parameters and service interfaces taking a utilization perspective. This mismatch is essentially the root cause for the majority of the portability challenges we see today in cloud computing. In our solution, cloud service consumers use abstract languages to specify their programs. A software infrastructure transforms the user program specifications to the required, provider specific software components. These transformations are generic and can mostly be automated. If these specifications are kept at a sufficiently high level such that cloudspecific features are not directly exposed, they can be automatically compiled and customized for a variety of cloud environments. This process is described in detail in Section 3. There are many aspects to consider in achieving this grand vision. In this paper, we focus only on one salient aspect that we consider to be the primary building block of cloud program portability, Functional specification abstraction. Functional abstractions provide high level specifications of the core business logic of a program. Our contributions in this paper are the following: 1) We present a set of fundamental transformational conditions that apply to translating abstract functional specifications to executable cloud programs. 2) We outline the practical impact of these conditions and their applicability in determining the feasibility of an abstraction driven solution for a given domain. 3) We present the metrics and lessons learned from two successful projects that use abstract specifications to generate cloud applications. The upcoming sections are organized as follows. We introduce all the background material in Section 2. Next we provide an overview of the use of abstractions and establish the core area of interest in Section 3. Sections 4 and 5 discuss in detail the theoretical aspects of language transformations and their impact. Section 6 presents an evaluation using two practical applications, followed by Section 7 where we discusses our experience. Finally, we present a discussion (Section 8) and conclude.
2
BACKGROUND
Our approach has its basis in the fundamentals of programming languages. In this section, we briefly cover the necessary details as well as the pertinent background on languages. 2.1
Defining the Abstract Concept
We use a modified version the definition of abstract concept, provided by Kleppe [1]. The abstraction level of a concept present in a software language is the amount of detail required to either represent (for data) or execute (for processes) this concept in terms of the perceived (virtual) zero level. The perceived zero level in this definition refers to the base line that the abstraction level is measured from. The absolute zero line for a software language is the computer hardware. In other words, every computer program has to be converted to hardware interpretable machine instructions if they are to be executed. Yet, with the advancement and sophistication of high level computer languages and their tools, the zero line may be considered at a much higher level when constructing programs. This elevated zero line is what is referred to as a virtual zero line [1]. A fitting example can be found in object oriented programming (OOP). In OOP, all the program design happens using objects as the primitive building blocks, thus perceiving a virtual zero line at the level of objects. The objects, defined further in terms of data structures to hold its state and methods to transform its state, will obviously need to be mapped to memory and instructions that can be executed on hardware. However, such transformations can be mechanically and transparently performed by established software frameworks (compiler, linker libraries) and hence, the program designer can conveniently assume objects to be the lowest level of abstraction. 2.2
Language Specification
Language theory states that a language specification (L) requires three elements to be described: 1) An abstract syntax model (ASM): this is the high level model of the language, often invisible and used directly inside the language interpreter mechanism. Also known as the conceptual model, the ASM can be represented as a directed, labeled graph. 2) One or more concrete syntax models (CSM): This is generally the syntax seen by the programmers and what is typically referred to as the language. A single ASM may have more than one related CSM. 3) Set of transformations (mapping) : A mapping from the ASM to the CSM (defined per CSM) specifying the conversion of the ASM to concrete syntax and vice versa. These transformations are
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. IEEE TRANSACTIONS ON SERVICES COMPUTING 3
reversible, i.e., a program representation can be transformed losslessly from ASM to CSM and vice versa. There are three more elements, relevant to a language specification: 1) A semantic description: a description of the meaning of the program or model, including a model of the semantic domain. 2) Required language interfaces: a definition of what the programs need from programs written in other languages. 3) Offered language interfaces: a definition of what parts of the programs are available to programs written in other languages. The semantic description warrants our special attention. All language specifications require a semantic specification, yet this semantic specification is hardly provided in a formal notation. Instead, it is provided as a prose, i.e. a rigorous but informal text for the benefit of the programmers at large. Formal semantic specifications are indispensable only for advanced activities such as the developmental activities such as the construction of compilers and interpreters, program transformation and verification tools, etc. We direct the reader to Kleppe [1] for a thorough coverage of the fundamental language theory concepts. 2.3
Domain Specific Language
Van Deursen et al. [2] state that a domain-specific language (DSL) is a programming language or executable specification language that offers, through appropriate notations and abstractions, expressive power focused on, and usually restricted to, a particular problem domain. A domain in this case is the set of entities and their associated operations, selected specifically to represent and operate on these entities in a restricted context. Domains can be of varying degrees of granularity. Some domains are highly constrained, while others have a much larger scope. For example, matrices would be a more constrained domain while mathematics is a domain with a much larger scope. A DSL, although technically a programming language, have an entirely different focus. Hence, some activities considered critical in a construction of a general purpose programming language (GPPL) are not treated with similar importance in a DSL. in the next section, we discuss the concepts of modeling and establish the relationship between a GPPL and a DSL. 2.4
Modeling and Metamodeling
A model can be thought of as an abstraction of a system or its environment or both [3]. Models are represented in
many forms, ranging from textual languages to graphical notations. Unified Modeling Language (UML) [4] is one such modeling language software engineers are familiar with. A metamodel defines the abstractions used by the modeling notation. The metamodel acts as the schema for a model, defining the permissible components and the constraints applicable on the model. Thus, based on conformity, one can create a hierarchy of models [5] as illustrated in Figure 2(a). Specifications higher than meta-metamodels are typically not useful in the context of model driven software development. Metamodels are important in our research since we consider a DSL to be a representation of a domain model. Hence, the ASM of the DSL is the domain metamodel.
Fig. 1: Standard metamodel for Mathematical expressions
To illustrate the relationship between metamodels and ASMs, consider the simple domain of unary and binary mathematical expressions. The high level concepts that encapsulate this domain are operator and expression. Expression may further be specialized as unary and binary expressions and number may also be added as a subclass of expression to support literal values. These concepts, arranged in a graph (Figure 1) defines the metamodel for mathematical expressions in a graphical form (See [1] for a detailed version of this example). We’ve omitted some details for brevity and depict this model as a directed, labelled graph.
(a) The typical (b) The relationship between model, metafour layers of model, and model transformation modeling
Fig. 2: The modeling hierarchy and its relationship to model transformation
Now assume that we want a DSL to describe mathematical expressions. We would need to model the
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. IEEE TRANSACTIONS ON SERVICES COMPUTING 4
relationship between various components of mathematical expressions and represent these expressions in a syntax agnostic manner, i.e. we need to construct an ASM for this language. It is easy to see that the components of this ASM are essentially the same as the metamodel. For example, each literal number present in a mathematical expression will need to be represented as an instance of number type, which is one of the components we defined in the expression metamodel. 2.5
Model Transformations
A Model transformation is a mapping from one model to another, defined on the metamodels but operated on the respective metamodel instances. This relationship is illustrated in Figure 2(b). There are multiple methods of model transformations and graph based model transformations is only one of them. For our work, we selected graphs as the primary representation of the models and hence limit our focus to graph transformations. Czarnecki et al. [3] provide an exhaustive list of transformational techniques used for model transformations. Also, we do not consider a specific transformation implementation technique, say a rule based mechanism. Our focus is only on the conditions and special requirements that apply to these transformations, rather than how they are performed. 2.6 Semantics of a DSL : Metamodeling vs Traditional Semantics Metamodeling is the preferred way of establishing semantics of a DSL. This is an alternative to the rigorous semantic representations used in traditional language construction. GPPLs are not confined to a domain and thus require domain independent specifications to establish their semantics formally. However DSLs always represent a domain and the domain metamodel in fact is sufficient to represent the semantics of the language. We direct the reader to Nordstrom [6] for a detailed discussion on the relationship of modeling and languages.
3
A BSTRACTION D RIVEN P ORTABILITY
We now outline the use of abstractions in achieving cloud application portability. First we provide an overview of the process based on abstractions and then present a formal definition for a cloud application. 3.1
Fig. 3: Using domain driven abstractions to generated executable cloud programs
Overview of Using Abstractions
The essence of our approach is using an abstract specification, typically in the form of a DSL script, to generate platform specific but functionally equivalent executable applications. The high-level process is illustrated in Figure 3.
The source program (script) is composed using a DSL, taking a domain perspective. This program is free of any concept specific to a cloud environment. A transformation and code generation engine uses the DSL script to mechanically convert them to the target platform specific code. During this process, the specifics of the target platform remain transparent to the program composer, thus there is no locking. When the application needs to be ported to a different target platform, the composer simply reuses the original source program to regenerate a functionally equivalent application for the target platform, thus achieving application portability. In reality, the abstractions may not provide complete coverage of all required features. The generated programs can provide generic functionality out-ofthe-box by using sensible defaults but they may not be able to exploit highly specialized features present in the target platform. As an alternative, this mechanism can be used as boiler plate code to cover the otherwise mundane work to be done by developers. The generated programs can have well-defined place holders to support further customizations. 3.2
Different Aspects of a Cloud Application
Given that a cloud application is a complex composite entity, first we identify what constitutes a cloud application, thereby establishing the different aspects to be modeled. We identified four types of semantic aspects for a cloud application; namely data, functional, nonfunctional (QoS) and system aspects [7]. 1) Data aspects refer to the core data structures and the behaviors of data items in the application. 2) Functional aspects refer to the core logic of the program, i.e. the operations and data manipulations expected from the application. 3) Non-functional aspects refer to the quality of service (QoS) concerns such as security. 4) System aspects refer to the specific system level details relevant to the application. We present an example application to illustrate the differences in these aspects. The application of choice is a two-tier numerical data processing program, we call spectra processor(SP) (SP is a real application being
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. IEEE TRANSACTIONS ON SERVICES COMPUTING 5
used as a back-end processing component for an experimental bio-informaticians tool). SP consists of a service interface, exposed via HTTP. The core function of SP is performing a set of statistical operations over large amount of data, which can be uploaded via files or character large objects (CLOBs). SP uses a Hadoop cluster in the back-end, thus submitting a processing task to SP would initiate the following sequence of operations. 1) Obtain the data via appropriate methods and place it in the Hadoop file system. 2) Start the Hadoop process to perform the requested numerical operations. A unique numeric token, which is to be used to retrieve the results of the computation, is issued to the job submitter at this point. 3) Collect the logs and the data output and place it in a non-distributed file system, when the processing concludes. 4) Provide the log output and data output, when the user requests it by passing the numeric token. A high-level architectural overview of SP is illustrated in Figure 4. Each of the aspects is highlighted over the actual application component. The core function expected from SP is numerical data processing, thus the implementation of this logic constitutes the functional aspects. The representation and storage of the numerical data are considered under the data aspects. QoS capabilities of the interface are considered part of the non-functional aspects and the system configuration is considered as part of the system aspects. These aspects are orthogonal to each other, thus can be addressed independently at design time. For example, non-functional aspects, such as security and privacy, are layered on top of the functional aspects and can be varied while other aspects remain unchanged. Similarly, the system aspects can change, while the functional, data or non-functional aspects remain unchanged. Note that the independence of these aspects may not be visible in the implementation. For example, secure access to the service interface in SP is a nonfunctional consideration at design time but this requires system level configuration change to enable an encrypted connection for the Web server. The same example also provides evidence to the relative independence of these aspects as well. Securing the endpoint does not affect the functional or data aspects at all, even in the implementation. A similar notion of designing high level details first and using tools to insert necessary code changes is used by the Aspect Oriented Programming (AOP)[8] community. Although the concept of aspect is not the same as ours in the AOP context, success of AOP provide evidence that such separation is usable in practice.
Fig. 4: Four types of aspects, highlighted for the Spectra Processor (SP) application
3.3
Formal Specification for a Cloud Application
Now we present a formal definition of a cloud application, based on the following assumptions. 1) Each semantic aspect can be expressed using a DSL. 2) The abstract syntax model (ASM) of each of these DSLs (i.e. the respective domain metamodels) can be represented using a graph. Thus, we use the graph representation of a language ASM due to their generality and flexibility. Definition 1 presents this formally. Definition 1. Abstract Syntax Model (ASM) is a directed, labeled graph G = (V, E, lE , lV ), where • • • •
V is a set of vertices, E is a set of ordered pairs of vertices, called edges, lE is a labeling function, defined on E and applies to E, called edge labels. lV is a labeling function, defined on V and applies to V , called vertex labels.
The interpretation of the meaning of the vertices, edges and labels of an ASM graph is dependent upon the domain that is represented. Based on the above assumptions, we establish the following formal specification of a cloud application, Definition 2. A cloud application CA is represented by the four tuple CA = Gdata , Gf unc , Gqos , Gsys where • • • •
Gdata is the ASM graph for the DSL representing data, Gf unc is the ASM graph for the DSL representing functional details, Gqos is the ASM graph for the DSL representing nonfunctional details, Gsys is the ASM graph for the DSL representing the system configuration.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. IEEE TRANSACTIONS ON SERVICES COMPUTING 6
Definition 2, in simpler terms, establishes that a cloud application is a collection of four specifications, each describing a different aspect of the application. In practice, one may use a single DSL to describe more than one aspect (say Data and Function) and leave out certain aspects entirely, implying the use of defaults. 3.4
Addressing Cloud Specific Features
Cloud specific features, such as explicit parallelism, may be modeled as part of different semantic aspects of Definition 2. This also depends on the application. For example, the SP system may use a horizontally replicating configuration for load balancing, providing parallelism at the system level (such horizontal replication is common in IaaS cloud environments). However, this parallelism is not visible (nor effect) any other aspect. Similarly, data storage may be performed on a distributed database, yet the rest of the system may well be completely isolated and such distribution is transparent at design time. 3.5
Theoretical Aspects of Interest and Motivation
The key component in the abstraction driven application generation is the code generation process that uses model transformations to convert the domain model to a model suitable for a cloud environment. Thus, the theoretical details we are interested in are the properties of the model transformation from the user domain ASMs to cloud ASMs. We assume that an ASM (and possibly the relevant CSMs), suitable for clouds exist. This is a reasonable assumption for two reasons: 1) Programming languages and models built for distributed environments exist. Two notable examples are ERLANG [10] and Chapel [11]. Distributed programming paradigms such as map-reduce [12] can also be considered as an ASM since one can define clear syntactic translations from abstract map-reduce models. Thus, a specification can be used to either generate a program using a distributed programming language or a distributed programming model (that can be translated to a program that runs on a software framework supporting the programming model, say Hadoop), when the cloud of choice offers explicit parallelism. 2) When the distributed nature of the cloud is transparent (as in a platform cloud), a supported general purpose programming language can be used as the target. To justify the importance of these theoretical investigations, we formulate our objectives into high-level questions. Our goal is to understand the applicability of DSLs in the context of clouds, thus there are three questions we are interested in answering. 1) To what kind of domains can we apply the DSL based programming abstractions?
2) What savings in effort (and cost) can be achieved when the DSL based abstractions are used? 3) Is it possible to reverse engineer an existing program and create a DSL representation? In order to answer these questions, one needs to understand the theoretical limitations of the transformations. Thus we are motivated to investigate the DSL transformations and understand their applicability and limitations in the context of clouds.
4 L ANGUAGE T RANSFORMATIONS C LOUD
FOR THE
In this section, we investigate the language transformational features in detail by using a symbolic representation. Specifically, we focus on the transformation of the functional language ASM from the users domain to the cloud environment, i.e., our focus is on the transformation of the Gf unc , introduced in Definition 2 (Section 3.3). It is possible to address this transformation in isolation and without loss of generality. I.e. some of the requirements relevant to functional specification transformations are also applicable to other graphs in Definition 2. We make the following realization that forms the core of our transformation strategy. The transformation from the domain model to a cloudsupported implementation model depends heavily on the details of the domain metamodel. In other words, the domain metamodels must be detailed enough so that a meaningful transformation can be made. This realization leads to our primary principle that source metamodel graphs need different vertices for semantically distinct language constructs, regardless of their syntactic representation. This is important since it is typical for ASMs to focus purely on giving an abstraction of the CSM, where syntactically similar constructs are modeled indistinctively. We use the simple expression metamodel, introduced in Figure 1 (Section 2.4) as an example. In this model, the concept operator is represented by a single vertex despite the fact that many semantically different operators may exist. For example, increment and decrement operators represent completely different tasks but are represented as a single vertex in the typical metamodel since their
Fig. 5: Enhanced Metamodel for Mathematical expressions, showing the sub expressions
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. IEEE TRANSACTIONS ON SERVICES COMPUTING 7
representations are similar in a concrete syntax. Figure 5 illustrates an enhanced metamodel that defines each operator as a distinct vertex. Usually this level of detail is considered excessive and unnecessary in syntactically driven ASMs, where the difference of the operator only becomes a consideration in the annals of compiler construction. 4.1 Requirements on ASM transformations for Cloud implementations Now we state our requirements formally using requirements and rationales. While these are not as rigorous as theorems and proofs, they can be considered the governing principles. Consider Gd as the ASM of the domain, Gc as the ASM of the cloud, Gmeta and Gmeta as the respecc d tive metamodels, represented as graphs. Consider the transformation Td−c , denoting the source and target as the domain and cloud respectively. Thus, Td−c is but applies to Gd defined using Gmeta and Gmeta c d and Gc respectively. The relationship between these components is illustrated in Figure 6.
mapped. Rather, the result of the transformation, i.e. the target graph that gets created as a result, should be complete. This is a different way of specifying that the transformation should provide sensible defaults to avoid an incomplete target graph. Rationale 2. Assume that the mapping is not surjective. Then Gc is missing at least one vertex λ needed to complete the model and thus, Gc is incomplete. Then Gc cannot be converted to a working program. Thus, In order to have an executable program, the mapping must be surjective. As a result of Requirement 2, we can derive lemma 1. Lemma 1. Td−c is not reversible. Lemma 1 states that the transformation is not reversible. This can be trivially rationalized by considering the properties of a general surjective mapping, except for the special case of the mapping being bijective. The usual case in this context is that the target language is almost always at a lower level of abstraction which makes the bijective case nonexistent (the transformation can be considered bijective when the models being translated are at equal levels of abstraction, enabling a lossless conversion in both directions). 4.2
Fig. 6: Relationship between metamodels,models and transformations, represented symbolically
Requirement 1. Gmeta must define distinct vertices for d each semantically distinct domain concept. Requirement 1 states that the source metamodel should have elements for each and every distinct domain concept that may implement a semantically distinct operation. Rationale 1. Assume that there is a vertex λmeta in Gmeta d that has two interpretations. Then there exists at least two vertices in Gd , Say λ1 and λ2 , that comply to λmeta but has two interpretations, hence should map to two different vertices in Gc . However, since there is one vertex in Gmeta , d only one mapping can exist for it in Td−c . Thus, trivially Td−c cannot manage different mappings to λ1 and λ2 unless they map to two different meta concepts in Gmeta . d Requirement 2. Td−c is a surjective (onto) mapping, i.e. all vertices in Gc must be defined by the transformation. Requirement 2 highlights the fact that the transformation must yield a complete target graph. This does not mean that all vertices in the source graph will be
Addressing Explicit Parallelism
These base requirements can be used to formulate more restricted requirements, applicable in domain dependent contexts. One such context is the translation to a map-reduce program, which is a common requirement when the parallel nature of infrastructure clouds need to be exploited. We first introduce the concept of the map-reduce task graph. A map-reduce task graph, sometimes called a physical plan is a task graph representing the map and reduce task sequences for a given program. For almost all practical cases, a single map or reduce combination is insufficient and requires a combination of multiple map and reduce tasks. The mapreduce task graph represents this sequence of map and reduce tasks. Task graphs are especially useful when abstractions, such as in PIGLatin, are used to generate map-reduce programs. PIGLatin is a SQLlike DSL, targeted towards generating Hadoop based map-reduce jobs [13]. Figure 7 illustrates a map-reduce task graph generated by the PIG compiler for a statistical operation, sum normalization. When requirements 1 and 2 are applied to the special case of translating the DSL to a map-reduce program, we derive the following condition, applicable only when transforming a domain model to a map-reduce model.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. IEEE TRANSACTIONS ON SERVICES COMPUTING 8
The transformation from the source metamodel to a map-reduce-metamodel is illustrated in Figure 9. Note that this translation maps all operators we modeled, to parts of a map-reduce task graph. Fig. 7: An example map-reduce task graph (plan) for a sum normalization operation
Requirement 3. All vertices of Gmeta must be mappable d to the components of a map-reduce task model. Requirement 3 simply states that all domain concepts should have a translation that can map it to a map-reduce task graph. 4.3
A Practical Example
We present the applicability of Requirement 3 using an example from the Scalable Cloud Application Generator(SCALE) project [14]. SCALE is a DSL based solution to create statistical workflows for scientific data processing (We use SCALE for our evaluations in Section 6 as well). The DSL is used to describe the nature of the processing job required by the scientists and then converted to a program of their choice, running on either a cloud or a desktop. A simple example SCALE DSL script is illustrated in Listing 1. The SCALE metamodel and the conceptual model of the script are illustrated in Figure 8. The metamodel of the SCALE DSL is illustrated in Figure 8(a), having unique vertices to semantically distinct operators. The ASM of the illustrated script (complying to the metamodel) is illustrated in Figure 8(b). Note that both these models are illustrated as directed labeled graphs rather than UML models. Listing 1: Example SCALE script to sum normalize and auto scale a dataset # load values loaded values= l o a d f i l e ( : raw values ) # normalize normalized=sum normalize ( l oad ed val ues ) # scale s c a l e d = a u t o s c a l e ( normalized ) # s t o r e i t back to a f i l e s t o r e f i l e ( : processed data , s c a l e d )
Fig. 9: SCALE metamodel Transformation
5 I MPACT OF C ONDITIONS
Fig. 8: SCALE metamodel and the complying model for the computation
T RANSFORMATIONAL
In this section, we look at the practical impact of the conditions we identified in Section 4.1. In other words, we look at how these conditions determine the answers to the questions we posited in Section 3.5. 5.1
Domain Modeling Requirements
The first impact of these conditions can be seen in the effort required for domain modeling (Tacitly assuming that the application specification can indeed be converted to a cloud environment). Condition 1 requires domain modelers to identify the semantically distinct concepts and incorporate them to the metamodel appropriately. Such a task obviously requires more effort than typically anticipated and adds additional overhead at the design phase. While such extra work is feasible in restricted domains, it may require considerable effort in domains that have a larger scope, encapsulating a large number of concepts. One method we suggest is to use the Effort Tradeoff as a means of determining the suitability of the domain. This requires a preliminary metamodel of the domain. Given the large number of modeling and DSL creation tools, we assume that a preliminary metamodel can be constructed quickly and without significant commitment. We describe a systematic method of determining the effort tradeoff in the next section. 5.2
(a) Partial Metamodel for SCALE (b) Model of the computation (inDSL stance of the metamodel), as in Listing 1
THE
Determining the Effort Tradeoff
To determine the tradeoff in effort, we introduce a single indicator R. The purpose of R is to determine the effort tradeoff in a single target platform, assuming the domain has already being modeled. We make the following assumptions. 1) The base code generation framework (parsing, syntax tree generation etc) is in place. Thus the effort required is limited to the creation of templates. 2) The effort required to create the templates can be estimated by the lines of code (LOC) count. LOC does
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. IEEE TRANSACTIONS ON SERVICES COMPUTING 9
not indicate the complexity of the code. However, it can be measured easily and rapidly, making it suitable for this type of indicator. 3) The LOC of the DSL is always less than the LOC of the generated program. Given these assumptions, we use the following statistics. 1) LOCtemplates : LOC count of the templates. 2) LOCgenerated : LOC of the generated code. 3) LOCdsl : LOC of the DSL script. We combine these metrics to form R, using Equation 1. R=
ln
LOCdsl 1 ln LOCgenerated LOCtemplates
−1 (1)
The rationale behind organizing these metrics in Equation 1 is as follows. The ratio between LOCdsl and LOCgenerated is a direct indication of saving in effort gained by using the DSL. However the generation capability of the DSL comes via the templates and thus, the effort required in creating the templates should be discounted from the gain. Arranging these metrics using their logarithms ensure R is normalized and remains between 0 and 1, given our assumption that LOCdsl < LOCgenerated . For practical cases, lower R values represent better applicability. Higher R values suggest either extensive template code efforts (with respect to the generated code) or the need for lengthy DSL code scripts that would offset the advantage gained from introducing a DSL in the first place. The evaluation in Section 6 presents an experiment that calculate the R values. 5.2.1 Limitations in Transforming to Distributed Models Condition 3 states that a complete transformation from the domain model to the map-reduce task model should exist, in order to implement explicit parallelism. In other words, every domain concept represented in the DSL should have an equivalent operation or a combination of operations in mapreduce. In practice this is hard to achieve, even in a limited domain. The reason for this can be attributed to the DSL feature gap where all desired features may not be supported by the implemented features. It implies that it is always possible for some desired features to be not covered by the DSL. An example for this can be found in the context of the SCALE project (Presented in detail in Section 6. Some specialized statistical operators used in advanced NMR data processing, such as orthogonal projection on latent structures (OPLS)[15] have no parallel implementations yet (OPLS is an iterative process and it may be possible to convert it to an
explicitly parallel version. It has not been formulated into a explicitly parallel process yet). Thus, the original SCALE language does not support OPLS as an operator, even though it is highly desired. The impact of this is felt when a desired feature becomes essential. In that case only a portion of the program can be distributed. As in the case of SCALE, the code generators have modifications that allow them to incorporate sequential (non distributed) code segments to the programs. This is very inefficient (for example, the data has to be pulled from the distributed file system to the local file system, processed and put back to the distributed file system when a non-distributed operation id interleaved), yet deemed necessary in extreme cases. 5.2.2 Reverse Engineering Lemma 1 states that it is unreliable (or impossible in most cases) to reverse the transformation. This means that trying to reverse engineer programs and generate a DSL representation is not possible. In practice, one may be able to glean a reasonable set of abstractions from a limited set of existing applications. This however should not be taken as a general property. The abstractions in the scope of this research are domain focused and can only be converted to an executable program by incorporating significant (often assumed) details. It is simply not possible to take an arbitrary program and convert it to an abstract form.
6
E VALUATION
Now we present two of our research projects and evaluate them based on the code metrics. MobiCloud [16], [17] presents a DSL driven approach to generate cloud-mobile hybrid (CMH) applications. A CMH application has a cloud based backend as well as a mobile device based front-end. The current MobiCloud DSL has provisions for data and functional specifications. The QoS (non-functional) and system details are assumed by the generators, although the composer may tweak some of these parameters, either via metadata attributes or an extension mechanism [17]. The MobiCloud composer is available for public use5 . SCALE, briefly introduced in Section 4.3, is a DSL driven solution for scientific programs [14]. A DSL is used to describe the nature of the processing job required by the scientists and then converted to a program of their choice, running on either a cloud or a desktop. The SCALE DSL is deliberately kept simple and tightly bound to the domain of interest since the primary users of this DSL are domain experts rather than cloud programmers. SCALE composer is also available for public use6 . 5. http://mobicloud.knoesis.org 6. http://metabolink.knoesis.org/SCALE
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. IEEE TRANSACTIONS ON SERVICES COMPUTING
Generated LOCgen
Ratio (LOCdsl :LOCgen )
Android Blackberry EC2 GAE Android Blackberry EC2 GAE Android Blackberry EC2 GAE Android Blackberry EC2 GAE
541 329 156 346 1244 592 628 1021 486 100 289 466 794 1377
1:49 1:30 1:15 1:31 1:73 1:35 1:37 1:60 1:54 1:11 1:32 1:52 1:88 1:153
Script
Description
DSL LOC (LOCdsl )
Models
Views
Controllers
LOC Target
10
Task Manager
A task manager application that stores and retrieves tasks
11
1
2
1
Shop Manager
An application to keep track of jobs and customers for a mechanics shop
17
2
4
2
URL Fetcher
Fetches and displays values from a Yahoo Web service
9
1
1
1
Salesforce Contacts
Fetches and displays the contact list from a Salesforce account
9
1
1
1
TABLE 1: LOC counts of selected MobiCloud generated applications Operation Sum normalization Auto Scaling Sum normalize then auto scale
DSL 3 3 4
Azure 847 848 938
Hadoop 952 956 1022
Desktop 88 94 106
TABLE 2: LOC counts of selected SCALE generated applications
the base language used since the DSL script is anyway much more concise than the equivalent GPPL code. The next two sections produce details of these experiments. 6.2
6.1
Experiments
We performed two main categories of experiments. 1) To evaluate the saving of effort, we measured the lines of code (LOC) of selected programs against the LOC of the DSL. 2) To evaluate the effort to create the generation mechanism, we measured the LOC of the code generation templates. The CLOC tool7 was used to obtain the LOC counts in all the listed experiments. The CLOC tool counts all code segments in a given directory recursively, separating counts for different types of languages (for example, when a project includes resources that use multiple languages, CLOC provides a breakdown of different code counts, excluding comments and white space). These experiments use the sum of all the code counts. For example, for a generated Android application, the LOC count (listed in Table 1) includes the sum of Java LOC and XML LOC, as counted by CLOC. Android projects use an XML based language for user interface layout, apart from their Java code segments. LOC is used as the primary metric due to its simplicity and ease of use. LOC depends on the choice of the language ( scripting languages such as Ruby typically require a fewer lines of code than a traditional programming language like Java, for the same operation). However, we assume that the relative advantage of effort is minimally affected by 7. http://cloc.sourceforge.net/
Effort Comparison Using Generated Code
For the code comparison experiment, 4 MobiCloud programs were selected. Two of the programs use the extension capabilities while the other two are based on the base language (See [17] for a discussion on the difference between the base language and the extended language of MobiCloud). Table 1 outlines the types of selected applications and their code statistics. The same statistics are presented in Figure 10(a) as a graph. To compare the effort saving in SCALE, three programs that represent either one or a combination of two operators were selected. The first two programs simply loaded a dataset, performed a single operation and wrote the results to a (distributed) file. The third program used a sequence of the two operations before the result is written to the file. The SCALE code statistics are presented in Table 2. A graph presentation of the same data is available in Figure 10(b). 6.2.1 Discussion There are varying degrees of savings in terms of code creation effort, which heavily depends on the target platform (when the effort is considered to be a matter of writing the code). The saving in effort is not uniform (ranges from 10 to 153 times in our experiments) because the cloud platforms provide programming primitives of different granularities. In reality, there are three other types of effort that gets introduced into program creation. 1) Effort in algorithm conversion : This is more prominent in instances where explicit parallelism is required, such as in the case of SCALE. There
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. IEEE TRANSACTIONS ON SERVICES COMPUTING 11
(a) MobiCloud LOC Comparison
(b) SCALE LOC Comparison
Fig. 10: LOC statistics of MobiCloud and SCALE generated applications Target Platform Android EC2 BlackBerry GAE
Template LOC 1162 519 380 789
Resources LOC 0 10 0 24
Target Platform Azure Hadoop Ruby
Template LOC 454 434 133
Resources LOC 736 886 90
(b) SCALE Template LOC
(a) MobiCloud Template LOC
(c) MobiCloud Template LOC Comparison
(d) SCALE Template LOC Comparison
Fig. 11: LOC statistics of MobiCloud and SCALE Generators
is significant effort saved by using the generators to convert the algorithms to map- reduce versions which is not reflected in the LOC counts. 2) Effort in code organization : Some platforms require the software artifacts to be organized in a specific way. For example, GAE projects require a specific code organization that is expected by the GAE deployment tools. The effort saved in code organization is not reflected in the above statistics. 3) Effort in debugging API incompatibilities : A significant debugging effort is saved in some cases where the client and server are both generated by the same specification. Often the remote communications are the most error prone segments in a distributed programming environment. This saved effort is not indicated in the LOC comparison. Despite the inability of the LOC metric to capture code translation burden and quality, LOC count enables a rough assessment of the relative effort required to create a program. This is sufficient for us to rapidly calculate the effort savings. In other words, it is
sufficient for us to find a lower bound assessment. 6.3
Effort Comparison for Templates
The objective of this evaluation is to quantify the effort in actually creating the code generation mechanism. Both MobiCloud and SCALE are based on the same code generation engine. The only difference is the parser and the set of code templates. Thus, in this experiment, the code templates are analyzed, assuming the LOC counts in the templates are indicative of the effort to create them. Figure 11 presents these statistics. Table 11(a) and Table 11(b) (Charts 11(c) and 11(d) respectively) include the LOC counts of templates and resources for MobiCloud and SCALE respectively. The counts under resources indicate static code files that gets placed in the generated code without modifications. Resources are especially important when significant code can be inserted without modification. For example, in SCALE, most of the relevant mapper and
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. IEEE TRANSACTIONS ON SERVICES COMPUTING 12
reducer implementations are inserted without modification and wired together using a controller class, generated dynamically. 6.3.1 Discussion Similar to the case of generated applications, effort in algorithm conversion and the effort in template conversion are not reflected in these numbers. Additionally, the effort in creating the parser is not taken into account in this experiment. This does not effect the accuracy of our experiments since both DSLs are subsets of Ruby (i.e. Internal DSLs that use Ruby as the host language). Thus, the existing Ruby parser is reused and no explicit effort is taken to construct a parser. In cases where external DSLs are used, the parser would also contribute to part of the development effort. It is uncommon however to construct a parser from scratch. For most cases, one can use a parser generator tool, thus the effort to create a parser for the DSL is negligible in most cases. 6.4 R value Comparison Using equation 1, we calculate the R values for a selected set of MobiCloud and SCALE generated applications. These figures are presented in Table 3. MobiCloud (Android) MobiCloud (GAE) SCALE (Azure) SCALE (Hadoop) SCALE (Ruby)
LOCgen 1162 813 1190 1320 223
LOCdsl 12 12 3 3 3
LOCtpl 766 802 878 977 96
R 0.034 0.036 0.025 0.024 0.053
TABLE 3: R value calculations
The R values are relatively small indicating that the selected targets are a good fit for this approach. It is noticeable that the Ruby target has a higher R indicating that the benefit of the using the DSL is not as much as in the other cases. Indeed the Ruby versions of the program were the smallest (in terms of LOC) implementation of the generated programs. Since R is a relative measure, there is no hard cutoff. It could be used at the discretion of the program developer to decide whether their own goals are met. There are other factors that are not reflected with R (since R is based on just LOC), thus it may not be used as the only decision factor.
7 P RACTICAL E XPERIENCE L EARNED
AND
L ESSONS
In this section, we briefly present our experience in the research projects discussed in Section 6 and some of the lessons learned. These lessons are more practical in nature (i.e. they are not based on carefully designed objective experiments) and stem from the numerous discussions and interactions we had with practitioners. They outline the applicability of our suggested development process and highlight some of the important considerations in practice.
7.1
Experience from MobiCloud and SCALE
The most notable lessons we learned from MobiCloud and SCALE projects are as follows: 1) The language and data transformations via a DSL into multiple platforms is practical, as long as the scope of the application domain is managed. In other words, the generated programs are functional but are not able to exploit every unique features of a target platform. Adding special constructs to the DSL to exploit such features tend to take away the simplicity of the DSL and thus, a good control of the domain scope is essential. In the case of MobiCloud, the DSL is deliberately kept simple to avoid contaminating the core MVC structure. 2) It is possible to intertwine the functional, nonfunctional, data and system considerations in a single script, in a manner that is natural to the domain. Such compositions helps the application author to define a single script with all the necessary details. Although designers may be tempted to use specialized DSLs to define different aspects of the program, this produces a difficult learning process, defeating the purpose of the DSL. The MobiCloud DSL is produced as a single coherent language covering data and functional aspects, which made it acceptable to many amateurs. 3) For domains driven primarily by domain experts, providing tools to run a complete process (develop, deploy and monitor) is very important. Program generation alone is not useful to these domain scientists unless there is an associated mechanism and tooling that lets them deploy and monitor these applications. This was clearly observed in the MobiCloud project where the acceptance increased after automatic deployment tools were introduced. The tooling requirements are further discussed in Section 7.2. 7.2
Other Considerations in Practice
while the above mentioned facts highlight specific considerations in the highlighted projects, following are some of the more general considerations we noted during the use of DSLs in cloud program generation. User Perception Even though abstractions implemented via a DSL introduces a streamlined development life-cycle, user (developer) perception of the language plays an important role in adoption. For example, MobiCloud is targeted at the developer community and the introduction of the DSLdriven technique was not met with enthusiasm. A survey identified that this is mostly due to the apparent inability of the DSL to exploit certain platform specific features, in the mobile platforms as well as the cloud platforms. The underlying reason seems to be the
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. IEEE TRANSACTIONS ON SERVICES COMPUTING 13
perceived loss of fine grained control due to the introduction of the abstractions, although the time saved by using MobiCloud was obvious and the developers could tweak the code if they wished, much faster than building it from scratch. Hence, user perception plays an important role in introducing this type of an abstraction driven programming paradigm. The polarity of the perception changes with the target user community and should be addressed accordingly. Tools Tools play an absolutely critical role in an abstraction driven development process. Properly featured tools are essential to mitigate the learning curve of a DSL, even when the textual form of the DSL is fairly simple, featuring a much simpler learning curve than a typical programming language. Both in MobiCloud and SCALE, the primary composer is graphical, i.e., one can drag and drop graphical symbols to a canvas to compose the respective program, reducing the learning effort. The graphical composers arose from necessity since many users, especially non-programmers, were not interested in learning a new programming language, even when it was only matter of few hours. Also, tool support for other actions such as cloud deployments are expected by many users. The very first demonstration of MobiCloud graphical composer was not well received since it did not support automatic deployments. The subsequent releases that had deployment support were received well by users and the utility of such a MobiCloud could be demonstrated clearly.
8
D ISCUSSION
Although an abstraction driven methodology has significant advantages, there are certain factors that need more exploration for a comprehensive solution. We discuss three important considerations in this regard. 8.1
Data Management and Migration
We have deliberately omitted application data management considerations. In many instances, the accumulated data is considered an important asset and significant effort is spent on porting the data to the new platform. Although not explicitly discussed, our abstraction driven, top-down approach forces the user programs to organize their data in a high level model, providing a methodical way to migrate data. The high level model is translated to the platform specific logical schema via a known translation process, thus, it is possible to mechanically generate a transformation to port data from one platform to another using the higher-level model as an intermediary. Note that we tacitly assumed the data models have equal expressive power and the transformations can be performed losslessly. Investigating data model
compatibility and the applicable limitations is a nontrivial task and outside the scope of the current work. 8.2
Deployment and Management of Applications
Application deployment (placing the application in a cloud) and management (updating the configurations, taking backups etc.) are important to the applications life-cycle. The programming abstractions become highly useful only when abstractions are provided over the deployment and management process. We mention our related research, for sake of completeness. We considered the use of a Middleware layer to provide abstractions over application deployment and management. This has been successfully demonstrated in the IBM Altocumulus research project [18]. Altocumulus allows users to deploy compatible applications to Amazon EC2, Google App Engine and IBM HiPODS, an IBM private cloud offering, using a uniform user interface. The procedural differences between the cloud deployment processes have been made transparent to the users via the middleware layer. The success of this strategy has been highlighted by its influence on the new IBM product, the IBM workload deployer, part of the IBM PureSystems private cloud solution. 8.3
Addressing Non-functional Aspects
In real-world applications, non-functional aspects are considered extremely important, sometimes as much as the core functionality of the application itself. For example, security and privacy are considered paramount in a number of industries (Financial, Online retail etc.) and significant effort is spent on hardening and verifying application security. We have not focused on these issues, though the use of abstractions provide a clear way to incorporate such nonfunctional capabilities into the generated applications. The other aspects, as discussed in Section 3.2, can be incorporated into the DSL, either by embedding fragments of other DSLs or extending the DSL itself to provide them. It is entirely possible for the code generators to insert QoS related code segments. One example in this regard is the secure metadata attribute in MobiCloud. Setting the secure attribute to true generates code that force all communications to happen using the HTTPS protocol. Although this requires changes across a number of components (client libraries, server configurations and service interface), it’s a matter of setting one attribute in the DSL script. We have discussed at length, how such modifications can be done to MobiCloud via an extension mechanism [17]. Such a mechanism would also be applicable to other domains that use DSL based solutions.
9
C ONCLUSION
We have explored the use of abstractions to support cloud programming. The driving principle is that the
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. IEEE TRANSACTIONS ON SERVICES COMPUTING 14
use of cloud should take an user driven perspective, rather than a provider-driven perspective. We investigated the use of DSLs to provide user-oriented abstractions for cloud programs and contributed a set of conditions on domain metamodel transformations applicable in this context. Based on these conditions, we have devised an indicator, R, as a litmus test to determine the applicability of a domain to use the abstraction driven methodology. Our experiments in two disparate domains indicate that DSL based solutions are indeed applicable and provide a manageable way to generate programs targeted towards the cloud. While the solution space is limited to domains of interest, increasingly clouds are being used for domain driven processing tasks and hence, domain based solutions are of great interest. In summary, we conclude that using abstractions via DSLs is an effective and a feasible method to provide a uniform programming methodology for clouds.
R EFERENCES [1] [2] [3]
[4] [5]
[6]
[7] [8] [9] [10] [11]
[12]
[13]
A. Kleppe, Software language engineering: creating domain-specific languages using metamodels. Addison-Wesley Professional, 2009. A. van Deursen, P. Klint, and J. Visser, “Domain-specific languages: an annotated bibliography,” SIGPLAN Not., vol. 35, no. 6, pp. 26–36, 2000. K. Czarnecki and S. Helsen, “Feature-based survey of model transformation approaches,” IBM Systems Journal, vol. 45, no. 3, pp. 621–645, 2006. [Online]. Available: http://bit.ly/w6fD4S G. Booch, J. Rumbaugh, and I. Jacobson, Unified Modeling Language User Guide, The (Addison-Wesley Object Technology Series). Addison-Wesley Professional, 2005. J. Sprinkle, A. Ledeczi, G. Karsai, and G. Nordstrom, “The new metamodeling generation,” in Proceedings. Eighth Annual IEEE International Conference and Workshop On the Engineering of Computer Based Systems-ECBS 2001. IEEE Comput. Soc, pp. 275–279. [Online]. Available: http://bit.ly/wJLT6G G. Nordstrom, “Metamodeling - Rapid Design and Evolution of Domain-Specific Modeling Environments,” Ph.D. dissertation, Vanderbilt University, 1999. [Online]. Available: http://bit.ly/w09iDU A. Sheth and A. Ranabahu, “Semantic Modeling for Cloud Computing, Part 1,” IEEE Internet Computing, vol. 14, no. 3, pp. 81–83, May 2010. [Online]. Available: http://bit.ly/yGTv6D G. Kiczales, J. Lamping, A. Mendhekar, C. Maeda, C. Lopes, J.-M. Loingtier, and J. Irwin, “Aspect-oriented programming,” ECOOP’97Object-Oriented Programming, pp. 220–242, 1997. R. Laddad, AspectJ in action: practical aspect-oriented programming. Manning, 2003, vol. 512. ¨ J. Armstrong, R. Virding, C. Wikstrom, and M. Williams, Concurrent programming in ERLANG. Prentice Hall, 1996, vol. 2. B. L. Chamberlain, D. Callahan, and H. P. Zima, “Parallel programmability and the chapel language,” International Journal of High Performance Computing Applications, vol. 21, no. 3, pp. 291–312, 2007. S. Ghemawat and J. Dean, “Mapreduce: Simplified data processing on large clusters,” in Symposium on Operating System Design and Implementation (OSDI04), San Francisco, CA, USA, 2004. C. Olston, B. Reed, U. Srivastava, R. Kumar, and A. Tomkins, “Pig Latin: A not-so-foreign language for data processing,” in Proceedings of the 2008 ACM SIGMOD international conference on Management of data. ACM, 2008, pp. 1099–1110.
[14] A. Ranabahu, P. Anderson, and A. Sheth, “The Cloud Agnostic e-Science Analysis Platform,” IEEE Internet Computing, vol. 15, no. 6, pp. 85–89, Nov. 2011. [Online]. Available: http://bit.ly/HdAIqP [15] P. Anderson, “Algorithmic techniques employed in the quantification and characterization of nuclear magnetic resonance spectroscopic data,” Ph.D. dissertation, Wright State University, 2010. [16] A. Manjunatha, A. Ranabahu, A. Sheth, and K. Thirunarayan, “Power of Clouds in Your Pocket: An Efficient Approach for Cloud Mobile Hybrid Application Development,” 2010 IEEE Second International Conference on Cloud Computing Technology and Science, no. 2, pp. 496–503, 2010. [Online]. Available: http://bit.ly/zW2s4u [17] A. Ranabahu, E. M. Maximilien, A. P. Sheth, and K. Thirunarayan, “A Domain Specific Language for Enterprise Grade Cloud-Mobile Hybrid Applications,” in 11th Workshop on Domain-Specific Modeling, 2011. [Online]. Available: http://bit.ly/ACKAzS [18] E. Maximilien, A. Ranabahu, and K. Gomadam, “An Online Platform for Web APIs and Service Mashups,” IEEE Internet Computing, vol. 12, no. 5, pp. 32–43, 2008.
Ajith Ranabahu is an engineer with Amazon Web Services and earned his PhD in computer science at the Ohio Center of Excellence in Knowledge-enabled Computing (Kno.e.sis) in Wright State University. His primary research is focused on application and data portability in Cloud computing. Contact him at
[email protected].
E. Michael Maximilien is a research staff member at IBM Research. He is active in a number of technical communities inside and outside of IBM. He is keenly interested in languages, systems, methods, practices, and techniques that make web computing easier and help make the web a trustable, social, and programmable platform and substrate for businesses and individuals. Contact him at
[email protected].
Amit Sheth is the LexisNexis Ohio Eminent Scholar and the director of the Ohio Center of Excellence in Knowledge-enabled Computing (Kno.e.sis) at Wright State University. His research interests are Web 3.0, including Semantic Web, and semantics empowered Social Web, Sensor Web/Web of Things, Mobile Computing and Cloud Computing. Contact him at
[email protected].
Krishnaprasad Thirunarayan is a Professor in the Ohio Center of Excellence in Knowledge-enabled Computing (Kno.e.sis) at Wright State University. His research interests are in Semantic Social and Sensor Data Analytics, Web 3.0, Information Retrieval/Extraction and Semantics of Trust. Contact him at
[email protected]