A Cloud Infrastructure Service Recommendation System for ... - arXiv

A Cloud Infrastructure Service Recommendation System for Optimizing Real-time QoS Provisioning Constraints

arXiv:1504.01828v1 [cs.DC] 8 Apr 2015

Miranda Zhang, Rajiv Ranjan, Michael Menzel, Surya Nepal, Peter Strazdins and Lizhe Wang, Senior Member, IEEE

Abstract—Proliferation of cloud computing has revolutionized hosting and delivery of Internet-based application services. However, with the constant launch of new cloud services and capabilities almost every month by both big (e.g., Amazon Web Service, Microsoft Azure) and small companies (e.g. Rackspace, Ninefold), decision makers (e.g. application developers, CIOs) are likely to be overwhelmed by choices available. The decision making problem is further complicated due to heterogeneous service configurations and application provisioning Quality of Service (QoS) constraints. To address this hard challenge, in our previous work we developed a semi-automated, extensible, and ontology-based approach to infrastructure service discovery and selection based on only design time constraints (e.g., renting cost, datacentre location, service feature, etc.). In this paper, we extend our approach to include the real-time (run-time) QoS (endto-end message latency, end-to-end message throughput) in the decision making process. Hosting of next generation applications in domain of on-line interactive gaming, large scale sensor analytics, and real-time mobile applications on cloud services necessitates optimization of such real-time QoS constraints for meeting Service Level Agreements (SLAs). To this end, we present a real-time QoS aware multi-criteria decision making technique that builds over well known Analytics Hierarchy Process (AHP) method. The proposed technique is applicable to selecting Infrastructure as a Service (IaaS) cloud offers, and it allows users to define multiple design-time and real-time QoS constraints or requirements. These requirements are then matched against our knowledge base to compute possible best fit combinations of cloud services at IaaS layer. We conducted extensive experiments to prove the feasibility of our approach. Index Terms—Decision support, Optimization, Service Selection, Web-based services

I. I NTRODUCTION In the cloud computing model, users access services according to their requirements, without the need to know where the services are hosted or how they are delivered. Increasing number of IT vendors (Amazon, GoGrid and Rackspace) are promising to offer applications, storage and computation resources as cloud hosting services. As a result, a large number of competing services are available for users [1] to choose from. Naturally, it is challenging for users to select the right services that meet their QoS requirements in the service cycle from selection, deployment to orchestration (e.g. Miranda Zhang and Peter Strazdins are with the Australian National University. Rajiv Ranjan and Surya Nepal are with the CSIRO, Australia. Michael Menzel is with Karlsruhe Institute of Technology, Germany. Lizhe Wang is with the School of Information Science and Engineering, Yanshan University, Qinhuangdao, China. Corresponding author: Lizhe Wang; Email: [email protected]

determine optimal web service when making service selection, identify suitable virtual machine servers for deploying web service instances, etc.) [2] . Effective service recommendation techniques are becoming important to help users (including developers) in their decision-making processes for critical application developments and deployments [3]. Such applications can include interactive games, real-time social networks, data analytics, scientific computing, business, Internet of Things (IoT) and other mobile applications as discussed next. All these applications have different needs and requirements. A. Motivation We next provide a few examples to demonstrate different types of applications with the needs to cater for real-time QoS requirements during their deployment lifecycle. Interactive Online Games: In the gaming industry, World of Warcraft counts over six million unique players on daily basis. The operating infrastructure of this Massively Multiplayer Online Role Playing Game (MMORPG) comprises more than 10,000 computers [4]. Depending on the game, typical response times to ensure fluent play must remain below 100 milliseconds in online First Person Shooter (FPS) action games [5] and below 1-2 seconds for Role-Playing Games (RPGs). A good game experience is critical for keeping the players engaged, and has an immediate consequence on the earnings and popularity of the game operators. Failing to deliver timely simulation updates leads to a degraded game experience and triggers player departure and account closures [6]. Startup gaming company with no existing infrastructure could launch a new game using public cloud infrastructure as cloud services offers the flexibility to scale on demand with no upfront investment. Using cloud services, the game application services can be dynamically allocated or de-allocated according to demand fluctuations. Game companies can also better serve the diverse international users with the global presence of data centers owned by Cloud providers. Real-time Mobile applications: There is an explosion of (primarily mobile based) communication apps. For example, WhatsApp, acquired by Facebook, has 450 million users [7]), Viber, acquired by Rakuten, has 200 million users [8]) and WeChat, a Chinese rival, has 270 million users [9]. For these apps, low latency (a QoS constraint) is very important for the real time collaboration experience. For example, video conferencing, has a limit of about 200 to 250 milliseconds

IEEE SYSTEMS JOURNAL,VOL. X, NO. X, XXX 2014

delay for a conversation to appear natural [10]. These apps have similar requirements as the game apps. They require large number of servers to support millions of users, need optimization on latency, speed and throughput. It’s worth mentioning that even for a generic web application, there are experiments with delaying the page in increments of 100 milliseconds and found that even very small delays would result in substantial and costly drops in revenue [10]. Big Data, IoT (Internet of Things) and eScience: We are closing in on the transfer of a zettabyte of data annually [11], resulting from internet search, social media, business transactions, and content distribution. Similarly, scientific disciplines increasingly produce, process, and visualize data sets gathered from sensors [12]. If the prediction holds true, then the Square Kilometer Array (SKA) radio telescopes will transmit 400,000 petabytes (∼400 exabytes) per month or a whopping 155.7 terabytes per second [13]. Furthmore, European Space Agency (ESA) will launch several satellites in the next few years [14], which will collect data about the environment, such as air temperatures and soil conditions, and stream that data back in real time for analysis. Similarly in the finance industry, New York Stock Exchange creates 1 terabyte of market and reference data per day covering the use and exchange of financial instruments. On the other hand, Twitter feeds generate 8 terabytes of data per day of social interactions [15]. Such “Data Explosions” has led to research issues such as: how to effectively and optimally manage and analyze such large amount of data. The issue is also known as the Big Data’ problem [16], which is defined as the practice of collecting complex data sets so large that it becomes difficult to analyze and interpret manually or using on-hand data management applications (e.g., Microsoft Excel). As both storing and analyzing the data requires massive amount of storage capacity and processing power. Companies and/or institutions may want to offload the complexity of managing hardware infrastructure to Cloud providers who are specialized in that, plus eliminating the need to wait for facilities to be built. Other: Apart from the above mentioned scenarios, there are many more cases our proposed solution would be useful. A stock investor, individual or firm, may want to test out a new strategy for monitoring analyzing data which automatically triggers alert when certain price pattern or keyword is identified in the source data. This may require a lot of compute resources periodically. System administrators and developers may need a lot of simulated clients from all around the world for a website load testing before its official release. A bitcoin [17] (or some other similar cryptocurrencies [18]) miner may decide to invest on some additional resource in mining when the price of the currency is high, and stop the mining when the profit does not justify the expense anymore. B. The Problem While the elastic nature of cloud services makes it suitable for provisioning aforementioned applications, the heterogeneity of cloud service configurations and their distributed nature raises some serious technical challenges. In particular, we deal with following research problems:

2

Selecting Optimal Service Configuration: The cloud computing landscape is evolving with multiple and diverse options for compute (also known as virtual machines) and storage services. Hence, application owners are facing a daunting task when trying to select cloud services that can meet their constraints. According to Burstorm [19] there are over 426 of various compute and storage service providers with deployments in over 11,072 locations. Even within a particular provider there are different variations of the services. For example, Amazon Web Service (AWS) has 674 different offerings differentiated by price, QoS features and location [1]. Add to this every quarter they add about 4 new services, change business models (price and terms) and sometimes even add new locations. To be able to select the best mix of service offering from an abundance of possibilities, application owners must simultaneously consider and optimize complex dependencies and heterogeneous sets of criteria (price, features, location, QoS etc.). For instance, it’s not enough to just select optimal cloud storage service, corresponding computing capabilities are essential to guarantee that one is able to process the data as fast as possible while minimizing the cost. Incorporating Network QoS-awareness in Service Selection Process: As the cloud data centers are distributed across the Internet, the network QoS (data transfer latency) varies. This variation is dependent upon the location of data center and location of input data stream. Current approaches do not differentiate between the QoS of compute and storage services and the QoS of the wide area network that interconnects input data stream sources to cloud data centers. This raises a research question: how to optimize the process of choosing the best compute and storage services, which are not only optimized in terms of price, availability, processing speed but also offers good QoS (e.g. network throughput and response delivery latency)? C. Our Contributions We propose a new technique that aids in network QoS-aware selection of cloud services for provisioning mobile (or device with internet access but limited processing capability and storage), real-time and interactive applications. We build upon our previous work [3] where we have developed an automated approach, along with a unified domain model capable of fully describing infrastructure services in Cloud computing [20] [21]. While our previous approach supports simple cloud infrastructure service selection based on declarative Structured Query Language (SQL), it does not take into account real-time, variable network QoS constraints. Furthermore, a declarative SQL-based selection approach only allows users to compare and select a cloud service based on a single criterion (e.g. total cost, max size limit for storage, memory size for compute instance). In other words, our previous approach was not capable of supporting a utility function that combines multiple selection criteria pertaining to storage, compute, and network services. In this paper, we make following concrete contributions: 1. Problem Formulation. We provide a clear formulation of the research problem by identifying the most important cloud


3

TABLE I A BRIEF COMPARISON OF THE CLOUD RECOMMENDER WITH OTHER EXISTING SOLUTIONS

```

```Feature ``` Product Broker@Cloud Yuruware CloudHarmony Cloudorado CloudBroker CloudRecommender

QoS Benchmark No Adjustable No Adjustable Fixed

SingleCriteria AggregateRanking Comparison &Comparison No evidence on progress of project No No No No Yes No Yes No Yes Yes

service selection criteria relevant to specific real-time QoSdriven applications, selection objectives, and cloud service alternatives. 2. Multi-criteria QoS Optimization. We adopt and implement an Analytic Hierarchy Process (AHP) based decision (service selection) making technique that handles multiple quantitative (i.e. numeric) as well as qualitative (descriptive, non numeric, like location, CPU architecture: 32 or 64 bit, operating system) QoS criteria. AHP determines the relative importance of criteria to each user by conducting pair-wise comparisons. 3. Network-aware QoS Computation. We implement a generic service that helps in collecting network QoS values from different points on the Internet (modeling big data source location) to the cloud data centers. The paper is structured as follows. In section II, we survey the state-of-the-art in Cloud Service Selection and Comparison (CSSC) techniques. We also highlight their significant limitations, their relationship and dependency on some of the prior concepts from other fields in computing. In Section III, we present the extension we made to our previously proposed decision making framework. We also explain the benefits of applying AHP and importance of considering QoS. In section IV, we present evaluations (conducted in real-world context) of the proposed decision support tool and techniques, which will automate and map users’ specified application requirements to specific Cloud service configurations. In section V, we conclude and point out open research questions and future directions in this increasingly important area. II. BACKGROUND AND RELATED WORK Though branded calculators are available from individual cloud providers, such as Amazon [22] and Azure [23] for calculating service leasing cost, it is not easy for users to generalize their requirements to fit different service offers (with various quota and limitations), let alone computing and comparing costs. A number of research [24] and commercial projects (mostly in their early stages) provide simple cost calculation or benchmarking and status monitoring, but none is capable to consolidate all aspects and provide a comprehensive ranking of infrastructure services. For instance, CloudHarmony [25] provides up-to-date benchmark results without considering cost, Cloudorado [26] calculates the price of IaaS-level CPU services based on static features (e.g., processor type, processor speed, I/O capacity, etc.) while ignoring dynamic QoS features (e.g. latency, throughput etc.). Yuruware

Cloud Management Yes No No No No

[27] used to provide a Compare service during beta version in 2012 (now removed or integrated into another service). Although they aim to provide an integrated tool with monitoring and deploying capabilities, it is still under development. One other similar system is Swinburne University’s Smart Cloud Broker Service [28], from the screencast they released, we can tell that their benchmarking is done in real-time which means users have to wait for the results to come back. We have considered this kind of situations, but decided to collect the benchmarking result beforehand. Because this way no matter how many cloud providers users want to compare against, they can still get the result with minimum (or no) waiting time. Another reason we choose to do it this way is because, at any particular point in time, the network benchmark result is not conclusive as performance fluctuates during time, so we use aggregated average which is a more reliable overall indication. To further distinguish ourselves from others, we offer the following two innovative features when ranking, selecting, and comparing various vendor services: 1) allow users to choose to include the QoS requirements during comparison; 2) when users want to take into account mixed qualitative (e.g. hosting region, operating system type) and quantitative criteria, we apply the Analytic Hierarchy Process (AHP) to aggregate numerical measurements and non numerical evaluation. Results are personalized according to each user’s preferences, because AHP takes users’ perceived relative importance of criteria (pair-wise comparisons) as inputs. Table I shows a brief comparison of the CloudRecommender with other existing products we mentioned previously. We have to clarify that we are more interested in the first 3 features. Yuruware had claimed to have comparison features in the past, but removed later. Menzel and Ranjan [29] introduced a framework called “CloudGenius” that supports decision making process on web server migration into the cloud. Our system supplements and partially extends their work. While “CloudGenius” focus on Virtual Machine (VM) selection, means it considers the software requirements (i.e. operating system version, supported languages), our study focus more on the hardware requirements (i.e. size of memory and hard disk). Although we have borrowed the idea of using the AHP (with simplification) for rank calculation from “CloudGenius”, we used it differently, as we applied the method in our declarative program which mainly handles data and calculation with database and SQL. That means it may be easier to scale out the solution using Hive [30] with minimal change, as suppose to rewrite the java


code to fit the Map Reduce Framework [31]. Queuing theory is one of the much studied method in QoS modeling and control from the infrastructure system administrator perspective [32] but our case is different, because we have no control of the infrastructure. Since we can only measure the QoS, we collected the statistics using the “speedtest” service provided by CloudHarmony due to easy adoption and ever evolving nature of this service. Klein et al. [33] proposed a highly theoretical model based on Euclidean distance for estimating latency, which we believe have omitted too much details to be practically accurate. However, we can use this model to estimate latency when QoS data is not available for a new client location. There are methods proposed for network aware service composition [34] [35] [36] considering generic web service, i.e. at the Software-as-a-Service (SaaS) and Platform-as-aService (PaaS) level. But the compatibility constrains at the IaaS level are different from web service. For example, generic web services are distinguished by their features, QoS and prices. It does not make sense to include 2 exact same services in one composition as one job does not need to be done twice, but using multiple quantity of an IaaS offer is perfectly valid. TABLE II SYMBOLS USED IN THE FORMULAS Symbol a C c D i L l ζ M P R r γ S T t U µ w

Meaning Resource usage behave like a decision variable. Set of all possible Cloud providers. Cloud Provider, e.g. Amazon,Rackspace, GoGrid. Downloading speed. Identifies a request. Set of all possible datacenter locations. A datacenter location, e.g. Sydney, Tokyo. Latency (download). Memory Size (e.g. 8G). Price Set of all possible resources, including all types whether it is Compute, Storage or Network. Identifies a source, e.g. GoGrid XX - Large Instance, S3 Storage Serive, EC2 instance. Set of Requests from one user. Storage. Period of time the resource is used. Exact point in time, like a time stamp. CPU speed. Uploading speed. Weight.

III. S YSTEM D ESIGN This section will describe our system’s architecture and give details on how it’s realised, i.e. formulas on how weight, rating, cost are calculated. We keep all the formulas in subsection III-A, then we show where/in which step different formulas are applied and how relates to each other in subsection III-B. In the last subsection, we provide illustrations of overall system design and include any worth mentioning details that does not fit into the previous subsections. A. Formal Model To give a conceptual explanation of our approach to address the QoS optimization problem, we define a formal model in

4

this section. Based on the formal model, we can describe the involved concepts that are incorporated in the algorithm presented later. Particularly, we define a cost estimation function using resource utilization estimations, and a benefit-cost ratiobased evaluation function which considers weights. Furthermore, we present a pair-wise comparison method to calculate normalized weights. For more precise resource utilization estimations, we show how variable resource utilization patterns can be incorporated into cost estimation. 1) Cost Estimation: Let “a” be the resource usage of a particular resource from a data center location of a Cloud provider. For example, we can use astorage,any,any = 50GB to represent user’s need to store 50 GB of data in the cloud. The symbols’ meanings are summarized in Table II. Equation 1 means the usage of the compute resource r from provider c at location l is between 0 and n. This value is usually suggested by users. Our assumption is that users may have a rough estimate of how much resources they might need. ar,c,l ∈ {0, 1, . . . , n}

(1)

To calculate the Cost (represented by function: ℘) for one kind of resource used at one point in time, we multiply its usage with the corresponding unit price (P) as: ℘(t) = ar,c,l Pr,c,l

(2)

After initial filtering on which options are appropriate for users, we can calculate the total (minimum) price per unit time for desired resource(s) (assume constant resource usage pattern throughout the time) as in formula 3. We assume users will choose the time period (T) they want to estimate price for, e.g. 1 hour, 30 days. ar,c,l Pr,c,l Tr,c,l

(3)

2) Cost Benefit Ratio: In our decision making framework, we consider the following QoS statistics: download latency (ζ), download speed (D) and upload speed (µ). Those characteristics are important for end-users experience and satisfaction. It’s possible to have options that have small price difference, or when having high quality service is more important than saving money. So we offer to calculate the cost/benefit ratio for the resources requested as in equation 4. P w1 ac,l,r Pc,l,r Tc,l,r + w2 ζ¯c,l,r (4) ¯ c,l,r w3 µ ¯c,l,r + w4 D Since users are likely to select a combination of compute storage and network services, hence the summation over resources when calculating the cost. Note that the network QoS of Compute and Storage Service are both collected then separately stored, since user maybe only interested in one of the services. For example, transferring files from (and to) the compute instance relatively “local” mounted storage is different from downloading or uploading files from/to dedicated storage only service (like AWS S3 [37]). In case user select both, we use the average. For ¯ to denote that we take instance, in the equation we used D the average of Dcompute (download speed measured from the Compute service) and Dstorage (download speed measured from the Storage service).


Non numeric preference

QoS to maximize

5

TABLE III ABSOLUTE VALUE AND CORRESPONDING DESCRIPTIVE SCALE REPRESENTING RELATIVE IMPORTANCE

Loca2on preference

Scale equal moderate strong very strong extreme

Download speed Speed

Beneﬁt

Upload speed

CPU speed CPU Computa2onal capability

Value 1 3 5 7 9

Reciprocals∗ 1 1/3 1/5 1/7 1/9

∗If activity i has one of the above nonzero numbers assigned to it when compared with activity j, then j has the reciprocal value when compared with i.

Core number

TABLE IV SYMBOLS USED IN WEIGHT EXPLANATION

RAM

Symbol

(a) Criteria to maximize

Meaning n=4 P yn

τ

n=1

V QoS to minimize

Latency (download) Storage usage cost

Cost

Storage Request cost Total cost for a period of :me

Compute Data transfer in Network Data transfer out

Vcomputedisk Vcost Vlatency Vram Vspeedupload Vspeeddownload x y y1 y2

Value given by User to rate the importance. How important is the size of disk space on VM. Importance value for cost. Importance value for Download Latency. How important is the size of memory allocated to VM. Importance value for Upload Speed. Importance value for Download Speed. Some user input value. Sum of the row values. n=3 P xn + 1 n=1 n=5 P xn + 1 + x1 n=4

1

(b) Criteria to minimize Fig. 1. Criteria taken into consideration during comparison. There are 2 categories: benefit and cost. “Benefit” groups the “good” criteria which are meant to be maximized. Similarly, “Cost” groups the “bad” criteria to be minimized. The actual values to be collected and stored are at the “leaf” (i.e. Node/criterion with no children) of the “tree”. For example, under “Benefit”, numeric values are collected for “Download/Upload Speed”, “CPU Speed” and “Number of Cores”. “QoS to Maximize” is the parent/big category “Download/Upload Speed” belongs to, there is no value stored for this node.

Symbol w represents the weight, which measures users’ perceived importance on a parameter, and w1 + w2 = 1 and w3 +w4 = 1 means the sum of the weights of benefits and cost each equals to one. Fig. 1 shows the criteria to be optimized. They are categorized into two groups: to be maximized or to be minimized. As we named this ratio “Cost Benefit Ratio”, we put cost on the numerator and benefit in the denominator. As a result we will be looking for smaller ratio as better option. Reversing numerator and denominator can still work, just means bigger ratios indicating better option. 3) Weight computed by Pairwise Comparison: The weight is calculated based on AHP’s pair wise comparison method. We choose the commonly used scale [38] [39] shown in Table III. In case user chooses to treat all options equally, (4) become (5). P 0.5 ar,c,l Pr,c,l Tr,c,l + 0.5ζc,l,r (5) ¯ c,l,r 0.5¯ µc,l,r + 0.5D Otherwise, weight is calculated as shown in Table V on page 6. The meaning of symbols is explained in Table IV.

The fully fledged AHP method consists of repeated matrix squaring to compute the eigenvector, see 6, every time the eigenvector gain a tiny improvement on precision at the cost of expensive computation, this is supposed to be repeated until no big enough difference (i.e. to four decimal places) can be observed. In our case, we noticed that the improvement is so small that this rule can be relaxed to omit iterations on matrix squaring.   y1 /τ  y2 /τ    (6)  y3 /τ  y4 /τ For example, user may have preference like shown in Table VI. It will produce the preference matrix M1 . M1   1 1/3 1/5 1/5 3 1 3 5    5 1/3 1 3  5 1/5 1/3 1 TABLE VI EXAMPLE USER PREFERENCE

Vspeedupload Vspeeddownload Vram Vcomputedisk

Vspeedupload Vspeeddownload Vram Vcomputedisk 1 1/3 1/5 1/5 1 3 5 1 3 1

(7)


6

TABLE V MATRIX ILLUSTRATING HOW TO TURN PAIR-WISE PREFERENCE INTO GLOBAL WEIGHT Vspeedupload 1 1/x1 1/x2 1/x3

Vspeedupload Vspeeddownload Vram Vcomputedisk

Vspeeddownload x1 1 1/x4 1/x5

Table VII shows the steps breakdown to compute the eigenvector from 7 before matrix squaring. TABLE VII EXAMPLE EIGENVECTOR CALCULATION

1 3 5 5

+ + + +

0.3333 1 0.3333 0.2

+ + + +

0.2 3 1 0.3333

+ + + +

0.2 5 3 1

The result eigenvector would be:  0.0566  0.4248 v1 =   0.3050 0.2135

= = = = Column Sum

Row Sum 1.7333 13 9.3333 6.5333 30.5999

Vram x2 x4 1 1/x6

Vcomputedisk x3 x5 x6 1 Column Sum

Symbol AvgQoS Dcompute Dstorage ¯ D `

Mmin

