Research Article
Brokel: Towards enabling multi-level cloud elasticity on publish/subscribe brokers
International Journal of Distributed Sensor Networks 2017, Vol. 13(8) Ó The Author(s) 2017 DOI: 10.1177/1550147717728863 journals.sagepub.com/home/ijdsn
Vinicius Facco Rodrigues1, Ivam Guilherme Wendt1, Rodrigo da Rosa Righi1, Cristiano Andre´ da Costa1, Jorge Luis Victo´ria Barbosa1 and Antonio Marcos Alberti2
Abstract Internet of Things networks together with the data that flow between networked smart devices are growing at unprecedented rates. Often brokers, or intermediaries nodes, combined with the publish/subscribe communication model represent one of the most used strategies to enable Internet of Things applications. At scalability viewpoint, cloud computing and its main feature named resource elasticity appear as an alternative to solve the use of over-provisioned clusters, which normally present a fixed number of resources. However, we perceive that today the elasticity and Pub/ Sub duet presents several limitations, mainly related to application rewrite, single cloud elasticity limited to one level and false-positive resource reorganization actions. Aiming at bypassing the aforesaid problems, this article proposes Brokel, a multi-level elasticity model for Pub/Sub brokers. Users, things, and applications use Brokel as a centralized messaging service broker, but in the back-end the middleware provides better performance and cost (used resources 3 performance) on message delivery using virtual machine (VM) replication. Our scientific contribution regards the multi-level, orchestrator, and broker, and the addition of a geolocation domain name system service to define the most suitable entry point in the Pub/Sub architecture. Different execution scenarios and metrics were employed to evaluate a Brokel prototype using VMs that encapsulate the functionalities of Mosquitto and RabbitMQ brokers. The obtained results were encouraging in terms of application time, message throughput, and cost (application time 3 resource usage) when comparing elastic and non-elastic executions. Keywords Internet of Things, publish subscribe communication, cloud elasticity, resource management, multi-level
Date received: 11 April 2017; accepted: 6 August 2017 Academic Editor: Sang-Woon Jeon
Introduction The Internet of Things (IoT) is considered the next evolution in the era of computing after the Internet.1,2 In the IoT paradigm, many objects around us will be connected in networks and will communicate with each other without any assistance or human intervention. IoT devices implement different types of services for applications related to smart health, smart farming, smart city, among others. Normally, these services use a middleware layer that can act as a messaging service,
1
Applied Computing Graduate Program, University of Rio dos Sinos Valley (UNISINOS), Sa˜o Leopoldo, Brazil 2 National Institute of Telecommunication (INATEL), Santa Rita do Sapucaı´, Brazil Corresponding author: Rodrigo da Rosa Righi, Applied Computing Graduate Program, University of Rio dos Sinos Valley (UNISINOS), Av. Unisinos, 950, Sa˜o Leopoldo 93022-000, Brazil. Email:
[email protected]
Creative Commons CC-BY: This article is distributed under the terms of the Creative Commons Attribution 4.0 License (http://www.creativecommons.org/licenses/by/4.0/) which permits any use, reproduction and distribution of the work without further permission provided the original work is attributed as specified on the SAGE and Open Access pages (http://www.uk.sagepub.com/aboutus/ openaccess.htm).
2 receiving information, storing, and processing it, in addition to making automatic decisions based on the environment status and historical data.3 Often, brokers or intermediate nodes (also named in some cases as superpeers) are used in this layer to enable communication among applications and things in a publish/subscribe (Pub/Sub) fashion.4 Pub/Sub is a communication paradigm for asynchronous data dissemination among services and users.5 In many Pub/Sub systems, clients are able to register subscriptions in a particular broker. In turn, brokers receive post messages from publishers and perform the filtering according to the subscriptions.6,7 As an indirect communication pattern,8 where publishers and subscribers sometimes do not have knowledge of each other, Pub/Sub is useful to implement several IoT scenarios and applications. A smart object is responsible for generating events based on the collected information, which can be used not only by other things or high-level applications, but also further processed by other modules (e.g. an event-processing unit).4 For instance, we can model a Pub/Sub system when collecting patients vital data via a network of sensors connected to medical devices, so delivering the data to a medical center for storage and processing and guaranteeing ubiquitous access to medical data in the electronic healthcare record (EHR) format. Figure 1(a) depicts a standard Pub/Sub deployment with a single broker. However, this organization suffers from the traditional problem related to centralized systems, that is, scalability. In other words, this deployment sometimes cannot provide quality-of-service (QoS) when enlarging the network clients, because of the larger amount of data in the system, the larger infrastructure requirements to store, process, and present it efficiently at real-time. Figure 1(b) illustrates a possibility to explore scalability using a Peer-to-Peer (P2P) network of brokers. This organization could be assembled using a cluster or grid infrastructure. Traditionally, both cluster and grids have a fixed number of resources that must be maintained in terms of infrastructure configuration, scheduling (where tools such as PBS (http://www. pbspro.org), OAR (http://oar.imag.fr), and open grid scheduler (http://gridscheduler.sourceforge.net) are usually employed for resource reservation and job scheduling), and energy consumption. The resource requirements of IoT Pub/Sub applications will inevitably fluctuate over time.9 Thus, parallel machines such as clusters and grids may lead to either under-provisioning or over-provisioning situations, incurring in performance and cost penalties, in addition to do not addressing the growing demand of IoT systems.10 IoT platforms should provide facilities to adapt (i.e. dynamically reconfigure) the resources allocated according to the perceived changes to the environment. Cloud Computing can provide the virtual infrastructure to support the IoT paradigm by integrating
International Journal of Distributed Sensor Networks monitoring, storage, analytical tools, visualization platforms, and client access.8,10,11 Unlike clusters and grids, cloud providers offer features to hide all the complexity and functionalities necessary to implement an IoT ecosystem.12 Among the main features, it is possible to emphasize the elasticity which allows users to change the cloud capacity at any time dynamically.13 Through the on-demand and pay-as-you-go provisioning principle, the interest in elasticity is related to the benefits it can deliver: better performance, improved resource utilization, and reduced costs. In this way, we can allocate a small number of resources at application launching time, so this number is moldable at runtime without user interference. Although several principles have been studied for cloud computing elasticity and IoT systems, we have not seen such principles for engineering IoT Pub/Sub systems on top of cloud resource reorganization. Our review of the state-of-the-art in the aforementioned topics revealed five gaps: (1) obligation to change the application source code or to develop additional scripts to transform a non-elastic application in an elastic one;1–3,14 (2) use of proprietary software components, not available to buy or download;2,3 (3) system scalability, but only for a particular set of Pub/Sub middlewares;1–3,14 (4) virtual machine (VM) thrashing, so launching elasticity actions on sporadic peaks prematurely;1–3,14 and (5) single-level elasticity support.11,15–17 Exploring (5), we perceive that cloud elasticity is widely explored on Web systems, where a unique entry point acts as load balancer and elasticity manager. But, a research question arises: what happens whether the own entry point becomes overloaded? Aiming at filling these gaps, we are proposing in this article a model named Brokel, a multi-level elasticity model for Pub/Sub brokers. Thus, users, things, and applications employ Brokel as a centralized messaging service broker, which is responsible for providing better performance and cost (used resources 3 performance) on message delivery using cloud elasticity. Figure 1(c) presents the main ideas of our proposal, which includes two main levels: orchestrator and broker. Particularly, the orchestrator is a component in charge of managing the distribution of client requests among the brokers. The term orchestrator is commonly used to define a component with administration functions, as presented in Hurtle (http://hurtle.it/), Open Baton,18 LiveCloud,19 and Roboconf.20 The multi-level keyword is explored in two elasticity levels, orchestrator and broker, and through the addition of a geolocation domain name system (DNS) service to define the most suitable entry point in the Pub/Sub architecture. The broker level is responsible for processing and handling messages, while orchestrator level performs load balancing and elasticity among brokers. The scientific contribution of Brokel relies on offering a peak-aware agnostic elastic
Rodrigues et al.
3
Figure 1. Pub/Sub deployment approaches: (a) centralized, (b) P2P-like distributed style, and (c) Brokel proposal.
Pub/Sub model, which can be implemented in any Pub/ Sub system in the market, including RabbitMQ (https://www.rabbitmq.com) and Mosquitto (https:// mosquitto.org). The peak-aware feature enables Brokel to avoid false-positive VM allocations, avoiding scaling in or out operations when only a load peak is observed. In the next sections, we present more details about Brokel. First, we introduce related work and then describe Brokel in detail, including its architecture and elasticity decision making. Second, we present the evaluation methodology in terms of client application, test
scenarios, and observed metrics. Finally, we present experiment results and final considerations. In particular, in the last part of the article, we emphasize again the scientific contribution of the work also detailing several challenges that we can address in the future.
Related work This section presents some initiatives that guided us on developing Brokel. Here, we are handling Pub/Sub systems and how they address scalability.
4 Running in cloud environments, E-STREAMHuB covers a mechanism for automatically expanding and reducing resources for Pub/Sub services with contentbased filtering.21 The E-STREAMHuB system is an extension of STREAMHuB,17 a scalable but static Pub/Sub engine. Subscribe messages are sent via unicast to a selected operator, while publish messages are transmitted through broadcast to all operators. ESTREAMHuB introduces data parallelism on Pub/Sub systems, enabling the processing of several publish messages in parallel to registered signatures. However, if the number of publish messages grows quickly, all the solution will be overloaded since all messages are sent to all operations for matching purposes. Elastic Queue Service (EQS)15 is a message queuing architecture classified as topic-based Pub/Sub, designed to allow elastic scalability, performance, and high availability. Its project concerns the deployment of a Cloud Computing infrastructure using Infrastructure as a Service (IaaS). The EQS elasticity is obtained by migrating topics between processing nodes. If the migration is not enough, the model can instantiate new replicas to meet the growing demand. Migration of threads between nodes can lead to cost and potential service disruption. Blue Dove22 seeks to deploy attribute-based Pub/ Sub services in public cloud environments. Available nodes are divided into regions, which one with its attributes. Subscribe and publish messages are sent to the servers that manage a particular region. Its architecture has two layers. The first concerns Dispatchers Servers, which plays the role of front-end, so directing publishers and subscribers to nodes that manage a given region. The second works as system back-end, processing received messages and forwarding to subscribers when necessary. Its approach only provides attributebased filtering in addition to not allowing extraction of contents from subscribe messages on the server side. Blue Dove partially supports elasticity, allowing only scaling out operations without supporting any kind of resource release in under-utilization cases. In Wang and Ma,23 the authors present a content-based, scalable, and elastic Pub/Sub service called general scalable and elastic content-based Pub/Sub service (GSEC). GSEC proposes a framework with two layers, which uses a hybrid space partitioning technique to achieve a high throughput rate, dividing subscriptions into multiple clusters through a hierarchical way. In addition, a helpers-based content distribution technique is proposed to achieve high upload bandwidth, in which servers act as providers and coordinators to fully exploit the system’s upload capability. X Ma et al.5 focus on a hierarchical architecture of servers to provide low latency in event matching of attribute-based Pub/Sub systems. Their solution divides all servers in two specific layers that define their function. Servers in the first layer act as load balancing
International Journal of Distributed Sensor Networks applying a hash strategy to define to which broker send a request. The brokers act in the second layer processing client requests matching events to subscriptions. In addition, a component named PDetection is in charge of monitoring the requests’ waiting time in the broker layer and performs elasticity actions comparing it with defined thresholds. After an elastic action, PDetection performs all reconfigurations in brokers and load balancers to adjust to the new configurations. Elastic computing framework (ECF)24 presents an architecture in which programmability of front-end machines is drawn upon to dynamically divide data processing tasks into (1) those that see all of the data but perform simple operations on it; (2) those that see a transformed subset of the data and extract deep insights from it. This results in lifting pressure on network that links the front-end to the back-end and achieves capability adaptive deployment of IoT solutions. In Bellavista and Zanni,25 the authors propose a distributed architecture combining machine-to-machine industry-mature protocols (i.e. Message Queue Telemetry Transport (MQTT) and Constrained Application Protocol (CoAP)) to enhance scalability of gateways for efficient IoT–cloud integration. In addition, the article presents how they have applied the approach in a practical experience of efficiently and effectively extending the implementation of opensource gateway that is available in the industry-oriented Kura framework for IoT. This architecture is used by Bellavista and Zanni26 to propose a scalability solution for IoT gateways via MQTT–CoAP integration. The authors use the hierarchical structure of MQTT to enable an easy way to manage messages. Using CoAP, the solution provides a CoapTreeHandler responsible of dealing with changes in the hierarchy. Through this strategy, it is possible to add or remove entities without needing topics reconfiguration. Table 1 presents a comparison among the aforementioned research initiatives. We perceive that some academic researches are focusing on providing cloud elasticity toward Pub/Sub systems. However, this combination is an open challenge when it comes to elasticity support for middlewares. Alternatives have to deal with some limitations related to either difficulty to adapt the application to take advantage from elasticity or middleware implementation restrictions. A common strategy present in the literature to transform a non-elastic application in an elastic one requires changes in the application source code or development of additional scripts.5,15,21–23,26 It makes the adoption of resource reorganization of cloud-based Pub/Sub applications more difficult, since developers/administrators must have a deep knowledge about both system environment and application code. Moreover, we perceive that elasticity is mainly addressed by solutions employing threshold-based reactive techniques, combining
Rodrigues et al.
5
Table 1. Related work comparison in front of seven relevant Pub/Sub features. Related work
Filtering
Elasticity
Method
Model
Operation
Modification
Restrictions
E-STREAMHuB21
Content
Yes, multilevel
Horizontal and migration
Reactive automatic
Yes, in the Middleware/ Broker
Applies only to STREAMHuB Middleware
EQS15
Topic
Yes, only on one level
Horizontal and migration
Reactive automatic
Yes, in the Middleware/ Broker
Blue Dove22
Attribute
Horizontal
Reactive automatic
The Middleware must support process migration Applies only to one Middleware
GSEC23
Content
Only adding resources, multi-level Yes, multilevel
Process migration and VM replication Process migration and VM replication VM replication
Horizontal
Reactive automatic
VM replication
ECF24
Content and topic
Yes, only on one level
Horizontal
Reactive automatic
Process migration
Bellavista and Zanni25
Topic
No
–
–
–
X Ma et al.5
Attribute
Yes, only on one level
Horizontal
Reactive automatic
VM replication
Bellavista and Zanni26
Topic
No
–
–
–
Yes, in the Middleware/ Broker Yes, in the Middleware/ Broker Yes, in the Middleware/ Broker Yes, in the Middleware/ Broker Yes, in the Middleware/ Broker Yes, in the Middleware/ Broker
Applies only to one Middleware Applies only to one Middleware Applies only to one Middleware Applies only to Matchers Applies only to one Middleware
VM: virtual machine.
horizontal and replication techniques.5,15,21–23 Such strategy requires user experience in configuring specific parameters and deploying cloud resources. However, the use of load threshold to guide resource reorganization can cause the problem of VM thrashing, where sporadic peaks exceeding one of the thresholds can result in an undesired elasticity action, being interpreted as a false-positive action. Finally, the main limitation of the state-of-the-art today refers to the elasticity support, which is offered only in a single level.5,16,24,27–29 What happens when users have an additional broker and need to inform or rewrite their own applications to deal with this new broker? Or yet, how can users know the best broker to receive the client (publisher or subscriber) requests in order to reduce communication latency in this interaction?
Brokel proposal Brokel is a multi-level cloud elasticity model for Pub/ Sub Brokers. The model also includes Orchestrator replicas, which are responsible for load balancing clients’ requests between the active Broker instances. Brokel offers a two-level elasticity approach providing elasticity for Brokers and Orchestrators. In this section, we present Brokel, including its architecture, application model, and elasticity mechanism. In the next subsections, we discuss details about the model. Therefore,
in the first place, we present the design decisions. Also, we describe details about the architecture and model components. Finally, we discuss the application model and the elasticity decisions.
Design decisions Brokel provides reactive and horizontal elasticity transparently for users, things and applications in both Broker and Orchestrator levels. We adopted horizontal elasticity by applying VM replication since vertical elasticity is limited by the resources available in a particular physical node. In addition, the majority of operating systems do not allow the addition of resources at runtime without a system reboot.30,31 To develop Brokel, we adopted the following design decisions: 1.
2. 3.
4.
Users do not need to configure the elasticity mechanism; however, they can input elasticity thresholds; Developers do not need to rewrite their source code to take advantage from elastic resources; The model is agnostic at the Pub/Sub system viewpoint, since the code of Brokers is encapsulated in VM templates; The Broker provides authentication to the actors who connect to it;
6
International Journal of Distributed Sensor Networks
Figure 2. Brokel ideas representing how client requests are distributed among the Orchestrators and Brokers.
5.
6.
7.
The cloud architecture considers homogeneous resources, that is, each VM has the same hardware configuration; Resource provisioning is accomplished through a threshold-based reactive, horizontal, and automatic elasticity mechanism with VM replication; The elasticity runs in two levels: Broker replication and Orchestrator replication.
Architecture Brokel operates at the PaaS (Platform as a Service) level of a cloud, acting as a middleware that transforms nonelastic Pub/Sub applications in elastic ones. Figure 2 depicts the main ideas of Brokel, highlighting the communication path for information exchange. The model focuses on enabling Pub/Sub Brokers to take advantage from cloud elasticity without needing source code modifications. In this way, Brokel provides elasticity by allocating and consolidating VMs with replicas of application services. Additionally, the Orchestrator plays an important role in the architecture since it is in charge of balancing clients’ requests among Brokers. At the user’s viewpoint, the Orchestrator acts as a web server processing clients’ requests as a Broker itself. The user does not need to change its application to connect to an Orchestrator since it dispatches the requests to a Broker. Furthermore, aiming at providing a multi-level elasticity model, Brokel also provides elasticity for Orchestrators. The model maintains a DNS server allowing clients to query an appropriate Broker. Thus, the server returns a single address for a particular
Orchestrator, so then acting as a load balancer for the Orchestrator level. To illustrate Brokel architecture, Figure 3 presents its three main components, each one running in a particular VM: (1) Broker, (2) Orchestrator, and (3) Elasticity Manager. The Broker is in charge of processing clients’ requests. Each Broker VM runs a particular message broker system, which is managed by the Pub/ Sub provider. Brokel is generic enough to cover all message broker systems. In turn, the Orchestrator is in charge of routing clients’ requests to the active Brokers transparently. Acting as a wrapper, the Orchestrator forwards requests to a suitable Broker and, if applicable, receives the Broker delivery and forwards it to clients. Moreover, elasticity is provided by the Elasticity Manager in both Broker and Orchestrator levels. This component performs all activities related to cloud resources and communication topology reorganizations. Additionally, it also provides the DNS service providing load balance among the Orchestrator VMs. Brokel levels. In the Broker level, the system operates as a traditional Pub/Sub system, which means that the provider does not need to change its system behavior. A Broker receives requests from both clients and Orchestrator processes not needing to be aware if the request came from an Orchestrator or a client itself. In this context, for subscribe requests, it just updates its distribution list. On the other hand, for publish requests, the Broker generates a client distribution list and sends the message to these clients. At the Orchestrator level, a particular Orchestrator forwards
Rodrigues et al.
7
Figure 3. The Brokel architecture. While the number of Broker VMs is m, the number of Orchestrator VMs is identified by n. The number of VMs running in the cloud is v, which can be computed by n + m + 1.
clients’ requests to Brokers in different ways depending on the type of the request. In particular, the Orchestrator processes send requests and receive data from Brokers as clients. When receiving a publish request, an Orchestrator forwards the message applying a round robin strategy to select one of the active Brokers. We selected this algorithm due to the simplicity and efficiency for load balancing between servers. Nevertheless, when handling a subscribe request, the Orchestrator connects to all Brokers individually and sends the message to them. This strategy is pertinent to guarantee that all Brokers have its subscriber lists updated. Additionally, it enables an one to n communication topology that is not supported by default in a transmission control protocol (TCP) implementation. Brokel Elasticity Manager. Brokel Manager is in charge not only of DNS services, but also of managing all operations related to resources monitoring and elasticity actions. Therefore, the Manager takes account of the processing load on VMs as input data to elasticity decision making. The Manager executes all cloud reconfigurations by sending requests to the Cloud Front-End, which is in charge to execute all operations related to the cloud environment by receiving requests through the cloud provider application programming interface (API). The Manager observes the cloud
resources central processing unit (CPU) occupation levels in constant periods of time evaluating the need of elasticity. To illustrate its functioning, Figure 4 shows the main monitoring phases that the Brokel Manager executes. At each monitoring observation, the Manager collects data from the available resources and evaluates it afterward for elasticity decision making. When scale out operations take place, the new resources require some time to be actually available for the application. This happens due to the process of transferring the VM image to a physical node and bootstrap overhead of the VM operational system. The Brokel Manager is aware of this process delivering the new resources only after they are online in fact. Thus, after starting the allocation process, the Manager continues its execution flow and at the beginning of each monitoring observation, it checks the new resources are online. In particular, after performing an elasticity action, the Manager needs to reorganize the communication topology. To clarify this process, Algorithms 1 and 2 present pseudo-codes with the main operations the Manager does in each cloud elasticity moment. Algorithm 1 presents the pseudo-code when a scaling out operation takes place. When adding a new Broker VM (line 4), the Elasticity Manager configures this new Broker with the current subscribers list (line 5). Specifically, line 8 of the algorithm represents the
8
International Journal of Distributed Sensor Networks
Figure 4. Brokel Manager computation phases.
Algorithm 1. Pseudo-code of a scaling out operation that the Elasticity Manager performs. 1 Input: Scaling out operation 2 Output: New resource 3 If (add_broker() then 4 add_brokerVM(cloud_frontend); 5 copy_subscribers_to(new_broker); 6 for (i = 0; i \ amount_of_orchestrator; i++ ) do 7 connect_to(orchestrator[i]); 8 config(orchestrator[i], new_broker); 9 disconnect(); 10 end for 11 else 12 add_orchestratorVM(cloud_frontend); 13 config_broker_list(new_orchestrator); 14 add_dns(new_orchestrator); 15 end if
Algorithm 2. Pseudo-code of a scaling in operation that the Elasticity Manager performs. 1 Input: Scaling in operation 2 Output: Remove resource 3 if (remove_broker() then 4 for (i = 0; i \ amount_of_orchestrator; i++ ) do 5 connect_to(orchestrator[i]); 6 remove_broker(orchestrator[i], broker); 7 disconnect(); 8 end for 9 while broker is processing request do 10 wait() 11 end while 12 remove_VM(cloud_frontend, broker); 13 notify_orchestrators(); 14 else 15 remove_dns(orchestrator); 16 while orchestrator is processing request do 17 wait(); 18 end while 19 remove_VM(cloud_frontend, orchestrator) 20 end if
operation where the Elasticity Manager reconfigures all active Orchestrator processes including a Broker in its lists. In case of adding a new Orchestrator (line 12), the Elasticity Manager configures the list of Brokers in this new Orchestrator (line 13) and updates the DNS service (line 14). However, instead of connecting with external VMs, when updating the DNS the Manager only needs to reconfigure its own DNS service. When removing either a Broker or an Orchestrator, the Manager must ensure that the target VM is neither receiving nor processing clients’ requests before performing the scale in operation. Thus, the Manager executes a collection of operations to guarantee it. Algorithm 2 presents the pseudo-code when a scaling in operation takes place showing these operations. When removing a Broker VM, the Manager executes four steps. First, it configures all Orchestrators to not send requests to the Broker that will be removed (line 6). Second, it waits until the target Broker be idle and not processing client requests (line 9). This step is important to guarantee that clients’ requests will not be lost. In the next step, the Manager removes the Broker VM from the cloud (line 12). Finally, it notifies all Orchestrators to remove completely the Broker from its lists (line 13). In case of removing an Orchestrator, the Manager performs similar operations. First, it removes the Orchestrator from the DNS service (line 15) in order to guarantee that new client request will not be processed by this particular Orchestrator. Following, as when removing a Broker, the Manager waits until the Orchestrator finishes clients’ requests (line 16). Thus, only after that, it removes the Orchestrator VM from the cloud (line 19).
Application model Brokel offers a two-level elasticity approach for both Brokers and Orchestrators. Thus, many Brokers and Orchestrator are available to process clients’ requests. Consequently, the flow of these requests is different
Rodrigues et al.
9 just stores this information in its table and the flow ends. However, if the message is a publish request, the Broker engine generates the client distribution list and sends it to the Orchestrator. Finally, the Orchestrator sends the message to all clients and the flow ends.
Elasticity decisions
Figure 5. Message processing flowchart.
depending on its type (publish or subscribe). Therefore, Figure 5 shows the request processing flowchart when a client request arrives. First, a client requests an Orchestrator to the DNS service that selects one of the active Orchestrator replicas employing a load balance algorithm. Next, the client sends the message to the Orchestrator which analyzes it to discover the request type. Depending on this type, the Orchestrator acts in two different ways. If the message is a subscribe request, the Orchestrator stores this information and forwards the request to all active Brokers. The process of store information by Orchestrator is important in situations where Brokel elasticity mechanism includes new Brokers. All new Broker needs to be updated of all subscriptions before it can operate properly. On the other hand, if the message is a publish request, Orchestrator applies the load balance algorithm to select the most suitable Broker to forward the message. This strategy is pertinent to avoid that two different Brokers process the same client request. Also, it increases the model processing capacity. In the last phase of the flowchart, the message arrives to a Broker. Then, the Broker analyzes its type and, if the message is a subscribe request, the Broker
When considering the task of monitoring resources, cloud platforms typically use metrics from the operating system to determine the node workload. In this context, CPU load in percentage refers to the most used metric particularly pertinent for CPU scavating applications.29,32,33 Thus, Brokel Manager monitors CPU load of each VM periodically and computes the load for each level. If resource reorganization is necessary, the Manager proceeds elasticity actions using the cloud API. When completing elasticity verification and action tasks, the Brokel Manager ends a monitoring observation and waits for the next monitoring cycle. One role of the Manager is to generate the system load (l ) in order to minimize the effect of disturbances or noises on the behavior of the target level. Thus, we are working with time-series and simple exponential smoothing (SES),34 also referred as weighted moving average35—technique over the CPU load metric of each VM. Equation (1) presents l(o) as the level load at the oth monitoring observation considering n active VMs on the considered level. This equation is an arithmetic average of the load on each VM, which is computed through l0 (v, o). Here, v is a VM index, o is the current monitoring observation, and n the number of VMs running application processes of one of the two levels (see equation (2)). Pn1 l(o) = ( 0
l (v, o) =
cpu(v, o) 2 l0 (v, o1) 2
l0 (v, o) n
v=0
+
cpu(v, o) 2
ð1Þ if o = 0 if o ¼ 6 0
ð2Þ
l0 consists in a SES average, where the weight of the current observation o has a stronger influence than o 1 in the final calculus (starting from 1=2, we are using 1=4, 1=8, and so on for the weights). The recurrence ends in the cpu(v, o) computation, which returns the CPU load of VM v at observation o. The use of this smoothing technique is pertinent to avoid VM trashing,28 that is, false-positive or false-negative operations on VM scaling in our scaling out. A false-positive situation happens when an elasticity action is triggered after detecting a sudden load peak that exceeds some threshold. We can see false-negative scenarios mainly on elasticity situations that trigger resource reorganization when finding one of two conditions: after exceeding thresholds either for a predefined number of
10
Algorithm 3. Pseudo-code of the Publisher.
1 2 3 4 5 6 7 8
Data: Input file Result: Publish messages initialization; for eachline in file do topic = get param(line,1); data1 = get param(line,2); data2 = get param(line,3); message = concat(data1, data2); publish (topic, message, server_address); end for
consecutive observations or for a predefined time. On both situations, elasticity can be pertinent to adjust the load; however, it does not occur or it is delayed.
Evaluation methodology In this section, we present methodological aspects related to the Brokel evaluation, starting from the application. Next, we discuss both infrastructure and evaluation scenarios. Finally, we show the metrics used to evaluate the experiments in the proposed scenarios.
Application prototype To evaluate Brokel, we designed two different applications: (1) a Publisher application to produce and send messages to our platform and (2) a Subscriber application to receive these messages. The Publisher receives an input file containing a list of messages to produce for a specific topic. In addition, each line of the file contains three parameters: (1) target topic, (2) data field 1, and (3) data field 2. To represent how Publisher works, Algorithm 3 shows it as a loop processing each line of the file separately. Basically, each line of the file results in an operation of publication of a message. On the other hand, the Subscriber receives the same file, processing it differently. The application reads each line of the file, retrieves the target topic, and sends a subscribe request to the server. To illustrate this process, Algorithm 4 shows the Subscriber functioning. Regarding optimizations, a topic receives a subscription only once. To simulate data input to our model, we used a data set36 with over 21 million records. Each record describes the precise location of one of the 316 taxis of Rome with the following structure: taxi identification, location, timestamp (date/time). The Publisher application uses the identification field as a topic and the location and timestamp as the message body. The data represent all data gathered between 1 February 2104 and 2 March 2014.
International Journal of Distributed Sensor Networks
Algorithm 4. Pseudo-code of the Subscriber.
1 2 3 4 5 6 7 8
Data: Input file Result: Topics subscriptions initialization; for each line in file do topic = get param (line,1); if find topic(topic, topicList) == 0 then add topic(topic, topicList); subscribe(topic, server_address); end if end for
Elasticity manager prototype and cloud infrastructure We implemented a Brokel prototype for private clouds using OpenNebula (https://opennebula.org/) version 4.12.1. The Brokel Elasticity Manager was coded in Java and it uses the Java-based OpenNebula API for both monitoring and elasticity activities. Two image templates for the VMs were provided: one for the Orchestrators and another for the Brokers. The grain of an elasticity action is always a single VM, so we have an increase or decrease of 1 Orchestrator or Broker at every resource reorganization. Moreover, as presented in Sladescu and Fekete,14 we used the interval of 15 s to configure the Brokel Manager periodic monitoring activity, which is associated with the OpenNebula lowest bound index for periodical VM monitoring. Considering the cloud infrastructure, our cloud is composed by 11 (1 Front-End and 10 nodes) homogeneous 2.9 GHz dual core nodes, each one with 4 GB of RAM memory, and an interconnection network of 100 Mbps. These nodes are configured with the Operational System Ubuntu Server 14.04 LTS. Besides, we configured in the cloud Front-End the shared data area to store application files and VM images. Finally, our DNS was deployed using the BIND (https:// www.isc.org/downloads/bind/) DNS server version 9.9.5.
Evaluation scenarios To evaluate our solution, we designed a set of scenarios with different configurations. We are considering strategies which support either single or multi-level elasticity. In addition, we are also evaluating the performance of our application prototype as a basis of comparison among the different scenarios. Particularly, we are basing this strategy on the work of Barazzutti et al.,21 which first evaluated an application without elasticity support as a baseline. Thus, aiming at analyzing the impact of different configurations in the application performance, we defined three scenarios: (s1) running the application with the lowest number of VMs and disabling the elasticity feature, (s2) running
Rodrigues et al. the application starting with the lowest number of VMs and enabling elasticity in the Broker level, and (s3) running the application starting with the lower number of VMs as possible and enabling the multi-level elasticity. For all executions in these scenarios, the application started with one single VM per stage. In scenario s1, the goal is to obtain the results from the execution of the application without considering elasticity. In this scenario, the results represent traditional approaches, in which the application runs with the same amount of resources in the entire execution. On the other hand, in scenario s2, we aim to evaluate our model as a singlelevel elasticity solution. So, in this scenario, we reorganized resources only in the Broker level. Finally, in scenario s3, we intent to evaluate our model enabling elasticity in both Broker and Orchestrator levels. This is the scenario that characterizes our model as a multi-level elasticity solution. Regarding scenario s2, we used two Broker environments to model three subscenarios: (s2.1) VM replicas running Mosquitto Broker, (s2.2) VM replicas running RabbitMQ Broker, and (s2.3) 50% of VM replicas running Mosquitto Broker and 50% of VM replicas running RabbitMQ Broker. Furthermore, in these
11 subscenarios, we first executed the application setting the cloud service level agreement (SLA) to allow a maximum of 6 VMs in the cloud. Later on, we run again the application in these scenarios setting the cloud SLA to allow a maximum of 10 VMs. Concerning elasticity parameters, 70% and 90% were used for the upper threshold (tu ) whereas 30% and 50% were used for the lower threshold (tl ). The combination of these thresholds was used to guide the elasticity in executions from scenarios s2 and s3. In summary, there are 30 different executions when considering all scenarios and thresholds. Additionally, Figure 6 presents all experiment scenarios and used parameters. To generate input load for scenarios s1 and s2, we instantiated 20 publisher processes and divided among them the first 400,000 records of our data set. In addition, we also instantiated 20 subscriber processes to receive these data. In scenario s3, as the elasticity multilevel is enable, we doubled the input data to generate a higher throughput and, consequently, increase the load. In this way, we instantiated in scenario s3 40 publisher processes and divided among them the first 800,000 records of our data set. Furthermore, we also instantiated 40 subscriber processes to receive these data.
Figure 6. Evaluation scenarios: without elasticity in s1 and with elasticity in s2 and s3.
12
International Journal of Distributed Sensor Networks
Figure 7. Application execution time of all scenarios and configurations.
Evaluation metrics
Application time
Focusing on analyzing performance and resource consumption, our evaluation analyzes the aforementionecenarios against four metrics: time, messages per second, cost, and efficiency. The first represents the performance perspective, referring to the application execution time. The second is analogous to the throughput of our solution. This metric is calculated by dividing the total messages sent to the system by the total time of execution. The cost metric, in turn, represents the amount of resources the application used along the total execution. To estimate the application cost, it is necessary to know the each VM deployment time. In equation (3), s means the maximum number of allocated VMs and pte ( j) is the time spent when running a configuration with j VMs. For example, consider the situation: 10 s with 2 VMs, 60 s with 4 VMs, 50 s with 6 VMs, and 40 s with 4 VMs; here, we have cost = 10 3 2 + (60 3 4 + 40 3 4) + 50 3 6 = 720. Finally, efficiency represents the occupation rates of the allocated resources along the execution. In equation (4), s and pte ( j) have the same meaning described in equation (3).
Figure 7 illustrates the time spent on handling the workload on each environment configuration. The yaxis refers to the type of Broker used and the replica boundary, while the x-axis corresponds to the time measured in seconds in which a bar refers to a threshold setting. In particular, the first value is the lower threshold (tl ), while the second refers to the upper threshold (tu ). For example, 50/90 refers to a lower threshold of 50% and a upper threshold of 90%. The environment with Mosquitto broker and 1 VM obtained the best result among executions as evaluation scenario s1 (without elasticity). This setting took 7,280 s (121.3 m) to be finalized. In this case, no load limits were used because cloud elasticity was disabled. In addition, RabbitMQ with 1 VM obtained a conclusion time of 10,520 s (175.3 m), so being 44.5% slower than the Mosquitto environment. The percentage of reduction in execution time using Brokel is shown in Figure 8. In this figure, we present a comparison of Mosquitto with 6 VMs and 10 VMs against the deployment using a single VM. Moreover, the same was conducted for the RabbitMQ broker. We can perceive that Brokel reduced the time required for handling the workload in environments running the Mosquitto broker by up to 76.6%. The advantage was even greater when running with RabbitMQ, reaching 81.2% of time reduction. In general, the best result obtained using elasticity was with Mosquitto (10 VMs) when using tl and tu equal to 30% and 70%, respectively. This configuration was responsible for a conclusion time of 1,700 s. On the other hand, the worst case was obtained in the RabbitMQ (6 VMs) environment using 50% for tl and 90% for tu . The conclusion time here was 2,600 s. The upper threshold equal to 70% was responsible for allocating VMs faster, since the
cost =
s X
j 3 pte ( j)
ð3Þ
j=1
efficiency =
s X j 3 pte ( j) j=1
time
ð4Þ
Results This section presents the results, which were organized in five subsections for better exploring each perspective of cloud elasticity applied to Pub/Sub systems.
Rodrigues et al.
13
Figure 8. Percentage of reduction in execution time.
Table 2. Time in seconds the application run with each resource configuration set and scenarios. Platform
Mosquitto
Limit
1 VM 6 VMs
10 VMs
RabbitMQ
1 VM 6 VMs
10 VMs
Mix
6 VMs
10 VMs
Threshold (tl /tu )
– 30/70 30/90 50/70 50/90 30/70 30/90 50/70 50/90 – 30/70 30/90 50/70 50/90 30/70 30/90 50/70 50/90 30/70 30/90 50/70 50/90 30/70 30/90 50/70 50/90
Virtual machines (VMs)
Total (s)
1
2
4
6
8
10
7,280 – – – – – – – – 10,520 – – – – – – – – – – – – – – – –
– 180 300 240 220 120 200 180 160 – 140 200 300 380 120 160 380 220 160 400 220 640 180 300 520 300
– 160 200 240 300 220 180 160 440 – 160 180 160 200 160 220 240 180 160 180 240 220 160 200 160 180
– 1,600 1,720 1,600 1,440 240 200 160 180 – 2,200 2,060 1,980 2,020 160 160 160 160 1,980 1,880 1,860 1,720 160 260 160 180
– – – – – 220 180 160 200 – – – – – 200 200 160 200 – – – – 160 220 220 200
– – – – – 900 960 1,120 900 – – – – – 1,340 1,480 1,300 1,280 – – – – 1,340 1,140 1,100 1,120
system does not execute with a load greater than this limit. Consequently, the larger the number of VMs, the shorter the execution time since we have a CPU-intensive incoming demand.
Resource allocation Table 2 shows the utilization time of the virtual resources on each environment configuration. For example, the Mosquitto environment with a limit of 10 VMs using tl equal to 50% and tu equal to 90% used 2 VMs for 160 s, 4 VMs for 440 s, 6 VMs for 180 s, 8 VMs for 200 s, and 10 VMs for 900 s, then totaling 1880 s of execution. Figure 9 displays the utilization
7,280 1,940 2,220 2,080 1,960 1,700 1,720 1,780 1,880 10,520 2,500 2,440 2,440 2,600 1,980 2,220 2,240 2,040 2,300 2,460 2,320 2,580 2,000 2,120 2,160 1,980
time of the virtual resources in each environment configuration. Here, the y-axis refers to the type of Broker, the limit of replicas, and thresholds used. The x-axis, in its turn, corresponds to the time addressed in seconds and each stacked bar of the graph corresponds to the amount of occupied resources. Figure 10 depicts the state of the thresholds at each monitoring point. The y-axis refers to the type of Broker, the allowed replica limit, and thresholds used in the experiments. The x-axis corresponds to the number of collections performed and each stacked bar of the graph means the state of the thresholds. Mosquito obtained 39 readings above tu in the following deployments: (i) Mosquitto with 10 VMs using tl of 50% and
14
International Journal of Distributed Sensor Networks
Figure 9. Profile of VM configuration along the application execution in each scenario and configuration.
Figure 10. Load status in all monitoring observations.
Rodrigues et al.
15
Figure 11. Messages per second handled by the application.
tu of 90%; (ii) Mix with 10 VMs using tl of 30% and tu of 90%; and (iii) Mix with 10 VMs using tl of 50% and tu of 90%. This index refers to the smallest number recorded in the experiments. In these settings, we the application spent less overloaded time. On the other hand, the environment with the greatest overhead was RabbitMQ (6 VMs) with tl of 50% and tu of 90%, registering 98 readings above tu .
Throughput analysis Figure 11 illustrates the rate of messages per second handled at each environment setting. The y-axis refers to the type of broker used and the allowed replica limits. The x-axis corresponds to the number of messages per second (Mps) and each bar of the graph means a threshold setting used. It is important to highlight that the messages have the same size for all executions. Mosquitto (1 VM) obtained the best result when considering the concurrents inside scenario s1 (without elasticity), reaching a rate of 54.95 Mps. RabbitMQ (1 VM) environment achieved a performance of 38.02 Mps, then 30.8% lower than the previous one. Using the limit of six replicas of VMs, the best result was obtained by Mosquitto (6 VMs) environment using tl of 30% and tu of 70%, so obtaining a value of 206.19 Mps. The worst case was obtained in the RabbitMQ (6 VMs) environment using 50% and 90% for tl and tu , respectively, resulting in a performance index of 153.85 Mps. In particular, with the replica creation limit set to 10 VMs, 235.29 Mps were treated in the Mosquitto (10 VMs) environment using tl of 30% and tu of 70%. The worst result was recorded in the RabbitMQ (10 VMs) environment, where 178.57 Mps were addressed using tl of 50% and tu of 70%, so being 24.1% shorter than the Mosquitto (10 VMs) environment. In general, the best result with elasticity appeared when employing Mosquitto (10 VMs) with 30% and 70% for the
thresholds, obtaining then 235.29 Mps. The worst case was obtained in the RabbitMQ (6 VMs) using tl of 50% and tu of 90%, resulting at 153.85 Mps.
Cost and efficiency In Table 3, we present the results regarding the cost of each execution. The cost computation corresponds to the total number of resources used during the execution of the experiment. Thus, the idea is to minimize the cost. Figure 12 depicts the information presented in Table 3. Here, the y-axis refers to the type of broker used and the allowed replica limit. In its turn, the x-axis corresponds to the cost value where each bar of the graph represents a threshold setting used. The best result associated with the cost was obtained by the Mosquitto (6 VMs) environment with tl equal to 50% and tu equal to 90%, so using 10,280 resources to complete the execution. The worst case was perceived in the RabbitMQ (10 VMs) with 30% and 90% for the thresholds, totaling 18,560 of resources during execution. Table 4 in turn presents the results of the efficiency calculation of each evaluated environment. The efficiency corresponds to the amount of resources (VMs) used in each second of execution. Thus, the idea is to minimize the number of resources to maximize the system efficiency. The best result was obtained by the Mosquitto (6 VMs) environment with tl of 50% and tu of 90%, resulting in an average of 5.245 VMs in each second of execution. The worst case was obtained in the RabbitMQ (10 VMs) environment with 30% and 70% for the thresholds, where an average of 8.505 VMs in each second was perceived.
Multi-level resource allocation The main objective of scenario s3 is to evaluate the elastic performance in the Orchestrator layer. To accomplish this, two replicas of Orchestrator VMs were used,
16
International Journal of Distributed Sensor Networks
Table 3. Cost results of all evaluation scenarios. Platform
Mosquitto RabbitMQ Mix
Resources
1 VM 6 VMs 10 VMs 1 VM 6 VMs 10 VMs 6 VMs 10 VMs
Thresholds (tl /tu ) –
30/70
30/90
50/70
50/90
7,280 – – 10,520 – – – –
– 10,600 13,320 – 14,120 16,840 12,840 16,640
– 11,720 13,360 – 13,480 18,560 12,800 16,120
– 11,040 14,440 – 13,120 16,960 12,560 15,400
– 10,280 13,760 – 13,680 16,520 12,480 15,200
–
30/70
30/90
50/70
50/90
1.00 – – 1.00 – – – –
– 5.464 7.835 – 5.648 8.505 5.583 8.320
– 5.279 7.767 – 5.525 8.360 5.203 7.604
– 5.308 8.112 – 5.377 7.571 5.414 7.130
– 5.245 7.319 – 5.262 8.098 4.837 7.677
VM: virtual machine.
Figure 12. Total cost to execute the application in all scenarios.
Table 4. Efficiency results of all evaluation scenarios. Platform
Mosquitto RabbitMQ Mix
Resources
1 VM 6 VMs 10 VMs 1 VM 6 VMs 10 VMs 6 VMs 10 VMs
Thresholds (tl /tu )
VM: virtual machine.
as well as three Mosquitto Broker replicas and other three broker replicas for RabbitMQ. Table 5 shows the utilization time of the virtual resources in each environment configuration. For example, the broker setup with a limit of 6 VMs using tl equal to 50% and tu equal to 90% used 2 VMs for 1900 s, 4 VMs for 280 s, 6 VMs for 3040 s, totaling 5220 s of execution.
During the same experiment, the Orchestrator layer used 1 VM for 4120 s and 2 VMs for 1100 s, totaling the same 5220 s of execution. Figure 13 corresponds to the resource allocation graph of the experiments in scenario s3. Figure 14 shows a graph of CPU utilization in the Orchestrator viewpoint when running scenario s3 with
Rodrigues et al.
17
Table 5. Time the application run with each resource configuration set in the multi-level scenario. Level
Limit
Broker
6 VMs
Orchestrator
2 VMs
Thresholds (tl /tu )
30/70 30/90 50/70 50/90 30/70 30/90 50/70 50/90
Virtual machines (VMs)
Total (s)
1
2
4
6
– – – – 360 480 3,200 4,120
100 200 1,260 1900 4,260 4,440 1,440 1,100
1,380 1,400 340 280 – – – –
3,140 3,320 3,040 3,040 – – – –
4,620 4,920 4,640 5,220 4,620 4,920 4,640 5,220
Figure 13. Profile of resource allocation in scenario s3.
Figure 14. Resource utilization and allocation in scenario s3 with thresholds tl of 30% and tu of 70%.
30% and 70% for thresholds. Here, the blue line represents the total CPU allocated at the time of data collecting, while the red line refers to the CPU usage. Figure 15 presents the same information as Figure 14, but referring to scenario s3 with 50% and 70% for thresholds. These two figures were selected to demonstrate the impact that thresholds setting can cause to the environment with elasticity.
In the experiment represented by Figure 15, the second Orchestrator replica is instantiated around observation 13. This action is performed because the total processing of the environment exceeds the value of tu . The environment is maintained with two replicas until finishing the experiment. During the execution shown by Figure 15, the second Orchestrator replica is also instantiated around observation 13. However, the
18
International Journal of Distributed Sensor Networks
Figure 15. Resource utilization and allocation in scenario s3 with thresholds tl of 50% and tu of 70%.
second replica is deactivated soon after its creation, as the total processing of the environment drops below tl . The second replica activation and deactivation actions are repeated until the end of the experiment. This behavior was perceived when using a single replica above of tu and with two replicas below of tl .
Conclusion This article presented Brokel, a multi-level elasticity model for Pub/Sub brokers, so enabling a performancedriven communication bus that integrates applications, things, and users. Our idea is to address the growing IoT demand in an effortless way at clients viewpoint, enabling them to perceive an acceptable performance level when using the Pub/Sub system regardless of load and time of day. We introduced two elasticity levels: Broker and Orchestrator, in which the latter acts as a load balancing wrapper without needing to change the existing Pub/Sub applications. Cloud elasticity is also offered as an agnostic service to brokers, that is, we only need to create VM templates for each target broker, allowing Brokel to work with any Pub/Sub broker (such as RabbitMQ, Mosquitto). Our scientific contribution relies on providing horizontal elasticity at both Orchestrator and Broker levels, adding the possibility of using a geolocation DNS service to define the most suitable entry point (Orchestrator) to the messaging system. Another Brokel feature that we did not observe in Pub/Sub state-of-the-art refers to the Aging method on evaluating the load of both aforesaid levels, then avoiding VM thrashing on elasticity actions. To the best of our knowledge, there is no solution that proposes cloud elasticity in Broker and Orchestrator levels. Thus, we compared our solution with different configurations and non-elastic scenarios. Our results are encouraging and serve as proof of concept for the use of elasticity in Pub/Sub systems. We evaluated Brokel by developing a prototype and a client application, testing both against different scenarios,
brokers, and metrics. Comparing elastic and non-elastic environments, we highlight performance gains of 76.6% and 81.2% when using Mosquitto and RabbitMQ brokers, respectively. At messages per second (Mps) viewpoint, we also get throughput benefits: from 54.95 to 235.29 Mps for Mosquitto and from 38.02 to 202.02 Mps for RabbitMQ. When exploiting the resource metric perspective, the best result was obtained when averaging 5.24 VMs per second during the execution, with lower threshold of 50% and upper threshold of 90%. These threshold values are in charge of postponing resource allocation, since new resources will only be instantiated after crossing 90% of system load. On the other hand, the worst results in the resource perspective happened when using 30% and 70% for the lower and upper thresholds, revealing a value of 8.51 VMs. In the elasticity scope, it is pertinent to observe that execution time and resource allocation are inversely proportional metrics. Future research includes the evaluation of Broker on real environments. We are studying the possibility to assess it in post office agencies. Furthermore, future studies include additional evaluation of research strategies employing our solution. Additionally, we intend to improve network performance investigating event notification strategies as proposed in Scalable Internet Event Notification Architectures (SIENA) (http:// www.inf.usi.ch/carzaniga/siena/).6,7 Although presenting encouraging results, the current version of Brokel uses reactive elasticity where lower and upper thresholds must be informed beforehand. Thus, we plan to explore proactive elasticity in order to drive elasticity actions in advance, therefore delivering the resources before entering in an over- or under-provisioning state. Declaration of conflicting interests The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Rodrigues et al. Funding The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was partially supported by the following Brazilian agencies: CNPq, FAPERGS, and CAPES. The work was also partially supported by Finep, with resources from Funttel, grant no. 01.14.0231.00, under the Radiocommunication Reference Center (Centro de Refereˆncia em Radiocomunicac xo˜es (CRR)) project of the Inatel, Brazil.
References 1. Gubbi J, Buyya R, Marusic S, et al. Internet of Things (IoT): a vision, architectural elements, and future directions. Future Gener Comp Sy 2013; 29(7): 1645–1660. 2. Ngu AH, Gutierrez M, Metsis V, et al. IoT middleware: a survey on issues and enabling technologies. IEEE Internet Things 2017; 4(1): 1–20. 3. Santos J, Rodrigues JJPC, Casal J, et al. Intelligent personal assistants based on internet of things approaches. IEEE Syst J. Epub ahead of print 19 May 2016. DOI: 10.1109/JSYST.2016.2555292. 4. Yao L, Sheng QZ and Dustdar S. Web-based management of the internet of things. IEEE Internet Comput 2015; 19(4): 60–67. 5. Ma X, Wang Y, Qiu Q, et al. Scalable and elastic event matching for attribute-based publish/subscribe systems. Future Gener Comp Sy 2014; 36: 102–119 (Special section: intelligent big data processing special section: behavior data security issues in network information propagation special section: energy-efficiency in large distributed computing architectures special section: eScience infrastructure and applications). 6. Tarkoma S. Scalable internet event notification architecture (SIENA). In: Proceedings of the research seminar on middleware for mobile computing, Helsinki, 20 February 2002. Helsinki: Department of Computer Science, University of Helsinki. 7. Carzaniga A, Rosenblum DS and Wolf AL. Design and evaluation of a wide-area event notification service. ACM T Comput Syst 2001; 19(3): 332–383. 8. Botta A, de Donato W, Persico V, et al. Integration of cloud computing and internet of things: a survey. Future Gener Comp Sy 2016; 56: 684–700. 9. Duran-Limon HA, Siller M, Blair GS, et al. Using lightweight virtual machines to achieve resource adaptation in middleware. IET Softw 2011; 5(2): 229–237. 10. Giang NK, Blackstock M, Lea R, et al. Developing IoT applications in the fog: a distributed dataflow approach. In: Proceedings of the 2015 5th international conference on the Internet of Things (IOT), Seoul, South Korea, 26–28 October 2015, pp.155–162. New York: IEEE. 11. Truong HL and Dustdar S. Principles for engineering IoT cloud systems. IEEE Cloud Comput 2015; 2(2): 68–76. 12. Robertazzi TG. Chapter 10: grids, clouds, and data centers. In: Robertazzi TG (ed.) Introduction to computer networking. Cham: Springer International Publishing, 2017, pp.113–127.
19 13. Chilipirea C, Constantin A, Popa D, et al. Cloud elasticity: going beyond demand as user load. In: Proceedings of the 3rd international workshop on adaptive resource management and scheduling for cloud computing (ARMSCC’16), Chicago, IL, 25–28 July 2016, pp.46–51. New York: ACM. 14. Sladescu M and Fekete A. Event aware elasticity control for cloud applications. Technical report 687, April 2012. Sydney, NSW, Australia: The University of Sydney. 15. Tran NL, Skhiri S and Zima´nyi E. EQS: an elastic and scalable message queue for the cloud. In: Proceedings of the 2011 IEEE 3rd international conference on cloud computing technology and science (CloudCom), Athens, 29 November–1 December 2011, pp.391–398. New York: IEEE. 16. Moore LR, Bean K and Ellahi T. Transforming reactive auto-scaling into proactive auto-scaling. In: Proceedings of the 3rd international workshop on cloud data and platforms (CloudDP’13), Prague, 14 April 2013, pp.7–12. New York: ACM. 17. Barazzutti R, Felber P, Fetzer C, et al. StreamHub: a massively parallel architecture for high-performance content-based publish/subscribe. In: Proceedings of the 7th ACM international conference on distributed event-based systems, Arlington, TX, 29 June–3 July 2013, pp.63–74. New York: ACM. 18. Carella GA and Magedanz T. Open Baton: a framework for virtual network function management and orchestration for emerging software-based 5G networks. Newsletter, 2016, http://resourcecenter.fd.ieee.org/fd/product/ enewsletters/FDSDNNL0005 19. Wang X, Liu Z, Qi Y, et al. LiveCloud: a lucid orchestrator for cloud datacenters. In: Proceedings of the 4th IEEE international conference on cloud computing technology and science, Taipei, Taiwan, 3–6 December 2012, pp.341– 348. New York: IEEE. 20. Pham LM, Tchana A, Donsez D, et al. Roboconf: a hybrid cloud orchestrator to deploy complex applications. In: Proceedings of the 2015 IEEE 8th international conference on cloud computing, New York, 27 June–2 July 2015, pp.365–372. New York: IEEE. 21. Barazzutti R, Heinze T, Martin A, et al. Elastic scaling of a high-throughput content-based publish/subscribe engine. In: Proceedings of the 2014 IEEE 34th international conference on distributed computing systems (ICDCS), Madrid, 30 June–3 July 2014, pp.567–576. New York: IEEE. 22. Li M, Ye F, Kim M, et al. A scalable and elastic publish/ subscribe service. In: Proceedings of the 2011 IEEE international parallel distributed processing symposium (IPDPS), Anchorage, AK, 16–20 May 2011, pp.1254– 1265. New York: IEEE. 23. Wang Y and Ma X. A general scalable and elastic content-based publish/subscribe service. IEEE T Parall Distr 2015; 26(8): 2100–2113. 24. Zhong T, Doshi KA, Lu Z, et al. Capability adaptive elastic IoT architecture. In: Proceedings of the 2015 IEEE international conference on smart city/SocialCom/SustainCom (SmartCity), Chengdu, China, 19–21 December 2015, pp.615–622. New York: IEEE.
20 25. Bellavista P and Zanni A. Towards better scalability for IoT-cloud interactions via combined exploitation of MQTT and CoAP. In: Proceedings of the 2016 IEEE 2nd international forum on research and technologies for society and industry leveraging a better tomorrow (RTSI), Bologna, 7–9 September 2016, pp.1–6. New York: IEEE. 26. Bellavista P and Zanni A. Scalability of Kura-extended gateways via MQTT-CoAP integration and hierarchical optimizations. In: Proceedings of the 11th EAI international conference on body area networks (BodyNets’16), Turin, 15–16 December 2016, pp.210–216. Brussels: Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering (ICST). 27. Ali-Eldin A, Tordsson J and Elmroth E. An adaptive hybrid elasticity controller for cloud infrastructures. In: Proceedings of the 2012 IEEE network operations and management symposium (NOMS), Maui, HI, 16–20 April 2012, pp.204–212. New York: IEEE. 28. Bryant R, Tumanov A, Irzak O, et al. Kaleidoscope: cloud micro-elasticity via VM state coloring. In: Proceedings of the 6th conference on computer systems (EuroSys’11), Salzburg, 10–13 April 2011, pp.273–286. New York: ACM. 29. Nikolov V, Ka¨chele S, Hauck FJ, et al. CLOUDFARM: an elastic cloud platform with flexible and adaptive resource management. In: Proceedings of the 2014 IEEE/ ACM 7th international conference on utility and cloud computing (UCC’14), London, 8–11 December 2014, pp.547–553. Washington, DC: IEEE Computer Society. 30. Dutta S, Gera S, Verma A, et al. SmartScale: automatic application scaling in enterprise clouds. In: Proceedings of the 2012 IEEE 5th international conference on cloud
International Journal of Distributed Sensor Networks
31.
32.
33.
34.
35.
36.
computing (CLOUD), Honolulu, HI, 24–29 June 2012, pp.221–228. New York: IEEE. Lorido-Botran T, Miguel-Alonso J and Lozano J. A review of auto-scaling techniques for elastic applications in cloud environments. J Grid Comput 2014; 12(4): 559–592. Herbst NR, Kounev S, Weber A, et al. BUNGEE: an elasticity benchmark for self-adaptive IAAS cloud environments. In: Proceedings of the 10th international symposium on software engineering for adaptive and self-managing systems (SEAMS’15), Florence, 18–19 May 2015, pp.46–56. Piscataway, NJ: IEEE Press. Lim HC, Babu S, Chase JS, et al. Automated control in cloud computing: challenges and opportunities. In: Proceedings of the 1st workshop on automated control for datacenters and clouds (ACDC’09), Barcelona, 19 June 2009, pp.13–18. New York: ACM. Herbst NR, Huber N, Kounev S, et al. Self-adaptive workload classification and forecasting for proactive resource provisioning. In: Proceedings of the 4th ACM/ SPEC international conference on performance engineering (ICPE’13), Prague, 21–24 April 2013, pp.187–198. New York: ACM. Yazdanov L and Fetzer C. Vertical scaling for prioritized VMs provisioning. In: Proceedings of the 2012 2nd international conference on cloud and green computing, Xiangtan, China, 1–3 November 2012. Washington, DC: IEEE Computer Society. Bracciale L, Bonola M, Loreti P, et al. CRAWDAD dataset roma/taxi (v. 2014-07-17), 2014, http://crawdad. org/roma/taxi/20140717