Jul 2, 2012 - Cloud Monitoring and Management. DARGOS ... Cloud monitoring. Cloud monitoring systems can be categorized: ... Agentless. Corradi ...
Cloud Monitoring and Management
DARGOS
Experimental Results
Conclusions and Future Work
DDS-Enabled Cloud Management Support for Fast Task Offloading IEEE ISCC 2012, Cappadocia Turkey
Antonio Corradi 1 Luca Foschini 1 Javier Povedano-Molina2 Juan M. Lopez-Soler2 1 Dipartimento
di Elettronica, Informatica, e Sistemistica Universit` a di Bologna (Italy)
2 Departamento
de Teor´ıa de la Se˜ nal, Telem´ atica y Comunicaciones Universidad de Granada (Spain)
2nd July, 2012 Corradi, Foschini, Povedano-Molina, Lopez-Soler DDS-Enabled Cloud Management Support for Fast Task Offloading
Cloud Monitoring and Management
DARGOS
Experimental Results
Agenda Cloud Monitoring and Management DARGOS Data-Centric Publish-Subscribe Architecture Experimental Results Testbed Description Results Conclusions and Future Work
Corradi, Foschini, Povedano-Molina, Lopez-Soler DDS-Enabled Cloud Management Support for Fast Task Offloading
Conclusions and Future Work
Cloud Monitoring and Management
DARGOS
Experimental Results
Conclusions and Future Work
Cloud monitoring
Cloud monitoring systems can be categorized: I
Architectural model: Centralized vs. Decentralized
I
Communication model: Pull vs. Push
I
Deployment model: Agent-based vs. Agentless
Corradi, Foschini, Povedano-Molina, Lopez-Soler DDS-Enabled Cloud Management Support for Fast Task Offloading
Cloud Monitoring and Management
DARGOS
Experimental Results
Conclusions and Future Work
Resource monitoring in Clouds I I
A typical approach: centralized pull Central node queries and stores remote resource usage I I
pros: easy to implement cons: central point of failure, request-reply, scalability in N:M scenarios, support for different update rates, no notifications
Corradi, Foschini, Povedano-Molina, Lopez-Soler DDS-Enabled Cloud Management Support for Fast Task Offloading
Cloud Monitoring and Management
DARGOS
Experimental Results
Centralized Cloud management
Centralized architecture (OpenStack) Corradi, Foschini, Povedano-Molina, Lopez-Soler DDS-Enabled Cloud Management Support for Fast Task Offloading
Conclusions and Future Work
Cloud Monitoring and Management
DARGOS
Experimental Results
Conclusions and Future Work
Centralized Cloud management (II)
Virtual Machine instantiation Typical Cloud scenario (OpenStack) Corradi, Foschini, Povedano-Molina, Lopez-Soler DDS-Enabled Cloud Management Support for Fast Task Offloading
Cloud Monitoring and Management
DARGOS
Experimental Results
Conclusions and Future Work
Types of loads in Clouds
I
Services I I
I
Long term duration Load is (almost) stable (e.g. Web server, Databases, ...)
Tasks I I
Short duration (from seconds to few minutes) Load of each task is unknown a priori
Corradi, Foschini, Povedano-Molina, Lopez-Soler DDS-Enabled Cloud Management Support for Fast Task Offloading
Cloud Monitoring and Management
DARGOS
Experimental Results
Conclusions and Future Work
Cloud resource monitoring in dynamic scenarios I
Short-mid tasks with dynamic load I I I
I
Require an accurate and reliable snapshot of resources available (real-time update) I
I
Bag of Tasks (BoT) Media transcoding Computation offloading
CPU load, memory usage, system load, hypervisor,...
Different goals: maximize throughput, minimize power consumption, ...
Corradi, Foschini, Povedano-Molina, Lopez-Soler DDS-Enabled Cloud Management Support for Fast Task Offloading
Cloud Monitoring and Management
DARGOS
Experimental Results
Conclusions and Future Work
DARGOS I I I I I
Distributed Architecture for Resource manaGement and mOnitoring in cloudS A distributed monitoring system “Argos Panoptes“: Argos the ”100 eyed“ guardian Uses a Publish Subscribe approach Used to collect real time monitoring data for taking scheduling decisions
Corradi, Foschini, Povedano-Molina, Lopez-Soler DDS-Enabled Cloud Management Support for Fast Task Offloading
Cloud Monitoring and Management
DARGOS
Experimental Results
Conclusions and Future Work
DCPS
Data-Centric Publish-Subscribe I I I I I
Entities share a data model instead using interfaces Producers publish data conforming this data model Subscribers receive data matching their interests Publishers and subscribers are decoupled in space and time Middleware can manage data samples
Corradi, Foschini, Povedano-Molina, Lopez-Soler DDS-Enabled Cloud Management Support for Fast Task Offloading
Cloud Monitoring and Management
DARGOS
Experimental Results
Conclusions and Future Work
DCPS
Data Distribution Service (DDS) I
OMG Specification for Data-Centric Publish-Subscribe I I
I I I I I I I I
Data model Wire protocol
Entities exchange Topics (e.g. temperature, 2D position, ...) Topics are defined by their name and data type Topic samples can contain key data to identify them Publishers pushes Topic updates into Subscribers local cache QoS control and management Partition mechanisms Unicast and multicast support Adopted in time critical systems (avionics, stock exchange quotations,...)
Corradi, Foschini, Povedano-Molina, Lopez-Soler DDS-Enabled Cloud Management Support for Fast Task Offloading
Cloud Monitoring and Management
DARGOS
DCPS
Data Distribution Service
Corradi, Foschini, Povedano-Molina, Lopez-Soler DDS-Enabled Cloud Management Support for Fast Task Offloading
Experimental Results
Conclusions and Future Work
Cloud Monitoring and Management
DARGOS
Experimental Results
Conclusions and Future Work
Architecture
DARGOS Entities I
DARGOS has two kinds of entities: I
Node Monitoring Agent (NMA): collect and publishs local resource usage I I
I
Installed at each node (e.g. CPU, system load, memory,...) 1 resource, 1 topic
Cloud Monitoring Supervisor (CMS): interested in remote monitoring data I I
I
Discovers and subscribes remote resources Define their own requirements (reliability, acceptable deadlines) Installed in every application interested in resource data (schedulers, dashboard,...)
Corradi, Foschini, Povedano-Molina, Lopez-Soler DDS-Enabled Cloud Management Support for Fast Task Offloading
Cloud Monitoring and Management
DARGOS
Architecture
DARGOS scenario
Corradi, Foschini, Povedano-Molina, Lopez-Soler DDS-Enabled Cloud Management Support for Fast Task Offloading
Experimental Results
Conclusions and Future Work
Cloud Monitoring and Management
DARGOS
Experimental Results
Conclusions and Future Work
Architecture
Node Monitoring Agents (NMA)
I I
Collect local resource data and publishes as DARGOS Topics DARGOS NMAs have two operation modes: I
Periodic I I
I
NMA pushes periodically resource usage information Maximizes Accuraccy
Event based I
I
NMA pushes resource information under certain conditions (e.g. resource usage delta exceeds threshold) Maximizes Scalability and bandwidth saving
Corradi, Foschini, Povedano-Molina, Lopez-Soler DDS-Enabled Cloud Management Support for Fast Task Offloading
Cloud Monitoring and Management
DARGOS
Experimental Results
Conclusions and Future Work
Architecture
Periodic vs. Event based Periodic
I
Period=1 second
I
Samples published=10
Corradi, Foschini, Povedano-Molina, Lopez-Soler DDS-Enabled Cloud Management Support for Fast Task Offloading
Event-based
I
Samples sent when usage changes range
I
Samples published=5
Cloud Monitoring and Management
DARGOS
Experimental Results
Conclusions and Future Work
Architecture
Cloud Monitoring Supervisor (CMS) I
CMS discovers available nodes and their available sensors (DARGOS Topics)
I
CMS subscribe to sensor information of interest (CPU, memory,...)
I
Applications that use CMS: Cloud dashboards, schedulers Each CMS define their own quality of service (QoS) requirements
I
I I I
Reliability or best effort (RELIABILITY) Maximum allowable delay between updates (DEADLINE) Maximum refresh rate (TIME BASED FILTER)
I
CMSs establish subscription contracts with NMAs
I
On QoS violations, a CMS can trigger actions
Corradi, Foschini, Povedano-Molina, Lopez-Soler DDS-Enabled Cloud Management Support for Fast Task Offloading
Cloud Monitoring and Management
DARGOS
Experimental Results
Architecture
DARGOS-based Cloud management
Cloud scenario (OpenStack+DARGOS) Corradi, Foschini, Povedano-Molina, Lopez-Soler DDS-Enabled Cloud Management Support for Fast Task Offloading
Conclusions and Future Work
Cloud Monitoring and Management
DARGOS
Experimental Results
Conclusions and Future Work
Testbed Description
Experimental testbed
I I
Testbed with DARGOS-enabled OpenStack Cloud DARGOS based OpenStack scheduler I I
I
Server Consolidation Load balancing
Run multiple tasks with random durations and loads
Corradi, Foschini, Povedano-Molina, Lopez-Soler DDS-Enabled Cloud Management Support for Fast Task Offloading
Cloud Monitoring and Management
DARGOS
Experimental Results
Testbed Description
Testbed description
I
OpenStack Cloud fabric
I
DARGOS enabled scheduler service
I
Three DARGOS enabled compute nodes
I
RTI DDS 4.5d middleware
I
1Gbps switch
Corradi, Foschini, Povedano-Molina, Lopez-Soler DDS-Enabled Cloud Management Support for Fast Task Offloading
Conclusions and Future Work
Cloud Monitoring and Management
DARGOS
Experimental Results
Conclusions and Future Work
Results
Results (VM per node)
OpenStack out-of-the-box scheduler
Corradi, Foschini, Povedano-Molina, Lopez-Soler DDS-Enabled Cloud Management Support for Fast Task Offloading
OpenStack DARGOS-based scheduler (consolidation)
Cloud Monitoring and Management
DARGOS
Experimental Results
Results
Results (bandwidth)
Bandwidth consumption per protocol Corradi, Foschini, Povedano-Molina, Lopez-Soler DDS-Enabled Cloud Management Support for Fast Task Offloading
Conclusions and Future Work
Cloud Monitoring and Management
DARGOS
Experimental Results
Conclusions and Future Work
Conclusions I
Typical centralized Cloud monitoring systems are not suitable for dynamic scenarios
I
DARGOS is a decentralized Cloud monitoring system suitable for fast task
I
DARGOS can also satisfy service scenarios
I
DARGOS is more robust than centralized systems
I
The Data-Centric Publish-Subscribe model used by DARGOS makes possible to manage task oriented Clouds accurately and reliably
I
DARGOS introduce low overhead while maintaining accuracy
Corradi, Foschini, Povedano-Molina, Lopez-Soler DDS-Enabled Cloud Management Support for Fast Task Offloading
Cloud Monitoring and Management
DARGOS
Experimental Results
Conclusions and Future Work
Future Directions
I
Include more sofisticated scheduling algorithm (e.g. include historical data)
I
Extend the notification mechanism to include alarms or complex events
I
Add customized action registration to automatically react to certain events (e.g. live migrations)
Corradi, Foschini, Povedano-Molina, Lopez-Soler DDS-Enabled Cloud Management Support for Fast Task Offloading
Cloud Monitoring and Management
DARGOS
Q&A
Thank You
Corradi, Foschini, Povedano-Molina, Lopez-Soler DDS-Enabled Cloud Management Support for Fast Task Offloading
Experimental Results
Conclusions and Future Work