Configuration Discovery and Monitoring Middleware for Enterprise Datacenters Sandip Agarwala, Luis A Bathent, Divyesh Jadav, Ramani Routray IBM Almaden Research Center, San Jose, CA 95120 tUniversity of California, Irvine, CA 92697 Email: { sagarwala.divyesh.routrayr } @
[email protected]
Abstract-Automatic discovery and monitoring of IT resources is a critical part of enterprise systems management. In addition to ascertaining internal device configurations, this discovery process may also need to capture the capabilities, usage, connectivity, availability, and other information related to various IT com ponents. Systems resource management (SRM) tools typically implement this discovery process using device specific APls, custom agents and/or some standard-based solution (like WBEM and elM). The discovery actions need to be systematically planned; an inefficient implementation or scheduling may easily take from a few minutes to several hours to complete in a large heterogeneous enterprise datacenter. This paper discusses the various challenges associated in discovering the configuration of a datacenter environment and presents an autonomic configuration monitoring middleware called Magellan that builds upon industry best practices and standards. Magellan reduces the overall discovery time by more than 50% in our micro-benchmark experiments as well as in a large datacenter configuration of a major financial organization.
I. INTRODUCT ION Efficient management of datacenter resources is critical for any enterprise. The demand on IT resources and the breadth of their features and functionalities is increasing, which in turn is driving up the cost and complexity of managing them. Additionally, the proliferation of technologies, vendor-specific hardware and software solutions make the management of an enterprise datacenter very difficult. Systems Resource Man agement (SRM) tools provide an integrated administrative interface and automate some of the management tasks that may include monitoring, reporting, planning, analysis and control. This integration is made possible by discovering and probing each device of interest, and storing its configuration and per formance related information in a non-volatile repository (such as a relational database), which can later be queried to perform other management actions. For example, storage resource provisioning [8] or planning for disaster recovery [10] require detailed data related to device configurations and historical performance behavior (to build workload model, etc.). For SRM tools to be scalable and efficient, the discovery process must also be scalable and efficient. There are, however, multiple issues that hinder scalable discovery of enterprise datacenters. First, enterprise applications and data impose numerous requirements in terms of performance, backup, disaster recovery, archival, etc. These make it necessary to capture all information related to the datacenter configuration and make them available to high-level management functions.
978-1-4244-5367-2/10/$26.00 ©20 10 IEEE
Second, in addition to traditional datacenters, SRM tools are increasingly being used for managing additional IT envi ronments such as clouds and high-performance computing, which pose scalability challenges. Third, administrators need online access to workload and resource (both virtual and physical) information, including that of topology, connectivity, configuration and performance. Some information like net work topology changes infrequently and others like health status may change in a fraction of a second. Fourth, some of the existing tools [1], [15], [20], [23] address scalability issues in enterprise monitoring by relying on custom moni toring agents, custom protocols, custom data representations and/or custom formats for persistent storage. However, to day's datacenters are commonly composed of heterogeneous devices like servers, switches, storage controllers, etc. These may potentially come from many different vendors and may have their own custom interfaces, protocols and/or 'element managers' to get access to their information. Building and managing custom discovery and monitoring infrastructures for such heterogeneous environment would be a management nightmare. Also, maintaining custom agents for every new realease of the devices is non-trivial. Fortunately, there exist standards like SNMP [19], CIM [3], WBEM [24] and SMI S [17] that have been adopted by many server, storage and networking device vendors. These standards make it easier for different hardware and software products to interoperate and provide a uniform way of exchanging information across them for the purpose of monitoring and control. However, they don't address the scalability issue that arises because of the large amount of information that is collected by the SRM tools. This paper addresses some of the challenges associated with discovering CIM-based enterprise environments and presents and evaluates a new tool called Magellan that makes discovery more scalable and efficient. Specifically, it uses a combination of load balancing techniques and intelligent CIM graph traver sal that divides the discovery process into smaller jobs, and parallelizes and distributes these jobs across different 'elM Agents' that provide information about the physical and logical IT resources in their management domain. Magellan makes multiple technical contributions: First, it identifies inefficiencies associated with configuration discov ery using the CIM standard-based approach in enterprise datacenters. Second, it presents various optimizations and load balancing strategies that would make the process of
639
configuration discovery run faster. Third, the proxy-based architecture of Magellan can keep track of the load across different agents and does not require any modifications to the existing standards and/or elM agents. Also, it is general enough to work with any elM profile and managed device (or entity) that presents a elM interface. Finally, it describes the implementation of the novel optimization strategies in an open source SRM platform called Aperi [2], presents a comparative study with the base discovery process of Aperi, and evaluates and analyzes the performance of Magellan in a complex datacenter configuration deployed by a major financial organization.
SubSystems[]
getAllSubSystems();
=
for each subsystem in SubSystems; Ranks[]
do
getAllRanks(subsystem);
=
for each rank in Ranks; Disks[]
=
do
getAllDisks(rank);
for each disk in Disks;
do
end for end for Extents[]
getAllExtents(subsystem);
=
for each extent in Extents; Volumes[]
=
do
getAllVolumes(extent);
for each volume in Volumes;
do
end for end for FCports[]
getAllFCports(subsystem);
=
for each vol in Volumes; Volume2Port[]
do
getAssociationsVolume2Port(vol);
=
end for end for
II. BACKGROUND In order to provide uniformity and interoperability in representing and managing IT assets, organizations such as DMTF [4] and SNIA [18] have come up with standards for systems management. elM [3] provides a common definition of management information for systems, networks, applica tions and services. WBEM [24] defines a set of standard operations and protocols that allow elM clients and agents to communicate. SNIA's Storage Management Initiative Specification (SMI-S) [17] is based on elM and WBEM, and it standardizes and streamlines 'Storage Area Network'(SAN) management functions. In a elM-based managed environment, each device has a management module called a elM Agent, which may be either embedded in the device itself, or communicates with it from an external physical machine. Each elM agent contains a set of providers, which are pluggable libraries that realize the elM profiles by serving the requested information out from the respective devices. elM agents provide a unified and consistent way for the SRM tools to monitor and control these devices. SRM tools discover their managed environment and store the configuration and properties of all the physical and logical entities in a relational database that is later queried to perform different management tasks. Before running the discovery process, a recipe is defined that lists the sequence of steps that needs to be executed by the discovery engine of the SRM tool. This is done by the elM client within the discovery engine. It makes WBEM calls to the elM agents, parses the response, correlates (maps) it with the results of other elM queries in order to establish the dependencies across them and finally stores it in the SRM database. Figure 1 shows a sample recipe to discover storage subsystems and fiber channel switches in a datacenter. Recipe formulation is a complex job and is usually done by highly skilled systems designers or developers. It requires deep understanding of the elM profiles published by different vendors as well as the schema of the underlying database repository where the collected information is stored. The complexity arises due to a combination of factors: First, a device's elM profile may have a large number of elM elements (classes, associations, methods, etc.) that specify the management interface for the device. Second, same informa tion can be obtained using more than one set of WBEM
640
FCSwitches[]
getAllFCSwitches();
=
for each fcswitch in FCSwitches;
do
FCports[]
=
Zones[]
getAllZones(fcswitch);
=
ZoneSets[]
getAllFCports(fcswitch);
=
getAllZoneSets(fcswitch);
For each zone in Zones; Zone2ZoneSet[]
=
do
getAssociationsZone2ZoneSet(zone);
end for end for
Fig. 1.
Sample pseudo-code for a CIM discovery recipe
operations. However, the cost of an operation depends on its type (enumeration or association) as well as the class type being requested. Third, there exist dependencies between different elM classes, as a result of which the output of one WBEM operation may serve as an input to another WBEM operation. Fiber channel 'Port2Port' association calls, for example, are made only after all port instances have been enumerated. Fourth, different elM agents and clients may have different capabilities and limitations in terms of pro cessing power, memory, concurrency, etc. Fifth, the quality of providers implemented by vendor varies widely. For example, in the case of a fiber channel switch, the process of collecting its ports' information may be faster using association-type WBEM calls for one switch vendor's provider, but faster using enumeration-type WBEM calls for another vendor's provider. Finally, there are industry best practices that these recipes need to adhere to before invoking elM operations. III. MAGELLAN OPTIMIZATIONS elM, WBEM, SMI-S and other standards address the inter operability issues in system management, but they don't ex plicitly address the scalability problem that may arise in large enterprise datacenters, specifically with respect to discovery and monitoring when large amount of data is gathered from different physical and logical entities. In order to make the discovery clients more scalable, we build a new autonomic middleware framework called Magellan that reduces the total discovery time using a series of optimizations. Before we get into the details of these optimizations, we list some of our goals and assumptions: • Reduce the time taken to execute a given discovery recipe in compliance with elM-based standards, which allows for broader applicability and can plug into any environment that adopts these standards.
2010 IEEEiIFIP Network Operations and Management Symposium - NOMS 2010
CIM Agents (Ci)
Managed Devices
TABLE I LIST OF METRICS MONITORED Type
BY
MAGELLAN
Metrics CPU
SRM Server
Memory Max no. of threads Max no. of threads allocated for discovery Max no. of concurrent CIM client connections CPU
configuration discovery clients
Database Server
Memory Application Heap Size Max no. of concurrent database connections
Embedded CIM Agent
Fig. 2.
Network
Magellan Architecture
Bandwidth Latency CPU Memory Max no. of concurrent CIM client connections
Prioritize between different discovery tasks and load balance across multiple CIM agents. • We assume that at least one CIM agent reports each man aged device. For device that doesn't support CIM standards, we have to build a CIM provider that can communicate with the device using its proprietary APls and respond to the CIM queries related to that device. • We assume that the constraints related to the CIM agents (like supported APls, version compatibility, etc. ) are avail able from their vendors. Figure 2 shows the architecture of Magellan framework interposed between CIM clients and CIM agents. It may reside in an external networked server or within the SRM server (along with CIM clients.) The job of Magellan is to control the execution of recipes by the CIM clients and regulate the flow of requests (i.e. WBEM calls) to CIM agents. This is achieved using four basic components: Controller, knowledge base, Scheduler and a self-monitoring module. •
There are multiple factors that govern the execution of a discovery recipe and the total time required for its completion. These consist of both the physical as well as functional capa bilities of the SRM server (running the CIM clients), server(s) running the CIM agents, database server and the underlying network. Table I lists some of the metrics that Magellan mon itors and maintains in its knowledge-base. In addition to these metrics, the knowledge-base may also contain vendor provided best practices and the results of customer experiences and large scale interoperability (/testing) labs. Some examples of knowledge-base information are: (a) Association-based CIM calls for a particular combination of device type, vendor and provider's version are slow; (b) CIM agent for a particular combination of device type, vendor and provider's version does not accept concurrent parallel CIM Client connections. Since an enterprise datacenter may evolve over time due to provider version upgrades, addition of new vendor devices and configuration changes, it is hard to statically design one discovery recipe that would scale in all environments. In our proposed Magellan framework, recipe author (or designer) writes the mUlti-pronged recipe with respect to CIM primi tives. At runtime, Magellan's controller optimizes and orches trates the execution of the recipe given the state, capability and value of monitored metrics of the managed environment. These optimizations are discussed in detail in the next few sub-sections.
No. of devices reported by CIM agent CIM
Device to CIM agent mapping
Agent
CIM provider version Avg. response time for each type of CIM call AvglMax no. of CIM instances per CIM class Avg/Max size of CIM instances AvglMax no. of attributes per CIM instance
A. Multi-level parallelization
Pseudo code in Figure 1 shows that the discovery recipe executed by a CIM client is essentially a sequential depth first traversal. In the sequential approach, significant processing cycles are wasted both at the CIM client and at the CIM agent(s). This is because a substantial time is spent in waiting for the liD. As discussed before, a CIM client makes WBEM calls to CIM agent(s), which in tum may make multiple command-line (or web service) requests to the managed device and wait for them to respond. The CIM client spends a significant amount of time waiting for response from CIM agent(s). After processing that response, it waits again for the database to finish storing the collected information in its repository. These idle windows provide substantial scope for parallelization. The Magellan framework parallelizes the discovery recipe along two dimensions: First, it identifies disjoint sets of operations (i.e. steps) in the recipes that can be executed concurrently. Each disjoint set forms the 'basic block' of execution. Two basic blocks won't have any common steps. Similarly, two steps belonging to two different basic blocks won't depend on each other for their execution. Basic blocks are nothing but the connected components in the recipe graph, where steps are the vertices and edges represent the dependencies between them. Each basic block is executed in separate CIM client instances. The second level of parallelization happens at the instance level. From the pseudo code in Fig. 1, once 'SubSystems' information is received from CIM agent, each returned in stance of 'SubSystems' is processed concurrently in separate CIM clients. That is, each iteration of the outer 'for loop' is executed in parallel in separate threads. Since the data is not shared between different clients, there is no synchronization requirement between different threads. The same process is repeated for the next level 'for loop' and this is repeated until the concurrency limit is reached. Fig. 3 illustrates this concept. The concurrency limit or the number of concurrent CIM client instances that can be created depend on multiple factors, some of which are listed in Table I. For example, some vendors
2010 IEEEIIFIP Network Operations and Management Symposium - NOMS 2010
641
function getPoolVolumeAssociator(SubSystem ss)( pools[]
=
agentQuery.associators(ss,
"CIM_ComputerSystemToStoragePool ", "CIM_StoragePool"); for each pool in pools; volumes[]
=
do
agentQuery.associators(pool,
"CIM_StoragePoolToStorageVolume", "CIM_StorageVolume"); end for function getVolumePoolAssociator(SubSystem ss)( volumes[]
=
agentQuery.associators(ss,
"CIM_ComputerSystemToStorageVolume","CIM_StorageVolumeH); for each volume in volumes; pools[]
=
do
agentQuery.associators(volume,
"CIM_StorageVolumeToStoragePool ", "CIM_StoragePool"); end for
The choice between the above two recipes is dependent on the cardinality of pools and volumes. In most cases the number of volumes will be considerably more than the number of pools because of which the first pseudo-code (Le. getPoolVolume Associator) will make fewer calls to the CIM agent, hence less round-trips and lower cost. There are scenarios, however, Fig. 3.
Parallel execution of a elM discovery recipe
limit the number of concurrent connections to a small number or may serialize requests from all incoming connections in temally because of security and consistency reasons. These limit the degree of parallelization. Also, a device may be reported by more than one CIM agent with potentially different
provider versions. Furthermore, different CIM operations may have different costs (i.e. resource requirements) associated with them. Magellan automatically determines the concurrency level by actively monitoring the number of outstanding jobs that
where such choices are not obvious. For example, the most cost-efficient recipe for retrieving port information from fiber channel or IP switches and their connectivity is not known until the query is run. In such cases, domain knowledge becomes necessary to choose the direction of processing. Another primitive operation that WBEM provides to retrieve CIM data is called' enumeratelnstances', which may sometime be more desirable than the above association-based recipes. It basically returns all the objects belonging to the specified CIM class. Consider the following pseudo-code: function getPoolVolumeAssociator()( pools[]
have been sent to the CIM agents, the CPU load on the
the management server. The threshold window may be set to some default values
(70% - 80% for example) or it may be
modified by the system administrator. Obviously the number of CIM clients cannot exceed the concurrency limit specified by the vendor of the CIM agent. B. Choice of WBEM Operations
As discussed earlier, WBEM defines a set of operations that are used by a CIM client to fetch data from a CIM agent.
Pool2Storage Volume class, each of which contains a reference to the volume object and its corresponding pool object. The number of calls doesn't depend on the cardinality of the pool and volume, and therefore runs faster than the previous two examples. Note, however, that the discovery engine has to do additional processing to map and correlate the pool and volume data from pool2Volume association information. During this process, it caches the pool and volume data in the memory for faster correlation or retrieves it from the SRM database. Magellan uses the following approach to choose between the enumeration and association calls: •
Full discovery recipes (all pools and volumes, for example) are performed using the enumeration approach;
•
Specific discovery recipes (performance data of a particular pool of a particular storage subsystem, for example) are
and latency) is in part dependent on the choice of WBEM operations. For example, in order to get pool, volume and their relationship information for a particular storage subsystem
3 enumeration calls to the
CIM agents. The last call enumerates the instances of Storage
The agent may provide more than one primitive to query the same data. The cost of the query (in terms of resource usage
agentQuery.enumeratelnstances(
The above recipe makes just
and is dependent on the provider's version. Depending on the
the window. This also helps in controlling the perturbation at
=
"CIM_StoragePoo12StorageVolume");
CIM agents. This information comes from our knowledge-base
it decrements the concurrency level until the load goes below
agentQuery.enumeratelnstances(
poo12Volume[]
to the maximum number of open connections supported by the
the load on CIM agent rises above a specific threshold window,
=
"CIM_StorageVolume");
performing the discovery. The concurrency level is initially set
monitored load, Magellan dynamically adjusts the number of
agentQuery.enumeratelnstances( "CIM_StoragePool");
volumes[]
server running the CIM agents and the management server
CIM client instances using a simple algorithm. Specifically, if
=
performed using the association approach. •
Partial discovery recipes (performance data of a subset of
from the CIM agent, the CIM client may make calls shown in
switch ports managed by a CIM Agent, for example) are
the following two pseudo-codes:
performed using either the association, or the enumeration
642
2010 IEEEIIFIP Network Operations and Management Symposium - NOMS 2010
20
•
approach, as shown in the experimental evaluation. Domain knowledge (e.g. elM provider version x for device y has an inefficient implementation of enumeration calls), if available, is also used to guide the type of elM call.
o Database Server 15
_CIM Client "CIMAgent
e. Load balancing between multiple CIM Agents
In a large datacenter environment, a single elM agent reporting all the devices can easily become a bottleneck. Therefore multiple elM agents are deployed that perform data collection from different devices. In order to handle elM agent failure, more than one elM agents may be configured to report the same device. SRM tools typically submit discovery jobs to the first available elM agent and switch to the other agents only in case of failure or they may load balance at the recipe level (i.e. the entire recipe is sent to one elM agent). As discussed earlier, Magellan divides the discovery recipes into multiple elM client instances. elM requests from these clients are queued in the Magellan's scheduler that does flow control and elM agent selection (for query processing) in such a way that reduces the overall discovery time. It keeps track of the available capacity of elM agents based on the number of outstanding calls to each of them and their vendor specified concurrency level. In addition, it maintains a queue for the incoming requests (along with their timestamps) from elM clients. The scheduler makes the agent assignment to the elM requests as follows: • For each request, a metric called 'compatibility count' (CC) is defined, which is equal to the number of elM agents that can process that particular request. • Then the scheduler re-orders the request queue based on increasing value of Cc. • The request with smallest value of CC is scheduled first. • If ee is greater than one, it means that there are more than one elM agents that can process the corresponding request, in which case the agent with the least load is chosen for processing. • If the agent(s) for the request with the least ee are un available, the scheduler moves on to next request in the queue and re-scans from the head of the queue in the next scheduling iteration. The basic idea behind this scheme is to give preference to the elM requests with the most constraints, which balances the load as well as increases the overall utilization of the elM agent infrastructure. IV. EVALUAT ION Experimental evaluation of Magellan is divided into two parts. In the first set of experiments, a series of micro benchmarks evaluates various Magellan optimizations with a elM agent infrastructure. In the second set, a macro eval uation is performed with a data center configuration of a large financial organization. Our prototype Magellan is imple mented by extending Eclipse Aperi Storage Resource Manager Suite [2]. Aperi SRM suite is a vendor-neutral, open, extensi ble, standards-based system resource management framework. We mainly extended the discovery and monitoring engine of
Ports 1
Ports 2
Extents
1
Extents 2
1 Ranks 2 Step Type
Rilnks
VaiSet
1
VoISet2
Milsk-Ma�
1 Mi!5It�ilp2
Fig. 4. Execution time breakdown for two scenarios: 1 CIM agent and 2 CIM agents
Aperi SRM suite. The engine performs the discovery and monitoring of elM compliant devices in the managed envi ronment by traversing elM profiles of fiber channel switches, storage subsystems and tape libraries. Server infrastructure is completely developed using Sun Java 1.4 and deployed on Windows XP and Windows 2003 server platforms. Our first set of micro evaluation deals with scalability of elM agent infrastructure. Device discovery and performance query recipes consisting of multiple enumeration and associa tion traversal were executed in a managed environment setup consisting of two enterprise class mid-range storage subsys tems and two elM agents; both elM agents reporting both subsystems. Discovery recipe captures the storage subsystem, fiber channel ports, storage extents, storage pools, storage volumes, physical disks etc. and their association relationships from the configuration perspective. Along with that, port and volume performance statistics were also collected at 5 minute intervals. We experimented with 4 different Magellan configurations: • Single Parallel Probe: Multiple elM client connections to one elM Agent that is reporting one storage subsystem. • Single Parallel Probe Load-balanced: Multiple elM client connections to two elM agents that are reporting one storage subsystem. • Dual Parallel Probe: Multiple elM client connections to one elM agent that is reporting two storage subsystems • Dual Parallel Probe Load-balanced: Multiple elM client connections to two elM agents that are reporting two storage subsystems Our first experiment is for discovery of one enterprise class mid-range storage subsystem reported by two external elM agents. Figure 4 shows a breakdown of the time taken in the execution of two discovery recipes by Magellan. The first recipe discovers one subsystem using one elM agent and is indicated by suffix' l' in the x-axis. The other recipe discovers the same subsystem using 2 elM agents (indicated by suffix '2' in the x-axis). Note that the most amount of time is spent in elM agent(s). Other logical steps involve complex correlation and referential integrity matches before persisting the information into database repository and these steps do not consume much time. Basic Aperi SRM does not query multiple elM agents for discovery of a single device. It picks one of the available elM agents for the given device and uses one thread to sequentially traverse all the profiles for completion of discovery process.
2010 IEEEIIFIP Network Operations and Management Symposium - NOMS 2010
643
100
-1 CIMAgent
�
80
"'Enum 128 ports As"", 128 port,
60
;;> L
40 -1 CIMAgent 2 CIMAgents
20 CIM Agenl CPU
origInal CIM Client Connections
Fig. 5.
2
CIM Client Connections
Single parallel probe
Fig. 6.
Dual parallel probe
3
4
5
6
7
8
9
20%
10
CIM Clien1 Connections
40% 60% 80% 100% Percentage of lotal ports queried
Fig. 7. Change in response time and CIMFig. 8. Enumeration Vs Association agent's CPU utilization execution time with 128 ports
Magellan, on the other hand, divides the discovery recipes into disjoint components and executes them concurrently. Furthermore, it unfolds the iterative steps in the recipes and executes them in parallel in separate CIM client instances. Our next experiment is for discovery of two enterprise class mid-range storage subsystems that are reported by two external CIM agents. Figure 5 shows the execution time speedup by using multiple threads for one storage subsystem discovery using one and two CIM agents. We got up to 33% speedup by using 'Single Parallel Probe' approach and 53% speedup by using 'Single Parallel Probe Load-Balanced' approach compared to base Aperi setup with just one thread. Again, Figure 6 shows the execution time speedup by using multiple threads for discovery of two storage subsystems using one and two CIM agents. 'Dual Parallel Probe' achieved a 25% speedup where as 'Dual Parallel Probe Load-balanced' achieved a 45% speedup. Here, the total number of threads used is shared by the two parallel running discovery jobs for the two subsystems. Note that the graph basically flattens out when the number of CIM clients are increased beyond 8. This is basically due to the concurrency limit of the CIM agents used in our setup. Figure 7 shows the change in CPU utilization of a CIM agent and the corresponding change in response time with the increase in the number of concurrent discovery recipes sent to it for execution. CPU load increases to 70% and the average response time also increases by 35% with just 7 concurrent recipes. The graph shows that four simultaneous recipes can be executed by this CIM agent without significant degradation in response time. Beyond that, its CPU becomes a bottleneck. Magellan populates its knowledge-base with these runtime profiles and uses it to optimize execution of future discovery recipes. Section III-B discussed about the choice of WBEM op erations and how similar information can be obtained using different operations. Two such operation calls are:enumeration and association. The number of association calls required is dependent on the cardinality of the association. Recipe execution time is dependent on two important factors: (i) number of associator calls and (ii) number of round-trips to CIM agent. Figure 8 shows the result of a real experiment that collects performance data of active and connected ports in a fiber channel switch. There are two ways to accomplish this: Enumeration-based approach: - Let Pa be the set of active and connected
644
/
TABLE II DATACENTER CONFIGURATION OF A LARGE FINANCIAL ORG
Type
Details 327 servers 292 with one dual port HBA (=584 ports)
Servers
35 with two dual port HBA (=140 ports) [6 Vendors] 18 Fiber Channel Switches 7 Fiber Channel Fabrics 6 FC 64 Port Switches (= 384 ports)
Network
9 FC 32 Port Switches (= 288 ports) 3 FC 16 Port Switches (= 48 ports) [3 Vendors] 21 Storage Subsystems 14 mid-range subsystems(l4x8=112 ports) 7 high-end subsystems(7 x16=112 ports)
Storage
3 Tape Libraries (3x16=48 ports) [4 Vendors]
ports.
- portStats[]
= Enumerate all FCPortStatistics
- portToPortStats[]
= Enumerate all
FCPortToFCPortStatistics - Filter out statistics of inactive ports from portStats based on Pa and portToPortStats
Association-based approach: - Let Pa be the set of active and connected ports. - let i=O - for each port p in Pa portStats[i++]
= Get
AssociatedPortStatistic(p)
The result in fig. 8 shows that enumeration-based approach is the preferred approach when the number of active ports are more than 60 in a 128 port fabric. This is because it makes only fixed number (2 here) of calls to the CIM agents, while the number of calls in the other approach is equal to the total number of active ports. When the number of active ports is small, association-based approach is better because it makes few round-trip calls, while the former approach wastes a lot of time in getting statistics of inactive ports. Our second macro-benchmark evaluates Magellan in the context of a datacenter configuration operated by a large financial company. Table II shows high-level view of the configuration. Due to large number of devices, the datacenter had a set-up of 29 CIM agents deployed to respond to the SRM server in this environment. A snapshot of this configuration
2010 IEEEIIFIP Network Operations and Management Symposium - NOMS 2010
150 -- Magellan - Simulated Env
130
(j)
Aperi - Real Env -+-Aperi - Simulated Env
110
c
I
90
E i=
70
OJ
50 2
4
6
8
10
CIM Client Connections
Fig. 9.
Discovery time of the large datacenter
was taken from the live datacenter and replicated in our lab environment using open-source Aperi's SAN Simulator [16]. This simulator has a 'snapshot-mode' that allows making point-in-time copy of real live CIM agent(s). SAN Simulator copies the management information from the CIM agents and stores them in its internal relational repository. Later, it reports data like a regular CIM agent. But, to mimic the exact provider delay, we modified the simulator provider to incorporate the real provider delay in response to different types of CIM client requests. Figure 9 shows that the base Aperi SRM discovery en gine took 128 minutes to complete the whole discovery by communicating with the real live CIM agents in the pro duction environment. Then, by taking snapshot of the 29 CIM agents through SAN Simulator, we hosted the simulated environment that mimicked the provider delay factor. With the simulated environment using the snapshot CIM agents, base Aperi took 131 minutes. Once Magellan's optimizations were enabled, the total discovery time dropped to 107 minutes (an improvement of 18%) with just one CIM client. This improvement was achieved primarily due to Magellan's ability to tune the discovery recipe execution based on CIM agent availability and datacenter configuration. As the number of CIM clients were increased, Magellan intelligently distributed the load across different CIM agents because of which the total discovery dropped even further. With 10 concurrent CIM clients, the entire datacenter discovery was completed in 58 minutes, an improvement of 55.7% compared to base Aperi. In a real deployment, Magellan can automatically determine the appropriate number of CIM client instances using techniques discussed in Section III-A. V. R ELATE D WORK Popular commercial operational systems management products like IBM TotalStorage Productivity Center (TPC) [22], EMC Control Center [5], HP Systems Insight Manager [9], and Network Appliance SANScreen [12], as well as the open source project Aperi [2] each implement a comprehensive discovery engine using SMI-S, CIM and other standards to achieve their functionality. These tools may also use SNMP [19] and proprietary management methods in conjunc tion with CIM agents. The main contribution of Magellan that differentiates it from these tools is the ability to orchestrate and optimize the discovery process using runtime metrics and a
knowledge-base of historical performance profiles and industry best practices. Run-time monitoring tools exist in the area of high performance computing (HPC). Ganglia [15] allows scalable monitoring of remote clusters using a combination of UDP multicast within a cluster, and XML streams to communicate between an aggregator and multiple potentially remote clus ters. Supermon [20] is a similar system, but the number of network connections required to obtain cluster state is higher, and monitoring data is collected using a pull model. HPC monitoring tools are characterized by frequent monitoring of cluster state as encapsulated in a few common metrics, for a large number of mostly homogeneous nodes. The enter prise configuration discovery scope of Magellan is similar in this context; frequency of configuration discovery is low, but computationally intensive, the target devices are usually heterogeneous, and the attributes queried are large in number (hundreds for a typical enterprise devices and applications), all of which are of interest and need to be queried. Performance monitoring recipes in Magellan may be shorter and run more frequently. Resource discovery is an essential component of grid [6], [11] and peer-to-peer computing. The MDS [7] component of the Globus Toolkit uses an aggregator service to make available monitoring data gathered from registered information sources via a Web-services interface. MDS is similar to Magel lan in that it requires secure access to nodes of interest and uses a pull-based model to discover and monitor. However, unlike Magellan, it typically pertains to geographically dispersed resources, a best-effort paradigm is sufficient, SLA violations are not an issue, and the middleware and applications need to deal with the fact that the monitored resources may be untrustworthy. Peer-to-peer information and resource discovery, as pop ularized by Napster and Gnutella facilitate the location and exchange of files amongst a large group of independent users connected through the Internet. In general, peer-to-peer systems differ from enterprise IT resource discovery in that detailed configuration information is not required and peers are not managed by centralized SRM tools. The Astrolabe [14] project makes use of the peer-to-peer communication paradigm to build a distributed and scalable monitoring system. It differs from the problem area of this paper in that the target is large Internet scale systems, where it is necessary to monitor a large number of nodes, each of which has a relatively small set of attributes to be monitored, whereas Magellan targets a smaller number of geographically proxi mate resources, each of which has a large set of attributes that need to be monitored, along with the connectivity and other configuration information that need to be inferred. Sword [13] and XenoSearch [21] are two related projects that deal with wide-area resource discovery. They differ from our work in that the goal is to find a set of resources that meet certain request criteria, pertain to wide area networks, and query the same set of static metrics for a given query. In summary, most interesting efforts in distributed system
2010 IEEEIIFIP Network Operations and Management Symposium - NOMS 2010
645
resource discovery have focused on wide-area with large number of entities and each entity has relatively few metrics associated with it. They typically rely on custom monitoring infrastructures that include custom monitoring agents, custom communication protocosl and custom data representations. Furthennore, their goal is not to configure and manage these wide area infrastructures. Magellan on the other hand, is built for a SRM toolkit that manages the entire dataceter and pro vides a "single" point of administrative control. The goal of the discovery process in this environment is to enable management of all aspects of the enterprise datacenter. This imposes several unique requirements for the discovery process: being able to communicate using standard APls for maximum coverage; complete discovery of the datacenter configuration to enable runtime analytics and control; standard persistent data storage format for the purpose of reporting via standard tools like BIRT, Crystal Reports, Business Objects, etc. Magellan is designed to make discovery more efficient while meeting these requirements. VI. CONCLUSION This paper presents an autonomic middleware framework called Magellan that uses a combination of optimization techniques to speed up the discovery process. Magellan iden tifies independent steps in CIM discovery recipes and paral lelizes them along mUltiple dimensions to launch simultaneous queries that are load balanced across multiple CIM agents. Fur ther, it guides the choice of WBEM operations and provides the ability to progressively discover the SAN environment based on priority hints in the discovery recipes. Magellan has been implemented in the Aperi open-source system manage ment framework. A series of experiments including micro benchmarks and a large datacenter configuration from a major financial organization show that Magellan can achieve speed up of 50% or more compared to the vanilla discovery process in Aperi. CIM, SMI-S and other standards have their own set of supporters and detractors. Our aim has been basically to use a standard-based approach that is portable and can be used out-of-the-box without any modification to the existing components. During the course of our implementation, we ran into a few issues because of the choices we made. First, not all device vendors have adopted CIM (SMI-S) standards. And even if they did, they may not expose their entire management interface via CIM. In such cases, we have to implement our own CIM providers that communicate with those devices using their proprietary APls. Most of these are fairly straighforward to implement. Second, many embedded CIM agents run on low-end CPU cores, which may not have enough processing power that would handle multiple concurrent clients. Third, the ability to speed-up configuration discovery depends signif icantly on the vendor's implementation of the CIM providers, which in many cases are sub-optimal or are single-threaded (to avoid concurrency bugs, for example). In the last few years, considerable amount of work has been done by standards body like DMTF and SNIA to make systems and storage management interoperability and
646
integration possible. Also, scalability and perfonnance of core I/O have been studied extensively. However, scalability issues in enterprise datacenter configuration discovery have received little attention; one of the reasons being that long running discovery recipes are executed very infrequently. But with the advent of new paradigms like cloud computing, scalability needs to be addressed across all dimensions. Moreover, the increasing need for automated planning and management analytics requires fast access to device configurations and historical perfonnance. Magellan is a step in this direction. As a future work, we would like to investigate configuration discovery in wide area networks. We are also investigating online algorithms that would detect bottlenecks (both hardware and software) in the discovery infrastructure and recommend strategies to fix them. REFERENCES [I] S. Agarwala, C. Poellabauer, J. Kong, K. Schwan, and M. Wolf, "Resource-aware stream management with the customizable dproc dis tributed monitoring mechanisms," in HPDC, 2003. [2] "Aperi Storage Management Project," http://www.eclipse.orglaperil. [3] "DMTF Common Information Model (CIM);' http://www.dmtf.org/ standardS/ciml. [4] "Distributed Management Task Force (DMTF)," http://www.dmtf.org. [5] "EMC Control Center;' http://www.emc.comlproductslstorage_ managementlcontrolcenter.jsp. [6] I. Foster and C. Kesselman, Eds., The Grid: Enabling Blueprint for a New Computing Infrastructure. Morgan Kanfmann, 2005. [7] I. Foster, "A Globus Toolkit Primer," http://www.globus.org/toolkitldocs/ 4.0!key/GT4]rimecO.6.pdf. [S] S. Gopisetty et aI., "Evolution of storage management: transforming raw data into information;' IBM Journal of Research and Development, vol. 52, no. 4, 200S. [9] "HP Systems Insight Manager," http://hIS013.wwwl.hp.comlproducts/ serverslrnanagementlhpsiml. [10] K. Keeton, C. Santos, D. Beyer, J. Chase, and J. Wilkes, " Designing for disasters ," in USENIX FAST, 2004, pp. 59-62. [II] Z. Nemeth and V. Sunderam, "Characterizing grids: Attributes, defini tions and formalisms," Journal of Grid Computing, vol. I, 2003. [12] "Network Appliance SANScreen: Data Center Automation for Stor age Environments," http://media.netapp.comldocumentsJData_Center_ AutomatioILcommentary.pdf. [13] F. D. Oppenheimer et al., " SWORD: Scalable Wide-Area Resource Discovery ;' University of California, Berkeley, Tech. Rep. UCBIICSD04-1334, 2004. [14] R. V. Renesse, K. Birman, and W. Vogels, "Astrolabe: A robust and scalable technology for distributed system monitoring, management and data mining," ACM TOCS, vol. 21, no. 2, pp. 164-206, May 2003. [15] F. D. Sacerdoti et al., "Wide Area Cluster Monitoring with Ganglia;' in Cluster Computing, 2003. [16] "Eclipse Aperi Storage Network Simulator ;' http://www.eclipse.org/ aperildocumentationlr4/install_simulator.php. [17] "SNIA Storage Management Initiative Specification (SMI-S)," http: IIwww.snia.org/forums/smiltech....Programs/smis_home/. [IS] "Storage Networking Industry Association," http://www.snia.org. [19] "A Simple Network Management Protocol (SNMP), IETF RFC 100S;' May 1990, http://www.ietf.orglrfc/rfcI l 57.txt. [20] M. J. Sottile and R. G. Minnich, "Supermon: A High Speed Cluster Monitoring System," in Cluster Computing, 2002, p. 39. [21] D. Spence and T. Harris, "Xenosearch: Distributed Resource Discovery in the Xenoserver Open Platform ;' in HPDC, 2003. [22] "IBM TotalStorage Productivity Center," http://www.ibm.comlserversl storage/software/center/. [23] C. Verbowski et al., "Flight data recorder: monitoring persistent-state interactions to improve systems management," in OSDI, 2006. [24] "DMTF Web-based Enterprise Management;' http://dmtf.orglwbeml.
2010 IEEEiIFIP Network Operations and Management Symposium - NOMS 2010