Defining Behavior and Evaluating QoS Performance of the SLICE ...

1 downloads 23 Views 1MB Size Report
Vanderbilt University. Nashville, TN, USA ... Nashville, TN, USA d.schmidt@vanderbilt.edu ... One of the challenge problems in the second phase of the ARMS ...
Defining Behavior and Evaluating QoS Performance of the SLICE Scenario John M. Slaby

Steve Baker

James Hill

Douglas C. Schmidt

Raytheon Portsmouth, RI, USA [email protected]

Raytheon Portsmouth, RI, USA [email protected]

Vanderbilt University Nashville, TN, USA [email protected]

Vanderbilt University Nashville, TN, USA [email protected]

Abstract The purpose of this technical report is to provide supplemental detailed information on the experiments that are described in the ECRTS paper submission [4]. It contains the predicted behavior for the components in the SLICE scenario, and screenshot figures illustrating the modeling of SLICE components using the CUTS Workload Modeling Language (WML) and the evaluation of system performance using the Benchmark Manager Web-interface.

1. The SLICE Scenario The DARPA Adaptive and Reflective Middleware Systems (ARMS) program [5] is a five year, multi-phase effort that is developing multi-layer resource management (MLRM) services to support product-lines that coordinate a grid of computers to manage many aspects of a ship's power, navigation, command and control, and tactical operations. The ARMS MLRM services have hundreds of different types and instances of infrastructure components written in ~500,000 lines of Java and C++ code and ~1,000 files developed by six teams at different geographic locations. One of the challenge problems in the second phase of

the ARMS program is called the SLICE scenario [4], which consists of 2 sensors, 2 planners, 1 configuration, 1 error recovery, and 2 effector components, as show in Figure 1.

Figure 1. Model of SLICE Showing the Components and Their Interconnections The SLICE scenario requires the transmission of information detected by the sensors to each planner in sequence, then to the configuration component, and lastly to both effectors to perform actions that control devices in the physical world. Components in the SLICE scenario are deployed across 3 computing nodes because the workload generated by components in the critical path is more that a single node can handle. The sensors and actuators (represented as sensor-1 and effector-1 in Figure 1) are deployed on separate nodes to reflect the placement of physical equipment in the production shipboard system. Figure 1 shows a model of the end-to-end layout of SLICE components, with the critical path specified by the dashed arrows. The SLICE scenario also uses software components similar to product-lines and challenge problems in phase one of the ARMS program. We therefore already understood each component’s behavior in SLICE. Table 1 lists the predicted behavior of each components based on our experience from phase one of the ARMS program. Planner -1 CoWorkEr Workload performed every second publish command of size 24 bytes Workload performed after receipt of a track alloc 30 KB; 55 dbase ops; 45 CPU ops; publish assessment of size 132 event bytes; dealloc 30 KB Configuration-Optimization CoWorkEr Workload performed at startup time alloc 1 KB; 25 dbase ops; 1 CPU ops; 10 dbase ops; dealloc 1 KB Workload performed after receipt of an asalloc 5 KB; 40 dbase ops; 1 CPU op; publish command of size 128 bytes; sessment event dealloc 5 KB Workload performed after receipt of a status 1 dbase op event

Sensor -1 CoWorkEr alloc 35 KB; 15 CPU ops; 30 dbase ops; 30 CPU ops; publish track of size 486 bytes; dealloc 35 KB Sensor-2 CoWorkEr Workload performed after receipt of 3 comalloc 35 KB; 15 CPU ops; 30 dbase ops; 30 CPU ops; publish track of size mand events 486 bytes; dealloc 35 KB Planner-2 CoWorkEr Workload performed on receipt of a track alloc 30 KB; 55 dbase ops; 45 CPU ops; publish assessment of size 100 event bytes; dealloc 30 KB Effector-1 & Effector-2 CoWorkEr Workload performed every second 25 CPU ops; publish status of size 32 bytes Workload performed after receipt of comalloc 30KB; 25 CPU ops; publish status of size 256 bytes; dealloc 30 KB mand event Error Recovery CoWorkEr Workload performed at startup alloc 1 KB; 25 dbase ops; 1 CPU op; 10 dbase ops; dealloc 1 KB Workload performed after receipt of an asalloc 5 KB; 40 dbase ops; 1 CPU op; publish command of size 128 bytes; sessment event dealloc 5 KB Workload performed after receipt of status 1 dbase op event Workload performed after receipt of a command event

Table 1. Expected Behavior for the Components in the SLICE Scenario

2. Applying CUTS to the ARMS SLICE Scenario The Component Workload Emulator (CoWorkEr) Utilization Test Suite (CUTS) is a system execution modeling tool chain for creating arbitrarily complex Service-Oriented Architecture (SOA)-based applications rapidly and performing experiments that systematically evaluate interactions that would otherwise have been hard to simulate. In particular, CUTS provides Model-Driven Development (MDD)-based workload generation, data reduction, and visualization tools to rapidly construct experiments and comparatively analyze results from alternate execution architectures. CUTS can also import measured performance data from faux application components running over actual configured and deployed infrastructure middleware services to better estimate enterprise distributed real-time and embedded (DRE) system behavior in a realistic production environment. This section presents and explains screenshots that were generated from applying CUTS to the SLICE scenario in ARMS. CUTS allowed us to create a more robust and complete solution for emulating actual application components and evaluating QoS earlier in the enterprise DRE system lifecycle. For example, we used CoSMIC [3] MDD tools to create models of production systems composed of faux application components and actual system infrastructure components. We then used these models in conjunction with CIAO [6] and DAnCE [2] to deploy all these components based into a representative testbed and conduct systematic experiments that measured how well the performed relative to QoS specifications from production computing systems.

2.1. SLICE Scenario in PICML PICML [1] is a domain-specific modeling language (DSML) built using the Generic Modeling Environment (GME) [7] that provides the foundation for the CoSMIC MDD tools. It enables developers of CORBA Component Model (CCM) applications to model DRE systems and synthesize structural code, e.g., interface definition files, and XML deployment and configuration (D&C) files that are syntactically and semantically correct. The SLICE Scenario and CUTS are both implemented using Real-time CCM and CIAO [6], therefore we used PICML to model the system and generate the necessary metadata instead of producing it manually. Figure 1 illustrates the PICML model of the SLICE scenario. In this figure components in the SLICE scenario are represented by their respective shaded rectangles, and the WLGBum rectangle represents the BenchmarkDataCollector component in CUTS. The lines connecting the components represent the interconnections between the components, which are created by DAnCE at deployment time. The transparent strip through the middle of the figure highlights the components in the critical path. PICML helps ensure correct-by-construction, e.g., two components can only be interconnected if their ports are compatible, e.g., same event type, interface and etc. From this model, PICML generates XML metadata descriptors, which are interpreted by DAnCE to deploy and configure the system.

Figure 1. PICML Model of the SLICE Scenario

2.2. The Component Workload Emulator (CoWorkEr) Figure 2 shows the PICML model of the CoWorkEr in CUTS. The CoWorkEr is an assembly of monolithic CCM components, which are represented by the shaded rectangles. Each component in the CoWorkEr has a specific functionality, such as CPU, database, memory, or publication work. Each CoWorkEr can also receive events via its event sinks (represented by the chevrons on the left in Figure 2) and event sources (represented by the pentagons on the right in Figure 2). Events received by a CoWorkEr on its event sinks are delegated to the EventHandler, whereas events produced by the EventProducer are delegated to the proper event source for the CoWorkEr.

Figure 2. PICML Model of a CoWorkEr

2.3. The Workload Modeling Language (WML) The Workload Modeling Language (WML) is a DSML for specifying the emulated behavior of the CoWorkErs. Figure 3 illustrates an example characterization using WML.

Figure 3. Example Workload Specification in WML This figure illustrates an event-based workload. The InputEvent specifies which event and how many of this event will trigger this workload. From the EventDrivenActivity element, lies a series of actions that constitute this workload. Each item in the series represents a different action to be performed its representative worker in a CoWorkEr. In this example, the workload performs memory allocations, CPU operations, database operations, CPU operations, memory deallocations, and finally publishes an event to CoWorkEr. Each action in the workload also has specific attributes, e.g., number of repetitions, size of allocation and dealloction, and size of publication data, which is not pictured here.

2.4. General and Detailed Analysis of Performance Figure 4 illustrates the general analysis of performance data feature of the Benchmark Manager Web-interfaace (BMW), which was used to create Table 4 in [4].

Figure 4. General Analysis Display Generated by the BMW for Test 8 in the SLICE Scenario Table 4 in [4] presents the timing data for Sensor-1, which is represented in Figure 4 by EnvDetector-1 in the Workload Generator (WLG) column, which shows the corresponding CoWorkEr. The columns in this figure are as follows:



Host – the symbolic name/address of the node hosting the CoWorkEr.



Event – name of the event received by the CoWorkEr.



WLG – the name of the CoWorkEr associated with the performance data, and receiving the event specified in the Event column.



Workload – the type of workload associated with the performance data.



Timeline – detailed graph of the performance data with respect to time of occurrence.



Snapshots – the number of benchmarks collected for the workload during the experiment.



Avg. Samples – the average number of samples per each snapshot.



Avg. /Rep – the average amount of time in milliseconds it took to complete an action (e.g., CPU, database, memory, or publication) in the workload specification generated using WML.



Average – the average execution time in milliseconds to complete a full workload specification (e.g., event-driven, periodic, or startup workload).



Worse Case – the worse execution time in milliseconds to complete a full workload specification.



Best Case – the best execution time in milliseconds to complete a full workload specification.

The information presented in Figure 4 can be viewed either while the test is executing, or after the test is completed because the performance data is read from the database used by the BenchmarkDataCollector component to store performance metrics for offline analysis. In addition to general analysis, the BMW also provides detailed analysis for a workload or action by clicking the respective graph in the Timeline column. Figure 5 shows the detailed graph that appears after clicking the Timeline table for the Total Workload of Sensor-1 (or EnvDectector-1). The graph shows the complete timeline from start to end of the experiment in graphical format. The top dotted-line in Figure 5 depicts the worse execution time for the workload. The middle dotted-line in Figure 5 depicts the average execution time for the workload. The bottom dotted-line in Figure 5 depicts the best execution time. Each of the dots in the dotted-lines corresponds to the snapshot of the performance data collected at that given time. This feature of the BMW allows further investigation of performance issues, such as high jitter, or slow throughput and execution.

Figure 5. Detailed Timeline of Specific Workload Generated by the BMW

The last analysis feature provided by the BMW is critical path analysis. Figure 6 and figure 7 illustrate the critical path analysis graph created by the BMW for test 8, which is highlighted in Table 5 of [4], and test 9, respectively. To generate the timelines, the components in the critical path are specified using the BMW. The BMW then extracts their performance data and generates a detailed timeline. The top horizontal line in either figure represents, in order of execution of components in the critical path, the time spent processing a workload and publishing the event resulting from that workload. The bottom horizontal line graphs the current execution time of the critical path vs. the specified deadline. As illustrated in Figure 7, if the deadline is missed, the left region in the bottom horizontal graph highlights the specified execution deadline, and the right region in the bottom horizontal graph highlights how many milliseconds the deadline was overrun. As illustrated in Figure 8, if the deadline is not missed, the left region in the bottom horizontal graph highlights the specified execution deadline, and a the right region in the bottom horizontal graph highlights how much headroom exists before missing the specified deadline.

Figure 6. Critical Path Analysis Timeline for Average Execution Time Generated by the BMW for Test 8

Figure 7. Critical Path Analysis Timeline for Average Execution Time Generated by the BMW for Test 9 When running the test 8 for the SLICE scenario, we used this feature to locate CoWorErs that had the longest execution times. In test 8 and illustrated in Figure 7, Sensor-1 (or EnvDectector-1) and Planner-2 (represented as Plan-1), have the longest regions for execution time, and are deployed on the same host (blade5.isislab.vanderbilt.edu as illustrated in Figure 4). In test 9, we deployed both Sensor-1 and Planner-2 on separate hosts. The results of this test produced the timeline illustrated in Figure 7, which shows that average execution time for test 9 for completing the critical path was below 350ms. We therefore were able to use the graphical analysis features to locate bottlenecks and create new deployments that meet the critical path deadline of 350ms. We also used the same process to create the deployments for test 10 and test 11, which both met the critical path deadline with an average execution time of 250ms and 221ms, respectively.

3. References [1] Balasubramanian, K. and Balasubramanian, J., Parsons, J., Gokhale, A. and Schmidt, D. (2005, Mar), “A Platform-independent Component Modeling Language for Distributed Real-time and Embedded Systems,” Proceedings of the 11th IEEE Real-Time and Embedded Technology and Applications Sym. San Francisco, CA. [2] Deng, G., Balasubramanian, J., and Otte, W., Schmidt, D. and Gokhale, A. “DAnCE: A QoS-enabled Component Deployment and Conguration Engine,” Proceedings of the 3rd Working Conference on Component Deployment. Grenoble, France, November 2005.

[3] Gokhale, A., Balasubramanian, K., Balasubramanian, J., Krishna, A., Edwards, G., Deng, G., Turkay, E., Parsons, J. , and Schmidt, D. (2005). “Model Driven Middleware: A New Paradigm for Deploying and Provisioning Distributed Real-time and Embedded Applications,” The Journal of Science of Computer Programming: Special Issue on Model Driven Architecture. [in press] [4] Slaby, J., Baker, S., Hill, J., & Schmidt, D. (2005). “Applying System Execution Modeling Tools to Evaluate Enterprise Distributed Real-time and Embedded System QoS,” Submitted to the 18th Euromicro Conference on Real-Time Systems (ECRTS 06). [5] The DARPA Adaptive and Reflective Middleware Services (ARMS) program, http://dtsn.darpa.mil/ixodarpatech/ixo_FeatureDetail.asp?id=6. [6] Nanbor Wang, Douglas C. Schmidt, Aniruddha Gokhale, Craig Rodrigues, Balachandran Natarajan, Joseph P. Loyall, Richard E. Schantz, and Christopher D. Gill, QoS-enabled Middleware, in Middleware for Communications, edited by Qusay Mahmoud, Wiley and Sons, New York, 2003. [7] G. Karsai, J. Sztipanovits, A. Ledeczi, and T. Bapty, “Model-Integrated Development of Embedded Software,” Proceedings of the IEEE, January 2003.

Suggest Documents