Simulation-based SW/HW Architectural Design Configurations for Distributed Mission Training Systems Hessam Sarjoughian1 Xiaolin Hu2 Daryl Hild3 Robert Strini4 Keywords: Co-Design, C2I, DEVS, Distributed Object Computing, DMT, JSAF, Modeling, Scalability, and Simulation
Abstract: Distributed Mission Training (DMT) is a concept composed of promising technologies that support training in a variety of domains such as defense and medicine. To develop and deploy such systems, it is important to account concurrently for hardware and software requirements, give high demands for network bandwidth, computing resources, and complexity of software applications. In this paper, we present the application of a distributed co-design methodology (Discrete Event System Specification/Distributed Object Computing) as applied to the Mission Training and Rehearsal System (MTRS), a DMT system. For example, we show that the developed simulation models allow prediction of the network capacity below which messages cannot be sent and therefore incorrect behavior results. The key issues presented are (1) characterization of the DMT style architectures in DEVS/DOC, (2) prediction of DMT-like basic scalability traits via simulation, and (3) discussion on some open problems underlying the applicability of distributed co-design for systems containing “off-the-shelf” components. 1
Introduction
For many years, more and more advvanced frameworks, methodologies, and techniques have been proposed to help architecting software/system designs. For example, in the arena of Distributed Mission Training Systems, key capabilities such as voice recognition, large-scale simulation, and collaborative mission training demand an architecture that is scaleable, efficient, open, fault-tolerant, etc. To meet such architectural requirements, it is imperative not to neglect combined software and hardware requirements. Of course, consideration of relationships between hardware and software has been a focus research for more than two decades for embedded systems. Thus, from a conceptual point of view, Distributed Mission Training systems must also trade off the interplay hardware and software constraints (i.e., architectural traits) have against one another – just as has been required for the development of embedded systems. One of the main approaches in developing architectures for systems intended to perform DMT feats, is via simulation modeling. This approach models various aspects within the DMT system to gain an understanding of user requirements and developer constraints and enabling verification and validation of simulation models within its architectural 1
Hessam S. Sarjoughian, Arizona Center for Integrative Modeling and Simulation, Computer Science & Engineering Department, Arizona State University, Tempe, Arizona, 85287-5406, Email:
[email protected], URL: http://www.acims.arizona.edu. 2 Xiaolin Hu, Arizona Center for Integrative Modeling and Simulation, Electrical & Computer Engineering Department, University of Arizona, Tucson, AZ, 85721-0104, Email:
[email protected], URL: http://www.acims.arizona.edu. 3 Daryl R. Hild, The MITRE Corporation, Colorado Springs, CO, Email,
[email protected]. 4 Robert A. Strini, Emerging Business Solutions, Smithfield, Virginia, Email:
[email protected].
1
structure. The simulation model of this entire DMT architecture may further provide a testbed for examining what-if scenarios based on quantitative dynamic behavioral analysis. Therefore, given the importance of simulation modeling, in this article, we illustrate how modeling and simulation can be applied to study one facet of a DMT system’s architecture (i.e., scalability) using a distributed co-design framework. We develop simulation models for MTRS and evaluate its architecture as we increase bandwidth requirements based on increased computing loads and more complex hardware platforms. Further, we discuss some issues surrounding characterization of model components of DMT-like systems.
1.1
Mission Training Rehearsal System
Distributed simulation training systems have achieved various levels of success replicating battlefield conditions for the warfighter. These systems are quite complex and have limitations such as the ability to handle the various input/output requirements as the number of trainee and complexity of their interactions increases. A potential solution – Mission Training and Rehearsal System (MTRS) [1] – has been proposed and to some extent developed to support a wide range of capabilities (academic learning, proficiency training, and mission rehearsal) and yet support a relatively large number of trainees. The prototype demonstrates an initial proof-of-concept for a comprehensive training system to support Air Force Command and Control (C2) training. One of the key objectives is to enable the warfighter to conduct various phases of mission execution within one training system, on demand and with minimal Human-in-the-Loop coordination at an acceptable level of resolution. The prototype includes a distributed simulation system with computer generated forces, terrain, environment and cognitive agents such as pilots, to replace today’s human support role players. The combat scenarios developed for this prototype provide training opportunities for four elements of the Ground Theater Air Control System (GTACS): Control and Reporting Center (CRC), Aerospace Operations Center (AOC), Air Support Operations Center (ASOC) and the Tactical Air Control Party (TACP). A single operator duty position for each element was designed to receive training through web-based courseware, practice sessions and distributed simulation mission sessions. Funding cutbacks reduced the AOC and ASOC positions to a role player observing the key elements of information necessary for the operator to execute mission processes. The two primary duty positions are the CRC Weapons Director (WD) and TACP Enlisted Terminal Attack Controller (ETAC). Differences in mission accomplishment by GTACS operators and individual pilot training are noteworthy in several ways. Initially, GTACS operator’s skill or positional training requires small-scale forces and scenario elements. As exercise levels progress toward team or crew training, MTRS must be able to scale the event to meet a Major Regional Conflict (MRC) or Major Theater of War (MTW). Operationally, this means the training system must be able to include many tactical missions, Rules of Engagements (ROE), Special Instructions (SPINS) and pre-mission planning considerations to handle the enormous number of combat situations that could arise. It is the responsibility of the training provider to determine if the training environment is valid to support this requirement. Typically, human instructors maintain the realism factor by inserting expertise when the system cannot meet realistic demands. MTRS includes the use of intelligent tutors and Computer Managed Instruction to support time management responsibilities of the human instructors. This element of support adds to the challenge of distribution responsibilities and is a factor in the evaluation of the MTRS architecture.
1.2
Related Work
The amount of reported research findings for DMT systems is relatively small. Organizations such as Boeing and Lockheed Martin have been (are) developing some types of Distributed Mission Training systems. For example, Boeing has successfully demonstrated real-time distributed cooperative training missions by linking its Joint Strike Force fullmission simulator with U.S. Air Force Air Combat Command simulators [2]. Similarly, Lockheed Martin recently was awarded to build F-16 Mission Training Centers (MTCs) for use by the US Air Force Air Combat Command [3]. Each MTC connects two to four training devices to simulate F-16 tactical formations and operations. However, while considerable amount of research and development has been carried out by major defense industries (e.g., Boeing and Lockheed Martin), it is not known to the outside community what methodologies, techniques, and frameworks may have been employed and to what extent. In the open literature, there is a handful of publications revealing some technical underpinnings for the analysis, design, realization, and testing of such systems – primarily from software engineering point of view.
2
We now briefly discuss four articles that are most closely relevant and related to our work. In the first paper, Murray and Monson consider the applicability of real-time HLA/RTI [4] for the C-5 DMT simulators [5]. The aim of the study was to determine the number of entities that RTI support for distributed training system. The authors provide a schematic of the system architecture hardware and outline a layered software architecture. The analysis and testing measured network bandwidth, CPU utilization, transport delay, and behavioral correctness (entity displacement inaccuracies between the true position computed by one simulator and its corresponding dead-reckoned position by another). Such measures showed, for example, that distributed C-5 Weapon System Trainer could support a few 10’s of entities. Based on extrapolated network bandwidth, it was concluded that hundreds of entities could be simulated. While this work underscores the importance of accounting for hardware and software aspects of the system, it does not provide a framework and associated methodologies. In the second paper, Mealy and his colleagues discuss performance analysis and testing of HLA/RTI with respect to network bandwidth utilization, RTI latency, processor utilization, and aircraft position error [6]. The experiments primary focus on RTI processing time for incoming/outgoing messages and the effect on network delay. In some experiments, up to a maximum of 32 aircraft were accounted for. The experimental setup consists of three machines. Two computers were networked together via a third computer responsible for emulating real-time network traffic. The two computers host the simulation of the aircraft and missiles. The authors, however, do not offer or employ any integrative hardware and software approach. Instead, they employ a layered approach to design the software aspect of C-5 DMT and account for hardware constraints in the design of the overall system – for example, by allocating a separate processor to host the RTI and using a custom-built board to generate time stamps. The authors Stytz and Banks, in the third article, discuss a software architecture in support of designing HLA-compliant software applications such as DMT [7]. The architecture highlights the importance of defining system requirement, applying good software architectural paradigms, rules, and so on. The authors discuss in detail the CODB (Common Object Database Architecture) and rules. The architecture addresses data handling for classes, containers, and a central runtime data repository to store and route data between major objects. Finally, in the last paper, Seidel, Testani, and Wagner describe the details of the CIMBLE system [8]. It focuses on system concepts, technology, and execution. The authors emphasize the importance of use-cases and scenarios and execution sequence. These last two articles are primarily based on the “pure” software engineering approach to architect DMT-like systems. Of course, within the software engineering and computer science communities, research has been underway to devise software frameworks and architectures [9,10]. This line of research is providing a basis to architect systems as complex as DMT. Unfortunately, it is vital to consider not only software aspects of a system, but equally important is hardware and in particular network aspect – i.e., treating hardware configurations and architectures in a peripheral manner is prevalent in software engineering and has yet to be recognized and employed.
2
An Integrative Distributed Co-Design Approach
The approach undertaken for study of the MTRS architecture follows the modeling and simulation methodology called DEVS/DOC (Discrete Event System System/Distributed Object Computing) [11-13] which is based on the concept of Distributed Object Computing [14]. This recently developed methodology provides a generic framework to model and simulate hardware and software components of distributed systems such as MTRS. An essential part of this approach is based on the validation of model components and their integration against their real-world counterparts. Hardware components of MTRS (e.g., computers and network) are modeled based on manufactures’ specifications. Software components are modeled based on suitable abstractions of their execution behaviors. The general strategy is to model various capabilities of MTRS (and its future incarnations) and then validate the model against measurements made on the actual prototype. It should be noted that while our focus in this report is on MTRS, the approach is applicable to other distributed systems such as those required to meet other DMT requirements. The approach relies on an iterative process through which the accuracy of models is incrementally improved. This makes it possible to gain confidence that the overall MTRS model is valid at an adequate level of resolution. The model can then be extended and simulated to predict alternative, “scaled up” versions of MTRS without actually building them.
2.1
Concurrent Software/Hardware Architectures Consideration
A traditional simplified software component architecture for DMT is shown in Figure 1 and depicts the major software components of MTRS. The figure shows the data and control communications that take place among the software components (Browser, Courseware, SoarSpeak, SoarGen, Database (Computer Managed Instruction), JSAF, and AI-
3
tutor (iGEN)) primarily from the point of software applications. However, the hardware aspect of any distributed application such as DMT is integral to the behavior and performance of the software applications. For example, an attribute (e.g., bandwidth) of a system’s architecture – distinct from software architecture – can only be complete by explicitly taking into account hardware components such as physical links (fiber optics) and intermediate nodes (e.g., hubs) between any two processing nodes (e.g., laptop and workstation). Unfortunately, such architectural diagrams, be it for software layer or hardware layer alone, are increasingly limited in supporting the development of complex, large-scale, heterogeneous environments such as DMT. The development of software applications to meet bandwidth requirements of successively larger organizations and the capabilities of the future network infrastructures is tied directly to alternative network architecture choices (e.g., LAN vs. WAN). Hence, not only individual software and hardware architecture specifications are required, but also these specifications must be carried out concurrently – i.e., a unified software/hardware specification is necessary. Indeed, for systems such as DMT, architecting systems in a concurrent fashion is likely to require a repertoire of approaches and techniques including UML, formal methods, and particularly, simulation models.
Course Builder
Client Applications
User Interface
iGEN
Database
Database SNN
Voice Recognition
Software Component
SOAR Speak
SAF
Data Flow Control/Monitor Flow
JSAF Federation
Server Applications
Figure 1: Sketch of MTRS Architecture
2.2
Distributed Object Computing Approach
Butler [14] proposed an abstract mathematical framework y for specifying a static, structural model of a generic distributed object computing environment. This framework takes a simple, but powerful view. It provides a model to characterize dynamic behavior of a distributed object computing environment by representing two distinct layers of behavior – one for software objects and another for hardware objects independently of one another – and allows a mapping between them (see Figure 2). That is, the framework facilitates modeling abstract behavior of the software components independent of the computing and networking components. Similarly, the framework enables standalone modeling of hardware components responsible for executing software components. The software and hardware models are referred to as the Distributed Cooperative Object (DCO) and Loosely Coupled Network layers, respectively. The
4
framework also defines the so-called Object System Mapping (OSM) to provide for the mapping of software components to hardware components.
Distributed Cooperative Objects
software object
domain arc
Object System Mapping
processor
link
gate
Loosely Coupled Network
Figure 2: Distributed Object Computing Architecture Finally, a set of metrics is defined to extract key parameters of interest to enable studies of alternative architectural designs given various choices for DCO and LCN layers as well as their mappings. For the DCO layer, transducers should capture metrics for the Domain, Object and Interaction Arc classes [14, 10]. Metrics for the Domain and Object classes are computational work performed, active time/utilization, I/O data load, percentage of active objects, degree of multithreading, length of execution queues, number of initialization invocations, total execution time, and coefficient of interactions. For the Interaction Arc class, metrics of interest are data traffic, percentage of packet retransmission, net throughput of data, rate of overhead, and gross and net response time. Similarly, for the LCN layer, three classes (Processor, Gate, and Link) classes are defined to contain key metrics. For the Processor class, metrics are computational work performed, active time/utilization, I/O data load, utilization of storage, percentage of active objects, and degree of multithreading. For Gate and Link classes, transducers should capture data traffic, percentage of packet retransmission, net throughput of data, and rate of overhead.
2.3
DEVS/DOC M&S Methodology and Environment
M&S has become increasingly the most suitable vehicle to study the complexities associated with the development of distributed object computing systems such as Distributed Mission Training systems. The Discrete-event System Specification (DEVS) modeling and simulation framework supports modular, hierarchical representation of components of such systems [15]. With it, we can model hardware and software components (e.g., processors, networking topologies, communication protocols, software objects) of a distributed system with varying degrees of resolutions and complexities in a systematic and scaleable manner. While the distributed object computing formulation proposed by Butler laid the basis for representing various components of a distributed object computing environment, it did not provide a characterization rich enough for representing dynamic, time-driven behavior of software and hardware components nor did it provide a modeling and
5
simulation environment to enable simulation modeling. As a result, the Discrete Event System Specification/Distributed Object Computing (DEVS/DOC) methodology and environment was proposed and implemented to enable and support simulation studies of distributed object computing systems [11-12].
Modeling Layer Distributed Object Computing Modeling Constructs
Auxiliary Modules (GUI)
DEVS Modeling Constructs
DEVS-DOC
The DEVS/DOC environment is based on the DEVS modeling and simulation framework [16, 17], providing a sound formal modeling and simulation framework based on concepts derived from dynamic systems theory. DEVS is a mathematical formalism with well-defined concepts of coupling of components, hierarchical, modular construction, support for discrete event approximation of continuous systems and an object-oriented substrate supporting repository reuse. Of particular interest for our purposes are the framework’s object-orientation traits and its ability to provide a fundamental, rigorous mathematical formalism for representing dynamical systems. The DEVS/DOC modeling and simulation environment extends DEVSJAVA [18] by introducing a layer of modeling constructs on top of generic DEVS modeling constructs. As shown in Figure 3, the modeling layer can employee alternative simulation engines (e.g., DEVSJAVA) to accommodate singular/parallel/distributed execution of DOC models.
Simulation Layer Distributed Simulation (DEVS/CORBA) Persistent Object Store
Middleware (CORBA)
Uniprocessor Simulation Parallel DEVS (DEVSJAVA)
Operating System
Figure 3: DEVS/DOC Modeling and Simulation Architecture
3
Modeling DMT System Architecture
The ability to evaluate architectural attributes (e.g., scalability) of distributed systems using dynamic models is invaluable to designers, developers, and decision-makers. Within the DEVS/DOC modeling and simulation environment, important components of advanced systems such as DMT can be represented and simulated to include the complex interactions occurring among hardware and software components. Furthermore, alternative design choices and tradeoffs among them can be studied. As an example, we might be interested in evaluating alternative DMT architectures in terms of their scalability as the number of its concurrent users increases. Design choices to achieve scalability are available at the hardware and software levels. For example, from the hardware point of view, a processor’s speed, buffer size, and a network’s bandwidth are key design parameters. Similarly, from the software point of view, the choice of data communication protocols, objects' multi-threading capabilities and memory requirements are important considerations. These software/hardware design choices generally interact in complex ways so as to make the development of distributed systems such as DMT challenging and demanding. Complexities in the behavior and performance of such systems arise from two sources: 1) the intricacies of individual components (both software and hardware), and 2) more importantly, the numerous interactions of hundreds of
6
components dispersed across a network. Without modeling and simulation tools at hand, these complexities make it difficult to know how to distribute software components across a network in order to appropriately match software computational needs against hardware resources. And, without proper distribution of software components, we are likely to greatly overestimate the bandwidth and processing requirements of the underlying hardware, resulting in greater costs for less performance. Table 1 shows DMT system components and their counterpart simulation models assuming three computers and a local area network. The table as shown includes all major software and hardware components. The DEVS/DOC simulation models represent either distinctly or as an aggregate, the DMT prototype. Consider the Virtual Network Computing (VNC) server and client applications. In the current simulation of the DMT, VNC client and server are distinct software applications. However, in the DMT model they are aggregated as part of JSAF and Weapon Director software applications. That is, while the model of the DMT system is based on its prototype realization, the model and the prototype do not have a one-to-one correspondence with one another in terms of LCN and DCO layers and their OSM mapping. Nevertheless, despite the absence of such formal relationships, key architectural design considerations (e.g., scalability) of the system can be demonstrated and studied via these approximate simulation models. Software Components
Hardware Components
Browser, VNC Client Courseware SoarSpeak/SoarGen Database, Computer Managed Instruction JSAF, Soar, VNC Server IGEN
DEVS/DOC Simulation Models
WD: (Weapon Director) CW: Courseware SS: SoarSpeak DB: Database JSAF: Joint SAF Ethernet Link Computer (Win. 98): Processor and Media Access Unit Computer (Win. NT): Processor and Media Access Unit Computer (Linux): Processor and Media Access Unit
iGEN: AI Tutor Ether PCJSAF, PCJSAFmau PCDB, PCDBmau PC1, PC1mau
Table 1: Hardware/software components and their corresponding simulation models The system modeling effort for systems such as DMT can proceed in four steps. First, the network is defined in terms of processing nodes, gates, and links. Second, the software objects and their interaction with one another are defined. Third, software objects and their interactions between distinct hardware components are mapped onto the processing nodes. Fourth, a set of models is defined to set up the experiments and control the simulation execution. These models collect pertinent measurements/metrics for functional and behavioral evaluation. The models to be developed for steps 1-3 are a Loosely Coupled Network (LCN), a set of Distributed Cooperative Objects (DCO), and an Object System Mapping (OSM), respectively. For step 4, a set of transducers and an acceptor models are to be defined, developed, and incorporated with the rest of the models. The software components for browser (the interface to the trainee), courseware, SoarSpeak/SoarGen, iGEN, database, and JSAF are modeled individually (see Figure 4). Each model is specified in terms of operations (methods) it can perform and memory size required for an executing software application. Two essential elements of any software object model are attributes and methods. The methods of software applications are specified in terms of their data size and resources required for processing. For example, an executing JSAF application typically requires 256MB of RAM. Messages produced by JSAF have an average size of 1500 bytes and messages received by JSAF have an average size of 100 bytes (see Section 3.2 for some details). Similar to software components, hardware components exhibit dynamic
7
behavior and therefore must be modeled (see Figure 4). Compared to software components, hardware components may generally be simpler to model and easier to customize. For DMT, off-the-shelf processors, hubs, and Ethernet models per specification of their real-world components were modeled.
Network Traffic
VN C clien t
VN C ser ver
Da t a ba se
J SAF
Cou r sewa r e Web Soa r Spea k Br owser S o ftw a re Ap p lic a ti o n s (S A)
PC (Wi n d o w s 98 OS )
Web Ser ver S o ftw a re Ap p lic a ti o n
iGE N SA
PC (Wi n d o w s N T OS )
PC (Li n u x OS )
Weapon Director LAN
Figure 4a: Candidate DMT Systems (MTRS) Architecture
VNC client
VNC ser ver
Da t a ba se
J SAF
Cour sewa r e Web Br owser S o ftw a re Ap p lic a ti o n s (S A)
PC (Wi n d o w s 98 OS)
Weapon Director Training Station
Web Ser ver
iGE N SA
PC (Wi n d o w s 98 OS)
S o ftw a re Ap p lic a ti o n
LAN
PC (Lin u x OS )
Wid e Are a N e tw o rk
Figure 4b: Alternative DMT Systems (MTRS) Architecture
8
3.1
DEVS/DOC Simulation Models
In this section, we exemplify DMT models of the software and hardware components, LCN and DCO composite models, and the combined LCN/DCO layers (see Figure 2). We also outline the list of transducer and acceptor models and how they are composed with hardware and software models to form experimental frames [16]. These models fall into two categories. The first category represents dynamic behavior of LCN and DCO components as well as their mappings via Object System Mapping. These models are represented as DEVS atomic and coupled models. The second category represents transducers aimed at capturing metrics of interest emerging out of simulation of the system – that is a host of transducers targeted for one or more of the software and hardware components. This category also includes an acceptor responsible for controlling the overall behavior of the entire simulation. The DEVS/DOC modeling of DMT begins with declaring the highest-level architectural models (see Figure 5). The dynamic behavior of the DMT is captured in a hierarchical fashion – that is representing hardware layer (Loosely Coupled Network), software layer (Distributed Cooperative Objects), and the mapping of the former to the later (Object System Mapping). Figure 5 also shows the experimental frame playing a vital role in extracting manifestation of DMT dynamic behavior. As shown in Figure 6, the components of the LCN layer (see Figure 1) are modeled in terms of their attributes. For each of the processor, attributes are storage size, CPU memory swap time penalty, and speed). Similarly, other components of the LCN – Ethernet and gates (hubs and routers) – are modeled by capturing their key attributes such as Ethernet speed and number of Ethernet segments. The LCN configuration shown for Figure 4 does not include any gates. The DEVS/DOC LCN models contain method specifying dynamic behavior such as collisions detection by Ethernet. However, in comparison with software components, the dynamic aggregate behavior of hardware components is fixed and generally no need to specify specialized methods for them. Of course, the DEVS/DOC environment supports incorporating new hardware components and behavior [11] Additionally as shown in Figure 6, the hardware components are composed into the LCN layer together as coupled models. These models specify how messages/data is transmitted from one model to another (e.g., PC to hub) using their respective ports and couplings.
/*Define Constructor for the Distributed Mission Training Model*/ public DistMissionTraining{ //Declare number of simulation runs and logical simulation time SimRuns = 5; SimTime = 100.0; … //Declare LCN, DCO, OSM, and EF coupled model components create_Loosely_Coupled_Network(); //construct LCN create_Distributed_Cooperative_Objects(); //construct DCO create_Object_System_Mapping(); //construct OSM create_Experimental_Frame(); //construct Experimental Frame … }
Figure 5: Snippet of the System level Model
9
/*Declare Loosely Coupled Network components*/ … link_ethernet Ether; //link ethernet hub_ethernet PCJSAFmau, PCDBmau; //media access units for PCs processor PCJSAF, PCDB; //PCs to host JSAF and Database software components … /*Define Loosely Coupled Network layer and its components*/ public void create_Loosely_Coupled_Network( ) { double ipHS = 20*8; //IP header size in bytes * 8 bits/byte double BF_hub = 2E+6*8; //buffer size in bytes * 8 bits/byte double swapTime = 1.2; //cpu-mem swapTimePenalty double ES = 1E+6; //ethernet speed bits/sec … double cpu1Sp = 8E+8; //cpu speed ops/sec double mem1Sz = 512E+6*8; //cpu memory in bytes * 8 bits/byte … PCJSAF = new processor(name+"----PCJSAF", ipHS, cpu1Sp, …); add(PCJSAF); … for(int i = 0; i