The Multi-Agent Data Collection in HLA-based Simulation System Heng-Jie Song1, Zhi-Qi Shen2, Chun-Yan Miao1, Ah-Hwee Tan1, Guo-Peng Zhao2
1 School of Computer Engineering, Nanyang Technical University, Nanyang Avenue, Singapore
[email protected],
[email protected],
[email protected] 2 Information Communication Institute, Nanyang Technological University, Singapore,
[email protected],
[email protected]
Abstract The High Level Architecture (HLA) for distributed simulation was proposed by the Defense Modeling and Simulation Office of the Department of Defense (DOD) in order to support interoperability among simulations as well as reuse of simulation models. One aspect of reusability is to collect and analyze data generated in simulation exercises, including a record of events that occur during the execution, and the states of simulation objects. In order to improve the performance of existing data collection mechanisms in the HLA simulation system, the paper proposes a multi-agent data collection system. The proposed approach adopts the hierarchical data management/organization mechanism to achieve fast data access which is indispensable to the analysis of simulation exercise. Furthermore, the multi-agent data collection system adopts a formalization expression method to describe the system behavioral characteristics, and implements the hierarchy language supports to the description by combing the XML and Petri net. In addition, we propose an independent reinforcement learning algorithm to generate optimized joint recording program which guarantees that the data collection and query tasks can be rationally distributed among logging agents as well as efficiently utilize computational resource. The testing results indicate that the proposed approach, under the premise of complete collection of simulation data, not only reduces the network load imposed by data collection components, but also provides effective supports to the analysis of simulation exercise.
1. Introduction The High Level Architecture (HLA) for Modeling and Simulation was originally developed by the US Department of Defense in 1996. HLA, utilizing Run
21st International Workshop on Principles of Advanced and Distributed Simulation (PADS'07) 0-7695-2898-8/07 $20.00 © 2007
Time Infrastructure (RTI) which provides a set of common and independent services, successfully separates simulation implementation, simulation execution management, and communication management, making implementation details transparent from each other. Therefore, HLA, with the guaranty of reusability and interoperability, is a suitable architecture for large and complex simulation applications. Currently, HLA-based simulations are widely utilized in the problem domains which are particularly complex or involve multiple collaborative parties. In these simulation systems, the analysts often need to construct a large-scale simulation with individual simulation components interacting over the Internet. Some typical examples include military commission rehearsal, internet gaming and supply chain simulation, etc. Therefore, lots of research and applications relating to HLA have been done during the last decade. In HLA terms, an individual simulation component refers as an ‘federate’. A group of federates that intend to interoperate with one another form a simulation system, called ‘federation’. Generally, HLA is defined by three components [4]: an interface specification, a set of rules and an object model template (OMT). HLA interface specification defines the rules of federate interoperability via a set of common interfaces [4] [5]. The Interface Specification prescribes the interface between federates and the Runtime Infrastructure (RTI), which is the name given to a software implementation of the interface specification [11] [13]. The six types of management services, as shown in Fig 1, provide communication and coordination services to federates. HLA rules define a set of compliance rules to ensure proper interaction of federates during federation execution [6] which govern the behavior of the overall federation and federates. HLA OMT provides a standard framework for recording the information
contained in a required HLA object model for each federation and federates. Essential components of the OMT [7] [11] are the federation object model (FOM) and simulation object model (SOM). The FOM is a federation level specification describing all information (objects, attributes, associations and interactions) that is to be shared by federates taking part in a particular exercise. The SOM is a federate level specification describing objects, attributes and interactions in a particular federate which are accessible to other federates. As an important component in HLA-based distributed simulation systems, data collection captures the data generated in simulation exercise and produces a record of the federation execution which can be permanently preserved. The data is indispensable to facilitate on-line debugging simulation exercise and to support after action review or analysis. Moreover, the data is the foundation to evaluate the performance and reliability of simulation systems [1] [12] [15]. However, the complex nature of HLA complicates the data collection. Due to the flexibility of data exchange protocol, the diversity and complexity of data produced during execution and the design of data collection system will inevitably affect the entire simulation performance and even limit the system scale. In order to eliminate the deficiency, different data collection mechanisms are deeply investigated. Briefly, existing data collection mechanisms are divided into two categories, namely centralized collection and distributed collection [15]. The functional representation of the HLA is provided in Fig. 1, which illustrates the co-operation and exchange of data between various simulations and support utilities. Centralized collection is a centralized approach which implies a single point of collection. This approach requires the collector to capture all required data from a single point in the network. The main advantages of the centralized approach are the inherent simplicity and the removal of data collation issues. Since the data are collected at one point they are immediately available for analysis. The main disadvantage is the consequence to have a large volume of traffic at one point in the network, which may cause network traffic jam and turn the data collection to be a bottleneck for the entire simulation system, especially in large-scaled HLA simulation systems based on Wide Area Network (WAN). Distributed collection has multiple data collection points. Each collector is responsible for gathering a subset of the entire data within the exercise. The main
21st International Workshop on Principles of Advanced and Distributed Simulation (PADS'07) 0-7695-2898-8/07 $20.00 © 2007
advantage of distributed collection results from the avoidance of the excessive network traffic.
Figure 1 Functional Representation of HLA Simulation An additional advantage is realized if collected data are required for analysis on the LAN and are readily available. The main disadvantage of this mechanism is that additional processing is required to collate the data for a complete exercise analysis. If a single database is to be formed all the distributed data must be moved across the network and careful consideration should be given to the timing of this operation to avoid overloading the network or related components. Issues then remain of removing duplicates from the separated collected files in order to produce one neat set of data. As HLA-based simulation systems are widely applied in all kinds of realms, the scale and complexity of simulation systems cause difficulties in the design and implementations for data collection systems and also urgently call for data collection techniques to meet higher requirements. In order to improve the performance of existing data collection mechanisms in HLA-based simulation systems, multi-agent data collection system is proposed. In the remainder of the paper we proceed as follows. The next section proposes the architecture of multi-agent data collection system. Section 3 details the key techniques involved in the design and implementation of multi-agent data collection system, including hierarchical data management/organization mechanism, formal expression of agents’ behavioral characteristics and the application of reinforcement learning algorithm to generate optimized joint recording program. By actually utilizing the multiagent data collection in a large-scale application simulation system, Section 4 gives the performance evaluation of the proposed approach. Finally, the conclusions are presented in Section 5.
2. Architecture collection
of
multi-agent
data
Taking the deficiencies of existing data collection mechanisms in HLA-based simulations into account, it is necessary to design a new data collection system to satisfy the various requirements involved in the development cycle of HLA-based simulation systems. Based on the previous research results [1] [12] [15], we propose the multi-agent data collection system to overcome the drawbacks of existing data collections. The multi-agent data collection system consists of multiple interacting logging agents. Each agent is comprised of Communication Module, Cooperative Decision Module, Data Logging Module and Data Processing Module. In addition, logging agents store the simulation data in corresponding Database Server, so called DS. The diagram shown in Figure 2 summarizes the architecture of multi-agent data collection system presented in this paper, dividing it into four functional modules.
2.1. Communication Module The Communication Module constructs a set of communication interfaces for logging agents. These communication interfaces conform to RTI which facilitates the communication among federates, as well as hides the low-level implementation details of network connection from high-level development of simulation objects and support utilities. In this method, logging agents communicate with others indirect via the RTI instead of direct via the communication channel. The communication mechanism enables agents to exchange state information as a time-stamped interaction. Depending on the communication mechanism, the logging agents share appropriate message to each other through publish-and-subscribe mechanism which belongs to the Declaration Management Service of RTI [7] [13] [16] and is used to realize remote access to simulation information.
2.2. Cooperative Decision Module By integrating the data logging request with local agent’s working load, the function of Cooperative Decision Model is to generate optimal recording program and corresponding logging actions. Obviously, the fact that this module adopts dynamic load balancing to make data logging decisions is efficient for resource management. As the core elements of Cooperative Decision Module, the
21st International Workshop on Principles of Advanced and Distributed Simulation (PADS'07) 0-7695-2898-8/07 $20.00 © 2007
cooperative independent reinforcement learning algorithm is discussed in detail in Section 3.3.
Database
Database
Figure 2 Architecture of Proposed System
2.3. Data Logging Module Referring to the optimal logging program generated by the Cooperative Decision Module, Data Logging Module completes the corresponding logging actions by adopting the data acquisition mechanism provided by RTI. In the implementation, asynchronous I/O is used so that the data collection system does not wait while the data is written to the log file, greatly reducing resource consumption.
2.4. Data Processing Module The principal function of Data Processing Module is to quickly respond to the user demands. Logging agents provide efficient supports to exercise analysis, such as on-line debugging of simulation exercise. On the basis of existing simulation data, the quick response to the user demands greatly depends on effective data management/organization mechanism. The more details about Data Processing Module are covered in Section 3.1. In the proposed multi-agent data collection system, logging agents adopt independent reinforcement learning algorithm to generate optimized logging program, subsequently update original functions, and finally cooperatively accomplish data logging and data acquisition tasks. Following sections discuss the details of the design and implementation of multi-agent data collection system.
3. Studies of key techniques in multi-agent data collection system From aforementioned discussions, the application requirements of multi-agent data collection system in HLA simulation systems include two main issues. Firstly, the data collection system should provide comprehensive and efficient supports to the analysis of simulation execution. Secondly, the data collection system should generate optimal joint recording program which enable data collection systems to consume network resource as less as possible under the premise of recording the simulation data correctly and completely. The implementations of the requirements depend on the data management/organization mechanism [17] and effective cooperation-decision algorithm, respectively. In this section, we will make an intensive study on the key techniques involved in the multi-agent data collection system.
3.1. Data management mechanism
and
organization
In order to efficiently support the system analysis, for example on-line debugging simulation exercise, the capability of accessing information quickly from logging nodes is crucial [10]. Taking the disperse configuration of logging nodes into account, the proposed system introduces a hierarchical data management/organization mechanism (as shown in Figure 3) which constitutes the kernel of Data Processing Module in multi-agent data collection system. As shown in Figure 3, the Data Processing Module is made of Service Requirement Component (SRC), Service Parsing Component (SPC), Service Communication Component (SCC), Service Grouping Component (SGC) and Service Access Point (SAP), which are all independent functional components. When client query arrives, local agent processes the data request through the five functional components until accomplishing the entire service. 3.1.1. SRC. Firstly, SRC receives the client request and makes a decision whether or not to accept the client request, according to the current load status of local agent. If accept the client request, SRC forwards the request to SPC. Otherwise, the request is send to SCC. 3.1.2. SPC. SPC parses the client request and judges whether the request content coincide with the data kept in the database caches of local DS. If the caches hit, the
21st International Workshop on Principles of Advanced and Distributed Simulation (PADS'07) 0-7695-2898-8/07 $20.00 © 2007
client request is forwarded to the SGC. Otherwise, the client request is distributed to SCC.
Data
Data
Figure 3 Data Management and Organization Mechanism 3.1.3. SCC. In the event of an excessive load placed on the load agent or the request content does not coincide with the data kept in the local caches, SCC publishes the corresponding client request as a time-stamped interaction to other agents. In this manner, SCC redirects the client request to optimize the system efficiency. 3.1.4. SGC. Upon the arrival of client requests decomposed by SPC, SGC divides data queries into several independent sub queries according to the type of the data queries. Subsequently, SGC queues the queries according to the logical relationships between them. Finally SGC submits the logical queue of the queries to SAP. 3.1.5. SAP. As the direct provider for data queries, SAP finalizes the query output and returns it to client. For users, SAP masks the data isomerization and provides uniform access services for data queries. Furthermore, SAP offers corresponding data access methods according to the different data source characteristics. The data access methods hide the diversity of practical data and ensure consistent and transparent data access for the whole system. From the view of the whole network connection, agent processes data query and completes access data transmission, depending on current load status of local agent. Compared to traditional mechanisms, the proposed data management/organization mechanism can effectively support the fast database access while ensuring access load balancing, and implementing the integrated processing of the distributed data. In the meantime, it guarantees the independence of each data
logger. Therefore it is possible to achieve acceptable performance even if data loggers are widely distributed or the network connection is inferior.
the above expressions can be described using Petri net as, Act6
S1
3.2. Formal Characteristics
Expression
of
Behavior
e1
S4
S2
Act9
S8
Act2
Act6 e2
S1 S2
Act9
E =< e1 :Receive Subscribe Object Request ; e2 :Receive Subscribe Interaction Request ; e3 :Receive Transfered Subscription Request ; e4 :Receive Client Data Request; e5 :Receive Transfered Client Data Request; >
V1 =< O1 , E1 , Acts1 , Q1 >
Acts1 =< Act1 , Act2 , Act3 , Act4 , Act5 , Act6 , Act9 , Act10 >
( Act5 ∨ Act6 ∨ Act10 ) → ( Act1 ∨ Act2 ) → ( Act3 ∨ Act4 )
V2 =< O2 , E2 , Acts2 , Q2 >
E2 =< e4, e5 > Acts2 =< Act7 , Act8 , Act9 , Act10 , Act11 , Act12 >
where
,
S4
S3
S6
S1 S3
S0
S2
Act9
S9
S5
Act 2
S7
Act 4
Act10 Act7 e4
S1
Act12
Act11
S10
S11
S12
S0
S13
S2
Act9
Act8
e5
S1
Act12
Act11 S10
S11
S12
S0 S2
Act9
S9 Act10
where
, S = {S } denotes agent’s current state. i
For the multi-agent data collection system, the cooperative decision algorithm that is applied to generate optimized joint logging program is essential for complete collection of simulation data. Based on the formal description on behavioral characteristics of multi-agent data collection system, we propose the cooperative independent reinforcement learning algorithm. The cooperative independent reinforcement learning algorithm derives from Q -learning algorithm. Q -learning algorithm proposed by Watkins [2] [3] has already been widely utilized for cooperative decisions in multi-agent systems. As a dynamic difference algorithm, it can be denoted as, Qt +1 ( s, a) = (1 − α )Qt ( s, a) + α [ rt + γ max Qt ( s ' , a ' )] (1) ' a
E1 =< e1 , e2 , e3 >
( Act7
Act3
Act1
Act6
3.3. Cooperative independent reinforcement learning and corporation between agents
Acts =< Act1 :Subscribe Simulation Object; Act2 :Subscribe Simulation Interaction; Act3 :Reflect Object Attribute Updates; Act4 :Receive Simulation Interaction; Act5 :Publish Subscribe Transfer Interaction; Act6 :Parse Transfered Subscribe Interaction; Act7 :Parse Transfered Client Data Request; Act8 :Publish Client Data Request; Act9 :Validate Request; Act10 :Refuse Request; Act11 :Proces Data Request; Act12 :Send Data; >
→
S7
S8
Act7
O =< O1 : CollectData; O3 : ProvideData >
Q2 = Act9
S5
Act5
e3
Act4
S3
S0
B =< O, E , A,V >
where O is the set of agent’s objectives; E is the set of detectable events; A is the set of agent’s internal actions; Vi is the set of agent’s executed actions to achieve the goal i .
→
S6
S0
Act5
For the design and implementation of multi-agent data collection system, there is a need to firstly make clear the definitions and abstracts of system actions. The clear definitions of the behavioral characteristics of system are necessary preconditions to generate optimal joint logging program. Therefore, it is necessary to further apply a formalization expression method to describe the behavioral characteristic of the systems. By combing the XML and Petri net, this paper employs the hierarchy language to supports the description. Agent’s behavioral characteristic is expressed as,
Q1 = Act9
Act3
Act1
S3
∨
Act8 ∨ Act10 ) → Act11
→
Act12
A → B denotes that A is followed by B ; A ∨ B denotes that A occurs or B occurs. Therefore,
21st International Workshop on Principles of Advanced and Distributed Simulation (PADS'07) 0-7695-2898-8/07 $20.00 © 2007
where Qt ( s ' , a ' ) is the value of evaluation function for agent to execute action a at state s and advance to its next state s ' . In the problem space represented by a finite state set S = {si } , let all agents have the same activities set A . In light of the activity strategy π , the given agent at current state si ∈ S can obtain reward r ( s, a) ∈ R by taking action a ∈ A .
Based on the reward r , agent obtains its corresponding value of evaluation function Q : S × A → P , and determines the amended activity strategy according to the evaluation function value Q . For the logging agents in the HLA-based distributed simulation environment, we assume that each agent, as an autonomous system, is able to self-regulate its own behavior in order to maximize its utility. Therefore, each agent selects its own optimal activity strategy to form optimized joint activities. Thus the reinforcement learning process of Agent i is described as [14], Qti+1 ( s, a ) = (1 − α )Qti ( s, a ) (2) Qt ( s ' , a u ' )] + α [rt ( s, a ) + γ max ' au
where max Qt ( s ' , a u ' ) represents the highest value of u' a
evaluation function for executing all agents joint activities a u ' at state s . Taking the incomplete information among agents into account, it is difficult to calculate max Qt ( s ' , au ' ) u' a
in Formula (2) by Nash equilibrium to construct iterative learning rule as [14], Qti+1 ( s, a) = (1 − α )Qti ( s, a) + α [rti ( s, a) + γ Qti ( s ' )∏ π j ( s ' )] (3) j∈ A
where π j ( s ' ) is the mixed strategy Nash equilibrium for Agent j at state s ' . Therefore, this paper proposes independent learning to construct the iterative learning process of each logging agent. For each logging agent, the iterative learning rules completely depend on its current state information and the received reward by taking its own activity strategy. For a given agent, it adopts the following iterative process,
{
}
Qti+1 ( s, a) = max Qti ( s, a), rt (s, a u ' ) + γ max Qti (s ' , a ' ) (4) ' a ∈A
In Formula (4), let the reward value of each agent by taking local action strategy is equal to the reward for executing joint actions, that is, r ( s, a u ' ) = r i ( s, ai ) = r j ( s, a j ) ∀i, j ∈ m (5) a u = (a1 ,..., a i ,..., a j ,..., a m ) Let the rule for updating strategies be, i
π 0 ( s ) = a ∈ A,
π ( s ), st ≠ s i i Qt +1 ( s , a ) max Qt ( s, a ) = max a∈ A π ( s ) = a∈ A arg max Qti+1 ( s, a ) a∈ A a∈ A i t
i
t +1
{
}
21st International Workshop on Principles of Advanced and Distributed Simulation (PADS'07) 0-7695-2898-8/07 $20.00 © 2007
Formula (6) indicates that an agent will keep its activity strategy unless Q values are updated. Let every agent take this policy, and then a joint strategy of all agents can be obtained as, π u ( s) = (π 1 ( s), π 2 ( s),..., π m ( s) ) where m denotes the number of agents. Now we proceed to prove our main algorithm, which states that strategy π tu will converge to certain optimal strategy π tu * by adopting formula (4), (5) and (6) as the learning algorithm and the rule for updating strategies. Theorem1 Strategy π tu converges to certain
,
optimal strategy π tu * , if the condition that all agents have the same evaluation function indicated in Formula (5), is satisfied and also, the learning process stated in Formula (4) and the strategy updating policy defined in Formula (6) are applied. Lemma1 [9], Define Q-learning process for joint activities as, Q ( s, a u ) s ≠ s t t u (7) Qt +1 ( s, a ) = r ( s, a u ) + γ max Qt ( s ' , a u ' ) au' Q values converge to the optimal Q* , as the number of experienced state-action pair increases to infinity in the learning process. Proof 1) The Q value of arbitrary state-action pair
:
( s, a) is the optimal Q value for Agent i to execute joint actions a at state s , that is, (8) Qti ( s, a ) = max Qt ( s, a u ) i a =a
Formula (8) can be proved by induction on iteration step t . 2) For arbitrary state s and arbitrary iteration step t ∈ {1, 2,...} , joint activities strategy is the optimal strategy, that is, (9) Qt ( s, π tu ( s )) = max Qt ( s, a u ) r u a =A
where π u ( s ) = (π 1 ( s), π 2 ( s),..., π m ( s) ) , π i ( s ) is denoted as Formula (6). The proof takes two possibilities consideration, Qt ( s, a u ) < max Qt +1 ( s, a u ) a) max u u u u a ∈A
(6)
into
a ∈A
According to Formula (8), we have max Qti ( s, a ) < max Qti+1 ( s, a ) a∈ A
a∈ A
From strategy rules described in Formula (5), we obtain the joint activities strategy for t + 1 as,
atu+1 = (π t1+1 ( s ),..., π tm+1 ( s )) , and
(
)
π ti+1 ( s) = arg max Qti+1 ( s, a) i = 1, 2,..., m a∈ A
Since Qt +1 only updates for state-action pair ( s, atu+1 ) , following all agents’ updating strategy rule, we have Qt +1 ( s, atu+1 ) = max Qt +1 ( s, a u ) u u
testing processes were executed in the same experimental environment. During the experiments, the simulation data is updated every 50 milliseconds. The simulation execution generates about 5.325 GB simulation data in about 33 minutes.
a ∈A
b)
max Qt ( s, a u ) = max Qt +1 ( s, a u ) u u u u a ∈A
a ∈A
Let a u = (a1 ,..., a i ,..., a j ,..., a m ) = arg max Qt +1 ( s, a u ) , u u a =A
according to Formula (5), we have π ti+1 ( s ) = π ti ( s ) = ati i = 1, 2,..., m . Therefore, the joint activities at t + 1 , atu+1 = ( π t1+1 ( s ),..., π tm+1 ( s ) ) ,
Figure 4, Sketch Map of Testing Platform
holds for u t +1
Qt +1 ( s, a ) = Qt +1 ( s,(π
1 t +1
( s),..., π
m t +1
u
( s )) = max Qt +1 ( s, a ) u u a ∈A
3) According to the conclusion from 2) and Lemma 1, it completes the proof that strategy π tu will eventually converge to some optimal strategy π tu * . By adopting above learning algorithm and the rule for updating strategies, the Cooperative Decision Module is able to generate corresponding optimized action which guarantees multi-agent data collection system to fully cover the simulation data and respond to client data queries promptly while effectively decreasing computational cost.
4. The performance measures Taking the design requirements of data collection system in HLA simulation into account, the chosen performance measures were data loss rate, network traffic and response speed of client data query, which, respectively, measure the integrity of recorded simulation data, the logging agents’ impact on simulation execution and effective supports to simulation analysis. In practice, we utilize the distributed simulation application system [8] [9] as a testing platform to investigate the performance of the proposed approach. Figure 4 depicts the testing platform which is originally designed to evaluate the performance of a certain weapon system. As a large-scale HLA-based simulation system, the testing platform includes numerous federates, for example missiles, fighter-interceptors, radars, satellites, command and control center, etc. All
21st International Workshop on Principles of Advanced and Distributed Simulation (PADS'07) 0-7695-2898-8/07 $20.00 © 2007
In testing process, we have different numbers of logging agents performed data collection. By applying the independent reinforcement learning algorithm mentioned in Section 3.3, the multi-agent data collection systems with different number of logging agents produce different optimized joint recording programs to cover the object and interaction information which is generated in simulation execution. Table 1 gives the detailed record content of each logging agent under different situations. Table 1: Recorded Content of Each Logging Agent and Corresponding CPU Usage Num of Logging Agents Participated in Testing Process 2 Logging Agents 3 Logging Agents 4 Logging Agents 5 Logging Agents
1thAgent th
2 Agent 1thAgent 2thAgent 3thAgent 1thAgent 2thAgent 3thAgent 4thAgent 1thAgent 2thAgent 3thAgent 4thAgent 5thAgent
Num of Num of Object Interaction classes Classes Recorded Recorded
Percentage Average of Rate of Total CPU Usage Data
32
120
74.56%
78.37%
28 23 21 20 14 11 12 12 11 10 10 9 9
107 97 84 75 75 72 64 67 32 32 25 29 22
61.24% 57.13% 49.67% 48.43% 36.7% 34.24% 36.15% 36.56% 34% 31.3% 29.76% 27.72% 22.7%
66.17% 64.45% 57.26% 55.14% 45.64% 39.74% 41.14% 42% 35.67% 32% 31.26% 29.43% 27.4%
There are 47 Object Classes and 128 Interaction Classes involved in the simulation execution. The amount of simulation data is 5.325GB.
From Table 1, we can see that the application of independent reinforcement learning algorithm make the proposed approach rationally distribute the tasks to different logging agents, reflecting optimal using of computational resource. Although certain simulation information is recorded repeatedly by different logging agents, the superposition does not affect the integrity of
recoded data. On the contrary, it seems to be able to facilitate the client data query because of more data backup existing in different logging nodes. As an important performance measure, data loss rate reflects the integrity of recorded data. Figure 5 illustrates the data loss rates of different data collection mechanisms. In the case, the data loss rate of centralized data collection exceeds 10% which indicates the centralized data collection is not able to record the simulation data acceptably. Compared to the centralized data collection system, the multi-agent data collection system effectively decreases the data loss rate. The statistical results in Figure 5 implies that, for the large-scale simulation system with fast data updates, the multi-agent data collection system effectively relieve the deficiency of data loss due to the high speed of data transmission. In the study case, five logging agents can satisfy the data-integrity requirement which is normally specified below 0.05% .
execution, the reduction of network load becomes more apparent. As mentioned previously, the fast data access is important to support exercise analysis. Figures 7 illustrate the query response tests performed by centralized data collection system and the multi-agent data collection system respectively.
Figure 6 Comparisons of Network Traffic
It is obvious that the response speed for data query in multi-agent data collection system is faster than that of conventional data collection mechanism in HLA simulation.
Figure 5 Comparisons of Data Loss Rate of Different Data Collection Mechanisms
Moreover, another exciting feature of multi-agent data collection system is the capability of reducing network load. In the large-scale HLA-based simulation system, as previously mentioned, the great mass of simulation information to data collection system often leads to network congestion and performance degradation results. The heavy network traffic caused by data collection system often results in the time delay of exchanged message which destroys the temporal logic of interactive information in simulation execution. Figure 6 gives the network traffic when centralized data collection system, multi-agent data collection systems are adopted respectively in the execution of testing platform. From Figure 6, we can see clearly that the multi-agent data collection system reduces the network load significantly. This testing result shows that multi-agent data collection system can decrease the load on network traffic imposed by data collection component in HLA simulations. Furthermore, with the increase in the number of logging agents participated in the testing platform
21st International Workshop on Principles of Advanced and Distributed Simulation (PADS'07) 0-7695-2898-8/07 $20.00 © 2007
20.00
15.00
1.00 5.00
10.00
15.00
20.00
25.00
30.00
Figure 7 Response Speed of Data Request
5. Conclusion In this paper, multi-agent data collection system has been proposed to overcome the drawbacks of existing data collection mechanisms in HLA simulation systems. The architecture of multi-agent data collection system is presented and the key techniques involved in the design and implementation process are discussed in detail.
By adopting hierarchical expression to describe the behavioral characteristics of the multi-agent data collection system, the proposed approach provides higher-level abstractions than traditional data collection system in HLA simulations. Based on the formal expression, the proposed approach employs hierarchical data management/organization mechanism to increase data access performance which effectively supports the simulation analysis. Furthermore, the multi-agent data collection system adopts the independent reinforcement learning algorithm to generate optimized joint action program which guarantees that the data collection and query tasks can be rationally distributed among logging agents while more efficiently utilizing computational resource. Compared with conventional data collection systems in HLA simulation, this capability gives the system more autonomy and flexibility in making decision. Experiment results show that the multi-agent data collection system is a promising alternative to existing data collection mechanisms in HLA simulations.
Distributed Interactive Simulation and Real-Time Applications, pages 67. 74, College Park, Maryland, March 1999. [11] IEEE Std 1516-2. IEEE Standard for Modeling and Simulation (M&S) High Level Architecture (HLA)Object Model Template (OMT) Specification. Simulation Interoperability Standards Committee of the IEEE Computer Society, Institute of Electrical and Electronics Engineers, Inc., New York, USA, 2000, 1-2. [12] J. Black, Data collection in an HLA federation, in: Proceedings of the Spring Simulation Interoperability Workshop, Orlando, FL, March 1999. [13] J.O. Calvin, R. Weatherly, An introduction to the high level architecture (HLA) runtime infrastructure (RTI), in: Proceedings of the 14th Workshop on Standards for the Interoperability of Defence Simulations, Orlando, FL, March 1996, pp. 705-715. [14] J. Hu, M.P. Wellman. Multi-agent reinforcement learning: Theoretical framework and an algorithm. Proc of the 15th Int Conf on Machine Learning, Morgan Kaufmann, 1998, 242-250.
6. Reference
[15] Pauline A. Wilcox, Albert Burger, Peter R. Hoare, Advanced distributed simulation: a review of developments and their implication for data collection and analysis. Simul. Pr. Theory 8(3-4): 201-231 (2000)
[1] Bachinsky S T, Tarbox G H, and Powell E T. Data collection in an HLA environment. In Proceedings of the 1997 Spring Simulation Interoperability Workshop, Orlando, FL, UCF-IST.
[16] S. Bachinsky, J. Russel, F.J. Hodum, Implementation of the next generation RTI, in: Proceedings of the Spring Simulation Interoperability Workshop, Orlando, FL, March 1999.
[2] C Watkins. Learning from delayed rewards. Cambridge, Cambridge University, 1989.
[17] THOM, M and LEO M. Applying temporal database to HLA data collection and analysis. Winter Simulation Conference 1998, 235-239.
[3] C. Watkins, Q-learning. Machine Learning, 1992, 279292. [4] Defense Modeling and Simulation Office. HLA Technical Specification. [5] Department of Defense (US), High Level Architecture Interface Specification, version 1.3 edition,February 1998. [6] Department of Defense (US), High Level Architecture Rules, version 1.3 edition, February 1998. [7] Department of Defense (US). High Level Architecture Object Management and Template, version 1.1 edition, February 1997. [8] Heng-Jie Song, Ming Yang, The application of evaluation method based on HMM for results validity of complex simulation system, Winter Simulation Conference, 2005, 1228-1233 [9] Heng-jie Song, Yang Ming. Research on federation integrated test platform based on HLA, Journal of Beijing University of Posts and Telecommunications, v28,n4, August 2005 [10] H. Zhao and N. D. Georganas. An Approach for Stream Transmission Over HLA-RTI in Distributed Virtual Environments. In 3rd International Workshop on
21st International Workshop on Principles of Advanced and Distributed Simulation (PADS'07) 0-7695-2898-8/07 $20.00 © 2007