Comparison of Network Protocol and Architecture for Distributed Virtual Simulation Environment Bu-Sung Lee, Wen-Tong Cai, Stephen J. Turner and Jit-Beng Koh, Nanyang Technological University School of Computer Engineering Blk N4, #2A-32, Nanyang Avenue Singapore 639798 {ebslee, aswtcai, assjturner}@ntu.edu.sg,
[email protected], Keywords: Light weight Reliable Multicast, Fast Messages, RTI, RTI-Kit, FDK, DIS, HLA Abstract
In any distributed virttml simulation enviromnent, the underlying network architecture and its protocols play an important part in its performance. Tiffs paper describes tile different underlying protocols used in file support of the RTI implementation in the Federated Simulations Development Kit (FDK). The communication FM and MCAST modules were modified to support different protocols. Tile performance of two different protocols: TCP and a new Lightweight Reliable Multicast, called Pseudo Reliable Multicast Protocol (PRMP), running on top of two different network architectures Ethemet and Asyncltronous Transfer Mode (ATM) were compared. The latter protocol was developed specifically to support the distributed virtual simulation enviromnent. Furthermore, in the case of the ATM network architecture, the use of native ATM-API was also implemented and its performance compared with file other protocols. The benclunarks used to compare their performance are Latency and Time Advance Request Benchmark. The results show that PRMP outperforms the other protocol teclmiques when the number of subscribers are large and when file bandwidth is limited. But it has some additional latency overhead, due to additional processing required to provide tile reliability. needed by the sender and receivers. Comparing the network architecture, the benclunark performance of the above protocols operating on top of 100BaseT switch network performs much better than over ATM network, although the transmission speed is much higher in file case of the latter. 1. Introduction
Distributed Virtual Simulation Environment (DVSE) is an important strategic technology for linking simulations of various types, at multiple locations to create a realistic, complex, "virtual world" for the simulation o f highly interactive activities. One major use of DVSE is the Distributed Interactive Simulation (DIS) [1,2] system for military simulations such as war-gaming, where geographically distributed hardware and personnel interact with each other as if they were in actual combat situations. The US Department of Defence (DOD) has adopted the High Level Architecture (HLA)[3,4] as the framework and the RunTime Infrastructure (RTI) as the infrastructure to support the DVSE. One of the major issues faced by developers of DVSE is the communication requirement. Increasing the bandwidth or transmission rate does not necessarily give better performance.
30
The efficiency/appropriateness of the protocol to support DVSE is also an important factor that affects its performance. Over the past decade two network architectures have made significant impact: 100BaseT[5] and Asynchronous Transfer Mode(ATM)[6]. They have changed the transmission rate of networks which used to be in the 10s of megabits to 100s of megabits. They have both been widely accepted and deployed in various organisations. With regard to the part on protocol, Interact Protocol (IP)[7] has been widely used in network communication. IP was developed for general purpose transmission of data. IP native multicast is a very useful mechanism for group communication found in DVSE and a number of researchers[17] have used reliable multicast in DVSE. However, such protocols are not tuned for DVSE. We have proposed and implemented a light-weight reliable multicast protocol, Pseudo Reliable Multicast Protocol (PRMP) [8], which runs on top of IP. It was designed specifically to support group communication found in DVSE. In our paper, we will focus on the performance of the different network architectures, 10/100 BaseT Ethernet and ATM network as well as the performance of different protocols in DVSE. Section 2 will introduce the Federated Simulation Development Kit which provides the RTI for the DVSE. Section 3 will introduce the different protocols and network architectures. Section 4 will report on the benchinark results. Conclusion drawn from the experiments are found in section 5.
2. Federated Simulations Development Kit The US Department of Defense (DoD) has adopted the High Level Architecture (HLA) as a standard for all its simulations. With this adoption, HLA-compliant Runtime hffrastructures (RTI), which provide simulations with runtime services like basic communications, play a crucial role in the reduction of high communication latencies. A non-commercial RTI development kit, known as the Federated Simulations Development Kit (FDK)[9,11], forms the basis for our development and implementation. Figure 1 shows an architectural overview of FDK and its interconnects with a Federate and the underlying network. The FDK consists of the following modules: • • • •
Buffer Management and Queues Library: Manage the buffer and queues used by the other modules. Time Management Kit (TM): Manages the time advancement in the DVS environment. Multicast Kit (MCAST): Manages multicast groups, sending and receiving of multicast data. Fast Messages (FM): Manages communication with other entities. It ensures reliable and ordered delivery of messages. It uses the Fast Message interface[12] for communication with other modules.
In addition to the RTI-Kit, the Federated Simulations Development Kit contains an RTI implementation compliant to the HLA Interface Specification, Debbie's RTI (DRTI). While
31
the implementation is not a complete realization of the HLA Interface Specification, sufficient HLA services are already in place for simple simulations and benchmarking.
Federate Simulator/Application RTI Interface
F-6ff. . . . . . . . . . . . . . .
-~ . . . . . . . . . . . . . . . . . .
Debbie's RTI
MCAST,
Management Queues Lib
FM
Network
Figure 1: Architectural Overview of FDK
The MCAST and FM modules are of particular interest in this paper. The MCAST module is responsible for the management of multicast groups and group communications while the FM module provides the low-level primitives for communications on the underlying network. MCAST and the TCP port of FM in FDK are modified to employ the network-level native multicasting as well as using FORE ATM API. The new FM APIs are backward compatible. Simulations need not be rewritten.
3. Communication Protocols and Network Architecture 3.1 Communication Protocols Protocols refer to the set of rules that govern communication between different entities. The MCAST mad FM modules in the FDK were modified to support a number of communication protocols. These modifications are transparent to the users of FDK. The communication protocols supported are as follows: . FM-TCP. This is the original communication protocol used by FDK. It only uses the unicast TCP protocol for communication. Sending to multiple entities requires the use of a TCP exploder, which is basically a dedicated process to copy the multicast data and send to all the subscribers one at a time using individual TCP connection.
32
. FM-ATM. FORE ATM API, which uses ATM Adaptation Layer 5. Tiffs only runs over ATM network. Since ATM only supports point to point communication, the multicast is developed using an exploder similar to FM-TCP. . FM-PRMP[22], which uses both multicast UDP and unicast TCP protocol for communication. The FM-PRMP, proposed by the authors, was designed to exploit the efficiencies of native multicast transmission, i.e. the network architecture has the ability to send to multiple receivers simultaneously. It provides reliable and ordered delivery using multicast transmission. Figure 2 shows the protocol stack of the above protocols. Both FM-TCP and FM-PRMP can run on top of either the Ethernet or ATM network architecture. In the case of FM-ATM, it can only run on top of an ATM network architecture.
Applications FM-TCP
FM-PRMP
TCP
UDP
FM-ATM ATM AAL5
IP Network Architecture (Ethemet, ATM) Figure 2: Communication protocol stack
3.2 3.2.1
Network Architecture 10/100BaseT Switch Ethernet
Ethemet which has its beginning in the 1980s, has evolved tremendously. One major change is the change of idea from a shared media to a shared switch [18]. In the 10/100BaseT switch the data packet is forwarded, in hardware, to the destination based on the data packet Medium Access Control(MAC) address. Full-duplex conamunication is supported. The packet format remains the same as in the original Ethernet specification. 3.2.2. ATM network ATM is basically a connection oriented (CO) service, whereas the existing LAN protocols support connectionless (CL) services. In ATM a virtual connection (VC) is setup for the communication between any two physically dislocated processes. Thus, all ATM cells carry the Virtual Path Identifier (VPI) and Virtual Connection Identifier (VCI), which represent a unique virtual connection. Also, the VPI and VCI are used for multiplexing, demultiplexing, and switching the cells through the network. Hence, the interfacing between ATM and the existing connectionless LAN protocols is not an easy task. However, the ATM forum proposes two ways to realize the existing connectionless LAN interfaces and functionality in an ATM environment. They are Classical IP and LAN Emulation [19,20]. However the
33
former, Classical IP, is not suitable for our experiment as it does not support native multicast, which is used by FM-PRMP. In addition, Fore Systems have implemented a proprietary interface known as Fore ATM API. Figure 3 shows the detailed protocol stack for the different implementations of LAN functionality in ATM.
I
I
Appllc~lon I i
•
/koplicaUonInlerface
I
,-,°
I
Application Software
FORE API
il ."~'
I
I
I
CLASSICAL IP
I!i
I '°'°~=°
N~w*rk
La,~ers
I
LAIN EMULATION
"
'°'°~°°
I
:l !i;I
li
ii EEES022~,C I l
i
,i
T
~
.
!
i.. ................... {il ........................................... il ........ ATM I............. MAC If!
Layers
t
A~
tey~r
I
I
I
A
Figure 3: Detailed ATM protocol stack of LANE, Classical IP and FORE ATM-API[21] 3.2.2.1 LAN Emulation (LANE) LANE[19] over ATM provides the full support for the existing LAN based applications without changes. The LANE specifications are standardized by the ATM forum to handle the differences between shared or switched LANs and connection-oriented ATM networks. It gives complete transparency to the network's changes to ATM. The LANE architecture is based on the client-server concept as shown in figure 4. Hence, a LANE client (LEC) is set up at each host the LANE server (LES) can be centralized or distributed over the network. The Broadcast Unknown Server (BUS) is the broadcast server for an Extended LAN(ELAN), a LAN environment over ATM, and it takes care of the broadcasting functionality within the ELAN. The configuration information is maintained by LAN Emulation Configuration Server (LECS) and the database is updated to keep record of changes in the database. However, LANE is configured using Switch Virtual Circuits(SVCs), to form Virtual channel connections(VCCs).
34
Configuration|
I Configuration
Control VCC ~1¢
r~-
~IVlulticast ~CC -V' Data Direct VCC
Figure 5: LANE Components [19] 3.2.2.2 Fore ATM-API Fore Systems has implemented an interface, which can support their proprietary signaling protocol called SPANS. Thus Fore IP allows communication using AAL 4 and 5 with no encapsulation, on top of SPANS signaling. Fore IP uses a broadcast ARP for SPANS address resolution and supports direct communication of all hosts on a physical ATM network without the use of IP routers. Also, a host can support Fore IP and SPANS at the same time as Q.2931 and SPANS use different VCIs. Fore IP is configured with an IP address of its own and belongs to the IP subnet defined by the mask.
4. Performance Measurements Benchmark results vary with different network configurations and equipment. To justify and validate the benchmark results, the network configurations and equipment setup of the testbed is presented in Figure 5. The machines on these network configurations are isolated. Thus, the benchmarks are unaffected by other network traffic. Furthermore, by using two interface cards per machine, we are able to compare the network performance as the computing systems are the same, only the network interface used changes. Equipment used: 1. 10/100Mbps Ethemet Switch (D-Link) 2. Fore ASX200BX switch 3. 5 x Intel-based PCs with configuration: • Intel 450Mhz Pentium II (3 machines) • Intel 450Mhz Pentium III (2 machines) • 128 Mbytes of RAM • 3COM 3C905B-TX 10/100Mbps Network Interface Cards • Fore System PCA-200EPC. (These are OC-3 ATM card, 155 Mbps) 4. Sun Solaris 2.6/x86 and Sun Solaris 7.0/x86
35
10/100 EthernetSwitch /
1 Figure 5: Ethernet and ATM Test-bed Two benchmarks are used to measure the performance of DRTI using Multicast FM. These are the Latency Benchmark and the Time Advance Request (TAR) Benchmark.
4.1 Latency Benchmark The Latency Benchmark measures the one-way latency between two federates. This latency is calculated by taking the round-trip latency and dividing it by two. Round-trip latency is measured as the delay it takes from an UpdateAttributeValues invocation to send a message until the time the same federate receives an acknowledgement message from the second federate. The message sent by UpdoteAttributeValues contains a wall-clock time value and a payload of N bytes. The acknowledgement message sent by the second federate on receiving the ReflectAttributeValues callback contains a time stamp without any payload. This approach used is consistent with the existing DMSO software and methodology.
Experiment 1: Communication Protocol comparison (10/100Mbps Ethernet) Figure 6 shows the results of running the benchmark over the Ethernet test-bed with various payload sizes from 100 bytes to 1500 bytes, at speeds of 10Mbps and 100Mbps.
36
Latency Benchmark 3900 3500 3100
2700
~ 2300 E 1900 -
1500 1100 7OO
• ........
• ........
• ........
30O 1 O0
500
1000
1500
Payload size (bytes) J,
tcplO
¢
r/udplO - - •
- -tcplO0 - - •
- .dudplO0 I
Figure 6: Latency Benclunark of 10 and 100 Mbps Ethernet As compared to FM-TCP (tcp]O, teplO0), FM-PRMP (r/udplO, r/udplO0) incurs an additional latency of about 400 to 500 gs in both cases. This is as expected since FM-PRMP has to assemble, process and buffer messages to attain reliability. A large part of this latency is incurred in context switching as Multicast FM is a multi-process implementation. Furthermore, the gradient of the curve is correct and shows that the time is due to packet transmission time. Initial delay is obtained by extrapolating backwards, which gives the results as 300 usec and 700 usec for FM-TCP and FM-PRMP.
Experiment 2." Communication Protocols comparison (ATM) This experiment compares the performance of FM-TCP, FM-PRMP, and FM-ATM API over the ATM network. Figure 7 shows the results of the experiment. It shows that the FM-ATM API gives the best performance. FM-PRMP gives the worst performance. This is partly due to the need to send the information to the BUS module (refer to figure 4) which distributes it to the receiver. This incurs delay due to the additional transmission and retransmission of the packet. The performance of the FM-TCP over ATM is comparable to that of the FM-ATM, as it is a direct source to destination transmission. The slightly higher delay is because in the case of FM-TCP over LANE, additional processing time is required for the segmentation and reassembly &the data, as shown in figure 3.
37
Latency Benchmark(ATM)
/
12000 10000 e8000 .o 6000
Y
E =,.. 4000
T,
2000 0 100
500 1000 Payload size(bytes)
1 = tcp
~r/tcp
1500
+atm/api f
Figure 7: Latency benchmark of communication protocols over ATM.
5.2 TAR Benchmark The Time Advance Request (TAR) Benchmark measures the performance of the time management services and the time required to perform Lower Bound Time Stamp (LBTS) computations. The original benchmark involves N federates, in which, federate i subscribes to attributes published by federate (i+1) and federate (N-l) subscribes to attributes published by federate 0. In effect, each multicast group contains a publishing (sending) federate and a subscribing (receiving) federate. Each federate will then repeatedly perform an UpdateAttributeValues call and a following TimeAdvanceRequest call with the same time parameter. The number of time advance grants (TAGs), per second of wall-clock time, as observed by each federate is then measured. To measure the effects of utilizing IP multicasting, a variation of the TAR benchmark is used. In this modified TAR benchmark, federate 1 to federate (N-l) will subscribe to attributes published by federate 0, mad federate 0 does not subscribe to any attributes. In effect, there is only one multicast group comprising of a sending federate (federate 0) and multiple receiving federates (federate 1 to N- I).
Experiment 1: TAR Benchmark on 10/100 BaseT Two different protocols were compared, FM-TCP(tcp) and FM-PRMP(r/udp) running over the 100Mbps Etheruet. FM-TCP is observed to offer a better performance, as shown in Figure 8, when there are few subscribers. This is because of the additional processing incurred in FM-PRMP. As the number of subscribers increases FM-PRMP starts to perform better.
38
With 4 subscribers, FM-PRMP surpasses and outperforms FM-TCP. Generally, in using FMTCP, as the number of subscribers increase the amount of network traffic and latencies increases. This is because it has to send the same packet to each and every one of its intended receivers. Thus, FM-PRMP is more favorable for large multicast groups, even with its associated overheads. This is because of the efficiency in transmission, where multicast needs to send the packet once, while the exploder need to send the packet to individual receivers. TAR Benchmark(100Mbps Ethernet) 2800 2600 2400 2200 2000 1800 1600 1400 1200 1000 600
1
2
3
4
# of subscribers / MG
[ "- t0p
Figure 8: TAR Benchmark (100Mbps Ethemet To verify the correctness of this deduction, the network speed was set to 10 Mbps. The results in figure 9, where the network is a 10Mbps Ethernet, FM-PRMP (r/udp) consistently outperforms FM-TCP (tcp). The higher latencies incurred by FM-PRMP, as established by Latency Benchmarks earlier, is outweighed by PRMP benefits of efficient utilisation of bandwidth using native multicast transmission. For 2 subscribers and above, FM-PRMP registers a 40% gain in performance over FM-TCP.
39
TAR
Benchmark(10Mbps Ethernet)
1500 1300 1100
U)
(3
900 700 500 300 100
2
3
4
# of subscribers I MG
I--A-- tcp - e - r/udp I Figure 9: TAR Benchmark (10Mbps Ethemet)
Experiment 2: Communication Protocols comparison (ATM). The TAR benchmark was carried out to compare the performance of FM-TCP over LANE in ATM: FM-PRMP over LANE and FM-ATM. The results, in figure 10, show that FM-ATM performs much better than the other protocols. FM-PRMP performance is very. close to the ATM. In this case the FM-PRMP is using the BUS module in LANE, which is a hardware mulicast, to act as an exploder for the packet instead of the sender as in the case of FM-TCP. TAR
Benchmark(ATM)
3000 2500 2000 1500 1000 500 0
t
2
3
4
# of subscribers { ~tcp
+r/udp
---A'--atm/api ]
Figure 10. TAR Benchmark of communication protocol (ATM)
40
5. Conclusions
The FDK communication modules, MCAST and FM, were successfully modified to support the use of PRMP and FORE ATM API communication. Latency Benchmark and Time Advance Request Benchmark were carried out using the different protocols running over 10/100 BaseT Switch Ethemet and FORE ATM switch. The results from the experiments on Ethernet network clearly show the benefit of using native multicast for DVSE. The light-weight reliable multicast, FM-PRMP, has proven to be the ideal choice of communication when the number of subscribers is large or where bandwidth is low as shown in the TAG benchmark. FM-PRMP(r/udp) clearly out-performs FM-TCP as the number of subscribers increase. Thus, it offers higher scaleability to the environment. This gain has come with a cost, increased latency. This increase is due to the additional processing required for wrapping the data to multicast. Likewise for our experiments on the ATM network, FM-PRMP out performs FM-TCP in the TAG benchmark. However, the communication protocol that gives the best result is the FMATM. It can also be seen from figure 10 that the FM-PRMP(r/udp) performance is getting closer to the FM-ATM performance. For larger number of subscribers, it would be confidently predicted that FM-PRMP will out perform FM-ATM. Comparing the TAG benchmark of the different protocols between the two network architecture, Ethernet and ATM, FM-PRMP is not affected while FM-TCP performance over Ethernet is certainly much better than FM-TCP over ATM. The reason for the poor performance of FM-TCP over ATM is the additional processing required to run IP over the LANE ATM network infrastructure. It did not affect FM-PRMP because FM-PRMP uses the BUS module in LANE to do the broadcasting to all the receivers, which is more efficient than the sender repeatedly sending the packets. Comparing the latency figures for the different network architectures, ATM certainly has a much higher delay. This is due to the complexity of the underlying network architecture. 100 BaseT Ethernet architecture is very simple while ATM support for IP architecture is complex as shown in figure 3. Given the above results, we find that it is better to run DVSE over 100BaseT Ethemet than ATM, and FM-PRMP is certainly the protocol to use. We are presently developing an RTI-kit that would switch the communication protocols used based on the type of event and the number of subscribers. Thus, the communication in DVSE is optimised.
8. References [ 1] IEEE 1278.1-1995, StandardJbr Distributed Interactive Simulation - Application Protocols
[2] IEEE 1278.2-1995, StandardJbr Distributed Interactive Simulation - Communication services and Profiles
[3] Defense Modeling & Simulation Office.
41
http://www.dmso.mil [4] High Level Architecture. htto://hla,dmso.mil [5] Charles E. Spurgeon "Ethemet:The Definitive Guide" ,O'Reilly and Associates. http://www.bellereti.com/ethernet/edg/edg.html [6] Tinothy Kwok," ATM: The New Paradigm for Internet, Intranet, and Residential Broadband Services and Applications", Prentice Hall. ISBN 0-13-107244-7, 1997. [7] Behrouz A. Forouzan"TCP/IP: Protocol Suite",McGraw Hill International Edition 2000. ISBN 0-256-24166-X, 2000 [8] Alex Koh Jit-Beng,"Communication Optimization Techniques for Distributed Virtual Simulation", Master's Thesis, Nanyang Technological University. 2000 [9] Federated Simulations Development Kit http://www.cc.gatech, edu/computing/pads/fdk.html [10] S.T. Bachinsky, L. Mellon, G.H. Tarbox and R. Fujimoto, RT1 2.0 Architecture, Proceedings of the Spring Simulation Interoperability Workshop, Mar 1998 [11] R. Fujimoto and P. Hoare, HLA RTIPer/brmance in High Speed LAN Environments, 1998 Fall Simulation Interoperability Workshop, Sep 1998 [12] Fast Messages http://www-csag.ucsd.edu/projects/comm/fm.html [ 13] W.R. Stevens," UNIX Network Programming, Networking APIs: Sockets and XTI", Vol. 1, Prentice Hall 1999 [14] MFTP: Multicast File Transfer Protocol. http://stardust.corn/ipmulticast/community/whitepaper/MFTP-IPMI.pdf [15] V, P. Laviano and J. M. Pullen, Selectively Reliable Transmission Protocol, IETF Internet-Draft, http://nac.gmu.edu/-vlavialao/draft-laviano-srtp-01 .txt [16] Patrick W. Dowd, Todd M. Carrozzi, Frank A. Pellegrino, Amy Xin Chen," Native ATM Application Programmer interface Testbed for Cluster-Based Computing", IPPS'96 [17] J. M. PuUen, V. P. Laviano and M. O. Moreau, Creating a Light-Weight RTI Using Selectively Reliable Transmission as an Evolution of Dual-Mode Multicast, Proceedings of the Fall Simulation Interoperability Workshop, Sep 1997 [ 18] Nalin K. Sharda,"Multimedia Information Networking", Prentice Hall, 1999, ISBN: 013-258773-4 [19] ATM Forum Technical Committee," LAN Emulation over ATM version 1.0", -f~..p..://..~p..:.a~..~.~..~.m.-:e...~.~.p.~u.b..L.a.9-p.r..~..y.~d~p~.~f~.~9.~.2.k~QQ`pd.f~ Jan'95 [20] M. Laubach,"Classical IP and ARP over ATM", RFC 1577, Network Working Group. [21] FORE System,"ForeRunner ASX-200BX/ASX-200BXE ATM switch user's manual", Feb' 1996. [22] Alex Koh Jit-Beng, Francis Lee Bu-Sung, Cai Wen-Tong, Stephen J. Turner, "Multicast Fast Messages in RTI-Kit", Simulation Interoperability Workshop Spring 2000, 00SSIW-039, March 26-31,2000. Orlando, Florida.
42