1-Causality and

ISCOM'97, Hsinchu, Taiwan, December 17 { December 19

1-Causality and "-delivery in Wide-Area Group Communications Takayuki Tachikawa and Makoto Takizawa

Dept. of Computers and Systems Engineering Tokyo Denki University Ishizaka, Hatoyama, Hiki-gun, Saitama 350-03, Japan Email: ftachi, [email protected] Abstract | In distributed applications, a group of multiple processes cooperate by exchanging multimedia messages. It is critical to support the group of application processes with enough quality of service (QoS) in addition to the ordered delivery of messages. The delay time and the message loss ratio are signi cant QoS parameters. In the Internet application, the delay time and the loss ratio are signi cantly dierent in dierent communication channels. We de ne a novel causality named 13 -causality among the messages to hold in the world-wide environment. We discuss how to transmit messages to the destination processes and how to resolve the message loss and delay supporting the 13 -causality given the requirements of delay time and message loss ratio.

I. Introduction

In distributed applications like teleconferences, a group of multiple processes cooperate by exchanging multimedia data. Group communication protocols support a group of processes with the reliable and ordered delivery of messages to multiple destinations in the group. Transis [3], ISIS(CBCAST) [5], and others [2,9] support the causally ordered delivery. Group communication protocols discussed so far assume that every communication channel has almost the same communication delay time. The FACE project [10] is now developing the world-wide teleconferences among the agents distributed in Japan, USA, and UK. Here, let us consider a worldwide teleconference among processes K , U , S , and H in Keele of UK, UCLA of the USA, Sendai and Hatoyama of Japan, respectively. By using the Internet, it takes about 60 msec to propagate a message in Japan while taking about 240 msec between Japan and Europe. In addition, the longer the distance is, the more messages are lost. For example, over than 10% of the messages are lost between Japan and Europe while less than 1% is lost in Japan. Thus, each communication channel between the processes supports dierent delay time [6] and a dierent level of reliability in the wide -area group. If the traditional group communication protocols are adopted to the wide-area group, the time for delivering messages to the destinations is dominated by the channel with the longest delay and the lowest level of reliability. In realtime multimedia applications, messages have to be delivered in some predetermined time units. The 1-causality [1,4,11,12] among messages is discussed where 1 denotes the maximum delay time between the processes required by the application. That is, it is meaningless to receive a message m unless m is delivered in 1 after m is transmitted. The 1causality assumes that every pair of processes have the same delay time 1. Each communication channel between processes Pi and Pj supports a dierent quality of service (QoS), i.e.

delay time ij and message loss ratio "ij . ij and "ij are time-variant. For example, ij and "ij are increased if the communication channel between Pi and Pj is congested. On the other hand, the application requires the system to support some QoS. Here, let 1ij and Eij be the delay time and the message loss ratio required between Pi and Pj , respectively. Here, problem is how to support each pair of Pi and Pj with 1ij and Eij given ij and "ij in the group. If messages are lost in the network and Eij "ij , some of the messages lost have to be retransmitted. The delay ij is related with the message loss ratio "ij . In addition, the causality among messages has to be de ned taking into account that some pair of 1ij and 1kl may3 be signi cantly dierent. In this3 paper, we newly de ne 1 -causality . Then, we present a 1 -causally ordered (13 CO) protocol which supports the 13 -causality and can reduce the delay time to deliver messages and the number of messages retransmitted by the destination forwarding and the replicated transmission . In section 2, we present a system model. In section 3, we discuss the 13 -causality. In section 4, we discuss the 13 CO protocol. We evaluate the protocol in section 5. II. System Model II. A. System layers

A distributed system is composed of three hierarchical layers, i.e. application , transport , and network layers as shown in Fig. 1. A group of n ( 2) application processes AP1, ..., APn cooperate. Each APi communicates with the others in the group by using the underlying group communication service supported by transport processes T P1 , ..., T Pn. Here, let G denote a group of the transport processes (G = f T P1 , ..., T Png) supporting AP1 , ..., APn. Data units transmitted at the transport layer and application layer are referred to as messages and streams , respectively. The network layer is considered to support each pair of processes T Pi and T Pj with a logical channel supporting the IP service. Data units at the network layer are referred to as packets . The channels are less reliable and asynchronous. Application

AP 1

Transport

TP 1

AP n GC / GCM

Network

TP n

Internet

Fig. 1: Distributed system.

The cooperation of T P1, ..., T Pn is coordinated by group (GC) and group communication management

communication

0.05

Sendai (S) 0.04

0.03 Receipt ratio R(t)

(GCM) protocols. The GC protocol establishes a group G among T P1, ..., T Pn and then reliably and causally [5] delivers messages to the destination processes in G. The GCM protocol is used for monitoring and managing the membership of G. APi requests T Pi to send a stream s to APj . T Pi decomposes s into messages, and sends the messages to T Pj . TPj assembles the messages into a stream sj , and delivers sj to APj . In some application, even if some messages are lost, APj can accept sj without retransmission of the messages lost. Here, the messages loss ratio "ij is de ned to be jsj j = jsj where jj is a number of messages in a stream . Eij is the maximum loss ratio required by the application. APj can accept sj if Eij > "ij . In addition, s is required to be delivered to APj in some time units 1ij . If ij > 1ij , there is no meaning for T Pj to receive s.

UCLA (U) 0.02

0.01

Keele (K)

II. B. Network parameters

Every T Pi has to know of the delay time ij and the loss ratio "ij with each T Pj in the group G. In the GCM protocol, TPi calculates ij from the round trip time between T Pi and TPj obtained. T Pi sends periodically the ICMP packets to all the processes in G. Here, T Pj is nearer to T Pi than T Pk if ij < ik . In addition, the GCM protocol obtains "ij by monitoring packets lost between each pair of T Pi and T Pj . Here, we assume that ij = ji and "ij = "ji . "ij and ij show averages of "ij and ij , respectively. Quality of3 service (QoS) supported by the network is characterized by = f ij g and "3 = f "ij g. The network is regular if ij + jk ik and (1 0 "ij )1(1 0 "jk ) (1 0 "ik ) for every three processes T Pi, TPj , and T Pk . In the irregular network, T Pi may deliver a packet m to T Pk via T Pj faster than directly sending m to TPk depending on the routing strategies. In the world-wide environment, ij and "ij are measured. First, the delay times among the processes H in Hatoyama, S in Sendai, U in UCLA, and K in Keele are measured. H sends 5,000 `PING' packets to S , U , and K . Each destination process monitors the delay time of each packet, i.e. how long it takes each packet to arrive at the destination. Here, Rij (t) shows the ratio of the packets which it takes t time units to arrive at the destination T Pj from T Pi to the total number of packets transmitted. Fig. 2 shows RHS (t), RHU (t), and RHK (t). The longer the distance is, the more uctuated the receipt ratio is. For example, 80% of packets arrive at S from H in 50 msec after the fastest packet arrives. On the other hand, 80% of the packets arrive at U and K in 85 and 137 msec after the fastest packet arrives there, respectively. In addition, the packet loss ratios for S , U , and K are measured as shown in Tab. 1. Only 0.9% of the packets are lost for S, but 8.3% and 11.7% are lost for U and K , respectively. Tab. 1 shows that the longer the distance is, the more packets are lost. Tab. 1: Delay[msec] & lost[%]. host min avrg max lost S 30.437 60.427 756.263 0.9 U 119.506 157.171 532.433 8.3 K 164.497 241.370 2733.565 11.7

0 0

100

200

300 Delay t [msec]

400

500

Fig. 2: Packet receipt ratio v.s. delay. [Causal precedence relation] A messages m1 causally precedes m2 (m1 ! m2 ) i m1 is sent before m2 by a process, m2 is sent after delivering m1 by a process, or m1 ! m3 ! m2 for some message m3 . 2

In realtime multimedia applications, messages have to be delivered to the destinations by deadline speci ed for the messages. Thus, a process T Pj has to receive a message m in 1 time units after T Pi sends m [1,4,12]. 1 denotes the maximum delay time between the processes in G, which is required by the application. Here, let ts(m) be time when m is sent. Let tri(m) be time when T Pi receives m. Suppose that T Pi sends a message m to T Pj . m is referred to as received in 1 by T Pj i ts(m) + 1 trj (m). The causality based on 1 [1] is de ned as follows. [1-causality] For every pair of messages m1 and m2 , m1 11 causally precedes m2 (m1 ! m2 ) i (1) m1 ! m2 and (2) ts(m1) + 1 ts(m2). 2 In a group K = f T P1, T P2, T P3 g as shown in Fig. 3, T P1 sends a message m1 to T P2 and T P3 . T P2 sends m2 after 1 m ). Then, T P sends m to T P . receiving m1 in 1 (m1 ! 2 2 3 3 T P3 receives m2 in 1 after m2 is sent but receives m1 not in 1. Hence, T P3 delivers m2 but not m1. III. B.

13-causality

A wide -area group G = f T P1, ..., T Pn g is a group where ij kl and "ij "kl for some pair of processes T Pi and T Pj .

Here, each 1ij is speci ed for every pair of T Pi and T Pj by the application. 1ij can be obtained based on the statistics of ij and "ij between T Pi and T Pj . The shorter 1ij is, the more messages are lost between T Pi and T Pj . That is, if 1ij is smaller, the more messages take a larger time than 1ij to be delivered and have no time to be resent even if the messages are lost. One idea is to take 1ij to be longer than the average of ij , i.e. 1ij ij . 1ij 1kl may hold if the distance between T Pi and T Pj is larger than T Pk and T Pl . In 3 III. 1 -Causality addition, 1ij is related with the loss ratios "ij and Eij . If Eij III. A. 1-causality > "ij , some messages are allowed to be lost. Otherwise, some lost by T Pj have to be retransmitted by T Pi. Here, Messages sent in a group G = fT P1, ..., T Png have to be messages let 13 be a set f1ij j i, j = 1, ..., ng of the delay requirements. delivered in the causal order [5].

TP 1

TP 2

TP 3

m1

∆

ik

m2 m3

time

Fig. 3: 1-causality. [13 -causality] Let m1 and m2 be messages sent by T Pi and 3 TPj , respectively. m1 13 -causally precedes m2 (m1 1! m2 ) i (1) m1 ! m2 and (2) ts(m1 ) + 1ij ts(m2 ). 2 That is, m2 is sent in 1ij time units after m1 is sent. In Fig. 4, T P1 sends a message m1 to T P2 and T P3, and TP2 sends m2 to T P3 after receiving m1 . Since T P3 receives m2 in 132 , T P3 delivers m2 . Then, T P3 receives m1 . Since TP3 receives m1 in 131 , T P3 can deliver m1 . However, since 3 m1 is already delivered and m1 1! m2 , T P3 cannot deliver m1 . If m1 is delivered, m2 cannot be delivered because m2 is obligated to be delivered after ts(m2) + 132. There is inconsistency among 112 and 123 . This example shows that TPi may not deliver m even if m is received in 1ij . Thus, the

13 -causality may be inconsistent if each 1ij is independently decided. If for3every triplet of processes T Pi, T Pj , and T Pk , 1ij = 1kj , 1! is consistent. That is, the 1-causality !1 is consistent because 1ij = 1 for every pair of T Pi and T Pj . TP 1 ∆ 21

TP 2

TP 3

m1 m2

∆ 32 ∆ 31

time

Fig. 4: 13 -causality.

In the wide-area group, the network may not be regular depending on the routing strategies. Here, 13 may be inconsistent if each 1ij is proportional to ij . That is, 1ij + 1jk < 1ik if ij + jk < ik for some T Pi, T Pj , and T Pk . There3are the following ways to resolve the inconsistency on the 1 -causality: (1) to neglect messages which do not satisfy the 13-causality, and (2) to change some 1ij so that 13 is consistent. IV.

processes so that the 13 -causality is satis ed given the network environment 3 and "3 . Since some messages are lost in the communication channels, T Pi detects that T Pk loses m1 if T Pi receives no receipt con rmation of m1 from T Pk in some time0 units tik R. tAs shown in Fig. 2, the delay time ik is variant. "ik = 1 0 Rik (t)dt gives a probability that each message sent by T Pi0 does not arrive at T Pk in tik after m1 is sent by T Pi. Here, "0ik "ik . If m1 is detected to be lost, T Pi resends m1 to T Pk . The expected delay time ik is (1 + 2"0ik + 2"0ik2 + 1 1 1)1tik = tik 1(1 + "0ik ) = (1 0 "0ik ). "0ik is "ik if ij = ik . Here, ik = ik 1(1 + "ik )/(1 0 "ik ). Here, let us consider three processes T Pi, T Pj , and T Pk in G. T Pi sends a message m1 to T Pj and T Pk . T Pj sends m2 to T Pk after receiving m1 from T Pi. One traditional way is to send directly m1 to T Pj and T Pk . That is, m1 arrives at T Pk after m2. Unless m1 arrives at T Pk in 1ik after m1 is sent, m1 is not delivered as presented in the preceding section. If m1 is lost by T Pk , the constraint 1ik or 1jk might not hold when T Pk receives m1 which T Pi retransmits. In order to deliver m1 so as to satisfy the 13 -causality, multiple replicas of m1 can be transmitted by sender replication and destination replication. In the sender replication, the sender T Pi sends rik ( 1) replicas of m1 to T Pk . Since each replica is lost in a probability "ik , the expected number of replicas which arrive at T Pk is (1 0 "ik ) 1 rik . If at least one of the replicas arrives at T Pk , T Pk receives m1. This is similar to the saturation protocol [7]. Hence, the expectedRminimum delay time ik for T Pk to receive m1 is given as 0 Rik (t)dt = 1 = [(1 0 "ik )1rik ]. If ik 1ik , T Pk can accept m1 . We measure how reliably the destination can receive a message m by sending multiple replicas of m in the Internet environment presented in Fig. 2. Here, the process H sends r replicas of m to S , U , and K . Tab. 2 shows a measured receipt ratio of messages for r = 1, 2, 3. Tab. 2: Receipt ratio by replication [%]. r 1 2 3 S 99.08 99.96 99.96 U 89.74 99.16 99.84 K 88.36 98.20 99.84 Next, suppose that T Pj sends m2 with m1 to T Pk [Fig. 5]. T Pk can receive two replicas of m1 from T Pi and T Pj . Even if it takes longer than 1ik to deliver m1 from T Pi to T Pk , T Pk receives another replica of m1 sent by T Pj . If m2 satis es 1jk , T Pk can receive m1 and m2. This is the destination replication. Here, T Pj sends T Pk a message m2 with messages which T Pj has received and are destined to T Pk . Here, we de ne which messages T Pk can forward with m1 in the destination replication. [De nition] A message m2 subsumes m1 in T Pj with respect to T Pk (m1 jk m2) i (1)3 m1 and m2 are sent to a common destination T Pk , (2) m1 1! m2, and (3) T Pj does not send a message to T Pk after receiving m1 before sending m2. 2 Let jk (m2) be a set of messages fm1 j m1 jk m2 g. T Pj can send jk (m2) to T Pk on sending m2. If each process T Pj forwards a message m to the other destinations, s1(s 0 1) replicas of m are transmitted for a number s of the destinations of m. Hence, each T Pj has to decide which message m1 in jk (m2) to be forwarded to T Pk on sending m2 to T Pk in order to reduce the number of replicas transmitted. In our protocol, T Pj sends m2 with a replica of m1 to T Pk if the following condition holds:

13-Causally Ordered Protocol

IV. A. Transmission strategies

First, we discuss how to transmit messages to the destination

ik

TP i ∆ ij

TP j m1

TP k

(m ) 1

m2

∆ ∆ jk ik

time

Fig. 5: Destination replication.

(1) ts(m1) + 1ik ts(m2) + jk , (2) ij + jk ik , and (3) (1 0 "ij )1(1 0 "jk) (1 0 "ik ). IV. B. Data transmission procedure

The messages have to be delivered3 to the destination processes in 3G so as3 to satisfy the 1 -causality and E 3 given QoS of and " supported by the network. The messages are causally ordered by using the vector clock [8]. Each T Pi manipulates the variables V C1, ..., V Cn showing the vector clock to causally order the messages received. T Pi increments V Ci by one each time T Pi sends a message m. m carries the vector clock m:V C = h m:V C1, ..., m:V Cn i. Here, m:V Cj = V Cj (j = 1, ..., n). On receipt of m from T Pj , V Cj = max(V Cj , m:V Cj ) (j = 1, ..., n). For every pair of messages m1 and m2 , m1 causally precedes m2 i m1 :V C < m2 :V C [8]. Each message m is sent to the destination processes, not necessarily all the processes in G. m:DP denotes a set of destination processes, i.e. m:DP G. Since a gap between messages cannot be detected by the vector clock, the message loss is detected by the sequence numbers of the messages. m is given a vector m:SEQ = h m:SEQ1, ..., m:SEQn i of sequence numbers. T Pi manipulates variables SEQ = h SEQ1, ..., SEQn i. Each SEQj denotes a sequence number of a message for T Pj . If m is destined to T Pj , SEQj is incremented by one. TPi has a variable Ti showing the current time. m carries m:ST which shows when m is sent, i.e. m:ST := Ti . The messages received are stored in the buer RBUF . The messages in RBUF are causally ordered by using the vector clock. The messages sent are also stored in the buer SBUF in the sending order. TPi manipulates variables REQ1 , ..., REQn to receive messages. REQj shows a sequence number where T Pi has received every message m from T Pj where m:SEQi REQj but T Pi does not receive m where SEQi REQj + 1. On receipt of a message m from T Pj , if m:SEQi = REQj , T Pi receives every message from T Pj whose SEQi REQj . Then, REQj := REQj + 1 and MREQj := REQj . MREQj shows a maximum sequence number of messages received from T Pj , i.e. MREQj : = max(MREQj , m:SEQ0 i). If m:SEQ i > REQj , T Pj nds a gap between m and m where m0 :SEQi = REQj 0 1. While MREQj > REQj , T Pi searches a message m for T Pj in RBUF where SEQi = REQj + 1. If found, REQj := REQj + 1. Until not found, this step is iterated. In order to notify what messages T Pi receives, each message m sent by T Pi carries the elds m:ACK1 , ..., m:ACKn.

Each time T Pi sends m, m:ACKj := REQj (j = 1, ..., n). T Pi manipulates variables ACK11 , ..., ACKnn . ACKkj := m:ACKj (j = 1, ..., n) on receipt of a message m from T Pk . This shows that T Pi knows that T Pj has received every message m sent by T Pk where m:SEQj ACKkj . If m:SEQk ACKjk for every T Pj in m:DP , T Pi knows that m is received by every destination. Then m can be removed from SBUF . When some message m sent by T Pk is not received by every destination, T Pi considers that T Pj loses m if the current time denoted by Ti is larger than m:ST + 2ij . Here, suppose that T Pi detects that T Pj loses m. m has to be retransmitted to T Pj . In the 13 CO protocol, not only the sender but also the destination can retransmit m to T Pj . We discuss which process to retransmit m to T Pj . Here, let RPi (m) ( m:DP ) be a subset of the destination processes of m which T Pi knows to have received m. RPi (m) also includes the sender of m. For each destination T Pk in RPi(m), Ck = ik + kj (1 + "kj )/(1 0 "kj ). For the source process T Ph of m, Ch = hj + hj (1 + "hj )/(1 0 "hj ). T Pi selects a process T Pk in RPi (m) whose Ck is the minimum and m:ST + Ck 1hj . If T Pi is selected, T Pi forwards m to T Pj . Otherwise, T Pi does not send m to T Pj . Suppose that T Pi sends a data D to the processes T Pi1, ..., T Pil in G. T Pi constructs a following message m. m:SP := T Pi ; m:DP := [ T Pi1 , ..., T Pil ]; m:ST := Ti ; V Ci := V Ci + 1; m:V C := V C ; SEQj := SEQj + 1 for every T Pj 2 m:DP ; m:SEQ := SEQ; m:ACK := REQ; m:DT := D; Before sending m, T Pi has to nd messages subsumed by m in the receipt buer RBUF . For each message m0 in RBUF , T Pi forwards m0 to T Pj if the following condition holds: [Subsumption condition] (1) T Pj 2 m:DP \ m0:DP , (2) Ti + 1ij < m0:ST + 1kj , where T Pk = m0:SP , (3) ki + ij kj , and (4) (1 0 "ki )1(1 0 "ij ) (1 0 "kj ). T Pi sends rij ( 1) replicas of m to T Pj in order to surely deliver m to T Pj . The number rij of the replicas is obtained by theRfollowing equation: 1 0 Rij (t)d = 1 = [(1 0 "ij )1rij ]. Here, suppose that T Pi receives a message m from T Pj . m is stored in RBUF and MREQj := max(MREQj , m:SEQi). T Pi manipulates the variables as follows: if (m:SEQi = REQj ) f REQj := REQj + 1; ACKjk := max(ACKjh , m:ACKk ) (k = 1, ..., n); V Ck := max(V Ck , m:V Ck ) (k = 1, ..., n); g Then, if MREQj REQj , T Pi searches a message m in RBU F whose SEQi = REQj . If found, the procedure presented above is executed. Each message m in RBUF is delivered if the following condition holds: Ti = m:ST + 1ki where T Pk = m:SP . Before delivering m, some messages causally preceding m in RBU F have to be delivered even if the condition discussed above does not hold. DV C shows a vector clock of0 a message most recently delivered. That is, each message m in RBUF is delivered if the following condition is satis ed: 3 (1) m0 1! m, i.e. m0:V C < m:V C and i

i

ij

(2) m:V C > DV C. If m0 is delivered, DV C is changed as follows: DV Ck := max(DV Ck , m0 :V Ck ) for k = 1, ..., n; After delivering every message satisfying (1) and (2) in RBUF , m is delivered if m:V C > DV C . Otherwise, m is discarded.

K

U

H

S m2

m1

∆ HU ∆ HK

(1)

V. Evaluations

We evaluate the 13 CO protocol in a wide-area group G of processes T P1, ..., T Pn3 in terms of the number of messages to be rejected. The 1 CO protocol is implemented on the UNIX workstations. Every T Pi gets the statistics of the delay time ij and the loss ratio "ij for each T Pj by using the GCM protocol, and reports it to the application process APi. APi decides 1ij by using the statistics of ij and "ij . One way to obtain 1ij is by adding the average ij with some constant i . Another way is for 1ij to be given time t within when a message can be received in a possibility of percent. For example, let be 70[%]. From Fig. 2, 70% of messages sent by U can be received by H in 168 msec. Hence, 1HU is given 168[msec]. 70% of messages sent by K can be received in 320 msec. Hence, 1HK = 320. On the other hand, the average delay time between H and S is about 60 msec where only 0.1% of messages are lost. Here, let 1HS be 90 which is 50% larger than 60 msec, i.e. i = 30%. First, suppose that each 1ij is the constant 1. Here, the minimum, average, and maximum of the delay are 90, 168, and 320 msec, respectively. Tab. 3 shows the receipt ratios of messages sent to H from S , U , and K. Next, we consider how many messages are rejected in order to satisfy the 13 -causality. Fig. 6 shows how m1 and m2 such 3 1 that m1 ! m2 are received by H . We assume HK ' HU + UK since there is a routing path from H via U to K . In Fig. 6, we also assume that U and S send m2 on receipt of m1. In (1) and (2), m1 is sent by K . In (3), U sends m1. In (3), U sends m2 on receiving m1 while S sends m2 in (2) and (3). The reject ratios of messages received are 6.9% in (1), 16.6% in (2), and 16.7% in (3) without destination replication. Tab. 3: Message receipt ratio " [%] for 1.

1 [msec] S U K minimum 90 60.5 0.0 0.0 average 168 94.5 70.0 7.6 maximum 320 98.9 88.9 70.0

m1

m2 ∆ HK

∆ HS

(2)

m1

m2 ∆ ∆ HS HU

(3)

time

Fig. 6: Inconsistent 13 -causality. References

[1] Adelstein, F. and Singhal, M., \Real-Time Causal Message Ordering in Multimedia Systems," . , pp.36{43, 1995. [2] Aiello, R., Pagani, E., and Rossi, G. P, \Causal Ordering in Reliable Group Communications," . , pp.106{ 115, 1993. [3] Amir, Y., Dolev, D., Kramer, S., and Malki, D., \Transis: A Communication Sub-System for High Availability," . , pp.76{84, 1993. [4] Baldoni, R., Mostefaoui, A., and Raynal, M., \Ecient Causally Ordered Communications for Multimedia Real-Time Applications," , pp.1400147, 1995. [5] Birman, K., Schiper, A., and Stephenson, P., \Lightweight Causal and Atomic Group Multicast," . , Vol.9, No.3, pp.272-314, 1991. [6] Hofmann, M., Braun, T., and Carle, G., \Multicast Communication in Large Scale Networks," ( ), 1995. [7] Jones, M., Sorensen, S., and Wilbur, S., \Protocol Design for Large Group Multicasting: The Message Distribution Protocol, " , Vol. 14 No. 5, pp.287{297, 1991. [8] Mattern, F., \Virtual Time and Global States of Distributed Systems," (Cosnard, M. and Quinton, P. eds.), , pp.215{226, 1989. [9] Nakamura, A. and Takizawa, M., \Causally Ordering Broadcast Protocol," . , pp.48{55, 1994. [10] Shiratori, N., Sugawara, K., Kinoshita, T., and Chakraborty, G., \Flexible Networks: Basic Concepts and Architecture," . ., Vol.E77-B, No.11, pp.128701294, 1994. [11] Tachikawa, T. and Takizawa, M., \Multimedia IntraGroup Communication Protocol," , pp.1800187, 1995. [12] Yavatkar, R., \MCP: A Protocol for Coordination and Temporal Synchronization in Multimedia Collaborative Applications," . , pp.606{613, 1992. Proc

ACM

of

IEEE

ICDCS-15

SIGCOMM'93

Proc

of

IEEE FTCS-22

Proc. of IEEE HPDC-4

ACM Trans

Computer

Systems

Third IEEE Workshop on

VI. Concluding Remarks

We have discussed the wide-area group communication which includes multiple processes interconnected by the Internet. Here, each logical channel between the processes in the group has dierent delay time. In this paper, we have proposed the 13 -causality to hold in the wide-area group and have presented the 13 CO3 protocol for supporting 3the 13 -causality given the delay and message loss ratio " of the network. We have measured the performance of the protocol in the wide-area group. Acknowledgements

We would like to thank Prof. Lewis A. Davis, Tokyo Denki Univ. for his useful help and encouragement in writing this paper. We also would like to thank Prof. N. Shiratori, Tohoku Univ., Prof. M. T. Liu, OSU, Prof. M. Gerla, UCLA, and Prof. S. M. Deen, Keele Univ. for their cooperation of the Internet experiments.

High Performance Communication Subsystems HPCS

Computer Communications

Parallel and Distributed Algorithms North-Holland

Proc

of IEEE ICDCS-14

IE-

ICE Trans

Commun

Proc. of

Proc of IEEE ICDCS-12

IEEE

HPDC-4