Effects of Group Synchronization Control in Networked Virtual ... - UoM

2 downloads 0 Views 1MB Size Report
rock-paper-scissors in this paper) [4] as collaborative work. We also investigate the effects of the group synchronization control in the case where we handle.
12th 2008 IEEE/ACM International Symposium on Distributed Simulation and Real-Time Applications

Effects of Group Synchronization Control in Networked Virtual Environments with Avatars Kazuki Hosoya Yutaka Ishibashi Shinji Sugawara Department of Scientific and Engineering Simulation Nagoya Institute of Technology Nagoya 466-8555, Japan k [email protected] [email protected] [email protected] Kostas E. Psannis Department of Technology Management University of Macedonia Thessaloniki 540 06, Greece [email protected]

Abstract

solve such a problem, we need to carry out group (or inter-destination) synchronization control [5]-[8], which synchronizes the output timing among multiple terminals. In [5], [6], and [7], the authors demonstrate the effects of group synchronization control by objective assessment when a single media source multicasts video and voice to two media destinations. However, the authors handle only one-way communication from the media source to the media destinations. In [8], where the authors deal with a networked shooting game in which two players fight with each other and the players transmit information about shots and movements of fighters through a server, the effect of group synchronization control in duplex communication is illustrated by objective assessment. However, they do not examine how the group synchronization control affects on the user-level QoS (i.e., QoE: Quality of Experience) since subjective assessment has not been carried out. This paper subjectively assesses the effects of group synchronization control in the case where we handle avatars in duplex communication by dealing with a name-guessing task like fastest fingers first, which is employed as competitive work in [3], and rock-paperscissors which is done over a network (called networked rock-paper-scissors in this paper) [4] as collaborative work. We also investigate the effects of the group synchronization control in the case where we handle videos, which are more widely used, instead of avatars. Furthermore, we compare subjective assessment results between the avatar case and the video case in order to investigate the influence of difference between the two cases on subjective assessment results. The rest of this paper is organized as follows. Section 2 explains the group synchronization control, and Section 3 describes two types of work handled in this paper. We explain our experimental system and the subjective assessment method in Section 4. Then, assessment results are presented in Section 5. We compare the assessment results between the avatar case and the video case in Section 6. Section 7 concludes the paper.

By subjective assessment, this paper investigates the effects of group (or inter-destination) synchronization control in networked virtual environments where users have a conversation with each other by using avatars constructed by computer graphics (CG) and live voices. Assessment results show that the fairness among the users can be maintained high under the group synchronization control in networked competitive work. We also demonstrate that the synchronization quality of networked collaborative work can be improved under the group synchronization control. In addition, we make an experiment in which we use videos instead of avatars, and we examine the difference between results of the avatar case and those of the video case.

1. Introduction In networked virtual environments which transmit voices and videos/information about movements of avatars constructed by computer graphics (CG), users can perform various kinds of competitive and collaborative work [1], [2]. However, when we transmit the media streams over a network which does not guarantee QoS (Quality of Service), temporal relationships of the media streams among multiple terminals may be disturbed owing to network latency and jitter. This is a main factor which damages the output quality of the media streams. Moreover, the interactivity and fairness may be disturbed in networked competitive work and networked collaborative work if the output timings among the terminals are different from each other [3], [4]. In [3], the authors investigate the influences of the network latency on the perception of fairness in a name-guessing task like fastest fingers first as networked competitive work by subjective assessment. They show that the fairness among users is damaged by the difference in output timing among the users. To

1550-6525/08 $25.00 © 2008 IEEE DOI 10.1109/DS-RT.2008.12

119

2.

GROUP CONTROL

SYNCHRONIZATION

a list of genres and determines a keyword (e.g., apple) which is closely related to the selected genre. Next, the examiner tells the genre to the two questioners (for example, he/she says to them, “The genre is fruit.” See Fig. 1 (a)). The questioners guess the keyword and then say “I’ve got it!” while raising their hands (see Fig. 1 (b)). They are instructed to answer after hearing the end of the examiner’s sentence. The examiner gives the right to answer orally to a questioner who has responded earlier than the other while pointing to the questioner’s image on the screen (see Fig. 1 (c)). Then, the questioner tells his/her guessed keyword to the examiner (for example, he/she says, “Is it an apple?”). If the questioner’s keyword coincides with the examiner’s one, the examiner says “That’s right!” and then selects a new genre. Otherwise, he/she makes the questioners guess the keyword again. The users can raise their avatars’ hands by hitting keyboards.

The group synchronization control adjusts the output timing of media units (MUs), each of which is the information unit for media synchronization, among multiple destinations. Each destination notifies the other destinations of control information about the output timing of MUs (i.e., the target output time [5]). The target output time, which denotes the time at which an MU should be output, is set to the generation time of the MU plus ∆ (≥ 0) seconds. The control selects the reference output timing [7] of MUs, which represents the start time of output or modification time of the target output time [5], from among the output timings included in the notifications; in this paper, it chooses the latest output timing as the reference one. Moreover, the value of ∆ is dynamically changed between ∆L and ∆H (∆L ≤ ∆H ) according to the network latency 1 . The control tries to achieve group synchronization by using the same value of ∆ among different destinations. For group synchronization control, the master-slave destination scheme [5], the synchronization maestro scheme [6], and the distributed control scheme [7] have been proposed so far. In this paper, we adopt the distributed control scheme. This is because the distributed control scheme is more adequate than the others since our experimental system is based on a peer-to-peer (P2P) model [7]. In the distributed control scheme, each destination transmits information about the output timing of MUs at the destination to all the other destinations. Thus, the destination can know the output timings of all the other destinations. It selects the reference output timing by using the information received from all the other destinations. As described earlier, it selects the latest output timing from among the output timings of all the destinations as the reference output timing in this paper. It achieves group synchronization by making its output timing approach the reference output timing.

'ZCOKPGT 3WGUVKQPGT CXCVCT 3WGUVKQPGT

CXCVCT

CXCVCT

C 6JGGZCOKPGTVGNNU CIGPTGQHSWGUVKQP

#SWGUVKQPGTYJQJCUTCKUGFJKUJGTCXCVCT UJCPF HCUVGTVJCPJKUJGTRCTVPGTECPURGCM

D #SWGUVKQPGTTCKUGU JKUJGTCXCVCT UJCPF

E 6JGGZCOKPGTIKXGUQPGQH VJGSWGUVKQPGTUVJGTKIJVVQURGCM

Figure 1. Displayed images in nameguessing task like fastest fingers first (avatar case).

3.2. Networked Rock-Paper-Scissors In this work, there are one caller and two receivers. The caller plays a role to say “Rock, paper, scissors, go!” The caller and the receivers try to show rock, paper or scissors at the same time (see Fig. 2).

3. WORK DESCRIPTIONS

4GEGKXGT

CXCVCT

This paper deals with two types of work in which three users communicate with each other via a network by using voices and avatars/videos. One is a nameguessing task like fastest fingers first (we have enhanced the name-guessing task in ITU-T P.920 [9]). The other is networked rock-paper-scissors. In the name-guessing task like fastest fingers first, the users compete with each other. On the other hand, the users collaborate with each other in the networked rock-paper-scissors. In what follows, we explain the two types of work.

%CNNGT

CXCVCT

4GEGKXGT

CXCVCT

TQEM UEKUUQTU

RCRGT

Figure 2. Displayed images in networked rock-paper-scissors (avatar case).

3.1. Name-Guessing Task Like Fastest Fingers First One of the three users acts as an examiner (avatar 1) of questions, and the others behave as questioners (avatars 2 and 3) in this work. At first, the examiner selects a genre of a question (e.g., fruit) from among

4. METHOD OF EXPERIMENT 4.1. Experimental System

1 The maximum allowable delay ∆ for group synchronization H may greatly influence on the interactivity and group synchronization quality. However, the influences have not been examined sufficiently. To clarify the influences is for further study.

As shown in Fig. 3, terminal 1 (CPU: Pentium4 processor at 2.4 GHz, RAM: 512 Mbytes) and terminals

120

In the name-guessing task like fastest fingers first, each subject makes a conversation for 40 seconds on the condition that there is no additional delay, and then the subject does for the same duration by generating additional delays. He/she gives a score based on Table 1 in terms of fairness of chance of speaking and Table 2 in terms of interactivity and comprehensive quality. The comprehensive quality is a synthesis of the fairness and interactivity. At first, each subject acts as the examiner at terminal 1 or one of the questioners at terminal 2. Next, they swap their roles with each other and assess the fairness, interactivity, and comprehensive quality again. One of the authors serves as the other questioner at terminal 3.

2 and 3 (CPU: Pentium4 processor at 2.8 GHz, RAM: 512 Mbytes) are connected to each other via a network emulator (NIST Net [10]) by using 100BASE-T cables. Each terminal has a headset with a microphone. The terminal captures voice samples (PCM (Pulse Code Modulation), the average bit rate: 64 kbps) every 20 ms, and it transmits the samples to the other terminals as a voice MU. Moreover, it inputs information about the position and the direction of an avatar’s hand, and it transmits the information to the others as a CG MU. Each MU includes a timestamp, which denotes the generation time of the MU. The MU is transferred by UDP. 6GTOKPCN

CXCVCT

*GCFUGV

In the networked rock-paper-scissors, each subject makes a conversation for 10 seconds on the condition that there is no additional delay, and then the subject does for the same duration by generating additional delays. He/she gives a score based on Table 2 in terms of group synchronization quality, interactivity, and comprehensive quality. The comprehensive quality is a synthesis of the group synchronization quality and interactivity. At first, each subject acts as the caller at terminal 1 or one of the receivers at terminal 2. Next, they swap their roles with each other and assess the group synchronization quality, interactivity, and comprehensive quality. The other receiver at terminal 3 is served by one of the authors.

/*\

 674$1

4'5'6

219'4

FKUE

)4''0

EQORWVGT

*GCFUGV

&+)6#. 6'%*01.1)+'5 %142146+10

510;

510;

5452% #%6+8'52'#-'45;56'/

5452% #%6+8'52'#-'45;56'/

+06'..+)'06

$1156

*GCFUGV

#FFKVKQPCNFGNC[

#FFKVKQPCNFGNC[

/*\

/*\



 674$1

4'5'6

219'4

674$1

FKUE

4'5'6

6'%*01.1)+'5 %142146+10

510;

510;

5452% #%6+8'52'#-'45;56'/

5452% #%6+8'52'#-'45;56'/

+06'..+)'06

$1156

6GTOKPCN

CXCVCT

219'4

FKUE

)4''0

)4''0

EQORWVGT

&+)6#.

EQORWVGT

#FFKVKQPCN FGNC[ 0GVYQTMGOWNCVQT 㧔0+560GV㧕

&+)6#. 6'%*01.1)+'5 %142146+10

510;

510;

5452% #%6+8'52'#-'45;56'/

5452% #%6+8'52'#-'45;56'/

+06'..+)'06

$1156

6GTOKPCN

CXCVCT

Figure 3. Configuration of experimental system (avatar case).

It should be noted that the fairness is important in the name-guessing task like fastest fingers first because of competitive work; on the other hand, since the networked rock-paper-scissors is collaborative work, we need to pay attention to the quality of group synchronization rather than the fairness.

NIST Net is employed to generate an additional delay for each MU transmitted in both directions between two terminals according to the Pareto-normal distribution [10]. As shown in Fig. 3, the additional delay between terminals 1 and 2 is called additional delay 1 in this paper, that between terminals 1 and 3 additional delay 2, and that between terminals 2 and 3 additional delay 3. In the experiment, additional delay 3 is set to 0 ms 2 , the averages of additional delays 1 and 2 are changed from 50 ms to 250 ms at intervals of 50 ms. The standard deviation is set to 10 ms. Moreover, the maximum value of ∆ (∆H ) for group synchronization is 200 ms, and the minimum value of ∆ (∆L ) is 50 ms. The values of ∆H and ∆L were determined by a preliminary experiment so that the user-level QoS of interactivity is maintained high. In subjective assessment, we deal with two cases according to whether the group synchronization control is carried out or not. In the case where the group synchronization control is carried out, intra-stream and inter-stream synchronization control is performed. In the case where the group synchronization control is not exerted, the intra-stream and inter-stream synchronization control is also performed. We use the Virtual-Time Rendering (VTR) algorithm[8] for the intra-stream and inter-stream synchronization control. The algorithm dynamically changes the buffering time of MUs according to the network load.

Table 1. Five-grade scale of fairness. Score 5 4 3 2 1

Description Fair On the fair side Neither fair nor unfair On the unfair side Unfair

Table 2. Five-grade impairment scale for deterioration owing to network latency. Score 5 4 3 2 1

4.2. Subjective Assessment Method For subjective assessment, we have enhanced the double-stimulus method in ITU-R BT.500-11[11]. Subjects are twenty persons (men and women) whose ages are between 21 and 24. 2 We are now investigating the influence of additional delay 3 on the user-level QoS. We will report the assessment results in another paper.

121

Description Imperceptible Perceptible, but not annoying Slightly annoying Annoying Very annoying



5.1. Name-Guessing Task Like Fastest Fingers First



㪤㪦㪪

5. ASSESSMENT RESULTS

We show the MOS (Mean Opinion Score[11]) values of the fairness, interactivity, and comprehensive quality of the examiner as a function of the average of additional delay 2 minus the average of additional delay 1 (called the difference in network latency here) in Figs. 4, 5, and 6, respectively. In the figures, we display the 95% confidence intervals (we also plot the 95% confidence intervals in the following figures). From Fig. 4, we find that the MOS value is always high in the case where the group synchronization control is carried out. On the other hand, the MOS value becomes smaller as the absolute value of the difference in network latency increases in the case where no group synchronization control is exerted. We also see that in the case where no group synchronization control is performed, the MOS value is smaller than 3 when the absolute value of the difference in network latency is larger than approximately 100 ms. In Fig. 5, the difference in network latency hardly influences the MOS value independently of whether the group synchronization control is exerted or not. The interactivity may deteriorate if the absolute value of the difference in network latency becomes larger than 200 ms; in other work, the MOS value may deteriorate even when the additional delay is less than about 250 ms; to investigate this is for further study. Figure 6 has a similar tendency to Fig. 4. This is because the influence of the group synchronization control on the interactivity is very small. We find in Fig. 6 that the MOS value of the comprehensive quality is always maintained high by the group synchronization control. In contrast, the MOS value in the case where no group synchronization control is performed deteriorates when the absolute value of the difference in network latency is large. Furthermore, the MOS values of the questioner (see Figs. 7, 8, and 9) are almost the same as those of the examiner (see Figs. 4, 5, and 6, respectively).



㪄㪈㪌㪇

㪤㪦㪪



㪠㩷㪐㪌㩼㩷㪺㫆㫅㪽㫀㪻㪼㫅㪺㪼㩷㫀㫅㫋㪼㫉㫍㪸㫃 㪘㫍㪼㫉㪸㪾㪼㩷㫆㪽㩷㪸㪻㪻㫀㫋㫀㫆㫅㪸㫃㩷㪻㪼㫃㪸㫐㩷㪈 㪞㫉㫆㫌㫇㩷㫊㫐㫅㪺㪿㫉㫆㫅㫀㫑㪸㫋㫀㫆㫅 㪥㫆㩷㪾㫉㫆㫌㫇㩷㫊㫐㫅㪺㪿㫉㫆㫅㫀㫑㪸㫋㫀㫆㫅 㩷㩷㩷㩷㩷㩷㩷㩷㩷㩷㪺㫆㫅㫋㫉㫆㫃 㩷㩷㩷㩷㩷㩷㩷㩷㩷㩷㩷㩷㪺㫆㫅㫋㫉㫆㫃 㪌㪇㩷㫄㫊䇭䇭䇭䇭䇭䇭 㪌㪇㩷㫄㫊 㪈㪌㪇㩷㫄㫊 㪈㪌㪇㩷㫄㫊 㪉㪌㪇㩷㫄㫊 㪉㪌㪇㩷㫄㫊



㪈 㪄㪉㪇㪇

㪄㪈㪌㪇



㪄㪌㪇



㪌㪇

㪈㪇㪇

㪈㪌㪇

㪉㪇㪇

㪠㩷㪐㪌㩼㩷㪺㫆㫅㪽㫀㪻㪼㫅㪺㪼㩷㫀㫅㫋㪼㫉㫍㪸㫃

㪤㪦㪪



㪊 㪘㫍㪼㫉㪸㪾㪼㩷㫆㪽㩷㪸㪻㪻㫀㫋㫀㫆㫅㪸㫃㩷㪻㪼㫃㪸㫐㩷㪈 㪞㫉㫆㫌㫇㩷㫊㫐㫅㪺㪿㫉㫆㫅㫀㫑㪸㫋㫀㫆㫅 㪥㫆㩷㪾㫉㫆㫌㫇㩷㫊㫐㫅㪺㪿㫉㫆㫅㫀㫑㪸㫋㫀㫆㫅 㩷㩷㩷㩷㩷㩷㩷㩷㩷㩷㪺㫆㫅㫋㫉㫆㫃 㩷㩷㩷㩷㩷㩷㩷㩷㩷㩷㩷㩷㪺㫆㫅㫋㫉㫆㫃 㪌㪇㩷㫄㫊䇭䇭䇭䇭䇭䇭 㪌㪇㩷㫄㫊 㪈㪌㪇㩷㫄㫊 㪈㪌㪇㩷㫄㫊 㪉㪌㪇㩷㫄㫊 㪉㪌㪇㩷㫄㫊

㪄㪈㪌㪇

㪄㪈㪇㪇 㪄㪌㪇 㪇 㪌㪇 㪈㪇㪇 㪛㫀㪽㪽㪼㫉㪼㫅㪺㪼㩷㫀㫅㩷㫅㪼㫋㫎㫆㫉㫂㩷㫃㪸㫋㪼㫅㪺㫐㩷㪲㫄㫊㪴

㪈㪌㪇

㪉㪇㪇

Figure 7. MOS of fairness of questioner versus difference in network latency (avatar case).

㪠㩷㪐㪌㩼㩷㪺㫆㫅㪽㫀㪻㪼㫅㪺㪼㩷㫀㫅㫋㪼㫉㫍㪸㫃

㪄㪈㪇㪇

㪄㪈㪇㪇 㪄㪌㪇 㪇 㪌㪇 㪈㪇㪇 㪛㫀㪽㪽㪼㫉㪼㫅㪺㪼㩷㫀㫅㩷㫅㪼㫋㫎㫆㫉㫂㩷㫃㪸㫋㪼㫅㪺㫐㩷㪲㫄㫊㪴

Figure 6. MOS of comprehensive quality of examiner versus difference in network latency (avatar case).

㪞㫉㫆㫌㫇㩷㫊㫐㫅㪺㪿㫉㫆㫅㫀㫑㪸㫋㫀㫆㫅 㪥㫆㩷㪾㫉㫆㫌㫇㩷㫊㫐㫅㪺㪿㫉㫆㫅㫀㫑㪸㫋㫀㫆㫅 㩷㩷㩷㩷㩷㩷㩷㩷㩷㩷㪺㫆㫅㫋㫉㫆㫃 㩷㩷㩷㩷㩷㩷㩷㩷㩷㩷㩷㩷㪺㫆㫅㫋㫉㫆㫃 㪌㪇㩷㫄㫊䇭䇭䇭䇭䇭䇭 㪌㪇㩷㫄㫊 㪈㪌㪇㩷㫄㫊 㪈㪌㪇㩷㫄㫊 㪉㪌㪇㩷㫄㫊 㪉㪌㪇㩷㫄㫊 㪄㪈㪌㪇

㪉㪇㪇



㪈 㪄㪉㪇㪇



㪈㪌㪇







㪄㪈㪇㪇 㪄㪌㪇 㪇 㪌㪇 㪈㪇㪇 㪛㫀㪽㪽㪼㫉㪼㫅㪺㪼㩷㫀㫅㩷㫅㪼㫋㫎㫆㫉㫂㩷㫃㪸㫋㪼㫅㪺㫐㩷㪲㫄㫊㪴

Figure 5. MOS of interactivity of examiner versus difference in network latency (avatar case).



㪤㪦㪪

㪠㩷㪐㪌㩼㩷㪺㫆㫅㪽㫀㪻㪼㫅㪺㪼㩷㫀㫅㫋㪼㫉㫍㪸㫃 㪘㫍㪼㫉㪸㪾㪼㩷㫆㪽㩷㪸㪻㪻㫀㫋㫀㫆㫅㪸㫃㩷㪻㪼㫃㪸㫐㩷㪈 㪞㫉㫆㫌㫇㩷㫊㫐㫅㪺㪿㫉㫆㫅㫀㫑㪸㫋㫀㫆㫅 㪥㫆㩷㪾㫉㫆㫌㫇㩷㫊㫐㫅㪺㪿㫉㫆㫅㫀㫑㪸㫋㫀㫆㫅 㩷㩷㩷㩷㩷㩷㩷㩷㩷㩷㪺㫆㫅㫋㫉㫆㫃 㩷㩷㩷㩷㩷㩷㩷㩷㩷㩷㩷㩷㪺㫆㫅㫋㫉㫆㫃 㪌㪇㩷㫄㫊䇭䇭䇭䇭䇭䇭 㪌㪇㩷㫄㫊 㪈㪌㪇㩷㫄㫊 㪈㪌㪇㩷㫄㫊 㪉㪌㪇㩷㫄㫊 㪉㪌㪇㩷㫄㫊

㪈 㪄㪉㪇㪇



㪈 㪄㪉㪇㪇



㪈㪌㪇

㪉㪇㪇

Figure 4. MOS of fairness of examiner versus difference in network latency (avatar case).

quality of the caller versus the difference in network latency in Figs. 10, 11, and 12, respectively. In Fig. 10, the MOS values become smaller as the difference in network latency becomes larger independently of whether the group synchronization control is performed or not when the difference in network latency is positive. Also, the MOS values hardly depend on the difference in network latency when the difference in network latency is negative. The reason is as follows. We instructed the subjects to pay attention to the difference between the

5.2. Networked Rock-Paper-Scissors 1. Caller We show the MOS values of the group synchronization quality, interactivity, and comprehensive

122





㪘㫍㪼㫉㪸㪾㪼㩷㫆㪽㩷㪸㪻㪻㫀㫋㫀㫆㫅㪸㫃㩷㪻㪼㫃㪸㫐㩷㪈

㪞㫉㫆㫌㫇㩷㫊㫐㫅㪺㪿㫉㫆㫅㫀㫑㪸㫋㫀㫆㫅 㪥㫆㩷㪾㫉㫆㫌㫇㩷㫊㫐㫅㪺㪿㫉㫆㫅㫀㫑㪸㫋㫀㫆㫅 㩷㩷㩷㩷㩷㩷㩷㩷㩷㩷㪺㫆㫅㫋㫉㫆㫃 㩷㩷㩷㩷㩷㩷㩷㩷㩷㩷㩷㩷㪺㫆㫅㫋㫉㫆㫃

㪋 㪠㩷㪐㪌㩼㩷㪺㫆㫅㪽㫀㪻㪼㫅㪺㪼㩷㫀㫅㫋㪼㫉㫍㪸㫃 㪘㫍㪼㫉㪸㪾㪼㩷㫆㪽㩷㪸㪻㪻㫀㫋㫀㫆㫅㪸㫃㩷㪻㪼㫃㪸㫐㩷㪈 㪞㫉㫆㫌㫇㩷㫊㫐㫅㪺㪿㫉㫆㫅㫀㫑㪸㫋㫀㫆㫅 㪥㫆㩷㪾㫉㫆㫌㫇㩷㫊㫐㫅㪺㪿㫉㫆㫅㫀㫑㪸㫋㫀㫆㫅 㩷㩷㩷㩷㩷㩷㩷㩷㩷㩷㪺㫆㫅㫋㫉㫆㫃 㩷㩷㩷㩷㩷㩷㩷㩷㩷㩷㩷㩷㪺㫆㫅㫋㫉㫆㫃 㪌㪇㩷㫄㫊䇭䇭䇭䇭䇭䇭 㪌㪇㩷㫄㫊





㪈 㪄㪉㪇㪇

㪈㪌㪇㩷㫄㫊

㪈㪌㪇㩷㫄㫊

㪉㪌㪇㩷㫄㫊

㪉㪌㪇㩷㫄㫊

㪤㪦㪪

㪤㪦㪪







㪄㪈㪌㪇

㪄㪈㪇㪇 㪄㪌㪇 㪇 㪌㪇 㪈㪇㪇 㪛㫀㪽㪽㪼㫉㪼㫅㪺㪼㩷㫀㫅㩷㫅㪼㫋㫎㫆㫉㫂㩷㫃㪸㫋㪼㫅㪺㫐㩷㪲㫄㫊㪴

㪈㪌㪇

㪉㪇㪇

㪉㪇㪇

㪈㪌㪇

㪈㪇㪇 㪌㪇 㪇 㪌㪇 㪈㪇㪇 㪛㫀㪽㪽㪼㫉㪼㫅㪺㪼㩷㫀㫅㩷㫅㪼㫋㫎㫆㫉㫂㩷㫃㪸㫋㪼㫅㪺㫐㩷㪲㫄㫊㪴

㩷㩷㩷㩷㩷㩷㩷㩷㩷㩷㩷㩷㪺㫆㫅㫋㫉㫆㫃

㩷㩷㩷㩷㩷㩷㩷㩷㩷㩷㪺㫆㫅㫋㫉㫆㫃

㪌㪇㩷㫄㫊䇭䇭䇭䇭䇭䇭 㪈㪌㪇㩷㫄㫊䇭䇭䇭䇭䇭䇭 㪉㪌㪇㩷㫄㫊䇭䇭䇭䇭䇭䇭



㪤㪦㪪

㪤㪦㪪



㪘㫍㪼㫉㪸㪾㪼㩷㫆㪽㩷㪸㪻㪻㫀㫋㫀㫆㫅㪸㫃㩷㪻㪼㫃㪸㫐㩷㪈 㪞㫉㫆㫌㫇㩷㫊㫐㫅㪺㪿㫉㫆㫅㫀㫑㪸㫋㫀㫆㫅 㪥㫆㩷㪾㫉㫆㫌㫇㩷㫊㫐㫅㪺㪿㫉㫆㫅㫀㫑㪸㫋㫀㫆㫅 㩷㩷㩷㩷㩷㩷㩷㩷㩷㩷㪺㫆㫅㫋㫉㫆㫃 㩷㩷㩷㩷㩷㩷㩷㩷㩷㩷㩷㩷㪺㫆㫅㫋㫉㫆㫃 㪌㪇㩷㫄㫊䇭䇭䇭䇭䇭䇭 㪌㪇㩷㫄㫊

㪄㪈㪌㪇

㪄㪈㪇㪇

㪈㪌㪇㩷㫄㫊

㪈㪌㪇㩷㫄㫊

㪉㪌㪇㩷㫄㫊

㪉㪌㪇㩷㫄㫊

㪄㪌㪇

㪉㪇㪇

㪘㫍㪼㫉㪸㪾㪼㩷㫆㪽㩷㪸㪻㪻㫀㫋㫀㫆㫅㪸㫃㩷㪻㪼㫃㪸㫐㩷㪈 㪌 㪞㫉㫆㫌㫇㩷㫊㫐㫅㪺㪿㫉㫆㫅㫀㫑㪸㫋㫀㫆㫅 㪥㫆㩷㪾㫉㫆㫌㫇㩷㫊㫐㫅㪺㪿㫉㫆㫅㫀㫑㪸㫋㫀㫆㫅





㪈㪌㪇

Figure 10. MOS of group synchronization quality of caller versus difference in network latency (avatar case).

㪠㩷㪐㪌㩼㩷㪺㫆㫅㪽㫀㪻㪼㫅㪺㪼㩷㫀㫅㫋㪼㫉㫍㪸㫃

㪈 㪄㪉㪇㪇

㪠㩷㪐㪌㩼㩷㪺㫆㫅㪽㫀㪻㪼㫅㪺㪼㩷㫀㫅㫋㪼㫉㫍㪸㫃

㪌㪇㩷㫄㫊 㪈㪌㪇㩷㫄㫊 㪉㪌㪇㩷㫄㫊



Figure 8. MOS of interactivity of questioner versus difference in network latency (avatar case). 㪌

㪌㪇㩷㫄㫊䇭䇭䇭䇭䇭䇭 㪈㪌㪇㩷㫄㫊䇭䇭䇭䇭䇭䇭 㪉㪌㪇㩷㫄㫊䇭䇭䇭䇭䇭䇭



㪌㪇

㪌㪇㩷㫄㫊 㪈㪌㪇㩷㫄㫊 㪉㪌㪇㩷㫄㫊

㪠㩷㪐㪌㩼㩷㪺㫆㫅㪽㫀㪻㪼㫅㪺㪼㩷㫀㫅㫋㪼㫉㫍㪸㫃





㪈㪇㪇

㪈㪌㪇

㪈 㪄㪉㪇㪇

㪉㪇㪇

Figure 9. MOS of comprehensive quality of questioner versus difference in network latency (avatar case).

㪄㪈㪇㪇 㪄㪌㪇 㪇 㪌㪇 㪈㪇㪇 㪛㫀㪽㪽㪼㫉㪼㫅㪺㪼㩷㫀㫅㩷㫅㪼㫋㫎㫆㫉㫂㩷㫃㪸㫋㪼㫅㪺㫐㩷㪲㫄㫊㪴

㪈㪌㪇

㪉㪇㪇

Figure 11. MOS of interactivity of caller versus difference in network latency (avatar case). 㪌

㪘㫍㪼㫉㪸㪾㪼㩷㫆㪽㩷㪸㪻㪻㫀㫋㫀㫆㫅㪸㫃㩷㪻㪼㫃㪸㫐㩷㪈

㪞㫉㫆㫌㫇㩷㫊㫐㫅㪺㪿㫉㫆㫅㫀㫑㪸㫋㫀㫆㫅 㪥㫆㩷㪾㫉㫆㫌㫇㩷㫊㫐㫅㪺㪿㫉㫆㫅㫀㫑㪸㫋㫀㫆㫅 㩷㩷㩷㩷㩷㩷㩷㩷㩷㩷㪺㫆㫅㫋㫉㫆㫃 㩷㩷㩷㩷㩷㩷㩷㩷㩷㩷㩷㩷㪺㫆㫅㫋㫉㫆㫃



㪤㪦㪪

time at which an avatar had moved first and the time at which another avatar had moved last for the group synchronization quality. Therefore, the MOS value depends on the larger delay between additional delays 1 and 2. Note that additional delay 1 is larger than additional delay 2 when the difference in network latency is negative. Thus, the MOS value is almost constant and depends on only additional delay 1. From Fig. 10, we further see that the MOS value in the case where the group synchronization control is exerted is higher than that in the case where no group synchronization control is exerted. The reason is that the output timing of MUs which are generated at each terminal is delayed at the terminal by the group synchronization control. That is, since the difference between the time at which an avatar has moved first and the time at which another avatar has moved last in the case where the group synchronization control is performed is less than that in the case where no group synchronization control is performed, the MOS value in the former case becomes higher than that in the latter case. In Figs. 11 and 12, the MOS values have almost the same tendency as those in Fig. 10.

㪄㪈㪌㪇

㪌㪇㩷㫄㫊䇭䇭䇭䇭 㪈㪌㪇㩷㫄㫊䇭䇭䇭䇭 㪉㪌㪇㩷㫄㫊䇭䇭䇭䇭

㪠㩷㪐㪌㩼㩷㪺㫆㫅㪽㫀㪻㪼㫅㪺㪼㩷㫀㫅㫋㪼㫉㫍㪸㫃

㪌㪇㩷㫄㫊 㪈㪌㪇㩷㫄㫊 㪉㪌㪇㩷㫄㫊



㪉 㪄







㪈 㪄㪉㪇㪇

㪄㪈㪌㪇

㪄㪈㪇㪇 㪄㪌㪇 㪇 㪌㪇 㪈㪇㪇 㪛㫀㪽㪽㪼㫉㪼㫅㪺㪼㩷㫀㫅㩷㫅㪼㫋㫎㫆㫉㫂㩷㫃㪸㫋㪼㫅㪺㫐㩷㪲㫄㫊㪴

㪈㪌㪇

㪉㪇㪇

Figure 12. MOS of comprehensive quality of caller versus difference in network latency (avatar case).

quality of the receiver as a function of the difference in network latency in Figs. 13, 14, and 15, respectively. In Fig. 13, we see that the MOS value in the case where the group synchronization control is exerted is smaller than that in the case where no group synchronization control is carried out when the average of additional delay 1 is 150 ms and 250 ms. The reason is as follows. In the case where no group synchronization control is performed, at first, the avatar of the other receiver motions at the subject’s terminal earlier than or equal to the

2. Receiver We show the MOS values of the group synchronization quality, interactivity, and comprehensive

123

time at which the subject’s avatar motions. Next, the subject’s avatar and the caller’s avatar start to move at the subject’s terminal at almost the same time when the subject hears the call. The subject hardly perceives a deterioration when the others’ avatars motion earlier than the subject’s avatar. On the other hand, the avatar of the caller motions at the subject’s terminal immediately after the call is heard. And then, the subject’s avatar and the other avatar start to move ∆ seconds after the caller’s avatar motioned at the receiver’s terminal in the case where the group synchronization control is performed. Figure 13 also reveals that the MOS value decreases as the additional delays become larger. This is because the value of ∆ is changed according to the additional delays. However, the MOS value is about 3 even in the worst case where the group synchronization control is exerted; it is not a large deterioration compared with the deterioration of the MOS value of the caller. From Fig. 14, we find a similar tendency to Fig. 13. However, when the group synchronization control is exerted, the MOS values in Fig. 14 are higher than those in Fig. 13. The reason is as follows. The avatar of the caller motions earlier than the other avatars at the receiver terminal. The avatars of the two receivers start to move at the same time under the group synchronization control. The interactivity is preserved high since there is no avatar which motions slower than the subject’s avatar. In Fig. 15, the MOS value of the comprehensive quality is almost the same as that of the group synchronization quality (Fig. 13).

㪤㪦㪪





㪘㫍㪼㫉㪸㪾㪼㩷㫆㪽㩷㪸㪻㪻㫀㫋㫀㫆㫅㪸㫃㩷㪻㪼㫃㪸㫐㩷㪈 㪞㫉㫆㫌㫇㩷㫊㫐㫅㪺㪿㫉㫆㫅㫀㫑㪸㫋㫀㫆㫅 㩷㩷㩷㩷㩷㩷㩷㩷㩷㩷㪺㫆㫅㫋㫉㫆㫃



㪈 㪄㪉㪇㪇

㪥㫆㩷㪾㫉㫆㫌㫇㩷㫊㫐㫅㪺㪿㫉㫆㫅㫀㫑㪸㫋㫀㫆㫅 㩷㩷㩷㩷㩷㩷㩷㩷㩷㩷㩷㩷㪺㫆㫅㫋㫉㫆㫃

㪌㪇㩷㫄㫊䇭䇭䇭䇭 㪈㪌㪇㩷㫄㫊䇭䇭䇭䇭 㪉㪌㪇㩷㫄㫊䇭䇭䇭䇭

㪄㪈㪌㪇

㪌㪇㩷㫄㫊 㪌㪇㩷㫄㫊 㪉㪌㪇㩷㫄㫊

㪠㩷㪐㪌㩼㩷㪺㫆㫅㪽㫀㪻㪼㫅㪺㪼㩷㫀㫅㫋㪼㫉㫍㪸㫃

㪄㪈㪇㪇 㪄㪌㪇 㪇 㪌㪇 㪈㪇㪇 㪛㫀㪽㪽㪼㫉㪼㫅㪺㪼㩷㫀㫅㩷㫅㪼㫋㫎㫆㫉㫂㩷㫃㪸㫋㪼㫅㪺㫐㩷㪲㫄㫊㪴

㪈㪌㪇

㪉㪇㪇

Figure 14. MOS of interactivity of receiver versus difference in network latency (avatar case).

㪌 㪠㩷㪐㪌㩼㩷㪺㫆㫅㪽㫀㪻㪼㫅㪺㪼㩷㫀㫅㫋㪼㫉㫍㪸㫃

㪤㪦㪪



㪊 㪘㫍㪼㫉㪸㪾㪼㩷㫆㪽㩷㪸㪻㪻㫀㫋㫀㫆㫅㪸㫃㩷㪻㪼㫃㪸㫐㩷㪈



㪞㫉㫆㫌㫇㩷㫊㫐㫅㪺㪿㫉㫆㫅㫀㫑㪸㫋㫀㫆㫅 㪥㫆㩷㪾㫉㫆㫌㫇㩷㫊㫐㫅㪺㪿㫉㫆㫅㫀㫑㪸㫋㫀㫆㫅 㩷㩷㩷㩷㩷㩷㩷㩷㩷㩷㩷㩷㪺㫆㫅㫋㫉㫆㫃 㩷㩷㩷㩷㩷㩷㩷㩷㩷㩷㪺㫆㫅㫋㫉㫆㫃

㪌㪇㩷㫄㫊䇭䇭䇭䇭䇭䇭 㪈㪌㪇㩷㫄㫊 㪉㪌㪇㩷㫄㫊

㪈 㪄㪉㪇㪇

㪄㪈㪌㪇

㪌㪇㩷㫄㫊 㪌㪇㩷㫄㫊 㪉㪌㪇㩷㫄㫊

㪄㪈㪇㪇 㪄㪌㪇 㪇 㪌㪇 㪈㪇㪇 㪛㫀㪽㪽㪼㫉㪼㫅㪺㪼㩷㫀㫅㩷㫅㪼㫋㫎㫆㫉㫂㩷㫃㪸㫋㪼㫅㪺㫐㩷㪲㫄㫊㪴

㪈㪌㪇

㪉㪇㪇

Figure 15. MOS of comprehensive quality of receiver versus difference in network latency (avatar case).

㪌 'ZCOKPGT

㪤㪦㪪







㪞㫉㫆㫌㫇㩷㫊㫐㫅㪺㪿㫉㫆㫅㫀㫑㪸㫋㫀㫆㫅 㩷㩷㩷㩷㩷㩷㩷㩷㩷㩷㪺㫆㫅㫋㫉㫆㫃

㪌㪇㩷㫄㫊䇭䇭䇭䇭 㪈㪌㪇㩷㫄㫊 㪉㪌㪇㩷㫄㫊

㩷㩷㩷㩷㩷㩷㩷㩷㩷㩷㩷㩷㪺㫆㫅㫋㫉㫆㫃

3WGUVKQPGT

㪌㪇㩷㫄㫊 㪈㪌㪇㩷㫄㫊 㪉㪌㪇㩷㫄㫊

㪠㩷㪐㪌㩼㩷㪺㫆㫅㪽㫀㪻㪼㫅㪺㪼㩷㫀㫅㫋㪼㫉㫍㪸㫃



㪉㪇㪇

㪈㪌㪇

㪈㪇㪇

㪌㪇



㪌㪇

㪈㪇㪇

㪈㪌㪇

㪉㪇㪇 㪋



Figure 13. MOS of group synchronization quality of receiver versus difference in network latency (avatar case).











Figure 16. Displayed images in nameguessing task like fastest fingers first (video case).

6. EXPERIMENT WITH VIDEO In this section, we do the two types of work described in Section 3 by using videos instead of avatars. Figure 16 shows displayed images in the name-guessing task like fastest fingers first in the video case. This situation in Fig. 16 corresponds to the scene of Fig. 1 (c) in the avatar case. We also show displayed images in the networked rock-paper-scissors in the video case in Fig. 17. In what follows, we describe our experimental system, and the subjective assessment method is explained. Then, assessment results are presented.

6.1. Experimental System Our experimental system is almost the same as that in Subsection 4.1. As shown in Fig. 18, each terminal has a camera. The terminal inputs a video picture (MPEG-1, GOP: I (Group Of Pictures: Intra-coded pictures only), the average bit rate: 616 kbps) as a video MU every 33 ms. The video MU is transferred to the other terminals by UDP.

124

%CNNGT



㪠㩷㪐㪌㩼㩷㪺㫆㫅㪽㫀㪻㪼㫅㪺㪼㩷㫀㫅㫋㪼㫉㫍㪸㫃

㪤㪦㪪



4GEGKXGT

㪊 㪘㫍㪼㫉㪸㪾㪼㩷㫆㪽㩷㪸㪻㪻㫀㫋㫀㫆㫅㪸㫃㩷㪻㪼㫃㪸㫐㩷㪈 㪞㫉㫆㫌㫇㩷㫊㫐㫅㪺㪿㫉㫆㫅㫀㫑㪸㫋㫀㫆㫅 㪥㫆㩷㪾㫉㫆㫌㫇㩷㫊㫐㫅㪺㪿㫉㫆㫅㫀㫑㪸㫋㫀㫆㫅 㩷㩷㩷㩷㩷㩷㩷㩷㩷㩷㪺㫆㫅㫋㫉㫆㫃 㩷㩷㩷㩷㩷㩷㩷㩷㩷㩷㩷㩷㪺㫆㫅㫋㫉㫆㫃 㪌㪇㩷㫄㫊䇭䇭䇭䇭䇭䇭 㪌㪇㩷㫄㫊 㪈㪌㪇㩷㫄㫊 㪈㪌㪇㩷㫄㫊 㪉㪌㪇㩷㫄㫊 㪉㪌㪇㩷㫄㫊



㪈 㪄㪉㪇㪇

㪄㪈㪇㪇 㪄㪌㪇 㪇 㪌㪇 㪈㪇㪇 㪛㫀㪽㪽㪼㫉㪼㫅㪺㪼㩷㫀㫅㩷㫅㪼㫋㫎㫆㫉㫂㩷㫃㪸㫋㪼㫅㪺㫐㩷㪲㫄㫊㪴

㪈㪌㪇

㪉㪇㪇

Figure 19. MOS of fairness of examiner versus difference in network latency (video case).

Figure 17. Displayed images in networked rock-paper-scissors (video case).



6GTOKPCN %COGTC

*GCFUGV

㪄㪈㪌㪇

/*\

 674$1

4'5'6

219'4



FKUE

)4''0 EQORWVGT

*GCFUGV

&+)6#. 6'%*01.1)+'5 %142146+10

510;

510;

5452% #%6+8'52'#-'45;56'/

5452% #%6+8'52'#-'45;56'/

+06'..+)'06

$1156

#FFKVKQPCN %COGTC FGNC[

#FFKVKQPCN %COGTC FGNC[

/*\

㪤㪦㪪

*GCFUGV

 674$1

4'5'6

674$1

219'4

FKUE

4'5'6

)4''0

)4''0

EQORWVGT

6'%*01.1)+'5 %142146+10

510;

5452%

5452% #%6+8'52'#-'45;56'/

+06'..+)'06

$1156

6GTOKPCN

219'4

FKUE

&+)6#.

510; #%6+8'52'#-'45;56'/

㪠㩷㪐㪌㩼㩷㪺㫆㫅㪽㫀㪻㪼㫅㪺㪼㩷㫀㫅㫋㪼㫉㫍㪸㫃 㪘㫍㪼㫉㪸㪾㪼㩷㫆㪽㩷㪸㪻㪻㫀㫋㫀㫆㫅㪸㫃㩷㪻㪼㫃㪸㫐㩷㪈 㪞㫉㫆㫌㫇㩷㫊㫐㫅㪺㪿㫉㫆㫅㫀㫑㪸㫋㫀㫆㫅 㪥㫆㩷㪾㫉㫆㫌㫇㩷㫊㫐㫅㪺㪿㫉㫆㫅㫀㫑㪸㫋㫀㫆㫅 㩷㩷㩷㩷㩷㩷㩷㩷㩷㩷㪺㫆㫅㫋㫉㫆㫃 㩷㩷㩷㩷㩷㩷㩷㩷㩷㩷㩷㩷㪺㫆㫅㫋㫉㫆㫃 㪌㪇㩷㫄㫊䇭䇭䇭䇭䇭䇭 㪌㪇㩷㫄㫊 㪈㪌㪇㩷㫄㫊䇭䇭䇭䇭䇭 㪈㪌㪇㩷㫄㫊 㪉㪌㪇㩷㫄㫊䇭䇭䇭䇭䇭 㪉㪌㪇㩷㫄㫊



/*\



EQORWVGT

#FFKVKQPCN FGNC[

&+)6#. 6'%*01.1)+'5 %142146+10

510;



510;

5452% #%6+8'52'#-'45;56'/

5452% #%6+8'52'#-'45;56'/

+06'..+)'06

$1156

6GTOKPCN

0GVYQTMGOWNCVQT 0+560GV㧕

㪈 㪉㪇㪇

Figure 18. Configuration of experimental system (video case).

㪈㪌㪇

㪈㪇㪇 㪌㪇 㪇 㪌㪇 㪈㪇㪇 㪛㫀㪽㪽㪼㫉㪼㫅㪺㪼㩷㫀㫅㩷㫅㪼㫋㫎㫆㫉㫂㩷㫃㪸㫋㪼㫅㪺㫐㩷㪲㫄㫊㪴

㪈㪌㪇

㪉㪇㪇

Figure 20. MOS of interactivity of examiner versus difference in network latency (video case).

6.2. Subjective Assessment Method 㪌

For subjective assessment, we employ the same method as that in Subsection 4.2. We also have the same subjects as those in Subsection 4.2.



㪤㪦㪪

6.3. Assessment Results 1. Name-guessing task like fastest fingers first We show the MOS values of the fairness, interactivity, and comprehensive quality of the examiner versus the difference in network latency in Figs. 19, 20, and 21, respectively. From Figs. 19, 20, and 21, we find that results in the video case are not largely different from results in the avatar case (Figs. 4, 5, and 6). However, by comparison between Figs. 5 and 20, we see that the MOS value of the interactivity in the video case tends to be slightly smaller than that in the avatar case. This is because subjects can raise avatars’ hands instantly, but it takes some time for them to move their hands actually. The MOS values of the questioner (plotted in Figs. 22, 23, and 24) are almost the same as those of the examiner (Figs. 7, 8, and 9, respectively). 㧔







㪠㩷㪐㪌㩼㩷㪺㫆㫅㪽㫀㪻㪼㫅㪺㪼㩷㫀㫅㫋㪼㫉㫍㪸㫃

㪊 㪘㫍㪼㫉㪸㪾㪼㩷㫆㪽㩷㪸㪻㪻㫀㫋㫀㫆㫅㪸㫃㩷㪻㪼㫃㪸㫐㩷㪈 㪞㫉㫆㫌㫇㩷㫊㫐㫅㪺㪿㫉㫆㫅㫀㫑㪸㫋㫀㫆㫅 㪥㫆㩷㪾㫉㫆㫌㫇㩷㫊㫐㫅㪺㪿㫉㫆㫅㫀㫑㪸㫋㫀㫆㫅 㩷㩷㩷㩷㩷㩷㩷㩷㩷㩷㪺㫆㫅㫋㫉㫆㫃 㩷㩷㩷㩷㩷㩷㩷㩷㩷㩷㩷㩷㪺㫆㫅㫋㫉㫆㫃 㪌㪇㩷㫄㫊䇭䇭䇭䇭䇭䇭 㪌㪇㩷㫄㫊 㪈㪌㪇㩷㫄㫊 㪈㪌㪇㩷㫄㫊 㪉㪌㪇㩷㫄㫊 㪉㪌㪇㩷㫄㫊





㪈 㪄㪉㪇㪇

㪄㪈㪌㪇

㪄㪈㪇㪇 㪄㪌㪇 㪇 㪌㪇 㪈㪇㪇 㪛㫀㪽㪽㪼㫉㪼㫅㪺㪼㩷㫀㫅㩷㫅㪼㫋㫎㫆㫉㫂㩷㫃㪸㫋㪼㫅㪺㫐㩷㪲㫄㫊㪴

㪈㪌㪇

㪉㪇㪇

Figure 21. MOS of comprehensive quality of examiner versus difference in network latency (video case).

synchronization quality, interactivity, and comprehensive quality of the receiver versus the difference in network latency in Figs. 28, 29, and 30, respectively. From Figs. 25 though 30, we find that the MOS values in the video case have a similar tendency to those in the avatar case (Figs. 10 though 15). However, the improvement in the MOS value of the caller in the video case is smaller than that in the avatar case when we perform the group synchronization control. The reason is the same as that in the comparison between Figs. 5 and 20.

2. Networked rock-paper-scissors We show the MOS values of the group synchronization quality, interactivity, and comprehensive quality of the caller as a function of the difference in network latency in Figs. 25, 26, and 27, respectively. We also show the MOS values of the group

125





㪠㩷㪐㪌㩼㩷㪺㫆㫅㪽㫀㪻㪼㫅㪺㪼㩷㫀㫅㫋㪼㫉㫍㪸㫃

㪘㫍㪼㫉㪸㪾㪼㩷㫆㪽㩷㪸㪻㪻㫀㫋㫀㫆㫅㪸㫃㩷㪻㪼㫃㪸㫐㩷㪈 㪞㫉㫆㫌㫇㩷㫊㫐㫅㪺㪿㫉㫆㫅㫀㫑㪸㫋㫀㫆㫅 㪥㫆㩷㪾㫉㫆㫌㫇㩷㫊㫐㫅㪺㪿㫉㫆㫅㫀㫑㪸㫋㫀㫆㫅 㩷㩷㩷㩷㩷㩷㩷㩷㩷㩷㪺㫆㫅㫋㫉㫆㫃 㩷㩷㩷㩷㩷㩷㩷㩷㩷㩷㩷㩷㪺㫆㫅㫋㫉㫆㫃

㪤㪦㪪

㪤㪦㪪

㪌㪇㩷㫄㫊䇭䇭䇭䇭䇭䇭 㪌㪇㩷㫄㫊 㪉㪌㪇㩷㫄㫊

㪊 㪘㫍㪼㫉㪸㪾㪼㩷㫆㪽㩷㪸㪻㪻㫀㫋㫀㫆㫅㪸㫃㩷㪻㪼㫃㪸㫐㩷㪈 㪞㫉㫆㫌㫇㩷㫊㫐㫅㪺㪿㫉㫆㫅㫀㫑㪸㫋㫀㫆㫅 㪥㫆㩷㪾㫉㫆㫌㫇㩷㫊㫐㫅㪺㪿㫉㫆㫅㫀㫑㪸㫋㫀㫆㫅 㩷㩷㩷㩷㩷㩷㩷㩷㩷㩷㪺㫆㫅㫋㫉㫆㫃 㩷㩷㩷㩷㩷㩷㩷㩷㩷㩷㩷㩷㪺㫆㫅㫋㫉㫆㫃 㪌㪇㩷㫄㫊䇭䇭䇭䇭䇭䇭 㪌㪇㩷㫄㫊 㪌㪇㩷㫄㫊䇭䇭䇭䇭䇭䇭 㪌㪇㩷㫄㫊 㪉㪌㪇㩷㫄㫊䇭䇭䇭䇭䇭䇭 㪉㪌㪇㩷㫄㫊



㪄㪉㪇㪇

㪄㪈㪌㪇

㪄㪈㪇㪇 㪄㪌㪇 㪇 㪌㪇 㪈㪇㪇 㪛㫀㪽㪽㪼㫉㪼㫅㪺㪼㩷㫀㫅㩷㫅㪼㫋㫎㫆㫉㫂㩷㫃㪸㫋㪼㫅㪺㫐㩷㪲㫄㫊㪴

㪈㪌㪇





㪉㪇㪇

㪄㪉㪇㪇

Figure 22. MOS of fairness of questioner versus difference in network latency (video case).

㪄㪈㪌㪇

㪄㪈㪇㪇 㪄㪌㪇 㪇 㪌㪇 㪈㪇㪇 㪛㫀㪽㪽㪼㫉㪼㫅㪺㪼㩷㫀㫅㩷㫅㪼㫋㫎㫆㫉㫂㩷㫃㪸㫋㪼㫅㪺㫐㩷㪲㫄㫊㪴

㪉㪇㪇

㪘㫍㪼㫉㪸㪾㪼㩷㫆㪽㩷㪸㪻㪻㫀㫋㫀㫆㫅㪸㫃㩷㪻㪼㫃㪸㫐㩷㪈

㪞㫉㫆㫌㫇㩷㫊㫐㫅㪺㪿㫉㫆㫅㫀㫑㪸㫋㫀㫆㫅 㪥㫆㩷㪾㫉㫆㫌㫇㩷㫊㫐㫅㪺㪿㫉㫆㫅㫀㫑㪸㫋㫀㫆㫅 㩷㩷㩷㩷㩷㩷㩷㩷㩷㩷㪺㫆㫅㫋㫉㫆㫃 㩷㩷㩷㩷㩷㩷㩷㩷㩷㩷㩷㩷㪺㫆㫅㫋㫉㫆㫃

㪌㪇㩷㫄㫊䇭䇭䇭䇭䇭 㪌㪇㩷㫄㫊 㪉㪌㪇㩷㫄㫊



㪤㪦㪪

㪠㩷㪐㪌㩼㩷㪺㫆㫅㪽㫀㪻㪼㫅㪺㪼㩷㫀㫅㫋㪼㫉㫍㪸㫃

㪤㪦㪪

㪈㪌㪇

Figure 25. MOS of group synchronization quality of caller versus difference in network latency (video case). 㪌



㪠㩷㪐㪌㩼㩷㪺㫆㫅㪽㫀㪻㪼㫅㪺㪼㩷㫀㫅㫋㪼㫉㫍㪸㫃

㪌㪇㩷㫄㫊 㪌㪇㩷㫄㫊 㪉㪌㪇㩷㫄㫊

㪠㩷㪐㪌㩼㩷㪺㫆㫅㪽㫀㪻㪼㫅㪺㪼㩷㫀㫅㫋㪼㫉㫍㪸㫃

㪌㪇㩷㫄㫊 㪌㪇㩷㫄㫊 㪉㪌㪇㩷㫄㫊



㪘㫍㪼㫉㪸㪾㪼㩷㫆㪽㩷㪸㪻㪻㫀㫋㫀㫆㫅㪸㫃㩷㪻㪼㫃㪸㫐㩷㪈



㪞㫉㫆㫌㫇㩷㫊㫐㫅㪺㪿㫉㫆㫅㫀㫑㪸㫋㫀㫆㫅 㪥㫆㩷㪾㫉㫆㫌㫇㩷㫊㫐㫅㪺㪿㫉㫆㫅㫀㫑㪸㫋㫀㫆㫅 㩷㩷㩷㩷㩷㩷㩷㩷㩷㩷㪺㫆㫅㫋㫉㫆㫃 㩷㩷㩷㩷㩷㩷㩷㩷㩷㩷㩷㩷㪺㫆㫅㫋㫉㫆㫃 㪌㪇㩷㫄㫊䇭䇭䇭䇭䇭䇭 㪌㪇㩷㫄㫊



㪄㪉㪇㪇

㪄㪈㪌㪇

㪄㪈㪇㪇

㪌㪇㩷㫄㫊䇭䇭䇭䇭䇭䇭

㪌㪇㩷㫄㫊

㪉㪌㪇㩷㫄㫊䇭䇭䇭䇭䇭䇭

㪉㪌㪇㩷㫄㫊

㪄㪌㪇



㪌㪇

㪉㪇㪇

㪈㪇㪇

㪈㪌㪇

㪉㪇㪇

㪌㪇

㪇㪇 㪌㪇 㪇 㪌㪇 㪇㪇 㪛㫀㪽㪽㪼㫉㪼㫅㪺㪼㩷㫀㫅㩷㫅㪼㫋㫎㫆㫉㫂㩷㫃㪸㫋㪼㫅㪺㫐㩷㪲㫄㫊㪴

㪌㪇

㪉㪇㪇

Figure 26. MOS of interactivity of caller versus difference in network latency (video case).

Figure 23. MOS of interactivity of questioner versus difference in network latency (video case).



㪘㫍㪼㫉㪸㪾㪼㩷㫆㪽㩷㪸㪻㪻㫀㫋㫀㫆㫅㪸㫃㩷㪻㪼㫃㪸㫐㩷㪈

㪞㫉㫆㫌㫇㩷㫊㫐㫅㪺㪿㫉㫆㫅㫀㫑㪸㫋㫀㫆㫅 㪥㫆㩷㪾㫉㫆㫌㫇㩷㫊㫐㫅㪺㪿㫉㫆㫅㫀㫑㪸㫋㫀㫆㫅 㩷㩷㩷㩷㩷㩷㩷㩷㩷㩷㪺㫆㫅㫋㫉㫆㫃 㩷㩷㩷㩷㩷㩷㩷㩷㩷㩷㩷㩷㪺㫆㫅㫋㫉㫆㫃

㪌㪇㩷㫄㫊䇭䇭䇭䇭䇭䇭 㪌㪇㩷㫄㫊 㪉㪌㪇㩷㫄㫊



㪠㩷㪐㪌㩼㩷㪺㫆㫅㪽㫀㪻㪼㫅㪺㪼㩷㫀㫅㫋㪼㫉㫍㪸㫃

㪌㪇㩷㫄㫊 㪌㪇㩷㫄㫊 㪉㪌㪇㩷㫄㫊

㪤㪦㪪

㪤㪦㪪

㪠㩷㪐㪌㩼㩷㪺㫆㫅㪽㫀㪻㪼㫅㪺㪼㩷㫀㫅㫋㪼㫉㫍㪸㫃



㪉 㪋





㪞㫉㫆㫌㫇㩷㫊㫐㫅㪺㪿㫉㫆㫅㫀㫑㪸㫋㫀㫆㫅 㪥㫆㩷㪾㫉㫆㫌㫇㩷㫊㫐㫅㪺㪿㫉㫆㫅㫀㫑㪸㫋㫀㫆㫅 㩷㩷㩷㩷㩷㩷㩷㩷㩷㩷㪺㫆㫅㫋㫉㫆㫃 㩷㩷㩷㩷㩷㩷㩷㩷㩷㩷㩷㩷㪺㫆㫅㫋㫉㫆㫃 㪌㪇㩷㫄㫊䇭䇭䇭䇭䇭䇭 㪌㪇㩷㫄㫊 㪌㪇㩷㫄㫊䇭䇭䇭䇭䇭䇭 㪉㪌㪇㩷㫄㫊䇭䇭䇭䇭䇭䇭

㪄㪉㪇㪇

㪄㪈㪌㪇

㪄㪈㪇㪇

㪄㪌㪇





㪈 㪄

㪄㪈













㪄㪈













㪈 㪈





㪈㪌㪇

㪄㪉㪇㪇

㪄㪈㪌㪇

㪄㪈㪇㪇 㪄㪌㪇 㪇 㪌㪇 㪈㪇㪇 㪛㫀㪽㪽㪼㫉㪼㫅㪺㪼㩷㫀㫅㩷㫅㪼㫋㫎㫆㫉㫂㩷㫃㪸㫋㪼㫅㪺㫐㩷㪲㫄㫊㪴

㪈㪌㪇

㪉㪇㪇

Figure 27. MOS of comprehensive quality of caller versus difference in network latency (video case).

㪉㪌㪇㩷㫄㫊

㪈㪇㪇







㪌㪇㩷㫄㫊

㪌㪇











㪉㪇㪇

Figure 24. MOS of comprehensive quality of questioner versus difference in network latency (video case).

7. CONCLUSIONS This paper examined the effects of group synchronization control by subjective assessment for a nameguessing task like fastest fingers first and networked rock-paper-scissors in networked virtual environments with avatars. Assessment results showed that in the name-guessing task like fastest fingers first, the group synchronization control improves the fairness and comprehensive quality while maintaining the interactivity high. Also, we found that in the networked rock-paperscissors, the group synchronization quality, interactivity, and comprehensive quality of the caller are im-

From the above observations, we can confirm that the user-level QoS of the receiver may deteriorate if we carry out the group synchronization control in the same way at the caller and receivers. To solve the problem, we need to enhance the group synchronization control by taking account of the difference of the role in conversations. This is for further study.

126

that the interactivity in the video case slightly deteriorates than that in the avatar case. As the next step of our research, we plan to enhance the group synchronization control by taking account of the difference of the role in conversations in order to alleviate the deterioration in the user-level QoS of the receiver in the networked rock-paper-scissors. In addition, we will make the experiment for other competitive work and collaborative work.

㪌 㪠㩷㪐㪌㩼㩷㪺㫆㫅㪽㫀㪻㪼㫅㪺㪼㩷㫀㫅㫋㪼㫉㫍㪸㫃

㪤㪦㪪



㪊 㪘㫍㪼㫉㪸㪾㪼㩷㫆㪽㩷㪸㪻㪻㫀㫋㫀㫆㫅㪸㫃㩷㪻㪼㫃㪸㫐㩷㪈



㩷㩷㩷㩷㩷㩷㩷㩷㩷㩷㪺㫆㫅㫋㫉㫆㫃

㪈 㪄㪉㪇㪇

㪌㪇㩷㫄㫊䇭䇭䇭䇭䇭 㪈㪌㪇㩷㫄㫊 㪉㪌㪇㩷㫄㫊

㪄㪈㪌㪇

㪥㫆㩷㪾㫉㫆㫌㫇㩷㫊㫐㫅㪺㪿㫉㫆㫅㫀㫑㪸㫋㫀㫆㫅 㩷㩷㩷㩷㩷㩷㩷㩷㩷㩷㩷㩷㪺㫆㫅㫋㫉㫆㫃

㪌㪇㩷㫄㫊 㪈㪌㪇㩷㫄㫊 㪉㪌㪇㩷㫄㫊

㪄㪈㪇㪇 㪄㪌㪇 㪇 㪌㪇 㪈㪇㪇 㪛㫀㪽㪽㪼㫉㪼㫅㪺㪼㩷㫀㫅㩷㫅㪼㫋㫎㫆㫉㫂㩷㫃㪸㫋㪼㫅㪺㫐㩷㪲㫄㫊㪴

㪈㪌㪇

References

㪉㪇㪇

Figure 28. MOS of group synchronization quality of receiver versus difference in network latency (video case). 㪌

[1] H. Nakanishi, C. Yoshida, T. Nishimura, and T. Ishida, “FreeWalk: Supporting casual meetings in a network,” in Proceedings of IEEE CSCW’96, pp. 308-314, April 1996. [2] F. Gong, “Multipoint audio and video control for packet-based multimedia conferencing,” in Proceedings of ACM Multimedia, pp. 308-314, April 1996.

㪠㩷㪐㪌㩼㩷㪺㫆㫅㪽㫀㪻㪼㫅㪺㪼㩷㫀㫅㫋㪼㫉㫍㪸㫃

㪤㪦㪪



[3] Y. Ishibashi, M. Nagasaka, and N. Fujiyoshi, “Subjective assessment of fairness among users in multipoint communications,” in Proceedings of ACM SIGCHI ACE’06, June 2006.

㪊 㪘㫍㪼㫉㪸㪾㪼㩷㫆㪽㩷㪸㪻㪻㫀㫋㫀㫆㫅㪸㫃㩷㪻㪼㫃㪸㫐㩷㪈 㩷㩷㩷㩷㩷㩷㩷㩷㩷㩷㪺㫆㫅㫋㫉㫆㫃



㪌㪇㩷㫄㫊䇭䇭䇭䇭䇭

㪥㫆㩷㪾㫉㫆㫌㫇㩷㫊㫐㫅㪺㪿㫉㫆㫅㫀㫑㪸㫋㫀㫆㫅 㩷㩷㩷㩷㩷㩷㩷㩷㩷㩷㩷㩷㪺㫆㫅㫋㫉㫆㫃

㪌㪇㩷㫄㫊

㪈㪌㪇㩷㫄㫊

㪈㪌㪇㩷㫄㫊

㪉㪌㪇㩷㫄㫊

㪉㪌㪇㩷㫄㫊

[4] Y. Hashimoto and Y. Ishibashi, “Influences of network latency on interactivity in networked rockpaper-scissors,” in Proceedings of ACM NetGames ’06, October 2006.

㪈 㪉㪇㪇

㪈㪌㪇

㪈㪇㪇

㪌㪇



㪌㪇

㪈㪇㪇

㪈㪌㪇

㪉㪇㪇

Figure 29. MOS of interactivity of receiver versus difference in network latency (video case).

[5] Y. Ishibashi, A. Tsuji, and S. Tasaka, “A group synchronization mechanism for stored media in multicast communications,” in Proceedings of INFOCOM’97, pp. 693-701, April 1997. [6] Y. Ishibashi and S. Tasaka, “A group synchronization mechanism for live media in multicast communications,” in Conference Record of IEEE GLOBECOM’97, pp. 746-752, November 1997.

㪌 㪠㩷㪐㪌㩼㩷㪺㫆㫅㪽㫀㪻㪼㫅㪺㪼㩷㫀㫅㫋㪼㫉㫍㪸㫃

㪤㪦㪪



[7] Y. Ishibashi and S. Tasaka, “A distributed control scheme for group synchronization in multicast communications,” in Proceedings of ISCOM’97, pp. 313-323, November 1999.



㩷㩷㩷㩷㩷㩷㩷㩷㩷㩷㪺㫆㫅㫋㫉㫆㫃



㩷㩷㩷㩷㩷㩷㩷㩷㩷㩷㩷㩷㪺㫆㫅㫋㫉㫆㫃

㪌㪇㩷㫄㫊䇭䇭䇭䇭䇭 㪈㪌㪇㩷㫄㫊 㪉㪌㪇㩷㫄㫊

㪌㪇㩷㫄㫊 㪈㪌㪇㩷㫄㫊 㪉㪌㪇㩷㫄㫊

[8] Y. Ishibashi , S. Tasaka, and Y. Tachibana, “Media synchronization and causality control for distributed multimedia applications,” IEICE Transactions on Communications, vol. E84-B, no. 3, pp. 667-677, March 2001.

㪈 㪉㪇㪇

㪈㪌㪇

㪈㪇㪇

㪌㪇



㪌㪇

㪈㪇㪇

㪈㪌㪇

㪉㪇㪇

Figure 30. MOS of comprehensive quality of receiver versus difference in network latency (video case).

[9] ITU-T P.920, “Interactive test methods for audiovisual communications,” International Telecommunication Union, May 2000. 㪄







[10] M. Carson and D. Santay, “NIST Net - A Linuxbased network emulation tool,” ACM SIGCOMM Computer Communication Review, 33(3), pp. 111126, July 2003.

proved by the group synchronization control. Moreover, we saw that the interactivity of the receiver does not largely depend on the difference in network latency between the caller and receiver. We found that the group synchronization quality deteriorates when the network latency between the caller and each subject as the receiver exceeds about 150 ms, but most of subjects do not perceive the deterioration to be annoying. We obtained almost the same results in terms of the comprehensive quality of the receiver as those in terms of the group synchronization quality. Furthermore, we investigated the difference between the avatar case and video case. As a result, we found









[11] ITU-R BT.500-11, “Methodoly for the subjective assessment of the quality of television picture,” International Telecommunication Union, June 2002.

127

Suggest Documents