Characterization and comparison of Skype ... - Semantic Scholar

Characterization and comparison of Skype behavior in wired and wireless network scenarios Marc Cardenete-Suriol, Josep Mangues-Bafalluy, Álvaro Masó, Mónica Gorricho

Publication:

Proc. IEEE Globecom

Vol: No.: Date:

November 2007

© 2007 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.

Characterization and comparison of Skype behavior in wired and wireless network scenarios* Marc Cardenete-Suriol1, Josep Mangues-Bafalluy1, Álvaro Masó2, Mónica Gorricho2 1 Centre Tecnològic de Telecomunicacions de Catalunya (CTTC) Parc Mediterrani de la Tecnologia – Av. Canal Olímpic s/n. 08860 Castelldefels – Barcelona – Spain 2 France Telecom R&D Spain Àvila, 45 – 08005 Barcelona – Spain {marc.cardenete, josep.mangues}@cttc.es, {alvaro.maso, monica.gorricho}@orange-ftgroup.com Abstract—This paper analyzes the call quality perceived by Skype users in both fixed and mobile environments. Specifically, LAN, WLAN, ADSL and UMTS network connections are evaluated by means of both PCs and mobile terminals. The endto-end delay is below 150 ms in all cases, so it does not degrade the call quality (in the case of no echo loss in the path). The traffic patterns of Skype were also studied for uplink/downlink, private/public addresses, which node initiates the call, and what kind of terminal is used. Additionally, PESQ measurements reveal that Skype calls are above the minimum acceptable quality when using PC-based nodes, whereas quality is not acceptable when using mobile terminals. The effect of the CPU load of the mobile terminal on the perceived call quality has also been analyzed. Keywords: Call quality, VoIP, Skype, PESQ, UMTS, WLAN

I.

INTRODUCTION

Skype is widely used in wired environments, with one of the contributing factors to its popularity being the widespread deployment of high-speed Internet connectivity. This permits the network to offer the appropriate support to VoIP calls, which, in the end, translates into a high voice quality perceived by the user. In fact, most Skype clients run on PCs connected through wires to the Internet. On the other hand, wireless channels pose additional impairments to voice traffic. Therefore, assessing the performance of VoIP over wireless networks is a first step towards globally deploying these services. The increasing growth of 3G and WLAN users raises the question of whether Skype will also succeed in mobile environments. Some mobile operators like Hutchinson 3 in several countries and e-plus in Germany already offer the Skype service in their flat rate services. Its success in mobile environments depends on several factors, including business, social, and technical aspects. With respect to the first issue, flat rates seem to be the option adopted by mobile operators partnering with Skype. As for the social factors, some methods to assess call quality take into account the degradation accepted by users in case they receive other benefits in return (e.g. free calls). Finally, from the technical point of view, the call quality perceived by the user mostly depends on the network impairments introduced by the network (e.g. end-to-end delay, packet loss), and the codec used. Furthermore, of particular importance in a mobile environment, the performance of the terminal turns out to be a limiting factor, given its usually limited processing power.This paper focuses on the technical factors affecting voice quality.

*

This work has been partially funded by Fundación France Telecom under research contract FREEDOM, and by Generalitat de Catalunya under grant number SGR2005-00690 (Grup de Recerca Singular)

The main contribution of this paper is the comparison of the behavior of Skype in different environments. It bases its analysis on first characterizing the traffic generated by Skype, and then, analyzing the voice quality perceived by users in multiple wired and wireless environments. In particular, the following scenarios were experimentally tested: LAN, ADSL, WLAN, and UMTS. Additionally, it also takes into account the effect of using private addresses behind a NAT. Furthermore, it not only considers the effect of networkrelated parameters, but also the effect of the characteristics of the terminal by comparing PC-based and mobile terminals. As for voice quality, the Perceptual Evaluation of Speech Quality (PESQ) algorithm [1] is used. This is an appropriate method to assess voice quality, because the proprietary nature of Skype highly restricts the applicability of other methods based on measuring network level parameters. This method uses the analog audio signals at each edge of the communication. In this way, the voice quality evaluated is the one the end-user is really listening to after having traversed all software and hardware components. Results show that call quality is acceptable (i.e. Mean Opinion Score (MOS) > 3.6) when running the Skype client in PC-based nodes. On the other hand, perceived voice quality is below the acceptable call quality in all tested cases when using the mobile terminal (i.e. SPV M5000 PDA). Additionally, results also point out that the processing power of the terminal seems to affect more the perceived call quality than the network connection. The paper is organized as follows. The next section explains some background topics and related work. Section III describes the scenario used for experimentation. Section IV presents measurements about VoIP traffic characteristics and end-to-end (e2e) delay in the networks under study. Section V presents the call quality perceived by users in the different scenarios. Finally, the last section concludes the paper. II.

BACKGROUND AND RELATED WORK

This section introduces the Skype architecture and codecs used, and the PESQ voice quality measurement method. It also presents related work on the Skype traffic analysis. A. Skype Skype is a VoIP service that uses as infrastructure a hybrid peer-to-peer network. This means that Skype nodes create an overlay network that is used to establish logical links between nodes. There are three different entities: ordinary hosts are the

clients of the network; supernodes are ordinary hosts that besides generating and receiving calls, realize server functionalities like searching for users or maintaining location information of clients; and the login server is the only centralized entity in the Skype network, and it serves to authenticate the clients, and to charge them when they use certain services. Once the user has registered to the login server, the interaction with the rest of nodes is done on a P2P basis. Additionally, a usual scenario for Skype clients is using private addressing behind a NAT. Some hypothesis point out that Skype might be using firewall and NAT traversal strategies based on TURN and STUN. See reference [2] for further details. As stated in [2], Skype uses GlobalIPSound iLBC and iSAC codecs, or a third unknown codec. iLBC [3] and iSAC [4] features are listed in Table 1. Table 1. iLBC and iSAC codecs features Features Frame size Bit rate Sampling rate

iLBC 20 and 30 ms 13.3 kbps (30 ms frames) and 15.2 kbps (20 ms frames) 8 kHz

iSAC Adaptive, 30-60 ms Adaptive and variable, range 10-32 kbps 16 kHz

B. PESQ The ITU-T recommendation P.862 [1] describes the Perceptual Evaluation of Speech Quality (PESQ) algorithm, an objective method for predicting the subjective quality of 3.1 kHz (narrow-band) handset telephony and narrow-band speech codecs. It was specifically developed to be applicable to end-to-end voice quality testing under real network conditions, such as VoIP, ISDN, GSM etc. PESQ compares an original signal X(t) with a degraded signal Y(t) that is the result of transmitting X(t) through a communications system. The output of PESQ is a prediction of the perceived quality mark that would be given to Y(t) by subjects in a subjective listening test. The original and degraded signals are mapped onto an internal representation using a perceptual model. The difference in this representation is used by a cognitive model to predict the perceived speech quality of the degraded signal. This perceived listening quality is expressed in terms of Mean Opinion Score (MOS), an average quality score over a large set of subjects. Additionally, there is a time alignment process between both signals, so the end-to-end delay in the path is not taken into account within the calculation. A PESQ calculation returns as result a value between -0.5 (worst) and 4.5 (best), although in most cases it is between 1 and 4.5, following the well-known MOS scale. However, the PESQ score tends to be optimistic for poor quality speech and pessimistic for good quality speech. C. Related work Skype is a proprietary protocol, and thus, its operation is not perfectly known. Some work has been carried out to study how it works by analyzing its traffic. This is the case of [2], which analyzes its architecture and functions such as login, NAT and firewall traversal, call establishment, and media transfer.

One of the main contributing factors to the success of Skype, along with its free service, is the user perceived voice quality, because this is the key parameter indicating the user level satisfaction when using a telephony service. Previous work, such as the one presented in [5], quantifies user satisfaction by means of call duration and source- and network-related parameters, such as bit rate and jitter. As our goal is to really measure what the end-user experiences after all hardware and software components are traversed by voice, our work differs from [5] in the sense that we measure the perceived call quality by means of a perceptual speech-based objective algorithm. Furthermore, Skype call quality is measured in different scenarios. Of particular importance are mobile and wireless environments, given the increasing importance and availability of wireless access. In this way, [6] presents an analysis of Skype VoIP traffic in LAN-to-LAN and UMTS-to-DSL environments, in terms of traffic features and voice quality when terminals are PC-based nodes. Up to our knowledge, reference [6] is the only study experimentally assessing Skype quality in UMTS environments. However, there are some differences with this paper. In fact, this paper presents a comparison of the Skype traffic behavior and call quality in different real environments, such as LAN, WLAN, ADSL, and UMTS networks. Additionally, in a mobile environment, the limited processing power of the terminals might introduce some additional impairments to voice quality, as seen in the results. This is why this paper not only takes into account PC-based nodes as clients, but it also assesses the usage of mobile terminals in wireless environments and evaluates its effect on traffic features and voice quality. Furthermore, some of the analysis in [6] of voice quality is based on traces obtained with tcpdump running on the PCs with the Skype clients. On the other hand, this paper just bases its analysis of voice quality on the analog audio signals. By doing this, our study intends to avoid the potential influence of running tcpdump on the results, given the observed importance of the available processing power. Besides, the analog audio signal is obtained once the playout buffer smoothed some effects introduced by the network, whilst tcpdump traces are obtained before, making the analysis of results more complex. Moreover, this paper also assesses the traffic characteristics when running Skype behind a NAT with private addressing (a quite usual case among Skype users). III.

SYSTEM DESCRIPTION

A. Scenario setup Experimentation has been carried out within EXTREME [7], a networking experimental testbed of the Centre Tecnològic de Telecomunicacions de Catalunya (CTTC), in Barcelona. EXTREME provides the WLAN infrastructure for the tests. Moreover, it is interconnected using a loosely coupled approach with the production network of the Orange mobile operator. An EXTREME node has also been configured in an ADSL subscriber line of 1 Mbps uplink / 300 kbps downlink. Figure 1 shows the scenario setup used in the experiments. All computers within EXTREME are Pentium IV PCs with 512MB of RAM memory. They all run Linux with Kernel

2.4.26. The WLAN Access Point (AP) carries an Atherosbased WLAN card with the Madwifi driver [8]. The sound cards used are the Sound Blaster 4.1 Digital card. Skype clients for Linux (version 1.2.0.18) run on these PCs. As mobile terminal, the SPV M5000 (i.e. Qtek 9000), which has both WLAN and UMTS network interfaces, is evaluated. This PDA carries an Intel Bulverde 520MHz processor and 64MB of SDRAM. The version of the Skype client running on the SPV M5000 PDA is the 2.0.0.39 for Pocket PC. The most recent version of the client (as of the time the tests were carried out) for each OS has been used.

file used is an English spoken text without noise, a sample rate of 8 kHz, and it is encoded with 16 bits per sample. The duration of the file is about 20 seconds. Some seconds of silence have been inserted at the beginning and the end of the file in order to avoid the loss of some part of the WAV file due to the end-to-end delay. Once an experiment is finished, the voice quality of the scenario under test is calculated by means of the PESQ algorithm, using the reference and the degraded WAV files. A script automates the process of reproducing the WAV file at PC 1 and recording it at PC 2, and calculating the PESQ score between both WAV files. The audio volume of PC 1 affects the obtained voice quality of the scenario under test. Therefore, for each scenario, the experiment is repeated several times using different volume levels to find the optimum volume level that leads to the maximum PESQ score. This process is also automated within the script. This volume tuning process is also carried out by the user in a real setup to find the best possible quality. Each test (in which a maximum PESQ score is obtained) is repeated 10 times, and the results obtained are averaged. So, an average result of the maximum scores is obtained.

Figure 1. Scenario setup

B. Voice quality measurement setup The general measurement setup for the voice quality measurements is the one depicted in Figure 2. Client 1 is a PCbased node in all the scenarios, and Client 2 is either a PCbased node or the SPV M5000 PDA. Moreover, the network connection of Client 1 is a LAN in all cases, whereas the network connection of Client 2 changes depending on the scenario under test. Therefore, differences between scenarios lie in the network connection of Client 2 and the type of terminal of Client 2 (PC-based node or SPV M5000 PDA).

Figure 2. Voice quality measurement setup

An audio cable connects the earphone jack of PC 1 and the microphone jack of Client 1, and another audio cable connects the earphone jack of Client 2 and the microphone jack of PC 2. A Skype VoIP session is established between Client 1 and Client 2. During the call, PC 1 reproduces a WAV file (the Reference WAV file) using the wavplay application (version 1.4), whereas PC 2 records a WAV file (the Degraded WAV file) using the wavrec application (version 1.4, 16 bits/sample). Thus, PC 2 records the WAV file reproduced by PC 1 after traversing the scenario under test. The quality from Client 2 to Client 1 is also tested by means of a flow in the reverse direction. This includes the audio cables, which are also present in real Skype setups. Both PCs are synchronized using NTP so that PC 2 records the WAV file at the same time PC 1 reproduces it. The WAV

The voice quality can be affected by the performance parameters of the terminal in which the Skype client is running in the case of using the PDA as a mobile terminal, because of the coding/decoding process of the call. One of the main performance parameters is the CPU load. Its value may vary with the applications that are executing on the mobile terminal or the interfaces used. The Pocket Hack Master application [9] has been used to check the SPV M5000 PDA CPU load during experimentation. The PESQ algorithm does not take into account the degradation in quality caused by the end-to-end delay. In order to assure that this parameter does not affect the perceived call quality, end-to-end delay measurements of the Skype VoIP traffic have been carried out in the different scenarios under study. Thus, e2e delays below 150 ms, guarantee interactivity, and thus, allows focusing on the audio signal quality. The usage of audio cables in the voice quality measurement setup has been calibrated in a reference scenario composed of an audio cable connecting two PCs. In this scenario, the average perceived voice quality obtained presents a MOS score of 4.27. So, it can be concluded that the usage of audio cables damages the call quality. The degradation of quality is mainly due to the process of digitalto-analog conversion + analog audio transmission through an audio cable + analog-to-digital conversion. Moreover, different results are obtained depending on the material (e.g. audio cables) used. This gives an idea of the magnitude of the degradation attributable to audio cables in a real setup. IV.

CHARACTERIZATION OF SKYPE TRAFFIC

This section presents measurement results of the VoIP flow features of Skype calls as well as the end-to-end delay of Skype traffic in the different networks tested.

A. VoIP flow features Table 2 shows the Skype VoIP flow features of the LAN and WLAN connections when using PC-based nodes. In both cases, the VoIP call features have similar values among them, independently of the traffic flow (from Client 1 to Client 2 or the other way round). Traffic characteristics are quite similar for LAN and WLAN. Given that there are limitations neither on CPU nor on link rate, one might assume that these are the traffic patterns that Skype considers as the best ones. Table 2. VoIP flow features using PC-based nodes Network connection Avg packets/sec Avg packet size eth Avg kbits/sec LAN 30.985 125.146 31.016 WLAN 30.790 128.787 31.687

Table 3 lists the traffic features of the calls when using the SPV M5000 PDA in WLAN (Client 2 in Figure 2). The peer Skype client runs on a PC-based node using a LAN connection (Client 1 in Figure 2). As for notation purposes, the downlink traffic flow corresponds to that sent from the PC to the PDA, and the uplink traffic flow corresponds to that sent from the PDA to the PC. Both values are measured at the Client 1 side using ethereal and the presented average packet sizes and average rates (kbit/sec) are measured at the Ethernet layer. All possible private-public combinations are tested to assess the influence of intermediate NATs on the traffic patterns. As seen in the table, traffic features vary with the type of IP address (public or private) of both the PDA and the peer PCbased node. Furthermore, and unlike in the WLAN and LAN cases, it also varies with the direction of the traffic flow (i.e. downlink or uplink). Table 3. VoIP flow features using the SPV M5000 PDA in WLAN PDA IP

PC IP Public

Public Private Public Private Private

Traffic flow Downlink Uplink Downlink Uplink Downlink Uplink Downlink Uplink

Avg pack/ sec 33.33 49.05 26.12 47.41 33.43 49.37 26.08 47.08

Avg pack size eth (bytes) 103.86 90.75 104.96 89.44 115.78 97.53 116.57 100.98

Avg kbits/ sec eth 27.69 35.61 21.93 33.92 30.97 38.51 24.30 38.04

Some observations follow. Packet rates do not seem to depend on the type of address assigned to the PDA, i.e. the first four values are almost equal to the last four values in the Avg. pack/sec column. Furthermore, in the uplink case, all packet rates are quite similar (around 48kbps). So, this parameter does not seem to depend on the pair of address types. On the other hand, packet sizes seem not to depend on the type of address of the PC. And packet sizes used when the PDA has a private address are higher more or less in the same percentage in all cases. This makes the overall bit rate higher in this case. Additionally, compared to Table 2, packets sizes are smaller in this case (around 20% or even more), though packet rates are in general higher (except for some downlink cases). As for the uplink vs. downlink comparison, packet rates are much higher (around 45%) and packet sizes slightly smaller (a bit more than 10%) in the uplink than in the downlink.

Some further comments on these observations arise. As packet rates in VoIP applications are related to packetization delays, it seems that in the uplink case it does not change with the type of address. As traffic characteristics change in the downlink for different cases, it seems that there is some adaptation that is not present in the uplink. So, this might indicate that both clients act differently (recall that clients are different because they run on different OSs and platforms). Table 4 shows the VoIP traffic features when using the SPV M5000 PDA in UMTS. The peer Skype client runs on a PC-based node using a LAN connection. When using the UMTS network connection, the PDA can only use a public IP address (assigned by the Orange production network). In this case, the traffic features of the Skype call depend on several factors, namely the type of address of the peer PC-based node, the traffic flow (traffic from the PC to the PDA or the other way round), and surprisingly, to the node that initiates the VoIP session. Both average packet size and average rate (kbit/sec) are measured at Ethernet layer at the Client 1 side. Table 4. VoIP flow features using the SPV M5000 PDA in UMTS PC IP

Traffic flow Downlink

Public Uplink Downlink Private Uplink

Initiated by PC PDA PC PDA PC PDA PC PDA

Avg pack/ sec 33.33 16.70 47.37 48.15 26.19 13.22 42.90 42.51

Avg pack size eth (bytes) 109.64 150.07 122.84 115.55 103.99 150.09 132.92 132.86

Avg kbits/ sec eth 29.23 20.05 46.50 44.38 21.78 15.87 45.62 45.18

The node that initiates the call influences the traffic features, specially the traffic generated by the PC. In particular, if the call is initiated by the PDA, the packet rate of the downlink traffic is half its value and the packet size is around 50% bigger. Furthermore, if the PC initiates the communication, the traffic pattern is quite similar to that observed for WLAN. On the other hand, in the measured traffic generated by the PDA (uplink), the influence of this factor is small. Again, this latter behavior might be due to the different adaptation capabilities of the two clients, as explained above. In the uplink case, the bit rate is almost the same in all cases (regardless of the type of address and who initiates the communication). However, packet rates and sizes are slightly different. So, they seem to slightly depend on the type of address. When compared to the WLAN case (Table 3), though packet rates are similar, packet sizes are much larger. Considering that the sampling rate is fixed (Table 1), it looks as if the packetization delay, which jointly with the sampling rate determines the number of samples per packet, was the same but the packet carried more redundancy because the Skype client somehow measured that this scenario is less reliable. On the other hand, in the downlink case when the PDA initiates the communication, it looks as if more samples are accumulated in a packet and transmitted at less packet rate, probably to introduce less overhead, based on some measurement coming from the PDA in the establishment of the call.

B. End-to-end delay PESQ bases its operation on the comparison of two audio signals. Therefore, delay is not taken into account in the final MOS score obtained. However, end-to-end (e2e) delay determines the interactivity of a call, and thus, it has influence in the quality of a call. The (ITU-T) G.114 recommendation [10] specifies that, for obtaining good voice quality, end-toend delay should not exceed 150 ms. This section presents the average end-to-end delay measurements obtained when emulating Skype VoIP traffic in the different networks under study. Measuring e2e delay with real Skype traffic would imply matching packets captured by a sniffer (e.g. ethereal/wireshark) at the sender with those at the receiver. However, these measurements would not take into account the potential delay introduced by the processing at the operating system level. Alternatively, another approach is to emulate Skype traffic by means of the MGEN traffic generator (version 3.3a8). Besides of being simpler, it also considers the processing of packets in the operating system. So, these values should be understood as an approximate measure of the degree of interactivity. In this sense, they are used to explain if, in spite of having good PESQ values, quality might be lower than expected due to loss of interactivity (i.e. delay > 150 ms). Both LAN and WLAN measurements have been carried out within EXTREME, and thus the e2e delay in these environments is almost negligible. On the other hand, the ADSL and UMTS networks under study correspond to real production networks. The traffic features (UDP payload packet size and packets per second) of the VoIP traffic emulated in the ADSL and UMTS network cases correspond to the ones of Skype when operating in these networks, and whose measurement values are provided in the previous section. For the ADSL case, the LAN VoIP flow features are the same as in the LAN connection, and they are extracted from Table 2. The UMTS VoIP flow features are the ones of Table 4, and the worst case of uplink and downlink traffic features has been taken into consideration. Table 5 shows the uplink and downlink average end-to-end delay measurements. Table 5. End-to-end delay of Skype traffic in the ADSL and UMTS network connections Network ADSL UMTS

Traffic flow Downlink Uplink Downlink Uplink

Pack / sec

UDP payload pack size (bytes)

30.1

83.15

33.33 47.37

67.64 80.84

E2E delay (msec) 35 36 107 129

As can be observed, all measurements present average values below 150 ms. Therefore, in the presence of perfect echo cancellation, the end-to-end delay does not affect the call quality, because it is not perceptible by users. As a consequence, the comparison of the behaviors in networks of different nature (real production network vs. lab-sized network) is not unfair in terms of e2e delay. It is true that the longer path of the UMTS and ADSL cases might introduce

additional jitter and packet loss not present in a lab network. But the application still has room for playout buffers (up to 150 ms), which are expected to compensate for jitter. As for packet losses in the UMTS network, they were also negligible during the realization of the tests. V.

VOICE QUALITY MEASUREMENTS

This section analyzes the voice quality of Skype calls by means of the PESQ assessment method, using the measurement setup described in section III. A. Scenario calibration Each component of the scenario under study can affect the PESQ measurement obtained. Thus, it is important to evaluate the voice quality degradation introduced by each component of the scenario. Additionally, it must be taken into account that the impairments introduced by each component are not additive to calculate the overall PESQ score. This section analyzes the voice quality degradation in the PC to PC audio transmission process, i.e. the process of reproducing a WAV file in a PC and recording it in another PC by transmitting the audio signal through an audio cable. The average perceived voice quality presents a MOS score of 4.27. Therefore, one may conclude that the usage of audio cables (within earphones or microphones) damages call quality. The degradation of quality in this case compared with the case of non-degradation (MOS score of 4.5) is mainly due to the following processes: 1) digital-to-analog conversion, 2) analog audio transmission through an audio cable, and 3) analog-to-digital conversion. Results reveal that there is variability among the results obtained depending on the material (e.g. audio cables) used (e.g. MOS scores of 4.134 and 4.405 have been obtained using two different audio cables). Recall that in our setup one cable at each side is used. B. PC-based nodes The MOS scores of Skype VoIP calls when using PCbased nodes are listed in Table 6. Table 6. MOS score using PC-based nodes Network Uplink Downlink ADSL 3.898 3.935 LAN 3.838 3.838 WLAN 3.631 3.631

The call quality obtained in all cases presents a MOS score above 3.6, which corresponds to the minimum acceptable call quality value for PSTN-like quality. Small differences among results are present, due the fact that the available bandwidth in all networks connections is much higher than the throughput required by a Skype call, and thus, the network connection has little effect on call quality. C. SPV M5000 PDA. Normal case In this section, the degradation of the perceived voice quality introduced by the usage of the SPV M5000 PDA as a VoIP mobile terminal is evaluated. The CPU load of the SPV M5000 varied among different experiments, ranging between 67% and 72% in the WLAN connection and between 66% and

81% in the UMTS connection, for the normal case (Table 7); and between 83% and 90% in both connections, for the case of increase of the CPU load (Table 8). Table 7 lists the call quality of Skype when running in the SPV M5000 PDA, when using both the WLAN and the UMTS network interfaces. Table 7. MOS score using the SPV M5000 PDA (normal case) Network Uplink Downlink WLAN 3.39 2.85 UMTS 2.636 2.873

Results show that the call quality range is below a MOS score of 3.6 in all cases, so it may not be acceptable for toll quality if PSTN-like quality is expected. In the case of WLAN, the traffic from the PDA to the PC presents a MOS score of 3.39, whereas the reverse traffic flow has a MOS value of 2.85. As there is only one WLAN client operating on the network, the fact of using the WLAN network does not affect the traffic. Therefore, the combined effect of terminal hardware, OS, and the Skype version running on it may be the reason why the PESQ scores are different depending on the traffic flow. As for UMTS, the values obtained in the downlink are similar to those for WLAN, which seems to mean that there is enough rate in the UMTS network to support the call (though with low quality). But in the uplink, it is much worse. This might be due to the uplink rate and the different way the OS interacts with the WLAN and UMTS cards, as the clients used are the same. D. SPV M5000 PDA. Increase of CPU load case This section studies the effect of the SPV M5000 PDA CPU load on the call quality. In order to increase the CPU load, a video application is running on the PDA during the VoIP call. The PESQ scores obtained in this case are listed in Table 8, for both WLAN and UMTS network connections used by the PDA. Table 8. MOS score using the SPV M5000 PDA (increase of CPU load case) Network Uplink Downlink WLAN 2.496 2.477 UMTS 2.617 2.66

Results show that the UMTS case presents a better MOS value than the WLAN case. One hypothesis for this is that the processing in the SPV M5000 terminal when using UMTS is more optimized than when using WLAN and this is effect is more evident as more stress is put to the PDA. So, processing seems to affect more the perceived call quality than the link rates obtained with each technology. It is important to notice that these results do not include the degradation in quality accepted by the user in return for the scenario used (i.e. the advantage factor). This parameter would indicate the degradation in quality accepted by a user given the interest he/she has in using a mobile PDA running a Skype application instead of using a PSTN or GSM telephone. This might render perceived quality acceptable to the user. VI.

CONCLUSIONS

This study has analyzed the Skype voice quality perceived by users in both fixed and mobile environments. Specifically,

LAN, WLAN, ADSL, and UMTS network connections have been evaluated. Measurements of Skype calls in the networks under study reveal that, in general, end-to-end delay values are below 150 ms, and thus, do not degrade much call quality. The patterns of Skype traffic were also studied for uplink/downlink, private/public addresses, which node initiates the call, and what kind of terminal is used. The PESQ quality assessment method has been used to predict the call quality of Skype calls. Results show that when using PC-based nodes as Skype client, LAN, WLAN, and ADSL network connections provide acceptable call quality; i.e., above a MOS score of 3.6. On the other hand, when using the Orange SPV M5000 PDA as Skype mobile client, perceived voice quality is below the acceptable call quality in all cases. Nevertheless, these results do not include the degradation in quality accepted by the user in return for the setup used (i.e., the advantage factor). One observed fact in the SPV M5000 mobile terminal scenario is that the CPU load of this PDA affects the perceived voice quality, and that, the CPU load value increases quickly. For instance, when only running Skype, the CPU load ranges from 66% to 81% and the MOS score obtained is 3.39 and 2.851 for uplink and downlink cases in WLAN, respectively. And values of 2.636 and 2.873 are obtained for uplink and downlink cases in UMTS, respectively. When the CPU load increases to levels ranging from 83% to 90%, the MOS score decreases to 2.496 and 2.477 for the uplink and downlink cases in WLAN, respectively. And values of 2.617 and 2.66 are obtained for the uplink and downlink cases in UMTS, respectively. These results might point out that the SPV M5000 terminal processing and hardware resources affect more the perceived call quality than the network connection used by the terminal. This is of particular importance in mobile environments, where resource- constrained devices are commonly used. REFERENCES [1]

ITU-T. "Perceptual evaluation of speech quality (PESQ)" ITU-T Recommendation. P.862, Feb. 2001 [2] S. A. Baset and H. Schulzrine, “An Analysis of the Skype Peer-to-Peer Internet Telephony Protocol”, IEEE Infocom’06, April 2006. [3] iLBC Codec, http://www.globalipsound.com/datasheets/iLBC.pdf [4] iSAC codec, http://www.globalipsound.com/datasheets/iSAC.pdf [5] Kuan-Ta Chen, Chun-Ying Huang, Polly Huang, Chin-Laung Lei; “Quantifying Skype user satisfaction”; ACM SIGCOMM, Oct. 2006. [6] T. Hossfeld, A. Binzenhöfer, M. Fiedler, K. Tutschku; “Measurement and Analysis of Skype VoIP Traffic in 3G UMTS Systems”; IPS-MoMe 2006, Feb. 2006. [7] M. Portoles-Comeras, M. Requena-Esteso, J. Mangues-Bafalluy, M. Cardenete-Suriol, “EXTREME: Combining the ease of management of multi-user experimental facilities and the flexibility of proof of concept testbeds”, IEEE TridentCom, March 2006. Information of EXTREME also available at http://www.cttc.es/engineering/extreme.htm [8] The Madwifi project. http://sourceforge.net/projects/madwifi/ [9] Pocket Hack Master application. http://www.antontomov.com/ [10] ITU-T. "One-way Transmission Time," ITU-T Rec. G.114, May 2003.