Downloaded - HKUST Institutional Repository - Hong Kong University ...

IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS,

VOL. 20,

NO. 1,

JANUARY 2009

59

A Trace-Driven Approach to Evaluate the Scalability of P2P-Based Video-on-Demand Service Jian-Guang Luo, Student Member, IEEE, Qian Zhang, Senior Member, IEEE, Yun Tang, Member, IEEE, and Shi-Qiang Yang, Member, IEEE Abstract—Peer-to-peer (P2P) networks have been emerging as one of the most promising approaches to improve the scalability of video-on-demand (VoD) service over Internet. Although a number of architectures and streaming protocols have been proposed in past years, there is few work to study the practical performance of P2P-based VoD service especially considering the real user behavior which actually has significant impact on the system scalability. Therefore, in this paper, we first characterize the user behavior by analyzing a large amount of real traces from a popular VoD system supported by the biggest television station in China, cctv.com. Then, we examine the practical scalability of P2P-based VoD service through extensive trace-driven simulation under a general system framework. The results show that P2P networks scale well in providing VoD service under real user behavior by saving a considerable percentage of server bandwidth. Moreover, it is observed that adopting hard cache at client side achieves much better system scalability than that with soft cache. We also identify the impact of various aspects of user behavior upon system scalability through detailed simulation. We believe that our study will shine insightful light on the understanding of practical scalability of P2P-based VoD service and be helpful to future system design and optimization. Index Terms—Peer-to-peer networks, video-on-demand, system scalability, user behavior.

Ç 1

INTRODUCTION

I

N recent years, peer-to-peer (P2P) networks have been emerging as one of the most promising approaches to address the scalability problem in large-scale systems, in which each node1 plays the role of both server and client at the same time, contributing its available computation, storage and/or bandwidth resources into the collective resource pool. The cooperative paradigm in P2P networks essentially amplifies service capacity without the need of special support from network infrastructure or costly servers. To date, P2P networks have already achieved a big success in supporting many applications over Internet, e.g., P2P-based file sharing [1], [2], [3] and live streaming [4], [5], [6]. Researchers also recognized the potentials of P2P networks in providing video-on-demand (VoD) service

1. In this paper, node, peer, and client will be used interchangeably unless explicit explanation.

. J.-G. Luo is with the Department of Computer Science and Technology, Tsinghua University, Room 1-512, Building FIT, Beijing 100084, P.R. China. E-mail: [email protected]. . Q. Zhang is with the Department of Computer Science, Hong Kong University of Science and Technology, Room 3533, Academic Building, Clear Water Bay, Kowloon, Hong Kong. E-mail: [email protected]. . Y. Tang is with the China Institute for Development Planning, Tsinghua University, Room 603, Building Wu ShunDe, Beijing 100084, P.R. China. E-mail: [email protected]. . S.-Q. Yang is with the Department of Computer Science and Technology, Tsinghua University, Room 3-518, Building FIT, Beijing 100084, P.R. China. E-mail: [email protected]. Manuscript received 3 Oct. 2007; revised 9 Feb. 2008; accepted 13 Mar. 2008; published online 24 Apr. 2008. Recommended for acceptance by C. Shahabi. For information on obtaining reprints of this article, please send e-mail to: [email protected], and reference IEEECS Log Number TPDS-2007-10-0350. Digital Object Identifier no. 10.1109/TPDS.2008.68. 1045-9219/09/$25.00 ß 2009 IEEE

and proposed various system architectures and streaming protocols [7], [8], [9], [10], [11]. In a typical P2P-based VoD system, video clips are divided into small segments, and the client can access the video segments from other clients or video servers according to data availability. The received segments will be stored in local cache of the client so that they can be used to serve the future requests from other peers. By this means, the mutual cooperation among clients reduces the server workload and in turn increases the system scalability. However, it is not a trivial task to evaluate the practical scalability of P2P-based VoD service. First of all, different from live streaming, users in VoD systems can individually request any video clip in any position at any time. This asynchronous nature of user access pattern severely reduces cooperation opportunity among peers. Second, there are usually a large number of video clips provided in ondemand service, while the requests from users are unevenly spread among those clips. This dispersal of requests to video clips also brings challenge for P2P-based VoD service. For example, the clips with low popularity might only be requested for few times and the corresponding clients will have less chance to cooperate with each other. Third, VCRlike interactivities from users, such as pause and jump forward/backward, widely exist in VoD systems, which also have great impact on the performance of P2P-based VoD systems. Last, but not least important, the online duration and appearance frequency of users, that is, the peer dynamics, are also vital in P2P networks. Especially in VoD systems, clients will cache the video contents they visited. Thus, the recurred users can help the others with the cached contents as long as they log in the system. With all these observation, we can see that user behavior plays an essential Published by the IEEE Computer Society

Authorized licensed use limited to: Hong Kong University of Science and Technology. Downloaded on August 3, 2009 at 02:21 from IEEE Xplore. Restrictions apply.

60


role in determining the scalability of P2P-based VoD service. However, in previous works [7], [8], [9], [10], [11], user behavior has not been carefully considered when carrying out the performance evaluation, and therefore, their results do not reflect the practical scalability of P2P-based VoD service very well. Thus, it is necessary and interesting to revisit the practical scalability of P2P-based VoD systems by properly considering the real user behavior in VoD service. In this paper, we thus target at investigating user behavior in VoD service and evaluating the practical scalability of P2P-based VoD service under real user behavior. To achieve this target, in the first part of this paper, we analyze a large amount of traces collected from a popular VoD system of cctv.com, which is supported by the biggest television station in China, to characterize the user behavior in VoD service. Note that although this VoD system is based on traditional Client/Server (C/S) architecture, we believe that its traces substantially reflect the real user behavior in general VoD service. In our study, we classify the video clips into two different categories, i.e., news and music, and perform statistical analysis accordingly for each category. The different aspects of user behavior that potentially have great impact on the scalability of P2P-based VoD service are carefully studied, including the number of concurrent online users, user interactivities, popularity evolution and skewness of video clips as well as user recurrence and repeated requests. After that, as the second part of this paper, we make a quantitative study on the scalability of P2P-based VoD service through trace-driven simulations, and further identify the detailed impact of user behavior upon system scalability. In the simulations that we conducted, the traces collected from cctv.com are used as input to model the user behavior so that the system scalability of P2P-based VoD service will be investigated in a more practical way. It is known that the cache management at client side is a key issue when designing a P2P-based VoD system. In the simulations, we adopt two cache types, i.e., soft cache and hard cache, as well as four representative cache replacement algorithms, i.e., FIFO, MCC, LRU, and LFU, to verify the system scalability under different settings. Based on the simulation results, some important conclusions are drawn which are helpful to guide future system design and optimization. Furthermore, in order to reveal the different impact of user behavior on different video types, we conduct simulations for news and music categories separately and then give discussions based on comparative studies of the simulation results in the two categories. Our main contributions in this paper are as follows: 1) Characterize the user behavior in a popular VoD system by analyzing a large amount of real traces to give a brief idea about the key aspects which will have great impact on the scalability of P2P-based VoD service. 2) Evaluate the practical scalability of P2P-based VoD service through extensive trace-driven simulation. Specifically, two cache types and four cache replacement algorithms are investigated. 3) Identify the impact of various aspects of user behavior on the system scalability through comparative studies between news and music categories. To the best of our knowledge, this is the first work to study the practical scalability of P2P-based VoD service with the consideration of real user behavior.

VOL. 20,

NO. 1,

JANUARY 2009

The remainder of this paper is organized as follows: In Section 2, we analyze the real workload traces and identify the key aspects of user behavior in news and music categories. In Section 3, we introduce a general framework of P2P-based VoD service, evaluate the system scalability through trace-driven simulations, and identify the impact of various aspects of user behavior. Finally, we discuss the related work and conclude this paper in Sections 4 and 5.

2

USER BEHAVIOR

IN

VIDEO-ON-DEMAND SERVICE

In this section, we aim to provide a comprehensive view about the user behavior through careful study to a large amount of traces collected from a popular VoD system in China. We first explain the methodology we adopted to collect and analyze the traces in the first section, and then present the characterized user behavior, especially in terms of the number of concurrent online users, user interactivities, popularity evolution and skewness of video clips, and user recurrence and repeated requests in the second section. We believe that the statistical results of user behavior in VoD service play important roles in gaining insightful understandings to system scalability.

2.1 Trace Study Methodology In this section, we introduce the methodology we used to collect and analyze the workload traces in VoD service. 2.1.1 Information of Traces The practical traces used in this paper were collected from the VoD system of cctv.com, which is supported by CCTV, the biggest television station in China. The VoD system is based on the traditional C/S architecture and consists of 12 centralized load-balanced powerful servers equipped with Windows Media Service (WMS). For each request, one of the streaming servers will respond to serve and, hence, record a trace of the request. All the 12 streaming servers are clock-synchronous, thus we can easily merge and sort the traces according to the request arrival time. Thereinafter, we will treat the 12 streaming servers as a single large server for ease of explanation in the rest of this paper. On the server, there are around 5,000 video clips encoded at about 300 Kbps. The length of these clips ranges from tens to thousands of seconds. We collect traces from the system for a period of 50 days, and a total of about 12 million traces are gathered. In each trace, there are 44 fields to record the request information. In this paper, we mainly use the seven fields listed in Table 1. 2.1.2 User Identification In order to distinguish requests from different users, we use c-playerid field instead of the IP address to identify users.2 In the traces, c-playerid is reported from the player at client side. If the user has configured its player for anonymity, the field of c-playerid will be randomly generated for each request and, thus, cannot be used for identification. Thus, we remove the record with anonymous c-playerid from the 2. Using IP address of client generally falls into following two weaknesses: 1) the presence of Network Address Translators (NATs), proxies, and DHCP could mistakenly identify multiple clients which shares one IP address as one user and 2) DHCP could mistakenly identify one client which is assigned to different IP addresses as multiple users.


LUO ET AL.: A TRACE-DRIVEN APPROACH TO EVALUATE THE SCALABILITY OF P2P-BASED VIDEO-ON-DEMAND SERVICE

61

TABLE 1 The Fields of Traces Used in Our Study

Fig. 1. Session model.

collected traces, and after that there are about 4 million traces with unique c-playerid left.

2.1.3 Video Categories In this system, different types of video clips are placed in different directories on the VoD server. We can classify the video clips into different categories by cs-uri-stream field. Accordingly, most of the video clips can be classified into two categories: news and music. Table 2 lists the trace information of these two categories. With respect to the different types of content, user behavior for these two categories also presents different characteristics. In this paper, we will conduct simulations over news and music categories separately, and try to reveal the impact of user behavior to system scalability through a comparative study. 2.1.4 Session Model In our study, the collected traces only record the user behavior at the granularity of requests. In order to catch the arrival and departure pattern of users, we define a session as a sequence of requests from a single user while each interval between two sequential requests is no greater than a given time th , which is schematically depicted in Fig. 1. Compared to the session model used in [12] and [13], in our proposed model, the user can request multiple video clips in a single session, which we believe can capture the arrivals and departures of users much better. Note that the user will keep online during the whole session even though sometimes there is no request from this user. In a session, when a request is finished, the user may leave the system, or send another request after a short silent time. th is the threshold of the silent time between two sequential requests in one session, which is set to be 20 minutes in this paper. We have also tested with other values in various simulations while similar results are obtained. As evidenced, we can learn from Fig. 2 that the difference between the proportion of intervals which are TABLE 2 The Information of News and Music Categories

Fig. 2. Distribution of the interval times.

smaller than 5 and 20 minutes is small. About 62 percent of the intervals between two sequential requests from the same user are smaller than 5 minutes, and 70 percent are smaller than 20 minutes.

2.2 Characteristics of User Behavior In this section, we will present the statistical analysis results of user behavior in the VoD system in terms of the number of concurrent online users, user interactivities, popularity evolution and skewness of video clips as well as user recurrence and repeated requests. In the next section, we will further discuss the impact of user behavior on system scalability in detail. 2.2.1 The Number of Concurrent Online Users Since the efficiency of P2P networks depends on the cooperation among peers, the number of concurrent online users should significantly affect the scalability of P2Pbased VoD service. We depict the average number of concurrent online users in 24 hours across 50 days for news and music categories, respectively, as in Figs. 3a and 3b. It can be seen that the two categories exhibit significantly different variations on concurrent user number in 24 hours. While the number keeps high from 10:00 a.m. until 22:00 p.m. for music category, there are two obvious peaks at 9:00 a.m.-10:00 a.m. and 22:00 p.m.-23:00 p.m. for news.3 The reason may be because the users are more likely to watch the news clips either in the morning or before sleep. From the statistical results, it can be speculated that the content type of video clips will affect user arrival patterns. In specific systems, the variations of concurrent online users may be used to guide the system optimization. For 3. Since most users of the VoD system are from China, all the timestamps used in this paper are in Beijing Time Zone (GMT þ 08:00).


62


VOL. 20,

NO. 1,

JANUARY 2009

used in some previous works [7], [8], i.e., the user will always request the video clip from the beginning, cannot grasp the real user behavior very well.

Fig. 3. Average number of concurrent online users. (a) News. (b) Music.

example, the system may need to adopt more efficient incentive scheme to encourage clients to contribute resources when there are more concurrent online users in the system.

2.2.2 User Interactivity Previous studies have shown that user interactivities, such as pause and jump forward/backward, will degrade the performance of IP multicast-based solutions in providing VoD service [14]. Intuitively, user interactivities4 should also have negative impact on the scalability of P2P-based VoD systems. As representative examples, Fig. 4 depicts start and end position of each request for three randomly selected video clips with different lengths. We do not distinguish news and music categories here because they essentially present the same characteristic in user interactivities. In the figures, the requests are sorted first by the start position and then by the end position in the requested video file. So the jumping requests which usually do not start from the beginning of the video are all shown at the end of curve. We can see that the longer the video clip is, the fewer requests will start from the initial beginning of the clip and last till the end of the clip. This actually indicates a high degree of user interactivities for long video clips. It is not surprising if we recognize users may not be patient enough to go through the whole video when the clip is long. Instead, the users prefer jumping to the position where they are interested in. Therefore, the sequential access model 4. Fast forward/backward operations are not taken into account in this paper since they appear too few in our traces. It may be because windows media player cannot support such operations well in streaming applications and the connection bandwidth is limited at the client side.

2.2.3 Popularity Evolution and Skewness of Video Clips We present the analysis results of video popularities in two aspects: popularity evolution and popularity skewness. The former one substantially reflects the change of request number to a specific video clip over time, while the latter one is used to depict the distribution of requests among multiple video clips. In VoD service, after a video clip is added to the streaming server, its popularity, reflected as the hit rate, will change over time. This so-called popularity evolution will impact the scalability of P2P-based VoD service. For example, if the requests to a given video clip aggregate in a short period of time, the streaming server will benefit more from P2P networks. That is because the aggregation of requests will increase the cooperation chance among peers. Figs. 5a and 5b, thus, show the 10-day popularity evolution over a random set of 10 news and music clips, respectively. As shown, for news clips, most requests arrive in only several days after their initial launch on the server; while for music clips, the daily request rate roughly keeps stable during this period. This “time-efficacy” characteristic of requests in news category can be potentially exploited to improve the efficiency of P2P networks. For example, the cached content of news clips can be refreshed faster than music ones without much degradation of system performance and, thus, the cache size allocated for news clip can be reduced. As an on-demand service, there are usually a number of video clips in a typical VoD system. Users can choose any video clip at any time, fully on their own decisions, resulting in different popularities among video clips. In P2P-based VoD systems, when the client requests a video clip, only those online users who cache the same clip can provide service for the request. Thus, if there are more requests to a video clip, there will be more opportunities for requesting clients to cooperate with each other, and thus less workload on the video server. Therefore, the distribution of requests among video clips, i.e., the popularity skewness, will certainly affect the scalability of P2P-based VoD service. Because the popularity of each video clip will also change over time as discussed in popularity evolution, we draw log-log graph of the average distribution of daily requests across 50 days over the rank of video clips in Fig. 6. From this figure, we can see that the popularity distribution in both categories is similar to Zipf distribution, but the daily requests for less popular clips tend to drop more quickly. Furthermore, the popularity distribution in news category is skewer than that in music category. Recall that the total number of news video clips is as many as 3,064. However, as shown in the figure, the requests to news clips stick to only about 400 clips of them in everyday. This aggregation of requests to a few hot video clips will increase the cooperation chance of clients and, thus, is helpful to the system scalability. As a comparison, the daily requests for music category are more “evenly” distributed to more clips than that for news category. The impact of popularity skewness among video clips on system scalability will be further evaluated and discussed in the next section.



63

Fig. 4. Start and end position of requests to video clips. (a) Video length ¼ 62 seconds. (b) Video length ¼ 255 seconds. (c) Video length ¼ 1;796 seconds.

2.2.4 User Recurrence and Repeated Requests As known, P2P-based VoD systems rely on the content cached at client side to fulfill the requests from user community. To some extent, user recurrence and repeated requests from users will decide the cache efficiency, and thus should be carefully analyzed. User recurrence here is defined as the occurrence times of the subscribers of VoD service. In P2P-based VoD service, if content cached at clients can be preserved during offline (we call it hard cache which will be elaborated in the next section), it can be used as long as the clients log in the system again. In this case, user recurrence will definitely impact the system scalability. We can imagine that if the client only enters the system for once, the content kept in its cache will be meaningless for the system performance. Thus, a higher user recurrence is potentially favorable for P2P-based VoD service. Fig. 7 then depicts the Cumulative

Distribution Function (CDF) curve of the distribution of users over number of sessions in both news and music categories. In 50 days of the trace study, there are about 80 percent and 65 percent of the users that enter the system only once for news and music categories, respectively, while about 97 percent and 95 percent of the users subscribe less than five times. The average session numbers of users in news and music categories are 1.5 and 2.0, respectively, indicating that the user recurrence of music category is higher than that of news. Besides the user recurrence, we also investigate a metric called repeated request, which means the times of the identical video segments being requested by the same user. It is known that if the client tries to render the video segments for multiple times, it does not need to request the content from P2P networks after the first playback due to the local cache. In P2P-based VoD service, if the contents cached at clients cannot be preserved during offline (we call

Fig. 6. Average daily popularity skewness.

Fig. 5. Popularity evolution in 10 days after being launched on the server. (a) News. (b) Music.

Fig. 7. User distribution over recurrence time.


64


Fig. 8. The distribution of segments over request times in a single session.

it soft cache), then local cache can only help the repeated requests within a single session. Otherwise, the repeated requests over multiple sessions can also be fulfilled unless the needed segments have been replaced by other segments. Therefore, we intentionally graph the CDF curve of the distribution of segments over the times being requested in a single session and across sessions in Figs. 8 and 9, respectively. Since all the timestamps in our traces are in granularity of seconds, we define a second of video content as a segment. In Fig. 8, for news category, more than 96 percent of the segments are only requested once in a single session, while for music category, the proportion falls to 88 percent. It is a reasonable result because the users will seldom try to watch the same news for more than once, but for some favorable music video clips, they may be likely to watch again and again. This result implies that more requests for music clips will be fulfilled by local cache than news ones. Besides, from Fig. 9, we can see that about 6 percent of the news video segments and 18 percent of music ones will be requested more than once from the same user across sessions, as compared to 4 percent and 12 percent in the single session case in Fig. 8. It indicates that some segments will be requested in more than one session by user, and thus, in P2P-based VoD service, the clients will get more benefit from its own local cache if the cached contents can be preserved during offline. The impacts of user recurrence and repeated requests will be further evaluated in the next section. Toward this end, we have analyzed several important aspects of user behavior with the benefit of traces collected from the VoD system of cctv.com. In a short summary, our main findings are as follows:

VOL. 20,

NO. 1,

JANUARY 2009

Fig. 10. A schematic example of general system framework of P2Pbased VoD service.

The daily variations of the number of concurrent online users are quite different for news and music categories. 2. The degree of user interactivities increases with the video length for both news and music clips, and the sequential access assumption does not hold in practical VoD systems. 3. The requests in news category are more likely to aggregate in a short period of time and to a small proportion of video clips than in music category. 4. The degree of user recurrence and repeated requests in music category is higher than that in news one. We believe these results will help understand the user behavior in VoD service and also provide brief ideas about the potential impacts of user behavior upon the scalability of P2P-based VoD service before the simulations proceed in the next section. 1.

3

SCALABILITY OF P2P-BASED VIDEO-ON-DEMAND SERVICE

In this section, we will evaluate the scalability of P2P-based VoD service under real user behavior that was obtained from the previous section and further examine the impacts of user behavior upon system scalability through extensive tracedriven simulations for both news and music categories. For ease of explanation, we first introduce a general framework of P2P-based VoD systems that will be used in our simulations. Then, the simulation methodology is presented, including the assumptions, cache types and cache replacement algorithms adopted at client side, the metrics used to evaluate the system performance, and the networks conditions of peers used in simulations. After that, we present the results of the practical scalability of P2P-based VoD service under different cache types and cache replacement algorithms. Finally, we further identify the impacts of various aspects of user behavior upon system scalability at the end of this section.

3.1

A General Framework of P2P-Based VoD Systems Fig. 10 depicts the general framework of P2P-based VoD systems which will be used in our simulations. The framework mainly consists of three components as follows: 1. Fig. 9. The distribution of segments over request times across sessions.

Video server. Video server acts as the original source and offers on-demand streaming service to the request peers. In general, it takes charge of publishing



available video files, responding to the requests from users and streaming video contents to the peer community. In this framework, the video server has all the video clips in its storage and is online all the time, so clients can always fetch the video segment from the video server if there is no appropriate peer holding that segment. It should be pointed out that sometimes the video server could be multiple centralized or distributed servers to balance the workload. 2. Client. Client requests video segments from peers or video server, assembles them to a video stream, and then renders the stream for playback. As discussed, the client in P2P-based VoD will store the displayed segments in its cache, and altruistically favor other clients’ explicit requests with those cached segments. Note as mentioned in the last section, there are two types of cache at client side according to the data persistence during offline. In addition, the cache capacity of clients is limited, and when the cache is full, certain replacement algorithm will be invoked to replace some video segments in the cache. We will examine the impact of cache types and cache replacement algorithms on the system scalability in this section. 3. Tracker. The tracker takes the responsibility to record cache availability information of online clients in P2P community. It responds the queries from clients about where they can find appropriate video segments. It should be pointed out that although in Fig. 10 the tracker is schematically a centralized server, in our framework, it can be implemented in either centralized or distributed manner, for example, DHT services [15]. Under such a system framework, when a user logs in, it first checks the contents in its cache and reports the cache information to the tracker for registry. For each on-demand request, the client first checks whether the video segments are stored in its local cache. If so, it begins the playback immediately. Otherwise, the client has to ask the tracker for where it could download the absent segments. After getting the query results from the tracker, it will try to fetch the segment from other peers or video server. During the lifetime, the client periodically updates its cache information to the tracker. When the client leaves, its cache information will be removed from the tracker. Though our proposed framework is simple, it abstracts the key components of a typical P2P-based VoD system including peer registration, resource locating, and data retrieve. Most existing P2P-based VoD systems can be modeled in this framework, such as the ones proposed in [9] and [11]. So, we believe it is both important and valuable to evaluate the performance of P2P-based VoD systems in such a simple but representative framework.

3.2 Methodology of Trace-Driven Simulations We have developed a trace-driven simulation tool to leverage the collected traces as input and adopt a discrete event method to simulate the action of each peer. In this section, we briefly present the assumptions, two cache types, four cache replacement algorithms, three performance metrics, and the network connections of peers used in simulations.

65

3.2.1 Assumptions In our simulations, we make the following assumptions: Clients receive the video segments from video server or other peers at the playback rate so that the continuous playback can be sustained. 2. A segment is considered as a second of video content. That is, if the bit rate of video stream is 512 Kbps, the size of a segment is 64 Kbytes. 3. Clients start to receive a video segment at the beginning of every second, and thus will finish the transmission of that segment at the end of that second. 4. Clients update the cache information at the end of every second, so that the tracker records the updated cache information of every peer at the granularity of video segments. 5. There is no latency for clients to query cache information from tracker. It is reasonable because this operation can be done in advance before the client starts to request the segment. The above assumptions essentially define several important parameters used in our simulations. We believe our assumptions are reasonable and represent the typical settings of a practical P2P-based VoD system. 1.

3.2.2 Cache Types In our simulations, we mainly use two cache types at client side: 1.

2.

Soft cache. In the case of soft cache, the client stores the received content in its temporary memory (RAM for instance), and thus, the cached content can only be accessed during the current online duration and will be cleaned up as soon as the user gets offline. Hard cache. In the case of hard cache, the client stores the received content in its permanent memory (hard disk for instance) and, thus, the cached content can be preserved during offline periods and be available once the user logs in the system again.

3.2.3 Cache Replacement Algorithms When client’s cache is full, it is necessary to replace a video segment in its cache with a cache replacement algorithm. As we know, the performance of P2P-based VoD service is essentially decided by whether the client can find and fetch its desired segments from peers rather than from servers in P2P networks. Thus, the cache replacement algorithm is very important in determining the system performance. In the following simulations, we thus adopt and compare four rudimental cache replacement algorithms described as follows: 1.

2.

First-in first-out (FIFO). In FIFO, the client caches segments in a queue. When the cache is full, it will replace the segment which is first added to the queue. The FIFO algorithm is rather simple and can be implemented locally at client side without any global information. Most copy cached (MCC). In MCC, the client will replace the segment which has the most cached copies in the whole system. In the simulations, we


66


assume that the client can get the number of cached copies of each segment from the tracker. 3. Least recent used (LRU). In LRU, the client will replace the segment which is played less recently than any other segments. However, because most segments will be only requested for once by most users according to the statistical result in the last section, we find that the performance of LRU is very similar to that of FIFO. In order to look into the performance of cache replacement algorithms related to access patterns of users, we implement LRU algorithm as follows: the client will replace the segment in its cache which is requested least recently by clients systemwide other than by the client itself. It is obvious that this LRU algorithm reflects the global access patterns from the P2P community in VoD service. 4. Least frequently used (LFU). In LFU, the client will replace the segment which is played less frequently than any other segments. Similar to LRU, we also implement LFU algorithm in a global manner: the client replaces the segment in its cache which is requested least frequently by clients systemwide other than by the client itself. What should be mentioned is although MCC, LRU, and LFU require global information, they are the most intuitive and representative cache replacement algorithms which consider client cache status and user access patterns in P2Pbased VoD service. Therefore, we believe it is valuable to examine the performance of MCC, LRU, and LFU in our simulations so as to guide the further design of more sophisticated distributed cache replacement algorithms.

3.2.4 Evaluation Metrics To evaluate the scalability of P2P-based VoD systems, we adopt three performance metrics as follows: Saved server bandwidth (SSB). SSB is defined as the percentage of server bandwidth saved by P2P solutions comparing to the traditional C/S systems. SSB is the most important metric to evaluate the scalability of P2P-based VoD service. 2. Local cache hit ratio (LCHR). LCHR is defined as the ratio of the number of segments which are hit in local cache of the client to the total number of requested segments. 3. Peer cache hit ratio (PCHR). PCHR is defined as the ratio of the number of segments which are not in local cache, but hit in other peers to the total number of requested segments. It is clear that the SSB benefits from the segments which are either buffered in local cache of clients or hit in the cache of peers in the system. But in practical systems, the client may not be able to fetch the segment from peers even though it is hit in peer cache because of the bandwidth restriction. So the sum of LCHR and PCHR must be larger than SSB. 1.

3.2.5 Network Conditions of Peers in Simulations To simulate the bandwidth heterogeneity of the peers, we use three different types of typical ADSL nodes in our simulations. Their upload capacities are 1 Mbps, 384 Kbps, and 128 Kbps and download capacities are 3 Mbps, 1.5 Mbps, and 768 Kbps, respectively. In our simulations, we randomly

VOL. 20,

NO. 1,

JANUARY 2009

Fig. 11. SSB versus cache size. (a) News. (b) Music.

assign the connection type to the peers with percentage of 30 percent, 40 percent, and 30 percent, thus, the average upload bandwidth for each peer is about 500 Kbps which is even smaller than the stream bit rate. We will investigate how much server bandwidth will be saved in such a P2P community with “tight” bandwidth supply.

3.3

Practical Scalability of P2P-Based VoD Service of Different Cache Schemes In this section, we will present the simulation results with different settings of cache types and cache replacement algorithms in news and music categories to give a comprehensive view of the scalability of P2P-based VoD service. Figs. 11a and 11b depict the average SSB of four cache replacement algorithms against cache size of clients for news and music categories, respectively. Obviously, no matter in the case of whether hard cache or soft cache, the SSB always increases with the cache size, regardless of the cache replacement algorithm. When cache size is infinite, indicating that the client can keep all the received content in its cache without any replacement, SSB of news and music categories are about 75 percent and 76 percent, respectively, in case of soft cache. It indicates that the VoD system can save at most about 75 percent of the server bandwidth. In the case of hard cache, SSB achieves about 85 percent and 90 percent for news and music categories, respectively, much higher than that of soft cache. It is obvious that hard cache can indeed improve the scalability of P2P-based VoD service over soft cache, and more importantly, the benefit of hard cache increases with the cache size. It minds us that hard cache is the better choice when designing a P2P-based VoD system, especially when the clients are with large cache capacities.



67

Fig. 12. SSB versus concurrent online users.

Fig. 13. SSB versus number of requests to video clips.

We then proceed to examine the performance differences among four cache replace algorithms. As shown in Fig. 11, in the case of soft cache, MCC achieves the highest SSB, while LRU and LFU perform much worse for both news and music categories. Note that when the cache size is larger than 3,000 seconds, the SSB of all four cache replacement algorithms are almost the same because the operations of cache replacement occur very few. In the case of hard cache, the performance of cache replacement algorithms is quite different. In news category, LRU achieves the highest SSB when cache size is relatively large, while MCC performs much worse. On the contrary, in music category, LRU performs badly and MCC achieves higher SSB. This is because in news category, the requests from users to a newly added video clip are generally crowded in time, so MCC will tend to erase the latest and most popular video segment while keep the older and less popular one. However, it would be better to remain the hot segments since they are more likely to be requested than the older ones. In music category, as opposite, the hot video clips will keep hot for a very long time. In order to avoid too many duplicated cache copies of hot segments among clients, MCC will allow some of the clients to cache the lesspopular segments, and thus achieves a better SSB. Very interestingly, we notice that LFU is the worst algorithm in both news and music categories. It is not so surprising if we realize that clients trend to keep the most popular segments in cache when adopting LFU algorithm, and thus, cause cache miss for less-popular segments. The above analysis also implies that there is hardly a unique best cache replacement algorithm for P2P-based VoD systems, because the performance will be essentially affected by user behavior in the system. So we believe it is not a good idea to design a complex algorithm when the user behavior is unknown. However, from Fig. 11, we can observe that although FIFO is very simple, its performance is comparable to, if not better than, that of other more complex algorithms. So we suggest that FIFO is a good choice when designing a P2P-based VoD system.

3.4.1 Impact of the Number of Concurrent Online Users As shown in Fig. 12, when the cache size at clients is infinite, the curve of SSB increases with the number of concurrent online users for both news and music categories, no matter whether soft cache or hard cache is enabled. This result shows that the P2P-based VoD systems exhibit an appealing characteristic “self-scalability,” since the service capacity will be amplified with the growth of the user scale. Furthermore, it is obvious from the figure that at the point of same concurrent online users, the SSB of news category is higher than that of music, which is because of the flush crowd requests for news clips which will be discussed below.

3.4

3.4.3 Impact of User Recurrence and Repeated Requests To gain an insightful understanding toward the impact of user recurrence and repeated requests, we further consider whether the requested video segment is hit by local cache or other peers. Fig. 14 shows the LCHR against cache size for

Impacts of User Behavior on System Scalability of P2P-Based VoD Service So far, we have discussed the system scalability of P2Pbased VoD service with different cache types and cache replacement algorithms. Here, we will highlight the impact of user behavior on system performance.

3.4.2 Impact of Popularity Evolution and Skewness As discussed, there is a direct relation between the request popularity and system scalability in cooperative P2P networks. Here, we try to explore the impact of the popularity differences among video clips on the system scalability. We show the average SSB against the number of requests to the clips in Fig. 13. Note that the peers can cooperate with each other only when they are interested in the same clip. Therefore, if the requests for a specific video clip are few, the opportunity for peer cooperation will be low, and in turn, the system exhibits poor scalability. As evidenced, Fig. 13 confirms that the SSB of video clips with more requests is higher than that of unpopular ones. However, it is worth pointing out that even if the video clips are with similar request rates, their SSB can still be significantly different. Referring to Fig. 13, the SSB for news clips is much higher than that of music ones with same requests in the case of both soft cache and hard cache. We attribute this to the stronger “time-efficacy” of news category as analyzed in the last section, which means the requests for news clips tend to arrive in a short period of time. To some extent, the flush crowd of requests to a video clip compensates the negative impacts of asynchronous characteristic of VoD service, because the requests exhibit a higher correlation over time.


68


VOL. 20,

NO. 1,

JANUARY 2009

Fig. 14. LCHR versus cache size.

Fig. 16. SSB versus extended online duration.

news and music categories. The LCHR of music category exhibits much higher than that of news, because the users are more likely to request the music video clips for more than once. Besides, we also found out that in the case of hard cache, the increase of LCHR of music category is much larger than that of news, due to its higher user recurrence and more repeated requests over sessions. Additionally, Fig. 15 depicts the PCHR against the cache size for the two categories. Since more segments of music category have already been hit by local cache, the PCHR of music is thus smaller than that of news. The cache replacement algorithm is FIFO in Figs. 14 and 15.

service, and further identified the impact of user behavior on the system scalability. As a short conclusion, our main observations comprise the following:

3.4.4 How Much Can Systems Benefit from the Extended Online Duration of Clients? Intuitively, if the client keeps online for an extended time after a session finishes, the content in its cache could be used to favor other clients. Otherwise, it would be offline with no benefit for the system scalability. We are, hence, motivated to purposely enlarge the online duration of clients and plot SSB against the extended online duration of clients in Fig. 16, in which the x-axis is the extended duration we assume the clients stay in the system after sessions. In the simulations, the cache size at client side is 3,600 seconds and the cache replacement algorithm is FIFO. We can see that SSB increases distinctly in all cases when the extended online duration becomes longer. This optimistic result suggests that a welldesigned incentive mechanism, which encourages the users to keep online when actually not using the system, will indeed improve the system scalability. In this section, we have described our trace-driven simulations, examined the scalability of P2P-based VoD

Fig. 15. PCHR versus cache size.

1.

2.

3.

4.

4

P2P networks exhibit promising scalability in providing VoD service. For example, in the case of infinite hard cache, about 85 percent and 90 percent of the server bandwidth can be saved for news and music categories, respectively. Besides, the SSB will increase with the system scale, exhibiting a talented “self-scalability” characteristic. Hard cache is always helpful to increase the system scalability, especially when the cache size is big at client side. However, it is not easy to find the best cache replacement algorithm, because the performance depends on the user behavior in VoD system. We also find that FIFO performs fairly well in both news and music categories. User behavior will have great impact on the system scalability. For example, the “time-efficacy” of requests and skewer popularity distribution help the news category to achieve better scalability than music, and the user recurrence and repeated requests also lead to different local and PCHR. The system scalability will benefit from extended online durations of clients, and thus, incentive mechanisms which encourage users to keep online is favorable in P2P-based VoD service.

RELATED WORK

In the past years, P2P-based living streaming is a very hot research topic, and its scalability has already been testified from industry experience. A good survey has been given in [16]. But for P2P-based VoD service, although there are already a number of P2P approaches proposed in previous works [7], [8], [9], [10], [11], its scalability has not been carefully studied under real VoD workloads, and thus still remains far from clear, which motivates our work in this paper. In [7], the clients are grouped into generations and employ a caching scheme to relay the video stream among peers. In [8], a distributed patching technique is proposed to cooperatively stream video content to clients. In [9], an application-layer asynchronous streaming multicast mechanism is designed to address the problem of on-demand media distribution. In [10], a receiver-driven P2P media streaming system is proposed which coordinates the peers, streams the media from multiple peers, performs load balancing, and



handles the online/offline of peers at client side. In [11], BASS uses a bittorrent-assisted method to support large-scale VoD services. All these approaches need the clients to cache video segments and cooperatively relay them to others, and thus can be partially modeled by our system framework. As the definition in this paper, we can recognize that soft cache is used in [7], [8], and [9], while hard cache is used in [10] and [11]. Our work in this paper complements these works in providing a more comprehensive idea about the practical scalability of P2P-based VoD service. Traces collected from practical systems are very important in evaluating and understanding the real performance of P2P systems. For example, in [17], the Gnutella query traces are used to evaluate a distributed caching mechanism and an adaptive search approach in reducing the search traffic in unstructured P2P networks. In [18], several categories of traces are used to evaluate the performance of node selection strategies in reducing the churn in P2P networks. In [19], traces collected from a practical P2P live streaming system are used to verify the performance of a new designed system which considers the priority among peers based on their contributions. However, there is still few work to study the scalability of P2Pbased VoD service with real VoD traces. A number of studies have focused on characterizing the workloads of various VoD systems [20], [21], [22], [23], [24], [25]. Traces from mMOD system were analyzed in [20] and some aspects of user behavior were observed such as the high temporal locality of accesses, preview of the initial portion of videos, etc. Video access from a large university was studied in [21] and a detailed characterization of session duration, object popularity, and sharing patterns of streaming media among the clients were presented. In [22], the client session arrival process was carefully studied through the analysis of two educational media server workloads. Two enterprise media server workloads have been extensively studied in [23] and the authors concentrated on the analysis of media server access trends, access locality, dynamics, and evolution of the media workload over time. In [24], the authors proposed a general model for workload characteristics in streaming media workloads. Traces from a large commercial VoD system were analyzed in [25] and the study focuses on user behavior and content access patterns. Compared to these works, we focus our analysis on the aspects of user behavior which have great impact on the scalability of P2P-based VoD systems. Furthermore, the traces are leveraged to evaluate the practical scalability of P2P-based VoD service in the second part of this paper which also distinguishes our work from previous studies. Reference [26] is a closely related work to this paper. The authors collected a large amount of traces from MSN Video, and analyzed the potential benefits of P2Pbased VoD service. However, the authors focused on the single video approach, i.e., a peer only redistributes the video it is currently watching. While in this paper, the client can relay any video segments in its cache to other peers, which is a more complex multiple video approach. Besides, we analyze the impacts of different cache types and cache replacement algorithms at clients to the whole system performance which is not carefully addressed in [26].

5

user behavior has not been well studied and still remains far from clear. In this paper, we are thus motivated to investigate the scalability of P2P-based VoD service under real user behavior that summarized from real traces of a VoD service and further identify the impact of user behavior upon system scalability. This paper essentially consists of two parts. In the first part, we have analyzed the real traces of millions of requests from users collected from the publicly available VoD service of China’s largest television station CCTV, over a period of 50 days to identify the key aspects of user behavior which will potentially affect the scalability of P2P-based VoD service. The analysis is taken separately for news and music categories, and various aspects of user behavior are presented in terms of the number of concurrent online users, user interactivities, popularity evolution and skewness of video clips, as well as user recurrence and repeated requests. In the second part, we have conducted extensive tracedriven simulations to evaluate the scalability of P2P-based VoD service under real user behavior. On the one hand, we present the simulation results under different cache types and cache replacement algorithms to give comprehensive ideas about the system scalability in different settings; On the other hand, we compare the simulation results for news and music categories, and thus further identify the impacts of various aspects of user behavior upon system scalability. We believe our findings drawn from the analysis and simulations in this paper will help better understand the user behavior in VoD service and their impacts on the scalability of P2P-based VoD service, and can also be used to guide future system design and optimization. In our future work, we will investigate the system performance of P2P-based VoD service in more sophisticated architectures. For example, we plan to evaluate the performance of newly proposed Dynamic Skip List (DSL) [27] under practical user behavior using our traces. Besides, we also plan to make analytical study about the performance of P2P-based VoD service.

ACKNOWLEDGMENTS The research was supported in part by grants from RGC under Contracts CERG 622407 and N_HKUST609/07, the NSFC Oversea Young Investigator Grant under Grant 60629203, the National Natural Science Foundation of China under Grants 60773158 and 60503063, the National High-Tech Research and Development Plan of China (863 Program) under Grant 2006AA01Z321, and the National Basic Research Program of China (973 Program) under Grant 2006CB303103 and the Key Project of Guangzhou Municipal Government Guangdong/Hong Kong Critical Technology grant 2006Z1-D6131.

REFERENCES [1] [2] [3]

CONCLUSION

P2P networks emerge as a promising way to provide largescale VoD service over Internet. However, although a number of architectures and protocols have been proposed, the practical scalability of P2P-based VoD service under real

69

[4] [5]

BitTorrent, http://www.bittorrent.com, 2008. KaZaA, http://www.kazaa.com, 2008. X. Zhang, Q. Zhang, Z. Zhang, G. Song, and W. Zhu, “A Construction of Locality-Aware Overlay Network: mOverlay and Its Performance,” IEEE J. Selected Areas in Comm., special issue on recent advances on service overlay networks, Jan. 2004. Y.H. Chu, S.G. Rao, and H. Zhang, “A Case for End System Multicast,” Proc. ACM SIGMETRICS ’00, June 2000. X. Zhang, J. Liu, B. Li, and T.-S.P. Yum, “CoolStreaming/DONet: A Data-Driven Overlay Network for Live Media Streaming,” Proc. IEEE INFOCOM ’05, Mar. 2005.


70


[6]

[7] [8] [9] [10]

[11] [12]

[13]

[14] [15] [16]

[17] [18] [19] [20] [21]

[22]

[23]

[24]

[25]

[26] [27]

M. Zhang, J.-G. Luo, L. Zhao, and S.-Q. Yang, “A Peer-to-Peer Network for Live Media Streaming Using a Push-Pull Approach,” Proc. 13th Ann. ACM Int’l Conf. Multimedia (Multimedia ’05), Nov. 2005. T.T. Do, K.A. Hua, and M.A. Tantaoui, “P2VoD: Providing Fault Tolerant Video-on-Demand Streaming in Peer-to-Peer Environment,” Proc. IEEE Int’l Conf. Comm. (ICC ’04), June 2004. Y. Guo, K. Suh, J. Kurose, and D. Towsley, “P2Cast: Peer-to-Peer Patching Scheme for VoD Service,” Proc. 12th World Wide Web Conf. (WWW ’03), May 2003. Y. Cui, B. Li, and K. Nahrstedt, “oStream: Asynchronous Streaming Multicast in Application-Layer Overlay Letworks,” IEEE J. Selected Areas in Comm., vol. 22, pp. 91-106, Jan. 2004. J. Li, “PeerStreaming: An On-Demand Peer-to-Peer Media Streaming Solution Based on a Receiver-Driven Streaming Protocol,” Proc. IEEE Int’l Workshop Multimedia Signal Processing (MMSP ’05), Oct. 2005. C. Dana, D. Li, D. Harrison, and C.-N. Chuah, “BASS: Bittorrent Assisted Streaming System for Video-on-Demand,” Proc. IEEE Int’l Workshop Multimedia Signal Processing (MMSP ’05), Oct. 2005. J.-G. Luo, Y. Tang, J. Zhang, and S.-Q. Yang, “Evaluation of Practical Scalability of Overlay Networks in Providing Video-onDemand Service,” Proc. IEEE Int’l Conf. Multimedia and Expo (ICME ’06), July 2006. E. Veloso, V. Almeida, W. Meira, A. Bestavros, and S. Jin, “A Hierarchical Characterization of a Live Streaming Media Workload,” Proc. ACM SIGCOMM Internet Measurement Workshop (IMW ’02), Nov. 2002. M. Rocha, M. Maia, I. Cunha, J. Almeida, and S. Campos, “Scalable Media Streaming to Interactive Users,” Proc. 13th Ann. ACM Int’l Conf. Multimedia (Multimedia ’05), Oct. 2005. S. Rhea, B. Godfrey, B. Karp, J. Kubiatowicz, S. Ratnasamy, S. Shenker, I. Stoica, and H. Yu, “OpenDHT: A Public DHT Service and Its Uses,” Proc. ACM SIGCOMM ’05, Aug. 2005. J.-C. Liu, S. Rao, B. Li, and H. Zhang, “Opportunities and Challenges of Peer-to-Peer Internet Video Broadcast,” Proc. IEEE, special issue on Recent Advances in Distributed Multimedia Communication, 2007. C. Wang, L. Xiao, Y. Liu, and P. Zheng, “Distributed Caching and Adaptive Search in Multi-Layer P2P Networks,” Proc. 24th IEEE Int’l Conf. Distributed Computing Systems (ICDCS ’04), Mar. 2004. P.B. Godfrey, S. Shenker, and I. Stoica, “Minimizing Churn in Distributed Systems,” Proc. ACM SIGCOMM ’06, Sept. 2006. Y.-W. Sung, M. Bishop, and S.G. Rao, “Enabling Contribution Awareness in an Overlay Broadcasting System,” Proc. ACM SIGCOMM ’06, Sept. 2006. S. Acharya, B. Smith, and P. Parns, “Characterizing User Access to Video on the World Wide Web,” Proc. ACM/SPIE Multimedia Computing Networking Conf. (MMCN ’00), Jan. 2000. M. Chesire, A. Wolman, G. Voelker, and H. Levy, “Measurement and Analysis of a Streaming Media Workload,” Proc. Third Conf. USENIX Symp. Internet Technologies and Systems (USITS ’01), Mar. 2001. J.M. Almeida, J. Krueger, D.L. Eager, and M.K. Vernon, “Analysis of Educational Media Server Workloads,” Proc. 11th Int’l Workshop Network and Operating Systems Support for Digital Audio and Video (NOSSDAV ’01), June 2001. L. Cherkasova and M. Gupta, “Characterizing Locality, Evolution, and Life Span of Accesses in Enterprise Media Server Workloads,” Proc. 12th Int’l Workshop Network and Operating Systems Support for Digital Audio and Video (NOSSDAV ’02), May 2002. W. Tang, Y. Fu, L. Cherkasova, and A. Vahdat, “MediSyn: A Synthetic Streaming Media Service Workload Generator,” Proc. 13th Int’l Workshop Network and Operating Systems Support for Digital Audio and Video (NOSSDAV ’03), June 2003. H. Yu, D. Zheng, B.Y. Zhao, and W. Zheng, “Understanding User Behavior in Large Scale Video-on-Demand Systems,” Proc. First ACM SIGOPS/EuroSys European Conf. Computer Systems (EuroSys ’06), Apr. 2006. C. Huang, J. Li, and K.W. Ross, “Can Internet Video-on-Demand Be Profitable,” Proc. ACM SIGCOMM ’07, Aug. 2007. D. Wang and J. Liu, “A Dynamic Skip List Based Peer-to-Peer Overlay for VoD with VCR Interactions,” IEEE Trans. Parallel and Distributed Systems, 2007.

VOL. 20,

NO. 1,

JANUARY 2009

Jian-Guang Luo received the BS degree in computer science from Tsinghua University, Beijing, in 2003, where he is currently in the PhD program in the Department of Computer Science and Technology. His research interests include peer-to-peer live and on-demand streaming, multimedia networking, and communication theory. He is a student member of the IEEE.

Qian Zhang received the PhD degree from Wuhan University, Wuhan, China, in 1999. She joined the Department of Computer Science, Hong Kong University of Science and Technology, Kowloon in September 2005 as an associate professor. Her current research interests include the areas of wireless communications, IP networking, multimedia, P2P overlay, and wireless security. From July 1999 to September 2005, she was a research manager at Microsoft Research, Asia. She has published or presented more than 150 refereed papers in leading international journals and at conferences. She has approximately 30 patents pending. She is an associate editor for the IEEE Transactions on Wireless Communications, the IEEE Transactions on Multimedia, the IEEE Transactions on Vehicular Technologies, Computer Networks, and Computer Communications. She has also served as a guest editor for special issues of the IEEE Wireless Communications, the IEEE Journal on Selected Areas in Communications, the IEEE Communications Magazine, and others. She received the TR 100 (MIT Technology Review) Worlds Top Young Innovator Award, the Best Asia Pacific Young Researcher Award, elected by the IEEE Communications Society, the Best Paper Award from the Multimedia Technical Committee of the IEEE Communication Society in 2005, and also the Best Paper Award in the Proceedings of the Third International Conference on Quality of Service in Heterogeneous Wired/Wireless Networks and the Proceedings of the 50th Annual IEEE Global Communications Conference. She is a senior member of the IEEE. Yun Tang received the PhD degree in computer science from Tsinghua University, Beijing, in 2007. He is currently a postdoctoral researcher in the China Institute for Development Planning, Tsinghua University. His research interests include multimedia networking, peer-to-peer networks, and socio-economic impacts of ICT. He is a member of the IEEE.

Shi-Qiang Yang received the BS and MS degrees in computer science from Tsinghua University, Beijing, in 1977 and 1983, respectively. He is a professor, a PhD supervisor, and the executive head in the Department of Computer Science and Technology, Tsinghua University. His research interests include multimedia technology and systems, video compression and streaming, content-based retrieval for multimedia information, and pervasive computing. He has published more than 80 technical papers. He served as a program cochair of the Workshop on ACM Multimedia ’05, PCM ’05, and MMM ’06. He is a codirector of the Tsinghua-Microsoft Multimedia Joint Research Laboratory. He is a member of the IEEE. . For more information on this or any other computing topic, please visit our Digital Library at www.computer.org/publications/dlib.