Multimed Tools Appl (2007) 35:311–333 DOI 10.1007/s11042-007-0134-7
Throughput optimization for video streaming proxy servers based on video staging W. K. Cheuk & Daniel P. K. Lun
Published online: 11 May 2007 # Springer Science + Business Media, LLC 2007
Abstract A video streaming proxy server needs to handle hundreds of simultaneous connections between media servers and clients. Inside, every video arrived at the server and delivered from it follows a specific arrival and delivery schedule. While arrival schedules compete for incoming network bandwidth, delivery schedules compete for outgoing network bandwidth. As a result, a proxy server has to provide sufficient buffer and disk cache for storage, together with memory space, disk space and disk bandwidth. In order to optimize the throughput, a proxy server has to govern the usage of these resources. In this paper, we first analyze the property of a traditional smoothing algorithm and a video staging algorithm. Then we develop, based on the smoothing algorithm, a video staging algorithm for video streaming proxy servers. This algorithm allows us to devise an arrival schedule based on the delivery schedule. Under this arrival and delivery schedule pair, we can achieve a better resource utilization rate gracefully between different parameter sets. It is also interesting to note that the usage of the resources such as network bandwidth, disk bandwidth and memory space becomes interchangeable. It provides the basis for interresource scheduling to further improve the throughput of a video streaming proxy server system. Keywords Variable-bit-rate video . Proxy server . Video smoothing . Video staging . Video streaming
1 Introduction Video streaming is a flourishing technology used in current multimedia communication systems. Its idea is to provide a just-in-time video data delivery service that allows client to
W. K. Cheuk : D. P. K. Lun (*) Centre for Multimedia Signal Processing Department of Electronic and Information Engineering, The Hong Kong Polytechnic University, Hung Hom, Kowloon, Hong Kong e-mail:
[email protected]
312
Multimed Tools Appl (2007) 35:311–333
render the video immediately during the transfer. Although the design of existing network infrastructures and transport protocols are not designed for video data transfer, we can still adopt most of them. However, proxy service is one of the few exceptions. Traditionally, a proxy server is employed as an intermediary between servers and clients. Firstly, it is responsible for redirecting requests of clients from different servers to achieve load balancing. Secondly, it also acts as a cache to allow immediate response for duplicated information. It is a vital device for bandwidth saving in an Internet Service Provider deployment [11] for textual and pictorial information exchange. Yet, video data exchange consumes even higher bandwidth than traditional data. It also requires a larger cache storage space. Therefore, expectation on a growth of proxy service demand is projected [5]. However, video is a special form of information that requires real time processing to render it in a human perceptible form. Any delay or loss of synchronization may destroy its perceptibility easily. To handle the streaming of a video properly, we require a real time application that is not available in traditional proxy servers, which might store up data for sometime before delivery. Besides, there will not be any delivery synchronization among different streams. To cope with this, a special form of proxy server, namely the video streaming proxy server, was designed to serve this purpose [2, 9]. Whenever a client makes the request for a video stream, the video streaming proxy server will determine the corresponding media server for request redirection. Then the proxy server will begin to receive video data packets. These packets can be stored temporarily in the proxy server for transcoding [4, 6] or traffic re-shaping [7]. If the proxy server decides to cache this video stream, it will make a copy of these packets in its cache storage. Hence it is possible that the proxy will find a copy of such video stream in its cache storage next time the same video stream is requested. The proxy server will then obtain the media directly from the cache rather than retrieving the stream from the actual media server. In addition to this all-or-nothing caching mechanism, recent research suggests that a proxy server may also implement the video staging mechanism [1, 10, 14, 15], of which it will only cache part of the video stream, leaving the rest in the media server. During playback, it will combine the packets retrieved from the local disk and from the remote server and forward them to the client. Finally, when the client wants to terminate the operation, it will request a disconnection from the media server and the proxy server can release the resource allocated for this connection. When there are concurrent connections, the actions described above will impose different resource constraints in the proxy server. For example, the total incoming and outgoing data rate of the server limit the incoming and outgoing bandwidths of these concurrent connections, while the available memory space on the server limits the buffer size. Moreover, when the proxy server considers caching a video in its disk cache, its storage will occupy disk space and consume disk bandwidth. In order to increase the throughput of a proxy server, the usage of such resources has to be scheduled. Since different resources have different favorable data retrieval patterns, a proxy server should consider this during the scheduling. Besides, a better schedule can usually be obtained from some tradeoff of usage between different resources. For example, incoming bandwidth requirement can be reduced by increasing the memory buffer size. Therefore, in order to optimize the throughput, a proxy server can schedule its connections so that their resource usage can be compromised to allow a higher volume of sustainable concurrent connections. We refer such compromise of resource usage as “resource interchange,” which we believe to be an important requirement for any video streaming proxy servers to achieve optimal performance.
Multimed Tools Appl (2007) 35:311–333
313
To achieve resource interchange, Rexford et al. have proposed a smoothing algorithm [12, 13] for proxy servers to allow interchange between memory usage and network bandwidth. With this algorithm, a proxy server can utilize its memory and network much more effectively in order to support a greater number of clients concurrently. However, the network is the only transportation medium considered by Rexford’s algorithm. It does not consider bringing in local storage of the proxy server for smoothing the backbone network bandwidth. As mentioned above, the video staging technique [15] separates a video stream into two parts. One part is obtained from the media server while the other is obtained directly from the local storage of the proxy server to allow saving of network bandwidth. It is desirable if we can incorporate the video staging technique with Rexford’s algorithm so that a better video smoothing mechanism with staging feature can be developed [3]. In this paper, we first analyze the relationships between the usage of different resources, including the network bandwidth, disk bandwidth and memory in traditional video streaming proxy systems. Based on Rexford’s algorithm and the video staging technique, we propose a new algorithm that provides mechanism to compromise the usage of different resources to achieve maximum throughput. We compare its performance with the traditional video staging algorithm. We find that our algorithm improves the utilization ratio of disk bandwidth dramatically while maintaining the same rate of network bandwidth reservation. This paper is organized as follows. In Section 2, a brief introduction to the smoothing of the video delivery schedule in a proxy server is given. In Section 3, we analyze the property of Rexford’s video smoothing schedule and the video staging mechanism. In Section 4, we propose the new video staging algorithm. In Section 5, experimental results and comparison with the traditional algorithm are given. Finally, we draw our conclusion in Section 6 and all mathematical proofs are given in the appendices.
2 Quantifying video streaming As shown by Rexford [12, 13], data transfer of a video streaming service with proxy server can be illustrated by a graph plotting its cumulative transferred data against time. As shown in Fig. 1, line A illustrates the earliest arrival schedule that addresses the maximum possible data cumulated at the proxy sent from the media server. Line S shows the client data deadline schedule, illustrating the least amount of data the client needs to receive in order to Fig. 1 Cumulative data transfer of a connection in a proxy server
Cumulative Byte Transferred SN
A F-S F
S Time N
314
Multimed Tools Appl (2007) 35:311–333
provide a correct play back. Between these lines, line F illustrates the actual arrival schedule for this connection in the proxy. The value max{F−S} is the buffer requirement of the proxy server required to support this connection without causing buffer overflow. In addition, the incoming bandwidth requirement of the connection is given by max{dF/dt}, while the outgoing bandwidth requirement is max{dS/dt}. We denote the total amount of data delivered at time k as Sk, where k=0,..., N, for a video with total length N+1. Hence the total size of the video is SN. Now we consider the case of caching. Under a traditional all-or-nothing caching strategy, both the storage and retrieval bandwidth requirements of the disk are less than or equal to max{dF/dt}. Video staging [15] divides a video stream into two parts, as shown in Fig. 2. The proxy obtains the first part from the media server through the network. it decides the bandwidth requirement. The other part is obtained from the disk. It decides the storage and retrieval bandwidth requirements. Video staging can operate in two modes: CAS (Cut-off after Smoothing) and CBS (Cut-off before Smoothing). Both of them rely on a cutoff bandwidth dC/dt for separation of the upper and lower streams, where C is an arbitrary value measuring the volume of data acquired from a streaming schedule at a particular time. Usually we have dF/dt>dC/dt>dS/dt. For example, at a certain period of time and we have dF/dt=100 Kbps and dC/dt=25 Kbps, then the upper part will have a rate of 75 Kbps and forms the upper stream while the lower part will have a rate of 25 Kbps and forms the lower stream. The upper part will be retrieved from the disk while the lower part has to be obtained from the network. As the video in CAS mode has been smoothed before being cutoff, the incoming bandwidth requirement and the disk retrieval bandwidth will be dC/dt and max{d(F 0 −C)/dt}, where F 0 represents the smoothed version of schedule F. As for CBS, the video will be cutoff before smoothing, the incoming bandwidth will become dC/dt and the retrieval bandwidth will become max {d(F−C) 0 /dt}, where (F−C) 0 represents the smoothed version of schedule F−C. The staging mechanism essentially divides a variable-bit-rate video stream into a boundedbit-rate video stream and another smaller variable-bit-rate video stream. As the network is responsible for the transfer of the bounded-bit-rate stream only, this relieves the complexity of its bandwidth allocation. The local disk of the proxy server will handle the rest of the unsmoothed stream. From an operational point of view, a video staging schedule has to accept a video delivery schedule as its parameter and propose an arrival schedule. This arrival schedule shall consist of two parts: the proxy shall obtain the first part from the media server through the network and the other part from its local disk. Therefore, we can break down this
Fig. 2 Cut-off arrangement with video staging
Multimed Tools Appl (2007) 35:311–333
315
proposed arrival schedule into a disk schedule and a network schedule. The design of a video staging schedule is therefore a design of these two schedules. 2.1 Smoothing algorithm Rexford et al. have suggested a smoothing mechanism to obtain a schedule for a variablebit-rate video stream in a proxy server [13]. We will apply this algorithm and call it as algorithm Φ in the following context. It will work under a limited incoming bandwidth r and a delivery schedule S=Sk: k=0,..., N and generate a proposed arrival schedule f=fk: k= 0,..., N for the media server to deliver the video stream to the proxy. We state the algorithm with a flowchart as follows:
Without loss of generality, both schedules f and S are indexed to 1 s. Therefore, the difference between two consecutive points in a schedule, such as fk −fk−1, gives the bandwidth requirement of schedule f during time interval [k, k−1). Applying algorithm Φ to a delivery schedule S as shown in Fig. 3 will generate a proposed arrival schedule f with an incoming bandwidth constraint r and delay w. The buffer requirement of this schedule imposed on the proxy is given by bp ¼ max f fk Sk g : k ¼ 0; . . . ; N. Occasionally, a large delay is required to avoid f from having any part of it to go below S, which implies starvation in the buffer of the proxy server. In this case, the buffer requirement of the proxy has to be obtained by bp ¼ max f fk Skþw g : k ¼ 0; . . . ; N. Algorithm Φ provides a link between the schedule, incoming bandwidth constraint, delay and its buffer requirement. The relationship between them gives the upper limits in incoming bandwidth and buffer requirements for each connection. A proxy server can adjust the usage of these two resources to achieve maximum number of sustainable clients. However, the network is the only transportation medium considered by Φ. When we take
316 Fig. 3 A smoothed arrival schedule
Multimed Tools Appl (2007) 35:311–333
Cumulative Byte Transferred SN f-S = Buffer Requirement in the proxy server
f
w
S Time
video staging into consideration, disk access becomes another transportation medium. The timing of disk accesses, the amount of buffer required for temporary storage, and the relationship with the incoming network bandwidth should be further investigated. In the following sections, we construct an algorithm based on algorithm Φ that incorporates disk access into its scheduling consideration. It maximizes the throughput of the proxy server by optimally scheduling the usage of different resources with video staging applied.
3 Video staging schedule 3.1 Property of the smoothing algorithm Before we describe the new algorithm, let us illustrate some properties of the smoothing algorithm Φ. We first consider the relationship between the proxy buffer size and the allowable incoming bandwidth. We shall show below that the proxy buffer requirement decreases monotonically when the allowable incoming bandwidth increases. This is reasonable, since the purpose of the buffer is to smooth out the abrupt increase in bandwidth demand from the video stream. As every video stream has its own maximum bandwidth, less buffer is required if the allowable incoming bandwidth increases and approaches that maximum value. This observation is proved as follows. Let f=Φ (S, r) and f 0 =Φ (S, r 0 ), the statement we have to prove becomes max f f S g max f f 0 S g if r≥ r 0. To prove this, we first show that fk can be written as Sk +ck. Lemma 3-1 If f=Φ (S, r) where Φ() is the smoothing algorithm as mentioned in [13] ; S= Sk: k=0,..., N and f=fk: k=0,..., N are delivery and arrival schedules, respectively, and r is the incoming bandwidth then fk can be written as Sk +ck, where ck is a constant and ck ≥0 for k=0,..., N. (See Appendix A for proof ) Based on Lemma 3-1, we can prove the abovementioned relationship between incoming bandwidth and proxy buffer size under Φ as follows: Theorem 3-2 Assume Φ() is the smoothing algorithm as mentioned in [13] and S is the =Φ (S, r0 ) required delivery schedule. If there are two arrival schedules f=Φ (S, r) and f 0 0 with two different incoming bandwidths such that r 0 >r, then maxf fk Sk gmax fk Sk . (See Appendix A for proof)
Multimed Tools Appl (2007) 35:311–333
317
The second property of Φ is that it always minimizes the maximum required buffer size of the proxy server. Theorem 3-3 Assume Φ() is the smoothing algorithm as mentioned in [13] and S is the required delivery schedule. If there is an arrival schedule f=Φ (S, r) and any arbitrary 0 feasible arrival schedule 0 f with the same incoming bandwidth r then we shall have max f fk Sk g max fk Sk for all k=0,..., N. (See Appendix A for proof ) From the properties above, we conclude that algorithm Φ is the optimal schedule in terms of the proxy buffer requirement. It yields a smaller buffer requirement when a higher incoming bandwidth is allowed. 3.2 Property of staging and resource usage scheduling For a proxy server, the staging mechanism divides the source of the incoming video data flow into network and hard disk. Their usage however exhibits distinct preferences. It is generally understood that the effective bandwidth supported by most wide area networks is not yet comparable with disk bandwidth nowadays. Hence it is impractical to expect retrieving a huge amount of data for every network access. However, as most network interfaces allow multiplexing of different network services, a media server can serve several clients at a time at low switching cost. It means that we may assume the proxy server can receive data from the media server rather frequently. The scenario of disk accesses is completely different. While the seek time of a disk removes the possibility of making it a multiplexing device in this time scale, yet it is also impractical for a disk to provide a long period of service to only one client ignoring the others. Based on these observations, we can conclude that the data retrieval from the network will be a frequent but small-scale one while the one from the disk should be a short term but large-scale one. As mentioned, the lower part of a video staging schedule accounts for data retrieved from network and behaves like a constant-bit-rate schedule. This behavior is favorable to the data retrieval pattern of network resources; mechanisms such as IntServ [8] can easily fulfill such a constant-bit-rate network data requirement. The upper part of the schedule accounts for data retrieved from local disk. This part of the schedule is responsible for most of the bit rate variation of the arrival schedule. Although we can reserve disk bandwidth in the same way as network bandwidth, this will introduce many small-scale disk accesses and lower the throughput. The method for constructing a schedule that will require fewer but large-scale disk accesses will be covered in the next section. 4 Video staging algorithm Γ From algorithm Φ, we know that we can compensate for the insufficiency of incoming bandwidth by means of a larger memory buffer to retrieve data in advance. On the other hand, from the staging mechanism we know that we can separate a video stream into two parts: one part is obtained from network and the other is from disk. Now, we begin to construct our own video staging algorithm Γ. In this algorithm, we will take the incoming bandwidth constraint, memory buffer usage and disk access into consideration so that interchange between them is allowed for throughput optimization. Similar to Φ, algorithm Γ is also built under an incoming bandwidth constraint r. As shown in Fig. 3, the minimum buffer requirement of a proxy server working under the
318
Multimed Tools Appl (2007) 35:311–333
smoothing algorithm Φ is the maximum vertical distance between the proposed arrival schedule obtained by Φ and the delivery schedule. Moreover, we recognize that the separation of these two schedules stems from the failure to satisfy the condition fk Sk1 r, which implies that the incoming network bandwidth is unable to cover the consumption required in the delivery schedule. However, we can compensate for it by having a disk access of data volume bk ¼ fk Sk1 r. Both the staging algorithms, CAS and CBS, will introduce continuous disk accesses that is undesirable for multiple-client services. In this case, we have to multiplex disk services in much the same way as the network services in order to retrieve the data required by different clients. Since the operating cost is much higher for multiplexing disk accesses, the throughput of a disk will drop dramatically. Worse still, this will increase the wearing rate of the disk and reduce its operational lifetime. Hence, in addition to those constraints imposed on algorithm Φ, we should also prevent algorithm Γ from generating continuous disk accesses. During the development of the proposed algorithm, we have made the following assumptions without loss of generality. We assume that the employed memory buffer b is so large that it can hold a substantial amount of data from the network with bandwidth r; this implies that the memory buffer is able to hold a substantially long period of network data. Besides, we assume that the maximum variation in the delivery schedule max{dS/dt} is not too much larger than r, which is true for general video title. Under a limited incoming bandwidth r, maximum proxy buffer size b, delivery schedule S=Sk: k=0,..., N and the suggested arrival schedule g=gk: k=0,..., N, we state the proposed algorithm Γ with g=Γ( S, r, b) as follows:
Algorithm Γ is similar to algorithm Φ, except in blocks 6 and 7, which show that whenever the vertical distance between the proposed arrival schedule g and delivery schedule S accumulates to a volume exceeding a threshold b+r, the algorithm will create a rapid drop to close up the gap. However, if we traverse Fig. 4 in a reverse direction from S0 to SN, the rapid drop is actually a rapid rise. We consider this rapid rise in cumulated data to be the result of disk access. As shown in Fig. 4, the rise lifts up the schedule in the period of [m, n) to avoid starvation during this period, even if the proxy relies on the network to obtain data only for the rest of the schedule.
Multimed Tools Appl (2007) 35:311–333
319
The volume of data retrieved from the disk access is in fact an upper boundary to the buffer requirement of the proxy server. Firstly, it is interesting to note that although we define in Flowchart 2 the condition for having a disk access as gk > Sk1 þ r þ b, in practice gk can only be slightly greater than Sk1 þ r þ b when the disk access occurs. This is because the vertical distance between g and S accumulates following the rate of Sk −Sk−1 − (gk −gk−1) whereas dSk Sk1 ðgk gk1 Þe ¼ max dS dt r, which is much smaller than b, following the assumptions stated above. Hence, the algorithm will not allow gk going too much greater than Sk1 þ r þ b before the introduction of a disk access. In fact, it is safe to rewrite the condition for disk access as gk ffi Sk1 þ r þ b. Now, suppose schedules g and S have the same value at time k, that is gk =Sk, and according to algorithm Γ, the value of gk+1 is constrained by either gkþ1 r þ Sk or gkþ1 ffi Sk þ r þ b depending on the need of a disk access at time k. We define the buffer requirement at time k+1 as gkþ1 Skþ1 . For the case without disk access, the buffer requirement will be bounded to r ðSkþ1 Sk Þ, which is much smaller than b (note that Skþ1 Sk < r in this case). In addition, for the case gk ¼ gkþ1 r then the value of gk+1 is constrained by gkþ1 r Sk < b and gkþ1 Sk > r. After rearrangement, we have the following inequality, Skþr < gkþ1 < b þ r þ Sk and after subtracting Sk+1 from both terms, we arrive at r ðSkþ1 Sk Þ < gkþ1 Skþ1 < b þ r ðSkþ1 Sk Þ. For the case with disk access, gkþ1 Skþ1 ffi b þ r ðSkþ1 Sk Þ. As Skþ1 Sk > r in this case, it is safe to approximate the upper bound of gkþ1 Skþ1 to be b. Note that between schedules g and S, there should not be a gap larger than b or otherwise algorithm Γ will introduce a disk access to close up the gap to zero. Hence, we have shown that the buffer requirement of the proxy server with algorithm Γ is bounded to b. 4.1 Minimum buffer size We can adjust the volume of these disk accesses by variable b, as mentioned above. At any time disk access is required, the access will only retrieve data of size b so that the data remains in the buffer can cover, with replenishment from the network, the consumption as indicated by the delivery schedule until another disk access. Suppose the first disk access appears at time m and the next at n, and g is never equal to the delivery schedule in that period of time; the relationship of these two disk accesses will be Sm þ b þ r ðn mÞ ¼ Sn . To avoid starvation between the two disk accesses, the choice of b cannot be arbitrary. We must satisfy a necessary and sufficient condition b max fdS=dtg r.
Fig. 4 Arrival schedule generated by algorithm Γ
Cumulative Byte Transferred SN
Disk Access g g-S=b S
g0 m
n
Time
320
Multimed Tools Appl (2007) 35:311–333
Theorem 4-1 Given that Γ() is the video staging algorithm proposed in Section 4 and S is the required delivery schedule. It is a necessary and sufficient condition that if b maxfdS=dtg r, the arrival schedule g=Γ (S, r, b) will be a feasible one, where r is the incoming bandwidth and b is the buffer size. (See Appendix B for proof ) Theoretically, disk accesses are allowed at any time during the streaming of the video data; however, this may result in continuous disk accesses that will reduce the disk throughput as mentioned above. If disk accesses are allowed at a frequency of 1/T, where T is an arbitrary constant, then the following condition has to be met in order to guarantee a feasible arrival schedule, b max fSiþT Si g r T The buffer size has to be larger than the maximum size of the data cumulated in a period of T minus the volume of data contributed by the network during this period. Once the necessary buffer size is satisfied; unlike algorithm Φ, it will be independent of the incoming bandwidth constraint. Regardless of the change of this rate, the buffer requirement remains below or equal to b. 4.2 Disk access Although the buffer size will remain unchanged even if a lower incoming bandwidth is used, the proposed arrival schedule will introduce more disk accesses to replenish the data insufficiency incurred. Theorem 4-2 For two different incoming bandwidths r 0 >r, if a and a′ denote the corresponding numbers of disk accesses required when applying algorithm Γ with delivery schedule S and proxy buffer size b, then the corresponding numbers of disk accesses will have a relation of a>a 0 . (See Appendix B for proof ) Similarly, a change in proxy buffer size would also affect the number of disk accesses required. Fig. 5 Relationship between incoming bandwidth, proxy buffer size and disk access for video staging algorithm Γ
Incoming Bit-rate
Proxy Buffer Size
Multimed Tools Appl (2007) 35:311–333
321
Theorem 4-3 Suppose there are two different proxies applying algorithm Γ with the same delivery schedule S and an incoming bandwidth constraint r. If their buffer sizes are b and b0 such that b0 >b, and the corresponding numbers of disk accesses are a and a0 respectively then they will have a relation a>a0 . (See Appendix B for proof ) Hence, we can summarize as follows the behavior of algorithm Γ in terms of the disk access requirement against the incoming bandwidth as well as the proxy buffer size. When the incoming bandwidth or proxy buffer size increases, the required number of disk accesses decreases, and vice versa. Figure 5 illustrates the relationship between the number of disk accesses, incoming bandwidth and memory buffer size. A darker color is used to indicate a larger number of disk accesses. 4.3 Comparison with the traditional algorithm Suppose we have a schedule g=Γ (S, r, b), we know from the algorithm that giþ1 gi < r if and only if Siþ1 Si < r and giþ1 ¼ Siþ1 at a particular time i. If this condition is satisfied we shall have gi =Si as well, therefore we shall have giþ1 gi ¼ Siþ1 Si . Conversely, if Siþ1 Si r or giþ1 6¼ Siþ1 then we shall have giþ1 gi r, which implies a full utilization of the network bandwidth. Now, we consider the utilization of the incoming network bandwidth for the traditional video staging algorithm CBS. The cutoff bandwidth, dC/dt in this case, should be equal to r. When Siþ1 Si r, the utilized incoming network bandwidth is at its maximum which is equal to r. When Siþ1 Si < r the utilized incoming network bandwidth will be Siþ1 Si , which will be less than r. Let us consider again algorithm Γ. When we have Siþ1 Si < r, it is not necessary to have giþ1 gi ¼ Siþ1 Si < r as well, since it may not satisfy condition giþ1 ¼ Siþ1 . It means that at the time the traditional CBS algorithm does not fully utilize the network bandwidth, Γ may still fully utilize the network bandwidth for data retrieval. As the traditional CBS algorithm does not fully utilize the network bandwidth, it has to supplement that left over by the disk and lead to more disk accesses as compared with the algorithm Γ. Similarly, it can also be shown that the algorithm Γ has the same, if not better, network bandwidth utilization rate as compared with the traditional algorithm CAS, which again leads to lower disk access requirement. Fig. 6 Relationship between incoming bandwidth, proxy buffer size and disk access for the traditional video staging algorithm
Incoming Bit-rate
Proxy Buffer Size
322
Multimed Tools Appl (2007) 35:311–333
To summarize, algorithm Γ is built based on algorithm Φ so that it can also give an optimal arrival schedule under a constrained proxy buffer size and network incoming bandwidth. These constraints are favorable to the scheduling of memory and network usages. Besides, algorithm Γ always gives better network bandwidth utilization as compared with the traditional video staging algorithms. As it retrieves more data from the network, it can offload the disk by reducing the need of data accesses. This gain in network utilization as compared with the traditional algorithms can be extremely high when the incoming network bandwidth is large, since the bandwidth to be wasted by the traditional algorithms will also be larger. It is interesting to note that the traditional algorithms will not benefit from supplying additional memory at its disposal, since the separation of upper and lower streams relies solely on the cutoff bandwidth. For the traditional algorithms, there is no relation between the memory buffer size and the incoming network bandwidth or disk access. We illustrate this in Fig. 6. 5 Experimental results Both the original video staging mechanism and the proposed video staging mechanism require disk accesses and network transfer to achieve information retrieval. In order to ensure the quality of transfer, related resources such as network bandwidth, disk bandwidth
6
x 10
Delivery Schedule (Star Wars)
10
Cumulative Data Volume (MB)
5
4
3
2
1
0
0
0.5
1
1.5
Frame Fig. 7 Data delivery schedule of video title “Star Wars”
2
2.5 x 10
5
Multimed Tools Appl (2007) 35:311–333
15
x 10
5
323
Traditional Video Staging Schedule (Star Wars)
Data Volume
10
5
0
1
1.005
1.01
1.015
1.02
1.025
1.03
1.035
1.04
Frame (100000 to 105000)
1.045
1.05 x 10
5
Fig. 8 Disk access generated by traditional video staging algorithm (740 Kbps 925 KB)
and memory are usually reserved to guarantee the service. However, such reservations also limit the number of clients supportable by the proxy server, as no reservation can be made to exceed the capacity of these resources. Therefore, the ability to achieve better utilization of reserved resources becomes a figure of merit for a video staging algorithm. Figure 7 shows the data delivery schedule of a video title “Star Wars.” It is recorded at 30 fps with an average bandwidth of 740 Kbps. Figure 8 shows the proposed arrival schedule over the disk obtained by the traditional video staging algorithm. This arrival schedule requires small volume continuous disk accesses and is therefore unfavorable for practical implementation. However, it is understood that such dispersed disk accesses can be grouped together to form a few disk accesses. The volume is limited by the memory buffer constraint and these assesses should be initiated before starvation of the memory buffer. Therefore, we modify the traditional video staging algorithm according to these constraints to group the continuous disk accesses into a few separated ones for a fair comparison with the proposed algorithm. For the rest of this paper, we refer the traditional video staging arrival schedule to this modified one and refer the gamma video staging arrival schedule to the one generated by the proposed algorithm Γ. Figure 9 shows the traditional and the proposed video staging arrival schedules for such video within an arbitrarily chosen period (frame 100,000–105,000) with network bandwidth and memory buffer set at 740 Kbps and 925 KB (equivalent to 10 s of storage), respectively.
324
Multimed Tools Appl (2007) 35:311–333
Fig. 9 Video staging arrival schedules of the proposed and traditional algorithms (740 Kbps 925 KB)
Fig. 10 Video staging arrival schedules of the proposed and traditional algorithms (370 Kbps 925 KB)
Multimed Tools Appl (2007) 35:311–333
325
Fig. 11 Video staging arrival schedules of the proposed and traditional algorithms (1,110 Kbps 925 KB)
Fig. 12 Video staging arrival schedules of the proposed and traditional algorithms (740 Kbps 462 KB)
326
Multimed Tools Appl (2007) 35:311–333
Fig. 13 Video staging arrival schedules of the proposed and traditional algorithms (740 Kbps 1388 KB)
Fig. 14 Network retrieval comparison
Multimed Tools Appl (2007) 35:311–333
327
The topmost graph in Fig. 9 shows the cumulative volume of data obtained from the network and the disk under different video staging arrival schedules generated using the proposed and traditional algorithms. The lower two graphs show the exact amount of data obtained for each frame plotted in logarithmic scale. In Fig. 9, the black lines and the dash lines represent the result for the gamma (algorithm Γ ) and the traditional video staging algorithms, respectively. The grey lines represent the original required amount of data as specified in the delivery schedule. The impulses in the middle and the bottom graphs indicate the times when there are disk accesses, while the remaining indicates network retrievals. The middle graph shows that the network part of the proposed arrival schedule of Γ maintains its network access at 740 Kbps most of the time while the traditional network arrival schedule drops below 740 Kbps very often during the testing period. In other words, arrival schedule of Γ has a better utilization of network as compared with the traditional one. Consequently, the traditional arrival schedule has to compensate the loss from the network by having more disk accesses. As shown in the graph, arrival schedule Γ has to make only two disk accesses compared with five in the traditional arrival schedule. When we cut the network bandwidth by half to 370 Kbps as shown in Fig. 10, both the traditional and Γ arrival schedules utilize the network bandwidth efficiently and therefore their disk access requirement is the same during the period. However, if we increase the network bandwidth to 1,110 Kbps as shown in Fig. 11, arrival schedule Γ again gives a better network bandwidth utilization. It is seen that while the arrival schedule Γ does not require any disk access within the period, the traditional arrival schedule still requires three disk accesses. In Figs. 12 and 13, the effect of memory buffer size on the traditional and Γ
Fig. 15 Disk storage comparison (logarithmic)
328
Multimed Tools Appl (2007) 35:311–333
arrival schedules is shown. It is seen that when we increase the memory buffer size, the number of disk accesses will be decreased. This is natural as the memory buffer size increases; there is more room to hold the data retrieved from the disk. The volume of each disk accesses can be increased hence the frequency of disk accesses can be reduced. Figure 14 shows a comparison on the volume of data retrieved from the network for the complete video title using Γ and the traditional algorithms under different network bandwidths and memory buffer constraints. We observe that given the same amount of incoming network bandwidth, algorithm Γ retrieves data closer to the network bandwidth bound than the traditional algorithm. This implies that algorithm Γ has utilized its network bandwidth better. In addition, as the network bandwidth increases, the amount of data retrieved by algorithm Γ increases much faster than that received by the traditional one, as revealed in the analysis above. Figure 15 shows the amount of data stored in the disk for both algorithms in logarithmic scale. As the traditional algorithm does not fully utilize the reserved network bandwidth, the disk has to supplement the difference accordingly. As a result, the storage requirement for the traditional algorithm is higher than Γ with the same incoming bandwidth and memory buffer size. Figure 16 confirms the analysis of Fig. 5. The number of disk access requirement is directly proportional to the disk storage requirement. Figure 16 also shows that the disk access requirement of Γ is lower than that of the traditional algorithm, particularly when the bandwidth is large. This is because the network utilization of the traditional video staging algorithm decreases as the bandwidth increases, in order to compensate this under-utilization, the disk storage requirement has to increase. This leads to a relatively higher disk access requirement than algorithm Γ.
Fig. 16 Disk access comparison (logarithmic)
Multimed Tools Appl (2007) 35:311–333
329
6 Conclusion Video staging provides an effective method to relax the bandwidth requirement between the video server and the proxy server. It separates a video stream into two parts: one is obtained through the video server and the other is obtained from the local disk cache. Owing to the temporal locality of video streams, network bandwidth can be saved. However, the traditional video staging algorithm has not yet optimized the throughput of a proxy server, since network bandwidth is not the only resource required to support a connection; disk bandwidth and memory space are also essential. The maximum number of connections that a system can support is the minimum of the maximum number supportable by either of the disk, network or memory alone. Therefore, if a system is running out of any one of these resources, it cannot support any further connections, even if it has the others in abundance. Hence, we should pursue the optimal usage of all these three kinds of resource in order to optimize the system throughput. In addition, we need to attain a specific pattern of schedule suitable for the usage of each of the resources so that when the resource is shared by different connections, its throughput can still be optimized. In our analysis of algorithm Φ, we have shown that Φ gives an optimal arrival schedule towards memory usage. As for the video staging algorithm Γ, it provides us with a way to interchange network bandwidth and memory usage within the proxy server. We have performed a series of simulations to verify the performance of algorithm Γ. We have shown that it provides us with the flexibility to increase memory usage in order to create a data retrieval pattern more suitable for multiple connections. We have also shown that the proposed algorithm allows us to increase the number of disk accesses in order to maintain a fixed buffer size when the incoming bandwidth decreases. The proposed algorithm also gives a better performance in utilizing reserved resources. They are the benefits obtained from the interchange of resource usage. Acknowledgement This work is supported by a grant provided by The Hong Kong Polytechnic University.
Appendix A Lemma 3-1 If f=Φ (S, r) where Φ() is the smoothing algorithm as mentioned in [13] ; S= Sk: k=0,..., N and f=fk: k=0,..., N are delivery and arrival schedules, respectively, and r is the incoming bandwidth then fk can be written as Sk +ck, where ck is a constant and ck ≥0 for k=0,..., N. Proof Assume that there exists an integer k=0,..., N such that fk ¼ Sk þ ck with ck ≥0. If fk Sk1 r then fk1 ¼ Sk1 else fk Sk1 > r and we have Sk þ ck Sk1 > r then fk1 ¼ fk r fk1 > Sk1 þ r r fk1 > Sk1 Therefore, we have fk1 ¼ Sk1 þ ck1 and ck1 > 0. As fN =SN, we have proved the statement by induction.
330
Multimed Tools Appl (2007) 35:311–333
Theorem 3-2 Assume Φ() is the smoothing algorithm as mentioned in [13] and S is the required delivery schedule. If there are two arrival schedules f=Φ (S, r) and f 0 =Φ (S, r 0 ) with 0 two different incoming bandwidths such that r 0 >r, then maxf fk Sk g max fk Sk g. 0
Proof First, we assume that there exists an integer k=0,..., N such that fk ¼ fk . 0 If fk Sk1 r < r 0 then 0
fk1 ¼ Sk1 It also implies fk Sk1 r ) fk1 ¼ Sk1 0
Therefore, we have fk1 fk1 , 0
) fk1 Sk1 > fk1 Sk1 0
If r 0 fk Sk1 r then 0
fk1 ¼ Sk1 and fk1 ¼ Sk1 þ ck1 As ck1 0, therefore 0
fk1 fk1 0
) fk1 Sk1 > fk1 Sk1 If fk Sk1 r0 > r then 0
0
) fk1 ¼ fk r and fk1 ¼ fk r0 0
0
As we have assumed that fk ¼ fk , therefore, we have fk fk as well as r 0 >r. Consequently, we have, 0
fk r > fk r 0 0
) fk1 > fk1 0
) fk1 Sk1 > fk1 Sk1 0
0
Since we must have fN ¼ fN , by induction, we have proved that fk Sk fk Sk . n 0 o ) max f fk Sk g max fk Sk Theorem 3-3 Assume Φ() is the smoothing algorithm as mentioned in [13] and S is the required delivery schedule. If there is an arrival schedule f=Φ (S, r) and any arbitrary 0 feasible arrival schedule 0 f with the same incoming bandwidth r then we shall have max f fk Sk g max fk Sk for all k=0,..., N. Proof Suppose we have two feasible arrival schedules f and f 0 , and there exists an integer 0 k=0,..., N such that fk < fk . The value fk Sk ¼ max f f S g determines the required proxy buffer size of arrival schedule f. Therefore, we have max f f S g ¼ fk Sk . When plotting the two arrival schedules in a diagram as shown in Fig. 3, we will see that these
Multimed Tools Appl (2007) 35:311–333
331
two arrival schedule curves do not cross each other at point k. Suppose these two curves meet m points after point k and the incoming rate constraint is r, then, fk þ mr ¼ Skþm 0
fk þ mr < Skþm 0
It means that fk goes below S at point k+m hence cannot be a feasible arrival schedule. No 0 schedule may give fk < fk . As a result, max f f S g ¼ fk Sk gives the minimum of the maximum proxy buffer requirement.
Appendix B Theorem 4-1 Given that Γ() is the video staging algorithm proposed in Section 4 and S is the required delivery schedule. It is a necessary and sufficient condition that if b max fdS=dt g r, the arrival schedule g=Γ(S, r, b) will be a feasible one, where r is the incoming bandwidth and b is the buffer size. Proof Assume at any instant m we have, dS dSm ¼ max dt dt dS Also, assume that b < max dt r, ) b < Sm Sm1 r As gm ≥Sm, ) gm Sm1 > b þ r According to algorithm Γ, we have, ) gm1 ¼ Sm1 We need a disk access at instant m−1. Since we must have, Sm1 þ b þ r ðm m þ 1Þ ¼ Sm ) Sm1 þ b þ r ¼ Sm ) Sm Sm1 ¼ b þ r It violates our assumption; therefore, it is necessary to have b max fdS=drg r in order to satisfy the condition. Now, we assume b max fdS=drg r and an arbitrary time n such that 0 < n N , ) b Sm Sm1 r
8m ¼ 1; . . . ; n
Take the sum of both sides for all m, ) nb Sn nr ) nðb þ rÞ Sn
332
Multimed Tools Appl (2007) 35:311–333
It means that when both disk and network accesses are at their full strength during period [0, n], they will be able to support the scheduled delivery at that time. Therefore, as it is possible to allocate disk and network access to their extreme at all times, it is a sufficient condition. Theorem 4-2 For two different incoming bandwidths r 0 >r, if a and a′ denote the corresponding numbers of disk accesses required when applying algorithm Γ with delivery schedule S and proxy buffer size b, then the corresponding numbers of disk accesses will have a relation of a>a 0 . Proof Consider the first disk accesses at times m and m0 of schedule g and g 0 obtained by Γ (S, r) and Γ (S, r 0 ) respectively. Prior to any disk access, algorithm Γ behaves the same as 0 Φ, therefore gk > gk 0 for all kr, according to algorithm Γ, the largest possible value of g′ at time n−1 can only attain 0 b > gn1 gn1 > Sn . Therefore, g 0 can make its second disk access at time n, but it will never have more than two disk accesses during this interval. Therefore, g 0 will make its first disk access later than g and for every two consecutive disk accesses of g there can only be at most two disk accesses of g′, so we can conclude that the number of disk accesses of g 0 will be smaller than g if their corresponding incoming rate constraints r 0 and r have a relation of r 0 >r. Theorem 4-3 Suppose there are two different proxies applying algorithm Γ with the same delivery schedule S and an incoming bandwidth constraint r. If their buffer sizes are b and b 0 such that b 0 >b, and the corresponding numbers of disk accesses are a and a′ respectively then they will have a relation a>a 0 . Proof Suppose g and g′ are the corresponding arrival schedules of the two proxies with buffer size b and b′. As b′>b, g will make its first disk access later than g′. When we consider the interval between two consecutive disk accesses of g, suppose g′ has also made a disk access within the interval, it will only be able to make another at the beginning of the interval. It will not be possible to have more than two disk accesses within this interval. Therefore, the number of disk accesses in g′ will be smaller than in g.
References 1. Chang S-H, Chang R-I, Ho J-M, Oyang Y-J (2002) An effective approach to video staging in streaming applications. In: Proceedings of the IEEE Global Telecommunications Conference, 17–21 Nov 2002. GLOBECOM ’02, vol 2, pp 1733–1737 2. Cheuk WK, Hsung TC, Lun DPK (2006) Design and implementation of contractual based real-time scheduler for multimedia streaming proxy server. Multimed Tools Appl 28(1):69–88 3. Cheuk WK, Lun DPK (2004) Video staging in video streaming proxy server. In: Proceedings of the IEEE International Conference on Multimedia & Expo, Taipei, 27–30 June 2004, vol 1, pp 459–462 4. Dogan S, Cellatoglu A, Uyguroglu M, Sadka AH, Kondoz AM (2002) Error-resilient video transcoding for robust internetwork communications using GPRS. IEEE Trans Circuits Syst Video Technol 12 (6):453–464
Multimed Tools Appl (2007) 35:311–333
333
5. Fabmi H, Latif M, Sedigh-Ali S, Ghafoor A, Liu P, Hsu LH (2001) Proxy servers for scalable interactive video support. Computer 34(9):54–60 6. Goose S, Schneider G, Tanikella R, Mollenhauer H, Menard P, Le Floc'h Y, Pillan P (2002) Toward improving the mobile experience with proxy transcoding and virtual composite devices for a scalable bluetooth LAN access solution. In: Proceedings of the Third International Conference on Mobile Data Management 2002, pp 169–170 7. Gorinsky S, Baruah S, Stoyen A (1997) Boosting the network performance via traffic reshaping. In: Proceedings of the Sixth International Conference on Computer Communications and Networks, 1997, pp 285–290 8. Lombaedo A, Schembra G, Morabito G (2001) Traffic specifications for the transmission of stored MPEG video on the internet. IEEE Trans Multimedia 3(1):5–17 9. Ma W-H, Du DHC (2000) Reducing bandwidth requirement for delivering video over wide area networks with proxy server. In: 2000 IEEE International Conference on Multimedia and Expo. ICME 2000, vol 2, pp 991–994 10. Ma W-H, Du DHC (2002) Reducing bandwidth requirement for delivering video over wide area networks with proxy server. IEEE Trans Multimedia 4(4):539–559 11. Mahanti A, Williamson C, Eager D (2000) Traffic analysis of a web proxy caching hierarchy. IEEE Netw 14(3):16–23 12. Rexford J, Sen S, Basso A (1999) A smoothing proxy service for variable-bit-rate streaming video. In: Global Telecommunications Conference, GLOBECOM ’99, vol 3, pp 1823–1829 13. Rexford J, Towsley D (1999) Smoothing variable-bit-rate video in an internetwork. IEEE/ACM Trans Netw 7(2):202–215 14. Sen S, Rexford J, Towsley D (1999) Proxy prefix caching for multimedia streams. In: Proceedings of the Eighteenth Annual Joint Conference of the IEEE Computer and Communications Societies, INFOCOM '99, vol 3, pp 1310–1319 15. Zhang Z-L, Wang Y, Du DHC, Su D (2000) Video staging: a proxy-server-based approach to end-to-end video delivery over wide-area networks. IEEE/ACM Trans Netw 8(4):429–442, Aug
Wai-Kong Cheuk received his B.Eng. (Hons.), M. Phil. and Ph.D. degrees in 1996, 2001 and 2005, respectively, from the Hong Kong Polytechnic University. His main research interests include distributed operating systems and video streaming.
Daniel Pak-Kong Lun (M’91) received his B.Sc. (Hons.) degree from the University of Essex, England, and Ph.D. degree from the Hong Kong Polytechnic University (formerly called Hong Kong Polytechnic) in 1988 and 1991, respectively. He is now an Associate Professor in the Department of Electronic and Information Engineering at the Hong Kong Polytechnic University. His research interests include digital signal processing, wavelets, and Multimedia Technology. He is active in participating professional activities. He was the Chairman of the IEEE Hong Kong Chapter of Signal Processing in 1999–00. He was the Finance Chair of 2003 IEEE International Conference on Acoustics, Speech and Signal Processing and the General Chair of 2004 International Symposium of Intelligent Multimedia, Video, and Speech Processing. He is a Chartered Engineer, a corporate member of IET and HKIE and a member of IEEE.