Feb 13, 2002 - operations at a TCP sender host by using some of new system calls. .... Suppose that a server host is sending TCP data to two clients of 64 Kbps ...... the Web proxy server, since scheme (4) can give a good effect on the Web.
Master’s Thesis
Title
Dynamic Resource Management Scheme for TCP Connections at Internet Servers
Supervisor Prof. Masayuki Murata
Author Tatsuhiko Terai
February 13th, 2002
Department of Informatics and Mathematical Science, Graduate School of Engineering Science, Osaka University
Master’s Thesis
Dynamic Resource Management Scheme for TCP Connections at Internet Servers Tatsuhiko Terai
Abstract Many research efforts have been directed towards avoiding or dissolving the network congestion,which is caused by an increase in network traffic. However, there has been little recent discussion on improvement of the performance of Internet endhosts in spite of the projection that the bottleneck is now being shifted from the network to Internet endhosts. In this thesis, we present a novel scheme of resource management for Internet servers to provide high performance and fair service for many TCP connections at the Internet servers. We first present a new architecture of fair socket buffer management for Web servers, which is called Scalable Socket Buffer Tuning (SSBT), having two major features. The one is the equationbased automatic TCP buffer tuning (E-ATBT) method for Web servers, where the sender host estimates ‘expected’ throughput of the TCP connections through a simple mathematical equation, and assigns the send socket buffer to them according to the estimated throughput. We also introduce the simple memory-copy reduction (SMR) scheme to reduce the number of memory access operations at a TCP sender host by using some of new system calls. The second contribution of this thesis is to propose a new resource/connection management scheme for Web proxy servers to improve their performance and reduce the elapsed Web document transfer time when using Web proxy servers. Our proposed scheme has the following two features: One is an enhanced E-ATBT, which is an enhancement of our E-ATBT proposed for Web proxy servers, in order to effectively handle the dependency of upward and downward TCP connections 1
and to dynamically assign the receive socket buffer. The other is a connection management scheme that prevents newly arriving TCP connection requests from being rejected at the Web proxy server due to a lack of resources. The scheme involves the management of persistent TCP connections provided by HTTP/1.1, which intentionally tries to close them when the resources at the Web proxy server are shorthanded. We evaluate an effectiveness of our proposed schemes through simulation and implementation experiments. From both results, we confirm that our SSBT can achieve up to 30% gain of the Web server throughput, and the fair and effective usage of the sender socket buffer can be achieved. Furthermore, we show our proposed connection management scheme can improve the Web proxy server performance up to 70%, and reduce the document transfer time for Web client hosts.
Keywords TCP (Transmission Control Protocol) Web Server Web Proxy Server Socket Buffer Persistent TCP Connection Resource Management
2
Contents 1 Introduction
8
2 Handling TCP Connections at Internet Servers
12
2.1
Web Servers and Web Proxy Servers . . . . . . . . . . . . . . . . . . . . . . . .
12
2.2
Resources for TCP Connections . . . . . . . . . . . . . . . . . . . . . . . . . .
13
2.2.1
Mbuf . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
15
2.2.2
File Descriptor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
15
2.2.3
Control Blocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
16
2.2.4
Socket Buffer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
16
HTTP/1.1 Persistent TCP Connections . . . . . . . . . . . . . . . . . . . . . . .
18
2.3
3 Scalable Socket Buffer Tuning (SSBT) for Web Servers 3.1
3.2
3.3
3.4
23
Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
23
3.1.1
Equation-based Automatic TCP Buffer Tuning (E-ATBT) . . . . . . . .
23
3.1.2
Simple Memory-copy Reduction (SMR) . . . . . . . . . . . . . . . . . .
25
Simulation Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
27
3.2.1
Experiment 1: Constant Packet Loss Rate . . . . . . . . . . . . . . . . .
28
3.2.2
Experiment 2: Drop-Tail . . . . . . . . . . . . . . . . . . . . . . . . . .
33
Implementation Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . .
38
3.3.1
Experiment 1: Server Performance Improvement by SMR . . . . . . . .
38
3.3.2
Experiment 2: Fair Buffer Assignment among Different Connections . .
39
3.3.3
Experiment 3: Scalability against the Number of Connections . . . . . .
43
3.3.4
Experiment 4: Web Server Performance Evaluation . . . . . . . . . . . .
46
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
48
4 Connection Management for Web Proxy Servers
3
50
4.1
4.2
4.3
4.4
Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
50
4.1.1
Enhanced E-ATBT (E2 -ATBT) . . . . . . . . . . . . . . . . . . . . . . .
50
4.1.2
Management of Persistent TCP Connections . . . . . . . . . . . . . . .
52
Simulation Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
53
4.2.1
Comparison of HTTP/1.0 and HTTP/1.1 . . . . . . . . . . . . . . . . . .
55
4.2.2
Evaluation of Web Proxy Server Performance . . . . . . . . . . . . . . .
56
4.2.3
Evaluation of Response Time . . . . . . . . . . . . . . . . . . . . . . .
59
Implementation and Experiments . . . . . . . . . . . . . . . . . . . . . . . . . .
60
4.3.1
Implementation Issues . . . . . . . . . . . . . . . . . . . . . . . . . . .
60
4.3.2
Implementation Experiments . . . . . . . . . . . . . . . . . . . . . . . .
63
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
69
5 Concluding Remarks
70
Acknowledgements
71
References
72
4
List of Figures 1
Mechanisms of Internet Servers . . . . . . . . . . . . . . . . . . . . . . . . . .
14
2
Analysis Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
19
3
Probability that a TCP Connection is Active . . . . . . . . . . . . . . . . . . . .
22
4
Memory Copy Operations in TCP Data Transfer . . . . . . . . . . . . . . . . . .
25
5
Flow Chart of Socket System Calls . . . . . . . . . . . . . . . . . . . . . . . . .
27
6
Network Model for Simulation Experiment (1) . . . . . . . . . . . . . . . . . .
29
7
Result of Simulation Experiment (1): p 1 = 0.0001, p2 = . . . = p8 = 0.02, B = 500 packets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
8
Result of Simulation Experiment (2): p 1 = 0.0001, p2 = . . . = p8 = 0.02, B = 100 packets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
9
30
32
Result of Simulation Experiment (3): p 1 = p2 = 0.02, p3 = p4 = 0.01, p5 = p6 = 0.002, p7 = p8 = 0.001, B = 200 packets . . . . . . . . . . . . . . . . . .
34
10
Network Model for Simulation Experiment (2) . . . . . . . . . . . . . . . . . .
35
11
Results of Simulation Experiment (4): B = 350 packets . . . . . . . . . . . . .
36
12
Results of Simulation Experiment (5): B = 200 packets . . . . . . . . . . . . .
37
13
Network Model for Implementation Experiment (1) . . . . . . . . . . . . . . . .
39
14
Result of Implementation Experiment (1): TCP Throughput using MAPOS Link .
40
15
Network Model for Implementation Experiment (2) . . . . . . . . . . . . . . . .
41
16
Results of Implementation Experiment (2): Fairness among Three Connections .
42
17
Network Model for Implementation Experiment (3) . . . . . . . . . . . . . . . .
43
18
Results of Implementation Experiment (3): Changes of Assigned Buffer Sizes . .
44
19
Network Model for Implementation Experiment (4) . . . . . . . . . . . . . . . .
45
20
Result of Implementation Experiment (4): Throughput of MAPOS Connection vs. the Number of Ethernet Connections . . . . . . . . . . . . . . . . . . . . . . . .
5
46
21
Results of Implementation Experiment (5): Document Size vs. Average Performance Gain by SSBT Scheme . . . . . . . . . . . . . . . . . . . . . . . . . . .
49
22
Network Model for Simulation Experiments . . . . . . . . . . . . . . . . . . . .
54
23
Result of Simulation Experiment (1): Comparison of HTTP/1.0 and HTTP/1.1 . .
57
24
Results of Simulation Experiment (2) : Web Proxy Server Performance . . . . .
58
25
Results of Simulation Experiment (3) : Response Time . . . . . . . . . . . . . .
61
26
Connection Management Scheme . . . . . . . . . . . . . . . . . . . . . . . . .
63
27
Network Model for Implementation Experiment . . . . . . . . . . . . . . . . . .
64
28
Result of Implementation Experiment (1): Web Proxy Server Throughput . . . .
65
29
Results of Implementation Experiment (3): Average Performance Gain . . . . . .
67
30
Results of Implementation Experiment (4): Response Time . . . . . . . . . . . .
68
6
List of Tables 1
Result of Implementation Experiment (2): Number of Established TCP Connections at the Web Proxy Server . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7
66
1
Introduction
The rapid growth of the Internet population has been the impetus for many research efforts toward dissolving network congestion against an increase of network traffic. However, there has been little work in the area of improvements to the performance of Internet hosts in spite of the projected shift of the performance bottleneck from the network to the endhosts. There are already hints of this scenario’s emergence as evidenced by the proliferation of busy Web servers on the present-day Internet that currently receive hundreds of document transfer requests every second during peak volume periods. Of course, an improvement of protocol processing on the endhosts is not a new subject. An early example can be found in [1] where the authors propose the ‘fbuf’ (fast buffer) architecture, which shares the memory space between the system kernel and the user process to avoid redundant memory copies during data exchanges. It is based on an observation that memory copy is a main cause of the bottleneck at endhosts in TCP data transfer. Other approaches can be found in, e.g., [2, 3]. However, past researches including the above approaches do not consider the ‘fair’ treatment of connections, by which they receive fair service from the server. By achieving fair service among connections, we can also expect performance improvement by the following reasons. Suppose that a server host is sending TCP data to two clients of 64 Kbps dial-up (say, client A) and 100 Mbps LAN (client B). If the server host assigns an equal size of the send socket buffer to both clients, it is likely that an amount of the assigned buffer is too large for client A, and too small for client B, because of the difference of capacities (more strictly, bandwidth–delay products) of two connections. For an effective buffer allocation to both clients, a compromise on buffer usage should be taken into account. Another important example that requires the ‘fair’ buffer treatment can be found in a busy Internet Web server, which simultaneously accepts a large number of TCP connections with different bandwidths and round trip times (RTTs) at the same time. In [4], the authors have proposed
8
a buffer assignment algorithm called Automatic TCP Buffer Tuning (ATBT), which dynamically adjusts the send socket buffer size according to the change of the TCP sending window size of the connection. However, it cannot provide ‘fairness’ among connections because the throughput of TCP connections is not proportional to the sending window size [5]. We will provide more discussions on the problems of ATBT in Subsection 2.2.4. In this thesis, we first propose a novel architecture, called SSBT (Scalable Socket Buffer Tuning), to provide high performance and fair service for many TCP connections at Web servers. For this purpose, we develop the following two mechanisms in SSBT. Those are E-ATBT (Equationbased Automatic TCP Buffer Tuning) and SMR (Simple Memory-copy Reduction scheme). The E-ATBT provides a fair assignment of the socket buffer. In E-ATBT, we consider ‘expected’ throughput of TCP connections by an analytic approach. It is characterized by packet loss rate, RTT and RTO (Retransmission Time Out) values of the connections, which can easily be monitored by the sender host. The send socket buffer is then assigned to each connection based on its expected throughput, in consideration of a max-min fairness among connections. The SMR scheme provides a set of socket system calls in order to reduce the number of memory copy operations at the sender host in TCP data transfer. It is a replacement of the existing so-called ‘zero-copy’ mechanisms [2, 3]. However, it is much simpler and can still achieve the same effect in reducing the overhead at the sender host. We validate an effectiveness of our SSBT through simulation and implementation experiments. In the simulation experiments, we evaluate the fair assignment of the socket buffer using our EATBT algorithm under various network situations. In the implementation experiments, we first confirm the behavior of SSBT at the transport layer, that is, we use native TCP data transfer in the experiments, followed by experiments on a SSBT-implemented Web server. We show the effectiveness of SSBT in terms of overall server performance, the number of accommodated connections, and fairness among connections. The second contribution of this thesis is as to the performance improvement of Web proxy 9
servers. On the current Internet, many requests for Web document transfer are made via the Web proxy servers [6]. Since the Web proxy servers are usually prepared by ISPs (Internet Service Providers) for their customers, they must accommodate a large number of simultaneous HTTP accesses. Furthermore, the Web proxy servers must handle both upward TCP connections (from the Web proxy server to Web servers) and downward TCP connections (from the client hosts to the Web proxy server). Hence, the Web proxy server becomes a likely spot for bottlenecks to occur during Web document transfers, even when the bandwidth of the network and the Web server performance are large enough. It is contention that any effort expended to study ways to reduce the transfer time of Web documents must consider improvements of the Web proxy server performance. The bulk of the past reported research on improvements of the Web proxy server performance was focused on cache replacement algorithms [7, 8]. In [9], the authors have evaluated the performance of Web proxy servers, focusing on the difference between HTTP/1.0 and HTTP/1.1 through simulation experiments, including the effect of using cookies and aborting the document transfer by the client hosts. However, there are few works on the resource management at the Web proxy server, and no effective mechanism has been proposed. In this thesis, we first discuss several problems that arise in handling TCP connections at the Web proxy server. One of these problems involves the assignment of socket buffer for TCP connections at the Web proxy server. When a TCP connection is not assigned the proper size send/receive socket buffer based on its ‘expected’ throughput, the assigned socket buffer may be left unused or insufficient because its size might be inadequate for performing the intended task, which results in a waste of the socket buffer. Another problem is the management of persistent TCP connections, which waste the resources of a busy Web proxy server. The persistent TCP connection can omit the overhead of TCP’s three-way handshake and then reduce the document transfer time by HTTP. However, when a Web proxy server must accommodate many persistent TCP connections without the use of an effective management scheme, its resources continue to be assigned to them whether they are ‘active’ or not. It results in 10
that new TCP connections cannot be established since the server resources become short. We then present a new resource management scheme for Web proxy servers that has been capable for solving those problems. Our proposed scheme has following two features: One is an enhanced E-ATBT, which is an enhancement of the E-ATBT mentioned above, proposed for Web servers. Differently from Web servers, the Web proxy server must handle upward and downward TCP connections and behave as a TCP receiver host to obtain Web documents from Web servers. We therefore enhanced the E-ATBT in order to effectively handle the dependency of upward and downward TCP connections and to dynamically assign the receive socket buffer. The other is a connection management scheme that prevents newly arriving TCP connection requests from being rejected at the Web proxy server due to a lack of resources. It involves the management of persistent TCP connections provided by HTTP/1.1, where the Web proxy server intentionally tries to close the persistent connections when the resources are shorthanded. We show that our proposed scheme can improve the Web proxy server performance through both of simulation and implementation experiments. In the simulation experiments, we evaluate the essential performance and characteristics of our proposed scheme by comparing them with those of original Web proxy server. We further exhibit the results of the implementation experiments, and confirm the effectiveness of connection management scheme in terms of Web proxy server throughput and document transfer delay. This thesis is organized as follows. In Section 2, we first introduce how Internet servers handle TCP connections and their resources. In Section 3, we propose our SSBT algorithm for Web servers, and confirm its effectiveness through simulation and implementation experiments. We next propose a connection management scheme for Web proxy servers, and evaluate its performance through some of simulation experiments and benchmark tests in Section 4. Finally, we present some concluding remarks in Section 5.
11
2
Handling TCP Connections at Internet Servers
We first introduce the mechanisms of Internet servers focusing on the resources for establishing TCP connections. We especially focus on Web servers and Web proxy servers in this thesis, which are widely used for providing WWW (World Wide Web) service.
2.1
Web Servers and Web Proxy Servers
A Web server delivers Web documents using HTTP (Hypertext Transfer Protocol) in response to a document transfer request from client hosts. See Figure 1(a). A client host browses the delivered Web document using a Web browser. In recent years, the Internet services including WWW have been developed rapidly, which is brought by the explosive increase of the Internet users. As a result, Internet traffic has been increasing, which often causes serious network congestion. On the current Internet, a busy Web server must accommodate hundreds of TCP connections for document transfer requests every second during peak volume periods. Those connections have different characteristics, such as RTT, packet loss rate, the available bandwidth of the path that the connection traverses, and so on. So the required resources for the TCP connections transferring Web documents are also different. However, almost all implementations of the current operating systems do not consider such differences when they assign their resources to each TCP connection, leading to unfair and ineffective usage of their resources. For establishing fair service among many of heterogeneous TCP connections, it is very important to assign the resources fairly and effectively to each TCP connection at the Web server. A Web proxy server works as an agent for Web client hosts that request Web documents [10], as shown in Figure 1(b). When it receives a Web document transfer request from a client host, it obtains the requested document from the original Web servers on behalf of the client host and delivers it to the client host. It also caches the obtained Web documents. When other client hosts request the same document, it then transfers the cached document, which results in a much
12
reduced document transfer time. For example, it was reported in [9] that using Web proxy servers reduces document transfer times by up to 30%. Furthermore, when the cache is hit, the document transfer is performed without establishing any connections to Web servers. Thus, the congestion that occurs within a network at Web servers can be reduced. The Web proxy server accommodates a large number of connections from Web client hosts and to Web servers. It is a different point from Web servers. The Web proxy server behaves as a sender host for downward TCP connections (between client hosts and a proxy server) and as a receiver host for upward TCP connections (between a proxy server and Web servers). There is a close dependency between these connections. Therefore, if the resource management scheme is not appropriately configured at the Web proxy server, the document transfer time increases even when the network is not congested or the Web server load is not high. That is, careful and effective resource management is a critical issue to improve the performance of a Web proxy server. On the current Internet, however, most Web proxy servers including those in [11, 12] lack such considerations. In the following subsections, we show the resources of Internet servers for establishing the TCP connection and point out the problem of handling the persistent TCP connections at a busy Internet server through a simple mathematical analysis.
2.2
Resources for TCP Connections
The resources at the Internet server that we focus on in this thesis are mbuf, file descriptor, control blocks, and socket buffer. These are closely related to the performance of TCP connections when transferring Web documents. Mbuf, file descriptor, and control blocks are resources for TCP connections. The socket buffer is used for storing/receiving the transferred documents by the TCP connections. When those resources are slight, the Web proxy server is unable to establish a new TCP connection. Thus, the client host has to wait for existing TCP connections to be closed and their assigned resources to be released. If this cannot be accomplished, the Web proxy server 13
Web servers get the document from the Web server
Internet Request the document to the Web server
Internet Client hosts
deliver the document
(a) Web Server
Web servers get the document from the original Web server Internet Upward TCP connection
Request the document to the original Web server
No Hit Hit
Web proxy server Downward TCP connection Internet
deliver the document Request a document Client hosts
(b) Web Proxy Server
Figure 1: Mechanisms of Internet Servers 14
rejects the request. In what follows, we introduce the resources of Internet Servers as they regard TCP connections. Although in our discussion we consider FreeBSD-4.0 [13], we believe that our discussions below can also be applied to other OSs, such as Linux.
2.2.1 Mbuf Each TCP connection is assigned an mbuf, which is located in the kernel memory space and used to move the transmission data between the socket buffer and the network interface. When the data size is larger than the mbuf, the data is stored to another memory space, called a mbuf cluster, which is listed to the mbuf. Several mbuf clusters are used for storing data according to its size. The number of mbufs prepared by the OS is configured in building the kernel; the default number is 4096 in FreeBSD [13]. Since each TCP connection is assigned at least one mbuf when established, the default number of connections a Web proxy server can simultaneously establish is 4096. It would be too small for busy Internet servers.
2.2.2 File Descriptor A file descriptor is assigned to each file in a file system so that the kernel and user applications can identify the file. It is also associated with a TCP connection when it is established. It is called a socket file descriptor. The number of connections that can be established simultaneously is limited to the number of file descriptors prepared by OS. The default number of file descriptors is 1064 in FreeBSD. In contrast to the mbuf, the number of file descriptors can be changed after the kernel is booted. However, since user applications, such as Squid [11], occupy the memory space according to the number of available file descriptors when they boot, it is very difficult to inform the applications of the change in the number of available file descriptors at the run time. That is, we cannot change the number of file descriptors used by the applications dynamically.
15
2.2.3 Control Blocks When establishing a new TCP connection, it is necessary to use more memory space for the data structures that are used in storing the connection information, such as inpcb, tcpcb, and socket. The inpcb structure is used to store the source and destination IP addresses, the port numbers, and so on. The tcpcb structure is for storing network information, such as the RTT (Round Trip Time), RTO (Retransmission Time Out), and congestion window size, which are used for TCP congestion control [14]. The socket structure is used for storing the information about the socket. The maximum number of these structures that can be built in the memory space is initially 1064. Since the memory space for these data structures are set in building the kernel and it is unchangeable while OS is running, the new TCP connection cannot be established as the amount of memory spaces becomes short.
2.2.4 Socket Buffer The Socket buffer is used for data transfer operations between user applications and the sender/receiver TCP. When the user application transmits data using TCP, the data is copied to the send socket buffer and subsequently it is copied to the mbufs (or mbuf clusters). The size of the assigned socket buffer is a key issue for the effective data transfer by TCP. As explained in the previous section, a compromise of the socket buffer usage should be taken in to account for an effective socket buffer assignment to each TCP connection. In [4], the authors have proposed an Automatic TCP Buffer Tuning (ATBT) mechanism, where the assigned buffer size of each TCP connection is determined according to the current window size of the connection. That is, when the window size becomes large, the sender host tries to assign more buffer to the connection. It decreases the assigned buffer size as the window size becomes small. When the total required buffer size of all TCP connections becomes larger than the send socket buffer size prepared at the sender host, the send socket buffer is assigned to each
16
connection according to a max-min fairness policy. More specifically, the sender host first assigns the buffer equally to all TCP connections. If some connections do not require large buffer, the excess buffer is re-assigned to connections requiring larger buffer. Through this mechanism, it is expected provide dynamic and fair buffer assignment by considering different characteristics. However, ATBT has several problems. It assigns the send socket buffer to each TCP connection according to its current window size at regular intervals. Therefore, when the sender TCP decreases its window size suddenly due to, e.g., an occasional packet loss, the assigned buffer size sometimes gets smaller than that the connection actually requires. It might be resolved by setting the update interval to a smaller value. Then, the assigned buffer size is changed too frequently, which causes the system instability. Furthermore, as the network bandwidth becomes larger, the oscillation of the window size also becomes large, leading to a large oscillation of the assigned buffer size. Another problem exists in the max-min sharing policy adopted in ATBT. Suppose that three TCP connections (connections 1,2 and 3) are active, and the required buffer sizes calculated from the window sizes of connections are 20 KBytes, 200 KBytes, and 800 KBytes, respectively. If the total size of the send socket buffer is 300 KBytes, the sender host first assigns 100 KBytes to each connection. Since connection 1 does not require such a large buffer, the sender re-assigns the excess buffer of 80 KBytes of connection 1 equally to connections 2 and 3. As a result, the assigned buffer sizes of both connection 2 and 3 become 140 KBytes. However, it must be better to assign the excess buffer proportionally to the required buffer size of connections 2 and 3. Then the assigned buffer sizes become 116 KBytes for connection 2 and 164 KBytes for connection 3 by proportional re-assignment. This assignment is more effective because the throughput improvement of connection 3 becomes larger. To overcome these problems and achieve an effective usage of socket buffer, we propose a novel architecture, called SSBT (Scalable Socket Buffer Tuning), which provides high performance and a fair assignment of the socket buffer for many TCP connections. We will describe our 17
proposed method in detail in Section 3.
2.3
HTTP/1.1 Persistent TCP Connections
In recent years, many Web servers and client hosts (namely, Web browsers) supports a persistent connection option, which is one of the most important functions of HTTP/1.1 [15]. In the older version of HTTP (HTTP/1.0), the TCP connection between server and client hosts is immediately closed when the document transfer is completed. However, since Web documents have many in-line images, it is necessary to establish TCP connections many times to download them in HTTP/1.0. This results in a significant increase of document transfer time since the average size of the Web documents at typical Web servers is about 10 KBytes [16, 17]. The use of the three-way handshake in each TCP connection establishment makes the situation worse. In this subsection, we focus on the problem of the persistent connection option when considering busy Web and Web proxy servers. In the HTTP/1.1 persistent TCP connection the server preserves the status of the TCP connection, including the congestion window size, RTT, RTO, ssthresh, and so on, when it finishes the document transfer. Then the server re-uses the connection and its status when other documents are transferred using the same HTTP session. In this way, the three-way handshake can be avoided. However, the Web proxy server keeps the TCP connection established, irrespective of whether the connection is active (in use for packet transfer) or not. That is, the resources at the server are wasted when the TCP connection is inactive. Thus, a significant portion of the resources may be wasted in order to maintain the persistent TCP connections at the Web proxy server to accommodate many TCP connections. In what follows, we introduce a simple estimation to show how many TCP connections are active or idle, and to explain that many resources are wasted in the typical Web proxy server when native persistent TCP connections are utilized. For this purpose, we use a network topology shown in Figure 2, where a Web client host and a Web server are connected via a Web proxy 18
hit ratio : h RTT: rtt s RTO: rto s
RTT: rttc RTO: rtoc
loss probability: ps
loss probability: pc
Client Host
Web Proxy Server
Web Server
Figure 2: Analysis Model
server, and derive the probability that a persistent TCP connection is active, that is, in use sending TCP packets. Notations in the figure, p c , rttc , and rtoc are the packet loss ratio, RTT, and RTO of the TCP connection between the client host and the Web proxy server, respectively. Similarly, ps , rtts , and rtos are those between the Web proxy server and the Web server. The packet size is fixed at m. The mean throughput of the TCP connection between the client host and the Web proxy server, and that between the Web server and the Web proxy server, denoted as ρ c and ρs respectively, can be obtained from the analysis results presented in our previous work [18]. Note that in [18], a more accurate analysis of TCP throughput estimation than those in [5] and [19] was provided, especially for the situation of small file transfers. Using the results presented in [18], we can derive the time for a Web document transfer via a Web proxy server. For this purpose, we also introduce the parameters h and f , which respectively represent the cache hit ratio of the document at the Web proxy server and the size of the document being transferred. Note that the Web proxy server is likely to cache ‘Web page’, which includes the main document and some in-line images. That is, when the main document is found at the Web proxy server’s cache, the following in-line images are likely to be cached, and vice versa. Thus, h is not an adequate metric when we examine the effect of the persistent connection, but the following observation is also applicable to the above-mentioned situation. When the requested document is cached by the Web proxy server, a request to the original Web server is not required, and the document is directly delivered from the Web proxy server to
19
the client host. When the Web proxy server does not have the requested document, on the other hand, it must be transferred from the appropriate Web server to the client host via the Web proxy server. Thus, the document transfer time, T (f ), can be determined as follows;
T (f ) = h Sc +
f ρc
+ (1 − h) Sc + Ss +
f f + ρc ρs
where Sc and Ss represent the connection setup times of a TCP connection between the client host and the Web proxy server and that between the Web proxy server and the Web server, respectively. To derive S c and Ss , we must consider the effect of the persistent connection provided by HTTP/1.1 which omits the three-way handshake. Here, we define X c as the probability that the TCP connection between the client host and the Web proxy server is kept connected by the persistent connection, and X s as the corresponding probability to the TCP connection between the Web proxy server and the Web server. Then S c and Ss can be described as follows; 1 3 Sc = Xc · rttc + (1 − Xc ) · rttc 2 2 1 3 Ss = Xs · rtts + (1 − Xs ) · rtts 2 2
(1) (2)
Xc and Xs are dependent on the length of persistent timer, T p. That is, if the idle time between two successive document transfers is smaller than T p , the TCP connection can be used for the second document transfer. If the idle time is larger than T p , on the other hand, the TCP connection has been closed and a new TCP connection have to be established. According to results in [16], where the authors modeled the access pattern of a client host, the idle time between Web document transfers follows a Pareto distribution whose probability density function is given as; p(x) = αk α xα+1
(3)
where α = 1.5 and k = 1. Then we can calculate X c and Xs as follows; Xc = d(Tp)
(4)
Xs = (1 − h) · d(T p)
(5)
20
where d(x) is the cumulative distribution function of p(x). From Equations (1)–(5), the average document transfer time, T (f ), can be determined. Finally, we can derive U , which is the utilization of the persistent TCP connection as follows; U
=
Tp 0
p(x) ·
T (f ) T (f ) dx + (1 − d(T p)) · T (f ) + x T (f ) + Tp
Figure 3 plots the probability that a TCP connection is active, as a function of the length of persistent timer T p, with various parameter sets of rtt c , pc , rtts , and ps . Here we set h to 0.5, the packet size to 1460 KBytes, and f to 12 KBytes according to the average size of Web documents reported in [16]. We can see from these figures that the utilization of the TCP connection is very low, regardless of the network condition (RTTs and the packet loss ratios on links between the Web proxy server and the client host/the Web server). Thus, if idle TCP connections are kept established at the Web proxy server, a large part of the resources at the Web proxy server are wasted. Furthermore, we can observe that the utilization becomes large when the length of the persistent timer is small (< 5 sec). This is because the smaller value of T p can prevent the emergence of situations where the Web proxy server’s resources are wasted. One solution to this problem is to simply discard HTTP/1.1 and to use HTTP/1.0, as the latter closes the TCP connection immediately upon the completion of a document transfer. However, HTTP/1.1 has other elegant mechanisms, such as pipelining and content negotiation [15]. We should therefore develop an effective resource management scheme under HTTP/1.1. Our solution is that as the resources become short, the Web proxy server intentionally closes persistent TCP connections that are unnecessarily wasting the resources at the Web proxy server. We will describe our scheme dissolving this problem in detail in Section 4.
21
Connection Utilization
1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1
rtt s=0.2 sec, p s =0.1 rtt s=0.2 sec, p s =0.01 rtt s=0.02 sec, p s =0.01 rtt s=0.02 sec, p s =0.001
5
10 15 20 Persistent Timer [sec]
25
30
Connection Utilization
(a) rttc = 0.02 sec, ps = 0.001
1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1
rtt s=0.2 sec, p s =0.1 rtt s=0.2 sec, p s =0.01 rtt s=0.02 sec, p s =0.01 rtt s=0.02 sec, p s =0.001
5
10 15 20 Persistent Timer [sec]
25
(b) rttc = 0.2 sec, ps = 0.01
Figure 3: Probability that a TCP Connection is Active
22
30
3
Scalable Socket Buffer Tuning (SSBT) for Web Servers
In this section, we propose a new resource management architecture for Web servers, called SSBT (Scalable Socket Buffer Tuning), and evaluate an effectiveness of our proposed algorithm through both simulation technique and experimental implementation. SSBT includes two mechanisms, the equation-based automatic TCP buffer tuning (E-ATBT) and the simple memory-copy reduction (SMR) scheme, as explained below.
3.1
Algorithms
3.1.1 Equation-based Automatic TCP Buffer Tuning (E-ATBT) The equation-based automatic TCP buffer tuning (E-ATBT) solves the problems raised in Subsection 2.2.4. In E-ATBT, the sender host estimates an ‘expected’ throughput for each TCP connection by monitoring three parameters (packet loss probability, RTT, and RTO values). It then determines the required buffer size of the connection from the estimated throughput, not from the current window size of TCP as the ATBT [4]. The estimation method of TCP throughput is based on the analysis result in Hasegawa et al [20]. There the authors derived the average throughput of a TCP connection for a model in which multiple connections with different input link bandwidths share a bottleneck router employing the RED algorithm [21]. The following parameters are necessary to derive the throughput; • p: packet dropping probability of the RED algorithm • rtt: the RTT value of the TCP connection • rto: the RTO value of the TCP connection In the analysis, the average window size of the TCP connection is first calculated from the above three parameters. The average throughput is then obtained by considering the performance degradation caused by the TCP’s retransmission timeout. The analysis developed in [20] is easily ap23
plied to our scheme, by viewing the packet dropping probability of the RED algorithm as the observed packet loss rate. The parameter set (p, rtt and rto) is obtained at the sender host as follows. Rtt and rto can be directly obtained from the sender TCP. Also, the packet loss rate p can be estimated from the number of successfully transmitted packets and the number of lost packets detected at the sender host via acknowledgement packets. A possible cause of estimation error in p is due to the stochastic nature of packet losses since the analysis in [20] assumes random packet losses occur. At the drop-tail router, however, packets tend to be dropped in a bursty nature. We present evaluation results on this aspect in Subsection 3.2. Note that the appropriateness of the estimation method used in the E-ATBT algorithm has been validated [20]. However, we can apply another estimation result of TCP throughput given in Padhye et al [5] using the additional parameter of W max , the buffer size of the receiver.
1 Wmax , λ = max RT T RT T 2bp + T min(1, 3 3bp )p(1 + 32p 2 ) 0 3 8 where b is set to 2 if the receiver uses TCP’s Delayed ACK option, and 1 if not. We denote the estimated throughput of connection i by ρ i. From ρi, we simply determine Bi , the required buffer size of connection i, as; Bi = ρi × rtti where rtti is RTT value of connection i. By this mechanism, a stable assignment of the send socket buffer to TCP connections is expected to be provided if the parameter set (p, rtt, and rto) used in the estimation is stable. However, in ATBT, the assignment is inherently unstable even when three parameters are stable, since the window size oscillates more significantly regardless of the stability of the parameters. As in ATBT, our E-ATBT also adopts a max-min fairness policy for re-assigning the excess buffer. Differently to the ATBT algorithm, however, E-ATBT employs a proportional reassignment policy. That is, when excess buffer is re-assigned to connections needing more buffer, 24
D a ta F ile F ile S y s te m
D a ta F ile F ile S y s te m M M e e m m o o r ry y C C o o p p y y
M M e e m m o o r ry y C C o o p p y y
A p p lic a tio n B u ffe r M M e e m m o o r ry y C C o o p p y y
U se r S p a c e
U se r S p a c e
K e rn e l S p a c e
K e rn e l S p a c e
S o c k e t B u ffe r
S o c k e t B u ffe r
T C P
T C P
(a) Original Mechanism
(b) Proposed Mechanism of SMR scheme
Figure 4: Memory Copy Operations in TCP Data Transfer
the buffer is re-assigned proportionally to the required buffer size calculated from the analysis. Whereas ATBT re-assigns excess buffer equally, since it has no means to know the expected throughput of the connections.
3.1.2 Simple Memory-copy Reduction (SMR) Another mechanism we implemented for SSBT is the simple memory-copy reduction (SMR) scheme. In TCP data transfer, two memory-copy operations are necessary at the sender host in TCP data transfer [22]. One is from the file system to the application buffer, and the other is from the application buffer to the socket buffer, as shown in Figure 4(a). The problem is that memory access is the largest obstacle in improving TCP data transfer [23], so reducing the number of memory copy operations in TCP data transfer is a key to improving the server’s protocol processing speed. Reducing the number of memory accesses in TCP data transfer is not a new subject [1, 2]. In Drushel and Peterson [1], the authors proposed a ‘fbuf’ (fast buffer) architecture, which shares the
25
memory space between the system kernel and user processes to avoid redundant memory copies during data transfers. However, it is difficult to implement fbuf on actual systems, because its complicated architecture requires many modifications in the memory management mechanism of operating systems. In this thesis, we propose a simple memory-copy reduction (SMR) scheme, which is a new and simpler mechanism that can avoid memory copying from the file system to the application buffer. A detailed mechanism at the sender host in TCP data transfer of the original BSD UNIX system is illustrated in Figure 4(a). When the user application transfers a file by using TCP, the file is first copied from the file system (on disk or memory) to the application buffer located in the user memory space by one memory copy operation. Then, the data in the application buffer is copied to the socket buffer which is located in the kernel memory space. It is performed by a couple of system calls provided by the socket interface, and so two memory copy operations are necessary. It is redundant, however, to copy the file in the file system to the socket buffer via the application buffer, as it can be directly copied from the file system to the socket buffer in the file transfer. Of course, the memory space should be clearly divided into user and kernel spaces by the UNIX memory management system, so that the kernel memory space is protected from illegal accesses caused by user applications. However, when we consider the performance improvement of TCP data transfer, we can avoid memory copying from the file system to the application buffer. Figure 4(b) illustrates the proposed mechanism of the TCP data transfer. In the proposed mechanism, a transferring file in the file system is directly copied to the socket buffer by the new socket system calls. The socket system calls of the original mechanism require the application buffer as one of the arguments to copy the data from application buffer to the socket buffer. However, in the proposed mechanism, socket system calls are modified to specify a file descriptor of a transferring file as an argument, by which a direct copy of the data from the file system to the socket buffer can be done. Then, we can avoid the redundant memory copying procedure. In what follows, we explain the new socket system calls implemented in our proposed mechanism. See 26
TT CC PP s s e e n n d d t to o
s s e e n n d d i it t
s so o s se e n n d d
UU DD PP I IC C M M P P
Figure 5: Flow Chart of Socket System Calls
Figure 5 for the flow chart of the socket system calls of the FreeBSD system. We implemented the proposed mechanism by modifying three socket system calls, and adding two new arguments to the mbuf structure [14]. All system calls for TCP data transfer call a sosend system call, in order to copy the transferring data from the user memory space to the kernel memory space and pass the data for further protocol processing. In the proposed mechanism, we change one of the arguments of sosend from the application buffer to the file descriptor, so that sosend can copy the data directly from the file system to the socket buffer. We also modify sendto and sendit system calls to handle the file descriptor according to the changes to the sosend system call.
3.2
Simulation Experiments
In this subsection, we present simulation results of E-ATBT using a network simulator ns-2 [24]. For evaluating the effectiveness of E-ATBT, we compared the following three algorithms. EQ: All of active TCP connections are assigned an equal size of the send socket buffer. ATBT: The send socket buffer size is assigned according to the automatic TCP buffer tuning algorithm described in Subsection 2.2.4.
27
E-ATBT: The sender host assigns the send socket buffer in accordance with the equation-based automatic TCP buffer tuning algorithm described in Subsection 3.1.1. In the three algorithms, buffer re-assignment is performed every second. We conducted two situations for packet loss occurrence. We first assumed a constant packet loss rate, implicitly assuming that the router was equipped with the RED algorithm. It meant that packet loss took place randomly with a given probability. Such an assumption has been validated in [5]. However, random packet dropping is a rather ideal situation for our E-ATBT since our method largely relies on an adequate estimation of the packet loss rate in determining the throughput of TCP connections. Hence, we also investigated the effect of the drop-tail router where packet loss occurred due to buffer overflow at the router, where we can never expect random packet losses, and they tend to occur in a bursty fashion.
3.2.1 Experiment 1: Constant Packet Loss Rate We first investigate the effect of constant packet loss rate experienced by each TCP connection. The network model used in this simulation experiment is depicted in Figure 6. It consists of a sender host, and eight receiver hosts (from receiver host 1 to 8), each of which is connected by the link of 100 Mbps to the sender host. The propagation delays between the sender host and receiver hosts (denoted by τ ) are equally set to 6 msec. We set the packet loss probability on each link between the sender host and the receiver host i to p i . The sender host has B packets of the socket buffer size in total. Each of simulation experiments starts at time 0. TCP connection i, which is established from the sender host to the receiver host i, starts packet transmission at time t = (i − 1) sec. All connections stop packet transmission at t = 1000 sec. We first set the packet dropping probability p i’s as p1 = 0.0001 and p 2 = ... = p8 = 0.02. The socket buffer size of the sender B is set to be 500 packets, and the packet size is fixed at 1000 Bytes. Expected throughput values obtained by the receiver hosts then become about 95 Mbps for the receiver host 1, and about 7 Mbps for the rest of receiver hosts. Figure 7 plots the time-dependent 28
Receiver Hosts Connection 1: bw = 100 Mbps p = p1 Connection 2: bw = 100 Mbps p = p2
Sender Host Buffer Size: B packets
Connection 7: bw = 100 Mbps p = p7
t = 6 msec
Connection 8: bw = 100 Mbps p = p8
Figure 6: Network Model for Simulation Experiment (1)
behaviors of the assigned buffer size and throughput values of connections obtained by three mechanisms; EQ, ATBT, and E-ATBT. In the figure, labels “#1” – “#8” represent connections 1 through 8. In the EQ mechanism (Figures 7(a) and 7(b)), the throughput of connection 1 is degraded during 200 < t < 500 sec and 600 < t sec of the simulation time. The first degradation occurs because the assigned buffer size of connection 1 is suddenly decreased at the time when the second and third connections start packet transmission. It results in temporal degradation of the throughput of connection 1. At t = 600 sec, the number of active connections exceeds six. However, connection 1 requires about 110 packets to achieve its estimated throughput for p 1 = 0.0001. If the active number of connections exceeds six, its assigned buffer becomes below 110 packets in the EQ mechanism. It is the reason why the throughput of connection 1 is decreased. On the other hand, the other connections only require about 10 packets of the send socket buffer and therefore, those connections attain the constant throughput even if the assigned buffer size gets smaller. In both of ATBT and E-ATBT, on the other hand, the throughput of connection 1 can keep a
29
100 #8 Throughput [Mbps]
Assigned Buffer Size [packets]
500 450 400 350 300 250 200 150 100 50 0
#7 #6 #5 #4 #3 #2 200
400 600 Time [sec]
800
#1
60 40 20 #2-#8
#1 0
80
0
1000
0
500 450 400 350 300 250 200 150 100 50 0
800
1000
800
1000
800
1000
100
#2-#8 0
200
400 600 Time [sec]
60 40 20 #2-#8
#1 800
#1
80
0
1000
0
(c) ATBT: Assigned Buffer Size
500 450 400 350 300 250 200 150 100 50 0
200
400 600 Time [sec]
(d) ATBT: Throughput
100 Throughput [Mbps]
Assigned Buffer Size [packets]
400 600 Time [sec]
(b) EQ: Throughput
Throughput [Mbps]
Assigned Buffer Size [packets]
(a) EQ: Assigned Buffer Size
200
#1
#1
80 60 40 20
#2-#8
#2-#8 0 0
200
400 600 Time [sec]
800
1000
(e) E-ATBT: Assigned Buffer Size
0
200
400 600 Time [sec]
(f) E-ATBT: Throughput
Figure 7: Result of Simulation Experiment (1): p 1 = 0.0001, p2 = . . . = p8 = 0.02, B = 500 packets 30
high value even when the number of connections becomes large as shown in Figures 7(d) and 7(f). It is because the send socket buffer is assigned appropriately, that is, the connection 1 is assigned the larger buffer size than other connections (see Figures 7(c) and 7(e)) even after other connections join. When comparing ATBT and E-ATBT mechanisms, E-ATBT can offer stable buffer assignment as shown in Figure 7(e). However, it can also be observed by comparing Figures 7(d) and 7(f) that the obtained throughput values of two situations are not different. We next decrease the total buffer size, B, to 100 packets. See Figure 8. In all of three mechanisms, the throughput of connection 1 is degraded as the number of connections increases, however, E-ATBT provides the highest throughput to connection 1. E-ATBT estimates the required buffer size for all connections, and assigns the small buffer to connections 2 through 8 since those do not require the large buffer due to its high packet loss rate. Then, the excess buffer can be utilized by connection 1 so that connection 1 can enjoy high throughput as shown in Figures 8(e) and 8(f). Further, the assigned buffer by E-ATBT is very stable when comparing with the ATBT. In ATBT, connections 2 through 8 temporarily inflate their window size according to the congestion algorithm of TCP, and it results in the decrease of the assigned buffer size of connection 1 (Figure 8(c)). It leads to throughput degradation of connection 1 in ATBT. For another experiment using the network model depicted in Figure 6, we set p i ’s as follows; p1 = p2 = 0.02, p3 = p4 = 0.01, p5 = p6 = 0.002, p7 = p8 = 0.001. That is, four kinds of connections are set up at the sender host. We set the buffer size B to 200 KBytes, which is relatively small compared with the total value of required buffer sizes of 8 connections. Results are shown in Figure 9. Connections 7 and 8 start the packet transmission at 750 sec and 875 sec, respectively. In the EQ mechanism, connections 7 and 8 receive as same throughput as connections 5 and 6 as shown in Figure 9(b), even though connections 7 and 8 have lower packet loss rate than connections 5 and 6. In ATBT, throughputs of connections 7 and 8 are slightly higher than those of connections 5 and 6 (Figure 9(d)), but the buffer assignment has large oscillation (Figure 9(c)). On the other hand, Figure 9(e) shows that E-ATBT keeps stable buffer assignment. It can also 31
100 #8 #7
80
Throughput [Mbps]
Assigned Buffer Size [packets]
100
#6
60
#5
40
#4 #3
20
#2 #1
0 0
200
400 600 Time [sec]
800
80
#1
60 40 20 #2-#8 0
1000
0
100
800
1000
800
1000
800
1000
100
80 60 40 20 #2-#8 200
400 600 Time [sec]
800
80
#1
60 40 20 #2-#8
#1
0 0
0
1000
0
(c) ATBT: Assigned Buffer Size
100
60 40 20
400 600 Time [sec]
100
#8 #7 #6 #5 #4 #3 #2
80
200
(d) ATBT: Throughput
Throughput [Mbps]
Assigned Buffer Size [packets]
400 600 Time [sec]
(b) EQ: Throughput
Throughput [Mbps]
Assigned Buffer Size [packets]
(a) EQ: Assigned Buffer Size
200
#1
80
#1
60 40 20 #2-#8
0
0 0
200
400 600 Time [sec]
800
1000
(e) E-ATBT: Assigned Buffer Size
0
200
400 600 Time [sec]
(f) E-ATBT: Throughput
Figure 8: Result of Simulation Experiment (2): p 1 = 0.0001, p2 = . . . = p8 = 0.02, B = 100 packets 32
provide the highest throughput to connections 7 and 8 without throughput degradation of other connections (Figure 9(f)).
3.2.2 Experiment 2: Drop-Tail We next use the network model depicted in Figure 10 to examine the effect of packet losses caused by buffer overflow at the router. Differently from the constant packet loss rate in the previous subsection, the packet loss rate of each TCP connection oscillates due to bursty packet losses, which may affect the behavior of the E-ATBT mechanism. The simulation model consists of a sender host, seven receiver hosts, six routers, and links connecting routers and sender/receiver hosts. Seven TCP connections are established at the sender host (connections 1 through 7). As shown in Figure 10, connection 1 occupies router 1, connections 2 and 3 share router 2, and the rest of connections (4 through 7) share router 3. The capacities of links between routers are identically set to be 1.5 Mbps, and those of links between routers and the sender host, and between the router and the receiver hosts are set to be 155 Mbps and 10 Mbps, respectively. Further, propagation delays of links between the routers, and that between the routers and the sender/receiver hosts are 100 msec and 1 msec, respectively. In this simulation experiment, connection i starts packet transmission at time t = (i − 1) × 500 sec. The simulation ends at 4,000 sec. When seven connections join the network, the ideal throughput becomes 1.5 Mbps for connection 1, 0.75 Mbps for connections 2 and 3, and 0.375 Mbps for connections 4 through 7. Figure 11 shows the results for the throughput and assigned buffer size as a function of time. We set the send socket buffer size B = 350 packets. As shown in Figures 11(a) and 11(b), connections sharing router 3 (connections 4 through 7) have unfair throughput values in the EQ, in spite of the fact that conditions of connections are completely identical. It is due to an accidental unfairness of the TCP congestion control algorithm, which is well known in the literature. See, e.g., [25, 26]. Figures 11(c) and 11(d) clearly show that ATBT cannot provide a stable buffer 33
#8 #7
Throughput [Mbps]
Assigned Buffer Size [packets]
200 150 #6 #5 100 #4 #3 50
#2 #1
0 0
200
400 600 Time [sec]
800
50 45 40 35 30 25 20 15 10 5 0
1000
#5
#3 #1 0
#1 #2
#3
#4
#5
#6 #7
#8
150 100 50 0 0
200
400 600 Time [sec]
800
50 45 40 35 30 25 20 15 10 5 0
1000
Throughput [Mbps]
Assigned Buffer Size [packets]
150 #7 100
#6 #5
#3
#1 0 0
200
400 600 Time [sec]
800
1000
#7 #8
#5
#3 #1 0
#8
#2
400 600 Time [sec]
#4
#2 200
400 600 Time [sec]
800
1000
(d) ATBT: Throughput
200
50
#4
#6
(c) ATBT: Assigned Buffer Size
#4
#8
(b) EQ: Throughput
Throughput [Mbps]
Assigned Buffer Size [packets]
200
#7
#2 200
(a) EQ: Assigned Buffer Size
#6
800
1000
(e) E-ATBT: Assigned Buffer Size
50 45 40 35 30 25 20 15 10 5 0
#5
#6
#7 #8
#3 #1 0
#4
#2 200
400 600 Time [sec]
800
1000
(f) E-ATBT: Throughput
Figure 9: Result of Simulation Experiment (3): p 1 = p2 = 0.02, p3 = p4 = 0.01, p5 = p6 = 0.002, p7 = p8 = 0.001, B = 200 packets 34
Receiver Hosts
155 Mbps 1 msec
1.5 Mbps 100 msec
Router1
10 Mbps 1 msec
Sender Host Buffer Size: B packets
1.5 Mbps 100 msec 155 Mbps 1 msec
Router2 10 Mbps 1 msec
10 Mbps 1 msec 155 Mbps 1 msec 1.5 Mbps 100 msec
Router3
Figure 10: Network Model for Simulation Experiment (2)
assignment. It is because ATBT assigns the buffer to each TCP connection according to the current window size of the connection, which oscillates dynamically due to an inherent nature of the TCP window mechanism. Therefore, the assigned buffer sizes of connections not requiring large buffer (connection 2 through 7 in the current situation) are often inflated. It leads to the temporary decrease of the buffer assigned to connections that need large buffer (connection 1), as in the constant packet loss rate of the previous subsection. As a result, the throughput of connection 1 is often degraded. On the other hand, our proposed E-ATBT keeps a stable and fair buffer assignment as shown in Figure 11(e), leading to fair treatment in terms of throughput (Figure 11(f)). Even when the buffer size is smaller, E-ATBT keeps its effectiveness. Results are shown in Figure 12, where the buffer size B is decreased to 200 KBytes. Figures 12(b) and 12(d) show that EQ and ATBT mechanisms cannot provide fairness among connections. In particular, throughput of connection 1 is significantly degraded, and connections going through router 3 receive differ35
1.4
#7
300
Throughput [Mbps]
Assigned Buffer Size [packets]
350
#6
250
#5
200
#4
150
#3
100
#2
50 0
1000
2000 3000 Time [sec]
4000
1.2 1
#4
0.8
#2,#3
0.6 0.4 #5,#6,#7
0.2
#1
0
#1
0
5000
0
350
#1 #2 #3 #4 #5 #6 #7
4000
5000
1.4
300 250 200 150 100 50
1.2 1
#1 #2,#3
0.8 0.6
#4,#5,#6,#7
0.4 0.2
0
0 0
1000
2000 3000 Time [sec]
4000
5000
0
(c) ATBT: Assigned Buffer Size
350
#7 #6 #5
300 250
#3
150
#2
100 50 0
1000
2000 3000 Time [sec]
4000
4000
5000
#1
1.2 1 0.8
#2,#3
0.6 #4,#5,#6,#7
0.4 0.2
#1
0
2000 3000 Time [sec]
1.4
#4
200
1000
(d) ATBT: Throughput
Throughput [Mbps]
Assigned Buffer Size [packets]
2000 3000 Time [sec]
(b) EQ: Throughput
Throughput [Mbps]
Assigned Buffer Size [packets]
(a) EQ: Assigned Buffer Size
1000
0
5000
(e) E-ATBT: Assigned Buffer Size
0
1000
2000 3000 Time [sec]
(f) E-ATBT: Throughput
Figure 11: Results of Simulation Experiment (4): B = 350 packets
36
4000
5000
1.4
#7 Throughput [Mbps]
Assigned Buffer Size [packets]
200 #6
150
#5 100
#4 #3
50
#2
1.2
#1
1 0.8
#2,#3
0.6
#4,#5
0.4
#6,#7
0.2
#1 0
0 0
1000
2000 3000 Time [sec]
4000
5000
0
200
#1 #2 #3 #4 #5 #6 #7
4000
5000
1.4
150 100 50
1.2
#1
1 0.8
#2,#3
0.6 #4
0.4 0.2
0
#5,#6,#7
0 0
1000
2000 3000 Time [sec]
4000
5000
0
(c) ATBT: Assigned Buffer Size
200
#7 #5 #4
100
2000 3000 Time [sec]
1.4
#6 150
1000
4000
5000
(d) ATBT: Throughput
Throughput [Mbps]
Assigned Buffer Size [packets]
2000 3000 Time [sec]
(b) EQ: Throughput
Throughput [Mbps]
Assigned Buffer Size [packets]
(a) EQ: Assigned Buffer Size
1000
#3 #2
50
#1
1.2 1 #2,#3
0.8 0.6
#4,#5,#6,#7
0.4 0.2
#1 0
0 0
1000
2000
3000
4000
5000
Time [sec]
0
1000
2000
3000
Time [sec]
(e) E-ATBT: Assigned Buffer Size
(f) E-ATBT: Throughput
Figure 12: Results of Simulation Experiment (5): B = 200 packets 37
4000
5000
ent throughput values. It is because in assigning the buffer, the EQ method does not take care of the connections’ characteristics at all. See Figure 12(a). It is also true for the ATBT mechanism (Figure 12(c)). It is mainly due to the oscillation of the assigned buffer to each connection as shown in Figure 12(c). From the figure, it is also verified that ATBT, which equally re-assigns the excess buffer to connections, does not work well. On the other hand, in the E-ATBT algorithm, throughput of connection 1 is kept to be a high value, and much better fairness can be achieved (Figures 12(e) and 12(f)). Note that a slight difference of the throughput values of connections 2 and 3 (and connections 4 through 7) is due to an estimation error of the required buffer sizes. For the drop-tail router, the parameters (p, rtt, and rto) observed for the throughput estimation change more largely than the constant packet loss rate because of the bursty nature of packet losses, leading to the difference of the assigned buffer size.
3.3
Implementation Experiments
We next present the results obtained by our experimental system. We implemented EQ, ATBT and SSBT mechanisms, described in the previous subsection, on Intel Pentium PC running FreeBSD. In the experiments, we focus on the following four subjects; (1) Server performance improvement by the SMR scheme (2) Fair buffer assignment among different connections (3) Scalability against the number of connections (4) Performance improvement of the Web server The experimental results are shown in the following subsection in turn.
3.3.1 Experiment 1: Server Performance Improvement by SMR We implemented the SMR scheme explained in Subsection 3.1.2 on Pentium-III Xeon 550 MHz machine with 512 MBytes memory, running FreeBSD 4.0-RELEASE [13]. The two machines are 38
S e n d e r H o s t
R e c e iv e r H o s t
6 2 2 M b p s M A P O S lin k Figure 13: Network Model for Implementation Experiment (1)
directly connected by the 622 Mbps MAPOS (Multiple Access Protocol Over SONET/SDH) [27] link, as shown in Figure 13. We did not change the TCP/IP implementation, except that the delayed ACK option [14] is not used. The sender host transmits 500 MBytes data to the receiver host with a single TCP connection through the MAPOS link. We then calculate the mean TCP throughput by measuring the transfer time of the data. Figure 14 shows the mean TCP throughput values of the original and proposed mechanisms, as a function of the socket buffer size of the sender host. The socket buffer size of the receiver host is identically set to that of the sender socket buffer size. The result shows that the proposed mechanism can improve up to about 30% of the throughput compared with the original mechanism. This is because that the bottleneck for performance improvement is located at endhosts since 622 Mbps bandwidth of the MAPOS link is large enough. That is, the sender host cannot inject data packets to fully utilize the link bandwidth in the original system, because its processing in TCP data transfer is slow compared with the network bandwidth. However, our SMR scheme can utilize the bandwidth of the MAPOS link largely compared with the original systems, and therefore the effect of the proposed mechanism becomes significant. In the following experiments, we use the server machine which is equipped with SMR scheme.
3.3.2 Experiment 2: Fair Buffer Assignment among Different Connections We next evaluate the E-ATBT mechanism of SSBT. First, we investigate the fairness property of three buffer assignment algorithm described in Subsection 3.2. In this experiment, we consider
39
Throughput [Mbps]
400 350
Proposed Method
300 250 200 Original Method
150 100 50 0 16
32
64
128
256
512
Socket Buffer Size [KBytes]
Figure 14: Result of Implementation Experiment (1): TCP Throughput using MAPOS Link
the situation where three TCP connections are established at the sender host through 622 Mbps MAPOS link. Three connections have different packet loss rates; 0.005 for connection 1, 0.01 for connection 2, and 0.02 for connection 3, as shown in Figure 15. In the experiment, packets are intentionally dropped at the receiver host. Note that with those packet loss rates, three connections can achieve throughput values of about 55, 90, and 220 Mbps, respectively, when an enough amount of the send socket buffer is assigned to each connection. Figure 16 compares throughput values of three connections and total throughput. The horizontal axis shows the total size of the send socket buffer. In the EQ (Figure 16(a)), three connections can achieve their maximum throughputs when the total buffer size is larger than 240 KBytes, while ATBT and E-ATBT need only about 200 KBytes and 170 KBytes, respectively. It is due to the same reason explained in Subsection 3.2. Connections 2 and 3 do not need as much buffer size as connection 1 does, but the EQ algorithm cannot re-allocate the excess buffer of connections 2 and 3 to connection 1. On the contrary, ATBT and E-ATBT algorithms work well for the buffer assignment. By comparing ATBT and E-ATBT algorithms in Figures 16(b) and 16(c), it is clear that E-ATBT can provide much higher throughput for connection 1. The difference is due to reassignment policies of the excess buffer to connections. In the ATBT algorithm, the excess buffer
40
S e n d e r H o st
C o n n e c tio n 1 : p a c k e t lo s s r a te 0 .0 0 5
R e c e iv e r H o s t
C o n n e c tio n 2 : p a c k e t lo s s r a te 0 .0 1 C o n n e c tio n 3 : p a c k e t lo s s r a te 0 .0 2
M A P O S
(6 2 2 M b p s)
Figure 15: Network Model for Implementation Experiment (2)
of connection 3 is equally re-assigned to connections 1 and 2. Then, only connection 3 is assigned a sufficient size of the buffer while connections 1 and 2 need more buffer. On the other hand, EATBT re-assigns the excess buffer in proportion to the required buffer size of connections 1 and 2. Accordingly, E-ATBT gives more buffer to connection 1 than connection 2 because the required buffer size of connection 1 is larger than that of connection 2. Consequently, connection 1 can achieve higher throughput at an expense of small throughput degradation of connection 2. Then, the total throughput of E-ATBT is highest among three algorithms regardless of the total buffer size as shown in Figure 16. We next present a time–dependent behavior of the buffer size assigned to each connection in Figure 18. We use the network topology depicted in Figure 17. For routers (Router 1 and 2) in the figure, we use the PC-based router called ALTQ [28] to realize the RED router. We establish connections 1,2, and 4 between the server host and client host 1, and connection 3 between the server host and client host 2. In this experiment, connections 1 and 2 first start packet transmission at time 0 sec. Then, connections 3 and 4 join the network at time 5 sec and 10 sec, respectively. Here, we set the total buffer size to be 512 KBytes. In ATBT (Figure 18(a)), the assigned buffer sizes heavily oscillate because the ATBT algorithm adapts the buffer assignment according to the window size. Another and more important reason is that when retransmission timeout expiration occurs at a certain connection, that connection resets the window size to 1 packet according to the TCP retransmission mechanism. Then, the assigned buffer size of that connection becomes very
41
400
350
Throughput [Mbps]
Total
300 250
Connection1
200 150 100
Connection2
50
Connection3
Total
350 300 250
Connection1
200 150 100
Connection2
50
Connection3
0
0 80 100 120 140 160 180 200 220 240 260 280
80 100 120 140 160 180 200 220 240 260 280
Total Buffer Size [KBytes]
Total Buffer Size [KBytes]
(a) EQ
(b) ATBT
400 Throughput [Mbps]
Throughput [Mbps]
400
Total
350 300 250
Connection1
200 150 100
Connection2
50
Connection3
0 80 100 120 140 160 180 200 220 240 260 280 Total Buffer Size [KBytes]
(c) E-ATBT
Figure 16: Results of Implementation Experiment (2): Fairness among Three Connections
42
1000 Mbps
1000 Mbps
Client Host 1
Router 1
Server Host 100 Mbps
100 Mbps
Router 2
Client Host 2
Figure 17: Network Model for Implementation Experiment (3)
low. Once the assigned buffer size gets small, the connection cannot inflate its window size during a while because of the throttled buffer size. It is the reason why the assigned buffer size is kept low during a while in the ATBT algorithm. On the other hand, E-ATBT can provide the stable and fair buffer assignment as shown in Figure 18(b). It brings higher throughput to each TCP connection.
3.3.3 Experiment 3: Scalability against the Number of Connections We next turn our attention to the scalability of the buffer assignment algorithms against the number of connection. Figure 19 depicts the experimental setting for this purpose. One TCP connection is established using the MAPOS link (which is referred to as MAPOS connection below). The several number of TCP connections are simultaneously established through the Ethernet link (Ethernet connections). Through this experiment, we intend to investigate the effect of the number of Ethernet connections on the performance of the MAPOS connection. Figure 20 shows the throughput values of the MAPOS connection dependent on the number of Ethernet connections. In addition to the results of EQ, ATBT and E-ATBT algorithms, we also present the results where the constant size (16 KBytes or 64 KBytes) of the send socket buffer 43
Assigned Buffer Size [KBytes]
350
Connection 1 Connection 2 Connection 3 Connection 4
300 250 200 150 100 50 0 0
5
10
15 20 25 Time [sec]
30
35
40
35
40
Assigned Buffer Size [KBytes]
(a) ATBT
350
Connection 1 Connection 2 Connection 3 Connection 4
300 250 200 150 100 50 0 0
5
10
15 20 25 Time [sec]
30
(b) E-ATBT
Figure 18: Results of Implementation Experiment (3): Changes of Assigned Buffer Sizes
44
S e n d e r H o st
E th e rn e t c o n n e c tio n N
..
1 0 0 M b p s
E th e rn e t c o n n e c tio n 1
6 2 2 M b p s
M A P O S c o n n e c tio n
R e c e iv e r H o s t
Figure 19: Network Model for Implementation Experiment (4)
is assigned to each TCP connection. Such a constant assignment is a default mechanism of the current TCP/IP implementation in major operating systems. Results are plotted in the figure with labels “16KB” and “64KB”. In this figure, we set the total of the send socket buffer to 256 KBytes. Therefore, if we assign the constant buffer size of 64 KBytes to each TCP connection, the sender allows up to four connections at the same time. We can make several important observations from Figure 20. First, we can see that the constant assignment algorithm, which is widely employed in the current OS, has the following drawback. If 16 KBytes send socket buffer is assigned to each TCP connection, the MAPOS connection suffers from very low throughput because it is too small for the 622 Mbps MAPOS link. When each connection is given 64 KBytes buffer size, on the other hand, the throughput of the MAPOS connection becomes considerably high. However, the number of Ethernet connection which can be simultaneously established is severely limited. In the EQ algorithm, when the number of Ethernet connections exceeds four, the throughput of the MAPOS connection is suddenly decreased to about 11 Mbps. It is because the EQ algorithm does not distinguish the MAPOS connection and the Ethernet connections, and assigns an equal size of the send socket buffer to all connections. Therefore, as the number of Ethernet connections is increased, the assigned buffer size to the MAPOS connection is decreased, leading to throughput degradation of the MAPOS connection. When we employ the ATBT and E-ATBT algorithms, the throughput degradation of the MAPOS connection can be limited even when the number of Ethernet connections is increased. However, when the number of Ethernet connections exceeds 10, throughput values of ATBT and E45
Throughput [Mbps]
300 250
E-ATBT
64KB
200
ATBT
150 EQ
100 50 16KB 0 0
2 4 6 8 10 12 14 Number of Ethernet Connections
16
Figure 20: Result of Implementation Experiment (4): Throughput of MAPOS Connection vs. the Number of Ethernet Connections
ATBT algorithms becomes distinguishable. It is again caused by the instability of the assigned buffer size and by the poor re-assignment algorithm of ATBT, as having been explained in Subsection 3.3.2. Even in the E-ATBT algorithm, the throughput of the MAPOS connection is slightly degraded by the larger number of Ethernet connections. It is because the total of the required buffer size of Ethernet connections is increased, even though the required buffer size of each Ethernet connection is small. Then, the buffer size assigned to the MAPOS connection is decreased because the excess buffer becomes small. However, the degree of the throughput degradation of the MAPOS connection can be limited in the E-ATBT algorithm.
3.3.4 Experiment 4: Web Server Performance Evaluation In the last experiment the effectiveness of the SSBT scheme is investigated using the machine running the Apache Web server [29]. We use the network topology in Figure 17, and run the SSBTenabled Web server at the server host. For routers (Router 1 and 2) in the figure, we use the PCbased router called NISTNet [30] to emulate a variety of network conditions, such as large RTT, packet loss environment, and so on. In this experiment, we set the delay time between server host and each client host to 2 msec. At the client host, we run httperf [31] to generate document transfer
46
requests to the Web server by HTTP. Furthermore, to consider a realistic scenario for the traffic arriving at the server, we follow [16] in determining the characteristics of the Web access, which includes the document size distribution, the idle time distribution between successive requests, and the distribution for the number of embedded documents. We use HTTP/1.1 for Web access to the server. The receive socket buffer size of each TCP connection is set to 64 KBytes. We run each experiment during 30 minutes, which corresponds to generations of about 25,000 document transfer requests from each client. Figure 21 shows the average performance gain by the SSBT scheme as a function of the document size. Here, the average performance gain for the documents with size i Bytes was determined as follows. Let the number of document requests during the experiments be M SSBT,i and Morig,i with and without the SSBT scheme, respectively. Similarly, the observed transfer time of jth document with size i is denoted as T SSBT,i,j and Torig,i,j . Then, the average performance gain Gi for documents with size i is defined as; M
Gi [%] = 100 ×
SSBT,i
j=1
i TSSBT,i,j
MSSBT,i
Morig,i
j=1
i Torig,i,j
Morig,i
Note that since we generated the requests according to the probability functions for the document size, idle time, and the number of embedded documents, M orig,i and MSSBT,i may be different in the experiments with and without SSBT. If G(i) is over 100 %, it means that our SSBT outperforms the original mechanism. The results of 10, 100, 300, and 600 clients are shown in Figures 21(a), 21(b), 21(c), and 21(d), respectively. It seems that SSBT degrades the performance of the Web server, especially when the document size is small. This is caused by TCP’s retransmission timeouts, which is explained as follows. When a requested document size is small, the window size of the TCP connection carrying the document does not become so large. Therefore, if packet loss occurs, the TCP connection tends to wait retransmission timeout since the window size is too small to invoke a fast retransmit algorithm. It results in that the timeout duration occupies a most part of 47
the document transfer. That is, the transfer time of the small document is dominated by whether the timeout occurs or not, not by whether the SSBT algorithm is used or not. When the document size is large, on the other hand, the effectiveness of the SSBT becomes apparent as expected. Especially when the number of clients increases, it greatly reduces the document transfer times (see Figures 21(c) and 21(d)). It means that the SSBT algorithm shares the Web server resource in an effective and fair manner, compared with the original mechanism. These results show that the SSBT algorithm is effective even at the actual Internet servers.
3.4
Summary
In this section, we have proposed SSBT, a novel architecture for effectively and fairly utilizing the send socket buffer of a busy Web server. Through the various experiments, we have confirmed that the E-ATBT algorithm can assign send socket buffers to the heterogeneous TCP connections in effective and fair manner. We have also shown that the SMR scheme can reduce the number of memory-copy operations when data packets are sent by TCP, which results in the significant throughput improvement of TCP connections. We have verified that our proposed mechanisms can improve the overall performance of a Web server.
48
Average Performance Gain [%]
Average Performance Gain [%]
1000 500 300 200 100 50 30 20 10 1000
10000 100000 1e+06 Document size [Bytes]
1000 500 300 200 100 50 30 20 10 1000
1e+07
1000 500 300 200 100 50 30 20 10 1000
10000 100000 1e+06 Document size [Bytes]
1e+07
(b) 100 Clients Average Performance Gain [%]
Average Performance Gain [%]
(a) 10 Clients
10000 100000 1e+06 Document size [Bytes]
1e+07
(c) 300 Clients
1000 500 300 200 100 50 30 20 10 1000
10000 100000 1e+06 Document size [Bytes]
1e+07
(d) 600 Clients
Figure 21: Results of Implementation Experiment (5): Document Size vs. Average Performance Gain by SSBT Scheme
49
4
Connection Management for Web Proxy Servers
In this section, we propose a new resource management scheme suitable to Web proxy servers, which solves the problems pointed out in Section 2.
4.1
Algorithms
4.1.1 Enhanced E-ATBT (E2 -ATBT) As described in Section 2, a Web proxy server must accommodate hundreds of TCP connections, which have different characteristics, as well as a Web server. Then E-ATBT proposed in Subsection 3.1.1 can be applied to the Web proxy server. Differently from the Web server, however, the Web proxy server behaves as a TCP receiver host when it obtains a requested document from the Web server. We thus have to incorporate a receive socket buffer management algorithm, which was not considered in the original E-ATBT as described in Subsection 3.1.1. Also, we have to consider the dependency of upward and downward TCP connections. In what follows, we propose E2 -ATBT (Enhanced E-ATBT), which is a revision of our original E-ATBT (described in Subsection 3.1.1), having two additional schemes to eliminate the above-mentioned two problems.
Handling the Relation between Upward and Downward TCP connections A Web proxy server relays a document transfer request to a Web server for a Web client host as depicted in Figure 1(b). Thus, there is a close relation between an upward TCP connection (from the Web proxy server to the Web server) and a downward TCP connection (from the client host to the Web proxy server). That is, the difference in expected throughput of both connections should be taken into account when socket buffers are assigned to both connections. For example, when the throughput of a certain downward TCP connection is larger than that of other concurrent downward TCP connections, the larger size of socket buffer should be assigned to the TCP connection by using E-ATBT. However, if the throughput of the upward TCP connection corresponding to
50
the downward TCP connection is low, the send socket buffer assigned to the downward TCP connection is likely not to be fully utilized. In this situation, the unused send socket buffer should be assigned to the other concurrent TCP connections having smaller socket buffers, hence, the throughput of those TCP connections would be improved. There is one problem that must be overcome to realize the above-mentioned method. TCP connections can be identified with the control block, called tcpcb, by the kernel. However, the relation between the upward and downward connections cannot be determined. Two possible way to dissolve this problem can be considered; • The Web proxy server monitors the utilization of the send socket buffer for downward TCP connections. Then, it decreases the assigned buffer size of connections whose send socket buffer is not fully utilized. • When the Web proxy server sends the document transfer request to the Web server, the Web proxy server attaches information of the relation to the packet header. That is, when the Web proxy server sends the document transfer request to the Web server, the Web proxy server attaches the information, to the packet header, about the corresponding downward TCP connection which originates the document transfer request. The former algorithm can be realized only by the modification of the Web proxy server. On the other hand, the latter algorithm needs the interaction of the HTTP protocol. In the higher abstract model, the above two algorithms have a similar effect. However, the latter has an implementation difficulty despite its ability to achieve precise control. In the simulation experiments described in Subsection 4.2, we will assume that the Web proxy server knows the dependency of the downward and upward TCP connections. We use this assumption to confirm the essential performance of the proposed algorithm.
51
Control of Receive Socket Buffer In most of past researches, it was assumed that a receiver host has enough large size of receive socket buffer, based on the consideration that the performance bottleneck of the data transfer is not at the endhosts, but within the network. Therefore, many OSs assign a small sized receive socket buffer to each TCP connection. For example, the default size of the receive socket buffer in the FreeBSD system is 16 KBytes. As reported in [32], however, it is now regarded as very small because the network bandwidth is dramatically increased in the current Internet, and the performance of Internet servers has become increasingly higher and higher. To avoid performance limitation introduced by the receive socket buffer, the receiver host should adjust its receive socket buffer size to correspond to the congestion window size of the sender host. This can be done by monitoring the utilization of the receive socket buffer, or by adding information about the window size to the data packet header, similar to the handling of the connection dependency described above. In the simulation experiments of the Subsection 4.2, we suppose that the Web proxy server can obtain complete information about required sizes of the receive socket buffers of upward TCP connections and control them according to the required size.
4.1.2 Management of Persistent TCP Connections As explained in Subsection 2.3, the careful treatment of persistent TCP connections on the Web proxy server is necessary for the efficient usage of the resources at the Web proxy server. We propose a new management scheme of persistent TCP connections at the Web proxy server by considering the amount of resource remaining. The key idea is as follows. When the load of the Web proxy server is low and the remaining resources are sufficient, it tries to keep as many TCP connections open as possible. When the resources at the Web proxy server are going to be short, the Web proxy server tries to close the idle persistent TCP connections and makes the resources
52
free, so that the released resources can be used for new TCP connections. For realizing the above-mentioned control method, the remaining resources of a Web proxy server, explained in Subsections 2.2.1 through 2.2.4 (mbuf, file descriptor, control blocks, and socket buffer), should be monitored. We also have to maintain persistent TCP connections at the Web proxy server to keep and close them according to the monitored value of the utilization of the resources. The implementation issues for the management of persistent TCP connections will be discussed in Subsection 4.3. For the further effective resource usage, we also add a mechanism that the amount of resources assigned to the persistent TCP connection is decreased gradually after the connection becomes inactive. The socket buffer is not needed at all when the TCP connection is idle. Therefore, we gradually decrease the send/receive socket buffer of persistent TCP connections by taking account of the fact that as the connection idle time continues, the possibility that the TCP connection is ceased becomes large.
4.2
Simulation Experiments
In this subsection, we evaluate the performance of our proposed scheme through simulation experiments using ns-2 [24]. Figure 22 shows the simulation model. In the model, the bandwidths of the links between the client hosts and the Web proxy server and those between the Web proxy server and the Web servers are all set to 100 Mbps. To see the effect of various network conditions, the packet loss probability on each link is randomly selected from 0.0001, 0.0005, 0.001, 0.005, and 0.01. The propagation delay of each link between the client hosts and the Web proxy server is also varied from 10 msec and 100 msec, and that between the Web proxy server and the Web servers is from 10 msec and 200 msec. The number of Web servers is fixed at 50, and that of the client hosts is changed as 50, 100, 200, and 500. We ran 1000 sec simulation in each experiment. In the simulation experiments, each client host randomly selects one of the Web servers and generates a document transfer request via the Web proxy server. The distribution of the requested 53
propagation delay : 10 200 msec loss probability : 0.0001 0.01
Hr = 0.5 Web proxy server
propagation delay : 10 100 msec Client Hosts loss probability : 0.0001 0.01 Number of client hosts : 50, 100, 200, 500
Web servers Number of Web servers : 50
Figure 22: Network Model for Simulation Experiments
document size follows [16], that is, it is given by the combination of a log-normal distribution for small documents and a Pareto distribution for large ones. The access model of the client hosts also follows that in [16], where the client host first requests a main document, and then requests some in-line images, which are included in the document after a short interval (following [16], we call it active off time), and then requests the next document after a somewhat longer interval (inactive off time). Note that since we focus on the resource and connection management of Web proxy servers, we have not considered detailed algorithms of the caching behavior at the Web proxy server, including the cache replacement algorithm. Instead, we set the cache hit ratio at the Web proxy server, h, to 0.5. Using h, the Web proxy server decides either to transfer the requested document to the client host directly, or to deliver it to the client host after downloading it from the Web server. The Web proxy server has 3200 KBytes of socket buffer and assigns it as send/receive socket buffer to the TCP connections. What this means is that the original scheme can establish at most 200 TCP connections concurrently since it fixedly assigns 16 KBytes of send/receive socket buffer to each TCP connection. In what follows, we compare the performances of the following four schemes; • scheme (1) which does not use any enhanced algorithms presented in this section
54
• scheme (2) which uses E 2 -ATBT • scheme (3) which uses E 2 -ATBT and the connection management scheme described in Subsection 4.1.2, but does not use the algorithm that gradually decreases the socket buffer, i.e., the socket buffer size remains unchanged after documents are transferred • scheme (4) which uses all of the proposed algorithms, which are E 2 -ATBT and the connection management scheme comprising of the algorithm that gradually decreases the size of the socket buffer assigned to the persistent TCP connections Note that for schemes (3) and (4), we do not explicitly consider the amount of the resources of the Web proxy server. Instead, we introduce N max (= 200), the maximum number of connections which can be established simultaneously at the Web proxy server, to simulate the limitation of the Web proxy server resources. In schemes (1) and (2), newly arrived requests are rejected when the number of TCP connections in the Web proxy server is N max . On the other hand, schemes (3) and (4) forcibly terminate some of persistent TCP connections that are inactive for the document transfer, and establish new TCP connections. For scheme (4), we exclude persistent TCP connections from calculation process of the E 2 -ATBT algorithm, and halve the assigned size of the socket buffer every second. The minimum size of the assigned socket buffer is 1 KByte. We use HTTP/1.1 for document transfer, except in Subsection 4.2.1 where we compare the effect of HTTP/1.0 and HTTP/1.1.
4.2.1 Comparison of HTTP/1.0 and HTTP/1.1 Before evaluating the proposed scheme, we first compare the effects of HTTP/1.0 and HTTP/1.1 on the Web proxy server performance, to confirm the necessity of the connection management scheme described in Subsection 4.1.2. In Figure 23, we plot the performance of the Web proxy server as a function of the number of client hosts in schemes (1) and (2). Here we define the performance of the Web proxy server by the total size of the documents transferred in both directions 55
by the Web proxy server during 1000 sec simulation time. It can be clearly observed from this figure that the performance of scheme (2) is improved as against scheme (1) in the both of HTTP/1.0 and HTTP/1.1. It is because each TCP connection is assigned to a proper size of the socket buffer by E 2 -ATBT. Also, the performance of the Web proxy server with HTTP/1.1 is higher than that of HTTP/1.0 when the number of client hosts is small (50 or 100). The reason is that the document transfer with HTTP/1.1 uses the persistent TCP connection, which can avoid the three-way handshake for establishing a new TCP connection to treat successive document transfer requests. On the other hand, the Web proxy server performance in HTTP/1.1 becomes further worse than that of HTTP/1.0 when the number of client hosts is 200 or 500, which indicates that the Web proxy server load is high. This is because many persistent TCP connections at the Web proxy server are not in actual use and waste the resources of the Web proxy server. It results in that TCP connections for new document transfer requests cannot be established. However, since HTTP/1.0 closes the TCP connection immediately after transferring the requested document, new TCP connections can be soon established even when the load of the Web proxy server is high. Thus, although the persistent TCP connection can improve the Web proxy server performance when the load of the Web proxy server is low, as the load becomes high it acts to significantly decrease the Web proxy server performance. This result explicitly indicates the need for careful management of persistent TCP connections at a Web proxy server, since the utilization of persistent TCP connections is very low as analytically shown in Subsection 2.3.
4.2.2 Evaluation of Web Proxy Server Performance We then investigate the performance observed at the Web proxy server. In Figure 24, we show the performance of the Web proxy server as a function of the number of client hosts, where we change the length of the persistent timer of persistent TCP connections at the Web proxy server to 5 sec, 15 sec, and 30 sec in Figures 24(a), 24(b), and 24(c), respectively. It is clear from Figure 24 that the performance of the original scheme (scheme (1)) decreases as the number of client hosts 56
[M B y te s ] T o ta l T ra n s fe r S iz e
H T T P /1 .0 : s c h e m e ( 1 )
H T T P /1 .0 : s c h e m e ( 2 )
H T T P /1 .1 : s c h e m e ( 1 )
H T T P /1 .1 : s c h e m e ( 2 )
6 0 0 0 5 0 0 0 4 0 0 0 3 0 0 0 2 0 0 0 1 0 0 0 0
5 0
2 0 0 1 0 0 N u m b e r o f C lie n t H o s ts
5 0 0
Figure 23: Result of Simulation Experiment (1): Comparison of HTTP/1.0 and HTTP/1.1
become larger than 200. This is because when the number of client hosts is larger than N max, the Web proxy server begins to reject some of document transfer requests, although most of N max TCP connections are idle, which has been analytically shown in Subsection 2.3. It means that they do nothing but waste the resources of the Web proxy server. The results of scheme (2) in Figure 24 shows that E 2 -ATBT can improve the Web proxy server performance regardless of the number of client hosts. However, it also shows the performance degradation when the number of client hosts is large. This means that E 2 -ATBT is not enough to solve the problem of ‘idle’ persistent TCP connections, and that it is necessary to introduce a connection management scheme to overcome this problem. We can also see that scheme (3) can significantly improve the performance of the Web proxy server, especially when the number of client hosts is large. It is because as the Web proxy server cannot accept all the incoming connections from the client hosts (which corresponds to where the number of client hosts is larger than 200 in Figure 24), scheme (3) closes idle TCP connections so that newly arriving TCP connections can be established. The result is that the number of TCP connections that actually transfer document is increased largely. Scheme (4) also can improve the
57
Total Transfer Size [MBytes]
scheme (1)
scheme (2)
scheme (3)
scheme (4)
4000 3000 2000 1000 0
50
100
200
500
Number of Client Hosts
(a) persistent timer: 5 sec
Total Transfer Size [MBytes]
scheme (1)
scheme (2)
scheme (3)
scheme (4)
4000 3000 2000 1000 0
50
100 200 Number of Client Hosts
500
(b) persistent timer: 15 sec
Total Transfer Size [MBytes]
scheme (1)
scheme (2)
scheme (3)
scheme (4)
4000 3000 2000 1000 0
50
100 200 Number of Client Hosts
500
(c) persistent timer: 30 sec
Figure 24: Results of Simulation Experiment (2) : Web Proxy Server Performance
58
performance of the Web proxy server, especially when the number of client hosts is 100 or 200. For a larger number of client hosts (500 hosts), however, the degree of performance improvements is slightly decreased. It can be explained as follows. When the number of client hosts is small, most of the persistent TCP connections at the Web proxy server are kept established. Therefore, the socket buffer assigned to the persistent TCP connections can be effectively re-assigned to other active TCP connections by using scheme (4). When the number of client hosts is large, on the other hand, the persistent TCP connections are likely to be closed before scheme (4) begins to decrease the assigned socket buffer. It results in that scheme (4) can do almost nothing against the persistent TCP connections. Next, the effect of the length of the persistent timer is evaluated. We first focus on schemes (1) and (2) in Figure 24. For 50 client hosts, the performance becomes higher as the persistent timer becomes larger. The reason is that for the longer persistent timer, the successive document transfers can be performed by persistent connections and the connection setup time can be removed. For a larger number of client hosts, however, the performance degrades when the persistent timer is large. This is caused by the apparent waste of the Web proxy server resources by resulting from the idle TCP connections. It can also be observed from Figures 24(a) through 24(c) that schemes (3) and (4) can provide much higher performance than schemes (1) and (2), regardless of the length of the persistent timer, especially when the number of client hosts is large. This is because schemes (3) and (4) can manage persistent TCP connections as expected. It means that schemes (3) and (4) can utilize the merits of persistent TCP connections for the smaller number of client hosts, and that they can avoid the Web proxy server resources from being wasted and assign them effectively for active TCP connections for the larger number of client hosts.
4.2.3 Evaluation of Response Time We next show the evaluation results of response times of document transfer. We define the response time as the time from when a client host issues a document transfer request to when it 59
finishes receiving the requested document. Figure 25 shows the simulation results. We plot the response time as a function of the document size for the four schemes. From this figure, we can clearly observe that the response time is much improved when our proposed scheme is applied especially for the large number of client hosts (Figures 25 (b)-(d)). However, when the number of client hosts is 50, the proposed scheme does not help improving the response time. This is because the server resources are enough to accommodate 50 client hosts and all TCP connections are immediately established at the Web proxy server. Therefore, the response time cannot be improved so much. Note that since E 2 -ATBT can improve the throughput of TCP data transfer to some degree, the Web proxy server performance can be improved, as was shown in the previous subsection. Although schemes (3) and (4) can improve the response time largely, there is little difference between the two schemes. This can be explained as follows. Scheme (4) decreases the assigned socket buffer to persistent TCP connections and re-assign it to other active TCP connections. Although the throughput of the active TCP connections becomes improved, its effect on the response time is very small compared with the effect of introducing scheme (3). However, scheme (4) is worth to be used at the Web proxy server, since scheme (4) can give a good effect on the Web proxy server performance as shown in Figure 24.
4.3
Implementation and Experiments
In this subsection, we first show an implementation overview of our proposed scheme on an actual machine running FreeBSD-4.0. We then discuss some of the experimental results in order to confirm the effectiveness of the proposed scheme.
4.3.1 Implementation Issues Our proposed scheme consists of two algorithms; the enhanced E-ATBT (E 2 -ATBT) proposed in Subsection 4.1.1 and the connection management scheme described in Subsection 4.1.2. To 60
100 scheme (1) scheme (2) scheme (3) scheme (4)
Response Time [sec]
Response Time [sec]
100
10
1
0.1 10
100
10
1
0.1 10
1000 100001000001e+06 1e+07 1e+08 Document Size [Bytes]
(a) Number of Client Hosts: 50
100
scheme (1) scheme (2) scheme (3) scheme (4)
10
1
0.1 10
100
100
1000 10000 100000 1e+06 1e+07 Document Size [Bytes]
(b) Number of Client Hosts: 100
Response Time [sec]
Response Time [sec]
100
scheme (1) scheme (2) scheme (3) scheme (4)
1000 10000 100000 1e+06 1e+07 Document Size [Bytes]
(c) Number of Client Hosts: 200
scheme (1) scheme (2) scheme (3) scheme (4)
10
1
10
100
1000 10000 100000 1e+06 1e+07 1e+08 Document Size [Bytes]
(d) Number of Client Hosts: 500
Figure 25: Results of Simulation Experiment (3) : Response Time
61
realize E2 -ATBT, we need to implement additional mechanisms to the original E-ATBT, which consider the connection dependency and controlling receive socket buffer at the Web proxy server. Two methods can be considered for this purpose as described in Subsection 4.1.1; monitoring the utilization of send/receive socket buffer, and adding some information to data/ACK packets. As a first step, we are now implementing the latter method, using additional bits to identify the connection dependency and the size of send socket buffer at the Web server. Since this method requires cooperation between the server and client hosts, it is not a realistic solution. However, we can know a upper–limit on performance that we can expect from using our proposed method, by comparing it with the former observation method. To implement the connection management scheme, we have to monitor the utilization of resources at the Web proxy server, and maintain an adequate number of persistent TCP connections. Monitoring the resources at the Web proxy server is done as follows. The resources for establishing TCP connections in our mechanism are mbuf, file descriptor, control blocks, and socket buffer as described in Subsections 2.2.1 through 2.2.4. The resources cannot all be changed dynamically once the kernel is booted. However, the total and remaining amounts of these resources can be observed in the kernel system. Therefore, we introduce threshold values of utilization for these resources, and if one of the utilization levels of those resources reaches its threshold, the Web proxy server starts closing the persistent TCP connections and releases the resources assigned to those connections. Figure 26 sketches our mechanism for managing persistent TCP connections at the Web proxy server. When a TCP connection finishes transmitting a requested document and becomes idle, the Web proxy server records the socket file descriptor and the process number as a new entry in the time scheduling list, which is used by the kernel to handle the persistent TCP connections. Note that a new entry is added to the end of the time scheduling list. When the Web proxy server decides to close the persistent TCP connection, it selects the connection from the top of the time scheduling list. In this way, the Web proxy server can close the oldest persistent connection first. 62
time scheduling list
time scheduling list
( sfd, proc )
( sfd, proc ) NULL NULL
insert
delete
Figure 26: Connection Management Scheme
When a certain persistent TCP connection in the list becomes active before being closed, or when it is closed by persistent timer expiration, the Web proxy server removes the corresponding entry from the list. All operations on the time scheduling list can be performed by using simple pointer manipulations.
4.3.2 Implementation Experiments We now present the effectiveness of our implemented scheme through some experiments. Figure 27 shows our experimental system. In this system, we implement our proposed scheme at the Web proxy server, running the Squid Web proxy server [11] on FreeBSD-4.0. The amount of Web proxy server resources is set such that the Web proxy server can accommodate up to 400 TCP connections simultaneously. The threshold value, at which the Web proxy server begins to close persistent TCP connections, is set to 300 connections. The Web proxy server monitors the utilization of its resources using the parameters which the kernel maintains every second. We in63
Web proxy server
Client host
Number of users 50, 100, 200, 600
Web server
upper limit : 400 connections threshhold : 300 connections persistent timer : 15 sec cache hit Ratio : 0.5
Figure 27: Network Model for Implementation Experiment
tentionally set the size of the Web proxy server cache to be 1024 KBytes, so that the cache hit ratio can be 0.5. The length of the persistent timer is set to 15 sec. The client host uses httperf [31] to generate document transfer requests. As in the simulation experiments, the access model of each user at the client host (the distribution of the requested document size and think time between successive requests) follows that reported in [16]. When the request is rejected by the Web proxy server, due to lack of resources, the client host resends the request immediately. We will compare the proposed scheme and the original scheme which uses no mechanism proposed in this section. In each of experimental results presented in the below, the average value of ten times experiments will be shown.
Evaluation of Web Proxy Server Performance We first evaluate the Web proxy server performance. In Figure 28, we compare throughput values of the proposed scheme and the original scheme as a function of the number of users at the client host. Here we define the throughput of the Web proxy server as the total number of documents sent/received by the Web proxy server, divided by the experiment time (10 minutes). It is clear that when the number of users is large, the throughput of the proposed scheme is larger than that of the original scheme, whereas both provide almost the same throughput when the number of users 64
Proxy Server Performance [Kbps]
original scheme
proposed scheme
9000 8000 7000 6000 5000 4000 3000 2000 1000 0
50
100 200 Number of Users
600
Figure 28: Result of Implementation Experiment (1): Web Proxy Server Throughput
is small. When the number of users is small, the Web proxy server can accommodate all users’ TCP connections without lack of the Web proxy server resources. When the number of users becomes large, the original scheme cannot accept all of the document transfer requests since many TCP connections established at the Web proxy server are idle and occupy the server resources, as described by the analysis in Subsection 2.3. On the other hand, the proposed scheme can accept a larger number of document transfer requests than the original scheme, since the Web proxy server forces the idle TCP connections to close when the Web proxy server resources become short and assigns the released resources to newly arriving TCP connections. Table 1 summarizes the number of TCP connections established at the Web proxy server for document transfer during the experimental run. When the number of users becomes large, the number of established TCP connections is increased by using the proposed scheme. It results in the increase of the server throughput as explained above.
Evaluation of End User Performance We next focus on the user-perceived performance. Figure 29 shows the average performance gain as a function of the document size, when the number of users is 100 (Figure 29(a)) and 600
65
Table 1: Result of Implementation Experiment (2): Number of Established TCP Connections at the Web Proxy Server Number of users
Original scheme
Proposed scheme
50
66
66
100
199
199
200
581
1234
600
2455
13235
(Figure 29(b)). Here, we define the average performance gain as the ratio of the response time for a certain size document obtained by the original scheme and the proposed scheme, where the response time has the same definition as that in Subsection 4.2.3. From Figure 29, we can see that especially when the number of users is large, our proposed scheme can reduce the response time up to 1/100 compared with the original scheme. In the original scheme, many document transfer requests from users are rejected by the Web proxy server. Then, the user resends the request later, which subsequently results in an increased response time. On the other hand, since the proposed scheme tries to accept newly arriving requests by closing idle TCP connections at the Web proxy server, the response time can be kept small. To understand this more clearly, we present in Figure 30 the relation of the requested document size and the response time in the original scheme and the proposed scheme. In obtaining this figure, the number of users is set to be 600. It is obvious that the response time in the original scheme is very large, which is mainly the result of the rejected requests. On the other hand, the response time in the proposed scheme remains small.
66
Average Performance Gain [%]
100
10
1
0.1
0.01 1000
10000 Document Size [Bytes]
100000
Average Performance Gain [%]
(a) Number of Client Users: 100
100
10
1
0.1
0.01 1000
10000 Document Size [Bytes]
100000
(b) Number of Client Users: 600
Figure 29: Results of Implementation Experiment (3): Average Performance Gain
67
Response Time [sec]
1
0.1 1000
10000 Document Size [Bytes]
100000
(a) Original Scheme
Response Time [sec]
1
0.1 1000
10000 Document Size [Bytes]
100000
(b) Proposed Scheme
Figure 30: Results of Implementation Experiment (4): Response Time
68
4.4
Summary
In this section, we have proposed a new resource management scheme for Web proxy servers. The E2 -ATBT, an enhanced E-ATBT, can assign the socket buffer to heterogeneous TCP connections according to their expected throughput. It also takes account for the dependency between the upward and downward TCP connections and the tuning of the receive socket buffer, which are necessary to effectively manage the Web proxy servers. The connection management scheme monitors the Web proxy server resources, and intentionally closes the idle connections when the resources become short, by which the Web proxy servers can accept newly arriving TCP connections when the Web proxy server load is high. We have also evaluated our proposed scheme through various simulation and implementation experiments, and confirmed that it can improve the Web proxy server performance up to 70%, and reduce the document transfer time for Web client hosts.
69
5
Concluding Remarks
In this thesis, we have presented a novel scheme of resource management for Internet servers. We have first proposed Scalable Socket Buffer Tuning (SSBT) for socket buffer management for Web servers. The SSBT consists of two algorithms, the E-ATBT and SMR schemes. The E-ATBT assigns send socket buffers to the connections according to the estimated throughput of those connections, which is intended to provide the fair and effective assignment of the send socket buffer. The SMR can avoid the redundant memory copy operations when the data is transferred by TCP and improve the overall performance of the server host. We have next proposed a resource management scheme for Web proxy servers, which has two algorithms. One is an enhanced EATBT, which is an enhancement of the E-ATBT according to the characteristics of the Web proxy servers. It takes account for dependency between the upward and downward TCP connections and the tuning of the receive socket buffer, which are necessary to effectively manage the Web proxy servers. The other is the scheme for managing persistent TCP connections at the Web proxy servers. It monitors the Web proxy server resources, and intentionally closes the idle connections when the resources become short. We have evaluated the proposed schemes through various simulation and implementation experiments. We have confirmed that SSBT can improve up to 30% gain of the Web server throughput, and the fair and effective usage of the sender socket buffer can be achieved. Furthermore, we have shown our proposed connection management scheme can achieve up to 70% of Web proxy server performance, and reduce the document transfer time for Web client hosts.
70
Acknowledgements I would like to express my sincere appreciation and deep gratitude to my advisor, Professor Masayuki Murata of Osaka University, who introduced me to the area of computer networks including the subjects in the thesis and advice through my studies and preparation of this manuscript. All works of this thesis would not been possible without the support of Research Assistant Go Hasegawa of Osaka University. He has been constant sources of encouragement and advice through my studies and preparation of this thesis. I would like to extend my appreciation to Mr. Takuya Okamoto of Osaka University for his valuable supports and many suggestions. Thanks are also due to Professor Hideo Miyahara and Professor Shinji Shimojo of Osaka University who gave me many advises. I am indebted to Associate Professor Masashi Sugano of Osaka Prefecture College of Health Sciences, Associate Professor Ken-ichi Baba of Osaka University, Assistant Professor Naoki Wakamiya of Osaka University, Research Assistant Hiroyuki Ohsaki of Osaka University and Research Assistant Shin’ichi Arakawa of Osaka University who gave me helpful comments and feedbacks. I am heartily thankful to many friends and colleagues in the Department of Informatics and Mathematical Science, Graduate School of Engineering Science, Osaka University for their generous, enlightening and valuable suggestions. Finally, I am deeply grateful to my parents. They have always loved and supported me continuously.
71
References [1] P. Druschel and L. L. Peterson, “Fbufs: A High-Bandwidth Cross-Domain Transfer Facility,” in Proceedings of the Fourteenth ACM symposium on Operating Systems Principles, pp. 189– 202, Dec. 1993. [2] J. Chase, A. Gallatin, and K. Yocum, “End-System Optimizations for High-Speed TCP,” IEEE Communications Magazine, vol. 39, pp. 68–74, Apr. 2001. [3] A. Gallatin, J. Chase, and K. Yocum, “Trapeze/IP: TCP/IP at Near-Gigabit Speeds,” in Proceedings of USENIX ’99 Technical Conference, pp. 109–120, June 1999. [4] J. Semke, J. Mahdavi, and M. Mathis, “Automatic TCP Buffer Tuning,” in Proceedings of ACM SIGCOMM ’98, pp. 315–323, Aug. 1998. [5] J. Padhye, V. Firoiu, D. Towsley, and J. Kurose, “Modeling TCP Throughput: a Simple Model and its Empirical Validation,” in Proceedings of ACM SIGCOMM ’98, pp. 303–314, Aug. 1998. [6] Proxy Survey, available at http://www.delegate.org/survey/proxy.cgi . [7] P. Cao and S. Irani, “Cost-Aware WWW Proxy Caching Algorithms,” in Proceedings of USENIX Symposium on Internet Technologies and Systems, Dec. 1997. [8] L. Rizzo and L. Vicisano, “Replacement Policies for a Proxy Cache,” IEEE/ACM Transactions on Networking, vol. 8, pp. 158–170, Apr. 2000. [9] A. Feldmann, R. Caceres, F. Douglis, G. Glass, and M. Rabinovich, “Performance of Web Proxy Caching in Heterogeneous Bandwidth Environments,” in Proceedings of IEEE INFOCOM ’99, pp. 107–116, Mar. 1999. [10] A. Luotonen and K. Altis, “World-Wide Web Proxies,” Computer Networks and ISDN Systems, vol. 27, no. 2, pp. 147–154, 1994. 72
[11] Squid Home Page, available at http://www.squid-cache.org/. [12] Apache proxy mod proxy, available at http://httpd.apache.org/docs/mod/
mod_proxy.html. [13] FreeBSD Home Page, available at http://www.freebsd.org/. [14] W. R. Stevens, TCP/IP Illustrated, Volume1: The Protocols.
Reading, Massachusetts:
Addison-Wesley, 1994. [15] J. Franks, P. Hallam-Baker, J. Hostetler, S. Lawrence, P. Leach, A. Luotonen, and L. Stewart, “Hypertext Transfer Protocol – HTTP/1.1,” RFC 2616, June 1999. [16] P. Barford and M. Crovella, “Generating Representative Web Workloads for Network and Server Performance Evaluation,” in Proceedings of Performance ’98/ACM SIGMETRICS ’98, pp. 151–160, June 1998. [17] M. Nabe, M. Murata, and H. Miyahara, “Analysis and Modeling of World Wide Web Traffic for Capacity Dimensioning of Internet Access Lines,” Performance Evaluation, vol. 34, pp. 249–271, Dec. 1999. [18] K. Tokuda, G. Hasegawa, and M. Murata, “TCP Throughput Analysis with Variable Packet Loss Probability for Improving Fairness among Long/Short-lived TCP Connections,” to be presented at IEEE Workshop on Communications Quality & Reliability, May 2002. [19] N. Cardwell, S. Savage, and T. Anderson, “Modeling TCP Latency,” in Proceedings of IEEE INFOCOM 2000, pp. 1742–1751, Mar. 2000. [20] G. Hasegawa, T. Matsuo, M. Murata, and H. Miyahara, “Comparisons of Packet Scheduling Algorithms for Fair Service among Connections on the Internet,” in Proceedings of IEEE INFOCOM 2000, pp. 1253–1262, Mar. 2000.
73
[21] S. Floyd and V. Jacobson, “Random Early Detection Gateways for Congestion Avoidance,” IEEE/ACM Transactions on Networking, vol. 1, pp. 397–413, Aug. 1993. [22] X. Xiao, L. Ni, and W. Tang, “Benchmarking and Analysis of the User-Perceived Performance of TCP/UDP over Myrinet,” Tech. Rep., Michigan State Univ., Dec. 1997. [23] D. D. Clark, V. Jacobson, J. Romkey, and H. Salwen, “An Analysis of TCP Processing Overhead,” IEEE Communications Magazine, vol. 27, pp. 23–29, June 1989. [24] The VINT Project, ”UCB/LBNL/VINT network simulator ns (version 2).” available at
http://www.isi.edu/nsnam/ns/ . [25] S. Floyd and V. Jacobson, “On Traffic Phase Effects in Packet-Switched Gateways,” Internetworking: Research and Experience, vol. 3, pp. 397–413, Aug. 1992. [26] A. Veres and M. Boda, “The Chaotic Nature of TCP Congestion Control,” in Proceedings of IEEE INFOCOM 2000, pp. 1715–1723, Mar. 2000. [27] K. Murakami and M. Maruyama, “MAPOS - Multiple Access Protocol over SONET/SDH Version 1,” RFC 2171, June 1997. [28] ALTQ (Alternate Queueing for BSD UNIX) , available at http://www.csl.sony.co.
jp/˜kjc/software.html . [29] Apache Home Page, available at http://www.apache.org/. [30] NIST Net Home Page, available at http://www.antd.nist.gov/itg/nistnet/. [31] D. Mosberger and T. Jin, “httperf – A Tool for Measuring Web Server Performance,” Performance Evaluation Review, vol. 26, pp. 31–37, Dec. 1998. [32] Mark Allman, “A Web Server’s View of the Transport Layer,” ACM Computer Communication Review, vol. 30(5), pp. 10–20, Oct. 2000.
74