Proceedings of the IASTED International Conference APPLIED INFORMATICS February 14-17, 2000, Innsbruck, Austria
Xwpt: An X-based Web Servers Performance Tool Ibrahim Haddad, Wael Hassan, Lixin Tao ifhadda,whassan,
[email protected] Computer Science Department Concordia University 1455 de Maisonneuve Blvd West Montreal, Quebec Canada H3G 1M8 Fax: (514) 848-2830
Abstract Our study aims at dening measures for evaluating the Apache web server, unveil problems, bottlenecks, and propose solutions and modications to current settings. It is based on the analysis of the client/server communication model supported by Xwpt. Extensive simulations showed that Xwpt was able to knock down some1 of its target machines repeatedly. Based on this study, we can summarise the factors that govern web server's stability and service enhancement. This would help us provide a better service, and by improving the resource allocation scheme, build a superior generation of web servers.
Keywords Web server, stability, scalability, performance, robustness.
1 Introduction The survival and reputation of industries delivering online services are on the stake whenever the volume and the quality of their services diminish. Having a stable and robust web server is very critical because of the nature of their functionalities: web servers negotiate style and language of response with the enquirer, authenticate users, serve documents, and most importantly they have to be secure [1]. The fact is that all online services depend on the stability, scalability and robustness of the web server in command. For this reason, companies are investing a lot of money on their web servers to keep them up and running. In the event of web servers failure, companies can be liable to huge prot losses and customer dissatisfaction. Companies, thus, are worth as much as the responsiveness and uptime of their web frontier. 1 Depending on the machine hardware capabilities and web server conguration.
307-2-019
2 The Apache Web Server Apache provides a robust and commercial-grade reference implementation of the HTTP2 protocol. It is the most widely used web server platform in the world [1]. It is reliable, robust, congurable, scalable, well documented, and free. In addition, as of Apache 1.3, a lot of eorts were put to bring performance up to a point where the dierence with other high-end web servers is minimal [2]. All these factors make Apache the qualied web server for our performance testing tool.
3 Online attacks and Xwpt There are several kinds of attacks that can be employed when it comes to web servers such as the SYN attack, the denial of service attack, ood or ping attack and the buer overow attack [3]. Xwpt attack model is dierent from the above mentioned models. It issues TCP/IP compliant, parallel, and legitimate requests that do not include any buer overow. The purpose of Xwpt attacks is to test the web server for: Robustness: To measure how well it can stand incoming attacks. Latency: From Xwpt point of view, latency is the time consumed from the request initiation, until the receival of the last byte at the client side. Performance: To measure the total eectiveness of a web server including throughput, response time, and availability.
4 Xwpt System Description Xwpt simulates a real world situation of attacks against the machine hosting Apache, or any other web server. It is composed of two main subsystems: the interface and the testing engine. The interface handles users' interactions, and maintains data structures, log and conguration les. The testing engine bangs the web server with a stream of connections whose conguration is set from the control panel. The requested connections can be of three models: 1. The parallel forked model forks multiple children where each loops sequentially to emit requests. 2. The threaded model creates processes using threads instead of forks. 3. The block sequential model initialises the maximum allowed number of resources that can be allocated on the client machine and initiates requests in a pseudo parallel manner. 2
Hypertext Transfer Protocol
The system architecture seeks optimisation at dierent levels; it is portable and easy to modify. The design reects modularity among its components. The client/server model of Xwpt is based on open standards so it integrates easily with other software or additions.
4.1 Interface Design The interface subsystem acts as the middle agent between the user and the testing engine. It accepts input from the user, maintains the related data structures and I/O les, and redirects user commands to the testing engine. It is composed of three modules:
The session module is responsible for creating, updating and loading session and conguration log les.
The machine module maintains the target machine structure.
The control panel module is composed of 3 submodules: 1. The general options sub-module allows users to specify the algorithm type to follow3 , the maximum number of links to request, and the protocol4 type. 2. From the graph sub-module, users specify the graph type5 . This sub-module is responsible for manipulating the output les of the testing engine and drawing the graph. 3. The attacks sub-module controls starting and stopping the attack.
Every session has a log and a conguration le. The log le records all the attack session information in addition to the attacks results in terms of forks and their corresponding latencies. The conguration le saves the session options; users can then load the conguration le and re-simulate the session.
5 Testing Engine Architecture The testing engine is the entity that sends requests to the web server. It features three attack models: the parallel forked model, the parallel threaded model and the block sequential model. Each one of these models tests the web server for a dierent criteria. 3 4 5
Breadth rst, depth rst, or a random pattern. Either the default HTTP port, or a user specied protocol Standard, minimum, average, or maximum.
5.1 Parallel Forked Model
The parallel forked model tests the web server for scalability by creating a simulated environment of parallel users and multiple requests. This model accepts two parameters: the number of clients to be forked and the number of sequential connections to initiate. The parallel forked model presents dierent kind of statistics such as the time consumed by all connections, the minimum, average and maximum latency per fork, and the time required by all connections grouped by fork number. For the purpose of recording all the time delays of sending, the client will block waiting for the data in the socket. After all data has been read, another time-stamp is taken and the dierence is dumped into log le.
5.2 Parallel Threaded Model
The parallel threaded model is an enhanced version of the parallel forked model. It uses threads instead of forks which consume less process resources and less memory at the client side. This model takes the same input as the parallel forked model. The problem we faced in this case was that we were running out of threads. On the SPARC 20 machine that we used as a base for attacks, we were limited by 92 threads. We were not able to go beyond that number because the system restricts the number of threads per user.
5.3 Block Sequential Model
The block sequential model tests the response time of the web server. It is guided towards testing the behaviour of the server in case of fast stream of connections. The model works by initialising the maximum allowed number of sockets (in our case 59 sockets), after which we sequentially send data on these sockets and record response time.
5.4 Timing requests
When Xwpt requests pages from the web server, it records the time it takes to serve it. The total time required to fulll a request is computed as equal to the time it takes to establish a connection with the server, the time for the client to send a request, the processing time on the server side, and the time to send the information back to the testing engine, acting as the web client. The equation for the total time spent is: Ttotal = Tconnect + TsendRequest + Tprocess + Ttransfer
6 Experiments We conducted experiments over three web servers: a search engine server, an e-commerce server, and our
own Spectrum Linux server. We took into consideration hardware variations, connection variations, as well as web server conguration. Our decision on which sites to use as targets was based on several factors: The search engine has to prove availability and provide services with minimal latency. The e-commerce web site, aside from proving availability, has to prove robustness as a provider of secure and paid services. We also needed a congurable machine at hand where we can tweak the web server conguration. Spectrum6 served that purpose. We were very comfortable testing with Spectrum; we had the freedom to congure Apache to improve the server's performance. Given that Spectrum was on the same network as the machine7 initiating the requests, the network delay, trac volume, and time of day factors were isolated. The isolation of delay factors allowed us to detect problems and suggest improvements to web server conguration. The results give clearer images on problems and areas to be improved in Apache. Since we had no control over the other two external servers, the most extensive tests were conducted on our machine.
6.1 Search Engine Attack Session
The search engine site is considered to be one of the most used web search engines. Their servers, running Free BSD UNIX [4], implement load balancing mechanisms to ensure optimal performance and response time. We have no information on what type of web server they run. A web search engine, as such, has to prove availability to the Internet public, and its queries should not slow down neither the response time of the server, nor latency, as dened earlier. The attack conguration consisted of 1,000 serial connections combined with 1,000 parallel connections, a total of 1,000,000 requests at Internet's usage peak time, 10 a.m; the requests were served within three minutes!
6.2 E-Commerce Site Session
This site claims, on its home page, to have more than one hundred thousands accesses per day. Their server is running Red Hat Linux and powered by Apache. The attack conguration consisted of 1000 serial connections combined with 100 parallel connections (a total of 100,000 requests). Their server halted and couldn't serve any more requests. It is the same case as Spectrum except with a higher volume of trac. We suspect that the system administrator of this server didn't 6 A Pentium III 450 MHz, 128 MB RAM, Linux box. The machine is powered with Red Hat 6.0 and Apache 1.3.6. 7 Orchid, an Ultra2/170 Sun machine.
keep the default Apache directives and didn't modify it to serve a high unexpected trac within a very limited time.
6.3 Spectrum: our own test server
We ran the test on our own test server. The conguration of the attack on Spectrum consisted of 100 serial connections combined with 100 parallel connections (a total of 10,000 requests). The machine freezed! At the time of the attack, the Apache directives used on Spectrum were left at their default values. For the purpose of testing Xwpt with dierent settings, we changed some of Apache's variables8. These variables are: Timeout, KeepAlive, MaxKeepAliveRequests, KeepAliveTimeou t, StartServers, MaxClients, and MaxRequestsPerChild. Modifying these variables allowed Spectrum to stand more extensive attacks than with the default assigned values.
6.4 Resulting Graphs and Interpretations
This section presents our experimental results in the form of graphs. Generally speaking, when discussing any system's latency we need to see the best, worst and average cases of the system. In Ecommerce and time critical web services, the worst case analysis helps a lot. The minimal time shows how fast the system is responding. Scalability is determined by how many users the system can serve at a time. The graphs plot the fork number with its associated response time in nanoseconds. In order to isolate network delays and congestion at usage peak time, our analysis is based on the trend rather than the time consumed to serve the requests.
6.4.1 Minimum Delay Graph
Looking at the minima delay graph as in Figure 1, we notice that there is a larger delay with the search engine. This delay accounts to the network distance that the web server is at. The web search site has a low, but a monotonous response time This is the direct eect of load balancing mechanisms. The other two machines showed similar behaviour.
6.4.2 Maximum Delay Graph
As for the maximum delay graph, we can observe that the search engine maxima times are relatively close to the other sites time, as shown in Figure 2. This shows a major breakthrough with what the web search engine is presenting. It has comparable maximal 8
Located in http.conf
Figure 1: Minimum Delay Graph
Figure 2: Maximum Delay Graph
from Apache's side. The graph shows how the very rst connection in each fork takes longer time to be fullled than its subsequent. We can think of either of these alternatives happening. 1. The web daemon is sucking up all system resources. 2. The Linux forking model has a problem with increased consumption of resources. This might bring up a controversial issue on whether Apache or the Linux kernel is the source of the problem.
7 Existing Problems Xwpt testing engine suers from a client crash problem. The problem comes from the nature of the client/server deadlock situation [3]. It occurs when the server crashes, Xwpt is not able to produce feedback of its status nor the status of the web server; meanwhile sockets are pending, waiting for data or for a SYN, or for an ACK.
8 Experimental Analysis Figure 3: Crash Graph response time that reveals that the other two servers are struggling and taking much longer time to fulll the requests. Looking at the Ecommerce site, we notice that it has the lowest maxima in the largest part of the graph. This proves the need to have a maximal delay under which Ecommerce transactions should be fullled.
6.4.3 The Crash Graph While doing a stress test on Spectrum, our client crashed. We thought that we ran out of resources and the machine experienced a complete halt. We repeated the same attack several times and we were able to reproduce the attack every time we launched more than 100 parallel x 100 sequential connections. The graph in Figure 6.4.3 was the last feedback we received. On the server side, Apache recorded garbage data, environment variable denitions, and random byte code. Since we didnt' see any unusual behaviour, and since the graph was very consistent, we can plead a very good handling of extensive requests
8.1 Variables governing decision Looking at the tasks that the servers traditionally perform, we propose dierent alternatives for delivering these services. In order for online service to be appropriately presented to the Internet public, these issues have to be addressed:
Dependability of the platform running the web server.
Volume and throughput of transactions.
Type of web server to use (secure vs. insecure) [3].
The threshold at which the web server crashes and whether this threshold can be reached
The acceptable response time.
The amount of money you are willing to spend on the system hardware.
In our opinion, these variables must be carefully weighted and taken into consideration to reach an optimal server system.
8.2 The 5 Golden Rules
From our experiments with web servers, we have derived what we called 5 Golden Rules. These rules must be deployed to reach an optimum web server conguration. The rules are: 1. Distribute service based on volume and threshold. 2. Increase hardware capabilities based on response time and transaction processing time. 3. Distribute and separate secure and insecure services. 4. Increase bandwidth based on volume and throughput. 5. Replicate the system based on the geographic distribution and backbone connectivity.
8.3 Xwpt Contribution
Using Xwpt, we can nd the threshold of system crash. We can do this by cloning9 our system on x machines; x number of clients can be processed timely. Any increase in client base implies that an increase in hardware capabilities should follow. In a dierent direction, Xwpt helps nding the processor handling power of multiple simultaneous buers. Depending on the outcome, CPU, memory, or disk may be upgraded.
9 Future Directions As of kernel 2.4, Linux will include support for a rather revolutionary idea: a web server actually integrated into the kernel [5]. The advantages of this include a faster response time from the server, because it can use the cache directly at lower layers in the kernel and doesn't need to make any network calls to user space. However, this change is not meant as a general solution to web hosting: it can serve only les, not CGIs10 , and it has been designed to be as simple as humanly possible. Any requests it can't handle can be passed to user space, where Apache or another web daemon grabs it up and serves it. This double-decker style of web hosting has been already observed to increase server performance in synthetic load environments.
The system has a lot of space for expansion. Models for testing error handling, SSL11 , web databases, and PERL modules can be easily integrated within Xwpt. In our experiments, we noticed that the response time of Apache is phenomenal. Handling requests can never be designed in a better way. The half parabolic graphs, in the original Xwpt generated graphs, show the consistency of response and service time of Apache. Looking at the signal tracing of Apache (i.e. the innite loop of accepting connections and forking children) shows that Apache designers have assumed powerful machines with a lot of resources. In the world of today's networks, no machine is big enough. We are proposing a possible failure in the fork model of Apache. The system's architecture claims that the server has congured limits for how many children it creates. In practice, on our test machine, Apache left no room for any other processes to run and the machine featured a complete halt. Some responsibility can be thrown at the operating system side. Given that we did not do similar tests using other operating system, a precise criticism is dicult to make. Yet, we can say that there is a place for improvement on the operating system resource handling. In conclusion, the three models show the vulnerability of the service and resource model of Apache 1.3.6 hosted on Red Hat Linux 6.0. The ow is jointly in the Apache forking model and resource accounting and in the Linux system resource allocation. Recovering this problem requires a wise design and careful modelling of resource handling in Apache, and in the Linux Kernel. Distributing and separating secure and insecure services, replicating the system based on geographical distribution help minimize the resource problem.
11 References
[1] Laurie, Ben; Laurie, Peter; Apache: The denitive [2] [3] [4] [5]
10 Conclusion
guide, 2nd Edition, O'Reilly, Feb 1999. www.apache.org Stevens, Richard W., UNIX network programming, 2nd Edition, 1998. Netcraft.com Prannevish, Joseph, The Bullet Points: Linux 2.4, Linux Journal, January 2000, page 32.
Xwpt simulates attacks against web servers. The testing engine, with its three attack models, targets dierent features of the communication and service model.
9 Or simply re-conguring the same machine for dierent Apache directives, benchmark, or prole; nd the bottleneck; x it and repeat until you reach the optimum conguration. 10 Commom Gateway Interface
11
Secure Socket Layer