Fast Web Page Allocation On a Server Using Self ... - CiteSeerX

3 downloads 174043 Views 177KB Size Report
organizing properties of neural networks, we map Web requests to servers through an ..... This router may be one of the servers or a dedicated machine to.
Fast Web Page Allocation On a Server Using SelfOrganizing Properties of Neural Networks Vir V. Phoha Computer Science Louisiana Tech University Ruston, LA 71272

S. S. Iyengar Computer Science Louisiana State University Baton Rouge, LA 70803

R. Kannan Computer Science Louisiana State University Baton Rouge, LA 70803

[email protected]

[email protected]

[email protected]

ABSTRACT This paper presents a novel competitive neural network learning approach to schedule requests to a cluster of Web servers. Traditionally, the scheduling algorithms for distributed systems are not applicable to control Web server clusters because different client domains have different Web traffic characteristics, Web workload is highly variable, and Web requests show a high degree of self-similarity. Using the selforganizing properties of neural networks, we map Web requests to servers through an iterative update rule. This update rule is a gradient descent rule to a corresponding energy function that exploits the selfsimilarity of the Web page requests and includes terms for load balancing. Heuristics to select parameters in the update rule that provide balance between hits and load balancing among servers are presented. Simulations show an order of magnitude improvement over traditional DNS based load-balancing approaches. More specifically, performance of our algorithm ranged between 85% to 98% hit rate compared to a performance range of 2% to 40% hit rate for a Round Robin scheme when simulating real Web traffic. As the traffic increases, our algorithm performs much better than the round robin scheme. A detailed experimental analysis is presented in this paper.

Keywords: World Wide Web, Self-Organization, Competitive Learning, Internet, Load Balancing

1.

Introduction

The World-Wide-Web offers tremendous opportunities for marketers to reach a vast variety of audiences at less cost than any other medium. With huge amount of capital invested in Web sites and various databases and services on these sites, it has become necessary to understand the effectiveness and realize the potential opportunities offered by these services. In the past few years the computational approach to artificial intelligence and cognitive engineering has undergone a significant evolution. Not only is this transformation, from discrete symbolic reasoning to massively, parallel and connectionist neural modeling of compelling scientific interest, but also of considerable practical value. In this paper, we present a novel application of connectionist neural modeling to map web page requests to web server cache to maximize hit ratio and at the same time balance the conflicting request of distributing the web requests equally among web caches. In particular, we describe and present a new neural network1

learning algorithm for a fast Web page allocation on a server using self-organizing properties of neural network. The Internet traffic has stabilized at doubling almost every year since 1997 and this growth rate will sustain for some time in future [CK01]. Web sites with heavy traffic load must use multiple servers running on different hardware; consequently, this structure facilitates the sharing of information among servers through a shared file system or via a shared data space. Examples of such a system include Andrews file system (AFS) and distributed file system (DFS). If facilities are not there, then each server may have its own independent file system. Clients

Router

HTTP Server Cache S1

HTTP Server Cache

HTTP Server Cache S2

Sn

A common file system

Disk

Disk

Disk

Figure 1: A typical routing configuration of the web page requests. Figure 1 describes an example of this type of system, where requests may come from various client sites to the router, which then pools the requests and directs them to different servers [IC97]. Here the servers S1, S2, ..., SN each have their own cache and may share a common file system. Correspondingly, each of these servers may have their individual storage, the router decides the allocation of web page request to individual servers. The organization of this paper is as follows. In section 1.1, we describe the terminology used in this paper. In section 2, we review the existing approaches on different web server routing systems. In section 3, we present background concepts related to Pareto distribution, selfsimilarity and an overview of neural network learning algorithms. Section 4 describes problem formulation and the new model for this problem and our algorithmic approach to solve this problem. Section 5 presents the experimental results and section 6 concludes the paper.

2

1.1 Terminology Used in This Paper We define a client as a program that establishes connections to the Internet, whereas a Web-server or a server stores information and serves client requests. For our purpose a distributed Web-server system is any architecture of multiple Web server hosts that has some means of spreading the client requests to the servers. The traffic load on the web site is measured in number of http requests handled by the web site. A session is an entire period of access from a single user to a given Web site. A session may issue many HTML page requests. Typically, a Web page consists of a collection of objects, and an object request requires an access to a server. Any access to a server for an object is defined as a hit. We use site and web site interchangeably to mean a collection of Web objects meant to be served. An object is an entity such as a web page, image etc., that is served by the server. An object-id is an item that uniquely identifies an object in a Web page. An object-id may be a URL, a mapping of a URL to a unique numerical value, and so on; however only one chosen method is consistently followed. There are N servers that service the requests for Web objects. The servers are identified as S1…SN. The Web-server cluster is scalable and uses one URL to provide a single interface to users. For example, a single domain name may be associated with many IP addresses and each address may belong to a different Web server. The collection of Web servers is transparent to the users. Each request for a Web page is identified as a duplet where 1 ≤ i ≤ M. Oi is the object identifier of the object requested, Ri is the number of requests served so far, for that object, and M is the total number of objects present at the site.

2. A Review of the Existing Approaches There are four basic approaches [CCY99b] to route requests among the distributed Web-server nodes: (1) Client-based, (2) DNS-based, (3) Dispatcher based, and (4) Server based. In the clientbased approach, requests can be routed to any Web server architecture even if the nodes are loosely connected or are uncoordinated. The routing decisions can be embedded by the Web clients like browsers or by the client-side proxy servers. For example, Netscape spreads the load among various servers by selecting a random number i between 1 and the number of servers and directs the requests to the server wwwi.netscape.com. This approach is not widely applicable as it is not easily scalable and many Web sites do not have browsers to distribute load among servers. One proposal that combines caching and server replication is given in [BBM97]. However, client-side proxy servers require modifications on Internet component that is beyond the control of many institutions that manage Web server systems. Domain Name System (DNS) can implement a large set of scheduling policies by translating a symbolic name to an IP address. The DNS approach is limited by the constraint of 32 Web servers for each public URL because of UDP packet size constraints although it can be scaled easily from LAN to WAN distributed systems. In the Dispatcher approach, one single entity controls the routing decisions. This approach provides finely tuned load balancing, but failure of the centralized controller can disable the whole system. Server based approach uses two levels of dispatching, first cluster DNS assigns 3

requests to web servers each Web server may reassign the request to any other server in the cluster. Server based approach can attain as good a control as the dispatcher approach but redirection mechanisms typically increase the latency time perceived by the users. To our knowledge, only the Internet2 Distributed Storage Infrastructure Project (I2-DSI) proposes a smart DNS that uses network proximity information such as transmission delays in making routing decisions [BM98]. However, none of the approaches incorporates any kind of intelligence or learning in routing Web requests. Many Artificial Neural Network architectures have the ability to discover for themselves patterns, features, correlations, or categories in the input data and code them in the output that is these Neural Networks display self-organizing properties. Several researchers [GB99, LT94] have detected self-similarity in the Internet traffic flows. The Web page requests are shown to be self-similar and many of them may be generated from the same user, we can use neural networks to make routing decisions that take into consideration the self-similar nature of the Web requests. In earlier work, self-organizing properties of neural networks have been used in various other applications. Rather than listing all the applications, we list those applications that have directly motivated this work. Phoha [P92] and Phoha and Oldham [PO96] use competitive learning to develop an iterative algorithm for image recovery and segmentation. Using Markov Random Fields they develop energy function that has terms corresponding to smoothing and edge detection and preservation. An update rule is then developed to restore and segment images; this update rule is a gradient descent rule to the energy function. A Modares et al. [MSE99] use a selforganizing neural network to solve multiple traveling salesman problem and vehicle routing problems. Their algorithm shows significant advances both in the quality of solutions and computational efforts for most of the experimental data. K. Yeung et al. [YY98] present a node placement algorithm in shuffle nets that uses a communication cost function between a pair of nodes and develop a gradient descent algorithm that places node pairs one by one.

2.1

Special Considerations of Web Server Routing

Requests encrypted using Secure Socket Layer (SSL) use a session key to encrypt information passed between a client and a server. Since session keys are expensive to generate, each SSL request has a lifetime of about 100 seconds and requests between a specific client and server within the lifetime of the key use the same session key. So it is highly desirable to route multiple requests from the same client to a server be routed to the same server, as a different server may not know about the session key. In our approach, the competitive learning approach of neural networks approach achieves this by the very nature of the algorithm. (Note: (1) The IBM’s Network Dispatcher (ND) achieves this by routing two SSL requests received within 100 seconds from the same client to the same server. (2) Using a simple Round Robin scheme will be very inefficient as this will require generation of many session keys.) The Web server load information becomes obsolete quickly and is poorly correlated with future load conditions [CCY99a]. Since, the dynamics of the WWW involves high variability of domain and client workloads, exchange of information about load condition of servers are not sufficient to provide scheduling decisions. Hence, we need a real time adaptive mechanism that adapts rapidly to changing environment.

3. Background 4

This section gives the background material such as Pareto distribution, self-similarity, competitive learning, and Kohonen’s algorithm, that are used in the subsequent sections of the paper.

3.1

Pareto Distribution and Self Similarity

Pareto Distribution: Web traffic can be modeled using Pareto distribution. Pareto distribution is an example of a heavy-tailed distribution. A heavy-tailed distribution is asymptotically hyperbolic, that is irrespective of the distribution of a random variable x over a short period of time; over the long run, the distribution function of x is hyperbolic. The probability mass function of Pareto distribution is p(x) = γkγx-γ-1, γ, k > 0 , x ≥ k; and its cumulative distribution is given by F(x) = P[X ≤ x] = 1 – (k/x) γ Here k represents the smallest value of the random variable and γ determines the behavior of the distribution. The parameters may be estimated from historical data and determines the behavior f the distribution. For example, if γ ≤ 2, then Pareto distribution has infinite variance and if γ ≤ 1 then the distribution has infinite mean. For more details, an interested reader is referred to [PF95]. Self similarity: Self similarity is usually associated with fractals, the shape of a self-similar object is similar regardless of the scale at which it is viewed; for example, a coastline has similar structure at a scale of one mile, 10 miles, or 200 miles. In case of stochastic phenomenon, such as a time series, self-similarity means that the object’s correlational structure remains unchanged at different time scale. Both, Ethernet traffic and Web traffic are shown to be self-similar [LTWW94] [CB97].

3.2

Neural Network Background

This section gives a brief background of competitive learning and Kohonen’s rule. We first describe the idea of competitive learning and then describe a mapping of this technique to web server routing.

3.2.1 Competitive Learning In the simplest competitive learning network there is a single layer of output units S={S1,S2,…,SN}, each is fully connected to a set of inputs Oi via connection weights wij. A brief description of a competitive learning algorithm follows. Let O={O1,O2,…,OM} be an input to a network of two layers with an associated set of weights wij. The standard competitive learning rule [H91] is given by: ∆wi*j = η(Oj - wi*j)

5

which moves wi* towards O, the i* implies that only the set of weights corresponding to the winning nodes is updated. The winning node is taken to be the one with the largest output. Another way to write this is: ∆wij = ηSi(Oj - wij), where:

 1 for i corresponding to the largest output Si =   0 otherwise

This is the adaptive approach taken by Kohonen in his first algorithm (see Kohonen model below). The usual definition of competitive learning requires a winner-take-all strategy. In many cases this requirement is relaxed to update all of the weights in proportion to some criterion. This form of competitive learning is referred to as leaky learning [H91]. Hertz et al.[H91] discuss various forms of this adaptive processing for different problems including the traveling salesperson problem. It has become a standard practice to refer to all of these as Kohonen-like algorithms.

3.2.2 Kohonen's Algorithm Kohonen’s Algorithms [K89], [H91] adjusts weights from common input nodes to N-output nodes arranged in a 2-dimentsional grid (See Figure 2 below) to form a vector quantizer. Input vectors are presented sequentially in time and after enough input vectors have been presented weights specify clusters or vector centers. These clusters or vector centers sample the input space such that the point density function of the vector centers approximate the probability density functions of the input vectors. This algorithm also organizes weights such that topologically close nodes are sensitive to physically similar inputs. Output nodes are thus ordered in a natural fashion. Thus, this algorithm forms feature maps of inputs. A description of this algorithm follows. Let x1, x2 , …, xN be a set of input vectors, which defines a point in N-dimensional space. The output units Oi are arranged in an array and are fully connected to input via the weights wij. A competitive learning rule is used to choose a winner unit i*, such that | wi* - x| x] ~ x-γ as x→ ∞ for 0 < γ < 2. The results reported here correspond to Pareto distribution, with probability density function p(x) = γ k γ x-γ -1, where γ = 0.9 and k = 0.1. We conducted the simulations in two separate environments: (1) Louisiana State University Networks Lab. Here the Web traffic was simulated using Pareto distribution, on a single PC and (2) Computer Science laboratory at Louisiana Tech University. Here the simulation environment consisted of a network of five IBM PCs. One PC was used as a dedicated Web server written in Java specifically for this experiment; one PC was used as a Proxy server, and on the rest of the three PCs, referred to as client PCs, Microsoft Internet Explorer (IE) and Netscape Browser settings pointed to the proxy server for all requests. In addition to the IE, each client PC simulated 300 clients by creating separate threads (using Java) for each client. We implemented the neural model on the PC running the proxy server. We compared the performance of our algorithm with Round Robin (RR), Round Robin 2 (RR2), and a special case of Adaptive Time-to-Live (TTL) algorithm. In RR2 algorithm, a Web cluster is partitioned into two classes: Normal domains and Hot domains. This partition is based on domain load information. In this strategy, Round Robin scheme is applied separately to each of the domains. Details of this algorithm are given in [CCY99a]. In our implementation of Adaptive TTL algorithm, we assigned a lower TTL value when a request is originated from Hot domains and a higher TTL value when it originates from Normal domain, this way the skew on Web objects is reduced. The following table gives characteristics of our simulations.

Table 1 Simulation Characteristics

Sample Size

Number of Web objects ranged from 150 to 1050 and the statistics were collected at the intervals of 50 objects each.

Number of Servers

Statistics were collected for 4, 8, 16, 32 servers

Web Pages Distribution Used

Uniform and Non Uniform (Pareto) 12

Algorithms

Neural Network (NN), Round Robin and Round Robin 2 (RR), and Adaptive Time-toLive

The comparison charts in the following discussions relate only to Round Robin scheme and our proposed Neural Net based algorithm. The results (hit ratios) for adaptive TTL algorithm varied widely for different input size of Web objects and for different input object distributions, but never ranged higher than 0.68. 1 0.9 0.8

Hit Ratio

0.7 0.6 0.5 0.4 0.3 0.2 0.1 1050

950

850

750

650

550

450

350

250

150

0

RR Performance with 4 Servers NN Performance with 4 Servers RR Performance with 8 Servers NN Performance with 8 Servers RR Performance with 16 Servers NN Performance with 16 Servers RR Performance with 32 Servers NN Performance with 32 Servers

No of Page Requests

Figure 5 Performance of page placement algorithm using competitive learning (Neural Network) versus Round Robin Algorithms (for Non-Uniform Input Data Distribution)

1 RR Performance with 4 Servers NN Performance with 4 Servers NN Performance with 8 Servers RR Performance with 8 Servers NN Performance with 16 Servers RR Performance with 16 Servers NN Performance with 32 Servers RR Performance with 32 Servers

0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1

95 0 10 50

0 85

0

0

0

0

0 75

65

55

45

35

0

13 25

15

0

0

Figure 6 Performance of page placement algorithm using competitive learning (Neural Network) versus Round Robin Algorithms (for Uniform Input Data Distribution)

As can be seen from Figure 5, the Neural Network (NN) competitive learning algorithm performs much better as compared to Round Robin schemes (also RR2) when input pages follow a Pareto distribution. As the number of input objects increase, our algorithm achieves a hit ratio close to 0.98, whereas the round robin schemes never achieved a hit ratio of more than 0.4. For NN algorithm, hit ratios (0.86) with a smaller number of objects is attributed to some learning on the part of the algorithm, but as the algorithm learns, the hit ratio asymptotically stabilizes to 0.98 for larger number of objects. For uniform distribution of input objects, NN algorithm performs similarly as for nonuniform distribution and is much better than the Round Robin schemes (See Figure 6). Table 2 Comparison of maximum hit ratio achieved, input size, and servers RR (0.31,150,4) (0.32,150,4)

Uniform Non-Uniform

NN (0.98,1050,4) (0.98,1050,4)

Round Robin scheme never achieves a hit ratio higher than 0.32, where as NN achieves hit ratios close to 0.98 (See Table 2). Table 3 Comparison of minimum hit ratio achieved, input size, and servers

Uniform Non-Uniform

RR (0.03,1050,32) (0.02,1050,32)

NN (0.85,150,32) (0.86,150,32)

As a worst case, NN achieves a hit ratio of as high as 0.85 for 32 servers, where as RR schemes go as low as 0.02 hit ratio (See Table 3).

6.

Conclusions

Our analysis shows the following results: (1) The performance of our algorithm (NN) increases considerably (from 0.85 hit rate to 0.98 as compared to 0.02 to 0.38 for Round Robin scheme) as the traffic increases where as the performance of Round Robin decreases. This result holds true irrespective of the number of servers. This is a result of a push of a object towards the same server based on the learning component in equation (2). 14

(2) For uniform distribution of Web object requests and at a lower traffic rate with large number of servers (16 and 32), both the algorithms perform equally well. But as the traffic increases our algorithm performs much better than the RR scheme. (3) For a non-uniform distribution (Pareto distribution), our algorithm performs considerably better for lower and higher traffic rates and the performance irrespective of the number of servers. For Pareto distribution, which closely models real Web traffic, better performance of our algorithm, at larger input rate of Web objects is a very attractive result of our algorithm. In summary, we have presented a novel approach using a competitive model of neural network for fast allocation of web pages and balancing the load between various servers. Our algorithm has ability to learn the arrival distribution of Web requests and adapts itself to the load and the changing characteristics of web requests. Simulation results show an order of magnitude improvement over some existing techniques. We are currently working on developing methodologies to include meta-data related to web pages in our algorithm to further improve our approach.

7.

References

[BBM97] M. Baentsch, L. Baum, G. Molter, “Enhancing the Web’s infrastructure: From caching to Replication,” IEEE Internet Computing, Vol. 1, No. 2, pp. 18-27, Mar-Apr. 1997. [BM98] M. Beck, T. Moore, “The Internet2 Distributed Storage Infrastructure Project: An architecture for Internet content channels,” Proc. Of 3rd Workshop on WWW Caching, Manchester, England, June 1998. [CB97] M. E. Crovella and A. Bestavros, “Self-Similarity in World Wide Web Traffic: Evidence and Possible Causes,” IEEE/ACM Transactions on Networking Vol. 5, Number 6, pp 835-846, December 1997. [CCY99a] V. Cardellini, M. Colajanni, and P. Yu, “DNS Dispatching Algorithms with State Estimators for Scalable Web-server Clusters,” World Wide Web Journal, Baltzer Science, Vol. 2, No. 2, July 1999. [CCY99b] V. Cardellini, M. Colajanni, and P. Yu, “Dynamic Load Balancing on Web-server Systems,” IEEE Internet Computing, Vol. 3, No. 3, pp. 28-39, May-June 1999. [CK01] K. G. Coffman and A. M. Odlyzko, Growth of the Internet: Preliminary research, available at http://www.dtc.umn.edu/~odlyzko/doc/oft.internet.growth.pdf. See also “Internet growth: Is there a "Moore's Law" for data traffic?” Handbook of Massive Data Sets, J. Abello, P. M. Pardalos, and M. G. C. Resende, eds., Kluwer, 2001. [GB99] Grossglauser M. and J.-C. Bolot, “On the Relevance of Long-Range Dependence in Network Traffic,” IEEE/ACM Transactions on Networking, 7 (5), pp. 629-640, October 1999.

15

[H91] J. Hertz, et al., Introduction to the Theory of Neural Computation, Lecture Notes, Vol. 1, Reading, MA: Addison-Wesley Publishing Co., 1991. [IC97] A. Iyengar and J. Challenger. Improving Web Server Performance by Caching Dynamic Data. In Proceedings of the USENIX Symposium on Internet Technologies and Systems, December 1997. [K89] T. Kohonen, “Self-Organization and Associative Memory”, 3rd ed. Berlin: SpringerVerlag, 1989.

[LT94] Leland, W.E., M. S. Taqqu, 1998 and Wilson, D. V., “On the Self-Similar Nature of Ethernet traffic (Extended Version),” IEEE/ACM Transactions on Networking, 2(1), February 1994. [LTWW94] W.E. Leland, M.S. Taqqy, W. Willinger, and D.V. Wilson, “On the self-similar nature of Ethernet traffic (extended version),” IEEE/ACM Transactions on Networking, 2(1);115, 1994. [MSE99] A. Modares, S. Somhom, and T. Enkawa, “A self-organizing neural network for multiple traveling salesman and vehicle routing problems,” Intl. Trans. In Op. Res. 6 (1999) pp. 591-606. [P92] Phoha V., “Image Recovery and Segmentation Using Competitive Learning in a Computational Network,” Ph.D. Dissertation 1992, Texas Tech University, Lubbock, Texas, 1992. [PF95] V. Paxson and S. Floyd, “Wide-area traffic: The failure of Poisson modeling,” IEEE/ACM Transactions on Networking, 3(3):226-244, June 1995. [PO96] V. Phoha and W. J. B. Oldham, “Image Restoration and Segmentation using Competitive Learning in a Layered Network,” IEEE Transactions on Neural Networks, pp. 843 – 856, Vol. 7, No. 4, July 1996. (See also V. V. Phoha and W. J. B. Oldham, Corrections to “Image Restoration and Segmentation using Competitive Learning in a Layered Network” IEEE Transactions on Neural Networks, Vol. 7, No. 6, November 1996.) [YY98] K. Yeung and T. Yum, “Node Placement Optimization in ShuffleNets,” IEEE/ACM Transactions on Networking, Vol. 6, No. 3, pp. 319-324, June 1998.

16

Suggest Documents