From an end user's point of view, too short a Webserver timeout implies too many
forced logouts, and too long a timeout duration poses a higher security risk to ...
OPTIMAL WEBSERVER SESSION TIMEOUT SETTINGS FOR WEB USERS ∗ Wei Xie†, Hairong Sun‡, Yonghuan Cao† and Kishor S. Trivedi†† {wxie, hairong, ycao, kst}@ee.duke.edu †Center for Advanced Computing and Communications Department of Electrical and Computer Engineering Duke University, Durham, NC 27708 ‡High Reliability and Availability Technology Center Motorola, Elk Grove Village, IL 60007, USA From an end user’s point of view, too short a Webserver timeout implies too many forced logouts, and too long a timeout duration poses a higher security risk to users’ sensitive data. We propose cost functions to select the timeout value, which are based on the solutions of a Markov regenerative process model. We discover that the distributions of user think time and server response time are important in determining the optimal timeout value. We thus present a non-parametric algorithm that dynamically estimates the optimal timeout without assuming a prior knowledge of the distributions. 1
Introduction
The Internet has become an indispensable part of people’s daily lives and a concrete keystone of today’s business and entertainment [BURE, EPAY]. The performance of Websites, such as response times, has been a major concern of the service providers since the early stages of the Internet. Increasing the number of servers does not always solve the problem as the demand quickly outgrows the capacity. At the same time, stricter security policies and mechanisms are being implemented and enforced [RESE01]. However, security and performance are two conflicting requirements in many contexts. The issue of how to trade between these two important attributes has not been addressed. Timeout is an important and popular mechanism, in terms of both performance and security, utilized by almost all parts of the e-business system. Session timeout of Webservers, which the user directly interacts with, is especially important. Unfortunately, little can be found in literature and it seems that Website administrators set this value ar∗ This research was supported in part by the Air Force Office of Scientific Research under MURI Grant No. F49620-00-1-0327, and in part by DARPA and US Army Research Office under Award No. C-DAAD19 01-1-0646. Any opinions, findings, and conclusions or recommendations expressed in this publication are those of the author(s) and does not necessarily reflect the view of the sponsoring agencies. † This work was done while K. Trivedi was a visiting Professor in the Department of Computer Science and Engineering holding the Poonam and Prabhu Goel Chair at the Indian Institute of Technology, Kanpur.
bitrarily. Decision of Webserver timeout length involves two types of tradeoffs. From the end user’s point of view, too short a timeout is inconvenient and may possibly accrue other losses as we will see in the next section. However, too long a timeout implies a higher risk of intrusion. From the server’s perspective, too short a timeout makes the unnecessary re-login overhead high, while too long a timeout may waste system resources. In other words, the former tradeoff is based on individual user’s experience, or a usability issue, and the latter tradeoffs over system performance. Naturally, the approaches to these two different problems could be substantially different. In this paper, we present a stochastic model which captures the user behavior and the Web server timeout activities, and analyze how the timeout value affects the user’s normal access of the Web services 1 . We then propose two cost functions and based on them, we formulate non-parametric estimators for optimal timeout durations. Our estimators are simple, fast and require no information about the underlying distributions. These features are especially attractive because the Webservers are already very busy, and the underlying distributions are unknown or nonstationary. With these non-parametric timeout estimators, Webserver administrators and Web application developers may incorporate the adaptive timeout adjusting mechanism 1 Only the tradeoff for end users is considered in this paper. The server oriented tradeoff will be addressed in another paper.
into their e-business Websites. The rest of this paper is organized as follows: Section 2 gives the background of e-business, its infrastructure, Webserver timeout and its impact. In Section 3 we define a stochastic model of the user-Webserver system with timeout, and solve it in closed-form. Section 4 describes the cost functions used for selecting timeout length, and the nonparametric estimation algorithm. Procedures for goodnessof-fit and minimum sample size using Kolmogorov-Smirnov statistic are described. Numerical simulations are given in Section 5. Sensitivity analysis is in Section 6 and Section 7 concludes the paper.
2 2.1
Background Overview of e-Business Infrastructures
The response times of the modern Website servers is often very long, because of the heavy load and internal structural complexity. The most popular e-business Website infrastructure is built as a multi-tier structure consisting of client, presentation, business, and datasource tiers. In contrast to the legacy client-server model 2 , multi-tier distributes different business logics into different components, and it is scalable, efficient, secure and easy to maintain. The client tier is preferably made thin because the network transportation is generally a bottleneck. Keep in mind that the thin client implies more burden on the servers. Then there is the request redirector, which is the facade of the entire Website. All requests go through it first and it may work as a load balancer, proxy and/or firewall, depending on the organization. It adds many security and performance benefits. The next tier is the Webserver cluster, or the presentation layer, which would be loaded with every task of the back-end. In the multi-tier model, Webserver’s only job is to arrange and render HTML pages. Complex business processing such as those accomplished by EJBs (Enterprise JavaBeans) is the responsibility of the next tier, application server cluster tier, or the business layer. This is the core of the Website and all business logics are implemented in this layer. It may also interface with other downstream systems via CORBA (Common Object Request Broker Architecture), RMI (Remote Method Invocation), etc., depending on the complexity and structure of the business. The last tier is the enterprise databases/data warehouses, where all information is stored after transactions are committed for data persistence, consistency, sharing and security. If two or more processes are accessing the same portion of the database, it is likely that only one of them can obtain the lock and others will have to wait. This increases the response time significantly if the load is very heavy. The Web server layer is the front-end which directly interacts with end users 3 and it is the only part we study 2 Some authors use the terminology “client-server” with a broader meaning. 3 The request redirector is ignored in our analysis, which does not do much more than simply forwarding requests and responses.
in this paper. Everything behind the Webserver, including application servers and databases, is called the back-end. 2.2
Active Webserver Session Timeout
As we mentioned previously, Webserver is the interface to the clients and its timeout behavior has a significant influence on the user’s satisfaction and data security. Specifically, we will study the impact of HTTP session timeout value in detail, because e-business is mainly built on WWW, which in turn is based on the HTTP protocol. The HTTP protocol is a stateless protocol [FIEL99]. In order to keep track of the users, the concept of session is implemented by either cookies or URL rewriting techniques. The users interact with the Website in the context of sessions. For most Websites when a user first sends HTTP request through a Web browser, either a cookie is stored in the user’s local hard disk or the URL the user enters is altered to include the encrypted session ID. In this way the Webserver knows whether multiple requests are from the same user or not. For many online services the user needs to log in with a unique user name and password combination before the personalized service begins, and the user is required to log out at the end of his or her visit for security and performance reasons. HTTPS, or HTTP over SSL (secure sockets layer) [FRIE96], is used extensively for data encryption, server authentication, message integrity, and client authentication. To protect sensitive private information from being accessed or modified by unauthorized persons when the user fails to explicitly log out because of technical reasons (e.g., the back-end servers are busy or down) or user’s negligence or ignorance (forget to click logout button or lack for knowledge that closing all browsers is required), the Web service providers often enforce a timeout policy, i.e., the user will be automatically logged out after a certain period of inactivity. The timeout event can either occur at the end of timeout duration or at the next user request. The former type, called active timeout, is more aggressive, and it depends on client side programming such as Javascript. In this paper, only the scenario of active timeout is modeled, which is preferred by e-business Websites with important user data. The duration of the timeout, typically a constant preset by the Website administrators in a rather arbitrary manner, ranges from minutes to hours. 2.3
Cost of Too Short and Too Long Session Timeout
Interruptions, such as forced timeout, of some online service are very undesirable. Most of the mission-critical online services, such as banking and stock trading, heavily rely on transactions. A transaction is an atomic set of steps which must be either finished successfully altogether or none of these steps should be carried out. These two alternatives are known as a commit and a roll-back, respectively. Distributed transactions involve multiple resource managers,
and therefore they last longer and their roll-backs are more complex and expensive. Obviously, if the user is timed out during a transaction, the transaction is rolled back. The user will have to repeat the entire transaction when he or she logs in again. This, at the minimum, causes inconvenience and wastes user time and system resources. For some applications such as online stock trading or online bidding, the loss to the customer may be significant and not recoverable. A short timeout will also increase the number of user re-logins. Creation of new sessions, especially secure sessions, is expensive. The entire procedure of initiating a new session will have to be repeated. Apostolopoulos et al. showed that SSL overheads can make Webservers slower by a couple of orders of magnitude [APOS00]. On the other hand, too long a timeout will greatly increase the risk of invasion and jeopardize the users’ personal data and the safety of the Website itself. The consequences include the breach of confidentiality and integrity of customer information, monetary loss, Website down time and loss of customer trust and business. 2.4
Optimal Webserver Session Timeout
Clearly, e-businesses with different security needs should configure their Websites, including timeout duration, according to different methods. User account security requirements vary drastically among e-business providers. Online businesses fall into three categories based the level of user data sensitivity. 1. High. The user’s business or personal data stored with the service provider is highly precious. The cost of losing such information is very high. Therefore, security, including confidentiality and integrity of user data, takes the top priority. Examples of this type include business-to-business (B2B), online banking, online stock brokerage, and online shopping sites which collect users’ credit card numbers. 2. Medium. Websites in this class have some users’ personal information, such as name, home address, email address, date of birth, etc. These entities are certainly private and should not be disclosed without the user’s consent, but the loss, or the cost of losing such information is not likely to be as serious as that of the previous category. Examples of this type include online email service Websites, online insurance Websites, online bill access from local electricity, gas or telephone companies. 3. Low. Most of the rest of the online services. Little or no personal information is collected and stored. Hence security is not a major issue while the performance has a high priority. Examples in this type include online books, news, weather, search engines, forums. For class 1 Websites, optimal session timeout should include both security and usability considerations, while for
class 3 Websites, usability is the only criteria for timeout selection. Websites of class 2 are in between so both security and usability may be important.
3
MRGP Model
In this section, we introduce a stochastic model of the Webserver and its timeout behavior. We then obtain a closedform solution to the model. 3.1
Model Definition User Leaves w/o Logout pqF(a)
Web page displayed G(a) Waiting 0
Reading 1 Web page requested (1-p)F(a)
Abnormal Exit 3
User Logout p(1-q)F(a)
Relogin
Timed out
Timed out 1-F(a)
Time out 1-G(a)
Normal Exit 4
Logged Out 2
Timed Out 5
Figure 1: MRGP Model. Transition probabilities are shown in this figure.
Web page displayed G(a)
Reading 1
End of Visit 1"
Reading F(a) User Leaves p
Waiting 0 Web page requested (1-p) Timed out 1-F(a)
Relogin
End of Reading 1'
w/o Logout q Abnormal Exit 3
w/ Logout (1-q) Timed out
Time out 1-G(a)
Logged Out 2
Normal Exit 4
Timed Out 5
Figure 2: Expanded Version of the MRGP Model. Solid arcs represent timed transitions and dashed arcs represent immediate transitions. Transition probabilities are shown in this figure. Fig. 1 is a Markov regenerative process (MRGP) model of the user-Webserver system, and Fig. 2, the expanded version of Fig. 1, is shown to assist explaining the model. Assume initially the user is logged in and has just sent an HTTP request, and is waiting for the page to be loaded. This state is denoted by state 0 (waiting). If the page is displayed after some time (generally distributed with cdf G(t)), the user begins to read the result and the system is in state 1 (reading). Note that the Webserver processing time and the network transmission time are ignored because they are small compared to the heavy-weight processing of the back-end 4 . Thus, G(t) is also the distribution of the back-end response time. The user reading time is generally 4 For
transaction oriented e-business Websites, a significant amount
distributed with cdf F (t). Whether in state 0 or state 1, the user is subject to a deterministic active timeout transition with cdf T (t) = u(t − a), where u(t) is the unit step function. In other words, if the user fails to send a new request or the back-end fails to fulfill the request after time duration a has elapsed, the Webserver will automatically terminate the session and log out the user. This is the temporary exit and denoted by state 2 (logged out) in the model. The user will re-login to finish his or her task. Alternatively, the user may end the session willingly after reading a Web page by logging out explicitly. We call this a normal exit and is labeled state 4 in the model. If the user leaves without properly terminating the session (by clicking the logout button and/or closing all active browsers), it is called abnormal exit and is denoted by state 3. State 3 will be timed out and the browser will display the re-login page, and this state is state 5. The mean sojourn time of state 3 t3 is the timeout duration a minus the mean user think time t1 , where ti is the expected sojourn time of state i. We introduce the constant p as the probability that the user finishes the work on this Website and logs out, and constant q which is the probability that the user does not fully end the session at the end of the visit. Clearly, the mean number of pages the user intends to load is 1/p. Expected branching probabilities are given in both Fig. 1 and Fig. 2. We are interested in the quality of service and security measures for the end users, and how the Webserver timeout settings affect them. If the timeout duration a is too short compared to the back-end server processing time or the user reading time, the user will be kicked out too many times during a visit. On the other hand, too long a timeout duration certainly exposes the user’s personal account information on the Website to security violations. How to determine the best timeout duration a in order to minimize the inconvenience of interruptions and risk of intrusions is the focus of this paper.
3.2
where
0 G(a) (1 − p)F (a) 0 Q = 1 0 0 0 0 0 p(1 − q)F (a) 0 , C = 0 0 0 1
¯ G(a) F¯ (a) 0 0
0 pqF (a) , 0 0
F¯ (t) = 1 − F (t), ¯ G(t) = 1 − G(t). A lot of information of the DTMC is contained in the fundamental matrix (e.g., see [TRIV01]) M = (I − Q)−1 . Given the initial state of 0, M00
=
M01
=
M02
=
1 , pF (a)G(a) 1 , pF (a) 1 − F (a)G(a) . pF (a)G(a)
It is well-known that Mij is the mean number of times state j is visited before absorption, given that the initial state is i. Note that M01 is greater than 1/p, due to the fact that the active timeout can interrupt reading of the current page and thus increase the re-logins and actual number of pages the user loads. Obviously, the absorption probabilities to states 4 and 5 are p04 = (1 − q) and p05 = q, respectively. This result can be verified by computing matrix A = MC = [aij ], where aij is the probability that the system starts from transient state i and eventually enters absorbing state j.
Model Analysis
4 There are several methods available to solve this stochastic process, including discrete time Markov chain (DTMC) and Laplace transform (e.g., see [TRIV01, GERM95]). In this paper only the DTMC method is shown, and the detailed derivations are omitted for brevity. This MRGP’s embedded DTMC has a transition probability matrix as (assuming Pii = 1 for absorbing states i ∈ {4, 5}): Q C P= , (1) 0 I of time is spent on the back-end processing for personalized data generation, such as account balance, etc.. Heavy-weight components of the rendered HTML page, for example, images, are static and easy to be cached in client’s machine or ISP. Therefore, the network transmission times (mostly for pure text files) are short.
Cost Functions and Timeout Choice
Armed with the closed-form solutions of the MRGP model, we are ready to investigate the problem of selecting optimal timeout, based on different requirements. 4.1
Unilateral Cost Function
For most general-purpose business Websites, e.g., those in medium or low user data sensitivity levels, success largely depends on the volume of traffic, which in turn depends on the number of customers. In order to attract and retain customers, the Website administrators need to make the browsing smooth and one of the major activities towards this end is to ensure a maximal ratio of kicking out active surfers not being reached. Timeout duration is directly related to this metric.
By denoting the upper limit of the mean number of times an individual user is timed out by the system (and re-login) during a visit by Nmax , we have the following inequality Nmax ≥ M02 =
1 − F (a)G(a) . pF (a)G(a)
(2)
By solving Eq. 2, the minimum timeout duration a can be obtained. This cost function is relatively easy to apply in practice if the criteria Nmax is given. 4.2
Bilateral Cost Function
For Websites containing sensitive or critical user data, e.g., those in the high user data sensitivity category, such as online-banking, online-brokerage, etc., the security of the user account is as important as, or even more important than the user convenience and/or comfort factors. To quantify the risk of invasion and the expense of timeout/relogin, we introduce the bilateral cost function. If timeout duration a is short, the user session is likely to be ended while the user is still thinking or the backend is still processing, and the user will have to re-login to finish the interrupted work, which may not be resumable. We denote the cost coefficient of this timeout/re-login by Cr . If the timeout duration a is long, the risk of invasion, denoted by cost coefficient Ci 5 , is likely to be high. The goal is to minimize the total cost given by the cost function C(a)
= Cre−login + Cintrusion = Cr M02 + Ci p05 t3 a 1 − F (a)G(a) + Ci q = Cr F (t)dt. pF (a)G(a) 0
By feeding each of Fn (xi ), xi , 0 ≤ i ≤ n and Gm (yj ), yj , 0 ≤ j ≤ m pairs in the cost function Eq. 3, we get the optimal timeout duration as the xi or yj which gives the lowest cost. a† = {zk |
Cm+n (zk )
(3)
Non-Parametric Estimators
Non-parametric estimation has been applied to areas such as software rejuvenation. In [DOHI00b, DOHI00a], neither the form nor the parameters of the system failure distribution is required. If the number of complete failure time 5 Note that C and C have different units. We can think of C r r i as Cr = Cr∗ × t2 , where the constant t2 is the mean sojourn time of state 2. In this way Cr∗ and Ci have the same units.
min
0≤k≤m+n
Cm+n (zk )},
(6)
where
t3 (zk )
In Eq. 3, t3 is the expected sojourn time of state 3, or the mean duration that the user session is active without the user’s attendance. As defined earlier, F (t) and G(t) are cdf’s, so they are monotonically increasing functions on [0, 1]. Therefore the first term of Eq. 3 right-hand-side is monotonically decreasing, while the last term is increasing when a is large. When a → 0, F (a)G(a) → 0 and C(a) → ∞, and when a → ∞, C(a) → ∞. Therefore, the cost function Eq. 3 is concave and has at least one minima. From Eq. 3, we see that the cost functions depend on the distributions of user think time and server response time. It would be desirable to have automatic distribution estimation without making any assumptions about the underlying distributions. We hence adopt a non-parametric approach. 4.3
observations approaches infinity, the non-parametric estimation converges to the optimal software rejuvenation interval uniformly with probability one. In our problem, by dynamically monitoring the user/server interactions and/or by analyzing the server log files periodically, a large amount of data such as server response times and the user HTTP request inter-arrival times is available. Let 0 ≡ x0 ≤ x1 ≤ x2 ≤ · · · ≤ xn and 0 ≡ y0 ≤ y1 ≤ y2 ≤ · · · ≤ ym be ordered, uncensored observations of user thinking times and server response times, respectively. It follows that the empirical cumulative distribution functions (ecdf) Fn (t) and Gm (t) are given as
i/n, xi ≤ x < xi+1 (4) Fn (x) = 1, xn ≤ x
j/m, yj ≤ y < yj+1 (5) Gm (y) = 1, ym ≤ y
=
=
Cr Cr /p − Fn (zk )Gm (zk ) p + Ci qt3 (zk ), 1 n
l
i · (xi+1 − xi ),
(7) (8)
i=1
zk = x1 , x2 , . . . , xn , y1 , y2 , . . . , ym , and xl ≤ zk < xl+1 (let xn+1 ≡ +∞). 4.4
Sample Size Determination
A practical and theoretical issue accompanying the nonparametric estimators is: how many samples are sufficient to give a good enough empirical cdf? We propose to use Kolmogorov-Smirnov test iteratively to determine the sample size for the non-parametric approach [D’AG86]. Kolmogorov-Smirnov test is typically used to tell the goodness-of-fit of an empirical cdf to a known distribution, based on the supremum statistics defined as the largest “vertical” difference between the empirical and known cdf’s [TRIV01, D’AG86]. Intuitively, if we assume that the underlying distributions F (x) and G(x) exist and are stationary, increasing the number of samples would make the empirical cdf’s approach the underlying distributions. In other words, the difference between the two empirical cdf’s, Fn (x) and Fn+∆n (x) (or Gm (x) and Gm+∆m (x)) where ∆n (or ∆m) is a positive integer, will diminish as n approaches to infinity. We follow the procedure listed below to determine the minimum sample size (without loss of generality, we use distribution F (t) here.):
1. Collect n data samples and plot the ecdf Fn (x) according to Eq. 4; 2. Collect ∆n more samples and combine them with the n samples collected in the previous step, then plot the ecdf Fn+∆n (x) according to Eq. 4;
Table 1: Summary of Distributions Employed Distribution Exponential Uniform
3. Compute the Kolmogorov-Smirnov statistic D = sup |Fn+∆n (x) − Fn (x)|;
Weibull
x
Pareto 4. Compute the critical value dn+∆n;α (e.g., see [TRIV01]), where α is the level of significance that we reject or accept the null hypothesis that our ecdf is good enough; 5. If D ≤ dn+∆n;α , the ecdf obtained from the n + ∆n samples is good at the level of significance α. Otherwise, repeat the steps. Note that we have assumed that the distribution F (t) exists and is stationary during the period that the data is collected. This assumption is realistic because e-business Websites often serve numerous customers concurrently and a large volume of user activity and server response data is generated in a short period of time. In practice, we may compute the minimum sample size required for a given level of significance offline or infrequently. In this way we can get at least a rule of thumb for sample size and consume less computing power in production servers.
5
Numerical Simulations
In this section, we demonstrate our non-parametric estimation method by a few simulation examples. Although the estimators described require neither the forms of the user and server behavior distributions nor any parameters of them, we use our random number generator to create data samples of various distributions. In reality these data samples are readily accessible either from the system log files or from active monitoring. Our simulation results show that the non-parametric estimators are fairly accurate, fast and easy to implement. 5.1
Simulation Setup
Each simulation consists of A steps and for each step we generate random numbers x∗i ∈ [0, 1], 1 ≤ i ≤ NF and yi∗ ∈ [0, 1], 1 ≤ i ≤ NG . Then we feed x∗i , yi∗ into the inverse functions of F (t) and G(t), respectively, and get xi = F −1 (x∗i ), yi = G−1 (yi∗ ). It is well-known that xi and yi obey distributions with cdf F (t) and G(t). Next, we plug in the sample distribution data into our non-parametric estimators and calculate the timeout durations. In each of the following iterations, one set of random numbers is generated and combined with the previous dataset, and a new estimation is made. In the figures of following sections,
Cumulative Probability Function (cdf) 1 − e−λt , 0 ≤ t 0 ≤ t < r1 0, t−r1 , r1 ≤ t ≤ r2 r2 −r1 1, r2 < t −(t/θ)k 1−e ,0≤t 0 ≤ t < b1 0, 1−(b1 /t)w , b1 ≤ t ≤ b2 w 1−(b1 /b2 ) 1, b2 < t.
the X-axis denotes the step of the simulation, and Y-axis is the estimation of timeout duration a. In some simulations we repeat the same program for multiple times in order to observe how well the estimators converge and how fast they converge. It is found that long tailed distributions such as Weibull and Pareto are prevalent in the Internet traffic (e.g., see [DENG96, CROV97, DOWN01]), and we will include them in our simulations. The distributions we use in the following simulations are summarized in Table 1. The default values for the parameters are k = 0.77, θ = 8.5788, b1 = 10, b2 = 360, and w = 0.5 6 . Unless otherwise specified, we assume the mean user reading time to be 60 seconds, and the mean server response time to be 10 seconds. Also we assume p = 0.1,q = 0.05, NF = 100, NG = 80, A = 30, Ci = 5, Cr = 1, and Nmax = 0.1. Each simulation is repeated for 5 times. 5.2
Unilateral Cost Function Case
As we can see in the unilateral cost function Eq. 2, F (a) and G(a) are symmetric. Hence the combination, not the permutation, of the two distribution functions is significant. Fig. 3 (a) (b) are for the unilateral case. It appears that our non-parametric estimators give a slightly different estimation for each run. This is natural because different sets of random numbers are generated for each simulation. The estimations given in the unilateral case are just the lower bounds for timeout length, and the final endpoints of the curves in Fig. 3 (a) (b) concentrate in a quite small range around the theoretical values. In practice, the Web administrators may create an automatic system task, which periodically parses segments of server log files, computes the intervals between consecutive user-level requests from the same address and intervals between corresponding back-end server responses. Then according to Eq. 4, 5, Fn (x) and Gm (x) are ready for the unilateral cost function Eq. 2, from which the minimum timeout length can be estimated. 6 All the default values used in this paper are just for the numerical examples. In practice these parameters should be decided case by case and may vary significantly.
100
100 0
30
6 0
60 40 20
0
10 20 Number of Steps A
30
100 80 60 40 20 0
0
10 20 Number of Steps A
30
Figure 3: Non-parametric Estimation Examples. Theoretical optimal timeout durations are (a) 276.9 (b) 326.8 (c) 60.0 (d) 45.9 seconds.
5.3
Bilateral Cost Function Case
Fig. 3 (c) (d) are the simulation results of the bilateral cost function case. As in the unilateral case, the output of the estimator stabilizes after about 10 to 20 steps. According to the parameters of our simulation, it is equivalent to 800 to 2000 observations for each of user request times and back-end server response times. If we choose 0.05 as the level of significance α, the critical values are d800;0.05 = 0.0481 and d2000;0.05 = 0.0304. For exponential distribution, the Kolmogorov-Smirnov statistic we obtained in the simulation are D800 = 0.014107 and D2000 = 0.005211; for Pareto distribution, D800 = 0.012857 and D2000 = 0.002974; for Weibull distribution, D800 = 0.013750 and D2000 = 0.003105; and for uniform distribution, D800 = 0.010179 and D2000 = 0.006737. As we can see, all of the Kolmogorov-Smirnov statistics D800 ’s and D2000 ’s are much smaller than d800;0.05 and d2000;0.05 , respectively. This means that as a rule of thumb, 800 to 2000 observations for each underlying distribution are good enough for the ecdf’s and our non-parametric estimators, given the level significance of α = 0.05. This amount of data can be collected within minutes or even seconds for commercial Websites, which may serve thousands of concurrent users at the same time. It is reasonable to assume that these distributions do not change within such a short period of time. Note that the results of each simulation did not end at the exact same a value after 30 steps, but rather distributed in a small range as in the unilateral case. This is known as the asymptotic behavior of non-parametric estimations, and they converge strongly consistently when the sample size approaches infinity (e.g., see [DOHI00b, DOHI00a]). The resulting range is compact enough to serve as a guideline for Website owners to adjust to the right timeout length.
Optimal Timeout Length a (seconds)
80
0
Sensitivity Analysis
30
(d) Bilateral: Pareto + Weibull
100
Estimation of Timeout Length a
Estimation of Timeout Length a
(c) Bilateral: Exponential + Exponential
10 20 Number of Steps A
(a) Unilateral Cost Function 2000 uniform+uniform exp+exp Pareto+Weibull
1500
1000
500
0
0
100 200 300 400 Mean User Reading Time (seconds)
(b) Bilateral Cost Function 500 uniform+uniform exp+exp Pareto+Weibull
400 300 200 100 0
0
100 200 300 400 Mean User Reading Time (seconds)
Figure 4: Sensitivity Analysis for Different Distributions.
Optimal Timeout Length a (seconds)
10 20 Number of Steps A
200
(a) Unilateral Cost Function 2000
1500
1000
µ−1=1 −1 µ =5 −1 µ =10 −1 µ =20 −1 µ =30
500
0
0
100
200
300
400
Mean User Reading Time λ−1 (seconds) Optimal Timeout Length a (seconds)
0
300
Optimal Timeout Length a (seconds)
200
400
Optimal Timeout Length a (seconds)
300
To implement this estimator in production Webservers, steps similar to those in the unilateral case may be taken, except using the bilateral cost function Eq. 3 and finding its minimum point.
500
(c) Bilateral Cost Function 200
150 −1
µ =1 −1 µ =5 −1 µ =10 −1 µ =20 −1 µ =30
100
50
0
0
100
200
300
400
Mean User Reading Time λ−1 (seconds)
Optimal Timeout Length a (seconds)
400
0
(b) Unilateral: Pareto + Weibull Estimation of Timeout Length a
Estimation of Timeout Length a
(a) Unilateral: Exponential + Exponential 500
(b) Unilateral Cost Function 800 p=0.005 p=0.01 p=0.05 p=0.1 p=0.5
600
400
200
0 −4 10
−2
0
10 10 Maximum # of Timeout Nmax
2
10
(d) Bilateral Cost Function 600 p=0.005 p=0.01 p=0.05 p=0.1 p=0.5
500 400 300 200 100 0 −2 10
0
10 Invasion Cost C
2
10 i
Figure 5: Sensitivity Analysis. Both user thinking times and the server response times are assumed to be exponential. As we can see in Fig. 4, the forms of the underlying distributions have a strong effect on the optimal timeout duration selections. Given the distributions with same means 7 , the differences in the optimal timeout interval might be an order of magnitude. Moreover, depending on the parameters of those distributions, uniform distributions sometimes give a larger timeout value than exponential distributions, and sometimes the opposite is true. It is not hard to understand that the forms of the underlying distributions are so influential because the cost functions Eq. 2 and Eq. 3 are directly composed of the cdf’s of actual distributions. This fact also makes our non-parametric estimators outstanding 7 Note that the variances and other moments of these distributions may be different, and the default values given previously do not apply in this section.
since no distributional assumption (neither the forms nor the parameters) is needed in our algorithm at all. Next, we take a look at the effects of other parameters when the forms of distributions are given. Fig. 5 shows the sensitivity analysis of both cost functions. Both the user thinking times and the server response times are assumed to have exponential distributions, with mean of λ−1 and µ−1 , respectively. If the users spend more time thinking, it naturally follows that the optimal timeout length will become longer. This is shown in Fig. 5 (a) (c). When λ−1 is large compared to µ−1 , the parameter µ−1 has little effect on the resulting estimations, hence the curves for different µ−1 in Fig. 5 (a) (c) are very close or overlapped. For unilateral case, Fig. 5 (b) is the semi-log plot of the maximum mean number of allowed timeouts per user per visit Nmax versus the minimum timeout length a. As we anticipate, the minimum a satisfying Eq. 2 drops quickly as Nmax increases. Fig. 5 (d) shows how the invasion cost coefficient Ci affects the optimal timeout choice in the bilateral case. When Ci increases, a shorter timeout duration a is desirable to reduce the risk of intrusion. The trend toward a smaller a slows down as Ci becomes larger. Because the aggregated user behavior varies from peak hour to night, from weekdays to weekends (e.g., see [HELL98]), it helps to periodically update the Webserver timeout duration so as to improve the server performance and user satisfaction.
7
Conclusion
Timeout is a very popular protection mechanism found in many systems. In the world of Web browsing and ebusiness, Webserver is the front-end that the user interacts directly with, and its timeout duration is closely related to the user’s convenience and privacy. However, not much attention has been paid to Webserver timeout selection and in practice Webmasters choose a value in a more or less arbitrary way. In this paper we have defined non-parametric estimators to fill this gap. Unlike traditional stochastic models, we have made no assumptions of distributions, such as the user thinking times and the back-end server response times. The timeout transitions are deterministic, which reflects the real world scenarios. We have solved the MRGP model and constructed two cost functions: unilateral and bilateral cost functions. The former corresponds to those Websites which have a relatively high user satisfaction requirements and low security requirements, and the latter cost function is for the Websites of which risk of invasion is a major concern too. Then we described the non-parametric estimators for these two cost functions. The greatest benefits of our estimators are simple, practical, fast and no information on the distributions is required. Based on this algorithm, the Website administrators and Web application developers can dynamically adjust the Webserver timeout length to accommodate the changes of the servers’ load and users’ activities. We
also have described iterative procedures to compute the minimum sample size required for the non-parametric estimations. Numerical examples have been given to demonstrate the estimators and parameter sensitivity analysis has been done. It has been found that both the forms and the parameters of the underlying distributions are important for the optimal timeout selection. This fact makes our non-parametric estimators especially useful and appealing. In our simulations with several distributions, we also have demonstrated that a sample size of 800 to 2000 observations provides a fairly good empirical cdf given the level of significance of 0.05. Future work includes a prototype implementation based on the proposed approach and extensive simulations.
References [APOS00] George Apostolopoulos, Vinod Peris, Prashant Pradhan, and Debanjan Saha, Securing electronic commerce: Reducing the SSL overhead, IEEE Network (2000), 8–16. [BURE]
U.S. Census Bureau, Measuring the electronic economy (http://www.census.gov/eos/www/ebusiness614.htm).
[CROV97] Mark E. Crovella and Azer Bestavros, Selfsimilarity in world wide web traffic: Evidence and possible causes, IEEE/ACM Trans. Networking 5 (1997), no. 6, 835–846. [D’AG86]
Ralph B. D’Agostino and Michael A. Stephens, Goodness-of-fit techniques, Marcel Dekker, 1986.
[DENG96] S. Deng, Empirical model of WWW document arrivals at access link, ICC’96, 1996, pp. 1797– 1802. [DOHI00a] Tadashi Dohi, Katerina Goseva-Popstojanova, and Kishor S. Trivedi, Analysis of software cost models with rejuvenation, IEEE Intl. Symposium on High Assurance Systems Engineering (HASE), November 2000. [DOHI00b] Tadashi Dohi, Katerina Goseva-Popstojanova, and Kishor S. Trivedi, Statistical nonparametric algorithms to estimate the optimal software rejuvenation schedule, Pacific Rim International Symposium on Dependable Computing (PRDC), December 2000. [DOWN01] Allen B. Downey, Evidence for long-tailed distributions in the Internet, ACM SIGCOMM’01, 2001. [EPAY]
epaynews.com, Statistics for online purchases, http://www.epaynews.com/statistics/purchases.html.
[FIEL99]
R. Fielding, J. Gettys, J. Mogul, H. Frystyk, L. Masinter, P. Leach, and T. Berners-Lee, Hypertext transfer protocol - http/1.1, June 1999.
[FRIE96]
A. Frier, P. Karlton, and P. Kocher, The SSL 3.0 protocol, Netscape, November 1996.
[GERM95] R. German, D. Logothetis, and K. S. Trivedi, Transient analysis of Markov regenerative stochastic Petri nets: A comparison of approaches, Proc. 6th IEEE Workshop on Petri Nets and Performance Models (PNPM’95) (Durham, North Carolina, U.S.A.), 1995, pp. 103–112. [HELL98]
Joseph L. Hellerstein, Fan Zhang, and Perwes Shahabuddin, Characterizing normal operation of a web server: Application to workload forecasting and problem detection, Proceedings of the Computer Measurement Group, 1998.
[RESE01]
Internet Week Research, Analysis and insight for timely e-business decisions, Internet Week (http://www.internetweek.com/eresearch01/876/research.htm) (2001).
[TRIV01]
K. S. Trivedi, Probability and statistics with reliability, queueing, and computer science applications, second ed., John Wiley & Sons, 2001.