Secure Application-Aware Service Differentiation in Public. Area Wireless Networks. â. Weisong Shi, Sharun Santhosh, and Hanping Lufei. Department of ...
Secure Application-Aware Service Differentiation in Public Area Wireless Networks ∗ Weisong Shi, Sharun Santhosh, and Hanping Lufei Department of Computer Science Wayne State University 431 State Hall, 5143 Cass Ave. Detroit, MI 48202, USA {weisong, sharun, hlufei}@wayne.edu
Abstract
two other bandwidth allocation approaches, the best effort and static access control, our proposed We are witnessing the increasing demand for application-aware service differentiation method, pervasive Internet access from public area wireless outperforms them in terms of the client fairness and networks (PAWNs). As their popularity grows, wireless bandwidth utilization. the inherent untrusted nature of public places and the diverse service requirements of end users are two key issues that need to be addressed. We Key Words: Mutual Authentication, Service Difhave proposed two approaches to address these ferentiation, Public-Area Wireless Network. issues. First, the Home-based Authentication Protocol (HAP) that provides a framework by 1 Introduction which to establish trust between a nomadic client Wireless connectivity is no longer just an add-on and a service provider using a trusted third party capability implemented through an interface card, (home). Second, we argue that the best-effort based it is now available as a built-in feature in a wide service model provided by many access points is range of user devices. Public-area wireless netnot enough to satisfy the end user fairness and to works (PAWNs) [1] are being deployed in public maximize the wireless link utilization for a diverse places such as airports, hotels, conference centers, user population. We have proposed an applicationshopping malls, libraries, and so on. As their popaware service differentiation (AASD) mechanism ularity grows, the inherent untrusted nature of pubthat takes both application semantics and user lic places and the diverse service requirements of requirements into consideration. Our analysis end users are two key challenges that need to be adof this framework shows several fruitful results. dressed. The total authentication latency increases with the We illustrate the problems faced in such an envinumber of clients but at a rate that is much less than ronment with the following scenario: Peter works linear increasing latency. Also, in comparison with for a large corporation. His company has contracts ∗ This work was supported in part by Michigan Life Science with a number of airports and hotels to provide InCorridor under grant number MEDC-459 and Wayne State ternet access (or other services) to their traveling University Faculty Research Grant. employees. One day, Peter walks into an airport 1
with his laptop equipped with a wireless LAN card. When he turns it on, the laptop is assigned an IP address by the DHCP server at the Airport. At this point two things are to be done to ensure the security of both parties involved. First, Peter must verify the identity of the Service Provider (in this case the Airport’s Internet Service Provider). Second, the airport must verify Peter’s identity. We call this process trust establishment. In many existing infrastructure-based systems, before providing services to users, the service provider has to verify the identify of users, which is possible due to “trusted” third parties vouching for them, such as Certifying Authorities [2], credit companies, or global authenticators. In a nomadic environment, clients do not have direct access to the Internet. They have to rely on an untrusted server providing them a connection. This renders them incapable of ensuring they are really communicating with entities qualified to verify the identity of the service providers. The CHOICE project [1] of Microsoft Research uses a global authentication database to authenticate users at PAWNs. This method is a centralized approach, and moreover, they do not provide a mechanism for the client to authenticate the service provider. The problem with WEP (Wired Equivalent Privacy) [3], the protocol used by many wireless networks, is that the user must have knowledge of the key in advance. While this is fine in SOHO (small office home office) environments, it is not reasonable in a large public setting. The other important issue in PAWNs is how best to enforce service differentiation for users. Public networks are open to a variety of users. Therefore, network access in public places must be able to support a wide range of service models, ranging from the best-effort service to differentiated quality of services. A service differentiation and access control mechanism is necessary in such an environment to ensure genuine users receive the services they have paid for, to protect against malicious users, and to use the networks bandwidth resources in an effective way. Several access control or bandwidth allocation al-
gorithms have been proposed. RSVP [4] is a signaling protocol to reserve resources at all the routers along the path. While RSVP provides guaranteed quality of service, it has significant scalability problems, and has not been widely deployed in today’s Internet. Instead of providing such a hard performance guarantee, several measurement-based admission control algorithms [5] have been proposed to provide a soft guarantee. Allowing occasional performance degradation enables more efficient utilization of network resources while still providing acceptable service. Although these previous results are promising, few of them takes the application level information into consideration. The result is either lower link utilization or unfair service provisioning. We believe that an effective combination of mutual authentication and service differentiation should be the underpinning for the deployment of a PAWN. We have proposed a framework by which to establish trust between a nomadic client and a service provider. It consists of two components; the authentication module and the application-aware service differentiation module. The authentication module, called the home-based authentication protocol (HAP), uses a trusted third party Home (clients home organization) that is able to verify both the service provider and clients authenticity. The diverse service requirements from end users are handled by an efficient service differentiation module, which aims to satisfy two goals, the end user fairness and the maximization of wireless link utilization. With improving the link utilization and user bandwidth allocation as of our goals, we have come up with an Application-Aware Service Differentiation (AASD) algorithm. Our algorithm is based on the observation that, awareness of application level demands is crucial to make predictions about client bandwidth requirements or evaluate the overall network utilization. A relatively accurate prediction of the request object size and Internet path bandwidth enables a more flexible and efficient implementation of traffic control and bandwidth allocation on the fly. Our analysis of this framework shows that the total authentication latency increases as the num2
ber of concurrent clients increases. However, the rate at which HAP latency increases is less than the linear increasing of latency. We think that 4.75 seconds latency for 256 clients is acceptable. The service differentiation algorithm was implemented and evaluated under two different scenarios: fixed upstream bandwidth and dynamic upstream bandwidth. The results show that in comparison with two other bandwidth allocation approaches, the best effort and static access control algorithms, our proposed method outperforms them in terms of both the client fairness and the wireless bandwidth utilization. For example, when the access point hosts 120 clients with different reserved bandwidths that follow a normal distribution, the utilization of our approach is twice of that of the static allocation method and 50% more than the best effort approach while at the same time satisfies 90% of clients with 95% of their individual reserved bandwidth. Furthermore, we proposed an exponential weighted queue optimization technique to adapt bandwidth allocation dynamically. Note that the two components, the HAP authentication protocol and the AASD algorithm, aren’t tightly coupled and may be used jointly or separately with other protocols of the existing infrastructure. The application-aware service differentiation module can be integrated with other authentication protocols. While HAP may be used as an authentication mechanism for other distributed systems as well. The rest of the paper is organized as follows. The design of the HAP protocol is described in Section 2. Section 3 outlines the design of the application aware service differentiation algorithm. Performance evaluation of the HAP protocol and the AASD algorithm are presented in Section 4 and Section 5. Related work and concluding remarks are listed in Section 6 and Section 7 respectively.
2
Continuing with Peter’s scenario in Section 1, and without losing generality, we will use H to represent the home organization of nomadic users (Peter’s employer), and A to represent the service provider in public places (Airport in Peter’s case), and P to represent the nomadic user (Peter).
2.1
Assumptions and Terminology
Before getting into the details of the protocol, we list the assumptions and terminologies used in the following sections. 2.1.1
Assumptions
• Each person has one home organization, which could be his/her employer. If the person has multiple employers, he or she can pick anyone as his/her home organization. Home organization has the responsibility of issuing certificates to all employees, based on their roles in the organization. • Each service provider at public places provides multiple service models, including authorized access, services with differentiated quality. • Each service provider has contracts with multiple organizations, and each organization may have contracts with multiple service providers located at different public places. Those contracts can be set up either off-line or on-line. Basically, a contract includes a match between the role of person in his/her home organization and the service-level agreement promised by the service provider. For example, if a university is a home organization, the faculty members from this university can be allocated 300Kbps bandwidth after being authenticated, while students from the same university can have 100Kbps bandwidth.
HAP Protocol Design
The basic idea of HAP is to use the trusted home organization of nomadic users to perform the mutual 2.1.2 Terminology authentication and to assist the service provider in Table 1 lists the definition of all symbols used in the following sections. providing differentiated services. 3
Symbol P H A CP CH CA Kx+ Kx− Mxy
Meaning Traveler P ’s home organization Service provider at a public place P ’s certificate H’s certificate A’s certificate The public key of entity x The private key of entity x The secure message from entity x to y
After sending this message to A, P then waits from a response from H via A that authenticates A. If H is unable to authenticate A, P drops the connection.
2.3
When a new nomadic user P enters A’s field of control (detected by beacon message), A’s DHCP server assigns an IP address to it. On receiving a “hello” message from P , A sends its certificate CA to P , and waits for P ’s authentication message. All the traffic from P to A is rejected by A until the authentication process is complete. On receiving P ’s message, A does the following :
Table 1: A list of the terminology.
2.2
Service Provider (A)
Nomadic User (P)
1. Decrypts Part III of the message using KA− , and verifies P ’s signature using KP+ (which it extracts from Part II).
As shown in the Figure 1, when the nomadic user P works at his home organization, he obtains his certificate CP and H’s certificate CH from H. Once P is moved to within A’s field of control, it is assigned an IP address from the DHCP server deployed at A. After that, P sends a “hello” message to A. On receiving CA from A, P creates a message and sends to A to perform the mutual authentication. This message contains three parts:
2. Forwards Part I to H. A also creates its own message to H, which contains two elements: (1) MAH , which basically identifies A and tells H that A has received a request from P that claims to belong to H; (2) the nonce N , which is extracted from P ’s message. These two parts are signed by A with KA− , and encrypted with + H’s public key KH . The entire message is of the form:
1. Part I includes two elements. MP H — a predefined message to be forwarded to H from A which includes P ’s identification, a nonce N which is a random number that is used only once to prevent the relay attack. Both MP H and N are then signed with KP− and then en+ crypted with KH .
{[MP H , N ]K − }K + , {[MAH , N ]K − }K + P
H
A
H
A then waits for a response from H confirming P’s membership and role in H. The message it receives from H has two parts, one for A and the other for P . If P is authenticated by H, A will forward message for P to P , and provides services based on information provided by H and the contract between H and A.
2. Part II is the certificate of the nomadic user CP .
3. Part III MP A — specifies P ’s identity, the H it belongs to, and a nonce N which has the same value as the nonce in Part I. Both MP A and N are signed by P with his private key KP− , and 2.4 Home Organization (H) the resulting message is encrypted using A’s On receiving a message from A, H extracts the first public key KA+ . portion from P , and verifies P ’s identity and his/her The entire message P to be sent to A is now of the standing in the organization. It then uses the second portion from A to verify A’s identity (to see form: whether it is a legal contracted service provider). If {[MP H , N ]K − }K + , CP , {[MP A , N ]K − }K + both P and A are legal, and the nonces N in both P
H
P
A
4
Internet access and other services
8
4. P creates and signs two messages, one (MP A ) for A and the other (MP H ) for H.
4
Home
5. A receives message from P , and extracts MP A and uses the information about H to forward MP H to H.
service differentiation
5
Airport
7
6 3
2 1
0
Traveling
Peter
6. H extracts MP H and uses it to determine P ’s identity and the services P is entitled to. H also extracts MAH , verifies A’s identity. If both of them are legal, H creates two messages MHA (informs A whether P is a valid member and quality of services P is entitled to) and MHP (informs P whether A is a valid service provider). Both messages are sent to A.
Peter
Figure 1: The message exchange chart of the homebased authentication protocol. parts match, H will send back a message to A. This will contain two parts, one that can be decrypted only by A and the other only by P. Again, to prevent the relay attack, the message created at H will create a new nonce N 2 for both parts. MHA basically specifies the standing of P and the quality of services that P is entitled to. MHP says if A is an authorized service provider and if so, the quality of services P should receive (Later, P can verify the quality of service he/she obtained based on this piece of information). The entire message is of the form:
7. A forwards MHP to P if P has been authenticated as a valid member of H. 8. P negotiates a shared key with A and begins using A’s services if A has been authenticated as a valid service provider. 9. A provides service differentiation for P based on his/her role at H and the contract between H and A, which is implemented by the policy enforcement module in A, as shown in Figure 2. The figure shows all the modules involved in this protocol.
{[MHA , N 2]K − }K + , {[MHP , N 2]K − }K + H
A
H
P
If the mutual authentication succeeds, A will provide P with a shared key for future secure communications. P starts using the services provided by A.
2.5
Home
Service Provider
HAP Server
DHCP Server
Contract Database
Role Table Contract Database
Policy Enforcement
HAP Component (Interception)
Local Services
Protocol Summary DHCP Client HAP Client
The summary of the nine steps corresponding to Figure 1 involved in the HAP protocol is listed as follows:
Nomadic Client
Figure 2: The architecture of the HAP Prototype.
1. P is given CH when connected to H’s private network.
2.6
Advantages of the HAP Protocol
2. P moves to A. After obtained an IP address Comparing with existing authentication and authofrom A, P sends a “hello” message to A. rization mechanisms, the HAP protocol has several 3. A replies to P with his CA . unique features. 5
• The trust establishment procedure is restricted to two parties, the home organization and the service provider. The nomadic client is an extension of the home. Therefore, HAP does not need a global authentication or a centralized CA. •
•
•
•
to get the maximum profit. For the end users what matters is whether the reserved bandwidth they paid for can be satisfied. We define two metrics to measure performance of service differentiation, from the prospective of the service provider and end-user respectively. Fairness — Fairness tries to guarantee that each By using the home organization to perform au- user should get the bandwidth he deserved. Fairness thentication, the HAP protocol is completely is a relative concept. Assume user A sends out n decentralized, and scalable approach. connections, bwi is the bandwidth of connection i, A is the reserved bandwidth of user A. We BWreserv The fact that in the HAP protocol each message define the fairness as following: identifies the sender and verifies the receiver at Pn the same time considerably aids the process of bwi i=1 n the mutual authentication. (1) Fairness of A = A BWreserv Resource constrained clients need not mainUtilization — Utilization shows the usage of tain up-to-date CA lists and CRLs. Adding and removing users, roles, and permissions are wireless resources. For a given number of users, say n users, in time interval Tinterval , user i receives greatly simplified. sizei bytes in total and the specific wireless link Authentication and authorization are integrated bandwidth of the access point is BWap , we can calculate the utilization by the following equation: together seamlessly in the HAP protocol. Pn
3
Application-Aware Differentiation
Service
Utilization =
3.2
Continuing with our previous scenario, once Peter has been authenticated by the service provider, services are provided to him based on the prenegotiated contract between his organization and the service provider.1 This section describes the details of a novel service differentiation algorithm called AASD that takes user (application level) requirements into consideration when allocating network resources. AASD also follows changes of the Internet path bandwidth and adapts to maximize the system performance. We first present the performance metrics used for evaluation.
3.1
sizei i=1 Tinterval
BWap
.
(2)
User Behavior
Our proposed algorithm is inspired by previous studies on the user behavior of Web surfing. Balachandran et al.’s recent study, based on the trace collected from an academic conference (August 2001) found that major part of the user traffic is HTTP traffic [7]. We believe that in other public places, such as airports and shopping malls, the portion of the HTTP traffic will be much larger than their observed results in [7], where conference attendees are all computer professionals who prefer command line access, such as SSH. Thus, in this paper we consider the HTTP traffic only. Note that, the same technique proposed here can be applied against other applications, such as, Telnet, SSH, and so on. Regarding the Web traffic, study performed by Barford and Crovella [8] shows that for Web surfing not much time is spent on the data transfer, on
Performance Metrics
The service providers care about what percentage of their wireless resources is being utilized in order 1
Although our service differentiation scheme uses HAP for authentication, any other authentication protocol may be used, such as Kerberos [6].
6
3.3
the contrary, most of the time the user is idle and this is refereed to as the user ‘think’ time. The intensity of service demand generated by a population of some known number of users can be measured in user equivalents (UEs). A user equivalent is defined as a single process in an endless loop that alternates between making requests for online content, and lying idle. Both the Web file requests and the idle times must exhibit the distributional and correlational properties that are characteristic of real Web users. Each UE is therefore an ON/OFF process; Statistical models show Active OFF Times follow the Weibull distribution and Inactive OFF Times follow the Pareto distribution. We will refer to periods during which files are being transferred as ON times, and idle times as Active OFF if it is between embedded item URLs or Inactive OFF times if it is between page URLs as in Figure 3. Page URL
URL 2
ON Object
User Requests Page
OFF
The goal of a service differentiation algorithm would be to efficiently allocate wireless bandwidth to users, and at the same time to satisfy both the fairness and the utilization metrics. 3.3.1 Overview As we have seen in Section 3.2, usually the user reserved bandwidth is not fully utilized during their connection time. Our algorithm is based on the queue-based access control model, as shown in Figure 4, and takes advantage of this underutilized bandwidth. Internet
GETURL
URL
GETURL
T_delay T_download URL T_delay T_download
Active OFF
URL 3
`
OFF
`
URL 1
Embedded Item URL
Design
OFF
Service Queues
Inactive OFF
Agent
Agent
Agent
Requested Page received User requests Next Page
Client 1
Figure 3: The ON/OFF model used in SURGE (HTTP 1.0) [8].
Client n
Client 2
Object size cache
Bandwidth cache
...
...
Statistical models show that Active OFF Times follow the Weibull distribution and Inactive OFF Figure 4: The basic model of the application-aware Times follow the Pareto distribution. Consider the service differentiation algorithm. Inactive OFF Times, for a given time τ = 10 Z
τ
The proposed application-aware service differentiation (AASD) algorithm consists of several modules: An agent, a queue module, a GetURL module, an object size cache, and a server bandwidth cache. Each agent is responsible for receiving requests from one user. Each agent checks the request object size and server bandwidth from the cache. Based on this information, the agent inserts the user request in an appropriate position of a proper queue, then it updates the object size and
αk α x−α+1 dx = 1 − τ − 1.5 = 99.68%
1
It means 99.68% of users have 1 to 10 average seconds idle time between their clicks, this is users ‘think’ time. In the thinking time interval, the bandwidth allocated to the user is wasted. In order to utilize the wasted bandwidth, we propose the application-aware service differentiation algorithm in the following. 7
server bandwidth cache if necessary. Once a response has been received, the agent sends it back to the client. The queue in our design which holds requests from clients is different from the conventional FIFO queue. This is because client’s requests may be inserted into any position in any queue. The GetURL module performs four functions, including taking the head of the queue, getting the URL object from the Internet, monitoring the dynamic Web bandwidth and object size, and subsequently notifying the corresponding agent of the response. 3.3.2
queues can be used to take care some seriously jammed Web sites. Suppose we predict the Internet bandwidth for one incoming request as 4Mbps and put it into the 4Mbps service queue proactively, but when GetURL serves this request and finds the actual Internet bandwidth is, say 1Mbps, which is much slower than we predicted. This means the object can not finish downloading in the time we calculated. Then if the GetURL module keeps serving this request all the following requests in this queue will be delayed. With several low bandwidth service queues, such as 2Mbps and 1Mbps, in hand, GetURL can migrate this delayed request into an appropriate slow queue. For each individual queue, it is different from the traditional first in first out (FIFO) queue. For any individual client, its request can be inserted into any position in the queue. The GETURL module only takes out the head of queue and processes the request, so that at any instant moment there is only one request is being served for each queue. For each service queue, each entry includes three important elements: Tdelay is the maximum delay time before serving this request. Tdownload is the estimated download time of the requested object. U RL is the location of the object, as shown in the Figure 4. The entries are increasingly ordered by Tdelay so that the following formula holds.
Multiple Service Queues
It is a common sense in the network community that the real end-to-end Internet bandwidth to a specific Web server is usually much smaller than the available upstream total bandwidth, due to multiple factors, such as network congestion, processing overhead, and router delays. Further, the available bandwidth between an access point and a Web server is dynamically changing. Thus, using one queue for multiple clients and multiple servers probably can not fill up the total wireless upstream bandwidth, given the fact of the head of queue blocking effect. Hence, in AASD, we divide total upstream bandwidth into discrete service queues based on different individual bandwidth limits, as shown in Figure 4. However, how to allocate servers into these service queues is a problem. To address this problem, we examined the top five Web servers accessed by the users from a mediumsize research institution. We found that the following observation is hold: basically, the available bandwidth between the clients in the institution and Web servers follows the exponential distribution [9], and this relationship can be used to decide the allocation of multiple service queues. We call these queues as exponential weighted queues (EWQs). Suppose there are n queues the bandwidth ratio of these queues would be 1 : 2 : 22 : 23 : ... : 2n . Another reason to chose EWQ is to take care of service downgrade dynamically. For example, in case of the traffic jamming, bandwidth decreases significantly so that some very small bandwidth
i+1 i Tdelay ≤ Tdelay , i = 1, 2, ..., n
(3)
Based on the statistical result of the web object size distribution, most of the web page is no more than 14KB [10]. As a result, with high probability the following formula holds. i+1 i i Tdelay + Tdownload ≤ Tdelay
(4)
It means downloading object i probably will not delay serving (i + 1)th request. Even if the inequation doesn’t hold, the GetURL module will dynamically migrate the request to an appropriate lower bandwidth service queue. Tdelay and Tdownload are calculated using the following two formulas. Sobj is the object size, Buser is the user bandwidth and Bsvr 8
delay
delay
Go to sleep; Waken up by GetURL module; Update object size or server bandwidth if necessary; Send response back to client; }
is the Web server bandwidth. Tdelay
Sobj = − Tdownload , Buser
GetURL { for(;;) {
(5)
Get head of the queue; Get response from web server based on URL; Monitor the server bandwidth; If Tdownload is used up transfer request to the proper queue whose service bandwidth is closest to the real server bandwidth; else calculate object size and the new server bandwidth; Wake up agent;
Sobj , (6) Bsvr Simply speaking, as long as the queues are not empty the wireless link will be fully utilized. So the AASD algorithm combines multiple loosely loaded user sessions with one (maybe more) tightly heavy loaded service queue to improve the link utilization and maintain the user fairness at the same time. 3.3.3 Two Important Modules Figure 5 shows us the steps performed by an agent while handling one request-response cycle. The computation of the download time and delay time is shown in the figure as well. Tdownload =
} }
Figure 6: Pseudocode of the GetURL module. The pseudocode of the GetURL module is listed in Figure 6. In order to decide the bandwidth for a specific Web server, at the beginning a default value is used. Later on the Internet path bandwidth will be calculated from the latest bandwidth that each connection observed and the bandwidth value of the immediate previous connection. The new available server bandwidth (Bi+1 ) is calculated as follows:
Agent { Parse URL; Get object size by accessing object size cache; Get server bandwidth by accessing server bandwidth cache; object _ size Calculate download time Tdownload = ; server _ bandwidth object _ size Calculate delay time Tdelay = ; user _ reserved _ bandwidth Chose the proper queue whose service bandwidth is closest to server bandwidth; Insert request into position i so that Tidelay