Mobile Networks and Applications 3 (1998) 419–431
419
WebExpress: A client/intercept based system for optimizing Web browsing in a wireless environment Barron C. Housel a , George Samaras b and David B. Lindquist a b
a IBM Corporation (BRQA/502) P.O. Box 12195, RTP, NC 27709, USA Department of Computer Science, University of Cyprus CY-1678 Nicosia, Cyprus
This paper describes an application model and software technology that makes it possible to run World Wide Web applications in wide area wireless networks. Web technology in conjunction with today’s mobile devices (e.g., laptops, notebooks, personal digital assistants) and the emerging wireless technologies (e.g., digital cellular, packet radio, CDPD) offer the potential for unprecedented access to data and applications by mobile workers. Yet, the limited bandwidth, high latency, high cost, and poor reliability of today’s wireless wide-area networks greatly inhibits (to the point of infeasibility) supporting such applications over wireless networks. This paper presents the Client/Intercept computational model that makes it possible to run such distributed applications efficiently in wide area wireless networks. Furthermore, it presents WebExpress, a client/intercept based system for optimizing Web browsing, that reduces data volume and latency of wireless communications by intercepting the HTTP data stream and performing various optimizations including: file caching, forms differencing, protocol reduction, and the elimination of redundant HTTP header transmission. This paper describes these optimizations and presents some experimental results.
1. Introduction The growth in mobile computing devices1 and the emergence of wide area wireless technologies (e.g., Ardis2 [6], Mobitex3 [7], Cellular Digital Packet Data (CDPD) [4], GSM [21], and Personal Communication Service [20]) paves the way for rapid growth in mobile wireless communications. Mobile computing brings about a new paradigm of distributed computing in which communications may be achieved through wireless networks and users can compute even when they relocate from one support environment to another. The impact of wireless computing on system design goes beyond the networking level and directly affects data management and data computational paradigms. Mobile computing is indeed extremely important. Industry and academia should not concentrate their effort to develop a computational model that only fits one new type of applications, but instead one that is able to support multiple types of application models, including: current and emerging TCP/IP applications, terminal emulation, dial services and emerging mobile application paradigms. A key point for mobile computing’s wide acceptance is the user’s efficient access to applications [24]. This includes access to legacy and existing applications with minimum effort and cost, and cost effective and reliable support for new applications. The wireline communication and computational paradigm is based on the client/server model [18] where the client directly communicates various requests to the 1
E.g., Laptops, notebooks, personal digital assistances (PDAs), personal communications assistants (PCAs). 2 Ardis is a registered service mark of the Ardis Co. 3 Mobitex is a registered trademark of Telia Mobitel. Baltzer Science Publishers BV
Figure 1. Wireless computational models.
server(s) (see figure4 1). Wireless links, however, have unique characteristics that have to be recognized and handled by the communicating application models. Wide area wireless connectivity has four major negative characteristics that render the existing client/server paradigm inadequate in practice, namely: • High cost; • High latency; • Low bandwidth; • Low reliability. 4
In figure 1 the client/server model is shown adapted to the wireless environment. Part of the wired link is replaced by a wireless one.
420
B.C. Housel et al. / WebExpress
High cost: The cost per byte transmitted is orders of magnitude greater than traditional wireline (LAN/WAN) networks [30]. This makes bandwidth consumption a major concern for wireless computing designs. Sending a 10 K HTML page costs from one to two dollars over current wireless networks. We accessed a popular quote server over a packet radio network and incurred a charge of $27. Even after establishing a local cache the second quote was still very expensive, about $0.50. Consequently, connectivity is weak and often intermittent since mobile units may prefer to remain disconnected for long periods of time for reasons of cost. High latency: The response time for wireless links in WANs is much slower than their wireline or wireless LAN counterparts. Indeed, latency is a factor affecting the performance of wireless applications. Systems such as WebExpress depend on communications support (e.g., CDPD [4], ARTour5 [8]) that provides TCP/IP over wireless links. Even typical TCP connection times may take between 5 and 15 seconds [9]. Low bandwidth: The capacity of WAN wireless links is limited compared to most wireline links. Wireless communications face many obstacles because the surrounding environment interacts with the signal. Thus, while the growth in wired network bandwidth has been tremendous (in current technology Ethernet provides 100 Mbps, FDDI 100 Mbps and ATM 622 Mbps), products for wide area wireless communication achieve much lower rates. Depending on the technology, the channel bit rate of WAN wireless links ranges from 4800 bps (e.g., Ardis MDC4800) to 19200 bps (CDPD, RD-LAP [27]) with the effective data rate being substantially lower due to latency and retransmission. Furthermore, for wide area radio transmission the error rate is often so high that the effective bandwidth is limited to less than 10 Kbps [31]. A WAN wireless link is also shared among all the mobile terminals within range of the wireless base station. Thus, the data rate realized by any given wireless device is the link speed divided by the number of devices simultaneously using the link (i.e., similar to a multi-drop link). This affects not only bandwidth but latency as well. Low reliability: Wireless WAN connections are significantly less reliable than wireline connections [25,33]. Wireless connections may be intermittently or permanently disrupted for various reasons: the mobile device goes outof-range; the mobile device goes behind a barrier that blocks the signal; or, higher layer communications or applications protocols time out because of delayed responses. Disruptions may be hidden from the wireless application by error recovery procedures in the link-level and higher layer communications protocols. However, this often results in excessive retransmissions (and extra cost), not to mention user frustration that may result in request resubmissions. These limitations of wireless communications result in a form of distributed computing with drastically different 5
ARTour is a registered trademark of the IBM Corp.
connectivity assumptions than in traditional distributed systems. New computational paradigms are needed to achieve a usable system. The client/agent/server [9,14–16] paradigm, while appropriate for certain new types of applications, does not efficiently support existing applications since it requires changes to the client code and optimizes communication only from the agent to the mobile client. We propose a variation of the client/server model, called Client/Intercept, that aims, based on interception techniques, to alleviate the negative characteristics of the wireless link. The client/intercept (or client/intercept/server) model achieves the aim efficiently and transparently without the need for changes in the client and server application code. It uses a pair of components that run on the client system and within the server’s wireline network for the interception and optimization of various wireless requests. It applies to a class of client/server applications; the optimizations used, while generic, are implemented in a way specific to the chosen type of applications. WebExpress is such a client/intercept system for optimizing Web access. It reduces the traffic over wide area wireless links in the context of World Wide Web (simply Web) client/server applications. It reduces data volume and latency by intercepting the HTTP data stream and performing various optimizations including: file caching, forms differencing, protocol reduction, and the elimination of redundant HTTP header transmission. While other wireless systems, e.g., [19], share intercept characteristics, none to our knowledge base their formal implementation on the client/intercept model. WebExpress, is so far, the only true client/intercept system. The system described in [19] provides only a client-side file system agent, called Venus, responsible for coping with the consequences of mobility through intercepting, caching and prefetching. This paper presents the mobile computational paradigms (client/agent/server and client/intercept models) and describes the WebExpress system as an instance of the client/intercept model. Section 2 presents the wireless computational models. Section 3 briefly describes the Web, the need for the WebExpress system, the HTTP transport protocol, and its inefficiencies for wireless communication. Section 4 describes the WebExpress intercept model and summarizes the optimizations that it uses (i.e., caching, differencing, protocol reduction and header reduction). Sections 5, 6 and 7 describe the implementations of these optimizations. Performance improvements and experimental results of the WebExpress system are presented in section 8. WebExpress for mobile users, related work, and comparison with other web browsing systems are discussed in sections 9 and 10. Section 11 concludes the paper.
2. Wireless computational models As noted above, the ideal wireless computational model should efficiently support existing client/server based applications without any changes to the client or server ap-
B.C. Housel et al. / WebExpress
plication code in addition to supporting other new types of applications. The client/agent/server model (see figure 1) uses messaging and queuing infrastructure for communications from the mobile client to the agent. Architectures based on such models somewhat alleviate the impact of the limited bandwidth and the poor reliability of the wireless link by continuously maintaining the client’s presence via the agent on the fixed network. While well suited for some applications (i.e., emulators for host based applications) for the mobile environment, the weakness of this model is that it still requires changes to the client code for the development of the client/agent interaction. Although there is no need for changes to the server application there is a clear need for intimate knowledge of server interaction. Finally, the agent can directly optimize only data transmission over the wireless link from the fixed network to the mobile client and not vice versa. Client/Intercept/Server (see figure 1) is a computational model that is able to satisfy the needed requirements. The model uses an intercept technique that enables the application to intercept and control communications (and computations, depending on the type of the application) over the wireless link for the purpose of reducing traffic volume and optimizing the communication protocol to reduce latency. The components that implement the intercept technique are inserted in the data path between the client and the server. The Client Site Intercept (CSI) process runs in the end user mobile device. The Server Site Intercept (SSI) process runs within the wireline network. The CSI intercepts a client’s request and, together with the SSI, performs optimizations to reduce data transmission over the wireless link, improve data availability and sustain uninterrupted the mobile computation. The CSI/SSI pair aims to transparently minimize the effect of the wireless link. From the point of view of the client, the CSI appears as the local server proxy that is co-resident with the client. Similarly, the SSI appears as the local client proxy that resides on the fixed network, typically co-resident with the server. In essence the Client/Intercept/Server model adapts the client/server model to wireless computing; the CSI/SSI pair minimizes the effect of the wireless link. The intercept model offers a number of advantages: it is transparent to both the client and server, therefore can be employed with any client application and be insensitive to the development of the particular client/server application or communication technology. The CSI/SSI protocol can facilitate highly effective data reduction and protocol optimization without limiting client functionality or interoperability. While well suited to the new mobile environment, the weakness of this new model is that every application that is to be accessed requires development work, for the implementation of the CSI/SSI pair, in the server and client sites. The CSI/SSI pair is not developed for every instance of an application; its development and optimizations are generic for a particular type of applications. For example, Web and mobile file [19] applications might require their
421
CSI/SSI interaction to facilitate different and applicationspecific optimizations. Cache reconciliation in Web applications requires different and less elaborate processing than reconciling cached data in file or database applications [16,19,22]. The rest of the paper describes WebExpress, a software system that reduces the traffic over wide area wireless links in the context of Web client/server applications. We measured 60% to 99% reductions in wireless network traffic and 36% to 97% improvements in application response time. WebExpress provides this traffic reduction using the client/intercept model and is thus transparent to Web clients and servers. 3. WebExpress: a client/intercept wireless Web browsing system The World Wide Web [1] is rapidly being accepted as a universal access mechanism for network information. The Web is based on the HyperText Markup Language (HTML) [2] and the HyperText Transport Protocol (HTTP) [3]. HTML provides a common representation for information, and HTTP defines the common protocol for transporting information between Web clients and Web servers. The Web browser serves as the user interface; it is responsible for sending user requests to the appropriate Web server (typically via a Web proxy gateway) and formatting and displaying HTML data streams returned to the client device. Web browsers serve as user interfaces to many forms processing applications in addition to their uses for information retrieval and browsing. While browsers currently lack certain features useful for forms processing (e.g., field editing and flexible display formatting), their widespread availability, the availability of the Internet to offer worldwide any-to-any connectivity, the ease with which users can create forms using HTML, and the ability to easily write Web application servers (using the HTTP defined Common Gateway Interface (CGI)) make the Web very attractive (and inexpensive) for a large class of forms-based applications. In contrast to random surfing (i.e., ad hoc Web browsing), WebExpress is aimed at routine repetitive commercial applications, such as: • Visiting nurse or medical personnel; • Mobile salespersons that need to query product data, place orders, and check credit; • Service workers involved with equipment repair or checking warehouses for parts. The popularity of the Web suggests that such browsers may offer a compelling end user interface for many mobile wireless applications. The objective of WebExpress is to facilitate the use of Web technology to run typical commercial transaction processing applications over wireless networks. The predictability of this type of usage makes it possible to employ optimizations that make wireless Web access
422
B.C. Housel et al. / WebExpress
pabilities, a list of 200–400 bytes in length, with each request. These capabilities are normally the same each time for any given browser (i.e., client device). • Verbose protocol: HTTP control information is coded in standard ASCII and employs human-friendly keywords, which increases the number of bytes transmitted per request.
Figure 2. Normal Web browsing (HTTP) protocol.
practical both from a usability and cost perspective. The successful deployment of WebExpress extends Web technology to a new usage domain. Next, we briefly summarize the HTTP protocol and describe the inhibitors to Web browsing over wireless networks. 3.1. HTTP summary A Web browser communicates directly with a Web server (or proxy server) over a TCP connection using the HTTP protocol (see figure 2). The user specifies a Universal Resource Locator (URL) to address the object requested. This object may be a stored (HTML) text document or an HTML data stream that is generated by a program. In the latter case, the Web server invokes the program via the Common Gateway Interface (CGI) [32]. The HTML object returned to the Web browser client may contain hyperlinks to other HTML objects and directly embed graphic (GIF, JPEG, TIFF, MPEG, etc.) objects. It is the browser’s responsibility to issue additional requests for the embedded objects (on behalf of the end user) until the document is complete. 3.2. Inhibitors to Web browsing over WAN wireless networks In addition to the limitations of wireless communications (limited bandwidth, high latency, high cost, and poor reliability), HTTP presents the following inefficiencies: • Connection overhead: Each request for an HTML page or graphic object (GIF or JPEG file) requires the browser to open a TCP/IP socket. This operation adds to data overhead and increases the latency. We have observed6 connection times in the neighborhood of 5–15 seconds in low speed (e.g., 4800 bps) WAN wireless networks [9]. • Redundant transmission of capabilities: As the HTTP protocol is stateless, the browser must (re)send its ca6
Experimental results: For packet radio networks, the ping time ranges from 0.5 to 3 seconds. During a connection attempt, there appears to be several retries probably because the normal TCP/IP window size is not set to accommodate slow wireless links. Thus, a TCP total connection set up time ranging from 5 to 15 seconds is not unusual. Time measurements (by watch) were taken for RAM and ARDIS but not for CDPD or GSM. Our measurements coincide with the measurements taken by other ARTour developers.
While the above overhead can be tolerated in wireline networks, it renders Web access in a wireless environment less attractive because of poor response times and occasional time-outs. Even in wireline networks the inefficiency of HTTP is a problem, as evidenced by performance improvements in HTTP/1.1 [3] and active research to improve HTTP [23]. The remainder of the paper describes the WebExpress Intercept model and the variety of data reduction techniques that it employs: caching, differencing, protocol reduction, and header reduction. 4. The WebExpress intercept model An important objective of WebExpress is to be able to run with any Web browser (e.g., Netscape, Mosaic, etc.) and any Web server without imposing any changes to either. To accomplish this we use the client/intercept model that enables WebExpress to intercept and control communications over the wireless link for the purposes of reducing traffic volume and optimizing the communications protocol to reduce latency. The WebExpress components that implement this intercept technique are shown in figure 3. Following the model, as described in section 2, we insert two components into the data path between the Web client and the Web server7 : the WebExpress Client Side Intercept (CSI) process that runs in the end user client mobile device and the WebExpress Server Side Intercept (SSI) process that runs within the wireline network. The CSI intercepts HTTP requests and, together with the SSI, performs optimizations to reduce (Web related) data transmission over the wireless link. From the viewpoint of the browser, the CSI appears as a local Web proxy that is co-resident with the Web browser (i.e., it resides on the same mobile host but runs in its own process). The CSI communicates with the Web browser over a local TCP connection (using the TCP/IP “loopback” feature8) via the HTTP protocol. Therefore, no external communication occurs over the TCP/IP connection between the browser and the CSI. No changes to the browser are required other than specifying the (local) IP address of the CSI as the browser’s proxy address. The actual proxy (or socket server) address is specified as part of the SSI configuration. The CSI communicates with an SSI process over 7
Henceforth, the term Web server can be a proxy server, a socket server, or the target Web server. 8 Interference might occur in case other applications are listening to the same port. Care must be taken during system set up time to prevent such interference.
B.C. Housel et al. / WebExpress
423
a response is received, the SSI computes the difference between the base object and the response and then sends the difference to the CSI. The CSI then merges the difference with its base form to create the browser response. This same technique is used to determine the difference between HTML documents. • Protocol reduction: Each CSI connects to its SSI with a single TCP/IP connection. All requests are routed over this connection to avoid the costly connection establishment overhead. Requests and responses are multiplexed over the connection. A similar approach is been lately proposed in the new HTTP Proposed Standard HTTP/1.1 [3]. Figure 3. WebExpress intercept model.
a TCP connection using a reduced version of HTTP (see section 7). The SSI reconstitutes the HTML data stream and forwards it to the designated Web proxy server. Likewise, for responses returned by Web servers (or proxies), the CSI reconstitutes an HTML data stream received from the SSI and sends it to the Web browser over the local TCP connection as though it came directly from the Web server. The intercept model implemented in WebExpress offers a number of advantages: It is transparent to both Web browsers and Web (proxy) servers and, therefore, can be employed with any Web browser. It is largely insensitive to the development of the rapidly maturing HTML/HTTP technology since WebExpress will let through what it does not understand (e.g., new HTTP access tags, encrypted data, etc.). Although we must parse user requests (URLs), the optimizations utilized by WebExpress are almost9 totally independent of HTTP (see sections 5, 6 and 7). Thus, WebExpress does not have to be upgraded to run with new (or different) versions of Web browsers that are available in the market place.10 The CSI/SSI protocols facilitate effective data reduction and protocol optimization without limiting any of the Web browser functionality or interoperability. The remainder of this paper describes the WebExpress optimization methods summarized below: • Caching: Both the SSI and CSI cache graphic and HTML objects. If the URL specifies an object in the CSI’s cache, it is returned immediately as the browser response. The caching functions guarantee cache integrity within a client-specified time interval. The SSI cache is populated by responses from the requested Web servers. If a requested URL received from a CSI is cached in the SSI, it is returned as the response to the request. As we shall see later the cache plays an important role in the differencing function. • Differencing: Each new CGI request to a particular URL may well result in a different response (e.g., a stock quote server). Essential to differencing is caching a common base object on both the CSI and SSI. When 9
WebExpress is sensitive to the HTTP protocol to the extend that it processes the HTTP “last modified” tag to determined caching criteria. 10 The new HTTP darft HTTP 1.1 required no changes to WebExpress.
• Header reduction: Currently the HTTP protocol is stateless, requiring that each request contains the browser’s capabilities, called access lists. For a given browser, this information is the same for all requests. When the CSI establishes a connection with its SSI, it sends its capabilities only on the first request. This information is maintained by the SSI for the duration of the connection. The SSI includes the capabilities as part of the HTTP request that it forwards to the target server (in the wireline network).
5. Caching The WebExpress caching methods significantly reduce the volume of application data transmitted over the wireless link. These cache methods are designed for browsing of stored documents or files that change relatively infrequently. Information that does change frequently is handled by the differencing methods described in the next section. Today’s Web browsers offer a variety of cache technologies and cache management options. In general, their caches are designed to meet the needs of wired users where surfing is common; cache methods trade inexpensive network bandwidth for reduced storage consumption. These cache methods are designed around a browser session, where we define a browser session to be the start and close of a browser application. These methods either purge the cache at the end of a session or let the cache objects persist across sessions with updates occurring once on first reference per session. Advanced user options are often available to cause cached objects to be updated on every reference, or to never be updated. Our experience with wireless systems indicates that cache methods designed for wired networks are not well suited for wireless users of Web application. We found that cross-session persistence of cached objects is critical, that update methods should be optimistic techniques that look for changes to stored objects based on elapsed time, and that changed objects should be updated through differencing versus a complete object update. WebExpress attempts to maximize the value of its client cache by blending these requirements with some additional user controls such that the cache management
424
B.C. Housel et al. / WebExpress
Figure 4. WebExpress caching.
technique is adaptable to the unique characteristics of each application. As illustrated in figure 4, WebExpress supports client and server caching. The algorithm used to manage the client cache is basically Least Recently Used (LRU), with a user option to specify indefinite persistence of specific objects. It is our intention to maximize client cache efficiencies through cache methods that allow a user to declare frequently accessed or critical information for object persistence and to enable the system to adapt to the browsing patterns of the individual user. The server cache, which is also LRU managed, is designed to adapt to the browsing patterns of a set of users. Specifically, information retrieved by one user may be reused by others, avoiding delays associated with retrieving information from the Web server (see also section 6). Server prefetching based on client’s surfing profile is also being considered. Objects loaded into either the WebExpress client or server caches persist across browser sessions. This decision increased the cache hit ratios but presented us with a cache coherency problem. We needed a mechanism to detect when objects change and the ability to update changed objects, without over-utilizing the wireless link. To satisfy this requirement we designed the cache coherency methods to be based on the age of information. This approach allows WebExpress users to adjust the frequency of update requests to the dynamics of each Web application. To provide this cache coherency model, WebExpress associates a digital signature (CRC) computed via a cyclic redundancy check algorithm [28,29], and a coherency interval (CI) with each cached object. The coherency interval specifies when the object is to be checked for changes. The coherency interval, measured in minutes, is set by each user or an administrator as a default for all cache objects. The user may override the default coherency interval for any given set of cached objects. When a cached object is referenced,
the CSI checks to see if the coherency interval has been exceeded. If it has not, the cached page is used. If the coherency interval has been exceeded, the CSI and SSI execute a protocol to determine if a fresh copy of the page has to be fetched. Essentially, the CSI requests that the SSI verify that the object in question has not changed; the CSI provides to the SSI the object’s URL, a CRC of the object, and the coherency interval associated with the object. The SSI attempts to satisfy the coherency request with the contents of its cache, depending upon the age of the object within its cache. If the object within the SSI is too old based on the coherency interval or the object is not in the SSI cache, the SSI will obtain a new copy of the object from the Web server and enter a Store Date Time (SDT) (indicating the current time the object is accessed) and CRC for the object into its cache directory. Now, if the CRCs match between the CSI’s and SSI’s copy, SSI indicates to the CSI that the object in question is up to date. CSI then updates the store date time (SDT) of the object to reflect the SSI’s SDT of the same object; an age scheme is used to avoid clock synchronization problems. In this schema the time remaining (produced by SSI) before the server’s copy expires relative to the client’s coherency interval is added to the client’s SDT to produce the updated SDT. This schema is required since the SDT at the server might be different due to requests of other clients (for more explanation of why the SDTs might be different see section 6). If the object has changed, based on the CRC comparison, the SSI indicates to the CSI that the object is out of date and sends the updated object to the CSI. The CSI then updates its cache and directory appropriately. The use of a coherency interval allows users to specify how frequently to update cached Web objects, trading off wireless traffic for cache coherency. Initiating coherency checks on object references limits wireless traffic to only those objects actually being referenced. An alternative ap-
B.C. Housel et al. / WebExpress
proach, which we may investigate in a future version of WebExpress, would be to initiate a single asynchronous coherency check on browser session start (or CSI startup, or after a period of user inactivity, or based on user input) for all cached objects that are older than the coherency interval. This batch check has the potential to further reduce the per object wireless latency associated with maintaining cache coherency at the cost of some additional bandwidth utilization. The WebExpress cache methods reduce the volume (up to almost 100% reduction, see section 8) of application data transmitted over the wireless link, through cross browser session persistence, age based coherency algorithms, digital signature based modification verifications, and the user options (CI or object persistence11 ). When updates to cached objects are required, differencing methods are invoked to further the reduction of application traffic as described in the next section.
425
Figure 5. The problem with CGI-generated pages.
6. Differencing The expectation of browsing stored documents or files is that they change relatively infrequently. This fact is fundamental to the caching techniques discussed previously. However, Web browsers and HTML can be used to perform transaction processing. The HTML definition allows the specification of HTML streams that enable users to enter data and then submit the data (or form) for processing by an executable program located elsewhere in the Web. The program is identified by a URL just as a non-executable file, and a command (e.g., GET or POST) is sent from the browser to the server specified by the URL. This command may be coded explicitly or generated implicitly as a result of entering data on a displayed form. Input data (if any) follow the URL as part of the HTTP data stream. The rules for invoking programs, enabling them to read parameter data and generate replies is the responsibility of the Common Gateway Interface (CGI). The term CGI Processing refers to the process of executing programs from Web browsers. The caching techniques described earlier do not help in CGI processing because no two replies to requests to the same URL are likely to be the same. This is not surprising because users enter different data for different requests and expect to receive different results. Figure 5 shows two different queries to a stock-quote server. Naturally, the reports for Company A (e.g., IBM) and Company B (e.g., Motorola) stocks are different. To minimize responses from CGI programs, we use a differencing technology. This approach is based on the observation that different replies from the same program (application server) are usually very similar. For example, the replies to stock queries for different companies vary only in numbers (e.g., price) and symbols (e.g., XYZ or ABC). HTML 11
“Persistence” means that the object will never be deleted from cache, although it can be refreshed based on the CI.
Figure 6. CGI request at time T.
byte streams representing query responses often contain lots of unchanging formatting data (including graphics). To illustrate the algorithm, we consider two queries to program (Form) X at times T and T + DT as shown in figures 6 and 7. The CSI determines that the HTTP request is a CGI request if the method is Post or if the URL is followed by a name/value parameter list. Initially, (time T) there is no record of a cached response for the URL at the CSI, and the request is sent to the SSI and forwarded to the server as normal. When the response is received by the SSI, it is cached (and its CRC is computed) before forwarding it to the CSI. Likewise, the form is cached at the CSI before it is sent to the browser. At this point a base object has been established for the CGI URL. The cache coherency methods described previously are not used for base objects since CGI responses are never coherent; they change with every request. Now, consider the flow when another request is issued to program X using the same URL at time T + DT (see figure 7). When a request for CGI processing is detected, the CSI checks to see if the URL is cached. In this example, at time T + DT a cached version is found. Now, the CSI forwards the request (i.e., URL plus parameters) to the SSI along with the CRC value of the base object (i.e., the report received for the request at time T). This
426
B.C. Housel et al. / WebExpress
Figure 7. Request at T + DT.
CRC is maintained as part of the request state. The HTTP data stream is forwarded to the HTTP server to execute the request. Subsequently, a report is received at the SSI. The SSI determines that differencing is possible because a base object for the URL exists in the cache and its CRC matches that received with the request from the CSI. The differencing engine computes the difference stream between the received report and the base object of the URL. A difference stream, consisting of a sequence of copy and insert commands, is sent to the CSI. The CSI update engine uses the difference stream and the request’s base object to reconstruct the new report. The copy commands tell which byte sequences of the base object are to be copied to the new report and the insert commands cause data received in the difference stream to be copied to the new report during reconstruction. We are guaranteed that the reconstruction is correct because the SSI verified (using CRCs) that the URL’s base objects are equal. Finally, the CSI sends the reconstructed report to the browser for display. The differencing engine, operating on the application layer, uses well known differencing technology [10,11] that has typically been used in code library maintenance. However, it is important that the differencing engine works well on binary files because the concept of a line does not exist in HTML; indeed, in our experience, programs that generate HTML streams often do not generate carriage-returns or line feeds. Also, the difference processing should be efficient for small files, since up to 80% of the CGI responses are 10 K or less. Basing clients: An SSI may serve many CSIs (clients). To avoid the SSI maintaining a separate base object (for a given URL) for each client, we need to return the same base object for each client the first time12 it requests a given 12
Recalling that objects in the cache may persist across many activations
URL. This is easily accomplished with a slight variation of the logic described in the previous example. When the SSI receives a response from the CGI server and the CRC received from CSI does not match the CRC of the corresponding base object at the SSI, the SSI computes the difference stream and returns both the base object and the difference stream to the CSI. The CSI caches the new base object before constructing the browser response. The significant point is that the CSI is prepared to receive two objects from the SSI: the difference stream and the new base object. This also handles the problem of rebasing the client which is necessary when the base objects in the CSI and SSI get out of sync as described below. Rebasing: Rebasing the client is periodically necessary because, for a variety of reasons, it is possible that the base objects in the CSI and SSI become different. An object in either the CSI or SSI may be flushed from the cache as a result of the LRU policy. Alternatively, the base object in the SSI may be updated because the SSI detects that the difference stream has grown beyond a certain threshold, which often indicates that the response CGI data stream has changed for reasons other than different request parameters (i.e., the application changes the format for aesthetics or adds information such as a copyright notice). When the SSI updates its base object, the CSI is rebased (as previously described) the next time the URL is requested. Differencing for non-CGI responses: Although the above discussion has focused on the use of differencing to accommodate CGI processing, the technique is more generally useful. Presently, differencing is not applied to graphics objects (GIF, JPEG files). However, differencing may produce dramatic reduction for text HTML files that have incurred minor updates. of the client browser and CSI, the first time is defined as the initial instantiation of the base object in the cache.
B.C. Housel et al. / WebExpress
7. Protocol reduction The use of caching and differencing significantly reduces the volume of application data (i.e., HTML and graphic objects) transmitted over the wireless link that connects the client work station to its wireline backbone network. However, these provisions do not address the overhead of repeated TCP/IP connections and redundant header transmissions. The WebExpress system for optimizing Web browsing in a wireless environment employs techniques to reduce the overhead of both of these categories. 7.1. Reduction of TCP/IP connection overhead The normal HTTP protocol is depicted in figure 8. The browser establishes a connection with the server and sends a single request to the server. The server sends a response document and closes its end of the connection. The browser receives the document and closes its end of the connection. For HTML documents this scenario (shown in figure 8 as a pair of arrows) is repeated for each image referenced in the document. It is also repeated every time the user clicks a hyper link on the displayed page. The continuous opening and closing of TCP/IP connections increases the response time to user requests and network traffic [23]. Virtual sockets: As figure 9 illustrates, the WebExpress system eliminates most of the overhead of opening and closing connections across the wireless link by establishing a single TCP/IP connection between the CSI and the SSI. The
427
CSI intercepts connection requests and document requests from the browser and sends the document requests over the single TCP/IP connection to the SSI. For each request received from the CSI, the SSI establishes a connection with the destination server and forwards the request. When the SSI receives the response from the server, it closes the connection with the server, sends the document to the CSI via the single TCP/IP connection but does not close this single TCP/IP connection. The CSI then forwards the document to the browser and closes its TCP/IP connection with the browser. The connection setup and takedown overhead is incurred between the browser and the CSI and the SSI and the Web server but not over the wireless link between the CSI and SSI. WebExpress uses a mechanism called virtual sockets to provide this multiplexing support. Virtual sockets enable a CSI to establish a single TCP/IP connection with an SSI and use the connection for many HTTP requests. Data sent for a given request is prefixed by a small header that contains a virtual socket id, a command byte, and a length field. At the CSI, the virtual id is associated with a (real) socket to the browser; likewise, at the SSI the virtual socket id is mapped to a socket connection to an HTTP server. A suite of virtual socket interfaces is defined that corresponds to the TCP/IP socket calls (e.g., open, close, select, etc.). A real TCP/IP connection is established with the first virtual socket open, and is closed when a preset time interval expires after the last virtual socket is closed. In summary, this mechanism permits efficient transport of HTTP requests and responses while maintaining correct HTTP protocol and transparency with respect to Web browsers and servers. 7.2. Reduction of HTTP headers
Figure 8. Normal HTTP connections.
Figure 9. Multiple browser connections over one TCP/IP connection.
HTTP requests and responses are prefixed with headers. HTTP request headers contain a list of MIME contenttypes that tell the server the various document formats the browser can handle. This list can be several hundred bytes in length. Since it is usually the same each time, it is unnecessary to send it across the wireless link in every request. Instead, the CSI allows this information to flow in the first request after CSI-to-SSI connection has been established. Both the CSI and the SSI save this list as part of the connection state information. For each request received from the browser, the CSI compares the list received with its saved version; if they match, the list is deleted from the request before it is forwarded to the SSI. When the SSI receives a request from the CSI with no access lists, it inserts its saved copy into the request header. If an access list is present in the received request, it replaces the saved version at the SSI if one exists. In either event, the correct access list is sent to the server as though there were a direct browser-server connection. HTTP response headers, unlike the request headers, may be different each time. Typically (as with CGI responses) only a few bytes (e.g., date-time) vary from one response to another. Encoding the constant data (e.g., content-type) can
428
B.C. Housel et al. / WebExpress
reduce the response to just a few bytes. We have observed response headers ranging from 73 to 389 bytes. This reduction, while often inconsequential in wireline networks, can be worthwhile when multiplied by all the mobile wireless units sharing a wireless link (i.e., every little bit helps!). The reduction techniques described in this section, including compression, achieve an overall HTTP header reduction of up to 90%.
8. Usage scenarios and results To demonstrate the effectiveness of WebExpress, we ran a series of transactions against two applications on active Web sites in the Internet: a DB2 World Wide Web Connection demo application and a popular quote server application. The results are summarized in table 1. The bytes transferred and the elapsed response time were recorded for each transaction. Each test case was evaluated with and without WebExpress. The Base column represents the measurements made without the use of WebExpress. To achieve maximum benefit and usability of WebExpress, the contents of the cache were established in the mobile device (using fast, cheap wireline communications) before the wireless access was attempted, i.e., the cache was prewarmed. To prewarm the client cache, users can run through application scenarios over the fixed network. The test environment was not a controlled environment. These tests were performed on a production Mobitex [26] network operated at 8,000 bps, connected to an enterprise network which was connected to the Internet. Consequently, the utilization of the various network and server components varied during our measurements, typical of a production environment. Test cases 1, 2 and 3 correspond to the DB2 application. The DB2 application consisted of: querying product information, querying and updating a customer profile and querying a parts order. This resulted in 10 Web pages, 7 documents and 3 forms, totaling about 30,000 bytes (including images). With all three test cases we exercised all 10 pages of the complete application. Test case 1 was the first pass through the application during peak hours, between 10 am and 4 pm. Test cases 2 and 3 were subsequent passes through the application with test case 2 occurring during peak hours and test case 3 occurring after 10 pm. With the first test case, the initial pass through the application, we evaluated a worst case scenario for the browser since the browser’s cache did not contain any of the application data. Although this is a worst case, it is a reality for a mobile worker when the browser does not support a persistent cache or refreshes pages on first time reference within a session. Without WebExpress each Web page and its corresponding images were fetched by the browser, generating 56 KB of traffic and taking in excess of 20 minutes to
complete.13 With WebExpress, the persistent cache methods satisfied the browser requests for the application documents and images. The differencing and communication methods were able to update the query reports with 2 KB of network traffic. The complete elapsed time of the job was reduced from 20 minutes to under 3 minutes during peak network utilization. In test cases 2 and 3, the browser cache contained all the documents and images in its memory cache. Without WebExpress the three queries (CGI processing) generated about 9,600 bytes of network traffic during peak hours and about 4,900 bytes during off hours. We believe that the difference between the measurements was due to the amount of packet retransmission during the peak hours. With WebExpress, the network traffic did not vary much between test cases 1 and 2 as we expected since the queries still needed to be resolved. And with test case 3 again we measured a significant drop in network traffic during the off hours. Test cases 4–6 correspond to the quote server application. The quote server application consisted of a home page, an input form and a report. This resulted in 3 Web pages, totaling about 50,000 bytes (including images). Like the DB2 tests, we exercised the complete application for each test case. Test case 4 was the first pass through the application during peak hours, between 10 am and 4 pm. Test cases 5 and 6 were subsequent passes through the application with test case 5 occurring during peak hours and test case 6 occurring after 10 pm. In the 4th test case, without WebExpress each Web page and its corresponding images were fetched by the browser, generating 137,000 bytes of traffic and taking in excess of 17 minutes to complete. With WebExpress, the persistent cache methods satisfied the browser requests for the application documents and images. The differencing and communication methods were able to update the stock quote requests with about 500 bytes of network traffic. The complete elapsed time of the job was reduced from 17 minutes to 30 seconds during peak network utilization times. In test cases 5 and 6, the browser cache contained all the documents and images in its memory cache. Without WebExpress, the stock quote requests generated 2.2 KB of network traffic during peak hours and 1.3 KB during off hours. With WebExpress, the network traffic did not vary much between any of the three test cases 4, 5, or 6. The quote requests were satisfied with about 500 bytes of network traffic. Similarly, the charges incurred were basically the same. All quotes were about $0.10 each. Without WebExpress the charges were much higher. Test case 4 incurred a charge of $27 and test cases 5 and 6 charges of $0.50 and $0.30, respectively. In summary, our experiments with the WebExpress prototype indicate significant data reductions and response time improvements. As illustrated in table 1, we measured 60% 13
The number of users sharing the wireless link, the contention on the server and the number of pages (documents and forms) affects the connection statistics as well as the absolute amount of data.
B.C. Housel et al. / WebExpress
429
Table 1 Results of running Web applications with and without WebExpress.
Test
Base
Bytes WebExpress
Reduction, %
Base
1 2 3 4 5 6
56779 9635 4853 137649 2234 1326
2302 2643 1272 456 533 515
96 73 74 99.7 76 61
1260 343 114 1079 59 19
to 99% reductions in wireless network traffic and 36% to 97% improvements in application response time. Experiments concerning the size of the cache, performed using more elaborated scenarios, indicated that for business applications 1 MB for cache storage (per client) was enough.
Seconds WebExpress 166 96 59 30 37 11
Reduction, % 87 72 48 97 37 42
• Alert messages to indicate when a given request is expensive; • Adaptive algorithms to tune resource usage based on the type of network connectivity. 10. Related work
9. WebExpress for mobile computing, and future work In this paper we presented a system model and optimization techniques for reducing data volume and communication latency sufficient for practical web access over slow, low volume, unreliable wireless links. However, there are many other issues that need to be addressed for a mobile user to work effectively. One primary concern is operation in the presence of lost connectivity due to signal loss or temporary blockages. WebExpress uses TCP/IP sockets like any other TCP/IP application. Therefore, the loss of signal is manifested by a lost TCP/IP connection from the viewpoint of the WebExpress software. Wireless link protocols attempt to minimize these disruptions with sophisticated roaming functions and timely packet retransmission. Nevertheless, wireless environments cause much greater disruption than traditional wireline networks. To help during lost connections, WebExpress provides an “asynchronous/disconnected” mode that permits requests to be automatically queued when connectivity is lost and resumed when connectivity is re-established. In addition, users can issue multiple web requests without having to wait for their respective replies. Responses are queued for the user to view at leisure. This function in conjunction with the disconnected operation capability enables users to make web requests without being connected at all, assuming that the required HTML pages are in the cache. Transferring the cache to sites closer to the roaming client does not seem to be needed, so far, since for established caches, differencing provides significant savings (up to 99%). Thus, the cost of the required transmission over high speed wireline networks remains relatively insignificant. A variety of other functions could be employed by the WebExpress client/server intercept system that would reduce user inconvenience and cost when operating in a mobile wireless environment including: • Lossy compression [14]; • Information filters (e.g., sending only the title of a document);
A number of studies have shown promising results regarding performance improvements of wireless Web browsing. Kaashoek et al. [12] reduced the latency of slow links by modifying the Mosaic client with scripts to perform caching and prefetching. Rover [13] used relocatable dynamic objects and queued RPC to develop a nonblocking operation for most browsers to allow a user to click ahead. GloMop [14] uses lossy compression while preserving semantic information for documents and images. Liljeberg et al. [15] provides an optimized sockets library and transport for wireless cellular links. Additionally, they implemented an agent-proxy model to facilitate an HTTP batch get and disconnected operations. Like WebExpress, most of these approaches employ variations of agent-proxy as well as cache technologies. The novelty of WebExpress lies in its use of differencing technology with forms (i.e., transaction) processing, its distributed cache model with its coherency algorithms, and the virtual socket design and implementation that minimizes the number of network connections needed for HTTP requests. In addition, the WebExpress system combines these technologies via the Intercept Model, and thus transparently supports today’s browsers, servers, and transport stacks. 11. Conclusions The emerging client/agent/server paradigm, while appropriate for certain new types of applications, does not efficiently support existing applications. In this paper we propose a variation of the client/server model, called client/intercept, based on interception techniques, that aims to alleviate the negative characteristics of the wireless link. We have described WebExpress, a client/intercept system, for optimizing wireless Web browsing. WebExpress makes it feasible, from both a usability and cost viewpoint,
430
B.C. Housel et al. / WebExpress
to run commercial Web applications over wide area wireless networks. An important key to this success is the repetitive and predictable nature of transaction processing. This predictability enables the WebExpress caches to be preloaded using wireline access before wireless access is attempted, thereby, enabling the caching and differencing functions to work with minimal data transfer. Due to the limitations of wireless communications, it is necessary to employ a variety of optimization techniques to achieve a usable system. WebExpress demonstrates one set of optimizations that has proven successful for Web applications. In particular, the distributed caching and differencing functions are critical, since they effectively extend caching to work with continually updated objects without requiring that the entire object (response) be transferred.
Acknowledgements We acknowledge Michael L. Fraenkel and Reed R. Bittinger and Andy Citron for their extensive contributions to the design and development of WebExpress.
References [1] T. Berners-Lee et al., The World-Wide Web, CACM 37(8) (August 1994) 76–82. [2] T. Berners-Lee and D. Connolly, Hypertext Markup Language specification/2.0, Internet Draft, Internet Engineering Task Force (IETF), HTML Working Group, June 1995. Available at: http://www.ics.uci. edu/pub/ietf/html/html2spec.ps.gz (work in progress). [3] R. Fielding, J. Gettys, J.C. Mogul, H. Frystyk and T. Berners-Lee, Hypertext Transfer Protocol – HTTP/1.1., RFC 2068, HTTP working Group, January 1997 (work in progress). [4] An introduction to wireless technology, IBM International Technical Support Center, SG24-4465-01 (October 1995). [5] G. Calhoun, Wireless Access and the Local Telephone Network (Artech House, Boston, 1992). [6] ARDIS Network Connectivity Guide (ARDIS, Illinois, March 1992). [7] RAM Mobile Data System Overview, RAM Mobile Data Limited Partnership, USA RMDUS 031-RMDSO-RM, Release 5.2 (October 1994). [8] ARTour technical overview release 1, IBM Corp. SB14-0110-0 (March 1995). [9] Oracle Mobile Agents Technical Product Summary, Oracle White Paper, Oracle Corp. (March 1995). [10] K. Coppieters, A cross-platform binary diff., Dr. Dobb’s Journal (May 1995). [11] D.M. Ludlow, Compare process for quick determination of text changes, IBM Technical Disclosure Bulletin 22(8A) (January 1980). [12] M.F. Kaashoek et al., Dynamic document: mobile wireless access to the WWW, in: Proc. IEEE Workshop on Mobile Computing and Applications, Santa Cruz, CA (December 1995). [13] A.D. Joseph et al., Rover: a toolkit for mobile information access, in: Proc. 15th Symposium on Operating Systems Principles (December 1995). [14] GloMop: global mobile computing by proxy, GloMop Group (March 13, 1995) (
[email protected]). [15] M. Liljeberg et al., Optimizing World-Wide Web for weakly connected mobile workstations: an indirect approach, in: Proc. SDNE‘95, Whistler, Canada (June 5–6, 1995) (IEEE 0-8186-70924/95).
[16] A. Demers et al., The Bayou architecture: support for data sharing among mobile users, in: Proc. Workshop on Mobile Computing Systems and Applications, Santa Cruz, CA (1994) pp. 2–7. [17] G.M. Voelker and B.N. Bershad, Mobisaic: An Information System for a Mobile Wireless Computing Environment (Dept. of Computer Science and Engineering, University of Washington, September 19, 1994). [18] J. Gray and A. Reuter, Transaction Processing: Concepts and Techniques (Morgan Kaufman, 1993). [19] M. Satyanarayanan, J.J. Kistler, P. Kumar, M.E. Okasaki, E.H. Siegel and D.C. Steere, Coda: A Highly Available File System for distributed Workstation Environment, IEEE Trans. Computers 39(4) (April 1990). [20] B.Z. Kobb, Personal Wireless, IEEE Spectrum 30(6) (June 1993) 20–25. [21] M. Mouly and M.-B. Pautet, The GSM System for Mobile Communications, published by authors (1992). [22] R. Gruber, F. Kaashoek, B. Liskov and L. Shrira, Disconnected operations in the thor object-oriented database system, in: Proc. Mobile Computing Systems and Applications, IEEE, Los Alamitos, CA, USA (1995) pp. 51–56. [23] V.N. Padmanabhan and J.C. Mogul, Improving HTTP latency, Computer Networks and ISDN Systems 28(1) (December 1995). [24] T. Imielinski and B.R. Badrinath, Wireless Mobile Computing: Challenges in Data Management, Communications of the ACM (October 1994) 19–27. [25] D. Everitt and M. Rumsewicz, Multiaccess, Mobility and Teletraffic: Advances in Wireless Networks (Kluwer Academic, 1997). [26] Mobitex features and services, RAM Mobile Data White Paper (February 1997). Available at: http://www.ram-wireless.com/new/white/ mobitex2.html. [27] What is ARDIS/DataTAC, Research in Motion (September 1997). Available at: http://www.rim.net/networks.html. [28] N.R. Saxena and E.J. McCluskey, Analysis of checksums, extended -precision checksums and cyclic redundancy checks, IEEE Trans. Computers 39(7) (July 1990) 969–974. [29] D.V. Sarwate, Computation of cyclic redundancy checks via table look-up, Communications of the ACM 31(8) (August 1988) 1008–1013. [30] D. Hayden, The New Age of Wireless (Mobile Office, 1992). [31] K. Miller, Cellular essentials for wireless data transmission, Data Communications 23(5) (March 1994) 61–67. [32] J. Rowe, Building Internet Database Servers with CGI (New Rides, 1996). [33] G.H. Forman and J. Zahorjan, The challenges of mobile computing, IEEE Computer 27(6) (April 1994) 38–47.
Barron Housel is a Senior Technical Staff Member involved with advanced technology in IBM’s Software Solutions Division located in Raleigh, North Carolina. Dr. Housel has been involved in networking technology since 1980 as an architect and product manager and developer. Prior to coming to IBM Raleigh, Dr. Housel was a member of the IBM Research Division where he worked on database conversion and database design technologies. Dr. Housel is a co-inventor of IBM’s MQSeries Message Queueing Interface (MQI). He has a number of patents relating to networking and data stream technology and over 20 technical conference and journal publications. In 1991 Dr. Housel was elected to the IBM Academy of Technology. He received his B.S. in mechanical engineering from the University of Oklahoma, an M.S. in computer science from Stanford University, and a Ph.D. from Purdue University. For the past two years, Dr. Housel has been involved with wireless and mobile computing technologies, and is one of the principal inventors and architects of the WebExpress technology. E-mail:
[email protected]
B.C. Housel et al. / WebExpress David Lindquist joined IBM Data Systems Division in 1982 specializing in large-system performance, and later in large-system architecture and design. In 1990 he joined the Networking Systems Division as a member of the technology staff, where he has focused on distributed multimedia and mobile products. David is one of the principal architects of the WebExpress technology. His research has led to numerous patents in the area of distributed processing, and recognition as an IBM Master Inventor. Mr. Lindquist is currently a Senior Technical Staff Member for IBM’s Software Solutions Division in Raleigh, North Carolina. He received a B.S. in computer engineering from Boston University in 1982. E-mail:
[email protected]
431
George Samaras received a Ph.D. in computer science from Rensselaer Polytechnic Institute, USA, in 1989. He is currently an associate professof at the University of Cyprus. He was previously at IBM Research, Triangle Park, USA. He served as the lead architect of IBM’s distributed commit architecture (LU6.2 Sync Point) and as a member of IBM’s wireless division. Dr. Samaras coauthored the IBM book on “Sync Point (commit) Services” and holds a number of patents related to distributed data processing. He also served on several of IBM’s internal international standards committees related to distributed computing (OSI/TP, Xopen, OMG). His research interest includes mobile computing, transaction processing, databases, object-oriented technology and real-time systems. He is a member of ACM and IEEE. E-mail:
[email protected]