A Mechanism for Supporting Client Migration in a Shared ... - CiteSeerX

38 downloads 83522 Views 120KB Size Report
running a shared application, 2) a user with a powerful. workstation joining the collaboration session, and 3) the. user hosting the client leaving the collaboration ...
A Mechanism for Supporting Client Migration in a Shared Window System Goopeel Chung & Prasun Dewan Department of Computer Science University of North Carolina at Chapel Hill Chapel Hill, NC 27599-3175 {chungg,dewan}@cs.unc.edu ABSTRACT

Migrating collaborative applications to or near the workstations of active users can offer better performance in many scenarios. We have developed a client migration mechanism for centralized shared window systems that does not require changes to existing application and system software. It is based on logging input at the old site and replaying it at the new site. This approach raises several difficult questions: How should the log size be kept low? How should response time be kept low while migration is in progress? How should applications that depend on the rate at which input is received be accommodated? How should the transition from the replay phase to the play phase be detected at the new site? How should the software at the old and new sites be synchronized? We have developed a series of alternative approaches for answering these questions and implemented them in the XTV [1] shared window system. In this paper, we motivate, describe, illustrate and evaluate these approaches, and outline how they are implemented. KEYWORDS: multiuser interface, collaborative system,

logging, groupware, migration, window system, replication INTRODUCTION

A shared window system is an extension of a single-user window system that replicates the windows of a single-user application program on multiple workstations. Shared window systems are important because they can allow a group of users to collaborate using an existing single-user application program not intended for multiple users. For this reason, several shared window systems have been developed. We can categorize these systems into two groups: centralized and replicated systems. A centralized system uses only one client for processing user input events, while

a replicated system uses multiple copies of a client, each of which runs on the site where a user is working. Both centralized and replicated systems have problems. In particular, centralized systems, although quite simple in architecture, suffer from slow performance, and replicated systems, while yielding better performance, do not always guarantee state synchronization among client replicas. Because of synchronization and other problems (described later) of replicated systems, most shared window systems are centralized, despite the poor performance of this architecture. An alternative to replication and (static) centralization is migrating a centralized client. This approach can offer performance benefits in several scenarios, some of which are given below. • Two plastic surgeons A and B at a hospital in North Carolina are trying to set up a plan for a reconstructive surgery for a patient using a program shared through a centralized system. The program lets the surgeons view a 3D volume image of the patient's head, "fly" around and through the image by changing their viewpoint, and perform a virtual operation on the 3-D head image. Realizing that they now need some help from C , a reconstructive surgery expert in New York, they open a program interface on C's workstation in New York, and expect C to take control. However, because of the network delay, C cannot satisfactorily perform a virtual operation on the patient's image, thereby delaying their progress in completing the operation plan. If the process on A's workstation could be migrated to C's workstation, C would be able to interact with the program more efficiently, and the whole group would make faster progress. • While the client migration scenario described above favors one active user over the others, there are cases where process migration can increase performance for all users. Such cases include: 1) an imminent shutdown or decreasing performance due to too many processes on a machine running a shared application, 2) a user with a powerful workstation joining the collaboration session, and 3) the user hosting the client leaving the collaboration session. It will be to everyone's advantage to be able to move the shared application in order to proceed more quickly with the collaboration.

• Besides processor resources, there are other resources that may become overused and not readily available at the hosting site. For example, a group of users may find, after long hours of collaboration, that they cannot save their work because the disk is full. In such cases, we would like to be able to migrate the shared application to a machine that has the resources. In order to support these scenarios, we need a mechanism for implementing process migration in a centralized shared window system. Based on these scenarios and results of previous process migration research, we have identified the following requirements such a mechanism must satisfy. It should: • not require changes to existing single-user application and system software such as operating systems, programming languages and single-user window systems. • accommodate heterogeneous computing environments such as different machine architectures and file systems, since collaborating users, specially in a wide-area environment, cannot be expected to use a homogeneous computing environment. • be cost-effective in providing process migration, that is, more than compensate for its cost with increased performance. • account for the unique architecture the centralized system takes to facilitate window sharing. In this architecture, it is not sufficient to simply migrate the client to a new location. It is also necessary to retarget it by breaking the connection to the window system at the previous location and making a new connection to the window system at the new location. Process migration mechanisms have been developed previously, but none of these meets all of these requirements. In this paper, we describe a new process migration mechanism designed to meet these requirements. We first describe the related work on which our research is based. Next, we describe our approach. Finally, we give conclusions and directions for future work. In this paper, we are concerned with migration of clients of a shared window system and not arbitrary processes. Since these clients are the only application processes we consider here, we use the terms client migration and process migration interchangeably.

requests. The shared application program generates these requests in response to a sequence of user input events (such as keyboard events and mouse pointer operations) gathered and delivered by the shared window system. As mentioned earlier, we can categorize shared window systems into centralized and replicated systems. In centralized systems such as XTV [1], requests from the single copy of a shared application are distributed to multiple display servers, and input events from the servers are delivered to the shared application. In replicated shared window systems such as MMConf [10], a copy of the application runs at every site in the conference. Input from each server is distributed to each copy, and output from each copy is delivered only to the local display server. The replicated architecture offers better performance. It requires lower communication bandwidth because only input, rather than output, must be distributed among the collaboration sites, and it is less sensitive to variations in network latency. All participants in the conference receive good interactive performance because they interact with local replicas of the application. However, the replicated architecture has some fundamental problems. Replicating a client operation wastes computation resources when the operation is expensive; creates bottlenecks when the operation accesses a shared (non-replicated) resource such as a file (since all replicas tend to access the device at the same time); and causes undesired semantics when the operation is a nonidempotent action such as sending a mail message. Moreover, good response times can be achieved only if input events are given to the local copy of a client first, and then distributed to other sites. This can result in different copies of the client receiving different sequences of input events.

Our research is related to previous research in shared window systems (both centralized and replicated architectures), caching, logging and process migration.

Some of these problems are caused by the need to fully synchronize all replicas by sending each input event immediately to all replicas. An alternative, ''lazy'', method of synchronization, is provided by the Rover [13] caching scheme developed to support mobile computing. Like a replicated window system, Rover creates multiple copies of an application, one at a fixed host, and another on the mobile client. Since the client and host are not always connected, Rover cannot provide full synchronization. Therefore, in the disconnected state, the client logs messages for the host, and transmits the log to the host when connection to it is next established. This scheme, of course, cannot be used to support real-time collaboration, but as we shall see later, the general idea of logging forms the basis of our mechanism.

As mentioned before, a shared window system is an extension of a single-user window system that allows multiple users to work together by replicating the user interface (windows) of a single-user application program on each user's workstation. The shared window system synchronizes the multiple copies of the user interface by providing them with the same sequence of display update

Both the caching and replication methods described above use the input refeed method, that is, they feed input to one incarnation of an application to another (remote) incarnation of it. Research in process migration illustrates another method to achieve good performance in distributed systems, wherein the image of an incarnation is copied to a remote location.

RELATED WORK

Process migration offers an extreme case of lazy synchronization: it requires synchronization only when a process migrates to another location. Process migration has been an active area of research for load balancing in distributed systems. Some of the primary factors considered for migrating processes are the processor loads and the communication loads on the network. When a processor in a distributed environment is overloaded with too many processes, process migration is used to more or less balance loads on distributed processors by moving some processes on the overloaded processor to less utilized ones. When multiple processes that are spread across multiple processors are involved in a distributed system, load balancing tries to reduce communication traffic between processors by putting intensely communicating processes on a processor, thereby decreasing host-to-host communication. When a process migrates, it should resume where it left off at the source processor. In order to facilitate this, the migration system needs the state information regarding the process such as the process' address space, information on opened files, register values, and links to other processes. Many process migration systems such as DEMOS/MP [15], Charlotte [2] and MOS [3,4] use migration-aware kernels to automatically extract and ship this state information to the destination processor, where the process context is rebuilt before resuming execution. This approach has a clear advantage of efficiency as compared with the proposed log-based scheme, since the former transfers the process state itself, while the latter takes a round-about way of transferring what causes state changes. However, this approach can be supported only in operating systems that support location-independent resource references (e.g. socket descriptors). Most current operating systems do not have this property and the operating systems mentioned above are research systems. It is questionable that users will take a rather radical step of replacing existing kernels for collaboration. Another disadvantage of this approach is that it does not address the problem of retargetting to a local window system after migration. Migration-aware applications such as [5,7,12,14] can solve this problem, but these systems assume special language support. It is to overcome these limitations of current process migration systems that we have developed the log-based scheme. The idea of logging operations has been used previously in collaboration systems to support latecomers to a conference, log replays in asynchronous collaboration and user interface animation, and failure recovery in service acquisition[5,6,8,9]. However, these works have not considered migration, which requires a significantly different mechanism. APPROACH Overview

Our approach has aspects of several of the related systems discussed above. As in a centralized system, at any one time, a single copy of the client interacts with all users.

Like the replicated and caching systems, our approach uses the input refeed method. Like the caching and failure recovery systems, it logs input events. Like process migration systems, it dynamically moves the client to remote locations. Like the latecomer and log replay systems, it considers the domain of collaborative applications. Like user interface animation systems, it synthesizes events based on logged events. A simple scheme for applying the input refeed method to process migration is to save all of the input sequence while a process is executing, and when the time comes to "migrate" the process, start a fresh copy of the process on the destination site and replay the saved sequence to the copy in order to ensure that the process will get to the state where it left off. Unlike the image-copy method, this approach does not require changes to existing single-user application or system software, does not depend on a specific operating system or language, and supports automatic retargetting of the client to the local server. However, it has the following fundamental problems: • During replay, the output of the new client might duplicate the requests issued by the old client, which may destroy the window state created by the old client. • Some events generated by the user may have been in response to requests made by the client. For instance, a mouse movement in a window may have been done in response to a request to map the window. Unexpected results can occur if the new client receives events before it has issued the corresponding requests. • The number of input packets to be saved can be without bounds depending on how long the application has been in use in a collaboration. The later in the application process's lifetime a migration has to occur, the more packets to be saved, and hence the longer it takes for the "migrated" process to resume at the destination. • There is no guarantee that this scheme will work for all applications. This is particularly due to non-determinism on the part of some applications which respond differently depending on what kinds of events are queued when they inspect the queue to determine which events to process next. For example, a free-hand drawing application can decide to process the latest of the queued events notifying the current pointer position. Hence, the best we can do is to develop a migration scheme that works for a large set of practical cases. Generality of a migration scheme will depend on how well it deals with the non-determinism exhibited by different applications. • Another serious generality problem for input refeed methods is the multiple execution of non-idempotent operations on external environment such as file systems and other processes. For example, many applications read input from external sources (e.g. files), process user requests, and write the outcome back to the external

sources. If these operations are repeated in sequence by another application, the two applications can reach different states. We reduce the first problem by trying to distinguish between the replay and play phases of the new client and ignoring its output during replay. To address the second problem, we control the flow of events to the new client. To reduce the third problem, we compress the log during interaction and uncompress it during replay. To deal better with the fourth problem, we use a subset of the client output requests as checkpoints to deduce how far the client has progressed. For the fifth problem, we try to recognize, with the help of users, input events causing non-idempotent external operations. We discuss below a series of schemes with different generality and efficiency, and describe in detail how the schemes deal with the problems above. These schemes have been added to the XTV system. We first describe the XTV algorithm for supporting window sharing and then give modifications for migration. The XTV Algorithm

In XTV (Figure 1), a process, called the host pseudo-server, is inserted in the middle of a conventional connection between an X application process and an X server. The host pseudo-server delivers the output from the application process not only to the local X server, but also to subscriber pseudo-servers running on machines of other users in the collaboration. After translating resource ID's in an output packet to local resource ID's allocated by the local server, each subscriber pseudo-server delivers the packet to the local server for display. Input from each server goes to the application process in the opposite direction. The algorithms for XTV host and subscriber pseudoservers are given in Figure 2. We use a CSP[11]-like language to describe the distributed algorithms: a [ and a matching ] denotes a block of commands. Prefixing a block with a * makes the block an infinite loop. Indentation is used to separate the arms of an alternative command block. The symbol → separates the guard and statement of an arm. An input command consists of the process name followed by the received message with a ? interposed between them. An output statement uses the symbol ! instead of ?. The symbol >> means a concatenation. The host (pseudo-server) receives, in an infinite loop, messages from a new subscriber (pseudo-server), a client or the local X server. The host responds to a join message from a new subscriber by simply adding the subscriber's connection information to a list of subscribers. It responds to an output message from the client by transmitting the message without any modification to its local server and to all the subscribers in the subscriber list. It responds to an input message from either the local server or one of the subscribers by transmitting the message to the client. The subscriber sends a join message and then receives messages in an infinite loop. It responds to an output

X client

output

subscriber pseudoserver

host pseudoserver

subscriber pseudoserver

X display server

X display server

X display server

input

Figure 1: XTV architecture

Host:: subscribers := NULL; *[ Subscriber ? join(subscriber) → subscriber >> subscribers Client ? output → [ Server ! output; subscribers ! output ] Server ? input, Subscriber ? input → Client ! input ] Subscriber(Host):: Host ! join(subscriber); *[ Host ? output → [ local_output := forward_translation(output); Server ! local_output ] Server ? input → [ host_input := backward_translation(input); Host ! host_input ] ]

Figure 2: algorithms for XTV host and subscriber pseudo-servers

message from the host by first performing a forward translation on the message (i.e. resource ID's contained in the message are translated to those allocated by the local server), and then transmitting it to the local server. It responds to an input message from the local X server by first performing a backward translation on the message and then sending it to the host. To illustrate, let us assume that two users, A and B, are interacting with a client, A using the host and B using a subscriber. A first types 'a' and 'b' into a window w, causing the local X server to send events E1=Key('a',w ) and E2=Key('b',w) to the host pseudo-server. The host then sends these events to the client without modifying them. The client, upon receiving these two events, generates a request R1=Draw('ab',w) asking the X server to draw 'ab' at the current cursor position on window w. The host, upon receiving this output request, relays it to the local X server and to all the subscribers. The subscriber now receives R1 and forward translates w to w' in the output request (where w and w' are two different resource ID's given by the two X servers for the same logical window), and sends the request to its local X server. Now, when B enters 'c' and 'd' generating events E3=Key('c',w') and E4=Key('d',w'), the

subscriber backward translates the events (changing w' to w), and sends them to the host, which relays the events to the client without modification. In response to these events, the client generates R2=Draw('cd',w), which goes through the same route as R 1 did. Thus, at the end of this interaction, both users see the string 'abcd' on w and w'. We now describe the series of process migration mechanisms we have added to XTV. The XTV approach assumes homogeneous window managers, color models, input devices and server extensions. We retain this assumption in our work. Brute Force Approach

The brute force approach adopts the simple scheme introduced above; i.e. we log all of the input sequence while the client is running and hand over this log to a new host when migration takes place. The new host starts a new copy of the client, replays the saved log until the new copy reaches the state where the old copy left off, and then assumes the XTV host role. When this occurs, the old host changes its role to that of an XTV subscriber. Hence, unlike the original XTV host and subscribers, the pseudo-servers now are able to change their modes following the changes of client ownership. Figure 3 illustrates mode transitions a pseudo-server can take. Figure 4 gives the algorithm.

Pseudo_Server(Mode,Host):: subscribers := NULL; input_log := NULL; mode := Mode; mode = Subscriber → Host ! join(subscriber); init(request_count); *[ mode = HOST → [ Subscriber ? join(subscriber) → subscriber >> subscribers Client ? output → [ Server ! output; subscribers ! output; increment(request_count) ] Server ? input, Subscriber ? input → [ Client ! input; input >> input_log ] User ? migrate_out(new_host) → [ new_host ! migrate_in(input_log, subscribers - new_host + Host, request_count); mode := HOST_MIGRATION; init(request_count); terminate(Client) ]] mode = HOST_MIGRATION → [ Server ? input, Subscriber ? input → /* reroute to new host */ new_host ! input new_host ? migration_complete(new_host) → [ /* new host is now distributing output */ mode := SUBSCRIBER; Host := new_host ]]

In the host and subscriber modes, a pseudo-server essentially implements the XTV host and subscriber algorithms, respectively, with the following differences. In the host mode, the pseudo-server logs the input events and also keeps track of the number of requests made by the client. This number is given to a new host for it to determine when to start distributing requests from the new copy of the client. It also receives input from the user asking it to migrate the client to a new host and responds to it by going into the host-migration mode.

mode = SUBSCRIBER → [ Host ? output → [ local_output :=forward_translation(output); Server ! local_output ] Server ? input → [ host_input := backward_translation(input); Host ! host_input ] Host ? migration_complete(new_host) → /* new host is now distributing output */ Host := new_host Host ? migrate_in(input_log, subscribers, request_count) → [ /* host asks me to become a new host */ mode := SUBSCRIBER_MIGRATION; init(local_count ); temp_log := NULL; initiate(Client) ]]

In the subscriber mode, the pseudo-server receives two additional messages: migration complete and migrate in. The first message tells the subscriber that the client has finished migrating to a new location. It responds to this message by updating the information about the new host, particularly the information associated with resource ID translations. The second message tells it that it should now become the new host. It responds to this message by starting the new client and changing its mode to subscriber migration.

mode = SUBSCRIBER_MIGRATION → [ Host ? input → [ /* temporarily store rerouted msg */ host_input := forward_translation(input); host_input >> temp_log ] Client ? output → [ increment(local_count); *[sendable(first_event(input_log), local_count) → [ /* local_count = sequence number */ input := dequeue(input_log); Client ! input ]]; ( request_count = local_count ) → [ mode := HOST; Client ! temp_log; subscribers ! migration_complete ]]]

migrate out host migration

host

migration complete

finish replay

] subscriber migration

subscriber migrate in

Figure 3: XTV pseudo-server mode transitions

Figure 4: Brute Force Algorithm

The host migration mode is a transient mode the old host takes before going into the subscriber mode. This mode coincides with the subscriber migration mode of the new

host. A pseudo-server in a host migration mode can receive input messages from other subscribers who have not been notified of the change in ownership. It responds to the message by rerouting the input to the new host, who will play it after replaying all the logged events. While the new host is replaying the input log to the client, the old host waits for the migration complete message. In response to this message, it changes its mode to subscriber. In the subscriber migration mode, the pseudo-server replays the input message log to a new copy of the client. One could simply send all the logged events in one batch. However, this will confuse the client when it sees in its event queue events yet to be solicited. Hence, the pseudoserver needs to control the flow of events to the client, as mentioned before. Our brute force approach uses a very simple scheme; the pseudo-server uses the sequence number attribute contained in all X input events. X servers use this number to notify clients how many client requests have been processed. Hence, the pseudo-server checks the sequence number in a logged event, and sends the event only when the same number of requests have arrived from the new copy. The new host needs to determine when it should start distributing output from the new client to the X servers involved in the collaboration; i.e. when to change its mode to host. It could start the distribution when the last event in the input log has been sent to the new client. However, if one or more output messages from the client in response to the last event had already been distributed to the local server by the old host, and those messages happen to be non-idempotent operations (such as an output request to destroy a window), the user interface may crash. A better, yet simple, approach is to count the number, n, of requests made by the old client, and to start distributing from the (n+1)'th request. However, this approach only works for clients creating the same number of requests in response to a fixed sequence of input events that can be fed at variable rates. Some clients can respond each time with a different number of requests, since they have an option to compress multiple requests into one when the same kind of input events occur consecutively on the same window (e.g. Draw('ab',w) instead of Draw('a',w) and Draw('b',w)). Hence, the pseudoserver has a good chance of indefinitely waiting for the (n+1)'th request, which will never come unless triggered by additional events. This can happen quite often with synthesized user interaction as is done in our scheme. The brute force algorithm assumes that requests are not compressed. This assumption is also made by the flow control component of it described above. When changing to a host mode, the new host also sends all the events accumulated (in the temporary log) during subscriber migration mode. There is no need to check sequence numbers contained in these events, since they depend on no future request from the client.

This is the simplest approach to implement and is independent of the server-client protocol for interpreting events and requests. However, it causes the size of the log and the migration time to be large. Moreover, during migration, interaction is paused in that users do not receive feedback. We performed several experiments to get an estimate of the log sizes and migration times. For instance, we interacted with xcalc for 105 seconds, sending it 900 events, before migrating it to a new server. The log size was 42K bytes and migration time 13 seconds. Thus, this approach is suitable only for short-lived collaborations where users are separated by networks that are so slow that a pause of a few seconds is tolerable. Another problem with this approach, as mentioned above, is that it assumes that an application does not compress requests. This assumption is not valid for several applications such as xterm and xedit. We describe below schemes that improve the log sizes and migration times and work for applications that compress requests. Because of the complexity of these schemes, we describe them in words without giving detailed algorithms for them. Proxy Serverization

In order to deal better with the problems with log sizes and compressible requests, we have developed a scheme called proxy serverization. The scheme tries to move a subset of functionality of the X server into the pseudo-server based on certain assumptions we make about X window managers and clients. By implementing some of the X server functionality, the pseudo-server has the intelligence required to compress the input log and better handle compressible output requests. The key idea behind compression of the input log is the observation that many events sent by the X server are either spurious during replay or dependent on other events or client requests if we make some assumptions about clients and window managers. We assume that Expose events do not change the state of a client, and hence can be discarded. Expose events are usually sent to notify a client that the associated windows need to be updated. When the new client is replaying the log, drawing requests for windows are not necessary since these windows were updated by the old client. When a user moves the mouse pointer in and out of different windows, the X server sends events such as EnterNotify(FocusIn) or LeaveNotify(FocusOut) to inform the client of the window in which the mouse pointer is positioned. These events can allow the client to provide a better user interface by highlighting a button window, for example, to demonstrate that the button is pressable, and dehighlighting it when the user moves the pointer out of the window. If the user never interacts with the window before leaving it, saving and replaying those notification events will not only take up space in the input log, but also delay the new client's synchronization on the new host. Therefore, we can log only those enter and focus events that precede and follow an actual window interaction,

assuming that the unlogged events will not cause state changes in the client. As it turns out, even these enter and focus events do not need to be logged. They are easy to synthesize from user interaction events (pointer and keyboard events) if we have the hierarchy of all windows. For example, if a user, after moving into a window w, presses a left mouse button generating a ButtonPress event, we can save only the ButtonPress event in the log and simulate the user moving the mouse into the window w by generating a series of EnterNotify and FocusIn events for each window starting from w's highest ancestor to w itself. In order to generate these artificial events, the new host must create the window hierarchy information that existed when the event to be replayed was generated. It cannot depend on its local X server for this hierarchy, since the window hierarchy maintained inside the local X server may be quite different from that constructed so far by the new client. We use the series of requests sent by the client as the basis for building window hierarchy and attributes. These requests give accurate information only if the window manager did not tamper with them. Thus, we need to assume a permissive window manager. This assumption also allows us to automatically generate notification events for requests that change the window hierarchy or attributes. The old host does not save these events and new one always sends positive notification events for these requests, which include CreateWindow, MapWindow and DestroyWindow. We can use the request history to also automatically reply to any query request regarding a client's window, and thus do not need to save the reply. Examples of these requests include the GetWindowAttributes and GetGeometry requests. However, we do not do this for all query requests. Some requests, such as GetFontPath, return sitespecific information. These requests are broadcast to each server and the corresponding pseudo-server logs replies to them. When a client is migrated to a new host, the pseudoserver returns the logged information in response to local requests issued by the new client. Logging replies to the site-specific requests can be tricky because an X reply does not give information about the request to which it corresponds. In order to log replies to site-specific requests, the subscriber needs to distinguish these replies from other replies. We solve this problem by blocking the subscriber when it issues a query request until it receives the corresponding reply from its server. So the events logged at the host site are the keyboard and button events (KeyPress, KeyRelease, ButtonPress, ButtonRelease, MotionNotify) and replies for QueryPointer and GetMotionEvents. A pseudo-server synchronizes a new client with an old one using three sources of information: local log of site-specific replies, the window data structure dynamically maintained within a new host, and the global input log gathered at the previous host. It steps through the same mode changes that a brute force pseudo-server does (Figure 3). In addition to event compression, proxy serverization brings about two

additional major changes. In order to handle compressible requests, it uses different mechanisms to handle flow control and determine when the output of the new client should be distributed. In the brute force approach, the sequence numbers contained in each X event and the number of client requests are used to control the flow of events. In proxy serverization, user interaction events are sent from the global input log in chronological order and checked for two conditions: First, the window associated with the event has been mapped; and second, it has the mask set that solicits the event. The input log is inspected for the next event to send when a window is mapped, a window's event mask is changed, a QueryPointer/GetMotionEvent request arrives, and after an entry has been sent from the input log. Replies and other events are sent when corresponding requests arrive from the client. Most of these responses are sent by a simple lookup of the local log or by simulation of non site-specific requests. For example, a reply for an AllocColor request is sent from the local log, and a MapNotify event is sent when a MapWindow request arrives from the local client. For determining the first request that should be distributed, we use the following method. We distinguish between drawing and non-drawing requests. Instead of counting all requests, the pseudo-servers only count the non-drawing operations. As a result, some drawing operations issued by the new client may be distributed too soon, that is, they may be replay requests that were also issued by the old client. However, since these operations are idempotent, they do not change the screen contents, but do cause it to flicker. The flicker is typically removed if we assume that for every sequence of consecutive drawing operations issued by the old client, the new client issues at least one (possibly compressed) drawing request, and wait for this request to be issued before distributing client requests. This approach will work as long as non-drawing operations are not compressed. While we have not found an example of a client that compresses these requests, it is theoretically possible to compress them. For instance, a client may map a window tree in one request or in multiple requests, depending on the events queued. To increase the robustness of the scheme, we reset the count after matching a query request, which we assume is not compressed since it requires exactly one reply. In summary, this approach differs in two ways from the brute force method. First, it accommodates applications with compressible requests. In particular, it works for xterm, xedit, and several other existing X applications we tested. Second, it compresses event logs. To get an idea of how well compression works, we repeated the experiments with xcalc given above and observed that the log size reduced from 42K bytes to 2K bytes. The saving is dramatic in the case of xcalc because it creates a large number of windows, which in turn leads to a large number of compressible EnterNotify, LeaveNotify, FocusIn and FocusOut events. We also experimented with xedit

to get an idea of how much compression is possible in applications that create few windows. We continuously interacted with xedit for 105 seconds, sending it 500 X events, before migrating it to a new site. Without compression the log size was 24K bytes and with compression it was 15K bytes. Thus the saving in log size is substantial but not as dramatic. While the compression optimization does lead to a significant saving in log size, it also reduces the generality of the mechanism in some ways because of the assumptions it makes about clients and window managers.

Prestart can substantially reduce the migration time, specially for heavyweight applications with large startup cost. For instance, we performed the xedit experiment using this optimization and found that the migration time was cut down from 22 seconds to 10 seconds. The main drawback of this approach is that it uses more memory and other resources since a copy of the application is active at each site during the conference. This can be a problem since it adds load to the local processor resulting in slower responses from the local server, especially during the startup phase.

Client Prestart

As in the other two schemes described above, interaction is paused during the entire migration. We now describe a scheme for eliminating this time.

To reduce the time for a new copy of a client to get synchronized with the old one, we have developed a simple scheme to start a copy of the shared client on each site when the conference is created. Even though this scheme has the flavor of a replicated shared window system, it still implements the centralized architecture in that output requests are distributed to multiple subscribers, and input events are targeted to one copy at the host site. Once all the subscriber copies reach a stable state, this scheme is almost the same as the proxy serverization scheme. The main difference is that in this scheme, a copy of the client is always ''prestarted'' at each site. When a host has to yield the client to another pseudo-server, it does not kill the local client, but keeps it "prestarted". We now transfer only the differences between the input logs of the current and next host. To compute this difference, each site maintains an array that contains the log sizes of all subscribers. This array is created by the first host of the client and initialized to 0. When the input log is passed to a new host, the old host sets the array entries for itself and the new host, and passes the whole array along with the log differences. This way, a current host of the client keeps correct log sizes of all subscribers. When a client is first started at a subscriber site, it typically issues several requests before waiting for user input. Thus, the subscriber gets requests from both the host and the local client until the local client stabilizes. Moreover, the local client may overrun the host by issuing requests that have not been received by the subscriber from the host. This can be a problem if migration occurs in this setup phase, since all local client requests, including overrun requests, are not sent to the local server, but discarded. The overrun problem could be eliminated by delaying prestart of a client till the client at the host site has stabilized, but as we shall see later, we could still face the overrun problem in the play/replay scheme discussed next. We have created a new mode, called the synchronized mode, to handle concurrent requests from host and client and the associated overrun problem. The pseudo-server at a subscriber site starts in this mode and moves to the subscriber mode when the client stabilizes. After that, it makes the transitions of Figure 4. We describe this new mode in detail in the next section.

Concurrent Play/Replay synchronize out host

host synchronization

migrate out

host migration

migration complete

finish replay

subscriber migration migrate in

subscriber synchronization

subscriber synchronize in

Figure 5. Concurrent Play/Replay

The main idea behind this scheme is to allow new user input events to be processed or ''played'' by the old client until the new client has finished replay of the old log, hence the name concurrent play/replay. To accommodate this behavior, we add two new modes, the host synchronization and subscriber synchronization modes, which are taken by the old and new hosts, respectively, during the play/replay phase. Figure 5 gives the new state transition diagram. As shown above, the host now goes into the host synchronization mode instead of the host migration mode when it is asked to migrate the client. The difference between the host-synchronization and host-migration modes is that in the former the pseudo-server sends the input it receives to both the new pseudo-server and the local client. The old host stays in the host synchronization mode until the new host sends a message indicating that it has drained its global input log. Upon receiving this message, the old host sends the remaining global input log and goes in to the host migration mode. Similarly, the new host goes into the subscriber synchronization mode instead of the subscriber migration mode when it is informed about the migration decision. One important difference between the two modes is that in the former mode a pseudo-server receives requests simultaneously from both the old host and its client. Moreover, it is possible for the client to overrun the host in the manner discussed in the previous section. Therefore, we log ''future'' requests from the client. When the corresponding request arrives from the host, a future request is removed from this log and events associated with

it are sent to the client. For instance, when the corresponding request for a MapWindow future request arrives from the host, a MapNotify event is sent to the client. Our scheme for matching the requests of the client and host is analogous to the request counting scheme described earlier, in that we match only the non-drawing requests, since the drawing requests may be compressed differently. Concurrent play/replay eliminates the pause time but does not cut down on the migration time. This time is important since during migration the users do not get the benefits of migration. Handling Non-Idempotent External Operations

Non-idempotent operations combined with our migration scheme can cause different states for different copies of a client when they share a common (centralized) environment (e.g. a file system or an addressee of an e-mail program). A simple solution is to replicate the environment on each user's site so that each copy of a client will work with its own copy of the environment. For example, files to be used in a collaboration can be replicated on all sites beforehand. However, it may be impossible (e.g. an e-mail addressee cannot be replicated) or too costly (e.g. large files) to replicate for all cases. This problem is not unique to our migration scheme, but also occurs in replicated window systems. However, it is more severe under our migration scheme because the different clients are not fully synchronized. We do not have a satisfactory, general solution for the centralized case, but have developed an ad-hoc solution for the typical application that reads, processes and then saves a file. In this case, there is a button on the XTV interface to train XTV to recognize a series of events indicating the save operation. The user presses this button to indicate the start of training, inputs a sequence of events indicating the save operation, and ends the training by pressing the XTV button again. Afterwards, during normal (non-training) execution of the application, XTV monitors server events to locate the learned save event sequence. When it finds such a sequence, it holds off sending it to the main client until it synchronizes other copies of the client with the main copy; that is, XTV sends the current global log to all the other copies, which then will read the old file before it gets written over. (We assume the client is prestarted at all sites.) The save event sequence is sent only to the main client to prevent redundant writes. The load and edit operations are processed as before. In some applications, the save operation may change the internal state of the application by, for instance, writing to a log. Therefore, we allow the user to indicate, at training time, whether the save sequence should be sent to all replicas. This approach can also be used to handle other non-idempotent operations. For instance, XTV can be trained to recognize a sequence of events indicating the mail operation. In this case, the users would indicate that the sequence not be replicated. Note that synchronization is not necessary here, because the send-mail operation

normally does not change the external environment (e.g. file system) to which it refers. Therefore, we also allow the training user to indicate whether synchronization is necessary before the execution of the operation. This approach is not general since different replicas may need to take different, application-specific, actions in response to non-idempotent operations. For instance, in the mail case, this approach would not work if the mail operation was written to a log. Moreover, it puts additional burden on users, who must now train XTV for each new application. A more general solution is to develop migration-aware applications that are responsible for handling this problem. Our ad-hoc solutions are meant for existing applications that cannot be changed. Evaluation

In this section, we have presented several process migration schemes. The brute force method is the simplest approach but causes log sizes and migration times to become very large and does not work for applications that compress requests. Serverization reduces the log size substantially and accommodates applications with compressible requests, but makes assumptions about the handling of the compressed events. In both cases, migration time includes the time required to start the new client, and feedback is suspended during migration. Prestarting applications reduces the migration and pause times, but uses more resources. Concurrent play/replay completely eliminates the pause time but does not reduce the migration time. Since each of these schemes has some unique benefits, we support all of them in our system. Our system allows the user who starts an application to indicate which of these optimizations - serverization, prestart, and concurrent play/replay - should be performed. By default, all optimizations are performed since the assumptions they make about application behavior seem to be valid for the existing X applications we have tested so far. We have conducted experiments to measure the benefits of migration over a local area network within UNC. In our experiments, we measured the times needed to respond to sequences of 100 key stroke events, which were generated by holding the space bar down. We found that the response time improves from 8.26 seconds to 6.83 seconds. The benefits of migration depend on the speed of the network connecting the collaborators and would be substantially greater if a slow network (WAN) was used. The overhead of logging was not noticeable and was about nine percent in our experiments. CONCLUSIONS AND FUTURE WORK

This paper makes several contributions. First, it motivates the use of process migration in centralized collaborative systems. Second, it presents log-based migration as a practical alternative to process migration mechanisms based on image copying. Third, it identifies several difficult issues that must be resolved in the implementation of such a mechanism. Finally, it presents, illustrates and evaluates

several new approaches for implementing this mechanism in a centralized shared window system.

software. The insightful comments of the reviewers and Jonathan Munson helped improve the paper.

While our algorithms have been developed for the XTV shared window system, many of the ideas they embody apply to other centralized shared window systems as well. All our migration schemes assume that mechanisms exist on all sites for starting copies of the new client. The basic brute force scheme can be applied to any shared window system. If the system is based on a pseudo-server, then the existing single-user window system does not have to be changed. For our algorithms to work correctly, the window system must compress only idempotent window operations. Event compression offers benefits in a system supporting hierarchical windows, window manager notification events, and Expose events. Thus, each of our ideas can be implemented in any system that satisfies the assumptions made by it.

REFERENCES

While the log-based migration schemes described here do offer some important practical advantages over other schemes, they also have several disadvantages. The time to migrate an application is about an order of magnitude greater than in schemes based on the image-copy method and depends on the length of the collaboration since the last migration out of the destination site. Thus, the time to migrate a client to a host is inversely proportional to the frequency with which the client is migrated to the host. While we have devised a scheme for eliminating pauses during migration, the time required to perform migration does effect when users get its benefits. Also, our solution to the non-idempotent external operation problem is ad-hoc. There are too many different actions to take for different kinds of non-idempotent external operations in various applications, which cannot all be recognized by a trainingbased approach. Finally, like replicated systems, our scheme increases the computation cost by executing potentially expensive operations multiple times. We intend to explore the use of migration awareness to address these and other problems of our scheme. For instance, we plan to allow clients to define handlers for start-replay and end-replay events. Also, we are exploring fine-grained migration allowing processing of an individual window to be migrated to a remote host. Migration is only one method to increase performance in a distributed system; it is not a replacement for replication. We intend to extend our mechanism to support a hybrid window architecture allowing both dynamic replication and migration. We also plan to devise policies for automatically migrating a client based on the kind of collaboration in which the users are involved. Finally, we intend to explore the use of lazy or delayed synchronization to support asynchronous collaboration in a shared window system. ACKNOWLEDGMENTS

This research was supported in part by the National Science Foundation grant IRI-9508514. Many thanks to Prof. Hussein Abdel-Wahab for providing us with the XTV

1. Abdel-Wahab, H, M. & Feit, M. A., XTV: A Framework for Sharing X Window Clients in Remote Synchronous Collaboration, Proceedings, IEEE Conference on Communications Software: Communications for Distributed Applications & Systems, Chapel Hill, NC, April 1991, pp.159-167. 2. Artsy, Y. & Finkel, R. , Designing a Process Migration Facility: The Charlotte Experience, IEEE Computer, 1989, pp.47-56. 3. Barak, A. & Shiloh, A., A Distributed Load-balancing Policy for a Multicomputer, Software Practice and Experience, Vol.15 No.9, September 1985, pp.901-913. 4. Barak, A. & Litman, A., Mos: A Multicomputer Distributed Operating System, Software Practice and Experience, Vol.15 No.8, August 1985, pp.725-737. 5. Bharat, K. & Cardelli, L., Migratory Applications, Proceedings of UIST '95, pp.133-142. 6. Bharat, K., Hudson, S. & Sukaviriya, N., Synthesized Interaction on the X Window System, GVU Center, Georgia Tech, Technical Report #95-07. 7. Black, A., Hutchinson, N., Jul, E., Levy, H. & Carter, L., Distribution and Abstract Types in Emerald, IEEE Transactions on Software Engineering, Vol.SE-13, No.1, January 1987, pp.65-76. 8. Chang, R.N. & Ravishankar, C.V., A Service Acquisition Mechanism for Server-Based Heterogeneous Distributed Systems, IEEE Transactions on Parallel and Distributed Systems, February 1994, pp.154-168. 9. Chung, G., Jeffay, K. & Abdel-Wahab, H., Dynamic Participation in Computer-based Conferencing System, Journal of Computer Communications, 17(1):7-16, January 1994. 10. Crowley, T., Milazzo, P., Baker, E., Forsdick, H. & Tomlinson, R., MMConf: An Infrastructure for Building Shared Multimedia Applications, CSCW '90 Proceedings, October 1990, pp.329-342. 11. Hoare, C.A.R., Communicating Sequential Processes, CACM, Volume 21, August 1978, pp. 666-677. 12. Johansen, D., Renesse, R. & Schneider, F.B., Operating System Support for Mobile Agents, IEEE Computer, September 1989, pp.42-45. 13. Joseph, A.D., deLespinasse, A.F., Tauber, J.A., Gifford, D.K. & Kaashoek, M.F., Rover: A Toolkit for Mobile Information Access, Proceedings of the Fifteenth Symposium on Operating Systems Principles, December 1995. 14. Jul, E., Levy, H., Hutchinson, N. & Black, A., FineGrained Mobility in the Emerald System, ACM Transactions on Computer Systems, Vol.6 No.1, February 1988, pp.109-133. 15. Powell, M.L. & Miller, B.P., Process Migration in DEMOS/MP, Proceedings of the Sixth ACM Symposium on Operating System Principles, November 1983, pp.110-119.

Suggest Documents