Jul 1, 2004 - an organization's efforts to maintain service continuity. ST-TCP allows a client TCP connection to be migrated to a backup server in the event of ...
To appear in (FastAbstract) Proc. of IEEE Int. Conf. on Dependable Systems and Networks, Florence, Italy, June 28–July 1, 2004
Extensions to ST-TCP Shivakant Mishra Manish Marwah Department of Computer Science University of Colorado, Campus Box 0430, Boulder, CO 80309-0430
Web-based services have grown rapidly in the past few years. Business over the Internet is a substantial part of the revenue of many companies. In fact, there is a significant number of companies which solely depend on the Internet for all their revenues. For these companies, service outage can be very expensive, for example, an hour of service outage costs $200K for Amazon Inc. and $6 million for a brokerage firm. Client transparent TCP connection migration, as provided by ST-TCP [1], would contribute towards an organization’s efforts to maintain service continuity. ST-TCP allows a client TCP connection to be migrated to a backup server in the event of the failure of the primary such that the migration is completely transparent to the client. This is particularly advantageous for client applications which have already been deployed in the field, since no code changes at all are required on the clients. In addition, (1) fail-over to the backup is fast and seamless, (2) from the client’s perspective, the TCP/IP stack on the primary behaves exactly the same as a regular TCP/IP stack, and, (3) there is negligible overhead on the primary server during normal operation. FT-TCP[2] is also a client transparent TCP connection migration mechanism similar to STTCP, however, it does not have advantages (2) and (3) listed above. Backup
Client ...
HB
Client
... Gateway
Switch
P rimary
Figure 1. ST-TCP architecture. ST-TCP has a primary-backup architecture, as shown in Figure 1. A replica of the primary application is run on the backup and receives (through a tapping mechanism) all the bytes destined for the primary. The backup maintains a shadowed connection which using the same TCP sequence numbers, virtual IP address and port number as the primary.
Also with Avaya Labs, 1300 West 120th Ave., Westminster, CO 80234.
ST-TCP assumes, however, is that the application is deterministic, that is, primary and replica application stay consistent. A heartbeat mechanism exists between the primary and the backup for detecting failures. In this paper, we augment the design of ST-TCP in two ways. We extend the architecture to support a cluster of backups instead of a single backup, and, we address handling of non-deterministic applications.
2. Cluster of Backups Only a single backup is considered in ST-TCP[1]. However, the concept of a single backup server can be extended to a cluster of backup servers (see Figure 2), resulting in a highly reliable system consisting of inexpensive, not-soreliable, off-the-shelf machines. Each backup will receive the client-primary TCP byte stream and runs an application replica. HB
Backup Cluster
HB
B a ckup
...
Master Backup
B a ckup
Client HB
...
1 Introduction
Client
... Gateway
Switch
Primary
Figure 2. ST-TCP architecture with a backup cluster Adding additional backup servers increases the load on the primary, since it has to exchange heartbeat messages with each of them. Further, the primary has to make sure that each of the backup servers has received a particular client byte before purging it from its receive buffers. To off-load the primary, one backup acts as a master backup. The role of the master backup is to perform heartbeats with all the remaining backups, and, make sure that all the backups receive all the client bytes. In fact, the primary does not communicate with any of the backups except the master backup. This makes the overhead of the backups on the primary a constant (and independent of the number of backup servers in the cluster). Figure 3 shows the state diagram of a server. A goal is to make sure that there is only one each of a primary and a
Backup Server
master backup server. If any of the servers fail, they transition to the failed state. If the primary fails, the master backup becomes the primary. If the master backup fails, the primary chooses the next master backup from the backup servers based on a criterion e.g. a pre-assigned priority. A new server enters the cluster as a backup (note that this server only shadows connections that are created after it joins). If it is unable to find the master backup, but is able to communicate with a majority of backup servers, it becomes the master backup and tries to communicate with the primary. If that fails, it assumes that the primary has failed and becomes the primary.
Events Out gettimeofday() Wrapper
gettimeofday() Wrapper
gettimeofday()
gettimeofday()
{
{ WaitForEvent()
real_gettimeofday() GenerateEventForBackup()
ExtractTime() }
Events In
}
Figure 4. A leader/follower synchronizing mechanism and it would be sufficient to update the affected libraries on the primary and the backup. This way the application would not even have to be re-complied. We are exploring how far the generation of these modified functions can be automated. Figure 4 depicts how the modified functions would behave for the gettimeofday(). Whenever the primary reaches a non-deterministic function, it generates an event for the backup servers; similarly, whenever a backup reaches such a function, it waits for an event from the primary to resolve the nondeterminism. The implementation of nondeterministic functions which simply return a value and do not have any side-effects (e.g. gettimeofday()) is quite straightforward both on the primary and a backup. On the primary, just before the function returns, an event with the result is generated. On a backup, the function would block for the event from the primary, and then extract and return the result. In non-deterministic functions which have side-effects, the event sent by the primary would provide enough information for the backup to execute the function in a deterministic fashion (that is produce the same result and side effects as the primary). Since security is an important part of any design, we are looking at running applications that use TLS with ST-TCP. We except that as long as we use the same keys and certificates on the primary and the replicas and address the few sources of non-determinism that may exist, this should not be a problem.
Primary
Failed
Primary Server
Master Backup
Backup New Server Joins
Figure 3. Server state diagram.
3. Handling Non-Determinism ST-TCP assumes that the application running on the primary is deterministic, that is, if another instance of the application – running on a backup – is provided with the same TCP stream from the client, then the two application instances will remain consistent with each other. In real applications, this assumption is unlikely to hold in most cases. The main source of non-determinism is system calls e.g. functions like gettimeofday(). Additionally, system calls like read() could lead to non-determinism in some cases e.g. if read() returned a different number of bytes on the primary and on a backup to the application and the subsequent action of the application depends on the number of bytes read. Functions like random() could also lead to nondeterministic behavior, however, as long as it is made sure that the random() function on the primary and on a backup are initialized with identical seeds, the function would behave in a deterministic fashion. Threads and signals can be another source of non-determinism in applications that use them. This have been discussed in the literature, and something we would explore as part of future work; however, this issue is not addressed here. We plan to use a leader/follower synchronizing algorithm for keeping the primary and a backup deterministic. First we identify functions (mainly system calls) that can lead to nondeterministic behavior. These functions need to be modified both on the primary and the backup servers, however, we may still not need any changes to the applications. In most applications, libraries are dynamically linked,
References [1] M. Marwah, S. Mishra, and C. Fetzer. TCP server fault tolerance using connection migration to a backup server. In Proceedings of the IEEE International Conference on Dependable Systems and Networks, San Francisco, June 2003. [2] D. Zagorodnov, K. Marzullo, L. Alvisi, and T. Bressoud. Engineering fault tolerant TCP/IP services using FT-TCP. In Proceedings of the IEEE International Conference on Dependable Systems and Networks, San Francisco, June 2003. 2