An Advanced Commit Protocol for MLS Distributed Database Systems Indrajit Ray1
;
Elisa Bertino2
;
y
Sushil Jajodia1
;
z
1 Center for Secure Information Systems and
Luigi Mancini3
;
x
Department of Information and Software Systems Engineering George Mason University, Fairfax, VA 22030-4444, U.S.A. firay,
[email protected] 2 Dipartimento di Scienze dell'Informazione Universita di Milano, 20135 Milano, Italy
[email protected] 3 Dipartimento di Informatica e Scienze Universita di Genova, Genova, Italy
[email protected]
ABSTRACT The classical Early Prepare commit protocol (EP), used in many commercial systems, is not suitable for use in multilevel secure distributed database systems that employ a locking protocol for concurrency control. This is because EP requires that read locks be not released by a subtransaction during its window of uncertainty; however, it is not possible for a locking protocol to provide this guarantee in a multilevel secure system (since read lock of a higher level transaction on a lower level data object must be released whenever a lower level transaction wants to write it). The Secure Early Prepare protocol (SEP) overcomes this diculty by aborting those distributed transactions that release their low level read locks prematurely. We see this approach as being too restrictive. One of the major bene ts of distributed processing is its robustness to failures, and SEP fails to take advantage of this. In this work, we propose the Advanced Secure Early Prepare commit protocol (ASEP) together with a number of language primitives that can be used as system calls in distributed transactions. These language primitives permit features like partial rollback and forward recovery, and allow a distributed transaction to proceed even when a subtransaction has released its low level read locks prematurely. This not only oers exibility, but also can be used, if desired, by a sophisticated programmer to tradeo consistency for atomicity of the distributed transaction. Partially supported by National Science Foundation under grants IRI-9303416 y Partially supported by Italian M.U.R.S.T. and by Nato Collaborative Research grant number 930888 z Partially supported by National Science Foundation under grants IRI-9303416 and INT-9412507 and by National Security Agency under grant MDA904-94-C-6118 x Partially supported by Italian M.U.R.S.T. c 1996 by the Association for Computing Machinery, Inc. Per mission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for pro t or commercial advantage and that new copies bear this notice and the full citation on the rst page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior speci c permission and/or fee. Request Permissions from Publication Dept., ACM Inc., Fax +1 (212) 869-0481 or
1 Introduction A distributed database consists of several data objects that are physically located at dierent sites or nodes. A user can initiate a distributed transaction at any site. If access to objects stored at remote sites is required, the distributed transaction initiates a subtransaction at the remote site. To guarantee correct execution of transactions, each site in the distributed database is equipped with a concurrency control protocol and an atomic commit protocol. The concurrency control protocol ensures that the execution of all local transactions and subtransactions at each site is serializable. The commit protocol on the other hand ensures the atomicity of a distributed transaction: All its subtransactions either commit or abort. There are several dierent concurrency and commit protocols with two-phase locking (2PL) and basic two-phase commit (2PC) being the most well-known, respectively [BHG87]. A major problem of the locking protocols in multilevel secure (MLS) systems is that in order to avoid a covert channel any read locks acquired by a high level transaction on a low level data object must be released whenever a low transaction attempts to acquire a write lock on the same data object [HS75, MJ93, Inf93b, Inf93a, ABJ94]. Unfortunately, this has grave implications for the corresponding commit protocol, specially the early prepare commit protocol (EP) [MLO86, SC93]. What it implies is that read locks may get released within a subtransaction's window of uncertainty (period after a participant has voted yes to commit, but before it receives the commit or abort decision from the coordinator), possibly resulting in nonserializable executions [JM93, JMB94]. To guarantee serializability, Atluri, Bertino, and Jajodia proposed a secure early prepare commit protocol (SEP) in [ABJ94]. SEP introduces a con rmation phase before the decision phase of EP to ensure that if a read lock is released prematurely (i.e., after a participant has voted yes to commit a subtransaction, but before it receives a con rm message from the coordinator; see section 3), the subtransaction and, therefore, the distributed transaction are aborted. We feel that such a strategy of aborting the distributed transaction is too conservative. One of the major bene ts of distributed processing is its robustness to failures, and SEP fails to take advantage of this. We can introduce partial rollback features to prevent the distributed transactions from aborting unilaterally.
SEP suers from another shortcoming in that a distributed transaction may be subjected to starvation. This occurs when a higher level transaction gets aborted repeatedly because of some malicious lower level transaction. While this may be inevitable if we are to guarantee serializability, we may still incorporate forward recovery measures which will prevent starvation if required, albeit at the expense of serializability. To this end, we propose in this paper an advanced secure early prepare commit protocol (ASEP) together with several language primitives. These primitives can be used by a sophisticated programmer to tradeo consistency for atomicity of the distributed transaction. In particular, we oer the programmer a choice to specify in the main transaction actions to be taken by each of the participating subtransactions that read low data, in case it has released locks prematurely. The choices oered may vary from outright abort of the subtransaction (thereby abort of the main one) to doing partial rollback so that the main transaction can proceed to completion without sacri cing consistency, to doing some kind of forward recovery which may result in execution that is meaningful in a particular application, even though not serializable. The rest of the paper is organized as follows. In section 2, we de ne the basic concepts and notation, followed by a brief review of SEP in section 3. Our system primitives and ASEP are described next in sections 4 and 5, respectively. Sections 6 and 7 contain some examples that show how a distributed transaction can proceed under ASEP even when a subtransaction has released its low level read locks prematurely. The emphasis of the examples in section 6 is on the commit of distributed transactions, possibly at the expense of consistency, while the examples in section 7 attempt to commit distributed transactions, preserving serializability at the same time. Finally we conclude in section 8.
2 MLS Distributed Database Model An MLS distributed database consists of a set N of sites, where each site N 2 N is an MLS database. The sites in the system are interconnected via communication links over which they can communicate. We assume that these links are secure (possibly using encryption) such that any communication between two sites is tamper proof. We model an MLS distributed database as a quadruple < D; T ; S; L >, where D is the set of data objects (objects), T is the set of transactions (subjects), S is the partially ordered set of access classes (or security levels) with an ordering relation , and L is a mapping from D [ T to S . For every x 2 D, L(x) 2 S , and for every T 2 T , L(T ) 2 S . In other words, every data object as well as every transaction has an access class associated with it. We extend the mapping L such that it maps each MLS database N to an ordered pair of security classes Lmin (N ) and Lmax (N ). Clearly, it should always be the case Lmin (N ); Lmax (N ) 2 S , and Lmin (N ) Lmax (N ). In other words, every MLS database in the distributed database has a range of security levels associated with it. For every data object x stored in an MLS database N , Lmin (N ) L(x) Lmax (N ). Similarly, for every transaction T executed at N , Lmin (N ) L(T ) Lmax (N ). A site Ni
is allowed to communicate with another site Nj only if Lmax (Ni ) = Lmax (Nj ).1 The reader may refer to [JM93] for additional details on the MLS distributed database model. Our security policy is based on the Bell-LaPadula model [BL76]. According to this model, the following two conditions are necessary for a system to be secure: A transaction T is allowed to read a data object x only if L(x) L(T ). A transaction T is allowed to write a data object x only if L(x) = L(T ). Thus, a transaction can read objects at its level or below, but it can write objects only at its level. Although the original ?-property proposed in the Bell-LaPadula model allows transactions to write into levels above their security level, it seems appropriate to disallow transactions that write to higher levels for the sake of database integrity [HS75, JK90]. The trusted DBMSs that are available commercially have the same restriction. In addition to these two requirements, a secure system must guard against illegal information ows through covert channels.
2.1 Distributed transaction model
For each security level s in the range fLmin , Lmax g at a particular site, there is one transaction manager (TMs ) at that level. A user logged in at level s submits a transaction T to TMs . If T does not require access to remote objects, TMs simply executes it as a local transaction. If, on the other hand, T is a distributed transaction, TMs assumes the role of the coordinator. It decomposes the transaction T into a collection of subtransactions Ti , each of which inherits the security level of the main transaction T , and forwards each Ti to the appropriate remote MLS system Ni . (It must be the case that Lmin (Ni ) L(Ti ) Lmax (Ni ).) A subtransaction Ti can execute at a site Ni only if whenever Ti wishes to read an object x, Lmin (Ni ) L(x) L(T ), and whenever Ti wishes to write an object x, L(x) = L(T ) Lmax (Ni ). We model a distributed transaction T as a nested transaction [Mos85]. Nested transactions have a hierarchical grouping structure: each nested transaction consists of either primitive actions (read or write) or some nested transactions. For simplicity we assume that a distributed transaction T in our model has only one level of nesting with the transaction T at the top level and each participating transaction Ti at the next level. The top-level transaction T can also have some read and write operations that are to be executed locally at the originating site. In such case, those read/write operations are treated as another subtransaction Ti which is executed at the originating site. The top-level transaction T performs only coordination and supervisory functions.
1 This assumption is necessary to avoid the cascade vulnerability problem [HCH+ 93, MS88].
We have chosen to use the nested structure for two reasons. First, it allows for an elegant way of representing asynchronously and independently executing subtransactions. Second, in a nested transaction, subtransactions can fail independently of each other and independently of the containing transaction. This includes the possibilities we provide such as attempting a rollback and reexecution at only one site if there is some kind of failure.
2.2 Secure concurrency control protocol
We assume that each transaction manager TMs uses a secure locking protocol as the concurrency control protocol [MJ93, Inf93b, Inf93a, ABJ94]. These locking protocols have the common feature that whenever a transaction Ti wishes to read (write) a data object x, it must rst acquire a read lock (write lock) on x. However, Ti must release a read lock on a data object x whenever another transaction Tj such that L(Tj ) < L(Ti ) requests a write lock on x.
3 Secure Early Prepare Protocol (SEP)
As we indicated in the introduction, the conventional EP [MLO86, SC93] fails in an MLS system since EP requires that no locks can be released by a subtransaction within its window of uncertainty. However, in an MLS environment, read locks on a low level object have to be released, whenever a low transaction tries to write to the same object. [ABJ94] solves this problem by introducing a con rmation phase before the decision phase of EP. The complete description of SEP is as follows: 1. The coordinator C generates subtransactions T1 , T2 , : : :, Tn , and sends them to the participants N1 , N2 , : : :, Nn , respectively. 2. Each participant Ni executes Ti and sends C either a yes vote or a no vote. A yes vote is sent if Ni could successfully complete the subtransaction. Additionally, if Ti has read any low data, Ni sends a read-low indicator to C along with the yes vote. 3. If C receives a yes vote from all participants, it checks to see if it has received any read-low indicator. If so, C sends to each site Nj that sent a read-low indicator, a con rm message to verify that it has not released any read locks on the low level data objects prior to receiving the con rm message. 4. If Nj has not released its read locks on any low level data object prior to receiving the con rm message, it responds with a con rmed message; otherwise it sends a not-con rmed message. 5. If the coordinator C receives a con rmed message from all Nj 's to which C had sent the additional round of messages, then it commits the transaction and sends a commit message to all participants; otherwise it aborts the transaction and sends abort messages to all participants. Thus, we see that in SEP, whenever any of the subtransactions of T releases any read locks on low level data objects prior to receiving the con rm message from the coordinator, T gets aborted. In the worst case this may lead
Begin_Top_Level_Transaction
Subtran_1 : /* low reading trans. */ Begin_Local_Transaction /* implicit sl_1 = SaveWork() */
Subtran_1 Begin_Local_Transaction
r[x] ; r[y] ; /* low read operation */ sl_2 = SaveWork();
End_Local_Transaction * * *
* w[x] ; sl_3 = SaveWork(); * End_Local_Transaction
Subtran_n Begin_Local_Transaction End_Local_Transaction
Subtran_n : /* another low reading trans */ Begin_Local_Transaction
SignalStatus[ repeat Subtran_1 => GetSignal[
/* implicit sl_1 = SaveWork() */ r[p] ; /* low read operation */
sl_1 -> RollBack(sl_1)]; Subtran_n => GetSignal[
w[q] ;
-> RollBack(sl_1)] until
sl_2 = SaveWork(); * End_Local_Transaction
redoneSignalStatus = 6 ] End_Top_Level_Transaction
Figure 1: Example of a distributed transaction to starvation of certain high transactions by malicious low transactions. We feel that such a strategy of aborting the distributed transaction is too conservative. In the following section, we introduce partial rollback features to prevent the distributed transactions from aborting. In fact our modi ed commit protocol sends to each low-reading site instructions regarding what to do if it had to release low read locks, instead of simply aborting the distributed transaction.
4 System Primitives
As mentioned in section 2.1, we model a distributed transaction in the form of a nested transaction with one level of nesting. We provide a number of linguistic constructs that a programmer can use as system calls in distributed transactions. These system primitives include the familiar transaction bracketing commands like Begin and End transaction plus a few others which we now describe. A typical distributed transaction in our model is shown in gure 1. The main (top-level) transaction is shown on the left, and details of two typical subtransactions are shown on the right. A main transaction, submitted by a user, is bracketed by the basic transaction processing system calls Begin Top Level Transaction and End Top Level Transaction. The entire code for the distributed transaction is within these system calls. Each subtransaction is bracketed by the pair of system calls Begin Local Transaction and End Local Transaction. The primitives for commit or abort of a transaction are Commit and Abort respectively. These can either be used explicitly by the programmer in the transaction, or they can implicitly be a part of the SignalStatus command, which is one of the set of seven new commands that we introduce. The syntax of these commands is as follows: 1. sl = SaveWork()
2. RollBack(sl) 3. GetSignal[[sl1 ! handler1 ],: : :,[sln ! handlern ]] 4. SignalStatus[repeat TIDp =) GetSignal[: : :]; .. . TIDq =) GetSignal[: : :] until < condition >] 5. SendNo 6. SendYes 7. NoSignalServiced These primitives when used appropriately enable the programmer to dictate the secure commit protocol to either ensure consistency or to trade it for more concurrency. Associated with the top level transaction, there are three system variables that can be accessed by the programmer.2 One is a counter variable called the redoneSignalStatus variable and the other two are boolean variables called returnValue and allReady, respectively. When the top level transaction executes the Begin Top Level Transaction primitive, it initializes all three system variables as follows: redoneSignalStatus is initialized to zero, and returnValue and allReady are set to TRUE. The usage of these variables will be explained when we describe the semantics of the SignalStatus command. The execution of the End Top Level Transaction command marks the end of the distributed transaction; the transaction manager for T can now forget about T . We associate with each subtransaction a counter variable called rollbackCount which is local to the subtransaction. When a subtransaction executes the Begin Local Transaction primitive, it initializes its rollbackCount to zero. This variable is incremented by one every time the subtransaction is rolled back. By checking this variable, a programmer can determine how many times a particular subtransaction has been rolled back. Note that when a subtransaction encounters the End Local Transaction primitive, it waits for the coordinator to initiate the commit protocol (provided it has not already decided to abort). As and when it receives an instruction from the coordinator, the subtransaction comes out of its wait state, executes the instruction and if the instruction was not a commit or abort, returns to the wait state. In the following we describe the semantics of each of the seven primitives in details and show how the experienced programmer can use them to direct each of the participating low-reading sites what to do in case low-read locks had to be released. We will refer to the example of the distributed transaction in gure 1 to help clarify the usage of these primitives.
programming language saves the current values of any local variables on the volatile memory. SaveWork call returns to the subtransaction a handle in the form of the identi er sl (called a signal label ) which can subsequently be used to refer to that savepoint. Typically, this handle is a monotonically increasing number. The subtransaction can reestablish (return to) any savepoint by invoking a RollBack command and passing to it the signal label of the savepoint that it wants to be restored. Depending on the application logic, the transaction programmer can decide to return to the most recent savepoint or to any other savepoint. Note that each savepoint established by a subtransaction is local to only that transaction. We assume that the successful execution of the Begin Local Transaction primitive establishes the rst savepoint for that subtransaction. The RollBack(sl) primitive takes as parameter a signal label sl and restores the state of the system to the state that existed at the time of the establishment of the savepoint sl. The transaction then restarts its execution from the step following the sl = SaveWork() step. More formally, the result of the RollBack(sl) command is the execution of a series of undo(opi ) operations, the duals of the operations opi . For each operation opi that precedes RollBack(sl) and up to the command SaveWork() corresponding to the signal label sl (but excluding the SaveWork() command), an undo(opi ) is executed. The eect of an undo operation is to release any lock that opi acquired on a data object, and remove the result of opi from the system, as if opi was never executed. Consequently, the data objects as well as the local program variables are restored to the state at the savepoint sl. Once that state has been restored, the RollBack(sl) call terminates. The transaction is now ready to re-execute starting from the operation that follows the SaveWork() command. Note that if there is a conditional branching command between the sl = SaveWork() step and the RollBack(sl) step, then a dierent set of commands mat be executed during the re-execution time, than was executed initially before the rollback. Figure 2 shows how the SaveWork() and RollBack(sl) primitives control the ow of a transaction. The original transaction is shown on the left. The steps that are eectively executed are shown on the right. The Begin Local Transaction command establishes implicitly the rst save point sl1 . The results of r[x] and w[y] operations are saved as a result of the sl2 = SaveWork() step. Since the RollBack command has sl2 as the parameter, the transaction performs undo operations undo(w[p]) and undo(r[z]), and then reexecutes r[z] and w[p]. Once this is done, the transaction resumes normal execution with the command following the RollBack command.
4.1 Semantics of SaveWork and RollBack
In our protocol, we assume that the lock manager sends a message (signal ) to the transaction manager whenever a low level read lock, held by a high subtransaction, has to be released prematurely. The GetSignal system primitive is used by the programmer to specify how such signals are to be handled (or serviced ) by the high subtransaction. The GetSignal call has two exit points: a standard one
The SaveWork() system call establishes a savepoint which causes the system to record the current state of processing. Each transaction manager writes a savepoint record on the local transaction log, while the run-time support of the 2 Their values cannot be modi ed by a programmer, however.
4.2 Semantics of GetSignal
Original Transaction
Effective Transaction
*
This savepoint covers ri[p] , ri[q] & ri [s]
*
Begin_Local_Transaction
Begin_Local_Transaction establishes implicitly, the first savepoint sl_1
r[x]
Results of r[x], w[y] are saved by sl_2
r[z]
w[x] End_Local_Transaction
r[x]
ri[p]
w[y]
*
This savepoint covers ri[q] & ri [s]
*
r[z]
slq = SaveWork()
w[p]
ri[q]
This savepoint covers only ri [s]
undo(w[p]) sls = SaveWork()
w[p] RollBack(sl_2)
slp = SaveWork()
sl_2 = SaveWork()
w[y] sl_2 = SaveWork()
Begin_Local_Transaction
r[z], w[p] are first undone by this command and then they are re-executed These commands are after the rollback is complete
undo(r[z]) r[z] w[p] w[x]
ri [s] GetSignal[sl p-> ...; sl q -> ... sl s -> ...]
End_Local_Transaction
Figure 2: Use of SaveWork and RollBack primitives which is the next instruction after the GetSignal step and an exceptional continuation which is represented by the expression [sl1 ! handler1 ]; : : : ; [sln ! handlern ] Each sli represents a signal label and the corresponding handler represents the code to be executed for this signal label. When a read lock acquired by a high subtransaction Ti on a low data object has to be released, the Lock Manager raises a signal (i.e., sends a message) to the high transaction manager for Ti giving the low data object in question. Each signal received by the transaction manager indicates a new value for a low data object that has been previously read by Ti . Upon receipt of a signal, the transaction manager locates the savepoint which immediately precedes the read of the data object identi ed by the signal. We say the savepoint covers the data object. For example, if the signal indicates a new value for the data object x, then the signal label sli corresponding to the savepoint before the operation ri [x] is chosen. If multiple low read locks of Ti are broken, the transaction manager receives multiple signals, one for each broken lock, from the lock manager. The high transaction manager buers all signals that it receives >from the lock manager for each transaction Ti . Later on when Ti invokes a GetSignal call, the transaction manager considers all the signals it has buered for Ti , and selects one signal to be serviced as follows: It selects a signal label that covers all data objects with broken read locks (i.e., the signal label that precedes all other signal labels that are generated due to broken read locks). To illustrate how the transaction manager selects a signal to service, consider the transaction fragment shown in gure 3. In the example, a high transaction Ti performs three low read operations ri [p], ri [q], and ri [s]. If the low read lock on p is broken, the signal generated by the transaction manager is slp since slp covers ri [p]. Similarly, if read locks on q and s are broken, the transaction manager will generate signals slq
Figure 3: Choosing a signal to be serviced and sls , respectively. If all three low read locks are broken, the signal slp will be selected for servicing by the transaction manager since it covers all three objects. When a signal sli is selected for servicing, the corresponding handler handleri is executed. The programmer can explicitly specify the default invocation by using the notation GetSignal[! RollBack], which does not specify any signal label. If there is a signal to be serviced, the default invocation rolls back the transaction Ti to the savepoint immediately preceding the earliest low read operation among all the low reads of Ti , that have been signalled. If no savepoint has been explicitly established in the transaction, then the default invocation rolls back Ti , to the default savepoint coinciding with the Begin Transaction. It should be noted that the GetSignal call is non-blocking, i.e. the call does not wait for the arrival of a signal. If a signal is already available, it is serviced, otherwise no action is taken.
4.3 Semantics of SignalStatus, SendNo, SendYes, and NoSignalServiced The SignalStatus primitive constitutes the heart of our commit protocol. When a programmer uses this command, the system will embark on the secure commit protocol. The SignalStatus primitive is always invoked by a top level transaction at the coordinating site. The body of the command consists of a Repeat : : : Until loop. This means that the body is executed at least once. Within the loop there are a number of steps of the form TIDi =) GetSignal[: : :]. Each of these steps represents a concurrent thread of operation. When the SignalStatus command is executed the following things happen at the coordinator: 1. For each step of the form TIDi =) GetSignal[: : :] within the body of the loop, the top level transaction forks a concurrent thread of operation. 2. Each of these concurrent threads invokes a remote system call at the corresponding subtransaction denoted
by TIDi . The remote system call invoked is the GetSignal primitive speci ed in the right hand side of TIDi =) GetSignal[: : :]. Essentially what each thread does is to send the GetSignal call with all its de ned handlers in a message to the participant TIDi . The thread then waits for a response to come from the participant. 3. Upon receiving the message from the coordinator, the subtransaction TIDi executes the GetSignal call as if it has been locally invoked. If the participant is able to successfully execute the handler speci ed in the GetSignal, it sends a yes message to the coordinator by executing the SendYes command. If no signal is serviced, the participant replies with a NoSignalServiced message, generated by invoking the NoSignalServiced primitive. Finally, if the participant fails to service a signal, it sends a no message by executing the SendNo command. 4. A thread at the coordinator comes out of its wait state when it receives a response from the participant. If the thread receives a NoSignalServiced message, it performs the operation \allReady = allReady ^ TRUE". For all other messages it rst performs the operation \allReady = FALSE". Then it checks to see if it has received a yes or a no message. If the thread receives a yes response from the participant it performs the operation \returnValue = returnValue ^ TRUE" else it performs \returnValue = FALSE" and then returns. This step is performed by each thread at the coordinator. 5. When all the threads return from their wait states, rst the value of the returnValue is checked. If it is false (which will be the case if even one of the lower reading participants replies by a no message - see the computation in the above step), the coordinator aborts the transaction and informs all participants accordingly. On the other hand if returnValue is TRUE (i.e. none of the lower reading participant has replied by a no message), the variable allReady is tested. If this variable is true (which will be the case only if all low reading participants have replied by NoSignalServiced), the coordinator commits and informs the decision to all participants. If it is false (i.e. there is atleast one lower reading participant that has sent a yes message), the condition portion of the Repeat : : : Until loop is checked. If the condition is false, the boolean variables returnValue and allReady are reset to TRUE, the counter variable redoneSignalStatus is incremented by one and the loop is reexecuted starting from step 1 above. If on the other hand the condition evaluates to true, the coordinating transaction commits and sends a commit message to each participant transaction. We would like to mention here that the programmer can utilize the redoneSignalStatus variable within condition portion of the loop to specify the number of times the loop is executed. We leave it to the programmer to decide what should be an appropriate value of redoneSignalStatus as there is no single metric on which to decide the value.
5 Advanced Secure Early Prepare Protocol (ASEP)
We now present our version of the secure early prepare commit protocol. We assume that a secure locking protocol is being used as the concurrency control algorithm by all transaction managers. Furthermore we assume that the programmer knows which subtransactions will be reading low data. Algorithm 1 [Advanced Secure Early Prepare] 1. A user who is logged on at security level s initiates a distributed transaction T at a node C . The level of the transaction becomes s. The coordinator runs at site C . Henceforth C will denote the coordinator. 2. Coordinator C generates subtransactions T1 , T2 , : : :, Tn . The security level of each Ti is same as T , viz., s. C forces a membership log record and forwards the subtransactions to participants N1 ; N2 ; : : : ; Nn , respectively. 3. The s-level transaction manager TMs at each participant receives the subtransaction. It proceeds to acquire the pertinent locks and then executes the subtransaction. If the subtransaction is able to complete successfully, TMs forces a prepare log record, and sends a yes vote to the coordinator C . If on the other hand the transaction is unable to complete successfully, the transaction is aborted. In this case, TMs sends a no vote to the coordinator C . 4. When the coordinator C has received a vote from each participant, it checks to if there is any no vote from any participant or if there is any missing vote (if a vote does not arrive after a time out period, it is considered to be a missing vote). 5. Depending on the votes received, the coordinator takes the following actions: (a) If there is even one no vote or one missing vote, the coordinator aborts the transaction and sends an abort message to each participant. On receipt of the abort message, the participant aborts the subtransaction and sends an acknowledgement to the coordinator. (b) If there is no missing vote and all are yes votes, the coordinator checks if there was any subordinate which read any low objects. If there is no such subordinate, the coordinator commits Ti , forces a commit record and then sends commit messages to all its subordinates. On receipt of the commit message, the participant commits the subtransaction and sends an acknowledgement to the coordinator. (c) If there were subtransactions that read lower level data, then the coordinator executes a SignalStatus command which contains one GetSignal remote system call for each participant that read low. Essentially, the coordinator by executing the SignalStatus command invokes a remote system call at the transaction manager of each participant that has read low data (and only at these low-reading participants).
(d) The SignalStatus is executed according to the semantics speci ed earlier. i. If the coordinator receives NoSignalServiced from all participants, as replies to the execution of the remote system calls, then it commits the transaction, forces a commit record and sends commit messages to all participants. The participants then commit and send an acknowledgement to the coordinator. ii. If the remote systems calls at the low reading participants speci es GetSignal[sl ! RollBack(sl)], then when the participant needs to service any signal, it does a rollback. If the reexecution is successful, the subtransaction sends a yes message to the coordinator else it sends a no message. If the coordinator receives even one no message from a participant (with a missing message being counted as a no), it aborts the transaction and noti es all participants accordingly. On the other hand, if the coordinator has not received any no response (i.e., the responses are either (1) all yes or (2) some yes and others NoSignalServiced), then it checks the condition speci ed in the body of the Repeat : : : Until loop of the SignalStatus command. (Note that the programmer can make use of the variables allReady and returnValue to specify the condition.) If the condition is FALSE, the coordinator reissues the remote system calls to the low reading participants and performs this step all over again. If the execution exits the loop because the condition has evaluated to TRUE, the transaction is committed. iii. As in the previous step if any of the remote system calls speci es a GetSignal[sl ! < some forward recovery >], the participants executes that forward recovery in case it has to service the corresponding signal. The rest of the execution remains the same as that in step (6(d)ii). Subtransactions that do not read lower level data hold on to all their locks until they commit. Subtransactions that are lower reading, hold on to their locks till commit time, unless forced to release lower level read locks by updating lower level transactions. In such case, if a lower reading subtransaction decides to perform a roll back to a savepoint s, it releases the locks acquired after the savepoint s and holds on to locks acquired before s. Such a strategy is essential if consistency of the database is required.
6 Ensuring Commit under ASEP
In this section, we give two examples to illustrate how it is possible to commit distributed transactions under ASEP even if subtransactions release their low level read locks prematurely. We assume that there are two sites, A and B. Site A stores data objects q, s, x, y and z, with L(x) = L(z) = High and L(q) = L(s) = L(y) = Low. Site B stores the data
Begin Top Level Transaction TA : Begin Local Transaction r[s] /* this is a low read */ r[x] sl2 = SaveWork() w[z] r[y] /* another low read */ sl3 = SaveWork() r[q] /* a third low read */ End Local Transaction TB : Begin Local Transaction r[o] w[p] End Local Transaction SignalStatus[ repeat TA =) GetSignal[ sl2 ! RollBack(sl2 ) ! RollBack] until (redoneSignalStatus = 1) ] End Top Level Transaction
Figure 4: A distributed transaction in which Repeat : : : Until loop is executed once objects m, n, o and p, with L(o) = L(p) = High and L(m) = L(n) = Low. TA denotes the subtransaction to be executed at site A, while TB represents the subtransaction destined for site B. Example 1 Consider the distributed transaction given in gure 4. Since only subtransaction TA reads low data, the SignalStatus command contains code for TA only. TA contains three low read operations, viz., r[s], r[y] and r[q]. If read lock on s is broken, then the signal raised will be sl1 (which was established when the Begin Local Transaction command in TA was executed). The signal will be sl2 if the lock on y is broken, and it will be sl3 if the lock on q is broken. Notice, that the programmer has selected to de ne an explicit handler for sl2 , but has chosen the default invocation if any of the other two signals is selected for servicing. The condition that has been speci ed in the Repeat : : : Until loop is \redoneSignalStatus = 1". As a result, the loop will be executed once and only once. Therefore, if TA replies yes (using SendYes) after servicing a signal or if it replies with a NoSignalServiced, the top level transaction can commit. 2 Example 2 Consider now the distributed transaction shown in gure 5. Since both subtransactions TA and TB read lower level data, signal handlers have been de ned for both of them within the SignalStatus command. The subtransaction TA and the corresponding parameters in the SignalStatus command are identical to the ones in example 1. TB , on the other hand, now reads low data items n and m. The read of n is covered by the signal label sl1 , while sl2 covers the read of m. Notice that the programmer has speci ed a forward recovery operation r[n] as the handler for sl1 . Notice also that the condition portion of the Repeat : : : Until loop contains \redoneSignalStatus = 6". This means that the loop may be executed as many as six times before the distributed transaction commits. A typical scenario for the execution of the SignalStatus may be as follows: In the rst time through the loop, suppose TA services a signal and sends a yes response while
Begin Top Level Transaction TA : Begin Local Transaction r[s] /* this is a low read */ r[x] sl2 = SaveWork() w[z] r[y] /* another low read */ sl3 = SaveWork() r[q] /* a third low read */ End Local Transaction TB : Begin Local Transaction r[n] /* this is a low read */ w[o] sl2 = SaveWork() r[m] /* a second low read */ w[p] End Local Transaction SignalStatus[ repeat TA =) GetSignal[ sl2 ! RollBack(sl2 ) ! RollBack] TB =) GetSignal[ sl1 ! r[n] sl2 ! RollBack(sl1 ) ] until (redoneSignalStatus = 6) ] End Top Level Transaction
Figure 5: A distributed transaction in which Repeat : : : Until loop is executed multiple times TB sends a NoSignalServiced response to the coordinator. At the end of this execution, redoneSignalStatus becomes 1. Also, at this stage the value of allReady is FALSE, and that of returnValue is TRUE. Consequently, the loop is executed a second time. During this execution, suppose TB services a signal, while TA sends a NoSignalServiced response. As a result, redoneSignalStatus will have a value of two, allReady will be FALSE, and returnValue will be TRUE. This will make the loop execute a third time. If during this third execution of the loop both TA and TB send NoSignalServiced messages, then the distributed transaction will go ahead and commit. Notice that the loop may execute a maximum of six times, before the distributed transaction commits. Of course if either TA or TB sends a No messages after handling the signals, the distributed transaction will abort. 2
7 Ensuring Consistency under ASEP In the previous section, we gave two examples that show how our primitives can be used to commit transactions that would have to abort under a dierent commit protocol (e.g., SEP [ABJ94]). The cost, however, is that this is achieved by possibly sacri cing serializability. We shift our focus in this section and show how our primitives can help even if a programmer desires serializability at all times. Once again, our primitives give greater exibity to programmers in that they can be used to commit transactions that will have to abort otherwise without sacri cing serializability. If we examine the steps of ASEP, we see that the only time a programmer can be sure to have serializable execution is when the coordinator receives NoSignalServiced response from all participants. This can be accomplished in one of following three ways:
Ti :
Begin Local Transaction .. . slx = SaveWork() ri [x] /* this is low read */ .. . sly = SaveWork() ri [y] /* this is the second low read */ .. . slz = SaveWork() ri [z] /* this is the last low read */ .. . End Local Transaction SignalStatus[ repeat .. . Ti =) GetSignal[ slx ! RollBack(slx ) sly ! RollBack(sly ) slz ! RollBack(slz )] .. . until some condition ]
Figure 6: Incorporating multiple savepoints and partial rollbacks of subtransactions 1. The Repeat : : : Until loop in the GetSignal command is executed once. If all the low reading participants send a NoSignalServiced message to the coordinator, then the distributed transaction commits, otherwise it aborts. Note that in this case ASEP reduces to SEP [ABJ94].) 2. The Repeat : : : Until loop in the GetSignal command is executed over and over again with each of the low reading participants rolling back completely when servicing a signal, until all of them send NoSignalServiced messages. 3. The programmer speci es the maximum number of times each of the low reading participants rolls back, before the distributed transaction aborts. Note that for each of these cases, a subtransaction may either rollback completely (i.e. to the rst savepoint) or it can rollback partially. If we choose the former (which is conservative) then the programmer need only to specify the default invocation of GetSignal in the SignalStatus command. On the other hand if the latter is desired, then the programmer must specify a RollBack(slj ) handler for every possible signal. The easiest way to achieve this is as follows: Suppose a subtransaction Ti contains the low read operations ri [x], ri [y] and ri [z] (in this order). The programmer can precede each of these low read operations by a sl = SaveWork() command, and specify handlers for each of the signal labels slx , sly , and slz in the SignalStatus command, as shown in gure 6. The program fragments for each of the three aforementioned cases are given in gures 7{9. For simplicity, we assume that the distributed transaction T consists of two subtransactions T1 and T2 , both of which read lower level data. Furthermore we assume that the only savepoint that is established in each subtransaction, is the rst savepoint resulting
Begin Top Level Transaction T1 : Begin Local Transaction .. . End Local Transaction T2 : Begin Local Transaction .. . End Local Transaction SignalStatus[ repeat T1 =) GetSignal[ sl1 ! SendNo ] T2 =) GetSignal[ sl1 ! SendNo ] until (TRUE)] End Top Level Transaction
Figure 7: Simulating SEP in ASEP
from the execution of the primitive Begin Local Transaction. Thus if a subtransaction has to rollback, it does so by rolling back to the rst savepoint. As discussed above we can easily eliminate this assumption by establishing more savepoints and allowing the subtransactions to rollback to dierent savepoints. Note that in gure 7 the Repeat : : : Until loop in the SignalStatus command will be executed once and only once. Since the handler for the GetSignal command in each subtransaction speci es a SendNo command, the coordinator aborts if any of the subtransactions services a signal (i.e., there is a broken lock). The only situation in which the coordinator commits is if each Ti sends a NoSignalServiced response. In gure 8, rst of all note that since the programmer has speci ed the default invocation of GetSignal, whatever be the signal selected to be serviced, the subtransaction will always RollBack to the rst savepoint. In our example we could have provided sl1 ! RollBack(sl1 ) as the parameter of each GetSignal call. Furthermore, if there were a number of savepoints - one for each low read operation - we could have written the code as discussed previously (i.e. have a slx ! RollBack(slx ) for each slx ). The observation here is the Repeat : : : Until loop which continues forever. In gure 9, the programmer requires the use of the counter variable rollbackCount which was introduced in section 4. Recall that this counter variable is associated with each subtransaction and is local to the particular subtransaction. When a subtransaction commences execution, its rollbackCount variable is initialized to zero, and the counter gets incremented by one everytime the subtransaction rollbacks, whether partially or completely. Note that allowing the value of the counter variable redoneSignalStatus to go up to twelve ensures that at least one of the subtransactions is allowed to rollback (if required) to its maximum speci ed times. If a subtransaction has rolled back its speci ed number of times and still has to service a signal, it sends a no vote which causes the distributed transaction to abort.
Begin Top Level Transaction T1 : Begin Local Transaction .. . End Local Transaction T2 : Begin Local Transaction .. . End Local Transaction SignalStatus[ repeat T1 =) GetSignal[ ! RollBack ] T2 =) GetSignal[ ! RollBack ] until not allReady ] End Top Level Transaction
Figure 8: Executing Repeat : : : Until loop again and again until all participants respond with NoSignalServiced messages
Begin Top Level Transaction T1 : Begin Local Transaction .. . End Local Transaction T2 : Begin Local Transaction .. . End Local Transaction SignalStatus[ repeat T1 =) GetSignal[ sl1 ! If rollbackCount < 5 RollBack(sl1 ) else SendNo ] T2 =) GetSignal[ sl1 ! If rollbackCount < 6 RollBack(sl1 ) else SendNo ] until (redoneSignalStatus = 12) ] End Top Level Transaction
Figure 9: Executing each subtransaction six times before the distributed transaction is aborted
8 Conclusions This paper presents an advanced secure commit protocol (ASEP) for MLS distributed database systems. ASEP improves upon the secure early prepare protocol (SEP) of [ABJ94] by allowing the distributed transaction to proceed even when a subtransaction has released its low level read locks prematurely. ASEP allows partial rollback of subtransactions so that the main transaction can proceed to completion without sacri cing consistency. ASEP also allows forward recovery measures to be incorporated, which will prevent starvation if required, albeit at the expense of consistency. As part of our future work, we would like to implement ASEP using a transaction processing language like Avalon [EMS91]. This will enable us to analyze not only its performance, but its impact on dierent types of transaction processing, particularly long lived transactions in a cooperative environment, as well.
REFERENCES [ABJ94]
Vijaylakshmi Atluri, Elisa Bertino, and Sushil Jajodia. Degrees of Isolation, Concurrency Control Protocols and Commit Protocols. In J. Biskup et al., editors, Database Security, VIII: Status and Prospects, pages 259{274. NorthHolland, Amsterdam, 1994.
[BHG87] Philip A. Bernstein, Vassos Hadzilacos, and Nathan Goodman. Concurrency Control and Recovery in Database Systems. Addison-Wesley Publishing Company, Reading, MA, 1987. [BL76]
D. E. Bell and L. J. LaPadula. Secure Computer Systems: Uni ed Exposition and Multics Interpretation. Technical Report MTR-2997, The Mitre Corp., Burlington Road, Bedford, MA 01730, March 1976.
[EMS91] J. L. Eppinger, L. B. Mummert, and A. Z. Spector, editors. Camelot and Avalon: A Distributed Transaction Facility. Morgan Kaufman Publishers, Inc., San Mateo, CA, 1991. [HCH+93] J. D. Horton, R. H. Cooper, W. F. Hyslop, B. G. Nickerson, O. K. Ward, R. Harland, E. Ashby, and W. M. Stewart. The Cascade Vulnerability Problem. Journal of Computer Security, 2(4):279{290, 1993. [HS75]
Thomas H. Hinke and Marvin Schaefer. Secure Database Management System. Technical Report RADC-TR-75-266, System Development Corporation, November 1975.
[Inf93a]
Informix Software, Inc., Menlo Park, CA. Informix OnLine/Secure Administrator's Guide, April 1993.
[Inf93b]
Informix Software, Inc., Menlo Park, CA. Informix OnLine/Secure Security Features User's Guide, April 1993.
[JK90]
Sushil Jajodia and Boris Kogan. Transaction Processing in Multilevel Secure Databases Using Replicated Architecture. In Proc. of the IEEE Symposium on Research in Security and Privacy, pages 360{368, Oakland, May 1990. [JM93] Sushil Jajodia and Catherine McCollum. Using Two-phase Commit for Crash Recovery in Federated Multilevel Secure Database Management Systems. In C. E. Landwehr et al., editors, Dependable Computing and Fault Tolerant Systems, Vol. 8, pages 365{381. Springer-Verlag, New York, 1993. [JMB94] Sushil Jajodia, Catherine D. McCollum, and Barbara T. Blaustein. Integrating Concurrency Control and Commit Algorithms in Distributed Multilevel Secure Databases. In T. F. Keefe and C. E. Landwehr, editors, Database Security, VII: Status and Prospects, pages 109{121. NorthHolland, Amsterdam, 1994. [MJ93] John McDermott and Sushil Jajodia. Orange locking: Channel-free database concurrency control via locking. In B. M. Thuraisingham and C. E. Landwehr, editors, Database Security, VI: Status and Prospects, pages 267{284. NorthHolland, Amsterdam, 1993. [MLO86] C. Mohan, B. Lindsay, and R. Obermarck. Transaction Management in R* Distributed Database Management System. ACM Transaction on Database Systems, 11(4):378{396, December 1986. [Mos85] J. Eliot B. Moss. Nested Transactions. An Approach to Reliable Distributed Computing. Information Systems Series. The MIT Press, Cambridge, Massachussetts, 1985. [MS88] Jonathan K. Millen and Martin W. Schwartz. The Cascading Problem for Interconnected Networks. In Proc. of the Fourth Aerospace Computer Security Applications Conference, pages 269{274, December 1988. [SC93] James W. Stamos and Flaviu Cristian. Coordinator Log Transaction Execution Protocol. Distributed and Parallel Databases, 1:383{408, 1993.