Logging Last Resource Optimization for ... - Semantic Scholar

4 downloads 41775 Views 258KB Size Report
Mar 22, 2010 - record stored in persistent memory such as a hard disk. A transac .... operation and failure recovery, Section 5 highlights performance achieved ...
Logging Last Resource Optimization for Distributed Transactions in Oracle WebLogic Server Tom Barnes

Adam Messinger Paul Parkinson Amit Ganesh Saraswathy Narayan Srinivas Kareenhalli

German Shegalov

Oracle Corporation 500 Oracle Parkway Redwood Shores, CA 94065, USA

{firstname.lastname}@oracle.com ABSTRACT State-of-the-art OLTP systems execute distributed transactions using XA-2PC protocol, a presumed-abort variant of the TwoPhase Commit (2PC) protocol. While the XA specification provides for the Read-Only and 1PC optimizations of 2PC, it does not deal with another important optimization, coined Nested 2PC. In this paper, we describe the Logging Last Resource (LLR) optimization in Oracle WebLogic Server (WLS). It adapts and improves the Nested 2PC optimization to/for the Java Enterprise Edition (JEE) environment. It allows reducing the number of forced (synchronous) writes and the number of exchanged messages when executing distributed transactions that span multiple transactional resources including a SQL database integrated as a JDBC datasource. This optimization has been validated in SPECjAppServer2004 (a standard industry benchmark for JEE) and a variety of internal benchmarks. LLR has been successfully deployed by high-profile customers in mission-critical highperformance applications.

1. I TRODUCTIO A transaction (a sequence of operations delimited by commit or rollback calls) is a so-called ACID contract between a client application and a transactional resource such as a database or a messaging system that guarantees: 1) Atomicity: effects of aborted transactions (hit by a failure prior to commit or explicitly aborted by the user via rollback) are erased. 2) Consistency: transactions violating consistency constraints are automatically rejected/aborted. 3) Isolation: from the application perspective, transactions are executed one at a time even in typical multi-user deployments. 4) Durability (Persistence): state modifications by committed transactions survive subsequent system failures (i.e., redone when necessary) [13]. Transactions are usually implemented using a sequential recovery log containing undo and redo information concluded by a commit record stored in persistent memory such as a hard disk. A transaction is considered committed by recovery when its commit record

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. EDBT 2010, March 22-26, 2010, Lausanne, Switzerland. Copyright 2010 ACM 978-1-60558-945-9/10/0003 ...$10.00

is present in the log. When a transaction involves several resources (participants, also denoted as agents in the literature) with separate recovery logs (regardless whether local or remote), the commit process has to be coordinated in order to prevent inconsistent subtransaction outcomes. A dedicated resource or one of the participants is chosen to coordinate the transaction. To avoid inconsistent subtransaction outcomes, the transaction coordinator (TC) executes a client commit request using a 2PC protocol. Several presume-nothing, presumed-abort, and presumedcommit variants of 2PC are known in the literature [2, 3, 4, 5, 13]. We briefly outline the Presumed-Abort 2PC (PA2PC) because it has been chosen to implement the XA standard [12] that is predominant in today's OLTP world.

1.1 Presumed-Abort 2PC Voting (Prepare) Phase: TC sends a prepare message to every participant. When the participant determines that its subtransaction can be committed, it makes subtransaction recoverable and replies with a positive vote (ACK). Subtransaction recoverability is achieved by creating a special prepared log record and forcing the transaction log to disk. Since the transaction is not committed yet, no locks are released to ensure the chosen isolation level. When the participant determines that the transaction is not committable for whatever reason, the transaction is aborted; nothing has to be force-logged (presumed abort); a negative vote is returned (ACK). Commit Phase: When every participant returned an ACK, the TC force-logs the committed record for the current transaction, and notifies all participants using commit messages. Upon receiving a commit message, participants force-log the commit decision, release locks (acquired for transaction isolation), and send a commit status to the TC. Upon collecting all commit status messages from the participants, the TC can discard the transaction information (i.e., release log records for garbage collection). Abort Phase: When at least one participant sends a ACK, the TC sends rollback messages without forcing the log (presumed abort) and does not have to wait for rollback status messages. The complexity of a failure-free run of a PA2PC transaction on n participants accumulates to 2n+1 forced writes and 4n messages. Since commit/rollback status messages are needed only for the garbage collection at the coordinator site, they can be sent asynchronously, e.g., as a batch, or they can be piggybacked on the vote messages of the next transaction instance. Thus, the communication cost is reduced to 3n messages, and more importantly to just 1 synchronous round trip as seen above. If one of the partici-

void procesess(TwoPCMessage msg) { switch (msg) { case PREPARE: if (i == n) { if (executeCommit() == OK) { sendCommitTo(msg.from); } else { executeAbort(); sendAbortTo(msg.from); } } else { if (executePrepare() == OK) { sendPrepareTo(i+1); } else { executeAbort(); sendAbortTo(i-1); sendAbortTo(i+1); } } break; case ABORT: if (i != n) { sendAbortTo(msg.from == i-1 ? i+1 : i-1); } executeAbort(); break; case COMMIT: if (i != 1) { sendCommitTo(i-1); } executeCommit(); break; } // switch }

ested 2PC: Another interesting optimization that is not part of the XA standard has originally been described by Jim Gray in [2]. It is also known as Tree or Recursive 2PC. It assumes that we deal with a fixed linear topology of participants P1, …, Pn. There is no fixed TC. Instead the role of TC is propagated left-to-right during the vote phase and right-to-left during the commit phase. Clients call commit_2pc on P1 that subsequently generates a prepare message to itself. Each Pi (indicated by variable i in the pseudocode) executes the simplified logic outlined in Figure 1 processing a 2PC message from the source Pfrom. A failure-free execution of an instance of the Nested 2PC Optimization costs 2n forced writes and 2(n-1) messages.

1.3 Java Transaction API Java Transaction API (JTA) is a Java “translation” of the XA specification; it defines standard Java interfaces between a TC and the parties involved in a distributed transaction system: the transactional resources (e.g., messaging and database systems), the application server, and the transactional client applications [11]. A TC implements the JTA interface javax.transaction.TransactionManager. Conversely, in order to participate in 2PC transactions, participant resources implement the JTA interface javax.transaction.xa.XAResource. In order to support a hierarchical 2PC, a TC also implements javax.transaction.xa.XAResource that can be enlisted by third-party TC's. From this point on, we will use the JTA jargon only: we say transaction manager (TM) for a 2PC TC, and (XA)Resource for a 2PC participant.

pants takes the role of the coordinator, we save one set of messages at the coordinator site, and we save one forced write since we do not need to log transaction commit twice at the coordinator site. Thus, the overhead of a PA2PC commit can be further reduced to 2n forced writes and 3(n-1) messages [13].

Every transaction has a unique id that is encoded into a structure named Xid (along with the id of the TM) representing a subtransaction or a branch in the XA jargon. The isSameRM method allows the TM to compare the resource it is about to enlist in the current transaction against every resource it has already enlisted to reduce the complexity of 2PC during the commit processing, and potentially enable 1PC Optimization. The TM makes sure that the work done on behalf of a transaction is delimited with the calls to start(xid) and end(xid), respectively, which adds at least 4n messages per transaction since these calls are typically blocking. When a client requests a commit of a transaction, the TM first calls prepare on every enlisted resource and based on the returned statuses, in the second phase, it calls either commit or rollback. After a crash, the TM contacts all registered XAReresource's and executes a misnamed method recover to obtain the Xid list of prepared but not committed, so-called in-doubt transactions. The TM then checks every Xid belonging to it (there can be other TM's Xid's) against its log and (re)issues the commit call if the commit entry is present in the log; otherwise abort is called.

1.2 Related Work

1.4 Contribution

There are several well-known optimizations of the 2PC protocol. The following two optimizations are described in the XA spec and should be implemented by the vendors:

Our contribution is an implementation of the LLR optimization through JDBC datasources in WLS [8], an adaptation of the Nested 2PC optimization to the JEE. The rest of the paper is organized as follows: Section 2 outlines relevant WLS components, Section 3 discusses the design considerations of the LLR optimization, Section 4 describes the implementation details of normal operation and failure recovery, Section 5 highlights performance achieved in a standard JEE benchmark, and finally Section 6 concludes this paper.

// // // // // // // // //

a)normal operation ->prepare-> ->prepare-> ->prepare-> P1 P2 P3 P4 P1 P2 P3 P4

Suggest Documents