Semantics–based Multilevel Transaction ... - Semantic Scholar

6 downloads 0 Views 73KB Size Report
A federated database management system (FDBMS) is a special type of ...... International Workshop on Interoperability in Multidata- base Systems, Kyoto, 1991.
Semantics–based Multilevel Transaction Management in Federated Systems Andrew Deacon, Hans–Jörg Schek, Gerhard Weikum Department of Computer Science, ETH Zürich, Zürich CH–8092, Switzerland Abstract A federated database management system (FDBMS) is a special type of distributed database system that enables existing local databases, in a heterogeneous environment, to maintain a high degree of autonomy. One of the key problems in this setting is the coexistence of local transactions and global transactions, where the latter access and manipulate data of multiple local databases. In modeling FDBMS transaction executions we propose a more realistic model than the traditional read/write model; in our model a local database exports high–level operations which are the only operations distributed global transactions can execute to access data in the shared local databases. Such restrictions are not unusual in practice as, for example, no airline or bank would ever permit foreign users to execute ad hoc queries against their databases for fear of compromising autonomy. The proposed architecture can be elegantly modeled using the multilevel nested transaction model for which a sound theoretical foundation exists to prove concurrent executions correct. A multilevel scheduler that is able to exploit the semantics of exported operations can significantly increase concurrency by ignoring pseudo conflicts. A practical scheduling mechanism for FDBMSs is described that offers the potential for greater performance and more flexibility than previous approaches based on the read/ write model.

1: Introduction The proliferation of databases has resulted in an increasing demand for interoperability between these often autonomous systems. Over the last few years a substantial amount of research has been conducted into finding solutions for interoperability problems in autonomous systems [ACM90]. A federated database management systems (FDBMS) is designed to provide such solutions. The term federated emphasizes that there is a willingness of the subsystems to cooperate while they remain autonomous and allow only partial and controlled access to the data they manage. We consider an FDBMS as a system for managing access to multiple pre–existing local databases (LDBs). FDBMSs execute global transactions (GTs) which span

multiple autonomous LDBs on behalf of a user. We refer to the transaction that execute on a single LDB, outside the control of the FDBMS, as local transactions (LTs). An LDB is a collection of data and applications that is managed by a database management system (DBMS) and are administrated under a policy specifying the interoperability and autonomy requirements of the LDB [Gr86, OV91]. Note that we permit two LDBs, administered under different policies, to be managed by the same DBMS. One of the key problems of FDBMSs is the support of global transactions and their coexistence with local transactions [BGS92]. The central issues that have to be addressed relate to the heterogeneity of the federated system and to the execution autonomy of the local systems. By heterogeneity we mean that different local systems may use different concurrency control and recovery mechanisms. By execution autonomy we imply that the execution of global transactions should have no or minimal impact on the execution and performance of local transactions. There are two lines of approaches to the problem of federated transaction management: F The first approach is to rely on some common properties of the underlying local DBMSs. It has been shown that global serializability and atomicity, i.e., ACID guarantees for both GTs and LTs, are self–guaranteed if all local DBMSs ensure commit–order serializability [BGRS91, BS92, Raz92] and are able and willing to participate in a global two–phase commit protocol. Note, however, that some of today’s widely used DBMSs do not necessarily ensure commit–order serializability since they make use of multiversion concurrency control protocols (e.g., Oracle and Rdb). In addition, the problem of execution autonomy is ignored by this approach. F The second is to implement the necessary scheduling and commit protocols on top of the existing DBMSs (e.g., [BST92, GRS91, Pu88, VW92, WV90]). The correctness reasoning for this approach has been based on a read/write model in which all transactions read and write primitive data items such as pages. This model is an oversimplification since no commercial DBMS allows the observation or control of the read/write accesses to pages; the interface FDBMSs have with DBMS is at a higher operation level, for example, SQL

operations. In addition, to ensure that GTs and LTs executions do not interfere incorrectly, the approaches in this category restrict the possible concurrency significantly and may incur severe performance problems.

1.1: Our approach This paper proposes an approach to federated transaction management that is based on exploiting semantics of high–level operations explicitly exported by the LDBs, and embedding invocations of these operations into open nested transactions within a requestor–server architecture. We require that the operations invoked by GTs on an LDB be pre–defined by a local database administrator. This approach is not unusual, as no airline, bank or car rental company would allow other corporations to attempt access to their data by means of SQL statements, for example. Simply restricting access using permissions or views is generally insufficient as this does not provide the level of execution autonomy and security demanded by many organizations. Rather we say that an LDB exports a well– defined set of high–level operations that may be invoked by GTs to execute a specific operation. These high–level operations may also serve as building blocks for local applications. The exported high–level operations correspond to the ‘‘steps” used in the Sagas and ConTracts models [GS87, GGKKS90, WR92, GR93]. An important advantages of this architecture is that it is simple to administer. It enables greater control of the performance impact of GTs on the local applications by virtue of being able to control what types of operations may be performed, and on which data items they may be performed. This autonomy for LDBs is very important in many organizations. While it may appear that such a high–level requestor– server approach is a fairly special case of an FDBMS, in practice we expect this will remain an important case, requiring careful consideration. There are many examples of database interoperability that are based on the above model [Gr86, GA87, ANRS92, VEH92]. These include networks of travel agencies and international interbank clearing systems. The most simple case of the high–level requestor–server approach restricts GTs to invoking a single exported operations at exactly one LDB; such a GT can be viewed as a local DBMS transaction that is initiated remotely. Even if multiple GTs are chained within the same process, as supported by most TP monitors and advocated in the Sagas model [GS87] (as well as the S–transaction model [VEH92] and the Flex Transaction model [ELLR90, KPE92]), the spheres of control with respect to isolation are single high–level operations. This does not require any additional concurrency control to be performed by the FDBMS, and GTs can be treated like local transactions.

However, with such an approach, it is impossible to enforce global consistency constraints, which are indeed crucial for important classes of applications (see Section 2 for an example). Thus, removing the restriction and permitting GTs to invoke multiple exported operations will require scheduling of these operations to prevent anomalies resulting from interleaved executions of high–level operations. Our approach to the scheduling of high–level operations is based on the model of open nested transactions, specifically on its variant known as multilevel transactions [BBG89, GR93, Wei91, WS92]. The classical read/write model of a database system defines a conflict between two operations if at least one operation is a write [BHG87]. A more general notion of a conflict is to consider two operation invocations as compatible if their execution order is irrelevant from an application point of view. The semantic property of the exported high–level operations that we want to exploit is the compatibility of pairs of operations. This enables certain low–level conflicts between operations to be ignored. High–level operations are usually not indivisible; rather they could spawn a number of subactions in the underlying local system. To develop correct and efficient semantics–based protocols, it is important to consider both the interleaving of transactions at the level of exported operations, and the interleaving of the executed high–level operations in terms of the resulting read/write accesses in the underlying local systems. Multilevel transactions are a natural model to reason about such problems. In addition, the implementation techniques for multilevel transaction management [WH93] are of great value for actually building efficient federated system [SSW93].

1.2: Contribution and outline of the paper This work is a continuation of our earlier results reported in [SWS91]. We provide a rigid model for using multilevel transactions in federated systems, and we present a concurrency control protocol that is being implemented in a federated system prototype at ETH Zürich. This paper makes the following contributions: F It explains why multilevel transactions are a correct and efficient execution model for federated systems with a high–level requestor–server architecture, and how local transactions are dealt with in this environment. F It provides a realistic model of transaction executions in federated systems, compared to the previous work based on the read/write model. F It develops a practical FDBMS scheduling protocol that enhances performance and achieves a higher degree of execution autonomy, compared to previous approaches. The rest of the paper is organized as follows. In Section 2 a simplified banking scenario is presented to illustrate

how the exported operations may be defined and how the semantics may then be used to construct a compatibility matrix. This scenario provides background information for examples used in the remainder of the paper. Section 3 applies the theory for multilevel concurrency control and recovery to FDBMS transaction management. In Section 4 a multilevel FDBMS scheduler is described.

2: Banking scenario Consider the information management problems in a large banking organization operating in financial markets such as security trading, foreign exchange and loans. The bank has a separate division for each of these markets enabling the divisions to operate largely independently of one another. Because of the interrelationship between these markets in the larger banking structure, there is a need to establish a global risk assessment procedure, without which it is possible that significant losses may result due to unfortunate interleavings of operations of different transactions. For example, it could be desired that the bank’s involvement in Japanese Yen, totaled over all divisions not reach ‘‘dangerous’’ levels. For this purpose a risk management database is maintained. Every deal, other than very small deals for which a reserve is held, have to be entered into the risk management database to reflect the total involvement in each currency. This is done by invoking the following operation that raises or lowers the holding of a currency in the risk management database: Enter (:currency, :amount)

For very large financial deals that can significantly alter the state of the bank’s holdings, such as when a deal exceeds a predetermined amount of say 10 Million Swiss Franks, an additional check against the risk management database is required. In reality many factors would have be considered such as the relationships between different business partners. If the deal is approved then it is also registered in the risk management database, otherwise an error code is returned. The rules for checking risk limits and approving deals are implemented by the operation: Alter (:currency, :amount, :ok)

Suppose that the risk management database is stored in an DBMS with an SQL–like interface. The skeleton versions of the operations Enter and Alter are given in Figure 1. These operations are examples of the exported high– level operations discussed in Section 1.1. We now consider the semantics of these operations to decide which are compatible and which conflict. It follows that two Enter operations are always compatible, even with the same currency parameter. In contrast, two Alter operations with the same currency parameter, do not generally commute and nor is

OPERATION Enter (:currency, :amount): UPDATE TotalInvolvement SET Total = Total + :amount WHERE Currency = :currency; OPERATION Alter(:currency, :amount, :ok): UPDATE TotalInvolvement SET Total = Total + :amount WHERE Currency = :currency AND Total + :amount

Suggest Documents