dbSwitch™– Towards a Database Utility - CiteSeerX

0 downloads 0 Views 232KB Size Report
+972-54-225019 ... called a Database Area Network (DAN), which pools database ..... shields application code from session or even query (DML) failure [20].
dbSwitch™– Towards a Database Utility Shaul Dar

Gil Hecht

Eden Shochat

Savantis Systems Ltd. 11 Galgaley Haplada St. Herzelia, Israel +972-54-225019

Savantis Systems Ltd. 11 Galgaley Haplada St. Herzelia, Israel +972-54-953096

Savantis Systems Ltd. 11 Galgaley Haplada St. Herzelia, Israel +972-54-461666

[email protected]

[email protected]

[email protected]

and populated. We then proceed in Section 5 with a more detailed discussion of the major features of the dbSwitch and how they are implemented. We discuss related work in Section 6. Finally in Section 7 we summarize our work, and outline open research questions.

ABSTRACT Savantis Systems’ dbSwitch™ is an innovative commercial product providing database server virtualization and advancing a database utility model. The dbSwitch enables a new architecture, called a Database Area Network (DAN), which pools database server resources and shares them among multiple database applications. Specific benefits of the DAN architecture for enterprise data centers include server consolidation, improved utilization, high availability and capacity management. We describe the major components of the dbSwitch, namely routing of application requests to database instances, optimization of database server resources and capacity visualization and manipulation. We also relate dbSwitch to recent work on utility and grid computing.

The majority of this document follows the terminology of the Oracle DBMS, however the DAN architecture is generally applicable to any database (and in principle could be extended to other applications as well). At the time of this writing the Oracle version of the dbSwitch has been sold to several enterprise customers, and support for Microsoft’s SQL Server and IBM‘s UDB is in the works.

1.1 Existing Database Infrastructure

Keywords dbSwitch™, Database Area Network, DAN, Consolidation, Grid, Utility.

1. INTRODUCTION Databases are ubiquitous. Large enterprises today often have tens to thousands of database instances serving a similar number of applications. These database instances, and the servers they run on, have no shared resources and are managed individually. As a result the database layer represents a major source of cost and complexity to enterprise IT. Savantis's vision is to transform enterprise database infrastructure in a manner paralleling the recent paradigm change at the storage layer, where the introduction of Storage Area Networks (SAN) and Network Attached Storage (NAS) [16] allowed enterprises to transition from server attached disks to shared networked storage.

Figure 1: Typical Enterprise Database Infrastructure

The common enterprise data center configuration, illustrated in Figure 1, allocates to each application a dedicated database server, and sometimes two servers for high availability. This configuration suffers from the following acute problems:

We start by describing in Section1.1 the typical enterprise database infrastructure and highlighting the problems associated with it. In Section 2 we introduce our solution, the Database Area Network (DAN), and the product that enables it, the dbSwitch™ and outline the benefits of this approach. We follow in Section 3 with a description of the components that collectively comprise the dbSwitch system. In Section 4 we explain how the dbSwitch is deployed into the enterprise network and how a DAN is created Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. SIGMOD 2004, June 13–18, 2004, Paris, France. Copyright 2004 ACM 1-58113-859-8/04/06 …$5.00.

892

1.

Expensive: Database servers are typically over-provisioned to accommodate peak conditions and are largely underutilized, yielding a poor return on investment. Additional software licenses are required for both DBMS and associated products such as backup, file system, volume manager, and monitoring.

2.

Static: database resource requirements change over time. Moving a database instance to a larger server, for example, requires significant manual labor and may incur lengthy downtime.

3.

4.

6.

Vulnerable: While high availability is paramount for databases, traditional approaches, such as replication ([23], [10]) or clustering ([17], [27]), are expensive, complex, and difficult to scale.

Centralized Management: enable administration of many servers and databases from a single console.

The DAN architecture is applicable whenever multiple database servers have access to the same data, which may be the case when storage is either shared or replicated. The current dbSwitch product assumes an underlying shared storage in the form of a Storage Area Network (SAN), Network Attached Storage (NAS) or shared disk, as shown in Figure 2. However the emergence of fiber optic networks and fast replication techniques will enable other, longer distance, configurations:

Complex: Database servers are islands that must be managed individually, increasing management cost.

Similar problems have plagued the storage infrastructure a decade ago, when most data were stored on server-attached disks. As shown in Figure 1 however, most enterprise data centers have changed their storage infrastructure, or are in the midst of changing it, from local disks and tapes into centralized, shared storage, usually in the form of a Storage Area Network (SAN). This paradigm shift at the storage level enables a similar paradigm shift at the database server level.



Disaster recovery (DR) is typically based on storage replication using BCVs (Business Continuous Volumes) and protocols such as EMC’s SRDF (Symmetric Remote Data Facility). In addition database replication techniques such as Quest’s SharePlex [23] or Oracle’s DataGuard can provide near time synchronization between remote databases, allowing the backup site to serve queries or even transactions (using by-directional replication). As an example Priceline has a primary site in Manhattan and a backup site in New Jersey connected via a fiber optic backbone. Continuous replication between the sites using SharePlex incurs only a 5 second delay [24].



Emerging storage networking protocols, such as iSCSI and Infiniband, would allow SANs to span longer geographical distances.



Several companies are working on providing shared storage access across a WAN at LAN speed using either a file system [9] [25] or shared memory [10] abstraction.

2. THE DATABASE AREA NETWORK Savantis addresses the above problems by introducing a new database networking architecture, the Database Area Network (DAN). The DAN concept is enabled by Savantis’s dbSwitch™ product. As a first approximation think of dbSwitch as an appliance situated between the applications and database servers, as illustrated in Figure 2, and capable of making switching decision at the database operation level in real time.

3. THE DBSWITCH SYSTEM As can be seen in Figure 2, the dbSwitch’s enabling role in the Database Area Network (DAN) resembles that of the storage switch for the Storage Area Network (SAN) or the Server Load Balancer (SLB) for Web and Application servers. In reality the dbSwitch combines two hardware elements and three major software components. One hardware element is a standard router connected between the database clients and database servers and the other is a standard Linux box running the dbSwitch software and connected to the router on the side, i.e. out of the critical path. The separation of the dbSwitch and router entails several important advantages:

Figure 2: dbSwitch and the Database Area Network

The DAN architecture provides several important benefits, described in greater detail in Section Error! Reference source not found.: 1.

Database Server Virtualization: shield applications from the actual location of their associated databases. Enable databases instances to be relocated between servers without requiring changes to the database configuration or to the client applications.

2.

Scalability on Demand: allow dynamic addition of database servers, or removal for rightsizing or upgrades.

3.

Resource Optimization: let multiple database instances share server resources, allowing more efficient overall utilization and consolidation.

4.

High Availability: guarantee the availability of all database servers and instances.

5.

Secure Database Access: filter application to database requests to prevent unauthorized access.



The dbSwitch system is highly robust. The MTBF (Mean Time Between Failures) for critical failures, i.e. ones that result in database service interruption, is dependent on the router only.



The dbSwitch software can be modified using rolling upgrades, avoiding service interruptions.

For increased fault tolerance the dbSwitch also supports a redundant hardware configuration, comprising two Linux boxes and two routers (using Cisco’s HSRP protocol [4]), and providing automatic takeover in less than a minute. The dbSwitch system includes three major software components: the dbSwitch software, the DAN Management System (DMS), and the Savantis agents. The following sections describe each of these hardware and software components in more detail.

893

3.1 dbSwitch Router

4. DEPLOYMENT

The dbSwitch router (currently Cisco 2691 or 5606 models) is responsible for directing packets of a database networking protocol, such as Oracle’s SQL*NET, from the clients to the servers. The router uses NAT (Network Address Translation) to achieve server virtualization, hiding from the clients the identity of the machines serving the requests, as explained in Section 4.1. The router also provides IP filtering, using ACLs (Access Control Lists), to enforce database zoning security, as in Section 5.4.

4.1 dbSwitch Networking In the simplest case the dbSwitch and router are physically deployed in the enterprise network between the database clients and servers, enabling virtualization of the database servers. Externally, i.e. to the database clients, each database is identified by a unique virtual IP (VIP). These VIPs are not defined on any machine in the network, and are routable by the dbSwitch router only. Typically the VIPs form a class C subnet such as 10.2.255.*.

3.2 dbSwitch Software

Internally each database is mapped at any point in time to a specific database server, and therefore requests sent to the associated VIP must be routed to that server. This is achieved by assigning each database server a new IP, called LIP, for Local (or Listener) IP. Typically the LIPs form a class C network, such as 10.2.0.*. Additionally each database is associated with a unique port on every server. The dbSwitch router dynamically translates connection requests to a VIP into the appropriate LIP and port using NAPT (Network Address and Port Translation).

The dbSwitch software runs on the Linux box and executes the resource management algorithms, described in Section 5.2. In addition, the dbSwitch software controls the router, communicates with the Savantis agents, and provides backend services, such as persistency, to the DAN Management System (DMS).

3.3 DAN Management System (DMS) The DMS is an intuitive Web-based user interface for DAN administration. It runs on the Linux box under the Apache Web server and Tomcat application server. Among its salient features are the following. •

A tree view and topological view for monitoring DAN elements, namely SubDANs, servers, databases and storage, and information panes allowing the user to see aggregated resource utilization or drill-down to individual elements.



Wizards for DAN configuration, namely addition of SubDANs, servers, and databases, ability to set high availability and specify QoS and security policies.



Charts and reports providing visibility into servers and databases resource consumption over time.



Alerts for notification of abnormal events as well as the actions taken by the dbSwitch. The DMS also allows integration with the enterprise Network Operations Center (NOC) software (e.g. HPOV) by sending SNMP traps.



The network topology described above is called “dbSwitch in the middle” (see [7] for more details). An alternative topology called “dbSwitch on the side” allows the dbSwitch and router to be inserted anywhere in an enterprise LAN or even WAN while logically still appearing to be located between the database clients and servers (by “attracting” and manipulating packets destined to the VIP network and responses to them). A third topology supports L2 VLANs for organizations concerned with application isolation. A detailed description of such topologies and their tradeoffs is beyond the scope of this paper, but it should be evident that they allow the construction of local or global DANs far more flexible than the typical local server cluster.

5. DAN BENEFITS 5.1 Database Server Virtualization The dbSwitch allows databases instances to be relocated from one server to another usually in 30 seconds or less. Relocation may be initiated proactively by the user or reactively by the system in response to a detected failure. Several techniques are used to minimize the effect of relocation on databases applications. One important technique is the use of Oracle TAF (Transparent Application Failover), a feature of the Oracle Net8 client that shields application code from session or even query (DML) failure [20]. TAF is configured by specifying multiple addresses (VIPs + ports) for an alias in the appropriate location, such as the tnsnames.ora file on the client machine. In addition before a proactive relocation the dbSwitch induces a checkpoint in order to minimize the recovery time of the instance on the new server.

Guides for performing resource optimization, as explained in Section 5.2.

A complete reference of the DMS may be found in [8].

3.4 Savantis Agent Each database server in a DAN runs a low overhead dbSwitch Agent that listens for TCP/IP communication from the dbSwitch on the servers' IP address and a dedicated port. The Agent acts as the delegate of the dbSwitch on the server, performing various processes on the database servers and databases, including: •

Server and database discovery, to automate the addition of servers and databases to the DAN.



Configuration changes, such as creating, propagating to all servers, and updating listener and instance control files.



Health and load monitoring of the server and instances



Instance control, e.g. stopping, starting and relocating databases and related storage.

5.2 Resource Optimization The dbSwitch employs innovative resource optimization algorithms to enable sharing of server resources, such as CPU and memory, between the database instances. In essence the algorithms strive to find a mapping (assignment) of database instances to servers within a SubDAN satisfying particular goals described below. Input includes user constraints (such as “do not move this instance from this server”), server capabilities, and historical data on database resource consumptions over a representative reference period (default is last month).

Currently supported server platforms include Solaris, HP-UX, Linux and Windows 2000.

894

server or instance failure. The dbSwitch high availability is dynamic, relocating failed database instances individually to the most suitable available server according to the resource optimization algorithms described in Section 5.2, and requiring no passive backup servers or prior client configuration. In fact, any number of servers can fail and the instances will find their way to the surviving machines, so long as they can carry the load.

Specific goals of the algorithms include the following. •

Improved Utilization: redistribute database instances to servers to get the best retroactive load balance among servers over the reference period.



Server Consolidation: remove a subset of the servers (e.g. specific ones, a certain number of servers, or allowing a specified increase in average utilization) and redistribute the instances among the remaining servers.



5.4 Security (Zoning) The dbSwitch introduces a new approach to database security called zoning (after a similar SAN concept), which filters application to database access at the network level, and thus goes beyond the basic user/password mechanisms used by the DBMS. Zoning allows the administrator to define which applications can access which databases from which clients, preventing inadvertent security violations as well as malicious attacks. A client specification can be any set of IP ranges. As an example the administrator can grant access to a particular database from only two IPs, the application server for that database, and his own personal PC. Or access can be granted from any machine in a subnet, allowing for dynamic IP protocols such as DHCP. The zoning specifications are translated into an IP Access Control List (ACL), utilizing the facts that (1) database = VIP, (2) clients = set of IPs. The dbSwitch router strictly enforces those routing rules, blocking and reporting on any violations.

High Availability: quickly (i.e. within a few seconds) relocate failed instances (or instances from a failed server) to the most suitable available servers.

To see the importance of an algorithmic approach note that given S servers and D database instances, there are SD possible mappings. The current algorithm performs a complete enumeration of the search space when possible, and a partial search based on simulated annealing otherwise. As a simplifying assumption it places a limit of (D/S)*K instances/server, where K=2 or K=3. The implementation uses a concise bit field representation to pre-compute and store the effects of all state transitions (single database relocations), as well as other incremental techniques. The algorithm takes only several minutes on the largest SubDAN tested (20 servers, 80 database instances, assuming up to 8 databases/server). Database relocation is a not a trivial operation, and furthermore the concept of proactive relocation is new to database administrators. To address this concern resource optimization results are presented as recommendations, letting the user take the final decision and implement it.

5.5 Heterogeneity Most IT database environments exhibit a mix of hardware, O.S. and DBMS vendors. Hence support for heterogeneity is crucial. The dbSwitch provides an important device for heterogeneity, called a SubDAN. A SubDAN is a logical grouping of compatible servers and databases, where by definition any database instance could run on any server. For example a heterogeneous DAN may be divided into two SubDANs, one with HP servers and Oracle 8i databases, the other with Solaris servers and Oracle 9i databases. SubDANs may also be used for other purposes, such as distinguishing critical and less important applications (giving each SubDAN different high availability and load balancing settings), or delineating organizational IT boundaries (e.g. separating departments).

6. RELATED WORK Utility Computing has become a major focus of leading hardware and software vendors, including IBM, Sun, HP, Oracle and Veritas, as well as many smaller companies. Viewed from the supplier’s side, it is also known as Grid Computing, in reference to the power grid that provides electricity as a utility. Emerging Grid standards such as OGSA [16],OGSI [19], and DAIS [6] seem to garner early vendor support.

Figure 3: The Resource Optimizer

The Resource Optimization GUI is shown in Figure 3. The user can ask for a recommended mapping, which he may approve, reject or modify. Charts allow the user to visually compare the actual recorded resource requirements with estimated hypothetical requirements as they would have been under the evaluated mapping. Pressing the Apply button (known affectionally as “Make It So”) starts a sequence of relocations (serially or in parallel) that bring about the desired mapping. Additionally the user may save a mapping and restore it at a later time, allowing for example one mapping to be used on weekdays and another on weekends.

Marketing hype aside, the aim of these initiatives is to create infrastructures that deliver IT services as utilities, i.e. make the service virtual, reliable and available on demand, allow consumers to connect easily and pay per usage etc. Savantis’s vision fits naturally into this paradigm, addressing the virtualization of the database layer and ultimately the delivery of database as a utility. As noted elsewhere however different things to different scientific community seems require the harnessing of

5.3 High Availability Because the dbSwitch treats DAN servers as a pool, it guarantees high availability to all managed database, guarding against any

895

(e.g. [13]), Grid computing means people. The main interest of the to be in large computations that many distributed, heterogeneous

computing resources (e.g. [5], [14][22]). The focus of the database research community appears to be huge data grids spanning remote locations (e.g. [1],[15], [26]).

8. REFERENCES [1] The Access Grid, www.accessgrid.org [2] The Alteon Application Switch, www.nortel.com

While there are important applications for these Grid problems, our current emphasis is on providing server transparency and dynamic allocation to typical commercial database applications, namely not one very large computation (CPU or data wise) but rather a stream of “normal size” database interactions (both OLTP and batch / decision support), that must be satisfied while making the most of the underlying resources.

[3] The Arrowpoint Application Switch, www.cisco.com [4] Hot Standby Router Protocol Functionality, Cisco Systems, 2003.

Features

and

[5] The Condor Project http://www.cs.wisc.edu/condor/ [6] Database Access and Integration Services (DAIS), http://forge.gridforum.org/projects/dais-wg/

Oracle has recently outlined its vision of grid computing and announced a new release of its database and application server software called 10g, to be rolled out during 2004[20]. Oracle’s database grid relies on its Real Applications Clusters (RAC) product, a shared disk parallel database. Currently RAC and its predecessor OPS represent a small percentage of the Oracle installations, and are mostly used to provide high availability and increased performance for a single critical application. Conversely dbSwitch aims to provide server sharing, consolidation and optimal resource usage in a mixed database environment using vanilla (non-parallel) DBMS. While we see these approaches as complementary we are also exploring the benefits of dbSwitch on top of RAC, extending our resource optimization algorithms to take advantage of RACs mechanisms to dynamically add and remove instances, and the dbSwitch router’s ability to distribute the load between the instances in real time.

[7] dbSwitch for Oracle Administrator’s Guide, Version 2.0, Savantis Systems, October 2003. [8] dbSwitch for Oracle User’s Guide, Version 2.0, Savantis Systems, October 2003. [9] Disksites W-NAS Architecture, www.disksites.com [10] EMC TimeFinder, EMC Corporation, www.emc.com [11] Gigaspaces, www.gigaspaces.com/ [12] Jim Gray: Distributed Computing Economics, Technical Report, MSR-TR-2003-24, March 2003. [13] Jim Gray: Microsoft and Grid Computing, Microsoft Memorandum, August 2002. [14] The Grid: Blueprint for a New Computing Infrastructure, Ian Foster, Carl Kesselman (Ed.), Morgan Kaufmann, San Francisco, 1999.

From a purely technical perspective the dbSwitch most resembles application switches such as [3], often used for Web and application (e.g. FTP) server load balancing (SLB). To the best of our knowledge none of these products addresses DBMS issues such as resource optimization or high availability, and typically their load balancing algorithms are based on the network traffic (the pipes) rather then on server and application load (the utility).

[16] Hugo Toledo, Jonathan Gennick, Oracle Net8: Configuration and Troubleshooting, O’Reilly, 2000.

7. SUMMARY AND FUTURE WORK

[17] Muralli Valath, Oracle Real Application Clusters, Elsevier, 2003.

[15] GryPhyN Project, www.griphyn.org/index.php

We described the dbSwitch product and the Database Area Network (DAN) architecture it enables. Servers in a DAN are virtualized and their resources are pooled together and shared among the database instances, bringing about resource optimization, high availability and the opportunity for consolidation and scaling on demand.

[18] The Open Grid Services Architecture (OGSA), https://forge.gridforum.org/projects/ogsa-wg [19] The Open Grid Services Infrastructure (OGSI), https://forge.gridforum.org/projects/ogsi-wg [20] Oracle Grid Computing, www.oracle.com/grid/

dbSwitch is now a commercial enterprise appliance and software sold to and installed at several large data centers. Savantis’s development focus is currently on DAN capacity management, adding extensive monitoring and reporting capabilities. We are also enhancing the resource optimization algorithms to provide better scalability, trending and prediction.

[21] Peterson, M., SAN Overview, Strategic Research Corporation, 1998, www.sresearch.com/wp_9801.htm [22] Platform Computing, www.platform.com [23] Quest SharePlex, www.shareplex.com/shareplex-portal/ [24] Ron Royce, Priceline CIO, Personal Communications.

We believe dbSwitch and the DAN architecture represent important steps towards a database utility model. Yet many questions must still be addressed, such as:

[25] Tacit Networks, http://www.tacitnetworks.com/ [26] The TeraGrid, www.teragrid.org

1. What storage and networking breakthroughs are required to make a WAN database grid economically feasible [12]?

[27] Veritas Cluster Server, Veritas Software, 2003, www.veritas.com

2. How do we build a scalable, interoperable database grid? 3. How should database usage be measured and charged for, i.e. what is a “database Watt”?

896

Suggest Documents