pdf handouts - C-JDBC

9 downloads 2102 Views 3MB Size Report
Emic 2005. Emmanuel Cecchet | ApacheCon EU 2005. Outline. • RAIDb. • C- JDBC. • Derby and C-JDBC. • Scalability. • High availability ...
Building Highly Available Database Applications for Apache Derby

Emmanuel Cecchet Principal architect - Emic Networks Chief architect – ObjectWeb consortium

© Emic 2005

Emmanuel Cecchet | ApacheCon EU 2005

www.emicnetworks.com

1

Motivations • Database tier should be – – – – –

scalable highly available without modifying the client application database vendor independent on commodity hardware

JDBC JDBC

JDBC

Internet

© Emic 2005

Emmanuel Cecchet | ApacheCon EU 2005

www.emicnetworks.com

2

Scaling the database tier – Master-slave replication • Cons – failover time/data loss on master failure – read inconsistencies – scalability App. server Web

Master

frontend

Internet

© Emic 2005

Emmanuel Cecchet | ApacheCon EU 2005

www.emicnetworks.com

3

Scaling the database tier – Atomic broadcast • Cons – atomic broadcast scalability – no client side load balancing – heavy modifications of the database engine

Internet

Atomic broadcast © Emic 2005

Emmanuel Cecchet | ApacheCon EU 2005

www.emicnetworks.com

4

Scaling the database tier – SMP • Cons – Cost – Scalability limit

Web frontend

App. server Database

Internet

Well-known database vendor here Well-known hardware + database vendors here © Emic 2005

Emmanuel Cecchet | ApacheCon EU 2005

www.emicnetworks.com

5

Scaling the database tier – Shared disks • Cons – still expensive hardware – availability Web frontend

App. server

Database

Disks

Internet

Another well-known database vendor © Emic 2005

Emmanuel Cecchet | ApacheCon EU 2005

www.emicnetworks.com

6

Outline

• RAIDb • C-JDBC • Derby and C-JDBC • Scalability • High availability

© Emic 2005

Emmanuel Cecchet | ApacheCon EU 2005

www.emicnetworks.com

7

RAIDb concept

• Redundant Array of Inexpensive Databases • RAIDb controller – gives the view of a single database to the client – balance the load on the database backends

• RAIDb levels offers various tradeoff of performance and fault tolerance

© Emic 2005

Emmanuel Cecchet | ApacheCon EU 2005

www.emicnetworks.com

8

RAIDb levels • RAIDb-0 – partitioning – no duplication and no fault tolerance – at least 2 nodes

SQL requests RAIDb controller

table 1 © Emic 2005

table 2 & 3

table ...

table n-1

Emmanuel Cecchet | ApacheCon EU 2005

table n www.emicnetworks.com

9

RAIDb levels • RAIDb-1 – mirroring – performance bounded by write broadcast – at least 2 nodes

SQL requests RAIDb controller

Full DB © Emic 2005

Full DB

Full DB

Full DB

Emmanuel Cecchet | ApacheCon EU 2005

Full DB www.emicnetworks.com

10

RAIDb levels • RAIDb-2 – partial replication – at least 2 copies of each table for fault tolerance – at least 3 nodes

SQL requests RAIDb controller

Full DB © Emic 2005

table x

table y

table x & y

Emmanuel Cecchet | ApacheCon EU 2005

table z www.emicnetworks.com

11

SQL requests

RAIDb levels composition

RAIDb-1 controller

• RAIDb-1-0 – no limit to the composition deepness

RAIDb-0 controller

table w

table x & y

RAIDb-0 controller

table z

table w

table y

table x & z

RAIDb-0 controller

© Emic 2005

table w

table x

Emmanuel Cecchet | ApacheCon EU 2005

table y

table z www.emicnetworks.com

12

Outline

• RAIDb • C-JDBC • Derby and C-JDBC • Scalability • High availability

© Emic 2005

Emmanuel Cecchet | ApacheCon EU 2005

www.emicnetworks.com

13

C-JDBC overview • Middleware implementing RAIDb – 100% Java implementation – open source (LGPL)

• Two components – generic JDBC driver (C-JDBC driver) – C-JDBC Controller

• Read-one, Write all approach – provides eager (strong) consistency

• Supports heterogeneous databases

© Emic 2005

Emmanuel Cecchet | ApacheCon EU 2005

www.emicnetworks.com

14

architectural overview

Application server C-JDBC DB2 Univ. JDBC Driver driver JDBC JVM

© Emic 2005

C-JDBC controller Derby with Network server

JVM

Emmanuel Cecchet | ApacheCon EU 2005

www.emicnetworks.com

15

Using C-JDBC as an open source driver for Derby

Application server

C-JDBC controller

C-JDBC JDBC driver JVM

Embedded Derby

© Emic 2005

JVM

Emmanuel Cecchet | ApacheCon EU 2005

www.emicnetworks.com

16

Inside the C-JDBC Controller Client application (Servlet, EJB, ...)

Client application (Servlet, EJB, ...)

C-JDBC driver

C-JDBC driver

C-JDBC Controller

HTTP RMI ... JMX administration console

C-JDBC driver

Virtual database 1

Virtual database 2

Authentication Manager

Authentication Manager

Request Manager

Request Manager

Configuration Monitoring

JMX Server

Client application (Servlet, EJB, ...)

Scheduler

Scheduler Recovery Query result cache Log

Recovery Query result cache Log Load balancer

Load balancer Database dumps management

Checkpointing service

© Emic 2005

Database Backend

Database Backend

Database Backend

Database Backend

Database Backend

Connection Manager

Connection Manager

Connection Manager

Connection Manager

Connection Manager

Derby JDBC driver

Derby JDBC driver

Derby JDBC driver

Derby JDBC driver

Oracle JDBC driver

Derby

Derby

Derby

Derby

Emmanuel Cecchet | ApacheCon EU 2005

Oracle

www.emicnetworks.com

17

Virtual Database Java client program (Servlet, EJB, ...)

• gives the view of a single database

C-JDBC driver

Virtual database Authentication Manager

• establishes the mapping between the database name used by the application and the backend specific settings

Request Manager Scheduler Recovery Log

Request Cache Load balancer

Database Backend

Database Backend

Database Backend

Connection Manager

Connection Manager

Connection Manager

Derby JDBC driver

Derby JDBC driver

Derby JDBC driver

Derby

Derby

Derby

© Emic 2005

• backends can be added and removed dynamically • configured using an XML configuration file

Emmanuel Cecchet | ApacheCon EU 2005

www.emicnetworks.com

18

Authentication Manager Java client program (Servlet, EJB, ...)

C-JDBC driver

Virtual database Authentication Manager Request Manager Scheduler Recovery Log

Request Cache Load balancer

Database Backend

Database Backend

Database Backend

Connection Manager

Connection Manager

Connection Manager

Derby JDBC driver

Derby JDBC driver

Derby JDBC driver

Derby

Derby

Derby

© Emic 2005

• Matches real login/password used by the application with backend specific login/ password • Administrator login to manage the virtual database

Emmanuel Cecchet | ApacheCon EU 2005

www.emicnetworks.com

19

Scheduler Java client program (Servlet, EJB, ...)

• Manages concurrency control

C-JDBC driver

Virtual database Authentication Manager

• Pass-through

Request Manager Scheduler Recovery Log

Request Cache Load balancer

Database Backend

Database Backend

Database Backend

Connection Manager

Connection Manager

Connection Manager

Derby JDBC driver

Derby JDBC driver

Derby JDBC driver

Derby

Derby

Derby

© Emic 2005

• Specific implementations for RAIDb 0, 1 and 2 • Optimistic and pessimistic transaction level – uses the database schema that is automatically fetched from backends

Emmanuel Cecchet | ApacheCon EU 2005

www.emicnetworks.com

20

Request cache •

Java client program (Servlet, EJB, ...)

– tunable sizes

C-JDBC driver

• Virtual database

Request Manager



Scheduler Request Cache

Database Backend

Database Backend

Connection Manager

Connection Manager

Connection Manager

Derby JDBC driver

Derby JDBC driver

Derby JDBC driver

Derby

Derby

Derby

© Emic 2005

metadata cache – column metadata – fields of a request

Load balancer

Database Backend

parsing cache – parse request skeleton only once – INSERT INTO t VALUES (?,?,?)

Authentication Manager

Recovery Log

3 optional caches



result cache – – – –

caches results from SQL requests tunable consistency fine grain invalidations optimizations for findByPk requests

Emmanuel Cecchet | ApacheCon EU 2005

www.emicnetworks.com

21

Load balancer 1/2 Java client program (Servlet, EJB, ...)



C-JDBC driver

– query directed to the backend having the needed tables

Virtual database Authentication Manager



Scheduler Request Cache Load balancer

Database Backend

Database Backend

Database Backend

Connection Manager

Connection Manager

Connection Manager

Derby JDBC driver

Derby JDBC driver

Derby JDBC driver

RAIDb-1 – read executed by current thread – write executed in parallel by a dedicated thread per backend – result returned if one, majority or all commit – if one node fails but others succeed, failing node is disabled

Request Manager

Recovery Log

RAIDb-0



RAIDb-2 – same as RAIDb-1 except that writes are sent only to nodes owning the updated table

Derby

© Emic 2005

Derby

Derby

Emmanuel Cecchet | ApacheCon EU 2005

www.emicnetworks.com

22

Load balancer 2/2 Java client program (Servlet, EJB, ...)

• Static load balancing policies

C-JDBC driver

– Round-Robin (RR) – Weighted Round-Robin (WRR)

Virtual database Authentication Manager

• Least Pending Requests First (LPRF)

Request Manager Scheduler Recovery Log

Request Cache Load balancer

Database Backend

Database Backend

Database Backend

Connection Manager

Connection Manager

Connection Manager

Derby JDBC driver

Derby JDBC driver

Derby JDBC driver

Derby

Derby

Derby

© Emic 2005

– request sent to the node that has the shortest pending request queue – efficient even if backends do not have homogeneous performance

Emmanuel Cecchet | ApacheCon EU 2005

www.emicnetworks.com

23

Connection Manager Java client program (Servlet, EJB, ...)

• C-JDBC driver provides transparent connection pooling

C-JDBC driver

• Connection pooling for a backend

Virtual database Authentication Manager Request Manager Scheduler Recovery Log

Request Cache Load balancer

Database Backend

Database Backend

Database Backend

Connection Manager

Connection Manager

Connection Manager

Derby JDBC driver

Derby JDBC driver

Derby JDBC driver

Derby

Derby

Derby

© Emic 2005

– – – –

no pooling blocking pool non-blocking pool dynamic pool

• Connection pools defined on a per login basis – resource management per login – dedicated connections for admin

Emmanuel Cecchet | ApacheCon EU 2005

www.emicnetworks.com

24

Recovery Log Java client program (Servlet, EJB, ...)

• Checkpoints are associated with database dumps

C-JDBC driver

Virtual database Authentication Manager

• Record all updates and transaction markers since a checkpoint

Request Manager Scheduler Recovery Log

Request Cache Load balancer

Database Backend

Database Backend

Database Backend

Connection Manager

Connection Manager

Connection Manager

Derby JDBC driver

Derby JDBC driver

Derby JDBC driver

Derby

Derby

Derby

© Emic 2005

• Used to resynchronize a database from a checkpoint • JDBCRecoveryLog – store log information in a database – can be re-injected in a C-JDBC cluster for fault tolerance

Emmanuel Cecchet | ApacheCon EU 2005

www.emicnetworks.com

25

Functional overview (read)

Client application (Servlet, EJB, ...)

Client application (Servlet, EJB, ...)

C-JDBC driver

C-JDBC driver

connect connect login, password execute SELECT *myDB FROM t

C-JDBC Controller Virtual database 1 Authentication Manager Request Manager Scheduler Recovery Query result cache Log Load balancer

© Emic 2005

Database Backend

Database Backend

Database Backend

Connection Manager

Connection Manager

Connection Manager

Derby JDBC driver

Derby JDBC driver

Derby JDBC driver

Derby

Derby

Derby

Emmanuel Cecchet | ApacheCon EU 2005

www.emicnetworks.com

26

Functional overview (write)

Client application (Servlet, EJB, ...)

Client application (Servlet, EJB, ...)

C-JDBC driver

C-JDBC driver

execute INSERT INTO t …

C-JDBC Controller Virtual database 1 Authentication Manager Request Manager Scheduler Recovery Query result cache Log Load balancer

© Emic 2005

Database Backend

Database Backend

Database Backend

Connection Manager

Connection Manager

Connection Manager

Derby JDBC driver

Derby JDBC driver

Derby JDBC driver

Derby

Derby

Derby

Emmanuel Cecchet | ApacheCon EU 2005

www.emicnetworks.com

27

Failures execute INSERT INTO t … • No 2 phase-commit – parallel transactions – failed nodes are automatically disabled

Client application (Servlet, EJB, ...)

Client application (Servlet, EJB, ...)

C-JDBC driver

C-JDBC driver

C-JDBC Controller Virtual database 1 Authentication Manager Request Manager Scheduler Recovery Query result cache Log Load balancer

© Emic 2005

Database Backend

Database Backend

Database Backend

Connection Manager

Connection Manager

Connection Manager

Derby JDBC driver

Derby JDBC driver

Derby JDBC driver

Derby Derby Emmanuel Cecchet | ApacheCon EU 2005

Derby

www.emicnetworks.com

28

Outline

• RAIDb • C-JDBC • Derby and C-JDBC • Scalability • High availability

© Emic 2005

Emmanuel Cecchet | ApacheCon EU 2005

www.emicnetworks.com

29

Highly available web site •

Apache clustering – L4 switch, RR-DNS, One-IP techniques, LVS, …



Tomcat clustering – mod_jk (T4), mod_proxy/mod_rewrite (T5), session replication



Database clustering – C-JDBC

RR-DNS

mod-jk

Open source driver Parsing cache Result cache Metadata cache

Internet

embedded

© Emic 2005

Emmanuel Cecchet | ApacheCon EU 2005

www.emicnetworks.com

30

Result cache • Cache contains a list of SQL • Policy defined by queryPattern

ResultSet Policy

• 3 policies – EagerCaching: variable granularities for invalidations – RelaxedCaching: invalidations based on timeout – NoCaching: never cached RUBiS bidding mix with 450 clients Throughput (rq/min)

© Emic 2005

No cache

Coherent cache

Relaxed cache

3892

4184

4215

Avg response time

801 ms

284 ms

134 ms

Database CPU load

100%

85%

20%

C-JDBC CPU load

-

15%

7%

Emmanuel Cecchet | ApacheCon EU 2005

www.emicnetworks.com

31

Configuring C-JDBC as a Derby driver (1/3) C-JDBC Controller

Client application

SingleDB configuration

org.objectweb.cjdbc.driver.Driver

C-JDBC Driver

jdbc:derby:path;create=true

Embedded Derby

jdbc:cjdbc://host/db

• copy c-jdbc-driver.jar in client application classpath Class.forName(“org.objectweb.cjdbc.driver.Driver”); Connection c = DriverManager.getConnection( “jdbc:cjdbc://host/db”, “login”, “password”);

• copy derby.jar in $CJDBC_HOME/drivers

© Emic 2005

Emmanuel Cecchet | ApacheCon EU 2005

www.emicnetworks.com

32

Configuring C-JDBC as a Derby driver (2/3)

© Emic 2005

Emmanuel Cecchet | ApacheCon EU 2005

www.emicnetworks.com

34

Highly available web site • Multiple databases – choosing RAIDb level – recovery log for • adding nodes dynamically • recovering from failures

Internet

© Emic 2005

Emmanuel Cecchet | ApacheCon EU 2005

www.emicnetworks.com

35

Derby clustering with C-JDBC Client application

Client application

org.objectweb.cjdbc.driver.Driver

org.objectweb.cjdbc.driver.Driver

C-JDBC Driver

C-JDBC Driver

jdbc:cjdbc://host1/db C-JDBC Controller RAIDb configuration com.ibm.db2.jcc.DB2Driver

org.objectweb.cjdbc.driver.Driver

DB2 Universal Driver for Derby

C-JDBC Driver

jdbc:derby:net://host2:1527/db; create=true;retrieveMessagesFro mServerOnGetMessage=true;

jdbc:cjdbc://host3/db

C-JDBC Controller Network Server

SingleDB configuration

jdbc:derby:path;create=true

Derby © Emic 2005

Embedded Derby Emmanuel Cecchet | ApacheCon EU 2005

www.emicnetworks.com

36

Configuring C-JDBC with Derby Network server •

Virtual database configuration file



© Emic 2005

Emmanuel Cecchet | ApacheCon EU 2005

www.emicnetworks.com

37

Configuring C-JDBC with Derby/C-JDBC •

Virtual database configuration file

© Emic 2005

Emmanuel Cecchet | ApacheCon EU 2005

www.emicnetworks.com

39

Configuring C-JDBC Clustering with Derby (2/2)