Emic 2005. Emmanuel Cecchet | ApacheCon EU 2005. Outline. • RAIDb. • C-
JDBC. • Derby and C-JDBC. • Scalability. • High availability ...
Building Highly Available Database Applications for Apache Derby
Emmanuel Cecchet Principal architect - Emic Networks Chief architect – ObjectWeb consortium
© Emic 2005
Emmanuel Cecchet | ApacheCon EU 2005
www.emicnetworks.com
1
Motivations • Database tier should be – – – – –
scalable highly available without modifying the client application database vendor independent on commodity hardware
JDBC JDBC
JDBC
Internet
© Emic 2005
Emmanuel Cecchet | ApacheCon EU 2005
www.emicnetworks.com
2
Scaling the database tier – Master-slave replication • Cons – failover time/data loss on master failure – read inconsistencies – scalability App. server Web
Master
frontend
Internet
© Emic 2005
Emmanuel Cecchet | ApacheCon EU 2005
www.emicnetworks.com
3
Scaling the database tier – Atomic broadcast • Cons – atomic broadcast scalability – no client side load balancing – heavy modifications of the database engine
Internet
Atomic broadcast © Emic 2005
Emmanuel Cecchet | ApacheCon EU 2005
www.emicnetworks.com
4
Scaling the database tier – SMP • Cons – Cost – Scalability limit
Web frontend
App. server Database
Internet
Well-known database vendor here Well-known hardware + database vendors here © Emic 2005
Emmanuel Cecchet | ApacheCon EU 2005
www.emicnetworks.com
5
Scaling the database tier – Shared disks • Cons – still expensive hardware – availability Web frontend
App. server
Database
Disks
Internet
Another well-known database vendor © Emic 2005
Emmanuel Cecchet | ApacheCon EU 2005
www.emicnetworks.com
6
Outline
• RAIDb • C-JDBC • Derby and C-JDBC • Scalability • High availability
© Emic 2005
Emmanuel Cecchet | ApacheCon EU 2005
www.emicnetworks.com
7
RAIDb concept
• Redundant Array of Inexpensive Databases • RAIDb controller – gives the view of a single database to the client – balance the load on the database backends
• RAIDb levels offers various tradeoff of performance and fault tolerance
© Emic 2005
Emmanuel Cecchet | ApacheCon EU 2005
www.emicnetworks.com
8
RAIDb levels • RAIDb-0 – partitioning – no duplication and no fault tolerance – at least 2 nodes
SQL requests RAIDb controller
table 1 © Emic 2005
table 2 & 3
table ...
table n-1
Emmanuel Cecchet | ApacheCon EU 2005
table n www.emicnetworks.com
9
RAIDb levels • RAIDb-1 – mirroring – performance bounded by write broadcast – at least 2 nodes
SQL requests RAIDb controller
Full DB © Emic 2005
Full DB
Full DB
Full DB
Emmanuel Cecchet | ApacheCon EU 2005
Full DB www.emicnetworks.com
10
RAIDb levels • RAIDb-2 – partial replication – at least 2 copies of each table for fault tolerance – at least 3 nodes
SQL requests RAIDb controller
Full DB © Emic 2005
table x
table y
table x & y
Emmanuel Cecchet | ApacheCon EU 2005
table z www.emicnetworks.com
11
SQL requests
RAIDb levels composition
RAIDb-1 controller
• RAIDb-1-0 – no limit to the composition deepness
RAIDb-0 controller
table w
table x & y
RAIDb-0 controller
table z
table w
table y
table x & z
RAIDb-0 controller
© Emic 2005
table w
table x
Emmanuel Cecchet | ApacheCon EU 2005
table y
table z www.emicnetworks.com
12
Outline
• RAIDb • C-JDBC • Derby and C-JDBC • Scalability • High availability
© Emic 2005
Emmanuel Cecchet | ApacheCon EU 2005
www.emicnetworks.com
13
C-JDBC overview • Middleware implementing RAIDb – 100% Java implementation – open source (LGPL)
• Two components – generic JDBC driver (C-JDBC driver) – C-JDBC Controller
• Read-one, Write all approach – provides eager (strong) consistency
• Supports heterogeneous databases
© Emic 2005
Emmanuel Cecchet | ApacheCon EU 2005
www.emicnetworks.com
14
architectural overview
Application server C-JDBC DB2 Univ. JDBC Driver driver JDBC JVM
© Emic 2005
C-JDBC controller Derby with Network server
JVM
Emmanuel Cecchet | ApacheCon EU 2005
www.emicnetworks.com
15
Using C-JDBC as an open source driver for Derby
Application server
C-JDBC controller
C-JDBC JDBC driver JVM
Embedded Derby
© Emic 2005
JVM
Emmanuel Cecchet | ApacheCon EU 2005
www.emicnetworks.com
16
Inside the C-JDBC Controller Client application (Servlet, EJB, ...)
Client application (Servlet, EJB, ...)
C-JDBC driver
C-JDBC driver
C-JDBC Controller
HTTP RMI ... JMX administration console
C-JDBC driver
Virtual database 1
Virtual database 2
Authentication Manager
Authentication Manager
Request Manager
Request Manager
Configuration Monitoring
JMX Server
Client application (Servlet, EJB, ...)
Scheduler
Scheduler Recovery Query result cache Log
Recovery Query result cache Log Load balancer
Load balancer Database dumps management
Checkpointing service
© Emic 2005
Database Backend
Database Backend
Database Backend
Database Backend
Database Backend
Connection Manager
Connection Manager
Connection Manager
Connection Manager
Connection Manager
Derby JDBC driver
Derby JDBC driver
Derby JDBC driver
Derby JDBC driver
Oracle JDBC driver
Derby
Derby
Derby
Derby
Emmanuel Cecchet | ApacheCon EU 2005
Oracle
www.emicnetworks.com
17
Virtual Database Java client program (Servlet, EJB, ...)
• gives the view of a single database
C-JDBC driver
Virtual database Authentication Manager
• establishes the mapping between the database name used by the application and the backend specific settings
Request Manager Scheduler Recovery Log
Request Cache Load balancer
Database Backend
Database Backend
Database Backend
Connection Manager
Connection Manager
Connection Manager
Derby JDBC driver
Derby JDBC driver
Derby JDBC driver
Derby
Derby
Derby
© Emic 2005
• backends can be added and removed dynamically • configured using an XML configuration file
Emmanuel Cecchet | ApacheCon EU 2005
www.emicnetworks.com
18
Authentication Manager Java client program (Servlet, EJB, ...)
C-JDBC driver
Virtual database Authentication Manager Request Manager Scheduler Recovery Log
Request Cache Load balancer
Database Backend
Database Backend
Database Backend
Connection Manager
Connection Manager
Connection Manager
Derby JDBC driver
Derby JDBC driver
Derby JDBC driver
Derby
Derby
Derby
© Emic 2005
• Matches real login/password used by the application with backend specific login/ password • Administrator login to manage the virtual database
Emmanuel Cecchet | ApacheCon EU 2005
www.emicnetworks.com
19
Scheduler Java client program (Servlet, EJB, ...)
• Manages concurrency control
C-JDBC driver
Virtual database Authentication Manager
• Pass-through
Request Manager Scheduler Recovery Log
Request Cache Load balancer
Database Backend
Database Backend
Database Backend
Connection Manager
Connection Manager
Connection Manager
Derby JDBC driver
Derby JDBC driver
Derby JDBC driver
Derby
Derby
Derby
© Emic 2005
• Specific implementations for RAIDb 0, 1 and 2 • Optimistic and pessimistic transaction level – uses the database schema that is automatically fetched from backends
Emmanuel Cecchet | ApacheCon EU 2005
www.emicnetworks.com
20
Request cache •
Java client program (Servlet, EJB, ...)
– tunable sizes
C-JDBC driver
• Virtual database
Request Manager
•
Scheduler Request Cache
Database Backend
Database Backend
Connection Manager
Connection Manager
Connection Manager
Derby JDBC driver
Derby JDBC driver
Derby JDBC driver
Derby
Derby
Derby
© Emic 2005
metadata cache – column metadata – fields of a request
Load balancer
Database Backend
parsing cache – parse request skeleton only once – INSERT INTO t VALUES (?,?,?)
Authentication Manager
Recovery Log
3 optional caches
•
result cache – – – –
caches results from SQL requests tunable consistency fine grain invalidations optimizations for findByPk requests
Emmanuel Cecchet | ApacheCon EU 2005
www.emicnetworks.com
21
Load balancer 1/2 Java client program (Servlet, EJB, ...)
•
C-JDBC driver
– query directed to the backend having the needed tables
Virtual database Authentication Manager
•
Scheduler Request Cache Load balancer
Database Backend
Database Backend
Database Backend
Connection Manager
Connection Manager
Connection Manager
Derby JDBC driver
Derby JDBC driver
Derby JDBC driver
RAIDb-1 – read executed by current thread – write executed in parallel by a dedicated thread per backend – result returned if one, majority or all commit – if one node fails but others succeed, failing node is disabled
Request Manager
Recovery Log
RAIDb-0
•
RAIDb-2 – same as RAIDb-1 except that writes are sent only to nodes owning the updated table
Derby
© Emic 2005
Derby
Derby
Emmanuel Cecchet | ApacheCon EU 2005
www.emicnetworks.com
22
Load balancer 2/2 Java client program (Servlet, EJB, ...)
• Static load balancing policies
C-JDBC driver
– Round-Robin (RR) – Weighted Round-Robin (WRR)
Virtual database Authentication Manager
• Least Pending Requests First (LPRF)
Request Manager Scheduler Recovery Log
Request Cache Load balancer
Database Backend
Database Backend
Database Backend
Connection Manager
Connection Manager
Connection Manager
Derby JDBC driver
Derby JDBC driver
Derby JDBC driver
Derby
Derby
Derby
© Emic 2005
– request sent to the node that has the shortest pending request queue – efficient even if backends do not have homogeneous performance
Emmanuel Cecchet | ApacheCon EU 2005
www.emicnetworks.com
23
Connection Manager Java client program (Servlet, EJB, ...)
• C-JDBC driver provides transparent connection pooling
C-JDBC driver
• Connection pooling for a backend
Virtual database Authentication Manager Request Manager Scheduler Recovery Log
Request Cache Load balancer
Database Backend
Database Backend
Database Backend
Connection Manager
Connection Manager
Connection Manager
Derby JDBC driver
Derby JDBC driver
Derby JDBC driver
Derby
Derby
Derby
© Emic 2005
– – – –
no pooling blocking pool non-blocking pool dynamic pool
• Connection pools defined on a per login basis – resource management per login – dedicated connections for admin
Emmanuel Cecchet | ApacheCon EU 2005
www.emicnetworks.com
24
Recovery Log Java client program (Servlet, EJB, ...)
• Checkpoints are associated with database dumps
C-JDBC driver
Virtual database Authentication Manager
• Record all updates and transaction markers since a checkpoint
Request Manager Scheduler Recovery Log
Request Cache Load balancer
Database Backend
Database Backend
Database Backend
Connection Manager
Connection Manager
Connection Manager
Derby JDBC driver
Derby JDBC driver
Derby JDBC driver
Derby
Derby
Derby
© Emic 2005
• Used to resynchronize a database from a checkpoint • JDBCRecoveryLog – store log information in a database – can be re-injected in a C-JDBC cluster for fault tolerance
Emmanuel Cecchet | ApacheCon EU 2005
www.emicnetworks.com
25
Functional overview (read)
Client application (Servlet, EJB, ...)
Client application (Servlet, EJB, ...)
C-JDBC driver
C-JDBC driver
connect connect login, password execute SELECT *myDB FROM t
C-JDBC Controller Virtual database 1 Authentication Manager Request Manager Scheduler Recovery Query result cache Log Load balancer
© Emic 2005
Database Backend
Database Backend
Database Backend
Connection Manager
Connection Manager
Connection Manager
Derby JDBC driver
Derby JDBC driver
Derby JDBC driver
Derby
Derby
Derby
Emmanuel Cecchet | ApacheCon EU 2005
www.emicnetworks.com
26
Functional overview (write)
Client application (Servlet, EJB, ...)
Client application (Servlet, EJB, ...)
C-JDBC driver
C-JDBC driver
execute INSERT INTO t …
C-JDBC Controller Virtual database 1 Authentication Manager Request Manager Scheduler Recovery Query result cache Log Load balancer
© Emic 2005
Database Backend
Database Backend
Database Backend
Connection Manager
Connection Manager
Connection Manager
Derby JDBC driver
Derby JDBC driver
Derby JDBC driver
Derby
Derby
Derby
Emmanuel Cecchet | ApacheCon EU 2005
www.emicnetworks.com
27
Failures execute INSERT INTO t … • No 2 phase-commit – parallel transactions – failed nodes are automatically disabled
Client application (Servlet, EJB, ...)
Client application (Servlet, EJB, ...)
C-JDBC driver
C-JDBC driver
C-JDBC Controller Virtual database 1 Authentication Manager Request Manager Scheduler Recovery Query result cache Log Load balancer
© Emic 2005
Database Backend
Database Backend
Database Backend
Connection Manager
Connection Manager
Connection Manager
Derby JDBC driver
Derby JDBC driver
Derby JDBC driver
Derby Derby Emmanuel Cecchet | ApacheCon EU 2005
Derby
www.emicnetworks.com
28
Outline
• RAIDb • C-JDBC • Derby and C-JDBC • Scalability • High availability
© Emic 2005
Emmanuel Cecchet | ApacheCon EU 2005
www.emicnetworks.com
29
Highly available web site •
Apache clustering – L4 switch, RR-DNS, One-IP techniques, LVS, …
•
Tomcat clustering – mod_jk (T4), mod_proxy/mod_rewrite (T5), session replication
•
Database clustering – C-JDBC
RR-DNS
mod-jk
Open source driver Parsing cache Result cache Metadata cache
Internet
embedded
© Emic 2005
Emmanuel Cecchet | ApacheCon EU 2005
www.emicnetworks.com
30
Result cache • Cache contains a list of SQL • Policy defined by queryPattern
ResultSet Policy
• 3 policies – EagerCaching: variable granularities for invalidations – RelaxedCaching: invalidations based on timeout – NoCaching: never cached RUBiS bidding mix with 450 clients Throughput (rq/min)
© Emic 2005
No cache
Coherent cache
Relaxed cache
3892
4184
4215
Avg response time
801 ms
284 ms
134 ms
Database CPU load
100%
85%
20%
C-JDBC CPU load
-
15%
7%
Emmanuel Cecchet | ApacheCon EU 2005
www.emicnetworks.com
31
Configuring C-JDBC as a Derby driver (1/3) C-JDBC Controller
Client application
SingleDB configuration
org.objectweb.cjdbc.driver.Driver
C-JDBC Driver
jdbc:derby:path;create=true
Embedded Derby
jdbc:cjdbc://host/db
• copy c-jdbc-driver.jar in client application classpath Class.forName(“org.objectweb.cjdbc.driver.Driver”); Connection c = DriverManager.getConnection( “jdbc:cjdbc://host/db”, “login”, “password”);
• copy derby.jar in $CJDBC_HOME/drivers
© Emic 2005
Emmanuel Cecchet | ApacheCon EU 2005
www.emicnetworks.com
32
Configuring C-JDBC as a Derby driver (2/3)
© Emic 2005
Emmanuel Cecchet | ApacheCon EU 2005
www.emicnetworks.com
34
Highly available web site • Multiple databases – choosing RAIDb level – recovery log for • adding nodes dynamically • recovering from failures
Internet
© Emic 2005
Emmanuel Cecchet | ApacheCon EU 2005
www.emicnetworks.com
35
Derby clustering with C-JDBC Client application
Client application
org.objectweb.cjdbc.driver.Driver
org.objectweb.cjdbc.driver.Driver
C-JDBC Driver
C-JDBC Driver
jdbc:cjdbc://host1/db C-JDBC Controller RAIDb configuration com.ibm.db2.jcc.DB2Driver
org.objectweb.cjdbc.driver.Driver
DB2 Universal Driver for Derby
C-JDBC Driver
jdbc:derby:net://host2:1527/db; create=true;retrieveMessagesFro mServerOnGetMessage=true;
jdbc:cjdbc://host3/db
C-JDBC Controller Network Server
SingleDB configuration
jdbc:derby:path;create=true
Derby © Emic 2005
Embedded Derby Emmanuel Cecchet | ApacheCon EU 2005
www.emicnetworks.com
36
Configuring C-JDBC with Derby Network server •
Virtual database configuration file
© Emic 2005
Emmanuel Cecchet | ApacheCon EU 2005
www.emicnetworks.com
37
Configuring C-JDBC with Derby/C-JDBC •
Virtual database configuration file
© Emic 2005
Emmanuel Cecchet | ApacheCon EU 2005
www.emicnetworks.com
39
Configuring C-JDBC Clustering with Derby (2/2)