Replication

11 downloads 0 Views 348KB Size Report
2008-11-26 .... 11. ROWA (Read One Write All, with 2PC). ▫ Why good: offers full transactional ACID-correctness ..... Sybase Adaptive Server Anywhere (ASA).
®

IBM Software Group – Information Management – solidDB

Data Replication in Contemporary Database Products Antoni Wolski Chief Researcher, solidDB IBM Helsinki Lab Software Group

HUT 2008-11-26

IBM Software Group | Information Management Software | solidDB

Network environment of the mobile age WLAN

Internet LAN

Mobile net VPN Fixed telephone net ”Servers”

Problem: predictability of the end-to-end path

”Clients”

Bandwidth and its fluctuations Delays Availability (connectability) Reliability (fault-tolerance) 2

Quality Quality of of Service Service (QoS) (QoS) still still aa promise promise only only

2

IBM Software Group | Information Management Software | solidDB

What are the forms of distribution? Table T1 T3

Logical Logicalview viewof ofdata data (global schema) (global schema)

Table T2

T4

Node A

Node C

T1a T3

Possible scenarios (local schemas)

T1c Node B T4 T1b

• Tablewise partioning: T3, T4 • Horizontal partitioning: table: T1 • Replication: table: T2 3

3

IBM Software Group | Information Management Software | solidDB

Why to distribute data? You partition your data in order to ƒ get better throughput (by load balancing) ƒ delegate responsibility for data geographically ƒ downsize You replicate your data in order to ƒ get better throughput ƒ improve response time ƒ improve data availability (fault tolerance) in the presence of  node failures  network failures (partitions)  weak connectivity

mobile! 4

4

IBM Software Group | Information Management Software | solidDB

Data recharging in mobile environment Net

1. Connected: power and data is charged (loaded)

2. Disconnected: using pre-loaded power and data

Net

3. Connected again: RECHARGING power and data

5

5

IBM Software Group | Information Management Software | solidDB

Replication: Main Concepts ƒ Replication Copying data with the purpose of using the copied data (the replica) instead of the original (the master)

ƒ Synchronization A method to propagate data changes to respective copies (in a consistency-preserving way)

ƒ Refresh A one-time operation of bringing a copy up to date

Replication Replication submodels: submodels: -- update update model model -- refresh refresh model model -- management management model model 6

6

IBM Software Group | Information Management Software | solidDB

Principles of transaction-based processing Transaction (unit of work): a sequence of operations, having the following properties(ACID): ƒ Atomicity Either all or none

Transaction BEGIN TRANSACTION

Time

ƒ Durability The effects are immediately permanent

ƒ Isolation The changes are not seen before they are committed

COMMIT

ƒ Consistency The effect of a transaction is a consistent database state Database

A system maintaining ACID properties produces serializable and recoverable transaction schedules. If multiple copies are updated in an ACID transaction, the updates are one-copy serializable. This is the highest possible consistency level for replicated data. 7

7

IBM Software Group | Information Management Software | solidDB

Replication Techniques: Main Division ƒ

Synchronous (eager) replication multiplicity of copies are updated in a single transaction (a distributed atomic commit protocol, like Two-Phase Commit, is needed) Guarantees ACID characteristics and one-copy serializability Æ strict consistency

ƒ

Asynchronous (lazy) replication you read or write one copy only, in a transaction the change is propagated to other copies later on may be one or two-way may be peer-to-peer or hierarchical various synchronization models various refresh models data consistency is maintained by way of conflict detection and resolution (i.e. reconciliation) Æ eventual consistency (snapshot consistency)

8

8

IBM Software Group | Information Management Software | solidDB Eager replication

Synchronous replication requires distributed transaction Application

One-phase commit

Coordinator log

Coordinator has to be run on a reliable node

Commit protocol

Participants

log

log

log

9

9

IBM Software Group | Information Management Software | solidDB

Eager replication with

Eager replication

Two-Phase Commit (2PC) Participant

Coordinator update operations

(one of many) UPDATE

Write locks

execute updates

COMMIT log changes

PREPARE

Log prepare ”get ready for Phase 1 committing”

PREPARE OK

log vote (WAL) ”ready for committing”

Log decision (WAL) OK COMMIT Phase 2

”now, commit all” COMMIT OK Committed

Log final WAL = Write-ahead log 10

log commit (and forget) (WAL) 10

IBM Software Group | Information Management Software | solidDB

ROWA (Read One Write All, with 2PC)

Eager replication

Definition: update all the copies in a single transaction. ƒ

Why good: offers full transactional ACID-correctness (and 1SR)

ƒ

Why bad: all nodes have to be on-line (prone to net partitioning and node failures) Æ unacceptable in mobile environment Æ slight improvement: ROWAA (write all available) long response time long lock times at participants Æ data availability deteriorates Æ probability of deadlock grows [GHOS96]: Deadlock_rate ~ TPS x Trans.time x Actions5 x Nodes3 New New oportunity: oportunity: participant looses autonomy HA HA cluster cluster platforms platforms with with coordinator has to be reliable backplane backplane high-speed high-speed net net 11

11

IBM Software Group | Information Management Software | solidDB

Voting copy protocols

Eager replication

Objective: preserve 1SR in the presence of net and node failures Basic majority voting (Gif79) App

Manager copies Manager collects votes



Quorum rule: any two write quorums as well any read and write quorums have at least one node in common



Voting takes place both when data is written and read Æ makes reading difficult



Basic method: lock-based; also optimistic with validation [Tho79]

Improvements 

Weighted voting [Giff79]



Dynamic voting [JM89]



Tree quorum [AE90] (read one, vote for updates)



... about 150 papers are written on (eager) voting protocols 12

12

IBM Software Group | Information Management Software | solidDB

Other 1SR approaches

Eager replication

ƒ Partial write operations (smaller granularity) [RL94] Æ less conflicts ƒ Using group communication protocols (atomic multicast) [KA00] total order multicast conflict checking at arrival delaying at conflict no 2PC, no deadlock

ƒ Graph-based ”lazy” methods [BKRS99] access a global replication graph

ƒ Epidemic commit protocols (decentralized voting) [AES97, RGK96]

voting

All copies are eventually consistent

decision reached

decision propagation 13

13

IBM Software Group | Information Management Software | solidDB

Summary of eager methods

ƒ Guarantee best (strict) consistency ƒ Are resilient to certain failures ƒ Because of the need of a node to be connected, they are not applicable to mobile environments

14

14

IBM Software Group | Information Management Software | solidDB

Lazy Update Models ƒ

Lazy replication

Lazy master All updates are directed to the master (primary) copy (update transaction also read from the master copy) Æ master copy is consistent at all times Æ other copies are refreshed by some method

ƒ

Lazy replica Any replica copy may be updated The change is pushed to master Checking against possible conflicts is done If necessary, reconciliation is performed Æ reconciliation leads to compensating transactions

ƒ

Intelligent transaction (Solid) The update is sent to master in a form of a transaction including conflict checking and reconciliation code (user-defined procedures)

ƒ

Peer-to-peer (symmetric) update (anywhere) All copies are equal, any may be updated Æ distributed conflict resolution and reconciliation

15

15

IBM Software Group | Information Management Software | solidDB

Primary copy method (Lazy Master) ƒ

Lazy replication

Basic version [Sto79] Updates are done in the primary copy (single node transaction) Reads may be done from secondary copies but the read objects have to be locked at the primary copy.

ƒ

Once the primary copy transaction is completed, the secondary copies are refreshed. Why good: there is always a consistent database somewhere no distributed transactions needed Æ better performance

ƒ

readers of local data get snapshot consistency Why bad: primary copy has to be on-line primary copy is a bottleneck:

This is the dominant method in practice 16

16

IBM Software Group | Information Management Software | solidDB

Primary copy methods: virtual primary copy

Lazy replication Table T1

Table T1

Partitioned primary copy Virtual node [BK87]

Partitioning is enforced by way of an applicationlevel agreement

Table T1 Table T1

primary copy partition

replica

17

17

IBM Software Group | Information Management Software | solidDB

Operation of Update Models Lazy master

Data update Commit OK

Lazy replication

Replica

Master Asynchronous refresh

Lazy replica Replica

Master

Update propagation Consistency enforcement (system internal)

Data update Commit OK

Asynchronous refresh

18

18

IBM Software Group | Information Management Software | solidDB

Variations of lazy replica

Lazy replication

ƒ Base/mobile transaction (two-tier) [GHOS96] base transactions run in the ROWA mode mobile transaction run locally, when not connected when reconnected, mobile transactions are run as base transactions Æ reconciliation needed Master

Replica

data update

mobile trans. (tentative)

Commit OK Compensating transaction when base transaction fails

distributed base trans.

ƒ Intelligent transaction (Solid) [Wol01] Master

execution of the intelligent transaction (includes consistency enforcement) 19

Replica

transaction propagation asynchronous refresh

data update Commit OK

execution of the intelligent transaction 19

IBM Software Group | Information Management Software | solidDB

Update-anywhere Methods

Lazy replication

• You can update (and commit) any copy • Auxiliary information is collected about the transactions: timestamps, read set, etc. • The updates are propagated to other copies • With the use of the auxiliary information, conflicts are detected • Conflicts are resolved using various methods • How does the Copy 4 Copy 3 Copy 2 Copy 1 application know? update Resolution Resolution heuristics: heuristics:

×

-- First First one one wins wins -- Last one wins Last one wins -- The The boss boss wins wins -- Master Master (primary) (primary) node node wins wins 20

20

IBM Software Group | Information Management Software | solidDB

Bad News about Update Anywhere

Lazy replication

• Impossible to guarantee serializability • Requires application support for compensating transactions • Reconciliation rate does not scale with number of copies: Reconc_rate ~ TPS2 x disconnect_time x Actions2 x Nodes2

21

21

IBM Software Group | Information Management Software | solidDB

Conflict detection in symmetric methods Basic conflict detection method (using elements of synchronous timestamp-based voting [Tho79]) ƒ ƒ ƒ ƒ ƒ ƒ ƒ

Lazy replication

Problem Problem definition: definition: •• transactions transactions should should produce produce locally locally consistent states consistent states •• transactions transactions should should be be globally globally ordered ordered

Assumption: unique, monotonic timestamps are available Each Xact has a timestamp TSx (ordering of TSx’s dictates the eventual serialization order) Each item has a timestamp TSu of the last update Xact Xact collects TSu’s of its read set on a read list Xact is propagated to a symmetric copy site and is validated iff TSu’s on the read list match the corresponding ones on the copy site (= read-write conflicts avoided) When validated, the update is applied if TSx > TSu, for each item separately, otherwise ignored (Thomas write rule) (=transactions are globally ordered in the case of write-write conflict). Assign TSu := TSx. If Xact is not validated, reconciliation is needed.

Alternative: Alternative: •• use use original original item item values values in in place place of of timestamps timestamps --- more more restrictive restrictive 22

22

IBM Software Group | Information Management Software | solidDB

Update anywhere: epidemic methods

Lazy replication

Deno system: weighted voting via epidemic information flow [CK00, CKF01, CK02] update candidates C

B

winner

2 2

A

3

1

elections for voting on update of item x

1

D

E

Step 1: voting and collecting votes (anti-entropy sessions) Problems Problems Step 2: awarding election Step 3: decision propagation

ƒƒ cannot cannot commit commit update update immediately immediately (eager/lazy?) (eager/lazy?) ƒƒ single single objects objects only only (no (no transactions) transactions) ƒƒ propagation propagation takes takes time time Æ Æ more more conflicts conflicts 23

23

IBM Software Group | Information Management Software | solidDB

Update anywhere: evaluation

Lazy replication

ƒ Why good: the application uses the local copy only --> the best one for mobile applications

ƒ Why bad: never guarantees 1SR more programming: the reconciliation method is a part of the application The application must be ready for compensating transactions Reconciliation rate does not scale well with no. of copies: Reconc_rate ~ TPS2 x disconnect_time x Actions2 x Nodes2 [GHOS96]

24

24

IBM Software Group | Information Management Software | solidDB

Consistency Models of Replicated Data ƒ Transactional consistency ACID, one-copy serializability Æ synchronous, eager replication methods

ƒ Snapshot-correctness Strong snapshot consistency: a transaction views a consistent state being a point (possibly in the past) in the global serializable history Weak snapshot consistency: transactions view only committed data but the views may reflect different serialization orders (not updateable) Æ lazy replication methods

ƒ Otherwise, in non-transactional methods: Dirty read: a copy may contain uncommitted data — dangerous!

ƒ Other consistency criteria temporal consistency application-specific (semantic) consistency 25

25

IBM Software Group | Information Management Software | solidDB

Strong and Weak Snapshot Consistency Strong Master

Replica

Weak Master

Replica Transactions arrive at a wrong order!

Weakly consistent snapshots are not really upadateable 26

26

IBM Software Group | Information Management Software | solidDB

Refresh Models Server Data change

Client Refresh data

Client

Server Request refresh

Data change

Pull

Client Notify about change Request refresh Refresh data

27

++ Client Client programming programming easy easy ++ Can Can guarantee guarantee data data freshness freshness -- Does Does not not scale scale with with no. no. of of replicas replicas

++ Scales Scales well well -- Data Data freshness freshness is is difficult difficult -- Client Client programming programming is is more more demanding demanding

Refresh data

Server Data change

Push

Notified Pull (Solid) ++ Scales Scales best best ++ Data Data freshness freshness is is OK OK -- Client Client prpogramming prpogramming is is demanding demanding

27

IBM Software Group | Information Management Software | solidDB

Efficiency of Refresh Full refresh

Changed data

Master

Replica

Differential refresh

Master

Replica

28

28

IBM Software Group | Information Management Software | solidDB

Characteristics of a good refresh method

[Lin+86]

ƒ sends all changes ƒ minimizes impact on the base table (i.e. additional columns, etc.) ƒ transmits as little data as possible ƒ maintains n-to-m relationship between tables and snapshots

29

29

IBM Software Group | Information Management Software | solidDB

Differential refresh methods Example: differential refresh using versions (Solid) ƒ Additional history table at master ƒ Additional column in the master and replica tables: row version ƒ On master update: increment row version copy the old version to the history table and stamp it with the new version # — to indicate time the data became ’old’) Note: on delete, the deleted row will be here.

ƒ On refresh request parameters: latest version received before, query constraint. retrieve from the hist. table rows since the last refresh and send to replica for removal (delete). retrieve from the base table rows changed since the last refresh and send to replica for insertion. 30

30

IBM Software Group | Information Management Software | solidDB

Controlling push refresh ƒ Transaction-wise, transactional Results of each transaction are sent to replicas immediately after Commit

ƒ Clock-based Committed changes are sent according to a timetable

ƒ Interval-based Committed changes are sent at intervals

Methods for extracting the refresh data: log sniffing table-wise logs subscription-wise persistent queues (typically maintained in a replication server) 31

31

IBM Software Group | Information Management Software | solidDB

Summary of lazy replication ƒ There must be readiness for committed transactions to be undone (compensated) ƒ Data is always available to mobile applications ƒ It is difficult to maintain consistency of copies Æ various levels of consistency are available

32

32

IBM Software Group | Information Management Software | solidDB

Replication vs. caching Why not to cache the data when needed? (disk or web caching analogy) Æ transactional caching Two approaches to transactional chaching [FCL97] ƒ Avoid access to stale data check/reload cache each time the data is accessed eager ÆROWA (contradicts with the idea of caching) lazy Æ dynamic subscribe, lazy replica

ƒ

Detect access to stale data validate data at commit eager Æ cache 2PL method lazy Æ dynamic subscribe, lazy replica

Caching is a form of dynamic replication 33

33

IBM Software Group | Information Management Software | solidDB

Replication Management Models ƒ Static Use data definition commands (CREATE SNAPSHOT ...)

ƒ Dynamic: publish/subscribe model CREATE PUBLICATION ... T1

T1

T2

T2 Refresh

T2

T2 T2 T2

Publication = a database view

Subscription

T1 T2 T2

Subscription Subscription may may be be restricted restricted 34

34

IBM Software Group | Information Management Software | solidDB

User Profiling Profile represents the needs of a user T1

T1

T2

T2 Refresh

T2

T2 Profile

Publication

Subscription) Profile 1

T1 T2

In In the the push push model, model, the the profile profile has has to to be be on on the the Master Master side side

T2

Profile 2

In In the the pull pull model, model, the the profile profile can can be be on on the the Replica Replica side side

35

35

IBM Software Group | Information Management Software | solidDB

Introducing Availability Actual Time in Service Availability = Intended Time in Service

Availability 95 %

Allowed Downtime per Year 438 h = 18 days

99%

87 h = 3.5 days

99.9%

8 3/4 hours

Traditional Data Center

No redundancy

7/24 Web Server

Clustering, Shared Resources

99.99%

53 min

99.999%

5 1/4 min Network Infrastructure

99.9999%

32 sec

Components of Five 9’s Systems

Hot Standby, Shared Nothing

36

36

IBM Software Group | Information Management Software | solidDB

A fully replicated hot-standby database (HSB) Node B

Node A HSB

DB Process (Standby)

DB Process (Active) Replication

Secondary D'

Primary D

Database Transactions

•• •• ••

checkpoints checkpoints the the state state maintains maintains transactional transactional consistency consistency maintains maintains transaction transaction durability durability 37

37

IBM Software Group | Information Management Software | solidDB

HA state diagram of the solidDB HA Database Server. Start (no database)

Start (database exists)

OFFLINE

SECONDARY ALONE

SECONDARY ACTIVE Failover PRIMARY UNCERTAIN

STANDALONE

PRIMARY ALONE

Switchover

PRIMARY ACTIVE

38

38

IBM Software Group | Information Management Software | solidDB

solidDB HA: logging architecture Active server

Commit

Transaction Logger

Standby server Replication protocol

Transaction Logger

OK

Primary DB

Log

Secondary Log DB

There are two ways to maintain transaction durability: • disk-based durability (transaction logging) • network-based durability (transactional replication) 39

39

IBM Software Group | Information Management Software | solidDB

1-safe and 2-safe Replication 1-safe

Standby server

Active server

Commit OK

Committed transaction OK

Primary DB

2-safe

Standby server

Active server Committed transaction

Commit OK

OK

Primary DB

40

Secondary DB

Secondary DB

The The safeness safeness level level may may be be controlled controlled dynamically, dynamically, by by aa session session or or transaction transaction

40

IBM Software Group | Information Management Software | solidDB

Data Replication Implementation

Three approaches: Do it yourself Use sync middleware Use replicating database

41

41

IBM Software Group | Information Management Software | solidDB

Three Approaches: (1) Do It Yourself Server app

Mobile app

Net

Self-made data transmission and replication layer

Master

Replica Server node

Mobile node

You have to implement: Replication protocol with recovery Data structures Definition tools for data structures Access methods Methods for consistency preserving Methods for data recovery

42

42

IBM Software Group | Information Management Software | solidDB

Three Approaches: (2) Sync Middleware

Server app

Sync middleware platform

Mobile app

Net Replica DBMS Server node

Mobile node

Master

You have to implement: Data structures in the mobile node Definition tools for mobile data Access methods for mobile data Methods for consistency preserving Methods for mobile data recovery 43

43

IBM Software Group | Information Management Software | solidDB

Three Approaches: (3) Replicating Database Server app

Mobile app

Net

Replicating database (DBMS)

Server

You have to implement: Only the application logic

44

Client

You You get: get: •• data data definition definition •• data data storage storage •• efficient efficient access access methods methods •• standard interfaces standard interfaces (SQL, (SQL, ODBC) ODBC) •• flexible flexible data data views views •• expandability expandability •• run-time run-time adaptability adaptability •• transactions transactions and and concurrency concurrency •• recovery recovery •• replication replication protocols protocols

44

IBM Software Group | Information Management Software | solidDB More More at: at: http://www.solidtech.com http://www.solidtech.com

Solid’s Replication Technology • A tree-structured replication hierarchy • Publish/Subscribe model (Master-based publications and Replica-based subscription profiles) • Consistency verified at the Master Æstrong snapshot consistency at Replica • Lazy master and lazy replica models • Refresh by way of Pull by Replica • Notified Pull is available • Conflict detection and reconciliation by way of Intelligent Tier 2 Transactions • Cross-platform operation • High availability with Hot Standby Tier 3

Flow Engine

Tier 1

Master Hot Standby

Master & replica

45

Replicas 45

IBM Software Group | Information Management Software | solidDB

Products: replicating databases Vendor

46

Server product

Client product

Update model

Refresh + management

Oracle

Oracle9i

Oracle9i Lite

- lazy master - lazy replica

- pull - full + differ.

Microsoft

SQL Server

SQL Server for CE

- lazy master - lazy replica

- pull - full + differ. - publications

IBM

DB2 Universal Server

DB2 Everyplace

- lazy master - lazy replica

- push - pull - full + differ.

Sybase

Adaptive Server Anywhere

UltraLite + MobiLink

- lazy master - lazy replica

- pull - full + differ.

Solid

FlowEngine

ƒ FlowEngin ƒ HotStandby ƒ CDC

- lazy+eager master - lazy replica - intell. transaction

- pull, notified pull - full + differ. - publications

46

IBM Software Group | Information Management Software | solidDB

Oracle 9i and Oracle9i Lite Symmetry

Asymmetric: Master (Oracle9i) and snapshot (Lite)

Direction

(1) one-way: read-only snapshot (2) two-way: updatable snapshot

Architecture

Master site (Oracle9i) and snapshot site (Oracle9i Lite)

Update models

(1) Lazy master (read-only snapshot (2) Lazy replica (updatable snapshot limited to one table)

Refresh model

(1) Snapshot pull, differential and full (2) In updatable snapshot: push to master

Consistency

(1) Serializability in master (2) Snapshot consistency in snapshots

Conflict detection

comparing row values

Reconciliation

Done in Master: 10 pre-defined methods, PL/SQL proc.

Management

Master: with SQL; snapshot: with OLE functions 47

47

IBM Software Group | Information Management Software | solidDB

Microsoft SQL Server and SQL Server for CE Symmetry

Asymmetric: master (SQL Server) and replica (SQL S. for CE)

Direction

(1) one-way: read-only snapshot (2) two-way: merge replication

Architecture

Master site (SQL Server) and replica site (SQL S. for CE)

Update models

(1) Lazy master (read-only snapshot (2) Lazy replica (updateable snapshot limited to one table) (3) Updateable result sets (with RDA)

Refresh model

(1) Replica pull, differential and full (2) Upload to master

Consistency

(1) Serializability in master (2) Snapshot consistency in snapshots

Conflict detection

comparing row values

Reconciliation

Done in Master: programmable with procedures 48

Management 48

SQL and C functions

IBM Software Group | Information Management Software | solidDB

IBM DB2 and DB2 Everyplace Symmetry

Asymmetric: Master (DB2) and replica (DB2 Everyplace)

Direction

(1) one-way: read-only replica (publication) (2) two-way: updateeable replica (publication)

Architecture

DB2, DB2 Everyplace and DB2 Everyplace Sync Server

Update models

(1) Lazy master (read-only replica) (2) Lazy replica (updateable replica limited to one table)

Refresh model

(1) Master push: differential and full (2) In updateable replica: push to master

Consistency

(1) Serializability in master (2) Snapshot consistency in replicas

Conflict detection

version-based

Reconciliation

Done in Master. predefined methods

Management

SQL, GUI tools 49

49

IBM Software Group | Information Management Software | solidDB

Sybase Adaptive Server Anywhere (ASA) and UltraLite

50

Symmetry

Asymmetric: master (ASAi) and replica (UltraLite)

Direction

(1) one-way: read-only replica (2) two-way: updateable replica

Architecture

Master site (ASA), replica site (UltraLite) and MobiLink Server

Update models

(1) Lazy master (read-only replica (2) Lazy replica (updateable replica limited to one table)

Refresh model

(1) Replica pull, differential and full (2) In updateable replica: push to master

Consistency

(1) Serializability in master (2) Snapshot consistency in replica

Conflict detection

comparing row values

Reconciliation

Done in Master

Management

With extended SQL and C functions

50

IBM Software Group | Information Management Software | solidDB

IBM solidDB Replication (Smart Flow) Symmetry

Asymmetric: master and replica

Direction

(1) one-way: read-only subscription (2) two-way: push and pull by replica

Architecture

Identical servers may run as masters and replicas

Update models

(1) Lazy master (2) Intelligent transaction (updateable replica)

Refresh model

(1) Replica pull, differential and full

Consistency

(1) Serializability in master (2) Snapshot consistency in replica

Conflict detection

programmable

Reconciliation

Done in master: programmable

Management

With extended SQL 51

51

IBM Software Group | Information Management Software | solidDB

IBM solidDB Universal Cache (December 2008) A solution to off-load enterprise servers to high-speed front-ends

solidDB The more read-intensive load the more benefit. solidDB JDBC driver

CDC for solidDB CDC= Changed Data Capture (was: DataMirror)

CDC Management Console

CDC management node

DB2 Informix Oracle Sybase MS SQL Server 52

Front-end

CDC for ...

Data server Back-end

52

IBM Software Group | Information Management Software | solidDB

solidDB Universal Cache: subscriptions Front-end

solidDB

CDC for solidDB instance Download suscription

Upload subscription CDC for back-end instance

Back-end

53

53

IBM Software Group | Information Management Software | solidDB

IBM solidDB Universal Cache (also: CDC) Symmetry

Symmetric: each side may be source or target, or both

Direction

(1) one-way: read-only subscription (2) two-way= two one-way subscriptions

Architecture

Data servers are equal. CDC instances and tools.

Update models

(1) Lazy master

Refresh model

(1) Replica push, differential and full

Consistency

(1) Serializability in master (2) Snapshot consistency in replica

Conflict detection

Automatic, based on data changes

Reconciliation

Selected rules and user-programmable

Management

With GUI tool and command line (partly) 54

54

®

IBM Software Group – Information Management – solidDB

Summary — data replication is a predominant form of data distribution — there are already commercially available technologies to support rudimentary mobile data replication — more research is needed in the area of profile management and correctness preserving

IBM Software Group | Information Management Software | solidDB

Bibliography [AE90]

Divyakant Agrawal and Amr El Abbadi. The Tree Quorum Protocol: An Efficient Approach for Managing Replicated Data. VLDB 1990: 243-254.

[AL80]

Michel E. Adiba, Bruce G. Lindsay. Database Snapshots. VLDB 1980: 86-91.

[AES97] D. Agrawal, A. El Abbadi, R.C. Steinke. Epidemic Algorithms in Replicated Databases. Proc. ACM PODS'97: 161-172. [BHG87] Philip Bernstein, Vassos Hadzilacos, Nathan Goodman. Concurrency control and recovery in database systems. Addison-Wesley Publishing Company, 1987. [BK97]

Breitbart, Y., Korth, H.F. Replication and consistency: being lazy helps sometimes. Proc. PODS'97: 173-184.

[BKRS99] Yuri Breitbart, Raghavan Komondoor, Rajeev Rastogi, S. Seshadri, Abraham Silberschatz. Update Propagation Protocols For Replicated Databases. SIGMOD Conference 1999: 97108. [BM99]

Siwoo Byun and Songchun Moon. Resilient data management for replicated Mobile database systems. Data & Knowledge Engineering, 29 (1999): 43-55.

[BN97]

Philip A. Bernstein, Eric Newcomer. Principles of Transaction Processing For the System Professional. Morgan Kaufmann Publishers, 1997. Chapter 10: Replication.

[CK00]

Ugur Çetintemel, Peter J. Keleher. Performance of Mobile, Single-Object, Replication Protocols. SRDS 2000: 218-227.

[CFZ01] Mitch Cherniack, Michael J. Franklin, Stan Zdonik. Expressing User Profiles for Data Recharging. IEEE Personal Communications, August 2001, 6-13. 56

56

IBM Software Group | Information Management | solidDB [CKF01] Ugur Çetintemel, Peter J. Keleher, MichaelSoftware J. Franklin. Support for Speculative Update Propagation and Mobility in Deno. ICDCS 2001. [CK02]

Ugur Çetintemel, Peter J. Keleher. Light-Weight Currency Management Mechanisms in Mobile and weakly-Connected Environments. Distr. and Parallel Databases (11)1 (Jan 2002): 53-71.

[FCL97] Michael J. Franklin, Michael J. Carey, Miron Livny. Transactional Client-Server Cache Consistency: Alternatives and Performance. TODS 22(3): 315-363 (1997). [GHOS96] J. Gray, P. Helland, P. O'Neil, D. Sasha. The Dangers of Replication and a Solution. SIGMOD’96: 173-182. [Gif79]

D.K.Gifford. Weighted voting for replicated data. 7th Symp. on Operating SystemsPrinciples: 150-162, 1979.

[GR92]

Jim Gray and Adreas Reuter. Transaction Processing Systems, Concepts and Techniques. Morgan Kaufmann Publishers, 1992.

[GW82] Hector Garcia-Molina, Gio Wiederhold. Read-Only Transactions in a Distributed Database. TODS 7(2): 209-234 (1982). [Ham95] Brad Hammond. WINGMAN: A Replication Service for Microsoft Access and Visual Basic. Microsoft internal report, 1995.

57

[JM87]

Sushil Jajoda and David Mutchler. Dynamic voting. SIGMOD'87: 227-234.

[KA00]

B. Kemme, G. Alonso. A New Approach to Developing and implementing Eager Database Replication Protocols. TODS 25(3) (Sep 2000): 333-379.

[KD01]

Vinay Knitkar and Alex Delis. Time Constrained Push Strategies in Client-Server Databases. Distributed and Parallel Databases, Vol. 9, no.1, (Jan 2001), pp. 5-38.

[KR87]

Bo Kähler, Oddvar Risnes. Extending Logging for Database Snapshot Refresh. VLDB’87: 389398.

[KS88]

Akhil Kumar and Michael Stonebraker. Semantics Based Transaction Management Techniques for Replicated Data. SIGMOD’88: 117-125.

57

IBM Software Group | Information Management Software | solidDB [LC02] Susan Weissman Lauzac, Panos K. Chrysanthis. Personalizing Information Gathering for Mobile Database Clients. ACM SAC'02. Madrid, Spain. [Lin+86] Bruce G. Lindsay, Laura M. Haas, C. Mohan, Hamid Pirahesh, Paul F. Wilms. A Snapshot Differential Refresh Algorithm. SIGMOD’86: 53-60. [LSLH98] Kwok-Wa Lam, Sang Hyuk Son, Victor C. S. Lee, Sheung-lun Hung. Using Separate Algorithms to Process Read-Only Transactions in Real-Time Systems. RTSS 1998: 50-59. [MPC01] Subhasish Mazumdar, Mateusz Pietrzyk, Panos K. Chrysanthis. Caching Constrained Mobile Data. CIKM 2001: 442-449. [PB95]

Evaggelia Pitoura, Bharat K. Bhargava. Maintaining Consistency of Data in Mobile Distributed Environments . ICDCS 1995: 404-413.

[PB99]

Evaggelia Pitoura, Bharat K. Bhargava. Data Consistency in Intermittently Connected Distributed Systems. TKDE 11(6): 896-915 (1999).

[Pit98]

Evaggelia Pitoura. Supporting Read-Only Transactions in Wireless Broadcasting. DEXA Workshop 1998: 428-433

[PV99a] Evaggelia Pitoura, Panos K. Chrysanthis. Scalable Processing of Read-Only Transactions in Broadcast Push. ICDCS 1999: 432-439. [PC99b] Evaggelia Pitoura, Panos K. Chrysanthis. Exploiting Versions for Handling Updates in Broadcast Disks. VLDB 1999:114-125. [Pit96]

Evaggelia Pitoura. A Replication Schema to Support Weak Connectivity in MobileInformation Systems. DEXA 1996: 510-520.

[PL91]

Calton Pu, Avraham Leff. Replica Control in Distributed Systems: An Asynchronous Approach. Proc. SIGMOD 1991: 377-386.

[PMS01] Esther Pacitti, Pascale Minet, Eric Simon. Replica Consistency in Lazy Master Replication Databases. Distr. And Parallel Databases 9(3): 237-267 (May 2001). 58

58

IBM Software Group | Information Management Software | solidDB [RGK96] M. Rabinovich, N.H. Gehani, A Kononov. Scalable Update propagations in epidemic replicated databases. Proc. EDBT'96: 207-222. [SA93] O. T. Satyanarayanan, Divyakant Agrawal. Efficient Execution of Read-Only Transactions in Replicated Multiversion Databases. TKDE 5(5): 859-871 (1993). [RL94]

Michael Rabinovich, Edward D. Lazowska. Efficient Support for Partial Write Operations in Replicated Databases. ICDE 1994: 43-53.

[SR90]

Amit P. Sheth, Marek Rusinkiewicz. Management of Interdependent Data: Specifying Dependency and Consistency Requirements. Workshop on the Management of Replicated Data 1990: 133-136.

[Sto79]

M. Stonebraker. Concurrency control and Consistency of Multiple Copies of Data in Distributed Ingres. IEEE Trans. on Sofware Engineering (3)5: 188-194 (May 1979).

[Tho79] Robert H. Thomas. A Majority Consensus Approach to Concurrency Control for Multiple Copy Databases. TODS 4(2): 180-209(1979). [Wol01] A. Wolski. Embedding Data Recharging In Mobile Platforms. Real-Time & Embedded Computing Conference (RTEC'01), Milan, Italy, November 26 - 29, 2001.

59

59