*Yao-ming Yeh(èèæ), *Wen-Da Sun(嫿é), and **Yeong-Sheng Chen(鳿°¸æ). * Department of Information and Computer Engineering. National Taiwan ...
Object Replication and CORBA Fault-tolerant Object Service *Yao-ming Yeh(葉耀明), *Wen-Da Sun(孫文達), and **Yeong-Sheng Chen(陳永昇) * Department of Information and Computer Engineering National Taiwan Normal University, Taipei **Department of Information Management Hwa-Hsia College of Technology and Commerce, Taipei
Outline Background Fault-Tolerant Mechanisms for CORBA FTOS: Object Replication embedded in CORBA
Object Replication Mechanism Object-Oriented Dynamic Voting Mechanism Smart Binding
Performance Evaluation Conclusions
Background CORBA (Common Object Request Broker Architecture) an object-oriented distributed platform proposed by OMG (Object Management Group) IDL(Interface Definition Language): a cross platform interface for Java, C, C++, Ada ORB (Object Request Broker): provides CORBA object bus Common Object Services Common Facilities
CORBA Architecture Application Objects
Common Facilities
CORBA ORB
...
... Event LifeCycle Query Naming
Time
Common Object Services (16)
A Distributed Object Platform Java
C++
Client Stub
Ada
Java
C++
Ada
IDL
IDL
IDL
Server Skeleton
CORBA ORB
Fault-Tolerant Mechanisms for CORBA Integration
Approach
Interception
Service
Approach
Approach
Integration Approach The ORB kernel and IDL are modified so that the fault-tolerant mechanisms can be embedded. Isis:
IONA Tech. Ltd. Electra: Landis and Maffeis 1997
Interception Approach Intercept the system call from client. Modify the system call so that the faulttolerant mechanisms can be provided. Eternal:
Narasimhan et.al. 1997 FTMP: Moser et.al. 1999
Service Approach Add fault-tolerant interface for the programmers of CORBA. Provide fault-tolerant capabilities for the upper level of ORB GCS:
Felber et.al. 1996 Primary-Secondary: Sheu et.al. 1997 Primary-Backup: Guerraoui and Schiper 1997
Our Works: FTOS (Fault-Tolerant Object Service) A Service Approach to provide fault-tolerant capability for CORBA Fault-Tolerant Mechanism is based on Object Replication Dynamic Voting is used to provide maximal faulttolerant capability Smart Binding is implemented to improve the performance of server service A COSS (Common Object Service Specification) Library for Distributed Programmers
FTOS in CORBA Client Object
Server Object
CORBA ORB
...
... Event LifeCycle
Naming FTOS
Common Object Services
Time
Object Replication Mechanism Object Manager automatically replicates the server object in divide and conquer mechanism. In order to improve the performance of object replication, data compression and local cache are added in our replicate operations. A A
A A
A A
A A
Dynamic Voting Mechanism
Jajodia (1987) proposed Dynamic Voting for distributed database.
Based on Majority Voting Strategy
The Distributed Object version of Dynamic Voting mechanism is proposed in our system.
Majority Partition Strategy Replicated Object can tolerate process failure and node failure; however, it can not deal with network partition. When the system is partitioned into two parts owing to network failure, the Majority Partition Strategy is used to ensure the Majority Part of the system can continue the server service. The Majority Part is decided according to
Number of replicated objects in the partition When two partitions are equal, then the partition which contains the Distinguished Object is the majority part.
Majority Partition Mechanism: Object Cardinality
Majority Partition Mechanism: Distinguished Object
State Concurrency Control of Replicated Objects Coordinator S1
Subordinator S2
S3 concurrency request
request ORB C
Subordinator
Dynamic Voting Client
Server1
Server2
Server3
1.Request 2.Concurrency_Request
3.Vote_Request
4.Commit or Abort 5.Return
Dynamic Voting: Object Data Structure OS
(Object State)︰indicates the number of times that the data in the object is changed OC (Object Cardinality)︰indicates the number of replicated objects. DO (Distinguished Object)︰indicates the distinguished object
Distributed Dynamic Voting Algorithm [Message]:
[Transition]:
if status = active then {send Concurrency_Request to all subordinator status := voting} if status = voting then if lock = true then lock := send Decision_Request to all subordinator if lock = false then {send Vote_Request to all subordinator run Is_Distinguished if majority partition exist then {store request and result in request-record status := commit} else status := abort} if status = commit then send commit to all subordinator status := idle if status = abort then send abort to all subordinator
receive(“request”,req,ts),req∈R,ts∈T if req not in request-record then status = active else return request-record(ts) receive(“Concurrency_Request”, ts),ts∈T if not conform to Berstein’s algorithm then waiting receive(“Decision_Request”, ts),ts∈T if ts found in locked-record then return false else return true receive(“Vote_Request”, ts),ts∈T if lock = true then lock := send Decision_Request to all objects if lock = false then {store ts in lock-record return OS, OC, DO} else return null
Fault-Tolerant Capability FTOS can tolerate the following failures in a distributed system: Process Failures Host Failures Communication link failures Network Partitions
Client/Server Binding CORBA provides binding transparency for client. However, the client may bind a remote server object instead of the local server object. FTOS provides two libraries︰ SmartBinding(
): bind the the local object first FindAllReplica( ): find all replicated objects in the system
The Component Structure of FTOS Object Manager
Object Manager
ServerObjectLevel
ServerObjectLevel ServerObjectLevel
server
server
ServerObject Level
ServerObject Level
Request
client Client Object
Return
FTOS Components: IDL defined in ServerObjectLevel1 module FTOS { interface ServerObjectLevel1 { string GetInformation ( ); boolean Setup ( in boolean simulation, in string group ); void Deactivate ( ); boolean ConcurrencyRequest ( in string ior, in string ts ); boolean VoteRequest ( out long os, out long oc, out string do, in string ior_id ); boolean Commit ( in long os, in long oc, in string do, in any state, in string ior_id ); boolean Abort ( ); boolean DecisionRequest ( in string ior_id ); boolean CatchUp ( out long os, out long oc, out string do, out any state);}; };
FTOS Components: IDL defined in ServerObjectLevel2 module FTOS { interface ServerObjectLevel2 { string GetInformation ( ); Deactivate ( );}; };
FTOS Components: IDL defined in ObjectManager module FTOS { interface ObjectManager { struct ObjectContent { long content;}; struct ObjectContentArray { sequence FileArray;}; string GetInformation ( ); boolean Setup ( in boolean simulation, in string group ); long Init ( in string host, in long number, in string filename, in string ior ); boolean SendData ( in ObjectContentArray data ); boolean IsAlive ( ); boolean ErrorMessage ( in string host ); boolean Reset ( );};};
Performance Evaluation Object
Replication
Dynamic
Voting
Object Replication : D ata S ize ( 1 K b ytes )
The data amount before and after compression with JAR 300 250 200 150 100 50 0
Before Compression After Compression
DataStore JcbWt
VbjGk Object Name
OM
Server
Object Replication : Total Processing Time ( 1 sec )
The time differences of replicating object with/without compression 80 70 60 50 40 30 20 10 0
No Compression With Compression
DataStore JcbWt VbjGk (298.1K) (205.2K) (96.8K) Object Name
OM (53.8K)
Server (25.4K)
Object Replication :
Process Time (1 sec)
The time of replicating objects by ObjectManager 50 40 30 20 10 0 0
1
2
3
4
5
Number of Objects
Notes: for 25.4K object
6
Object Replication : Process Time (1 sec)
The time to start a local object by ObjectManager 35 30 25 20 15 10 5 0 0
1
2
3
4
5
Number of Objects
Notes: for 25.4K object
6
Dynamic Voting:
Reponse Time (ms)
The response time of request with dynamic voting 1500 1200 900 600 300 0 0
1
2
3
4
5
Number of Objects
Notes︰The average response time of a client to a single server object is 550 ms
6
Dynamic Voting:
P r o c e s s T im e ( 1 s e c )
The time of accessing local object and remote objects 40 30 20 10 0
0 50 access local object access remote object
100
150 Data Size (1K bytes)
200
250
300
Conclusions
FTOS provides fault-tolerant service for CORBA distributed system. ─ Object replication Mechanism ─ Dynamic Voting Mechanism
FTOS can tolerate many hardware and software failures in a distributed system including process failures, host failures, communication link failures, and Network Partitions.