Document not found! Please try again

An Approach To Increase Reliability in Service Oriented Systems

3 downloads 16878 Views 626KB Size Report
Sep 13, 2011 - CRASH. Danilecki, Holenko, Kobusinska ... ReServE: An Approach to Increase Reliability in SOA systems ... Currently, recovery starts by hand – we are testing ... 3.00GHz CPU, Barracuda 7200.12 SATA 3Gb/s 500 GB HDD.
ReServE Service: An Approach To Increase Reliability in Service Oriented Systems A. D. Danilecki, M. Holenko, A. Kobusinska, M. Szychowiak, P. Zierhoffer {adanilecki,akobusinska}@cs.put.poznan.pl Poznan University of Technology

September 13, 2011

ITSOA project

The IT-SOA project aims: Self-healing Manageable

Easy monitoring

Virtualization Dependable

The research presented was partially supported by the European Union in the scope of the European Regional Development Fund program no. POIG.01.03.01-00-008/08.

Danilecki, Holenko, Kobusinska ...

ReServE: An Approach to Increase Reliability in SOA systems

[1/27]

The IT-SOA toolkits from Poznan University of Technology DyMST: Dynamic Management SOA Toolkit : Obligations, Restrictions, Capabilities, Audit M3: Metrics, Monitoring, Management Failure Detection Service Module ReSP: Reliable SOA Platform Replication service Reliable Service Environment (ReServE) AtomicRMI RESTgroups Service-Oriented ad-hoc systems support Transaction support

Danilecki, Holenko, Kobusinska ...

ReServE: An Approach to Increase Reliability in SOA systems

[2/27]

The Problem

In any distributed environment, failures are inevitable Existing fault-tolerance techniques have high cost associated Replication: improves high availability Transactions: highly restrictive: may be withdrawn even with transient failures Different organizations have different requirements wrt. fault tolerance

Danilecki, Holenko, Kobusinska ...

ReServE: An Approach to Increase Reliability in SOA systems

[3/27]

The Constraints: solution should...

Respect the administrative independency of the different organizations Impose as minimal requirements on services as possible Allow to use both WS-*, Restful... Not interfere with the application level Be transparent to the applications Be automatic solution (allowing for separation of bussiness logic and fault tolerance)

Danilecki, Holenko, Kobusinska ...

ReServE: An Approach to Increase Reliability in SOA systems

[4/27]

What we’ve done The reliability provided with external logging web services The organizations web service’s must follow few requirements and implement several methods available via standard interface Each organization may use provided proxy services, or implement their functionality on their own Each organization may implement their own, independent reliability policy, which is complemented by the possibilities offered by ReServE environment May be used standalone or with the rest of the ReSP toolkit Support for Restful services (support for WS-* possible) The ReServE services may be offered by many different organizations Danilecki, Holenko, Kobusinska ...

ReServE: An Approach to Increase Reliability in SOA systems

[5/27]

Assumptions The services are piecewise deterministic: Starting from some service state Sx , repeating the same sequence of requests (in the same order) results always in the same state Sy . As a consequence, response generated by a service depends only on its state and the request. The clients are piecewise deterministic Client execution is always the same, given the same responses

But the important thing is... Clients and Services may obey „reasonable” restrictions and requirements Danilecki, Holenko, Kobusinska ...

ReServE: An Approach to Increase Reliability in SOA systems

[6/27]

Normal execution

Ci t

ReServE

Si

Danilecki, Holenko, Kobusinska ...

ReServE: An Approach to Increase Reliability in SOA systems

[7/27]

Service’s fault

Ci t

ReServE

Si

Danilecki, Holenko, Kobusinska ...

CRASH

ReServE: An Approach to Increase Reliability in SOA systems

[8/27]

Client’s fault

Ci

CRASH t

ReServE

Si

Danilecki, Holenko, Kobusinska ...

ReServE: An Approach to Increase Reliability in SOA systems

[9/27]

General Architecture

Danilecki, Holenko, Kobusinska ...

ReServE: An Approach to Increase Reliability in SOA systems

[10/27]

Problems to be solved

Message losses Detecting duplicates What to log? When to start recovery? From where start the recovery? How to find out about request execution order? What to do with service dependencies? How to ensure that service’s state changes seen by the client are not lost?

Danilecki, Holenko, Kobusinska ...

ReServE: An Approach to Increase Reliability in SOA systems

[11/27]

The answers Message losses the messages are retransmitted until a response or acknowledgement is received Detecting duplicates a message identifier must be attached What to log? Taking advantage of HTTP semantics: may not log GET requests When to start recovery? From where start the recovery? How to find out about request execution order? What to do with service dependencies? How to ensure that service’s state changes seen by the client are not lost? Danilecki, Holenko, Kobusinska ...

ReServE: An Approach to Increase Reliability in SOA systems

[12/27]

The answers Message losses Detecting duplicates What to log? When to start recovery? Currently, recovery starts by hand – we are testing cooperation with FADE failure detector service using its callback capabilities From where start the recovery? Service Provider must expose information about the last request processed How to find out about request execution order? What to do with service dependencies? How to ensure that service’s state changes seen by the client are not lost? Danilecki, Holenko, Kobusinska ...

ReServE: An Approach to Increase Reliability in SOA systems

[13/27]

The answers Message losses Detecting duplicates What to log? When to start recovery? From where start the recovery? How to find out about request execution order? Service provider must attach an ordering information to the responses, reflecting the execution order What to do with service dependencies? postpone some outgoing requests; we must be careful here to avoid deadlocks How to ensure that service’s state changes seen by the client are not lost? postpone some service’s responses when necessary Danilecki, Holenko, Kobusinska ...

ReServE: An Approach to Increase Reliability in SOA systems

[14/27]

Failure-Free run

Ci Cj

a b

ReServE

Sj

a

c b b:1

c a:2

checkpoint last request b

Danilecki, Holenko, Kobusinska ...

last request a

ReServE: An Approach to Increase Reliability in SOA systems

[15/27]

The recovery after the failure 1

ReServE a b e:5

d:1 c:2 f:3

ordered queue

List of your recovery points, pretty please

unordered queue

C1: d C2: e

Sj checkpoint last request d

Danilecki, Holenko, Kobusinska ...

standby replica last request e

ReServE: An Approach to Increase Reliability in SOA systems

[16/27]

The recovery after the failure 2

ReServE a b e:5

d:1 c:2 f:3

Withdraw to C1

Sj

Danilecki, Holenko, Kobusinska ...

Done

c

f

Done

a,b,e

Done

ReServE: An Approach to Increase Reliability in SOA systems

[17/27]

Distributed Architecture Client A

Distributed rollback-recovery service

Service A

RMU

Client B

Service Repository Service B

RMU Client C

Service Repository

Service C Service D RMU

Danilecki, Holenko, Kobusinska ...

ReServE: An Approach to Increase Reliability in SOA systems

[18/27]

The experiments

15 workstations connected by Gigabit Ethernet network, with 64-bit OpenSuse 11.3 (Linux 2.6.34.8-0.2-desktop-x86_64) 8GB RAM, Gigabit 82567LM-3 card, Core2 Quad Q9650 3.00GHz CPU, Barracuda 7200.12 SATA 3Gb/s 500 GB HDD Parameters (number of clients and requests) were tuned in order to not to saturate the network

Danilecki, Holenko, Kobusinska ...

ReServE: An Approach to Increase Reliability in SOA systems

[19/27]

Read only requests ReServE overhead during failure-free runs (GET) 4000

average response time [ms]

3500

GET GET ReServE

3000 2500 2000 1500 1000 500 0 20

40

60

80

100

120

140

160

180

200

number of clients

Danilecki, Holenko, Kobusinska ...

ReServE: An Approach to Increase Reliability in SOA systems

[20/27]

Small PUT requests ReServE overhead during failure-free runs (PUT) 4000

average response time [ms]

3500

PUT 4kB PUT 4kB ReServE PUT 32kB PUT 32kB ReServE

3000 2500 2000 1500 1000 500 0 20

40

60

80

100

120

140

160

180

200

number of clients

Danilecki, Holenko, Kobusinska ...

ReServE: An Approach to Increase Reliability in SOA systems

[21/27]

PUT requests (cont) ReServE overhead during failure-free runs (PUT) 8000

average response time [ms]

7000 6000

PUT 4kB PUT 4kB ReServE PUT 32kB PUT 32kB ReServE PUT 128kB PUT 128kB ReServE

5000 4000 3000 2000 1000 0 20

40

60

80

100

120

140

160

180

200

number of clients

Danilecki, Holenko, Kobusinska ...

ReServE: An Approach to Increase Reliability in SOA systems

[22/27]

PUT requests (cont) ReServE overhead during failure-free runs (PUT) 8000

average response time [ms]

7000

PUT 32kB ReServE PUT 32kB ReServE 6services

6000 5000 4000 3000 2000 1000 0 20

40

60

80

100

120

140

160

180

200

number of clients

Danilecki, Holenko, Kobusinska ...

ReServE: An Approach to Increase Reliability in SOA systems

[23/27]

Recovery time Service recovery time 1000 No resource groups Low granularity resource groups Low granularity resource groups+semantics High granularity resource groups

total recovery time [ms]

800

600

400

200

0 2000

4000

6000

8000

10000

number of requests in queue

Danilecki, Holenko, Kobusinska ...

ReServE: An Approach to Increase Reliability in SOA systems

[24/27]

Remaining problems and the future work

Performance is still not satisfying Better SOAP integration Security problems Browser clients What to do when piecewise determinism is almost true Integration with monitoring and audit tools from DyMST package Testing with real services and applications

Danilecki, Holenko, Kobusinska ...

ReServE: An Approach to Increase Reliability in SOA systems

[25/27]

Key points ReServE: outsourcing the reliability, using logging and external web services The web service’s using ReServE must follow few requirements and implement several methods available via standard interface Each organization may use provided proxy services, or implement their functionality on their own Each organization may implement their own, independent reliability policy, which is complemented by the possibilities offered by ReServE environment May be used as standalone tool or with the rest of the ReSP toolkit Support for Restful services (WS-* support possible) The ReServE services may be offered by many different organizations Danilecki, Holenko, Kobusinska ...

ReServE: An Approach to Increase Reliability in SOA systems

[26/27]

Thank you!

Suggest Documents