FAULT-TOLERANT CLOCK SYNCHRONIZATION Joseph ... - CS - Huji

12 downloads 0 Views 993KB Size Report
Joseph Y. Halpern. Barbara Simons. Ray Strong. IBM Research Laboratory. San Jose, California 95193 ...... On receipt of this message, the joiner sets its corre-.
F A U L T - T O L E R A N T CLOCK S Y N C H R O N I Z A T I O N Joseph Y. H a l p e r n Barbara Simons Ray Strong IBM Research L a b o r a t o r y San Jose, California 95193 D a n n y Dolce Hebrew U n i v e r s i t y , G i v a t Ram 91904 Jerusalem, Israel

This paper gives two simple efficient dist r i b u t e d algorithms: one for keeping clocks in a n e t work s y n c h r o n i z e d and one for allowing n e w processors to join the n e t w o r k with their clocks synchronized. The algorithms tolerate both l i n k a n d node failures of any type. The algorithm for m a i n t a i n i n g s y n c h r o n i z a t i o n will work for a r b i t r a r y n e t works (rather t h a n just completely c o n n e c t e d n e t works) and tolerates any n u m b e r of processor or c o m m u n i c a t i o n l i n k faults as long as the correct processors remain c o n n e c t e d by fault-free paths. It thus represents a n i m p r o v e m e n t over other clock s y n c h r o n i z a t i o n algorithms such as [LM1,LM2,LL1]. Our algorithm for allowing new processors to join requires that more t h a n half the processors be correct, a r e q u i r e m e n t which is p r o v a b l y necessary.

Recently, m a n y protocols for r e s y n c h r o n i z a -

Abstract:

tion in the presence of faults have received wide att e n t i o n (cf. [ L M 1 , L M 2 , M a , L L 1 ] ) .

The algorithms

m e n t i o n e d above are all based on a n averaging process that involves reading the clocks of all the other processors.

Because of this use of averaging, there

must be more n o n f a u l t y t h a n faulty processors for these algorithms to work.

T w o of the algorithms

presented

and

ILL1]

in

[LM1,LM2]

the

algorithm

of

require 3f+1 processors in order to handle f

faults; a third algorithm of [LM1,LM2], which assumes the existence of unforgeable signatures, requires 2 f + 1 processors. The algorithms of [ M a ] , for

1. I n t r o d u c t i o n

I n a distributed system it is o f t e n necessary for

which n o worst case analysis is provided, deal with

processors to perform c e r t a i n a c t i o n s at roughly the

ranges of times rather t h a n a single logical clock time

same time.

and therefore are not directly comparable.

I n such a system each processor usually

possesses its o w n i n d e p e n d e n t clock.

However, de-

spite the marvels of modern technology, clocks tend

I n this paper a s y n c h r o n i z a t i o n algorithm is

to drift apart. Therefore, clocks must be r e s y n c h r o n -

presented that does not require a n y m i n i m u m n u m b e r

ized periodically.

of processors to handle f processor faults, so long as the n e t w o r k remains connected. The crucial p o i n t is that since we do not use averaging, it is not necessary that the majority of processors be correct. Moreover, our algorithm requires the t r a n s m i s s i o n of at most n 2

Permissionto copy without fee all or part of this materialis granted provided that the copies are not made or distributedfor direct commercial advantage, the ACM copyright notice and the title of the publicationand its date appear, and notice is giventhat copyingis by permission of the Associationfor Computing Machinery.To copy otherwise, or to republish,requires a fee and/or specific permission.

messages per s y n c h r o n i z a t i o n (where n is the total

©

ithms of [LM1,LM2] might need as m a n y as n f + l

1984 A C M 0-89791-143-1/84/008/0089

$00.75

n u m b e r of processors in the system).

of [LL1] and one of the algorithms of [LM1,LM2] also require only n 2 messages; the other two algor-

messages t o - t o l e r a t e f faults. 89

The algorithm

A final advantage of

our a l g o r i t h m is t h a t it c a n deal w i t h either p r o c e s s o r

agree on t h e e x p e c t e d t i m e for the next s y n c h r o n i z a -

or l i n k f a u l t s in a n y n e t w o r k , p r o v i d e d the n e t w o r k

tion.

remains

connected.

The

algorithms

of

[ L M I , L M 2 , L L I ] deal only with processor faults in a

I n p r a c t i c e the p e r i o d i c r e s y n c h r o n i z a t i o n algorithm must be s u p p l e m e n t e d by a method for syn-

completely connected network.

c h r o n i z i n g the o r i g i n a l p a r t i c i p a n t s and for b r i n g i n g The a l g o r i t h m is based on the f o l l o w i n g simple

in new processors.

O u r t e c h n i q u e s c a n also be used

If there are no f a u l t y processors, a

to c o n s t r u c t such a join algorithm, which c a n be used

p r o c e s s o r c a n be c h o s e n to be a synchronizer a n d to

to a l l o w n e w p r o c e s s o r s to join the n e t w o r k w i t h

b r o a d c a s t a message w i t h its c u r r e n t time once a n

their c l o c k s s y n c h r o n i z e d to those of a l r e a d y e x i s t i n g

hour (or day, or week, d e p e n d i n g on the f r e q u e n c y of

processors.

s y n c h r o n i z a t i o n required).

r e p a i r e d ( p r e v i o u s l y f a u l t y ) processors t h a t m u s t be

observation.

Each

processor w o u l d

This a l g o r i t h m can also be a p p l i e d to

t h e n adjust its c l o c k accordingly, m a k i n g m i n o r al-

r e s y n c h r o n i z e d w i t h the rest of the n e t w o r k .

The

l o w a n c e s if necessary for the t r a n s m i s s i o n time of t h e

join a l g o r i t h m requires t h a t fewer t h a n half the p r o c -

message.

essors in the n e t w o r k be f a u l t y in order to w o r k , a r e q u i r e m e n t w h i c h is p r o v a b l y necessary.

If t h e r e are faults, however, t h e n there are o b v i ous p r o b l e m s w i t h the a b o v e a p p r o a c h .

s y n c h r o n i z e r might b r o a d c a s t d i f f e r e n t messages (i.e. d i f f e r e n t times) to d i f f e r e n t processors, or it m i g h t b r o a d c a s t the same message but at d i f f e r e n t times, o r it might " f o r g e t " to b r o a d c a s t the message to some processors.

T h e r e m a i n d e r of the p a p e r is o r g a n i z e d as fol-

A faulty

N o t e t h a t it is not n e c e s s a r y to assume

lows.

I n the n e x t s e c t i o n the p r o b l e m is f o r m a l i z e d

and the precise a s s u m p t i o n s u n d e r l y i n g the a l g o r i t h m are described.

These a s s u m p t i o n s include the exist-

ence of a b o u n d e d rate of d r i f t b e t w e e n the c l o c k s of n o n f a u l t y processors, a k n o w n upper b o u n d on the

" m a l e v o l e n c e " on the p a r t of the s y n c h r o n i z e r for

transmission

such b e h a v i o r to occur.

processors, a n d the a b i l i t y to a u t h e n t i c a t e signatures.

F o r example, a s y n c h r o n i z e r

might fail in the middle of b r o a d c a s t i n g the message " T h e time is 9 A.M.," s p o n t a n e o u s l y r e c o v e r f i v e m i n u t e s later, and c o n t i n u e b r o a d c a s t i n g the same

time of messages b e t w e e n

nonfaulty

The r e s y n c h r o n i z a t i o n a l g o r i t h m is described in section 3 a n d a n a l y z e d in s e c t i o n 4. The degree of sync h r o n i z a t i o n o b t a i n e d is almost as tight as possible,

message. Thus, some of the processors w o u l d r e c e i v e

b u t a c a r e f u l discussion of this p r o p e r t y is b e y o n d the

the message " T h e time is 9 A . M . " at 9 A.M., w h i l e

scope of this p a p e r (v. [ DHS ] and [ L L 2 ] ). F i n a l l y ,

the r e m a i n d e r w o u l d receive it at 9:05.

the join a l g o r i t h m is p r e s e n t e d and a n a l y z e d in Section 5.

Nevertheless, the idea of using a s y n c h r o n i z e r c a n be m o d i f i e d to o b t a i n an e f f i c i e n t s y n c h r o n i z a t i o n a l g o r i t h m w h i c h is c o r r e c t even in the p r e s e n c e of faults. T h e k e y idea is to d i s t r i b u t e the role of t h e s y n c h r o n i z e r : e v e r y ( c o r r e c t ) processor will t r y to a c t as a s y n c h r o n i z e r at roughly the same time, a n d at least one will succeed.

2. A specification of the algorithm. I n this section b o t h the p r o p e r t i e s ( C S I - C S 3 ) t h a t t h e c l o c k s y n c h r o n i z a t i o n algorithm satisfies and the a s s u m p t i o n s ( A I - A 3 ) t h a t are made in the model are presented.

To ensure t h a t this r e a l l y

h a p p e n s at " r o u g h l y the same t i m e " , we use a p r o t o col t h a t g u a r a n t e e s t h a t all the c o r r e c t p r o c e s s o r s 90

T h e c l o c k of a p r o c e s s o r is defined to be a part i c u l a r time service d e l i v e r e d by that processor.

In

response to a time query the service responds with a

clocks at the b e g i n n i n g of the period.

n u m b e r i n d i c a t i n g the " t i m e . " In particular, the n o -

dr O. T h a t is: (A1)

schedule the s y n c h r o n i z a t i o n process tends to domi-

(l+p)'l(v-u) < C(v)-C(u) < (l+p)(v-u).

nate the time required to t r a n s m i t a message along F or t e c h n i c a l reasons the l e f t m o s t term has a f a c t o r

the c o m m u n i c a t i o n links.

of ( l + p ) -1 rather than the more c o m m o n l - p ; f o r

analyzed a r ef i n ed v er si o n of assumption (A2) (such

small p both approaches are essentially the same.

as that used in [ L L 1 , L M 1 , L M 2 ] ) that, if t is as above,

An

Therefore, we have not

a d v a n t a g e of (A1) is that it implies the s y m m e t r i c

then 8 - ~ < t < 8 + ~ .

condition

that our results could also be obtained using this re-

(l+p)'1(C(v)-C(u))


end k,

(I.6)

the k th c l o c k s of c o r r e c t processors d i f f e r b y

(1.7)

most

DMAX

I n either case, it passes a message o n to pj,

w h i c h arrives w i t h i n time trtG,F(h,j).

E i t h e r pj has

a l r e a d y started its k th c l o c k by the time the message arrives, or, as we n o w show, the message will pass the v a l i d i t y test of T a s k MSG, so t h a t pj will s t a r t its k th

(1.5)

at

MSG.

c l o c k w i t h i n trtG,F(i,j) of p~. in

the

interval

L e t X be the value of E T shared by all c o r r e c t

[ e n d k , e n d k + l ] ; thus CSI holds in this i n t e r -

processors a c c o r d i n g to h y p o t h e s i s (b). W h e n the k th

val,

c l o c k of a c o r r e c t processor is started, it is set to X.

c o n d i t i o n s (a) and (b) hold with k r e p l a c e d

Suppose pj has not started its k th c l o c k w h e n the

b y k + l and D r e p l a c e d by any D* >_DMAX.

message from Ph arrives. 95

If Ph sent the message as a

result of i n i t i a t i n g T a s k TM, this must have h a p p e n e d

this time (begk).

at time X on Ph'S k-1 st clock.

processors read a time a f t e r E T - ( f p + I ) D

Since, by h y p o t h e s i s

(a), p j ' s c l o c k differs from Ph'S by at most D, this

Thus, the ( k - l ) st c l o c k s of c o r r e c t

at beg k. This proves (1.4).

= ET-ADJ

I"1

h a p p e n s at a time later t h a n X - D on pj's clock. T h u s pj receives the message from Ph at a time later t h a n

Proof of (1.5). Suppose Pi is the first correct p r o c -

E T - D (since E T - - X , by hypothesis, until pj starts its

essor to start its ( k + l ) st clock, and let v' be its value

k th clock).

of E T i m m e d i a t e l y before the ( k + l ) st c l o c k is startl-

Since the message has one signature

(Ph'S), it passes the v a l i d i t y test.

ed. L e t v be the value on its k th c l o c k when the k th

N o w suppose Ph

sent the message to pj as a result of getting a valid

c l o c k is started.

message with s d i s t i n c t signatures. The message m u s t

ithm, it follows t h a t v' = v + P E R .

come at a time a f t e r X - s D on Ph'S clock. By a simi-

m e n t to that of (1.3) a b o v e shows t h a t Pi starts its

lar a r g u m e n t to t h a t above, it comes t o pj at a time

( k + l ) st c l o c k l a t e r t h a n v ' - f p D on its k th clock; i.e.,

after X-(s+I)D

C~(beg k + l ) > v'-fpD. F r o m (1.1), it follows that the

on p j ' s clock, and since it now has

F r o m the d e f i n i t i o n of the

s + l signatures ( i n c l u d i n g Ph'S), the message also p a s -

C~(end k) _> v + ( l + p ) d m i n .

ses the v a l i d i t y test for pj. []

tion

inequality,

algor-

A n identical argu-

By the I n t e r v a l Separa-

v+(l+p)dmin