The Extended BG-Simulation Eli Gafni∗ May 5, 2008
Abstract The BG-simulation is currently the main tool, short of arguing directly about topology, to show the impossibility of r/w protocol to solve a task T (res) on n processors, t-resiliently. It is done by reducing T (res) from a counterpart task T (wf ) on t + 1 processors and showing that T (wf ) is not wait-free solvable. The task T (wf ) derived from T (res), is driven by the capability of the BG-simulation. Yet, the task T (wf ) proved meaningful only when derived from a task T (res) which was “colorless” - essentially a decision task in which any processor may adopt the output of any other processor as it own. It lacked, when a terminating simulator cannot easily adopt any simulated output for itself. In this paper we extend the capability of the BG-simulation to this case. As a result, the definition of T (wf ) as derived from T (res) becomes tighter, i.e. the new task is a subtask of the previous one. This makes for a T (wf ) that is of higher-level and therefore a better tool. We demonstrate the convenience that the Extended BG-simulation provides, in two ways. First, we prove a new impossibility result: we show that n processes cannot solve t-resilient weak renaming with n + (t + 1) − 2 names, where n > 1 and 0 < t < n. Second, we show that the solvability n-processors tasks, t-resiliently, for t > 1 and n > 2, is undecidable by a simple reduction to the undecidability of the solvability of 3-processors tasks, wait-free.
1
Introduction
The most celebrated result of distributed computing, the FLP impossibility result [6], is couched in the domain of resiliency: any number of processors with the possibility of at most one faulty (1resiliently), cannot asynchronously reach consensus. It may be thought that 3 processors with one faulty cannot reach consensus, since the majority of the non-faulty processors is not overwhelming; yet, 100 processors with the possibility of just one faulty, then with the overwhelming majority of 99 non-faulty processors they could reach consensus? The FLP result proved they cannot. Yet, this generality came at a cost. Dealing directly with any number of processors and showing that all but one have taken infinitely many steps, made the original FLP derivation quite complex. On the other hand, is easy to derive FLP for 2 processors in Shared-Memory (SM). SM is equivalent to Message-Passing for t < n/2 [5]. If we could only reduce t-resiliency questions to wait-free questions then FLP will boil down to a proof for just 2-processors wait-free. The BG-Simulation [2], [3] does just that. It reduces certain questions about resiliency to questions about wait-free. It can be used to prove FLP in the SM context by proving the impossibility for 2 processors, and then showing that the existence of r/w protocol that can solve ∗
Computer Science Department, University of California, Los Angeles. 3731F Boelter Hall, UCLA, LA. CA. 90095, USA.
[email protected]
1
consensus for 100 processors 1-resiliently, will imply a wait-free protocol for 2 processors, leading to a contradiction. The idea of the reduction is to take 2 simulators s0 and s1. If there exist a code code0 , code1 , ..., code99 , for 100 processors that solves consensus 1-resiliently, then there exists an execution e0 that results a decision 0, and and execution e1 that results a decision value 1. Simulator si, i = 0, 1 in solo execution emulates execution ei. Otherwise, when not alone, they start servicing the codes in a round-robin fashion. It is shown that the simulation is such that the failure of a simulator manifests itself as lack of service of a single code. Thus the emulated execution is 1-resilient. Consequently, at least 99 codes will have an output which will be adopted as a consensus output for s0 and s1. In this simple example there is only one natural wait-free problem to do reduction to. In general though one follows the mechanism of the BG-simulation as it operates on the original task T (res), to arrive at a problem T (wf ) that may not be wait-free solvable. The BG-simulation does not dictate a single problem to reduce to, but rather a family of problems. It can bee seen that the correctness of the emulation relies heavily on the fact that the two simulators, one observing 99-entry output tuple and the other observing 100, will nevertheless, output the same value. But what about tasks in which this is not the case. Imagine you wanted a reduction from 100 processors, 1-resilient, to 2 processors wait-free, in which s0’s output is the output of code0 and s1’s output is the output of code1 . If the task is not consensus, then when s0 observes 99 outputs, these may be outputs for codes code1 , ..., code99 , while code0 is the one on which progress was blocked as a result of a failure of simulator s1. Thus the BG-simulation cannot accommodate such a requirement. This drawback is remedied by the Extended BG-simulation. It allows a simulator to be associated uniquely with certain code and guarantees that when the simulator s0 above observed only 99-entry output tuple, one of these outputs will be the output of code0 . Let t(wf ) be the family of tasks on t + 1 processors that can be derived via the BG-simulation, from T (res) [3]. The Extended BG-simulation allows us to tighten this correspondence between T (res) and T (wf ): If we make an a priori association between si and codei , the outcome in which a processor returns 99 outputs that do not include an output for codei is removed. Thus T (wf ) that results in by the Extended BG-simulation is a subtask of the one that results from the BG-simulation. Call the former T (wf − extended), and the latter T (wf − regular). Thus T (wf − extended) is potentially harder to solve than T (wf − regular). Consequently, there may be in instance in which T (wf − regular) is wait-free r/w solvable and therefore cannot serve in an impossibility reduction, while T (wf − extended) is not solvable and therefore can serve in a reduction. Even when that is not the case, The Extended BG-simulation is a higher-level construct with more structure, and therefore easier to employ. Below we outline two relevant examples in which it was natural to prove impossibility using T (wf − extended), while we, nor anybody else, was able to do it with T (wf − regular): 1. Suppose that we know that there is no w/r wait-free protocol for 2(t + 1) − 1 processors, out of which at most t + 1 will participate, to rename in the range 1 to 2(t + 1) − 2. By the BG-simulation philosophy, since t-resilience is like t + 1 processors wait-free, and the renaming overhead for t + 1 processors is t, consequently there should not be an algorithm for n + (t + 1) − 1, out of which at most n will participate to be able 2t-resiliently rename to the range 1 to n + (t + 1) − 2 (The resiliency here pertains to all n + (t + 1) − 1 processors, which translates to t-resiliency for a set of processors of size n). No such reduction is known, and indeed there is no known topological proof of this result. 2
Here we extend the weak renaming impossibility result, to the t-resilient case, whenever the wait-free impossibility for the t + 1 processors holds [9], [4]. 2. Suppose that we know that 3 processors tasks are undecidable [14]. But perhaps a task on 100 processors, when asked whether it is 2-resiliently solvable, this question is decidable since like the FLP presumption, conceivably the fact that 98 processors at least will take steps forever, will help to decide the problem. Indeed, it is not the case as [13], generalizing [14] has shown. The paper [13] is complex as it invokes topology. Here we show that the result can be simply obtained by reduction to [14], but using the Extended BG-simulation rather than the BG-simulation, relegating topology just to the waitfree case. The paper is organized as follows: In the following Model section we define resiliency, present the details of the BG Agreement-Protocol and how it is extended. In subsequent section we present a comparison of the protocols when we execute few of them as threads in parallel. Afterward we present the Extended BG-simulation. Finally we argue the two reductions outlined above, and conclude with open problems.
2
The Model
We assume a SM-SWMR Atomic Snapshot [1] model. Each processor pi has a dedicated cell Ci , to which it writes exclusively, and can read the whole memory in a snapshot. Processors alternate between writing and reading in a snapshot. W.l.o.g we assume that processors when they write, they write all their history. To start with, the memory cells are initialized to ⊥, and the initial state of each processor is its own unique id. A snapshot of a processor is called a “View.” A Task ∆ is a binary relation between a vector of distinct processors names (participating set) and a commensurate size vector of output values (output tuple). Task A is a Subtask of B, if A ⊂ B, i.e. A is derived by removing some input-output tuples from B. A task is “colorless” if given an input-output tuple (In, Out) ∈ ∆ where the length of the Out vector is k, and the set of values that appear in Out is σ, then for all vectors v ∈ σ k , (In, v) ∈ ∆. (For a generalization of colorlessness that captures the outputs of the BG-simulation declaratively, see [3].) A Protocol is a mapping from views to output values. An Execution is an infinite sequence of processors names. Every appearance of a processor in the sequence is called a Step. With each entry in an execution we associate a set of views, one for each processor. We do this by interpreting the first appearance of a processor name in the sequence as a write action, the second appearance as a snapshot action, the third as a write action, etc. Initially views are ⊥. A processor which took a step in an execution is called “Participating.” An Environment is a set of infinite executions. A processor pi is non-faulty in an execution e if its name appears infinitely many times in e. It is otherwise faulty. A task ∆ is solvable in an environment N by a protocol Π, if for all executions e ∈ N , all processors which are non-faulty have somewhere in the execution some views that are mapped by
3
Π to an output value. For each processor that has a view that maps to an output we consider only the first such view and corresponding output. If we then take the vector of the participating set, and fill out the corresponding size vector with these output values in the appropriate positions, then we say that the the task is solvable by Π if the possible empty entries can be filled-out (completed) such that the resulting output values vector and the vector of the participating set are members of ∆. Given a system of n processors, a t-resilient environment is all the executions N such that for all e ∈ N at least n − t processors are non-faulty. The wait-free environment is the largest environment - one which places no constraints on the set of infinite executions. It is easy to see that the wait-free environment and the n − 1-resilient environment coincide. A task is bounded-wait-free solvable if there is a uniform (over all executions) bound on the number of steps a non-faulty processor can take before an output.
2.1
An Agreement-Protocol (AP) [2] [3]
An agreement-protocol is a misnomer since it is a protocol and an environment. The task is an Election Task: All participating processors output a name of a participating processor on which they all agree. The Election Task is tantamount to consensus and is not solvable even 1-resiliently. Yet, it is solvable in the “Agreement Environment.” The agreement environment is specified by a number b; in an infinite execution such that processors either failed to take any step or failed after step b, then all processors that take an infinite number steps have to output. Notice that directly from the abstract definition of an AP we can derive the following properties: 1. If all participating processors are past b steps then a processor can immediately decide; it can simulate itself continuing solo. The decision cannot change from this solo outcome since we are in the r/w wait-free environment, here on. 2. If at some point of time all participating processors, at the time, are past b steps, then processors can immediately decide. This follows again from the reason above. If a solo waitfree run has to decide at that point in time, no further steps in the protocol can change that potential decision in a r/w wait-free environment. This potential decision is available to all as they read the whole history. Consequently, we can talk about the state of the AP. It is either “decided” or not. Here’s a simple instance, to make things concrete, of an agreement-protocol: Conceptually consider that each cell Ci is subdivided into two cells Ci,1 and Ci,2 , initialized to ⊥. the algorithm proceeds in two phases. In the first phase pi writes its name i in Ci,1 . It then takes a snapshot of the names, Si := C∗,1 , to get a set of names Si . In the second phase processor pi posts the snapshot, Ci,2 := Si . It then waits until its snapshots of C∗,2 are such that there exists a not-a-⊥ cell C2,j , such that for all k ∈ C2,j , C2,k 6= ⊥. When this happens pi return the smallest name from the smallest set it sees. For the this agreement-protocol b = 3. Safety - all processor that return a name return the same name - follows from the fact that as time progresses the smallest set posted at phase 2 can only decrease. Since posted sets are snapshots it follows that two sets of the same size are identical. By way of contradiction assume |Sp | > |Sq | and one processor returned a name from Sp and another 4
from Sq . Thus Sq was posted after Sp and by virtues of snapshots Sq ⊂ Sp and q ∈ Sq . But the halting condition says that there exist some snapshot Sm greater or equal to Sp and all its processors have a already posted their set. Since q belongs to Sm we have a contradiction that Sq was posted after Sp . Liveness follows from the fact that if all processors that took a step executed at least 3 steps, then if q appears in any snapshot, then it took the first step, and since it took the 3rd step, a snapshot is now posted at Cq,2 , and consequently the halting condition holds.
2.2
Commit-Adopt (CA) Protocol [11]
The Commit-Adopt Task is specified as follows: Processors are divided into groups, the members of each group share the unique group-name. 1. If the participating set is from a single group then at least on processor outputs “commit my-group-name,” 2. If a processor outputs “commit j” for some group name j, the all processors output either “commit j,” or “adopt j.” The CA task is wait-free solvable in two phases similar to the phase of the agreement protocol: In the first phase pi writes its group name to Ci,1 and then reads all C∗,1 . If in the first phase pi observes only a single group name it writes “commit group-name” in Ci,2 , otherwise it writes “abort.” It then reads C∗,2 if it reads only “commit group-name” it outputs “commit group-name,” and if it reads both “commit group-name” and “abort” it outputs “adopt group-name.” Obviously, only the group name written first in the first phase has a chance that a processor will write it as a commit argument in the second phase. Also, if a processor commits, it did not read any abort in the second phase, consequently “commit” was written first. Consequently all will see its commit posting and will either commit or adopt.
2.3
Extended Agreement Protocol
The Extended Agreement Protocol (EAP) is a sequence of protocols. Each element in the sequence is an Agreement-Protocol or Commit-Adopt. The sequence alternates between AP and CA starting with an AP. What Are the properties of EAP? To terminate an EAP we will require the processor to commit a value in the EAP. Consequently by the properties of CA, EAP is safe: all processors that terminate an EAP will output the same value.
3
Comparison between AP and EAP
Consider n processors with m parallel APs AP0 , ...APm−1 protocols. Let each processor arbitrarily order the protocols and in its order execute b steps on each (recall that b is the parameter of AP such that if all processors that started the protocol executed b steps on the AP then it is decided). The idea behind the BG simulation is the observation that in this setting when a processor pi executed all the m agreement protocols, then m − (n − 1) of them, at least, are decided. This follows from the Pigeonhole Principle. For an APj to not be decided, at least one processor has taken more then one but less than b steps in APj . Since processors execute the APs in order, 5
a processor can be executing only a single AP at a time. Thus for an AP to not be decided an adversary must “sacrifice” a processor. Since pi is executed all the APs, it is not in the middle of any AP, thus the adversary has only n − 1 processors at its disposal and consequently can block agreement only on at most n − 1 APs. Continuing this logic we conclude that for all k, 1 ≤ k ≤ n at most n − k processors will see just less or equal m − (n − k) APs decided. Call this property of the relation between the number of processors and the number of outputs they see, the number-of-outputs property. Nevertheless, pi is not guaranteed to observe the output of APi . Can we maintain all the flexibility we have above, i.e. that the order in which processors traverse the APs is arbitrary, guarantee the number-of-output property, and on top have pi see an output for APi ? The AP as specified cannot solve this problem. Yet, if we replace each AP with an EAP we can. The way we do that is as follows. In addition to using an EAP instead of an AP, we also use an “ejected board,” initially empty. When snapshotting the system, if a processor sees its name on the ejected board, it stops and returns all the EAPs which committed. Similarly, we have an “aborted board,” initially empty. Each AP in each of the EAP has a unique identifier. An AP which is posted on the aborted board, is to be treated by a processor that seeks its decision, as if it has decided for it its own name. Each non-ejected processor executes the first AP in all the m EAPs. As we said, in at least m − (n − 1) first AP in the m EAPs, it will observe an output. It then proceed through the appropriate CAs, those whose preceding APs have an output, to try to commit these outputs. On some of them it might fail to commit, and on those that it failed, it proceeds to the next AP either with its name or with its adopted value, as dictated by the CA. In the meantime, some of the first APs which did not have an output before may now have one. On those it proceeds to the following CA, etc. It continues until it gets “stuck” - each of the EAP is either committed, or has a processor in the middle a non-aborted AP which is in the front of the execution of that EAP, and consequently that AP is not decided. Notice that an aborted AP, one whose id appear on the aborted board, is considered “decided” and does not count toward getting a processor stuck since it can proceed to the succeeding CA inputing it own name. Notice that the stuck state is exactly the state that can be seen by a processor terminating in the BG-simulation. Thus, we get the first property of the EAP, which is that it subsumes AP - if processors quit upon being stuck, the outputs will satisfy the number-of-outputs property. But a processor pi is not to quit when it gets stuck, unless its EAPi is committed. Thus when pi gets stuck, it looks for a single arbitrary name (maybe its own) of a processor pj such that EAPj is committed and pj does not appear on the “ejected board.” It may find one, or not. If it does, it puts pj on the ejected board. Whether it does, or does not, it now proceed to get itself “unstuck.” To get unstuck it aborts all those APs that block its progress and continues to the successive CAs with its name as an input. It will again continue until it gets stuck, it will then again try to eject etc, until it is ejected and consequently returns. To see that mayhem will not ensue, and processors will terminate and will not go forever, notice that for that to happen we must get into a stable situation in which no new processor will be added to the ejected board, and all processors on the ejected board have terminated, or “crashed” by now. Let k be the number of processors on the ejected board. Then some of the n − k processors that are left are get stuck infinitely often and cannot eject any of the n − k processors. Consequently, at least n − k EAPs, those whose index correspond to the n − k processors, have not been committed.
6
But n − k processor can commit at least one EAP out on n − k EAPs, contradiction. To see that the number-of-outputs property holds, notice that an ejected processor pi , which was ejected by pj , could have conceptually “switched places” with pj and observe the system as stuck and output for itself. (That is why for each stuck state, a processor should not post more than a single additional eject). A processor that observes the system as stuck observes exactly its termination condition for the case that we have APs instead of EAPs. But for APs we have seen the that the number-of-outputs property holds.
4
The BG and Extended BG-simulation
It is now easy to see how both the BG and extended BG-simulation proceed: Given a sequence of codes, simulators try to reach consensus on moving the program-counter of each code. The codes are SWMR Atomic-Snapshot Shared-Memory. There is no problem with moving the program counter of a code to a “write” command as the outcome of the command is predetermined by the preceding “read” command. The “read” commands are the problematic ones. Each simulator locally executes the “read” on behalf of the code, as the view of the shared memory for the codes is correctly simulated until now, inductively. The problem is that the emulated shared-memory is changing and different simulators reading on behalf of the same “read”-command may return different values. While each of these values is valid, simulating the processor that executes the code asynchronously, “reading” in different times, nevertheless, simulators have to stick with one single outcome i.e. consensus. For this we use either an AP or an EAP protocol. Thus with every “read”-command we have an AP or EAP. These are precisely the parallel AP and EAP we discussed in the previous section. Consequently, if the number of simulators is t + 1 and the number of codes is n > t + 1 then the execution we get t-resilient as at most t simulators may fail, and each fault of a simulator manifest itself as at most the inability to continue executing commands of a single code, which in turn is a fault of the simulated processor which owns this code. Thus, the whole difference between the discussion of the previous section and a full-fledged simulation is that a single agreement, be it AP or EAP is now a sequence of agreements. Thus what changes is the notion of “termination.” While in the previous section once an AP in the context of BG, or EAP in the context of Extended BG, has reached an output, or committed an output, respectively, then this thread from the set of “parallel things to execute” has terminated, now for a thread to terminate, the whole sequence of agreements that correspond to the sequence of “reads” of the code, have to terminate.
4.1
T (wf − regular) and T (wf − extended)
Thus, tailored to the mechanism of the simulation, we can create a wait-free r/w task for the simulators that is solvable if the codes exists [3]. If we are able to create a wait-free task that is known to be wait-free r/w unsolvable, we have proved that the speculated codes do not exist. To prove that 100 processors 1-resiliently cannot do consensus, we tailored a task for t + 1 = 2 processors. Using the fact that a bi-valent state exists [6], we take an execution in which “the bi-valent” code participates, and one in which it does not. We program our simulators to follow the order dictated by these execution. Thus we obtain the output for the “solo” participating sets. The rest of the outputs for the participating set that includes both simulators, are determined by
7
all the possible outputs simulators may observe, and these are the 99 outputs, or a super-set of 100 outputs,BG- inherited from the original task. For each tailoring of task that we produced with the BG simulation, that is T (wf − regular), we are now able to tailor with the Extended BG-simulation the same task only with the added condition that if we uniquely associate each simulator a priori with a unique code then the output of this simulator contains an output for that code, i.e. T (wf − extended), - thus tightening the task.
5 5.1
Impossibility of t-resilient Weak-Renaming From the Impossibility of Wait-free Weak-Renaming Weak-Renaming
The Weak-Renaming problem is a task on infinitely many processors po , p1 , .... The only constraint is that for the participating sets of size n, each processor outputs a unique integer between 0 and 2n − 3 [10]. It was shown by the author that w.l.o.g. the task can be posed for just n + (n − 1) processors p0 , p1 , ..., p2n−3 instead of infinitely many [8], and that the task is equivalent to the one he called Weak-Symmetry-Breaking (WSB) in which processors instead of outputting unique integers, output 0 or 1, such that at least one processor outputs 0 and at least one processor outputs 1 [7]. Notice that W SB although not colorless, nevertheless, has the property that seeing a single output enable other processor to immediately decide an output for themselves.
5.2
t-resilient Weak-Renaming
The t-resilient Weak-Renaming is a task on n + t processors p0 , ..., pn+t−1 . The only constraint is that for participating sets of size n, each processor outputs a unique integer between 0 and n+t−2.
5.3
Reduction
We want to show that if WSB is impossible, then t-resilient Weak-Renaming is impossible to solve 2t-resiliently (the other direction is known and therefore after the reduction the problems are proved equivalent). The reader who is puzzled by the 2t-resiliency should recall that resiliency is with respect to the set of processors in the task, rather then w.r.t. the particular participating set with which this problem is distinguished. The 2t-resiliency for the task translates into t-resiliency for the n processors participating set. Let code0 , ..., coden+t−1 solve t-resilient Weak-Renaming 2t-resiliently. We are going to simulate the codes by the Extended BG-simulation using t + 1 simulators. These simulators will solve WSB among themselves. In the WSB we pick t + 1 processors out of 2(t + 1) − 1 processors. Since n > t + 1 (n = t + 1 is the wait-free case) we get that n + t − 2 ≥ 2(t + 1) − 1. Thus, each simulator is a priori associated with a unique code, as required by the Extended BG-simulation. We now run the simulation as follows: Before a simulator starts simulating it posts it arrival. Its arrival signifies that, the code it is associated with, is going to be one of the n codes that are going to participate in solving the t-resilient Weak-Renaming. At each point of the simulation the participating set of codes is the codes whose processor posted their arrival, call this set P (wsb), and the first k codes, cod0 , ..., codek−1 , such that |{code0 , ..., codek−1 } ∨ P (wsb)| = n − (t + 1) + |P (wsb)|. That is, when only one simulator is active, it will wake up its associated code plus enough prefix 8
index code so that n − (t + 1) + 1 codes will participate. This will allow it if alone to simulate a 2t-resilient run of the codes, and correspondingly for every number of simulators until t + 1 is reached. When simulators terminate they output “left,” or “right.” If all t + 1 participated, then at least one simulator will output “left,” and at least one will output “right.” They will do this according to the slot their code had acquired, which from now on will be called “their slot.” When a simulator has terminated and it is not the only one to have terminated by now, the resolution of “left” or “right” is simple: Assigning “left” to the 0 direction and “right” to the other, a simulator that sees itself to the left of another simulator outputs “left,” and otherwise it outputs “right.” Obviously, if no simulator saw itself as the only one to terminate, then the problem is solved, as the leftmost simulator will output “left,” and the rightmost simulator will output “right.” The “action” occurs with the simulator that terminated and sees only itself. By the property of number-of-outputs, when a simulator terminates it will see at least n − t code outputs. Since the total number of slots is n + (t + 1) − 2 = n + (t − 1), then the maximum number of slots it will see unoccupied is t + (t − 1). Thus to its left or to its right, there is at most t − 1 unoccupied slots. W.l.o.g. assume that on its left the number of unoccupied slots is less than t. Thus, among the t + 1 simulators of which t have not terminated yet, it cannot be the rightmost simulator. Thus it will choose to output “left” with the guarantee that some other simulator will output “right.”
6 6.1
2-resilient Tasks are Undecideable 3 Processors Tasks are Undecidable
Some tasks are r/w wait-free solvable and some are not [6]. A task on n processors has a precise finite encoding specification. Why not build a decision program whose input is a task encoding and whose output is “yes” or “no” according to whether the encoded task is r/w wait-free solvable or not? Indeed this question was raised in [12] where they show that 2 processors tasks is a reachability problem and consequently, decidable. In [14] and [15], it was shown that the family of tasks on 3 processors is undecidable. Later, in [13] the authors worry that perhaps 3 processors tasks are undecidable, but if you take 100 processors and only 2 may fail, then the question of whether the task is solvable is decidable. They prove that it is not. Using the Extended BG-simulation we now show that given a program to decide 100 processors 2-resiliently task, it can be used to decide 3 processor task. Let T (3) be a task on 3 processors p0 , p1 , p2 . Build from it a task T (2 − res) on n processors p0 , ..., pn−1 as follows. The projection of ∆ of T (2 − res), on p0 , p1 , p2 , that is, we take an inputoutput tuple in T (2 − res) and just look at what input-output relation it implies vis p0 , p1 , p2 , is T (3). Conversely, any input-output tuple which may be syntactically in T (2 − res) is there, if its projection on p0 , p1 , p2 is in T (3), and moreover the rest of the symbols in the output appear as an output of one of the processors p0 , p1 , p2 in the tuple. That is, any processor from p4 , ..., pn−1 adopts an output it sees from p0 , p1 , p2 . We want to show that T (3) is solvable if and only if T (2 − res). If T (3) is solvable then T (2 − res) is: Processors p4 , ..., pn−1 wait on processors p0 , p1 , p2 to execute T (3) until at least an output is available. They adopt this output and drop out. The rest
9
of the processors in p0 , p1 , p2 finish executing T (3) to get an output. If T (2 − res) is solvable then T (3) is: Processors p0 , p1 , p2 simulate the code for T (2 − res) using the Extended BG-simulation. When a simulator finishes the simulation it has an output for itself in T (3). Without the Extended BG-simulation the last sentence was not justified.
7
Conclusion
We tightened the the BG-simulation by devising the Extended BG-simulation. We showed some ramifications by deriving a new result, and a simple re-derivation of formerly involved result. The question left begging is whether tomorrow we will have Extended2 BG-simulation, or whether this is the end of the road. Our feeling is that it is. An analogous question happened with ImmediateSnapshots as tightening of Atomic Snapshot, and in that case reference [9] implied that it is the end of the road. Appropriately formulated, there must be a Meta Theorem that says that t-resiliency “equals” Extended BG-simulation. There is an unresolved question, which we were not obliged to answer for the paper to go through, hovering between the lines. Is there really an example in which T (wf − regular) is waitfree solvable, while T (wf − extended) is not? Were we able to work out the two examples in the paper because of this concrete difference between the tasks, or just because T (wf − extended) is a higher-level construct with more structure? We conjecture that they are equivalent, whenever T (wf − regular) is solvable, T (wf − extended) is. Acknowledgment I am greatly in debt to Hagit Attiya who back in the summer of 2005 made me aware of the seemingly inapplicability of the BG-simulation to the t-resilient Weak-Renaming problem.
References [1] Afek Y., H. Attiya, Dolev D., Gafni E., Merrit M. and Shavit N., Atomic Snapshots of Shared Memory. Proc. 9th ACM Symposium on Principles of Distributed Computing (PODC’90), ACM Press, pp. 1–13, 1990. [2] Borowsky E. and Gafni E., Generalized FLP Impossibility Results for t-Resilient Asynchronous Computations. Proc. 25th ACM Symposium on the Theory of Computing (STOC’93), ACM Press, pp. 91-100, 1993. [3] Borowsky E., Gafni E., Lynch N. and Rajsbaum S., The BG Distributed Simulation Algorithm. Distributed Computing, 14(3):127–146, 2001. [4] Armando Casta˜ neda and Sergio Rajsbaum, New Combinatorial Topology Upper and Lower Bounds for Renaming. to appear in PODC 2008. [5] Hagit Attiya , Amotz Bar-Noy and Danny Dolev, Sharing memory robustly in message-passing systems. Journal of the ACM (JACM), v.42 n.1, p.124-142, Jan. 1995. [6] Fischer M.J., Lynch N.A. and Paterson M.S., Impossibility of Distributed Consensus with One Faulty Process. Journal of the ACM, 32(2):374-382, 1985.
10
[7] Gafni E., DISC/GODEL presentation: R/W Reductions (DISC’04), 2004. http://www.cs.ucla.edu/˜ eli/eli/godel.ppt [8] Gafni E., Private Communication: Bed-time story to Maurice Herlihy. May 1993. [9] Herlihy M.P. and Shavit N., The Topological Structure of Asynchronous Computability. Journal of the ACM, 46(6):858-923, 1999. [10] Hagit Attiya, Amotz Bar-Noy, Danny Dolev, David Peleg, Rdiger Reischuk: Renaming in an Asynchronous Environment J. ACM 37(3): 524-548 (1990) [11] Eli Gafni, Round-by-Round Fault Detectors: Unifying Synchrony and Asynchrony (Extended Abstract). PODC 1998: 143-152 [12] Ofer Biran, Shlomo Moran and Shmuel Zaks, A combinatorial characterization of the distributed tasks that are solvable in the presence of one faulty processor. In Proceedings of the Seventh ACM Symposium on Principles of Distributed Computing (PODC’08), ACM Press, 1988. [13] Maurice Herlihy, Sergio Rajsbaum: The Decidability of Distributed Decision Tasks (Extended Abstract). STOC 1997: 589-598 [14] Eli Gafni, Elias Koutsoupias: 3-Processor Tasks Are Undecidable (Abstract). PODC 1995: 271 [15] Eli Gafni, Elias Koutsoupias: Three-Processor Tasks Are Undecidable. SIAM J. Comput. 28(3): 970-983 (1999)
11