The Randomness Complexity of Private Computation
Carlo Blundo, Alfredo De Santis, Giuseppe Persiano, and Ugo Vaccaro Dipartimento di Informatica ed Applicazioni, Universita di Salerno, 84081 Baronissi SA, Italy. E-mail fcarblu,ads,giuper,
[email protected]
Abstract
We consider the classic problem of honest but curious players with private inputs who wish to compute the value of a xed function F ( 1 n) in such way that at the end of the protocol every player knows the value F ( 1 n). Each pair of players is connected by a secure point-to-point communication channel. The players have unbounded computational resources and they intend to compute F in a -private way. That is, after the execution of the protocol no coalition of size at most ? 1 can get any information about the inputs of the remaining players other than what can be deduced from their own inputs and the value of F . We study the amount of randomness needed in -private protocols. We prove a lower bound on the randomness complexity of any -private protocol to compute a function with sensitivity . As a corollary we obtain that when the private inputs are uniformly distributed at least ( ? 1)( ? 2) 2 random bits are needed to compute the sum modulo 2k of -bit integers in an ( ? 2)-private way. This result is tight as there are protocols for this problem that use exactly this number of random bits. n
x1 ; : : : ; x n
x ;
x ;
;x
;x
t
t
n
t
t
n
k n
n k
n
=
n
1 Introduction We consider the classic problem of n honest but curious players with private inputs x1; : : :; xn who wish to compute the value of a xed function F (x1; ; xn ) in such way that at the end of the protocol every player knows the value F (x1; ; xn ). Each pair of players is connected by a secure point-to-point communication channel. The players have unbounded computational resources and they intend to compute F in a t-private way. That is, after the execution of the protocol no coalition of size at most t can get any information about the inputs of the remaining players other than what can be deduced from their own inputs and the value of F . Private computation in this model has been the subject of several papers [1, 6, 7, 8, 9, 16, 17, 20]. If t = n ? 1, then t-private computation is referred to as totally private computation. Chor and Kushilevitz [8] characterized the boolean functions that can be computed in a totally private way. More precisely, they proved that a boolean function F (x1; ; xn ) is totally private if and only if it can be represented as the XOR A preliminary version of this work has been presented at ICALP '95 [3]. Work partially supported by National Council for Research (C.N.R.) and Italian Ministry of University and Research (M.U.R.S.T.).
1
of n one-argument boolean functions. In [17], the concept of universality in totally private computation has been also investigated. It is well known that no non-trivial function can be computed privately by means of a deterministic protocol, therefore randomness is an essential ingredient to all secure computations. The scope of this paper is to quantify the amount of randomness needed in protocols for t-secure computations. Randomness plays an important role in several other areas of Computer Science, most notably Algorithm Design and Complexity. Since random bits are a natural computational resource, the amount of randomness used in computation is an important issue in many applications. Therefore, considerable eort has been devoted both to reduce the number of random bits used by probabilistic algorithms (see for instance [13]) and to analyze the amount of randomness required in order to achieve a given performance [4, 5, 12, 15, 21]. Motivated by the fact that \truly" random bits are hard to obtain, it has also been recently investigated the possibility of using imperfect sources of randomness in randomized algorithms [22]. Our approach is close in spirit to [2] in that we mainly concentrate on the rigourous quanti cation of the number of random bits the players need to execute a protocol for secure computation. Since dierent algorithms might use random bits produced by dierent sources, we rst need a uniform measure for the amount of randomness provided by dierent sources. To this aim, we use the Shannon entropy of the source generating the random bits, since it represents the most general and natural measure of randomness in settings where no limitation is imposed on the computational power of the parties. We also recall the important result of Knuth and Yao [14] that shows that the Shannon entropy of a random variable (i.e., of a memoryless random source) is closely related to a more algorithmically oriented measure of randomness. More precisely, Knuth and Yao have shown that the entropy of a random variable R is approximatively equal to the average number of tosses of an unbiased coin necessary to simulate the outcomes of R. In this paper we prove a lower bound on the entropy of the source supplying the stream of bits used by the players as source of randomness to compute in a t-private way a function with sensitivity n. As a corollary, we obtain that when the private inputs are uniformly distributed at least k(n ? 1)(n ? 2)=2 random bits are needed to compute the sum modulo 2k of n k-bit integers in an (n ? 2)-private way. This lower bound is tight as Chor and Kushilevitz have presented in [9] a protocol for computing the sum modulo 2k of n k-bit integers that uses exactly k(n ? 1)(n ? 2)=2 random bits. The importance of the computation of modular sum when total privacy is required lies in the result of Chor and Kushilevitz [8] that tells us that, in the boolean case, the modular sum is the only building block to construct totally-private functions. To prove our lower bound we make no assumption on the general structure of the protocol; for example, our lower bound holds also for non oblivious protocols. Oblivious protocols are protocols where the decision whether a player i sends a message to player j at round k depends only on i; j; and k and not on his input or random coin tosses. The randomness complexity of private computation protocols has also been studied in [18]. In particular in [18] it has been proved that in any t-private protocol computing the XOR function at least t random bits are needed under the assumption that the source of randomness of each player is a stream of independently and uniformly distributed random bits. Moreover, in [19] has been analyzed the relationship between private protocols and circuits. More precisely, it has been proved that a function F has a linear size circuit if and only if F has 1-private n-party protocol in which the total number of random bits used by all players is constant. 2
The paper is organized as follows. In Section 2 we recall the de nition and some basic properties of the entropy function. In Section 3 we describe the model for t-private protocols and formally de ne it by using the entropy framework. In Section 4 we de ne the sensitivity of a function F and we prove useful properties of protocols computing functions F with xed sensitivity. In Section 5 we formally de ne the protocol's randomness. Finally, in Section 6 we prove a lower bound on the number of random bits needed to compute in a t-private way any function with sensitivity n.
2 Information Theoretic Background In this section we review the basic concepts of Information Theory used in our de nitions and proofs. For a complete treatment of the subject the reader is advised to consult [11]. Given a probability distribution fPr(x)gxX on a nite set X , we de ne the entropy 1 of the random variable X taking value on X , denoted by H (X ), as X H (X ) = ? Pr(x) log Pr(x): xX
The entropy enjoys the following property 0 H (X ) log jXj: Given two random variables X and Y taking value on the nite sets X and Y , respectively, and a joint probability distribution fPr(x; y )gxX;yY on the cartesian product X Y , the conditional entropy H (X jY ) is de ned as XX H (X jY ) = ? Pr(y)Pr(xjy) log Pr(xjy): yY xX
From the de nition of conditional entropy it is easy to see that
H (X jY ) 0:
(1)
We have that H (X jY ) = 0 when the value taken from Y completely determines the value chosen from X ; whereas, H (X jY ) = H (X ) means that choices from X and Y are independent, that is, the probability that the value x has been chosen from X given that from Y we have chosen y is the same as the a priori probability of choosing x from X . Therefore, knowing the values chosen from Y does not enable a Bayesian opponent to modify an a priori guess regarding which element has been chosen from X . The mutual information I (X ; Y ) between X and Y is de ned by
I (X ; Y ) = H (X ) ? H (X jY ) = H (Y ) ? H (Y jX ) = I (Y ; X ):
(2)
Given n +1 random variables X1; : : :; Xn; Y , taking value on the nite sets X1 ; : : :; Xn ; Y , respectively, and a joint probability distribution on the cartesian product X1 Xn Y , the entropy of X1 : : :Xn conditioned by Y can be expressed as
H (X1 : : :Xn jY ) = H (X1jY ) + H (X2jX1Y ) + + H (XnjX1 : : :Xn?1 Y ): 1
All logarithms in this paper are to the base 2.
3
(3)
Let X be a random variable taking value on the nite set X and let Y be a random variable consisting of the sequence of an in nite number of random variables Yi 's, each taking value on the nite set Yi , that is Y = hY1 ; Y2 ; Ym ; i. We de ne the entropy of X given Y as H (X jY ) = mlim (4) !1 H (X jY1 : : :Ym ) if the above limit exists; otherwise, the quantity H (X jY ) is not de ned. Given three random variables X; Y; Z taking value on the nite sets X , Y , and Z , respectively, the conditional mutual information I (X ; Y jZ ) between X and Y given Z is de ned by I (X ; Y jZ ) = H (X jZ ) ? H (X jY Z ): (5) If the random variable Z is an in nite sequence of random variables Zi 's, each taking value on the nite set Zi , (i.e., Z = hZ1; Z2 ; Zm ; i), then the conditional mutual information I (X ; Y jZ ) is de ned as mlim !1 [H (X jZ1Z2 : : :Zm ) ? H (X jY Z1Z2 : : :Zm )] if this limit exists. The conditional mutual information enjoys the following properties: I (X ; Y jZ ) = H (X jZ ) ? H (X jY Z ) = H (Y jZ ) ? H (Y jXZ ) = I (Y ; X jZ ) (6) and I (X ; Y jZ ) 0 (7) from which one gets H (X jZ ) H (X jY Z ): (8)
3 The Model for t-Private Computation In this section we describe the model and then we give formal de nitions for t-private protocols. There is a set P = fP1 ; P2; : : :; Png of n players, each holding a private input xi taken from a nite domain Xi according to a probability distribution fpXi(x)gx2Xi . The players wish to distributely compute a function F (x1; x2; : : :; xn) of their private inputs. Player Pi has access to an in nite source Ri of randomness, independent from its input value and, in every execution of a protocol each player Pi only uses a nite amount of randomness. The source Ri can be seen as a sequence of countably many (independent) random variables Ri = Ri1 Ri2 each taking value on some nite set. Typically, each Rij is a uniformly distributed random variable taking values in f0; 1g, but we stress that our results hold for Rij taking value on nite sets according to any probability distribution. The players compute the value of the function F by exchanging messages over a complete network of private channels; these are channels in which only the sender and the receiver can read the transmitted message. Each message sent by player Pi is function of its private input xi , the random input string ri he has obtained from its random source Ri, and the messages previously exchanged with the other players. The players are honest (that is, they follow the protocol) but curious (that is, after the protocol is nished some of them might pool the information they have obtained during the execution of the protocol in order to learn something on the other players' inputs). This motivates the need of computing F in a t-private way; that is, in a way that any subset of size at most t of the players cannot infer any additional information from the execution of the protocol other than what they can obtain from the value of the function and their private inputs. Let us start by formally de ning the notion of a protocol. 4
De nition 3.1 A randomized protocol for n players P1; : : :; Pn holding values x1; : : :; xn, respectively, to compute a function F is pair (M; E ), where M is a probability distribution on the messages sent during the execution of the protocol. More precisely, M (i; j; k; x; Historyki ; m) is the probability that player i holding the private input x sends the message m to player j at round k where Historyki is the concatenation of all messages sent/received by player i along with the identity of the receiver/sender of each message at rounds 1; 2; : : :; k ? 1. E is an evaluation function. That is, for every execution of the protocol , there exists an integer g0 such that for any 1 i; j n and for any g > g0 it holds that, M (i; j; g; x; Historygi ; \EMPTY MESSAGE") = 1 and E (i; Historygi0 ) = F (x1; : : :; xn).
We consider protocols proceeding in rounds starting from round 1 where each player Pi chooses the message mij 1 to be sent to player j according to M (i; j; 1; x; ;; mij 1) and each player can send and receive messages to and from any other player. To sample its messages according to M , player Pi uses Ri as source of randomness. Also, we have imposed no limitation on the diculty of sampling according to M but only that M be samplable in a nite expected number of steps. For example, to choose m according to M might very well be superpolynomial in the length of Historyki . For de ning the concept of a t-private protocol we use the following notation. We denote random variables by capital letters and any value that a random variable can take by the corresponding lower letter. Thus, for i = 1; 2; : : :; n, by Xi we denote the random variable induced by the i-th player's input taking value in Xi , and by F the random variable induced by the value assumed by the function F (x1; x2; : : :; xn ). The communication between players Pi and Pj is represented by the random variable Ci;j = Cj;i . We denote all the communication involving the player Pi (messages sent and received) during a speci c execution of the protocol by ci, while Ci denotes the corresponding random variable. Hence, Ci = Ci;1 : : :Ci;i?1 Ci;i+1 : : :Ci;n. If Ai (ai ) is a random variable (a value) for i = 1; : : :; n and W = (i1; : : :; im ), with 1 i1; i2; ; im n, then AW (aW ) denotes the random variable Ai1 : : :Aim (the value ai1 : : :aim ). Thus, xW = xi1 xim , rW = ri1 rim , CW = Ci1 Cim , and Cj;W = Cj;i1 Cj;im . In an ideal evaluation of the function F the players give their private inputs to a trusted party. The trusted party computes the value f = F (x1; x2; : : :; xn ) of the function on those inputs, and reveals it to all players, without giving any other information. Thus, all the information that a coalition W of players can get on the other players' private inputs is what can be deduced from the values xW and f . We want to achieve the same result without the help of a trusted party. It is well known that this goal is impossible to achieve without randomness. Informally, in order for a protocol to be private against coalitions of size at most t < n, we require that all information that a set W , jW j t, of players can compute on the other players' private inputs is the same that they can compute in an ideal function evaluation. That is, we say that a protocol is t-private if all the information that a coalition W of players of size at most t can get on the other players' private inputs from what they see during the execution of the protocol, that is from the values xW , F (x1; x2; : : :; xn ), and the set of all messages exchanged cW , is equal to what can be deduced from the values xW , and F (x1; x2; : : :; xn ). We formalize this concept in the next de nition using the information-theoretic quantities reviewed in Section 2. 5
De nition 3.2 Let P = fP1; P2; : : :; Png be a set of n players and let F be a function F : X1 X2 Xn ! X . For 1 t n ? 1, we say that a protocol is a t-private protocol for computing the function F if the following two conditions are satis ed. 1) Each player can compute the value of F ,that is, for each player Pi , it holds that H (F jCiXi) = 0. 2) Any coalition of at most t players gets no additional information, that is, for all Y; W f1; : : :; ng, such that jY j t and W \ Y = ;, it holds that H (XW jCY XY F ) = H (XW jXY F ).
Remark 3.3 While condition 1) of above de nition expresses the natural fact that at the
end of the protocol any player should be able to compute the function, condition 2) expresses the \level of security" of the protocol. We point out that condition 2) is weaker than the corresponding one generally found in literature (e.g., [9]). For example, as we shall see in Lemma 3.6, according to the de nition given in [9] of t-private protocol for computing a function F , Property 2: of De nition 3.2 should be replaced by: 20) Any coalition of at most t players gets no additional information, that is, for all Y; W f1; : : :; ng, such that jY j t and W \ Y = ;, it holds that H (XW jCY RY XY F ) = H (XW jRY XY F ). which is a stronger condition than 2), see next Lemma 3.5. Nevertheless, our de nition is strong enough to prove non-trivial lower bound on the number of random bits needed to compute in a t-private way a function with sensitivity n.
Remark 3.4 Notice that the condition 20) above is expressed using entropies of random
variables conditioned to in nite random variables (i.e., RY ). We stress here that the de nition is sound as all entropies are well de ned. Consider for example H (XW jCY XY RY F ). As we have already discussed, the source of randomness RY can be seen as a sequence of countably many nite random variables RY = RY1 RY2 . However, since in every execution of a protocol each player P only uses a nite amount of randomness, all random variables depend upon this nite amount of randomness, that is, there exists an integer ` = `(Y; ) such that for each m > ` it holds that H (XW jCY XY RY1 RY2 RYm F ) = H (XW jCY XY RY1 RY2 RY` F ). Therefore H (XW jCY XY RY F ) is well de ned and is equal to H (XW jCY XY RY1 RY2 RY` F ):
Lemma 3.5 If for Y; W f1; : : :; ng, such that jY j t and W \ Y = ;, it holds that H (XW jCY RY XY F ) = H (XW jRY XY F ), then H (XW jCY XY F ) = H (XW jXY F ). Proof: Notice that H (XW jXY F ) = H (XW jRY XY F ): (9) Indeed,
H (XW jXY F )?H (XW jRY XY F ) = I (XW ; RY jXY F ) = H (RY jXY F )?H (RY jXY XW F ) = 0:
6
(The last equality is justi ed since the source of randomness RY is independent from the random inputs and the value of the function.) We have that
H (XW jCY XY F ) H (XW jCY RY XY F ) (by (8)) = H (XW jRY XY F ) (by hypothesis) = H (XW jXY F ) (by (9)) H (XW jCY XY F ): Hence, we get that H (XW jCY XY F ) = H (XW jXY F ), which proves the lemma.
Next lemma gives equivalent formulations of Property 20 ). In particular, Property 2a) is a simpler but equivalent formulation of Property 20) whereas Property 2b) has been used in the literature (e.g., [9]).
Lemma 3.6 For 1 t n ? 1, Property 20) is equivalent to each of the following: 2a) For all Y f1; : : :; ng such that jY j t, it holds that H (XW jCY RY XY F ) = H (XW jRY XY F ), where W = f1; : : :; ngnY . 2b) For all Y f1; : : :; ng such that jY j t, for all xY ; rY ; cY ; xW ; x0W such that the function evaluated at xY xW and at xY x0W has the same value f , it holds that Pr(cY jxW ; rY ; xY ) = Pr(cY jx0W ; rY ; xY ) where W = f1; : : :; ngnY and the probability is taken over all random inputs rW .
Proof: First, we prove that Properties 20) and 2a) are equivalent. Since Property 20) states that H (XW jCY RY XY F ) = H (XW jRY XY F ) for any disjoint sets Y; W then the same equality holds also for any pair Y and W = f1; : : :; ngnY and Property 2a) is satis ed. On the other hand, suppose that Property 2a) holds; that is for all Y f1; : : :; ng, with Y t, we have that the mutual information I (XW ; CY jRY XY F ) = H (XW jCY RY XY F ) ? H (XW jRY XY F ) = 0, where W = f1; : : :; ngnY . We have to prove that for all Y; W 0 f1; : : :; ng, W 0 \ Y = ;, it holds that I (XW 0 ; CY jRY XY F ) = 0. Let W 00 = W nW 0. We have that 0 = I (XW ; CY jRY XY F ) = I (XW 0 ; CY jRY XY F ) + I (XW 00 ; CY jRY XY FXW 0 ) I (XW 0 ; CY jRY XY F ) 0. Hence, we get that for all Y; W 0 f1; : : :; ng, W 0 \ Y = ;, it holds that I (XW 0 ; CY jRY XY F ) = 0. Thus, Property 20) is satis ed.
Now, we prove that Properties 2a) and 2b) are equivalent. Property 2a) says that CY and XW are statistical independent given RY XY F . That is, Pr(xW jcY ; rY ; xY ; f ) = Pr(xW jrY ; xY ; f ). This is equivalent to Pr(cY jxW ; rY ; xY ; f ) = Pr(cY jrY ; xY ; f ) which, in turn, is equivalent to Pr(cY jxW ; rY ; xY ; f ) = Pr(cY jx0W ; rY ; xY ; f ), where xW and x0W are two sets of values such that the function F evaluated at xY xW and at xY x0W has the same value f . Therefore, we get that Pr(cY jxW ; rY ; xY ) = Pr(cY jx0W ; rY ; xY ), which proves the lemma.
4 Sensitivity In this section we de ne the sensitivity of a function F and we prove useful properties of protocols computing functions F with xed sensitivity. We say that a function F : X n ! X is sensitive to its i-th variable on assignment x=(x1; : : :; xn), if it results that jfF (x1; : : :; xi?1; zi; xi+1; : : :; xn) : zi 2 Xgj = jXj: In other 7
word, a function F is sensitive to its i-th variable, on a xed assignment x=(x1; : : :; xn ), if its value changes whenever the i-th variable's value does. We say that the function F is i-sensitive if it is sensitive to its i-th variable for any assignment x. The sensitivity of a function F , denoted by S (F ), is the number of indices i to which the function P F is isensitive. For instance, the sensitivity of the function MSum(x1; : : :; xn) = ni=1 xi mod q is n. More generally, if (G; ) is a group then the function F : Gn ! G de ned by F (x1; x2; ; xn) = x1 xn has sensitivity n. Notice that our de nition of sensitivity diers from other de nitions found in literature [20]. The following lemma states that for any function F with sensitivity n, any n ? 1 input values, say fx1 ; : : :; xi?1; xi+1; : : :; xn g, the value of the function F itself, and the indices f1; : : :; i ? 1; i + 1; : : :; ng uniquely determine the missing input value xi.
Lemma 4.1 Let F : X n ! X be a function with sensitivity n. For any i 2 f1; : : :; ng it
results that
where Yi = X1 : : :Xi?1 Xi+1 : : :Xn :
H (XijYiF ) = 0;
Proof:
Since the function F has sensitivity n we have that for any xed value f of the function and xY = x1 ; : : :; xi?1; xi+1; : : :; xn , there exists a value xi such that f = F (x1; : : :; xi; : : :; xn ). Suppose by contradiction that H (XijYiF ) > 0. Hence for some f and xY there are at least two values, say x0i and x00i , such that f = F (x1; : : :; x0i; ; xn) = F (x1; : : :; x00i ; ; xn). This implies that jfF (x1; : : :; xi?1; zi; xi+1; : : :; xn) : zi 2 Xgj < jXj: Therefore, the function F is not sensitive to its i-th variable, on the xed assignment (x1; : : :; xi?1 ; xi+1; : : :; xn). Hence, the function F is not i-sensitive leading to a contradiction. In the case of independent and uniformly distributed private inputs, the previous lemma can be generalized to the following.
Lemma 4.2 Let X1; : : :; Xn be independent and uniformly distributed random variables and let F be a function with sensitivity n. For any W; Y f1; : : :; ng such that W \ Y = ; and for any i 2 Y , it results that ( H (XY ); if jW [ Y j < n; H (XY jXW F ) = H (X ); otherwise, Ti
where Ti = Y nfig.
Proof: Since the function is n-sensitive, we have that for any Z of size less than n, it holds H (F jXZ ) = H (F ). Therefore, H (F jXW ) = H (F jXW XY ) if jW [ Y j < n. By (5), this implies that H (XY jXW F ) = H (XY jXW ). By the independence of the Xi's we have H (XY jXW ) = H (XY ) and thus H (XY jXW F ) = H (XY ) if jW [ Y j < n. On the other hand, if jW [ Y j = n, then by Lemma 4.1 we have that H (XijXW XTi F ) = 0. Hence, it holds that H (XY jXW F ) = H (XTi jXW F ) + H (XijXW XTi F ) = H (XTi jXW F ) = H (XTi jXW ). The last equality is justi ed by the rst part of this proof. Thus, the lemma holds. The following lemma relates the sensitivity of a function F to a property of any protocol computing F . Namely, if F is an i-sensitive function, then the communication Ci uniquely determines the input value xi of player Pi . 8
Lemma 4.3 In any protocol computing an i-sensitive function F , it results that H (XijCi) = 0: Proof: If H (Xi) = 0 then, as 0 H (XijCi) H (Xi), we have H (XijCi) = 0. As-
sume H (Xi) > 0 and hence there are at least two values x0i x00i with non zero probability. The identity H (XijCi) = 0 means that the communication involving player Pi uniquely determines its input value xi . Would it be otherwise, the player Pi with two dierent inputs, x0i and x00i , would be involved in the same communication ci, but this leads to a contradiction. Indeed, the protocol, xed the inputs of all the players but Pi , will output the same value f for Pi holding either x0i or x00i . But, as F is i-sensitive, it must be the case F (x1; : : :; x0i; : : :; xn ) 6= F (x1; : : :; x00i ; : : :; xn ) contradicting the fact that the protocol computes F . Thus, the lemma holds. The next lemma states that if a function F is i-sensitive, then its value is uniquely determined by the communication Ci involving the player Pi .
Lemma 4.4 In any protocol computing an i-sensitive function F , it results that H (F jCi) = 0: Proof: We get 0 = H (XijCi) (by Lemma 4.3) I (Xi; F jCi) (by (6) and (1)) 0 (by (7)). Therefore, we have I (Xi; F jCi) = 0. Since I (Xi; F jCi) = H (F jCi) ? H (F jCiXi), we have that H (F jCi) = H (F jCiXi ). By de nition of private protocols, we have H (F jCiXi ) = 0, from which we obtain that H (F jCi) = 0.
Since the function MSum has sensitivity n, the following corollary holds.
Corollary 4.5 In any protocol computing the function MSum, it results that H (XijCi) = 0 and H (F jCi) = 0, for i = 1; 2; : : :; n. Next lemma relates the communication Ci involving the player Pi and the value of an i-sensitive function F .
Lemma 4.6 In any protocol computing an i-sensitive function F , it results that H (Ci) = H (F ) + H (CijF ): Proof: From equation (2) we have that H (Ci) ? H (CijF ) = H (F ) ? H (F jCi). Since the function F is i-sensitive, from Lemma 4.4 we get H (F jCi) = 0. Hence, we obtain H (Ci) = H (F ) + H (CijF ) which proves the lemma.
9
5 Protocol's Randomness To measure the protocol's randomness we use the notion of Shannon entropy of a random variable. The entropy of a random variable is strictly related to the measure of randomness introduced by Knuth and Yao [14]. Indeed, Knuth and Yao [14] have shown that the Shannon entropy of a nite random variable is closely related to a more algorithmically oriented measure of randomness. More precisely, they have shown that the entropy of a random variable X is very close to the average number of tosses of an unbiased coin necessary to simulate the outcomes of X . Let A be an algorithm that generates a random variable X with distribution (p1; : : :; pn ), using only independent and unbiased random bits as inputs. Denote by T (A) the average number of random bits used by the algorithm A and let T (X ) = minA T (A). In [14] the following theorem has been proved.
Theorem 5.1 (Knuth and Yao [14]) H (X ) T (X ) < H (X ) + 2:
Thus, the entropy of a random source is very close to the average number of independent unbiased random bits necessary to simulate the source. In view of Theorem 5.1, the total randomness present in a private protocol for n players is lower bounded by the entropy H (C1 : : :Cn ) of all the communication in the protocol. The protocol's randomness is lower bounded by the entropy of all the communication in the protocol to compute the function F (x1; x2; : : :; xn ) once the values x1 ; x2; : : :; xn and the probability distribution on them are known. Therefore, for a private protocol the protocol's randomness is lower bounded by the entropy H (C1 : : :Cn jX1 : : :Xn ). The two bounds are related by the following lemma.
Lemma 5.2 For any protocol computing an n arguments function F , it holds that H (C1 : : :CnjX1 : : :Xn) H (C1 : : :Cn) ? H (X1 : : :Xn): Equality holds if the sensitivity of F is S (F ) = n. Proof: Consider the mutual information I (C1:::Cn; X1:::Xn), from equation (2) we have H (C1 : : :Cn) = H (C1 : : :CnjX1 : : :Xn ) + H (X1 : : :Xn) ? H (X1 : : :XnjC1 : : :Cn ): Since H (X1 : : :Xn jC1 : : :Cn ) 0, we get H (C1 : : :Cn) H (C1 : : :CnjX1 : : :Xn ) + H (X1 : : :Xn): If the sensitivity of F is S (F ) = n, then from (3), (8), and Lemma 4.3 we get 0 H (X1 : : :Xn jC1 : : :Cn )
n X i=1
H (XijCi) = 0:
Hence, it follows that H (X1 : : :Xn jC1 : : :Cn ) = 0. Thus,
H (C1 : : :CnjX1 : : :Xn) = H (C1 : : :Cn) ? H (X1 : : :Xn): 10
To estimate the randomness needed by a t-private protocol for n players that computes the function F , we de ne the quantity RF (n; t; ; ) = H (C1 : : :CnjX1 : : :Xn); where is a t-private protocol and is a probability distribution on the players' inputs (x1; x2; : : :; xn ). The value RF (n; t; ; ) represents a lower bound on the amount of randomness required by the t-private protocol to evaluate the function F when is the probability distribution on the inputs (x1; x2; : : :; xn ). Notice that RF (n; t; ; ) depends also on since the probability that participants receive/send a particular message depends both on and . Since we are interested in the minimum amount possible of randomness for computing a function F , we give the following de nition. De nition 5.3 The t-randomness of a function F with respect to the probability distribution on the inputs (x1; x2; : : :; xn ) is de ned as RF (n; t; ) = inf2T RF (n; t; ; );
where T is the set of all t-private protocols computing F . In the sequel, whenever the function F is clear from the context, we will simply write R(n; t; ) instead of RF (n; t; ).
6 A Lower Bound for t-Private Protocols We are now ready to prove a lower bound on the number of random bits needed to compute in a t-private way any function with sensitivity n. We consider the case of private inputs which are independent and uniformly distributed. We start by proving the following preliminary lemmas. Lemma 6.1 Let 2 t n ? 2, let F be a function with sensitivity n, and let 1 j n. Moreover, let Z and W be two disjoint non-empty subsets of f1; : : :; ngnfj g such that jZ j + jW j = t. If X1; : : :; Xn are uniformly distributed, then, for any t-private protocol computing F it results that H (Cj;Y jCZ Cj;W ) H (Xj ); where Y = f1; : : :; ngn(Z [ W [ fj g). Proof: By (1) and (8) one obtains 0 H (Xj jCZ Cj;W Cj;Y ) H (Xj jCj ). Since F has sensitivity n, by Lemma 4.3 we have H (Xj jCj ) = 0, and thus we get H (Xj jCZ Cj;W Cj;Y ) = 0. We have that H (Cj;Y jCZ Cj;W ) I (Xj ; Cj;Y jCZ Cj;W ) (by (5) and (1)) = H (Xj jCZ Cj;W ) ? H (Xj jCZ Cj;W Cj;Y ) = H (Xj jCZ Cj;W ) H (Xj jCZ CW ) H (Xj jCZ CW XZ XW F ) = H (Xj jXZ XW F )(by 2: of De nition 3.2): = H (Xj ): (by Lemma 4.2) 11
Lemma 6.2 Let 2 t n ? 2, let F be a function with sensitivity n, and let 1 j n. Moreover, let Z be a non-empty subset of f1; : : :; ngnfj g such that jZ j < t. If X1 ; : : :; Xn are uniformly distributed, then, in any t-private protocol computing F , for any 1 r < n ? t ? 1, there exists a set T f1; : : :; ngn(Z [ fj g) of cardinality r such that H (Cj;T jCZ ) n ? rt ? 1 H (Xj ):
Proof: For r = 0 we choose T = ; and thus Cj;T is the \empty" random variable for which H (Cj;T jCZ ) = 0. Hence, the lemma holds for r = 0. Now, x any t-private protocol computing F , an integer j 2 f1; : : :; ng and an integer ` < t. Recall that, by Lemma 6.1, for any set W f1; : : :; ngn(Z [ fj g), with jW j = t ? `, letting Y = f1; : : :; ngn(Z [ W [ fj g) be a set of cardinality n ? t ? 1, it results that H (Cj;Y jCZ Cj;W ) H (Xj ). For r = 1, notice that H (Cj;Y jCZ ) H (Cj;Y jCZ Cj;W ) H (Xj ). Hence, by (3) and (8), there exists an index k 2 Y such that H (Cj;k jCZ ) H (Xj )=(n ? t ? 1). Therefore, the lemma is proved for r = 1. Assume the lemma true for a set V of size r. Consider an index k 2 f1; : : :; ngn(V [ fj g). If H (Cj;V Cj;k jCZ ) (r + 1)H (Xj )=(n ? t ? 1), then we choose T = V [fkg and the lemma holds for jT j = r + 1. Suppose instead that H (Cj;V Cj;k jCZ ) < (r + 1)H (Xj )=(n ? t ? 1). Let W f1; : : :; ngn(Z [ fj; kg), with jW j = t ? `, and let Y = f1; : : :; ngn(Z [ W [ fj g). Then, Y is a set of cardinality n ? t ? 1 such that (V [ fkg) Y . By inequality (8) and Lemma 6.1 we have H (Cj;Y jCZ ) H (Cj;Y jCZ Cj;W ) H (Xj ). Therefore, using (3) for U = Y n(V [ fkg), we have that H (Cj;Y jCZ ) = H (Cj;V Cj;k jCZ ) + H (Cj;U jCZ Cj;V Cj;k) H (Xj ). Since H (Cj;V Cj;k jCZ ) < (r + 1)H (Xj )=(n ? t ? 1), we obtain H (Cj;U jCZ Cj;V ) H (C jC C C ) j;U r +Z 1j;V j;k > 1 ? n ? t ? 1 H (Xj ) t ? r ? 2 H (X ): = n? j n?t?1 Since jU j = n?t? r ?2, there exists an u 2 U such that H (Cj;ujCZ Cj;V ) > H (Xj )=(n?t?1). Using (3), the inductive hypothesis, and H (Cj;ujCZ Cj;V ) > H (Xj )=(n ? t ? 1) we get H (Cj;V Cj;ujCZ ) = H (Cj;Vr jCZ ) + H (C1 j;ujCZ Cj;V ) > n ? t ? 1 + n ? t ? 1 H (Xj ) r + 1 H (X ) = n? j t?1 which, by choosing T = V [ fug, proves the lemma.
Lemma 6.3 Let 2 t n ? 2, let F be a function with sensitivity n, and let 1 j n. Moreover, let Z be a non-empty subset of f1; : : :; ngnfj g such that jZ j = d < t. If X1; : : :; Xn are uniformly distributed, then in any t-private protocol computing F , it results that H (Cj jCZ ) nn ?? dt ?? 11 H (Xj ): 12
Proof: Let t ? d = q(n ? t ? 1) + r, where 0 r < n ? t ? 1 and q 2 N . If r = 0, then set T = ;; otherwise, let T f1; : : :; ngn(Z [ fj g) be a set of cardinality r satisfying Lemma 6.2. Consider a set W f1; : : :; ngnfj g of cardinality t ? d such that T W and let Y = f1; : : :; ngn(Z [ W [ fj g). Since Cj;W [ Cj;Y Cj , by (3) H (Cj jCZ ) H (Cj;W Cj;Y jCZ ) = H (Cj;W jCZ ) + H (Cj;Y jCZ Cj;W ) H (Cj;W jCZ ) + H (Xj ): (by Lemma 6.1) With Y` , for ` = 1; 2; : : :; q , we denote sets of indices in W nT such that jY` j = n ? t ? 1 and Yk \ Yh = ; for k = 6 h; whereas, with W` we denote the set W` = f1; : : :; ngn(Z [ Y` [ fj g). If t ? d < n ? t ? 1 then q = 0, r = t ? d, and there are no sets Wi 's and Yi 's. We get H (Cj jCZ ) H (Cj;W jCZ ) + H (Xj ) q X H (Cj;T jCZ ) + H (Cj;Y` jCZ Cj;W` ) + H (Xj )
`=1
r n ? t ? 1 + q + 1 H (Xj ) (by Lemmas 6.2 and 6.1) ? 1 H (X ): = nn ?? dt ? j 1
Thus, the lemma holds.
Remark 6.4 By the same arguments used to prove Lemma 6.3, we obtain that in any t-private protocol computing a function F with sensitivity n and for any j 2 f1; : : :; ng, it holds that H (Cj jF ) n n??t ?1 1 H (Xj ): We are now ready to prove a lower bound on the randomness complexity of a t-private protocol computing any function with sensitivity n. The following theorem states a lower bound on the protocols' randomness of any tprivate protocol computing a function F whose sensitivity is S (F ) = n. For the sake of clarity of exposition, we will prove the theorem for probability distributions such that H (Xi) = H (X ), for i = 1; 2; : : :; n. Moreover, we assume that each player P Pi chooses his input value Xi independently from the other players. Hence, H (X1 : : :Xn ) = ni=1 H (Xi) = nH (X ). Theorem 6.5 Let 2 t n ? 2. For any function F , with sensitivity S (F ) = n, if the Xi's are independent and H (X ) def = H (X1) = = H (Xn ), then the t-randomness R (n; t; U ) satis es ? 2(n ? 2) R(n; t; U ) t(t +2(3) n ? t ? 1) log jX j + H (F ); where U is the uniform probability distribution on n-tuples of integers (x1; ; xn ).
Proof : By Lemma 4.4 and by 2. of De nition 3.2 we have that H (Xt+1 : : :XnjC1 : : :CtX1 : : :Xt) = H (Xt+1 : : :XnjC1 : : :CtX1 : : :Xt F ) = H (Xt+1 : : :Xn jX1 : : :XtF ): 13
Hence, from Lemma 4.2 it results that H (Xt+1 : : :Xn jC1 : : :CtX1 : : :Xt) = (n ? t ? 1)H (X ): (10) The mutual information I (C1; : : :; Ct; X1; : : :; Xn ) can be written both as H (C1 : : :Ct ) ? H (C1 : : :CtjX1 : : :Xn) and as H (X1 : : :Xn) ? H (X1 : : :Xn jC1 : : :Ct). Hence, we get H (C1 : : :Ct jX1 : : :Xn) = H (C1 : : :Ct) ? H (X1 : : :Xn) + H (X1 : : :Xn jC1 : : :Ct ) = H (C1 : : :Ct) ? n H (X ) + H (X1 : : :Xn jC1 : : :Ct) = H (C1 : : :Ct) ? n H (X ) + H (X1 : : :XtjC1 : : :Ct) +H (Xt+1 : : :Xn jC1 : : :CtX1 : : :Xt) = H (C1 : : :Ct) ? n H (X ) + H (Xt+1 : : :Xn jC1 : : :CtX1 : : :Xt) (since H (XijCi ) = 0 for i 2 f1; : : :; ng) = H (C1 : : :Ct) ? (t + 1)H (X ): (by (10)) Now, we are ready to provide a lower bound on H (C1 : : :Cn jX1 : : :Xn ). We have
H (C1 : : :CnjX1 : : :Xn) H (C1 : : :CtjX1 : : :Xn) = H (C1 : : :Ct) ? (t + 1)H (X ) = H (C1) +
t X i=2
H (CijC1 : : :Ci?1 ) ? (t + 1)H (X )
= H (F ) + H (C1jF ) +
t X i=2
H (CijC1 : : :Ci?1) ? (t + 1)H (X )
H (F ) + n n??t ?1 1 H (X ) +
(by Lemma 4.6)
t X
n ? i H (X ) ? (t + 1)H (X ) n ? t?1 i=2
(by Remark 6.4 and Lemma 6.3) = t(t + 3) ? 2(n ? 2) H (X ) + H (F ) 2(n ? t ? 1) = t(t + 3) ? 2(n ? 2) log jX j + H (F ) (since X is uniformly distributed): 2(n ? t ? 1) Thus, the theorem holds. The next corollary is an immediate consequence of previous theorem. Corollary 6.6 Let 2 t n ? 2. For any function F , with sensitivity S (F ) = n, if the Xi's are independent and H (X ) def = H (X1) = = H (Xn) = H (F ), then the t-randomness R(n; t; U ) satis es R(n; t; U ) 2(nt(?t +t ?1) 1) log jX j;
where U is the uniform probability distribution on n-tuples of integers (x1; ; xn ).
If t = n ? c, for some constant c > 1, then from the bound of Corollary 6.6 one can conclude that R(n; n ? c; U ) = (n2) log jX j; whereas, if t = n, where is a constant such that 0 < < 1, then we get a linear lower bound, namely R(n; n; U ) = (n) log jX j. 14
6.1 A Tight Lower Bound for Totally Private Protocols
In this section we consider the case of totally private protocols. That is, after the execution of the protocol no coalition of arbitrary size can get any information about the inputs of the remaining players other than what can be deduced by their own inputs and the value of F . In the case of functions with sensitivity n, it is immediate to see that (n ? 2)-privacy implies (n ? 1)-privacy. Indeed, if we know the value of the function F and n ? 1 of its inputs, then the remaining input can be immediately computed. Therefore, in this section we restrict our attention to the case t = n ? 2. Next corollary holds when the private inputs are uniformly distributed.
Corollary 6.7 Let Un;k be the uniform probability distribution on n-tuple of k-bit integers (x1; ; xn ) and let F be any function with sensitivity n. Then, R(n; n ? 2; Un;k) k (n ? 1)(2 n ? 2) : In the case of the MSum function the bound of previous corollary is tight in that the protocol presented in [9], to compute the sum modulo q = 2k of n values uniformly and independently distributed in Zq , makes use of exactly k(n ? 1)(n ? 2)=2 random bits.
References [1] M. Ben-Or, S. Goldwasser, and A. Wigderson, Completeness Theorems for Non-Cryptographic Fault{Tolerant Distributed Computation, Proceedings of the 20th Annual ACM Symposium on Theory of Computing, 1988, pp. 1{10. [2] C. Blundo, A. De Santis, and U. Vaccaro, Randomness in Distribution Protocols, to appear in: Information and Computation. A preliminary version appears in: ICALP '94, Vol. 820 of LNCS, 1994, pp. 568{579. [3] C. Blundo, A. De Santis, G. Persiano, and U. Vaccaro, On the Number of Random Bits in Totally Private Computation, ICALP '95, Vol. 944 of LNCS, 1995, pp. 171{182. [4] R. Canetti and O. Goldreich, Bounds on Tradeos Between Randomness and Communication Complexity, Computational Complexity, Vol. 3, pp. 141{167, 1993. [5] S. Chari, P. Rohatgi, and A. Srinivasan, Randomness{Optimal Unique Element Isolation, with Application to Perfect Matching and Related Problems, Proceedings of the 25th Annual ACM Symposium on Theory of Computing, 1993, pp. 458{467. [6] D. Chaum, C. Crepeau, and I. Damgard, Multiparty Unconditionally Secure Protocols, Proceedings of the 20th Annual ACM Symposium on Theory of Computing, 1988, pp. 11{19. [7] B. Chor, M. Gereb-Graus, and E. Kushilevitz, On The Structure of the Privacy Hierarchy, J. of Cryptology, Vol. 7, 1994, pp. 53{60. [8] B. Chor and E. Kushilevitz, A Zero{One Law for Boolean Privacy, SIAM J. Discrete Math., Vol. 4, 1991, pp. 36{47. [9] B. Chor and E. Kushilevitz, A Communication-Privacy Tradeo for Modular Addition, Information Processing Letters, Vol. 45, 1993, pp. 205{210. [10] B. Chor and N. Shani, The Privacy of Dense Symmetric Functions, to appear in Computational Complexity.
15
[11] T. M. Cover and J. A. Thomas, Elements of Information Theory, John Wiley & Sons, 1991. [12] R. Fleischer, H. Jung, and K. Melhorn, A Time{Randomness Tradeo for Communication Complexity, 4th International Workshop on Distributed Algorithms, Vol. 486 of LNCS, 1991, pp. 390{401. [13] R. Impagliazzo and D. Zuckerman, How to Recycle Random Bits, Proceedings of the 30th IEEE Symposium on Foundations of Computer Science, 1989, pp. 248{255. [14] D.E. Knuth and A.C. Yao, The Complexity of Nonuniform Random Number Generation, in \Algorithms and Complexity", Academic Press, 1976, pp. 357{428. [15] D. Krizanc, D. Peleg, and E. Upfal, A Time{Randomness Tradeo for Oblivious Routing, Proceedings of the 20th Annual ACM Symposium on Theory of Computing, 1988, pp. 93{102. [16] E. Kushilevitz, Privacy and Communication Complexity, SIAM J. Discrete Math., Vol. 5, pp. 273{284. [17] E. Kushilevitz, S. Micali, and R. Ostrovsky, Universal Boolean Judges and their Characterization, Proceedings of the 35th IEEE Symposium on Foundations of Computer Science, 1994, pp. 478{489. [18] E. Kushilevitz and Y. Mansour, Randomness in Private Computations, in Proceedings of the 15th ACM Symposium on Principles of Distributed Computing, 1996. [19] E. Kushilevitz, R. Ostrovsky, and A. Rosen, Characterizing Linear Size Circuits in Terms of Privacy, Proceedings of the 28th Annual ACM Symposium on Theory of Computing, 1996. [20] E. Kushilevitz and A. Rosen, A Randomness-Rounds Tradeo in Private Computation, CRYPTO '94, Vol. 839 of LNCS, 1994, pp. 397{410. [21] P. Raghavan and M. Snir, Memory Versus Randomization in On{line Algorithms, ICALP '89, Vol. 372 of LNCS, 1989, pp. 687{703. [22] D. Zuckerman, Simulating BPP Using a General Weak Random Source, Proceedings of the 32nd IEEE Symposium on Foundations of Computer Science, 1991, pp. 79{89.
16