Random Bit Recycling, PCP s and the Complexity of N P ? Dimitris A. Fotakis1;2 1
Paul G. Spirakis1;2
Computer Engineering and Informatics Department Patras University, 265 00 Rion, Patras, Greece 2 Computer Technology Institute | CTI Kolokotroni 3, 262 21 Patras, Greece Email:
[email protected],
[email protected]
Abstract. In this paper, we study quantitative aspects of randomness in Probabilistically Checkable Proof Systems for NP . We show that, under a general assumption, we can recycle random bits so as to reduce the need of true randomness in such PCP Systems. Then, we prove that a structural property of PCP systems for NP implies that
the number of random bits can be decreased by small constant factors. Further, we study the decision and the optimization versions of a combinatorial problem expressed in expander graphs and we describe an algorithmic procedure such that if a certain family of instances have feasible solutions, then the time complexity of this procedure bounds from above the timecomplexity of the whole class NP . Moreover, if 1+ n , for some > 0, then NP is included in we can solve these instances in time O 2 1+ DT IME 2n . Since the Probabilistically Checkable Debate Systems for the class PSPACE are very similar to the Probabilistically Checkable Proof Systems for NP , we suggest that analogous results hold for the whole class PSPACE .
?
This work was partially supported by ESPRIT LTR Project no. 20244|ALCOM{IT.
1 Introduction
All recent results on hardness of approximating some NP {hard optimization problems rely on a probabilistic de nition of NP based upon the complexity class of Probabilistically Checkable Proofs, PCP . According to this de nition, the class NP contains exactly those languages whose membership proofs can be checked by a polynomial time probabilistic veri er that uses logarithmic number of random bits and inspects constant number of proof bits ([ALMSS92, AS92]). This is can be summarized as NP = PCP (log n; 1). All known reductions from PCP systems to NP {hard optimization problems exploit the logarithmic randomness and the small number of proof bits that the veri er inspects, for producing big gaps among the optimal values of the resulting instances. (eg. [FGLSS91, LY93, ALMSS92]). Further, the resulting non-approximability factors and the query bit complexity of the veri er are strongly related to each other (e.g. [BGS96]). The notion of Probabilistically Checkable Proofs was generalized by [CFLS93] to the notion of Probabilistically Checkable Debates, PCD. A Probabilistically Checkable Debate System for a language L consists of a polynomial time probabilistic veri er and a debate between a player who claims that the input is in L, and a player who claims that the input is not in L. Thus, Condon, Feigenbaum, Lund, and Shor de ned the class PSPACE as the class of languages for which there exists a PCD system such as the veri er uses logarithmic number of random bits and inspects constant number of the debate bits (i.e. PSPACE = PCD(log n; 1)). Further, this characterization of PSPACE was used for proving lower bounds on the diculty of approximating some PSPACE {hard optimization problems ([CFLS93]). The subject of deriving (as tight as possible) non-approximability results via proof checking has already received quite a lot of exposure (see for example [AL96, Bel96, BGS96, Sud96]). There was a long line of work on improving the query bit complexity of the proof systems and thereby improving the non-approximability factors. In addition, there was some work concerning randomness in interactive proofs and probabilistic proof checking. Bellare, Goldreich, and Goldwasser [BGG93] initiated a study of the quantitative aspects of randomness in interactive proofs. Their main result was a randomnessecient technique for decreasing the error probability of an equivalent form of Interactive Proof Systems known as Arthur{Merlin Games. Also, Feige and Kilian [FK95] proved that random bit recycling meets its limits when two-prover one-round (MIP (2; 1)) proof systems are concerned. More speci cally, they proved that: { There are MIP (2; 1) systems for which no pseudo-random \straight" veri er can reduce the error below a constant using logarithmic randomness. { There are MIP (2; 1) systems for which no pseudo-random \clever" veri er can reduce the error below log?c n using logarithmic randomness. { There are MIP (2; 1) systems for which \straight" repetition does not reduce the error below a constant, but \clever" does. Recently, Fotakis and Spirakis [FS96a] used a novel method for reducing randomness in some hypothetical probabilistic proof systems so as to prove that P = PCP (loglog n; 1). Their technique consists of recursive random walks on expander graphs of dierent sizes and it decreases, by large constant factors, the number of random bits used by systems of low randomness. In this paper, we study quantitative aspects of randomness in PCP systems for NP , that is an almost unaddressed topic. We provide a general assumption under which a generalized 1
version of the method of [FS96a] can be applied to PCP systems of logarithmic randomness. Then, we prove that a structural property of PCP systems for NP implies that the number of random bits can be decreased by small constant factors, since the necessary general assumption can be deduced by this property. Further, we study the decision and the optimization versions of a combinatorial problem expressed in expander graphs and we prove that if a certain family of instances of the decision version are yes{instances, then there exists an algorithmic procedure such that its time complexity bounds from above the complexity of the classNP . Moreover, if we can compute the solutions of a family of instances in time O 2n1+ , for some > 0, then any language in NP can be decided in deterministic time O 2n1+ . Since the Probabilistically Checkable Debate Systems for the class PSPACE are very similar to the Probabilistically Checkable Proof Systems for NP , we suggest that analogous results hold for the whole class PSPACE . As stated in [FK95], a natural question is to explore the limits of the principle of random bit recycling. The work of Feige and Kilian captures a \phase transition", from MIP (2; 1) systems where random bits can be successfully recycled to MIP (2; 1) systems where random bits cannot be recycled. Our results suggest that PCP and PCD systems share a structural property which leads the principle of random bit recycling to success. Also, we study some properties for the PCP and PCD systems where random bits would be able to be successfully recycled. Clearly, this is a step towards understanding in general which probabilistic situations lend themselves to recycling of random bits, and which not. Our results may be thought as a generalization of the results in the area of random bit recycling (i.e. [IZ89], [BGG93]). The main strategies for random bit recycling achieve smaller error probabilities without using more randomness. Further, they do not need to consider the internal details of the algorithms where they apply. We show that if a randomized algorithm satis es some assumptions, then we can reduce the need of true randomness without increasing the error probability. Our assumptions are quite restricting and are not expected to be satis ed by a general randomized algorithm. From the other point of view, they are expected to be satis ed by PCP and PCD Systems. Additionally, as far as we know, it has not been suggested any connection between the random bit and the query bit complexity of a PCP , or a PCD, System and the time complexity of the whole class NP , corresp. PSPACE , apart from the trivial one. Moreover, we de ne a family of decision and optimization combinatorial problems such that their time complexity is closely related to the time complexity of very important complexity classes like NP and PSPACE , results that seems interesting by themselves. From now on, we only consider one-side error randomized systems and we follow the notation of [Aro94] and [HPS94] that contain similar self-contained proofs of the PCP Theorem.
1.1 Probabilistically Checkable Proofs
A veri er V is a probabilistic polynomial time Turing machine with access to an input x and a string r of random bits. Furthermore, the veri er has access to a proof via an oracle, which takes as input a position of the proof the veri er wants to query and outputs the corresponding bit of the proof . The result of V 's computation, denoted by V (x; r; ), is either accept or reject. A veri er is (r(n); q (n)){restricted if on each input of size n it uses at most r^(n) = O(r(n)) random bits for its computation, and queries at most q^(n) = O(q (n)) bits from the proof. 2
De nition 1 (Arora, Safra [AS92]). A language L is in PCP(r(n); q(n)) i there exists an (r(n); q (n)){restricted veri er V such that: (a) For all x 2 L there exists a proof x such that
Probr [V (x; r; x) = accept] = 1, (b) while for all x 62 L every proof satis es Probr [V (x; r; ) = accept] 41 .
ut
Let PCP (: ; :) denote the class of languages de ned in the same way as PCP (: ; :) except that the constant 1/4 in De nition 1 is replaced by . Since the probability of getting a wrong answer can be made arbitrarily (but constantly) small by repeating the run of the restricted veri er a (suitable) constant number of times, we have PCP (: ; :) = PCP (: ; :) for any constant . A veri er is non-adaptive if the positions it queries from the proof depend solely on the random string r, and are irrelevant of the outcome of any previously queried positions. In contrast, the original de nition of PCP ([AS92]) allowed the veri er to be adaptive, i.e. to base its next query on the bits it had already read from the proof . Obviously, given a proof for a veri er V , the number of dierent sets of proof positions that can be addressed by a random string for V is an important parameter for an adaptive veri er. We de ne ([FS96b]) a veri er V to be q{adaptive if, for all possible proofs, the number of dierent sets of proof positions that can be addressed by a random string for V is at most q. Since, for all possible proofs, the number of dierent sets of proof positions that can be addressed by a non-adaptive veri er is exactly one, non-adaptive veri er is 1{adaptive. a Clearly, a veri er that queries q^(n) proof bits is O 2q^(n) {adaptive.
1.2 Random Bit Recycling
Random bit recycling, that is replacing independent random bits by dependent random bits extracted from a pseudo-random bit generator, has been a successful enterprise in many scenarios, including cryptography, NC computations, space bounded computation, RP and BPP algorithms, and interactive proofs. A wide variety of techniques have been introduced for this purpose. A method presented in [AKS87] and [IZ89] uses random walks on expander graphs for producing long pseudo-random bit strings. The following lemma gives the theoretical background for this method:
Lemma 2 (Ajtai, Komlos, Szemeredi [AKS87]). Let G be an in nite family of d-regular graphs with the following property: If G = (V; E ) is a member of G and A denotes its adjacency
matrix multiplied by 1/d then all but the largest eigenvalue of A are less than 1 and positive. Then for every subset C of V with jC j jV j=16 there exists a constant c such that the probability that a random walk on G of length c arrives in every c-th step in a vertex of C is at most 2? .
The existence of families of graphs G satisfying the requirements of the Lemma 2 is based on the existence of constant degree expanders. An explicit construction of constant degree 3
expanders is given by Gabber and Galil [GG81]. The so-called Gabber{Galil expander has the advantage that we do not need to explicitly construct the entire graph. In particular, for any vertex in the expander, it is possible to compute the neighboring vertices in time polynomial in log jV j. Random walks on Gabber{Galil expanders are used (e.g. [MR95, IZ89, BGG93]) for obtaining probability ampli cation results for randomized algorithms. We also use the following generalization of Lemma 2 that is proved in [BGG93] using the ideas of [AKS87]:
Lemma 3 (Bellare, Goldreich, Goldwasser [BGG93]). For any family G of d-regular expander graphs there is a constant 1 such that the following is true. Suppose < 1=2 and let c = log ? . Let 2 IN and let C ; : : :; C be subsets of V which have density at most . Let b be an integer and let 1 j < : : : < jb be a sequence of indices between 1 and . Consider a random walk of length c on the expander and denote by Yj the vertex visited at time c j for j = 1; : : :; . Then Prob [Yj1 2 Cj1 ; : : :; Yjb 2 Cjb ] (2)b= . 1
1
1
2
1.3 Outline of the paper
In Section 2, we show how to reduce the need of true randomness in probabilistic systems that satisfy a family of general assumptions. In Section 3.1, we describe the trivial connection between the random and the query bit complexity of a veri er and the deterministic time complexity of the corresponding class. In Section 3.2, we prove that the random bit complexity of PCP systems for NP can be decreased by small constant factors. In Section 3.3, we describe an algorithmic procedure such that (under a combinatorial assumption) its time complexity bounds from above the deterministic time complexity of the class NP . We conclude with some open problems. The proofs of Lemma 4, Lemma 6 and Lemma 7 can be found in the appendix, due to lack of space.
2 How to Reduce Randomness Using Random Walks Lemma 2 implies the following well known probability ampli cation result in a straight forward manner:
Lemma 4 (Fotakis, Spirakis [FS96a]). Let L 2 PCP = (r(n); q(n)). Then for any 0< 1 16
< 1, L has a restricted veri er R0 achieving an error rate , using r^(n) + c log 1 (log d) random bits (c, d are the constants of Lemma 2) and inspecting q^(n) log 1 proof bits.
Proof. See the appendix.
Let L be a language in PCP (r(n); q (n)) and S be the set of all bit strings of length r^(n). Further, for any xed integer 1, let S be the set of all bit strings of length r^(n) . Let us assume that there exist an integer r^(n)1=2 > 1 and a sequence of sets C1 ; C2; : : : ; C S such that the following proposition holds:
Proposition 5. A bit string s 2 S causes the veri er to result in wrong outcome i there exist substrings m 2 C ; m 2 C ; : : :; m 2 C such that s = m m m . 1
1
2
1
2
2
Obviously, Proposition 5 characterizes the set of \misleading" strings as the set of the strings that can be produced by the concatenation of substrings mi 2 Ci , Ci S , i = 1; : : :; . 4
In the following, we assume that we can calculate some > 1 such that Proposition 5 holds and we prove that the techniques of [FS96a] can be applied to general PCP Systems. Lemma 6. Let L be any language in PCP(r(n); q(n)) and let > 1 be an integer such that Proposition 5 holds. Then, there exists a restricted veri er R^ for L that uses r^(n) + O() random bits, inspects O( q^(n)) proof bits and achieves error rate 2?O(). Proof. See the appendix. Lemma 6 implies that we can reduce the number of random bits used by a Probabilistically Checkable Proof System if there exist an integer > 1 and sets C1; C2; : : : ; C S such that the strings s1 s2 s , si 2 Ci correspond to the \misleading" strings of the original system. Using similar arguments, we can prove analogous lemmas for the following cases: (A) There exist an integer > 1 and at least one set C S ; jCj < jS2 j such that the strings s1 s2 s , s 2 C and si 2 S ; i 6= , correspond to the \misleading" strings. (B) There exist an integer > 1 and sets C1 ; C2; : : : ; C S such that the strings s1 s2 s, si 2 Ci correspond to the \correct" strings of the original system. The query bit complexity of the resulting veri er is O(2 q^(n)). (C) There exist an integer > 1 and at least one set C S ; jCj > jS2 j such that the strings s1 s2 s , s 2 C and si 2 S ; i 6= , correspond to the \correct" strings. The query bit complexity of the resulting veri er is O(2 q^(n)). Thus, we have obtained a family of sucient conditions for reducing the need of true randomness. Lemma 4 and Lemma 6 can be further generalized so as to cover any one-side error randomized system that is susceptible of random bit recycling. However, since the assumptions are quite general and arti cial, our techniques seem dicult to apply in arbitrary natural randomized systems. In the following, we show how to translate the aforementioned conditions to natural situations in case of Probabilistic Proof Checking.
3 Random Bit Recycling in PCP Systems
3.1 Comparing Random and Query Bit Complexity with Time Complexity
We begin with a well known lemma that shows the trivial connection between the random bit and the query bit complexity of deciding, using a PCP system, whether an input x is in a language L, and the time complexity of solving the same problem deterministically. We provide a generalized version of this result that also holds for adaptive veri ers. Lemma 7 (Fotakis, Spirakis [FS96b]). Let L be any language over and R a g(n){ adaptive (r(n); q (n)){restricted veri er for L. Then for any x 2 there exists a Boolean formula Bx of size O 2r^(n) q^(n)g (n) that is satis able i x 2 L. Proof. See the appendix. Remark. Using a reduction similar to Theorem 16.1 of [Pap94], we can also reduce the optimal debate problem to an instance of the Quantified Boolean Formula problem of size r ^(n) O poly(n) 2 q^(n)g(n) . Another example of such a reduction can be found in [CFLS93] where the optimal debate problem is reduced to the Maximum Quantified 3{Sat problem. 5
3.2 Decreasing Randomness By Small Constant Factors
It is straight forward that our techniques can be applied to a probabilistic system which uses r^(n) random bits and ful lls the following condition:
Proposition 8. There exist an integer > 1 and r n xed positions of the random string, that form a string s 2 C , such that: { For any value of the string consisted of the remaining ? r^(n) positions, less than ^( )
1
half of the values of s cause the system to result in wrong outcome.
Obviously, this probabilistic system ful lls the (A) case of Proposition 5. Thus, we can apply the corresponding version of Lemma 6 so as to decrease the error rate of the system while saving random bits. In the following, for the sake of simplicity, we always assume that the r^(n) positions that form the string s 2 C are consecutive in the random string r. Then, we prove that for the PCP Systems there exists an > 1 such that Poposition 10 holds. Thus, we can apply our techniques so as to reduce the need of true randomness while decreasing the failure probability. Let us consider the rst PCP System for NP that uses logarithmic randomness and inspects constant number of proof bits as described in [Aro94] and [HPS94]. The corresponding veri er V is non-adaptive and consists of the composition of a (log n; loglog n){restricted ? 3 veri er and a n ; 1 {restricted veri er. The former veri er results by the composition of a (log n; log n){restricted non-adaptive veri er by itself. The veri er's operation may be viewed as having three stages. The rst stage reads the input and the random string, and decides what locations to examine in the proof . The second stage reads the symbols from . The third stage decides whether or not to accept. The proof locations which the veri er examines, are used for performing the following randomized algebraic tests:
{ the Low Degree Test that checks whether a given function is ({)close to a low degree
polynomial; { the Zero Test that, given a function that is a low degree polynomial, checks whether this function is identically zero; and { the Consistency Test that is used by the composition lemma.
The veri er accepts i all the tests results in success, otherwise rejects. Since the veri er V in non-adaptive, the tests are performed independently and, given the input x, the examined proof locations only depend on the random string. Also, as we can deduce from the calculation of the error rate of the veri er V , each application of a test uses its own (disjoint) random substring. The probability that one repetition of the above tests fails, depends on various parameters but its typical value is no more than 41 . Each of the above tests is invoked repeatedly many times in order to achieve very small failure probability, because the error rate of the veri er is bigger than the sum of failure probabilities of all the tests. A large number of random bits is usually wasted to the repeated applications of the tests. Obviously, since the veri er is non-adaptive and the failure probabilities of the tests are no more than 41 , Proposition 8 holds for all the invocations of the above tests of this PCP System. Since this implies the more general Proposition 5 for the veri er V , we can apply 6
Lemma 6 for obtaining a veri er V^ for 3{Sat that achieves smaller error rate and uses less truly random bits than the original veri er. Let us now compare the random bit complexity of the veri ers V and V^ . Let us assume that the size of the proof is n and that V performs a total number of tests and each invocation of a test consists of independent repetitions. Further, assume, without loss of generality, that each repetition of a test uses log n random bits. Thus, the random bit complexity of the veri er V is log n. Further, the veri er V^ uses only log n + O() truly random bits per invocation of a test. Consequently, the random bit complexity of the veri er V^ is log n + O(). The decreasing factor is a small constant for the PCP Systems described in [Aro94, HPS94]. The factor depends on the proof and on the types of the tests used by the veri er V . A typical value of lies in the range between 2 and 4. The previous observations can be summarized in the following:
Theorem 9. Random bit recycling can be applied to restricted veri ers for NP in order to decrease the random bit complexity by a small constant factor that depends on the veri er.
However, the query bit complexity of the veri er V is inceased by a small constant factor 0 , that depends on the failure probabilities of the various tests. If the failure probabilities of all the tests are no more than 14 , then 0 is almost equal to 2. As a nal remark, since the reduction of Lemma 7 results in smaller Boolean formulae Bx for the veri er V^ than for V , we conclude that the total complexity of the veri er V^ is smaller that the complexity of V . Remark. The family of veri ers described in [BGS96] results by the composition of the canonical veri er of [Raz95] and a family of \inner" veri ers. Our techniques are also applicable to these veri ers, since the results of Feige and Kilian [FK95] suggests that there exists MIP (2; 1) systems for which pseudo-random \clever" canonical veri ers can reduce the error rate up to any constant.
3.3 A Combinatorial Problem Equivalent to Proposition 5
In this section, we de ne a combinatorial problem equivalent to Proposition 5. Thus, if we can solve this problem eciently, then we can apply our techniques for reducing the need of true randomness in randomized algorithms. Further, under the assumption that a family of instances are feasible instances, we describe an algorithmic procedure such that its time complexity provides a highly non-trivial upper bound to the deterministic time complexity of the class NP . Let G(V; E ) be a d{regular expander graph of N = 2r^(n) vertices. Assume that G is a member of a family of graphs G and ful lls the requirements of Lemma 2 and Lemma 3. Let F be a computable one-to-one function that assigns bit strings of length r^(n) to the vertices of G, F : V 7! S . We would like to nd a pair (g (N ); ), g (N ) < N 1=2 and > 2, such that there exists a function F with the following property:
Property 10. For every set of vertices C V , jC j g(N ), such that the induced subgraph GC (C; EC ) consists of a single connected component, there exist sets C ; : : : ; C S of size jC j jCj = jC j such that a vertex u is in C i it is assigned (by F ) a bit string that produced by members of Ci (i.e. u 2 C , F (u) = s s ; si 2 Ci ; i = 1; : : : ; ). 1
1
1
7
Now we are ready to de ne the decision and the optimization versions of the corresponding combinatorial problem: Problem 11 (General Decision Version). Does there exists a triple (G(V; E ); g (N ); ), de ned as previous, such that there exists a computable function F : V 7! S with Property 10? Problem 12 (General Optimization Version). Find a triple (G(V; E ); g (N ); ), that minimizes the function 2r^(n)= g (N ), such that there exists a computable function F : V 7! S with Property 10.
Additionally, we can de ne two decision{optimization pairs of problems similarly, if we consider g (N ), corresp. , as given parameters of the problem. From now on, we call these problems {Decision/Optimization, corresp. g {Decision/Optimization. Obviously, the {Optimization is a maximization problem while the g{Optimization is a minimization problem, as deduced by the de nition of the General Optimization Version. Let us return to the PCP Systems and the proofs of Lemma 4 and Lemma 6 in order to show how this family of arti cial combinatorial Decision/Optimization pairs of problems can be exploited. Consider a (log n; 1){restricted veri er R for an arbitrary language in NP that uses r^(n) = log n random bits, inspects q^(n) = proof bits and achieves 161 error rate. Assume that we can compute a d{regular expander graph G(V; E ), jV j = N = n , a function g (N ) and a constant , such that there exists a computable function F : V 7! S V j of the vertices of G that causes the with Property 10. Also, consider the set C V; jC j j16 veri er R to result in wrong outcome. Since we assume that we can compute a feasible solution to the Problem 12, there exist the following possibilities for the distribution of the \misleading" vertices to induced connected subgraphs of G: (1) The \misleading" vertices form small induced connected components of size less than g(N ). (2) The \misleading" vertices form large induced connected components GC (C 0; EC ); C 0 C of size at least g(N ). However, Propery 10 implies that there exist sets C1; : : : ; C S of size jC1j jCj = jC 0j such that a vertex u is in C \ C 0 i the corresponding bit string consists of the concatenation of substrings si 2 Ci; i = 1; : : : ; . (3) Both cases (1) and (2). 0
0
Note that the case (2) can be expressed by Proposition 5 (more speci cally, by the case (A) of Proposition 5). Thus we can use the techniques of Lemma 4 and Lemma 6 for avoiding the large induced connected components consisted of \misleading" vertices. The techniques of Lemma 4 and Lemma 6 ensure that we have to use only r^(n) + O() random bits in order to avoid, with probability no more than 2?O(1), the large components of \misleading" vertices. Further, if each time we visit a vertex v 2 V in the random walk on G (see the proof of Lemma 4), we perform an arbitrary walk of length g (N ), then we avoid the small components of \misleading" vertices (case (1)) with probability 1. Consequently, the resulting veri er R0 uses r^(n) + O() random bits, inspects O(g (N ) q^(n)) proof bits and achieves error rate no more that 41 . However, the running time of the veri er R0 consists of the time for the veri cation procedure, that is O(poly(n)), plus the time for nding the triple (G; g (N ); ) and computing the function F . Then, we have proved the following: 8
Theorem 13. Let L 2 PCP (r(n); q(n)) and R be the corresponding veri er that runs in time
TR (n). If there exists at least one feasible triple (G; g(N ); ) to Problem 12 such that both the triple and the function F can be computed by an algorithm A in time TA (n), then: There exists a veri er R0 for L that uses r^(n) + O() random bits, inspects O(g (N ) q^(n)) proof bits and achieves error rate no more that 14 . Further, the running time of R0 is TR (n) = TR (n) + TA(n) + poly(n). In the following, consider the aforementioned veri er R for an arbitrary language in NP 0
and assume that the answer to the Decision Problem 11 is yes for some triples. We are interested in the {Optimization Problem, i.e. we x the d{regular expander graph G(V; E ) of N = n vertices and an integer g (N ) and we try to nd the biggest possible for which we can compute a function F : V 7! S with Property 10. Let A be an algorithm that nds such an and F and let TA (N ) its time complexity. We are only interested in algorithms that run in time at most subexponential in N . Then, we use Lemma 7 and Theorem 13 for showing that, provided that Problem 12 is feasible, the complexity of the {Optimization Problem is closely related to the complexity of any problem in the class NP . Let us consider the following cases: (A) g (N ) = O(1) (A.1) If the best possible feasible is equal to then for any language in NP there exists a veri er R0 that uses exactly log n + O(1) random bits and inspects O(1) proof bits. Thus, n Lemma 7 implies that we can decide any language in NP in time O (2 + TA (N )) = 1 = O 2n + TA n . If TA(N ) = O 2N = O (2n), then any language in NP can be decided in time O (2n ). Consequently, NP DT IME (2n ) = DT IME (TA (N )). (A.2) If there exists feasible < , then the sameapproach yields that any language in NP = n can be decided in time O 2 + TA n . Further, if TA n = o (2n ), then any language in NP can be decided in subexponential, o (2n ), time. 1 (A.3) If there exists feasible = o(1), i.e. = logloglog n , then an observation of [AS92] about PCP (o(log n ) ; o(log n )) implies that any language in NP can be decided in time O TA n + poly(n) . (B) g (N ) = o(n) (B.1) Using similar arguments, we can deduce that any language L 2 NP has a restricted n veri er that uses log + O(1) random bits, inspects o(n) proof bits, achieves error rate 1 Lemma 7 implies no more than 4 and runs in time O(TA (N ) + poly(n)). Consequently, = n that for any input x we can construct a Boolean formula Bx of size O 2 o(n) , that is satis able i x 2 L, where is the best possible integer for which Problem 12 is feasible. 1+ n Provided that and TA (N ) = O 2 for a suitable > 0, we can decide any 1+ language in NP in time O 2n . Thus, NP is included in DT IME 2n1+ for some > 0. A straight forward generalization of the previous arguments implies the following:
Theorem 14. If there exists a feasible such that both and the function = F can be com puted in time TA (n), then we can decide any language in NP in time O 2n g N + TA (n) . (
9
)
Obviously, if TA (n) = 2n = then the time complexity of the {Optimization Problem bounds from above the time complexity of the whole class NP . Further, the same holds for the General Optimization Version (Problem 12).
4 Conclusions Let us recall our main results. We showed how to use random walks on expander graphs so as to reduce the need of true randomness in one-side error probabilistic systems while decreasing the failure probability. Then, we exploited these technical results in order to prove that random bit complexity of PCP Systems can be decreased by small constant factors. Finally, under some assumptions on the feasibility of a family of well-de ned instances of a combinatorial optimization problem expressed in expander graphs, we described an algorithmic procedure such that its time complexity provides a non-trivial upper bound to the deterministic time complexity of the whole class NP . More speci cally, based on the assumption that a family of instances of the {Optimization Problem are feasible and we can solve them in time O 2n1+ , for some > 0, we showed there exists an 0 > 0 such that the whole class NP is included in DT IME 2n1+ . Moreover, since PCD Systems are very similar to PCP Systems, our results can be generalized to the class PSPACE , a topic that is of independent interest. We conclude with some areas for further investigation: 0
{ A lower bound for parameter and an upper bound for parameter g(N ), such as Prob{ { { {
lem 12 is feasible, should be found. Note that there exists a quite simple proof that (G; N 1=2; 2) is a yes instance of Problem 11. For speci c interesting values of g (N ), e.g. O(1), poly(log n), and o(n), the values of should be investigated in order to establish a feasible region for the {Optimization Problem. An algorithm for either the General Optimization Problem or the restricted problems would be of independent interest. It is quite important for such an algorithm to operate eciently for a wide range of parameters and to run in time subexponential in the number of vertices of the expander graph. It is quite important to investigate the assumptions under which our techniques can be applied recursively so as to obtain tight results. We provide a family of sucient conditions for the probabilistic systems to which our techniques can be applied. Further investigation is needed so as to obtain an exact characterization for them.
Acknowledgements: We wish to thank Sanjeev Arora, Christos Papadimitriou, and Madhu
Sudan for their insightful comments on earlier drafts of this paper. We also would like to thank Luca Trevisan for helpful discussions on the subject of [FS96a].
References [AKS87] M. Ajtai, J. Komlos, and E. Szemeredi. Deterministic simulation in logspace. Proc. of the 19th ACM Symposium on Theory of Computing, pp. 132{140, 1987.
10
[Aro94] S. Arora. Probabilistic Checking of Proofs and Hardness of Approximation Problems. Revised version of PhD Thesis, CS Division, UC Berkeley 1994. CS{TR{476{94. Available from ftp.cs.princeton.edu. [AL96] S. Arora and C. Lund. Hardness of Approximations. Approximation algorithms for NP {hard problems. D. Hochbaum, ed. PWS Publishing, Boston, 1996. [ALMSS92] S. Arora, C. Lund, R. Motwani, M. Sudan, and M. Szegedy. Proof veri cation and hardness of approximation problems. Proc. of the 33th Annual IEEE Symposium on Foundations of Computer Science, pp. 14{23, 1992. [AS92] S. Arora and S. Safra. Probabilistic checking of proofs: A new characterization of NP . Proc. of the 33th Annual IEEE Symposium on Foundations of Computer Science, pp. 2{13, 1992. [Bel96] M. Bellare. Proof Checking and Approximation: Towards Tight Results. Complexity Theory, Column 12, Sigact News, 27, No. 1, March 1996. [BGG93] M. Bellare, O. Goldreich, and S. Goldwasser. Randomness in Interactive Proofs. Computational Complexity 3, pp. 319{354, 1993. [BGS96] M. Bellare, O. Goldreich, and M. Sudan. Free Bits, PCP s and Non-Approximability| Towards Tight Results (3rd Version). TR 95{024. Electronic Colloquium on Computational Complexity. December 1995. [CFLS93] A. Condon, J. Feigenbaum, C. Lund, and P. Shor. Probabilistic Checkable Debate Systems and Approximation Algorithms for PSPACE {hard Functions. Chicago Journal of Theoretical Computer Science Volume 1995, Article 4, 1995. Preliminary version in Proc. of the 25th Annual ACM Symposium on Theory of Computing, 1993. [FGLSS91] U. Feige, S. Goldwasser, L. Lovasz, S. Safra, and M. Szegedy. Approximating clique is almost NP {complete. Proc. of the 32th Annual IEEE Symposium on Foundations of Computer Science, pp. 2{12, 1991. [FK95] U. Feige and J. Kilian. Impossibility Results for Recycling Random Bits in Two-Prover Proof Systems. Proc. of the 27th Annual ACM Symposium on Theory of Computing, pp. 457{468, 1995. [FS96a] D. Fotakis and P. Spirakis. (poly(loglog n); poly(loglog n)){restricted veri ers are unlikely to exist for languages in NP . Proc. of the 21th Mathematical Foundations of Computer Science, 1996. [FS96b] D. Fotakis and P. Spirakis. Randomness in Probabilistic Proof Checking and the Complexity of NP . Unpublished Manuscript. [GG81] O. Gabber and Z. Galil. Explicit constructions of linear{sized superconcentrators. Journal of Computer and System Sciences 22, pp. 407{420, 1981. [HPS94] S. Hougardy, H.J. Promel, and A. Steger. Probabilistically Checkable Proofs and their Consequenses for Approximation Algorithms. Discrete Mathematics 136, pp. 175{223, 1994. [IZ89] T. Impagliazzo and D. Zuckerman. How to recycle random bits. Proc. of the 30th Annual IEEE Symposium on Foundations of Computer Science, pp. 248{253, 1989. [LY93] C. Lund and M. Yannakakis. On the hardness of approximating minimization problems. Proc. of the 25th Annual ACM Symposium on Theory of Computing, pp. 286{293, 1993. [MR95] R. Motwani and P. Raghavan. Randomized Algorithms. Cambridge University Press, New York, 1995. [Pap94] C.H. Papadimitriou. Computational Complexity. Addison{Wesley, 1994. [Raz95] R. Raz. A Parallel Repetition Theorem. Proc. of the 27th Annual ACM Symposium on Theory of Computing, pp. 447{456, 1995. [Sud96] M. Sudan. Ecient checking of polynomials and proofs, and the hardness of approximation problems. PhD Thesis, UC Berkeley, 1992. ACM Distinguished Theses series, Lecture Notes in Computer Science, Vol. 1001, Springer 1996.
11
Appendix
A. Proof of Lemma 4
Proof. Since L 2 PCP 1=16(r(n); q (n)), there exists a restricted veri er R for L such that achieves an error rate 1=16, uses r^(n) random bits and inspects q^(n) proof bits. The restricted veri er R0 with the desired properties will be the result of multiple invocations of R. Lemma 2 implies that the bits required for multiple invocations of R can be provided by a random walk on a Gabber{Galil expander. The restricted veri er R0 should proceed as follows:
(1) It should read a random string r of length r^(n) + c log 1 (log d). (2) It should construct a d{regular graph G(V; E ) that should be a member of a family of graphs G for which Lemma 2 holds. The graph G should have a vertex for every possible bit string of length r^(n). Let c be the constant of Lemma 2. (3) R0 should perform a random walk on the graph G. Let c be the length of the random walk. Let ri be the bit string that is associated with the vertex reached by the random walk at the (i c)-th, i = 1; : : :; , step. If R0 has been invoked as R0(x; r; ), then R0 should invoke R for each bit string ri ( times) as R(x; ri; ). R0 should produce accept i all invocations of R produce accept too. Since error probability of R is no more than 161 , there exists a C V (G); jC j jV (G)j=16 such that a vertex u 2 C i it is associated with a bit string ri that causes invocation R(x; ri; ) to result in wrong outcome. Let the number of times that R is invoked by R0 be = log 1 . Then Lemma 2 implies that the probability Perror of all bit strings ri causing ? log(1=) = 2log = . invocations R(x; ri; ) to result in wrong outcomes will be Perror 2 Further, a bit string r of length r^(n) + c log 1 (log d) can completely specify the steps of a walk on G (^r(n) bits are used for choosing the initial vertex and c log 1 (log d) bits are used for specifying the c log 1 steps of the walk). Since R is invoked by R0 exactly log 1 times, the number of proof bits inspected by R0 is q^(n) log 1 . Also, provided that the initial veri er R is non-adaptive, the resulting veri er R0 is non-adaptive. ut
B. Proof of Lemma 6
Proof. Since L 2 PCP (r(n); q (n)), Lemma 4 implies that there exists a restricted veri er R for L that achieves error rate 2?4, uses r^(n) + O() random bits and inspects O( q^(n)) proof bits. The veri er R^ with the desired properties will be the result of multiple invocations of R. Lemma 3 implies that the bits required for multiple invocations of R can be provided by a random walk on a Gabber{Galil expander. Let R^ (x; r; ) be an invocation of R^ , where r is a truly random string of length r^(n) + O(). The restricted veri er R^ should proceed as follows: (1) It should construct a d^{regular graph G^ (V^ ; E^ ) that should be a member of a family of graphs G for which Lemma 3 holds. The graph G^ should have a vertex for every possible bit string of length r^(n) . Let c^ be the constant de ned in Lemma 3. (2) R^ should perform a random walk on the graph G^ . Let c^ be the length of the random walk and let mi be the bit string associated with the vertex reached by the (i c^)-th, i =
12
1; : : :; , step of the walk. Let rj be the bit string that consists of the concatenation of consecutive results of the random walk, rj = mj+1 mj+2 mj+ ; j = 0; : : :; ? 1. (3) R^ should invoke R(x; rj0 ; ); j = 0; : : :; ? 1 ( times). The veri er R uses random strings rj0 of length r^(n) + O() (see the proof of Lemma 4). More speci cally, since the veri er R also performs a random walk, it uses r^(n) bits for choosing the initial vertex and O() bits for specifying the steps of the walk. Thus, jrj j = r^(n) can be used by R for the choice of the initial vertex of its walk. Further, the remaining bits are truly random bits and are provided by the random string of R^ . (4) R^ should produce accept i all invocations of R0 produce accept too. The complete speci cation of the walk requires r^(n) random bits for the choice of the initial vertex and c^ log d^ = O() random bits for the speci cation of the c^ steps of the walk. Additionally, O() random bits are needed for the invocations of R. Thus, the veri er R^ uses r^(n) + O() random bits. Further, the number of bits that R^ inspects from the proof is O( q^(n)). Let us determine the error probability of R^ . Let 2r^(n) = N and jV^ (G^ )j = 2r^(n)= = N 1=. Since the error rate of R is 161 , without loss of generality, no more than 16N random strings rj cause the veri er R to result in wrong outcome. By hypothesis, there exists a sequence of sets C1; C2; : : : ; C S such that Proposition 5 holds. Obviously, the sets Ci correspond to sets of \misleading" vertices of the graph G^. Consequently, Proposition 5 implies that there exist subsets C^i of the vertex set of G^ such that the veri er R^ results in wrong outcome i the random walk of (2) only reaches vertices of C^i at its i c^-th steps. Since the error rate of the veri er R is no more that 161 , we can assume, without loss of generality, that the densities of the sets C^i are no more than 161 . Thus, we can apply Lemma 3 for b = = and = 161 . Then, the probability that a random walk of length c^ reaches at every c^-th step a \misleading" vertex is at most 2? . Remark. Actually, the average density of the sets C^i is 161 . It is straight forward that we can use Lemma 3 in order to handle the case of sets of \misleading" vertices C^i which have varying densities. Additionally, we can achieve the desired error probability using the same amount of random bits by carefully adjusting the parameters of the random walk. The proof is concluded by the observation that if the original veri er for L is a nonut adaptive one then the veri er R^ is also non-adaptive.
C. Proof of Lemma 7
Proof. For any x 2 ; jxj = n, we will exploit the veri er R in order to construct a Boolean formula Bx such that Bx is satis able i x is an element of L. For every position of a membership proof we introduce a variable whose values True and False correspond to the values 1 and 0 of the bit at this position. Using these variables, the Boolean formula Bx is obtained as follows: { For any possible random string r let Br denote the Boolean formula that expresses which proofs are accepted by R on input x. Since R queries only q^(n) bits from the proof, the size of the formulas Br is O(^q(n)). Furthermore, since jrj = r^(n) and the veri er is g(n){adaptive, the number of formulas Br is 2r^(n) g(n).
13
{ Let Bx be the conjunction of all the formulas Br . Since formulas Br contain O(^q(n)) variables and the number of Br is 2r^(n), the size of the Boolean formula Bx is O(2r^(n)q^(n)g(n)).
ut
14