On the Query Complexity of Testing for Eulerian

0 downloads 0 Views 435KB Size Report
The dense graph model is in a sense too lenient, since for n-vertex graphs, the distance ... graph model of [13], in which a sparse representation of sparse graphs is ... function allows only edges inversions, but no edge insertions or deletions. ... We later derive ϵ-tests from the p-tests by proving two lower bounds for p. The.
On the Query Complexity of Testing for Eulerian Orientations Eldar Fischer



Arie Matsliah



Ilan Newman



Orly Yahalom

§

Abstract We consider testing directed graphs for being Eulerian in the orientation model introduced in [15]. Despite the local nature of the property of being Eulerian, it turns out to be significantly harder for testing than other properties studied in the orientation model. We show a superconstant lower bound on the query complexity of 2-sided tests and a linear lower bound on the query complexity of 1-sided tests for this property. On the positive side, we give several 1-sided and 2-sided tests, including a sub-linear 2-sided test for general graphs. For special classes of graphs, including bounded-degree graphs and expander graphs, we provide improved results. In particular, for dense graphs we give a 2-sided test with constant query complexity.

1

Introduction

Property testing deals with the following relaxation of decision problems: Given a property P, an input structure S and  > 0, distinguish between the case where S satisfies P and the case where S is -far from satisfying P. Roughly speaking, an input S is said to be -far from satisfying a property P if more than an -fraction of its values must be modified in order to make it satisfy the property. Algorithms which distinguish with high probability between the two cases are called property testers or simply testers for P. Furthermore, a tester for P is said to be 1-sided if it never rejects an input that satisfies P. Otherwise, the tester is called 2-sided. We say that a tester is called adaptive if some of the choices of the locations for which the input is queried may depend on the values (answers) of previous queries. Otherwise, the tester is called non-adaptive. Property testing normally deals with problems with a very large input or with costly retrieval procedures. Thus, the number of queries of input values, rather than the computation time, is considered to be the most expensive resource. Property testing has been a very active field of research since it was initiated by Blum, Luby and Rubinfeld [5]. The general definition of Property testing was formulated by Rubinfeld and Sudan [24], who were interested mainly in testing algebraic properties. The study of property testing for combinatorial objects, and mainly for labelled graphs, began in the seminal paper of Goldreich, Goldwasser and Ron [12]. They introduced the dense graph model, where the graph is assumed ∗ Computer Science Department, Technion - Israel Institute of Technology, Haifa 32000, Israel. Email: [email protected]. Research supported in part by an ISF grant number 1101/06. † Computer Science Department, Technion - Israel Institute of Technology, Haifa 32000, Israel. Email: [email protected]. ‡ Computer Science Department, Haifa University, Haifa 31905, Israel. Email: [email protected]. Research supported in part by an ISF grant number 1101/06. § Computer Science Department, Technion - Israel Institute of Technology, Haifa 32000, Israel. Email: [email protected].

1

to be represented by an adjacency matrix, and the distance function is computed accordingly. Properties of directed graphs were studied in the same context in [1, 2]. For comprehensive surveys on Property testing see [23, 8]. The dense graph model is in a sense too lenient, since for n-vertex graphs, the distance function allows adding and removing o(n2 ) edges, regardless of the number of actual edges in the graph. Thus, many interesting properties, such as connectivity in undirected or directed graphs, are trivially testable in this model, as all the graphs are close to satisfying the property. In recent years, researchers have studied several alternative models for graph testing, including the bounded-degree graph model of [13], in which a sparse representation of sparse graphs is considered, and the general density model (also called the mixed model) of [20] and [17]. In these models, the distance function allows edge insertions and deletions whose number is at most a fraction of the number of the edges in the original graph. Here we continue the study of testing digraph related properties in the orientation model, which started in [15] and followed in [14] and [7]. In this model, an underlying undirected graph → − G = (V, E) is given in advance, where the input is an orientation G of G, in which every edge in E has a direction. Our testers may access the input using edge queries. That is, every query concerns → − → − an edge e ∈ E, and the answer to the query is the direction of e in G . An orientation G of G is called -close to a property P if it can be made to satisfy P by inverting at most an -fraction of → − the edges of G, and otherwise G is said to be -far from P. Note that the distance function in the orientation model naturally depends on the size of the underlying graph and is independent of representation details. Moreover, the testing abilities strongly depend on the structure of the underlying graph. The model is strict in that the distance function allows only edges inversions, but no edge insertions or deletions. On the other hand, we assume that our algorithms have a full knowledge of the underlying graph, whose size is roughly the same as the input size. Viewing the underlying graph as a parameter that the testing algorithms receive in advance, we say that the orientation model is a massively parameterized model. Other examples that can be thought of as massively parameterized models appear in [4], where a truth assignment is tested for satisfying a known 3CNF formula, in [9], where the input is a vertex-coloring of a known graph, and in other works. In this paper we consider the property of being Eulerian, which was presented in [14] as one of → − the natural orientation properties whose query complexity was still unknown. A directed graph G is called Eulerian if for every vertex v in the graph, the in-degree of v is equal to its out-degree. An → − undirected graph G has an Eulerian orientation G if and only if all the degrees of G are even. Such an undirected graph is called Eulerian also. Throughout the paper we assume that our underlying undirected graph G is Eulerian. Eulerian graphs and Eulerian orientations have attracted researchers since the dawn of graph theory in 1736, when Leonard Euler published his solution for the famous “K¨onigsberg bridge problem”. Throughout the years, Eulerian graphs have been the subject of extensive research (e.g. [22, 18, 25, 19, 6, 3]; see [10, 11] for an extensive survey). Aside from their appealing theoretic characteristics, Eulerian graphs have been studied in the context of networking [16] and genetics [21]. Testing for being Eulerian in the orientation model is equivalent to the following problem. We have a known network (a communication network, a transportation system or a piping system) where every edge can transport a unit of “flow” in both directions. Our goal is to know whether the network is “balanced”, or far from being balanced, where being balanced means that the number of flows entering every node in the network is equal to the number of flows exiting it (so there is 2

no “accumulation” in the nodes). To examine the network, we detect the flow direction in selected individual edges, which is deemed to be the expensive operation. The main difficulty in testing orientations for being Eulerian arises from the fact that an orientation might have a small number of unbalanced vertices, each of them with small imbalance, and yet be far from being Eulerian. This is since trying to balance an unbalanced vertex by inverting some of its incident edges may violate the balance of its balanced neighbors. Thus, we must continue to invert edges along a directed path between a vertex with a positive imbalance and one with a negative imbalance. We call such a path a correction path. A main component of our work is giving upper bounds for the length of correction paths. In this context we mention the work of Babai [3], who studied the ratio between the diameter of Eulerian digraphs and the diameter of their underlying undirected graphs. While he gave an upper bound for this ratio for vertex-transitive graphs, he showed an infinite family of undirected graphs with diameter 2 which have an Eulerian orientation with diameter Ω(n1/3 ). Our upper bounds are based on three “generic” tests, one 1-sided test and two 2-sided tests. Instead of receiving  as a parameter, the generic tests receive a parameter p, which stands for the number of required correction paths in an orientation that is far from being Eulerian. We hence call these tests p-tests. We later derive -tests from the p-tests by proving two lower bounds for p. The first gives an efficient test for dense graphs and the other one gives an efficient test for expander graphs. Finally, we show how to use the expander tests for obtaining a 1-sided test and a 2-sided test for general graphs, using a decomposition (“chopping”) procedure into subgraphs that are roughly expanders. The 2-sided test that we obtain this way has a sub-linear query complexity for every graph. Unfortunately, our chopping procedure is adaptive and has an exponential computational time in |E|. All of our other algorithms are non-adaptive and their computational complexity is of the same order as their query complexity. On the negative side, we provide several lower bounds. We show that any 1-sided test for being Eulerian must use Ω(m) queries for some graphs. For bounded-degree graphs, we use the toroidal grid to prove non-constant 1-sided and 2-sided lower bounds. These bounds are quite surprising, as bounded-degree graphs have a constant size witness for not being Eulerian, namely, the edges incident with one unbalanced vertex. In contrast, the st-connectivity property, whose witness must include a cut in the graph, is testable with a constant number of queries in the orientation model [7]. However, in other testing models there are known super-constant lower bounds also for properties which have constant-size witness. For instance, in [4] it is proved that testing whether a truth assignment satisfies a known 3CNF formula requires a linear number of queries for some formulas. The following chart gives a summary of our query complexity upper and lower bounds. Here and throughout the paper, we let n = |V |, m = |E|, ∆ be the maximum vertex-degree in G, and def d = m/n. The tilde notation hides polylogarithmic factors.

3

Result (Section)

1-sided  O ∆m 2 d2   m O ∆ log α

Tests for large d (5) Tests for α-expanders (6) General tests (8)

O



(∆m log m)2/3 4/3

Simple lower bound (3)

Ω(m)

Lower bounds for

Ω(log m),

bounded-degree graphs (9)



Ω(m1/4 ) non-adaptive

2-sided h  3  √ i e m e 2∆m min O , O 6 6  d2    d  √  3 ∆ log m log m e e ,O min O α α  1/3 2/3  m e ∆ 4/3 O for ∆ ≤ (m)4/7 ,   3/16  e ∆ 5/4m3/4 for ∆ > (m)4/7 O  —



q

Ω(log log m),  log m non-adaptive log log m

The rest of the paper is organized as follows. Section 2 includes general definitions and lemmas to be used in the sequel. In Section 3 we provide a simple 1-sided lower bound for general graphs. Section 4 is dedicated to our three p-tests, which distinguish between Eulerian orientations and orientations with many correction paths. In Section 5 we give a lower bound on the number of correction paths as a function of the average degree in the graph, and derive our tests for graphs with high average degree. In Section 6 we give such a bound and tests for expander graphs. Section 7 considers testing subgraphs that we call “lame” expanders, providing results for them that are similar to those obtained for expanders, and that will be used in the sequel. Section 8 presents our most general tests, which use the results of Section 7. Section 9 gives our lower bound for bounded-degree graphs. Finally, Section 10 summarizes the still open questions.

2

Preliminaries

Throughout the paper, we assume a fixed and known underlying graph G = (V, E) which is Eulerian, that is, for every v ∈ V , the degree deg(v) of v is even (although connectivity is sometimes regarded as a second necessary condition in the definition of an undirected Eulerian graph, we note that all our algorithms and proofs work also for the case where G is not connected). → − → − → (v) denote the in-degree of v Given an orientation G = (V, E ) and a vertex v ∈ V , let indeg− G → − → − → (v) denote the out-degree of v with respect to G . We define with respect to G and let outdeg− G → − def → (v) = outdeg− → (v) − indeg− → (v). In all the above, we may omit the the imbalance of v in G as ib− G G G → − → − subscript G whenever it is obvious from the context. We say that a vertex v ∈ V is a spring in G → − → (v) > 0. We say that v is a drain in G if ib− → (v) < 0. If ib− → (v) = 0 then we say that v is if ib− G G G → − → − balanced in G . We say that G is Eulerian if all its vertices are balanced. Since all the vertices of → − G have even degrees, there always exists some Eulerian orientation G of G. → − Note that testing whether G is Eulerian is trivial for  ≥ 12 . → − Observation 2.1 Every orientation G of G is 21 -close to being Eulerian. − → − → ← − Proof. Let G1 be an arbitrary Eulerian orientation of G. Let G2 = G1 , namely, the orientation − → derived from G1 by inverting all the edges. Clearly, G2 is Eulerian as well, since inverting all the 4

edges maintains the absolute value of the imbalance of all vertices. Now, for every edge e ∈ E, the → − − → − → direction of e in G is the same as in G1 if and only if it is opposite to the direction of e in G2 . → − − → → − − → → − − → − → Hence, dist( G , G1 ) = 1 − dist( G , G2 ), and therefore, G is 12 -close to either G1 or G2 . Given a set U ⊆ V , we let def

E(U ) = {{u, v} ∈ E | u, v ∈ U }, def

∂U = {{u, v} ∈ E | u ∈ U, v ∈ / U }, and

− → → − def E (U ) = {(u, v) ∈ E | u, v ∈ U }.

Given two disjoint sets U, W ⊆ V , let def

E(U, W ) = {{u, w} ∈ E | u ∈ U, w ∈ W } and

2.1

− → → − def E (U, W ) = {(u, w) ∈ E | u ∈ U, w ∈ W }.

Correction subgraphs and p-tests

→ − → − −→ → − Let G be an orientation of G. Given a subgraph H = (VH , EH ) of G (that is, a directed graph −→ − → → def − → − − = (V, E ← − ) to be the orientation of G derived from where VH ⊆ V and EH ⊆ E ) we define G ← H H → − → − G by inverting all the edges of H . Namely, − → − −→ → −→ − = E \ EH ∪ {(v, u) ∈ (VH )2 | (u, v) ∈ EH }. E← H → − − → → − − → − is Eulerian. Note that in such a case, G is We say that H is a correction subgraph of G if G ← H −→ |EH | m -close

to being Eulerian.

→ − − → Lemma 2.2 A subgraph H of G is a correction subgraph if and only if the following conditions hold for every v ∈ V : − → 1. If v ∈ / VH , then v is balanced in G . → (v) = 2. If v ∈ VH , then ib− H

1 2

→ (v). · ib− G → − − → − → In particular, a vertex v is a spring in G if and only if it is a spring in H , and v is drain in G if → − and only if it is a drain in H .

Proof.

→ − − → → (v) = 0 for every Remember that H is a correction subgraph of G if and only if ib− − G←

v ∈ V . The proof follows from the following facts, which can be easily verified: → (v) = ib− → (v). 1. If v ∈ / VH , then ib− − G← G H

→ (v) = ib− → (v) − 2 · ib− → (v). 2. If v ∈ VH , then ib− − G H G← H

5

H

→ − Since we assume that G is Eulerian, there always exists some correction subgraph of G . Furthermore, without loss of generality, we may focus on acyclic correction graphs. → − − → → − Observation 2.3 For any orientation G of G and for any correction subgraph H1 of G , there → − → − − → exists an acyclic correction subgraph H of G that is a subgraph of H1 . → − − → − → Proof. We obtain H from H1 as follows. While H1 is not acyclic, we arbitrarily choose a directed cycle and remove its edges from the subgraph. Note that this operation maintains the − → balance factors of all the vertices of G with respect to H1 . The proof thus follows from Lemma 2.2. → − → − Let S be the set of springs in G and let T be the set of drains in G . We say that a directed → − → − path P = hu0 , . . . , uk i in G is a spring-drain path if u0 ∈ S and uk ∈ T . Note that in this case, → − → − → − for any correction subgraph H of G , we have by Lemma 2.2 that u0 is a spring in H and uk is a → − drain in H . The following observations and lemmas follow easily. → − → − → − Observation 2.4 If G is not Eulerian then any correction subgraph H of G contains a springdrain path. → − → − → − → − Observation 2.5 Let H be a correction subgraph of G and let P be a spring-drain path in H . → − − → → − → − → − − → def Define H \ P to be the graph obtained from H by removing all the edges of P , that is, E \ P = → − − → → − − → → − → − − , the graph obtained from H by inverting (VH , E \ P ). Then H \ P is a correction subgraph of G ← P → − → − → − − → → − − is Eulerian. all the edges of P . Moreover, if H is acyclic then H \ P is edgeless if and only if G ← P → − → − → − Lemma 2.6 If G is not Eulerian then any acyclic correction subgraph H of G is a union of → − p = 14 Σu∈V |ib(u)| edge-disjoint spring-drain paths. Moreover, every decomposition of H into edgedisjoint spring-drain paths has exactly p paths. In particular, removing any spring-drain path from → − H results in a graph that is a collection of p − 1 edge-disjoint spring-drain paths. → − Proof. From Observations 2.4 and 2.5, it is clear that H may be decomposed into edge-disjoint spring-drain paths. Note that when removing a spring-drain path, we reduce the sum Σu∈V |ib(u)| by exactly four (as we reduce the absolute value of the imbalance of both the spring and the → − drain by two). Therefore, every decomposition of H into disjoint spring-drain paths contains p = 14 Σu∈V |ib(u)| paths. → − → − → − Lemma 2.7 Let G be an orientation of G and let H be an acyclic correction subgraph of G which → − is a union of p disjoint spring-drain paths. Let S be the set of springs in H . Then Σu∈S ib(u) = 2p. Proof. By Lemma 2.2, for any spring u ∈ S, the number of spring-drain paths starting at u is ib(u)/2. Thus Σu∈S ib(u)/2 = p. → − If every correction subgraph of an orientation G is a union of at least p disjoint spring-drain → − path, we will say that G is p-far from being Eulerian. An algorithm will be called a p-test for being Eulerian for some positive number p if it can distinguish between Eulerian orientations and p-far orientations. Namely, the algorithm should accept an Eulerian orientation with probability at least 2/3 and reject a p-far orientation with probability at least 2/3. 6

2.2

β-correction subgraphs and (p, β)-tests

Given β > 0, we say that a vertex v is β-small if deg(v) ≤ β and β-big if deg(v) > β. An orientation → − → − → − G is called β-Eulerian if all the β-small vertices in V are balanced in G . Note that for β ≥ ∆, G is → − → − −→ → − β-Eulerian if and only if G is Eulerian. A directed subgraph H = (VH , EH ) of G is a β-correction → − subgraph of G if: → − − is β-Eulerian. 1. G ← H → − → (v)| ≤ |ib− → (v)| and if v is a spring (drain) in G ← − then it is 2. For every v ∈ V we have: |ib− − G← G H H → − a spring (drain) in H . → − That is, H fixes the balance of all the β-small vertices without increasing or changing the sign of the imbalance of β-big vertices. − → A directed path in an orientation G is called a β-spring-drain path if it is a spring-drain path, where at least the spring or the drain is β-small. − → Observation 2.8 Every orientation G which is not β-Eulerian has a β-correction subgraph which is a union of edge-disjoint β-spring-drain paths. → − For 0 ≤  ≤ 1 and β > 0, we say that an orientation G is -close to being β-Eulerian if there → − −→ → − −→ → − exists a β-correction subgraph H = (VH , EH ) of G with |EH | ≤ m. Otherwise we say that G is -far from being β-Eulerian. → − For p, β > 0, we say that an orientation G is p-close to being β-Eulerian if there exists a β→ − − → correction subgraph H of G that is a union of at most p edge-disjoint spring-drain paths. Otherwise, → − we say that G is p-far from being β-Eulerian. One can show that all the lemmas and observations that we have proved in Subsection 2.1 for correction subgraphs and spring-drain paths can be adapted to β-correction subgraphs and β-spring-drain paths. In particular: → − → − → − Lemma 2.9 If G is not β-Eulerian then any acyclic β-correction subgraph H of G is a union of → − at least 41 Σu∈V |ib(u)| and at most 12 Σu∈V |ib(u)| edge-disjoint β-spring-drain paths. Hence, if G is p-far from being β-Eulerian then Σu∈V,deg(u)≤β |ib(u)| ≥ 2p. An algorithm will be called a (p, β)-test for being Eulerian for some positive number p if it can distinguish between β-Eulerian orientations and orientations that are p-far from being β-Eulerian. Namely, the algorithm should accept a β-Eulerian orientation with probability at least 2/3 and reject an orientation that is p-far from being β-Eulerian with probability at least 2/3.

3

A linear lower bound for 1-sided tests

In this section we prove that there exists no sub-linear 1-sided test for Eulerian orientations of general graphs.

7

→ − Consider an algorithm that tests an orientation G of G. At a given moment, we represent the → − −→ edges that the algorithm has queried so far in a directed knowledge graph H = (V, EH ), where −→ − → → − EH ⊆ E . We say that a cut M = (U, V \ U ) of G is valid with respect to a knowledge graph H if −→ −→ 1 1 |EH (U, V \ U )| ≤ |E(U, V \ U )| and |EH (V \ U, U )| ≤ |E(U, V \ U )|. 2 2 → − → − → − Otherwise, M is called invalid. Clearly, if G is Eulerian, then every knowledge graph H of G contains only valid cuts. We show that any 1-sided test for being Eulerian must obtain a knowledge → − graph that contains some invalid cut in order to reject G . → − We say that a (valid) cut M = (U, V \ U ) of G is restricting with respect to H if −−→ −→ 1 1 |EH (U, V \ U )| = |E(U, V \ U )| or |EH (V \ U, U )| = |E(U, V \ U )|. 2 2 → − → − Note that, given that G is Eulerian, a restricting cut with respect to H forces the orientations of all the unqueried edges in the cut. We say that two restricting cuts M1 and M2 are conflicting (with → − respect to a knowledge graph H ) if they force contrasting orientations of at least one unqueried edge. → − → − Lemma 3.1 Let H be a knowledge graph of G and suppose that all the cuts in G are valid with → − → − respect to H . Then there are no conflicting cuts with respect to H . Proof. Assume, on the contrary, that M1 = (V1 , V \ V1 ) and M2 = (V2 , V \ V2 ) are conflicting → − with respect to H . That is, there exists an edge {u, w} ∈ E, which was not queried and hence is → − not oriented in H , and is forced to have contrasting orientations by M1 and M2 . Without loss of generality, assume that u ∈ V1 \ V2 , w ∈ V2 \ V1 , and −→ 1 |EH (V1 , V \ V1 )| = · |E(V1 , V \ V1 )|, 2

(1)

−→ 1 |EH (V2 , V \ V2 )| = · |E(V2 , V \ V2 )|. (2) 2 Thus, M1 forces e to be oriented from w to u, whereas M2 forces e to be oriented from u to w. → − Recall now that all the cuts in G are valid with respect to H . Consider the cuts (V1 ∩ V2 , V \ (V1 ∩ V2 )) and (V1 ∪ V2 , V \ (V1 ∪ V2 )). We have −→ 1 |EH (V1 ∩ V2 , V \ (V1 ∩ V2 ))| ≤ · |E(V1 ∩ V2 , V \ (V1 ∩ V2 ))| 2

(3)

and

−→ 1 · |E(V1 ∪ V2 , V \ (V1 ∪ V2 ))| |EH (V1 ∪ V2 , V \ (V1 ∪ V2 ))| ≤ 2 since these cuts are valid. Note that |E(V1 ∩ V2 , V \ (V1 ∩ V2 ))| + |E(V1 ∪ V2 , V \ (V1 ∪ V2 ))| = |E(V1 , V \ V1 )| + |E(V2 , V \ V2 )| − 2 · |E(V1 \ V2 , V2 \ V1 )| 8

(4)

(5)

and

−→ −→ |EH (V1 ∩ V2 , V \ (V1 ∩ V2 ))| + |EH (V1 ∪ V2 , V \ (V1 ∪ V2 ))| = −→ −→ −→ −→ |EH (V1 , V \ V1 )| + |EH (V2 , V \ V2 )| − |EH (V1 \ V2 , V2 \ V1 )| − |EH (V2 \ V1 , V1 \ V2 )|.

(6)

Summing Equation (1) with Equation (2) yields −→ −→ 1 |EH (V1 , V \ V1 )| + |EH (V2 , V \ V2 )| = (|E(V1 , V \ V1 )| + |E(V2 , V \ V2 )|) . 2 Summing Inequality (3) with Inequality (4) yields −→ −→ EH (V1 ∩ V2 , V \ (V1 ∩ V2 ))| + |EH (V1 ∪ V2 , V \ (V1 ∪ V2 ))| ≤

(7)

(8)

1 (|E(V1 ∩ V2 , V \ (V1 ∩ V2 ))| + |E(V1 ∪ V2 , V \ (V1 ∪ V2 ))|) . 2 Substituting Equations (5) and (6) in Inequality (8) we obtain: −→ −→ −→ −→ |EH (V1 , V \ V1 )| + |EH (V2 , V \ V2 )| − |EH (V1 \ V2 , V2 \ V1 )| − |EH (V2 \ V1 , V1 \ V2 )| ≤ 1 (|E(V1 , V \ V1 )| + |E(V2 , V \ V2 )|) − |E(V1 \ V2 , V2 \ V1 )|. 2 Now, from Equation (7) we have: −→ −→ |EH (V1 \ V2 , V2 \ V1 )| + |EH (V2 \ V1 , V1 \ V2 )| ≥ |E(V1 \ V2 , V2 \ V1 )|. → − That is, all the edges in E(V1 \ V2 , V2 \ V1 ) are oriented in H . This is a contradiction to our assumption that {u, w} ∈ E(V1 \ V2 , V2 \ V1 ) was not yet oriented. → − → − Lemma 3.2 Suppose that H is a knowledge graph that does not contain invalid cuts. Then H is − → −→ −→ extensible to an Eulerian orientation G of G. That is, EH ⊆ EG . Proof. We orient unoriented edges in the following manner. If there exists a restricting cut with unoriented edges, we orient them as obliged by the cut. According to Lemma 3.1, this will not invalidate any of the other cuts in the graph, and so we may continue. If there are no restricting cuts in the graph, we arbitrarily orient one unoriented edge and repeat (and this cannot violate any cut in the graph since there were no restricting cuts). Eventually, after orienting all the edges, we receive a complete orientation of G whose cuts are all valid, and thus it is Eulerian. Theorem 3.3 There exists an infinite family of graphs for which every 1-sided test for being Eulerian must use Ω(m) queries. def

Proof. For every even n, let Gn = K2,n−2 , namely, the graph with a set of vertices V = {v1 , . . . , vn } and a set of edges E = {{vi , vj } | i ∈ {1, 2}, j ∈ {3, . . . , n}}. Clearly, Gn is Eulerian and n = Ω(m). → − Consider the orientation G n of Gn in which all the edges incident with v1 are outgoing and → − all the edges incident with v2 are ingoing. Clearly, G n is 21 -far from being Eulerian. According to Lemma 3.2, every 1-sided test must query at least half of the edges in some unbalanced cut (because otherwise it would clearly not obtain an invalid cut in the knowledge graph). However, one can see that every cut which does not separate v1 and v2 is balanced, and every cut which separates v1 and v2 is of size n − 2 = Ω(m). 9

4

Generic tests

In this section we present one 1-sided p-test and two 2-sided p-tests for being Eulerian. Namely, our → − tests distinguish with high probability between the case where G is Eulerian and the case where → − G is p-far from being Eulerian (see Subsection 2.1). In later sections we devise several such lower → − bounds on p for every orientation G that is -far from being Eulerian, thus obtaining corresponding upper bounds on the tests below. In fact, the 1-sided and 2-sided tests that we give in Subsection 4.2 are (p, β)-tests (see Subsection 2.2), which are, in particular, p-tests when β = ∆. We will use these tests for β < ∆ in Section 7.

4.1

A 2-sided p-test

We give a simple 2-sided p-test that is independent of the maximum degree ∆. To simplify notation, def p we denote δ = 4m . → − Algorithm 4.1 SIMPLE-2( G , p): • Repeat

4 δ

times independently:

→ − – Select an edge e ∈ E(G) uniformly and query it. Denote the start vertex of e in E by u → − and the end vertex of e in E by v. – Query

16 ln(12/δ) δ2

edges incident with u uniformly and independently and reject if the

outgoing edges. sample contains at least (1 + δ) 8 ln(12/δ) δ2 • Accept if the input was not rejected earlier. e Lemma 4.2 SIMPLE-2 is a 2-sided p-test for being Eulerian with query complexity O  3 e m3 . O

1 δ3



=

p

Proof. The query complexity is clearly as stated. To prove the correctness of our algorithm, → − suppose first that G is Eulerian. For every vertex u ∈ V , the expected number of outgoing edges in the sample of u’s incident edges is 8 ln(12/δ) . Applying Chernoff’s upper tail bound, the probability δ2 of having at least (1 + δ) 8 ln(12/δ) outgoing edges in a sample is at most δ2 

eδ (1 + δ)(1+δ)

 8 ln(12/δ) 2 δ

By Lemma A.1 (see Appendix A), we have eδ 2 < e−δ /8 . (1+δ) (1 + δ) Thus, the probability of sampling (1 + δ) 8 ln(12/δ) outgoing edges for a balanced vertex is at most δ2 4 δ . Since we sample vertices, the probability of rejecting an Eulerian orientation is at most 13 . 12 δ

10

→ − Assume now that G is p-far from being Eulerian. A vertex u ∈ V will be called a δ-spring in − → → − G if ib(u) > 3δ · deg(u). Let Sδ be the set of all δ-springs in G . Note that Sδ ⊆ S, where S is the → − set of al springs in G . We have Σu∈S ib(u) ≤ Σu∈Sδ deg(u) + 3δ · Σu∈S\Sδ deg(u). Let y = Σu∈Sδ deg(u). Since Σu∈S deg(u) < 2m, we obtain Σu∈S ib(u) < y + 3δ · (2m − y) = (1 − 3δ)y + 6δm. However, from Lemma 2.7, we have Σu∈S ib(u) = 2p = 8δm, and therefore, y>

2δm > 2δm. 1 − 3δ

Thus, Σu∈Sδ deg(u) > 2δm, and since for every u ∈ Sδ we have degout (u) > 12 deg(u), there exist → − → − at least δm edges (u, v) ∈ E such that u is a δ-spring. As SIMPLE-2 samples 4δ edges (u, v) ∈ E uniformly and independently, the probability of not sampling any edge (u, v) such that u is a 4 δ-spring is at most (1 − δ) δ < e−4 < 0.03. → − Suppose now that the algorithm has sampled an edge (u, v) such that u is a δ-spring. Then G will be rejected unless fewer than (1 + δ) 8 ln(12/δ) of the queried edges (u, w) are outgoing. Since δ2

. Note that ib(u) > 3δ · deg(u), the expected number of outgoing edges is at least (1 + 3δ/2) 8 ln(12/δ) δ2

(1 + δ) 8 ln(12/δ) < (1 − 41 δ)(1 + 32 δ) 8 ln(12/δ) . Therefore, by the Chernoff bound, the probability of δ2 δ2 sampling fewer than (1 + δ) 8 ln(12/δ) outgoing edges is at most δ2

 2     δ 3δ 8 ln(12/δ) ln(12/δ) exp − 1+ < exp − . 32 2 δ2 4 > ln(10/3), and therefore, the probability of not detecting that u is a δ-spring is Note that ln(12/δ) 4 → − → − at most 0.3. We thus conclude that if G is p-far from being Eulerian, then SIMPLE-2 accepts G with probability smaller than 31 , which completes the proof of the theorem.

4.2

(p, β)-tests

We now give a simple 1-sided (p, β)-test. → − Algorithm 4.3 GENERIC-1( G , p, β): 1. Repeat

ln 3 m p

times independently:

→ − • Select an edge e ∈ E(G) uniformly and query it. Denote the start vertex of e in E by u → − and the end vertex of e in E by v. • If | deg(u)| ≤ β then: Query all the edges {u, w} ∈ E and reject if and only if u is unbalanced, namely, if ib(u) 6= 0. 2. Repeat

ln 3 m p

times independently: 11

→ − • Select an edge e ∈ E(G) uniformly and query it. Denote the start vertex of e in E by u → − and the end vertex of e in E by v. • If | deg(v)| ≤ β then: Query all the edges {w, v} ∈ E and reject if and only if v is unbalanced, namely, if ib(v) 6= 0. 3. Accept if the input was not rejected by the above. Lemma 4.4 GENERIC-1 is a 1-sided (p, β)-test for being Eulerian with query complexity O   In particular, for β = ∆, GENERIC-1 is a 1-sided p-test with query complexity O ∆m . p



βm p



.

Proof. The query complexity is clearly as stated. Since Algorithm 4.3 rejects only when an → − → − unbalanced β-small vertex is discovered, it always accepts a β-Eulerian G . Suppose now that G → − → − is p-far from being β-Eulerian. Clearly, to reject G , it suffices to sample one edge e = (u, v) ∈ E → − such that u is a spring in Step 1 or an edge e = (u, v) ∈ E such that v is a drain in Step 2. By Lemma 2.9, E contains at least p edges of at least one of these two kinds of edges, and so their → − fraction is at least p/m. We thus conclude that the probability of accepting G is at most 

1 p  ln 3p m < . 1− m 3

We note that GENERIC-1 has a better query complexity than SIMPLE-2 for ∆ 

m2 p2

ln( m p ).

We conclude this section with a 2-sided (p, β)-test, which gives better query complexity than GENERIC-1 for β  log2 m and better than SIMPLE-2 for p  √mβ . The main idea of the algorithm is to perform roughly O((log β)2 ) testing stages, each designed to detect unbalanced vertices whose degree and imbalance lie in a certain interval. We show that with high probability, a β-Eulerian → − orientation G is accepted by all the testing stages, while an orientation that is p-far from being β-Eulerian is rejected by at least one of them. In the following, log denotes the logarithm with base 2. → − Algorithm 4.5 MULTISTAGE-2( G , p, β): For i = 1, . . . , dlog βe − 1, do: def

def

1. Let Vi = {u ∈ V | deg(u) ∈ [2i , 2i+1 ) } and ni = |Vi |. 2. If Let j = di/2e. If 2j · ni ≥ 2

2p (log β)2

then:

j+1

2 ni • Sample xi,j = ln 12 (log β) vertices in Vi uniformly and independently. 2p • For every sampled vertex u, query all the edges incident with u, and reject if u is unbalanced.

3. For every j ∈ {di/2e + 1, . . . , i − 1} such that 2j · ni ≥ • Sample xij =

ln 12 (log β)2 2j+1 ni 2p

2p (log β)2

do:

vertices in Vi uniformly and independently.

• For every sampled vertex u, query qij = 256 · ln( 6(log β)2 xij ) · 22(i−j) edges adjacent to u, uniformly and independently, and reject if the absolute difference between q the number of ingoing and outgoing edges in the sample is at least 4·2iji−j . 12

Accept if the input was not rejected earlier. Lemma √ 4.6 MULTISTAGE-2 is a 2-sided (p, β)-test for being Eulerian with query complexity β m e O . In particular, for β = ∆, MULTISTAGE-2 is a 2-sided p-test for being Eulerian with p √  ∆ m e query complexity O . p Proof.

First, we compute the asymptotic query complexity of Algorithm 4.5. Since   (log β)2 · 2j · ni xij = O p

and





qij = O log

log β · 2j · ni p

 ·2

2(i−j)







= O log

log β · m p

 ·2

2(i−j)

 ,

the total query complexity is at most β i−1 Σlog i=1 Σj=di/2e

 log β · m (log β)2 log β 2i 1 i−1 xij · qij = log Σi=1 2 · ni Σj=di/2e p p 2j   log β · m (log β)2 p = log βm. p p 

→ − → − Now suppose that G is Eulerian. Then G can only be rejected in Step 3, where we randomly sample qij edges incident with a vertex u ∈ Vi . Since u is balanced, the expected number of q ingoing edges in the sample is 2ij . By the Chernoff bound,   the probability of sampling fewer than

(1 −

q 1 ) ij 4·2i−j 2

1 ingoing edges is at most exp − 22(i−j) ·

of sampling fewer than

qij 64

1 . Similarly, the probability 6(log β)2 xij qij 1 1 (1 − 4·2i−j ) 2 outgoing edges is at most 6(log β)2 xij . Since for every pair (i, j) and since there are less than (log β)2 pairs, the total probability of rejecting




2p . (log β)2 · 2j0 +1 13

(9)

Recall that |Vi0 | = ni0 . Hence, when we sample xi0 j0 vertices in Step 2 or in Step 3, the probability of not sampling any vertex u ∈ Vi0 j0 is smaller than   xi j  0 0 2p 1 2p 1− < exp − · x . = i j 0 0 2 j +1 2 j +1 (log β) · 2 0 · ni0 (log β) · 2 0 · ni0 12 Assume that the algorithm has sampled a vertex u ∈ Vi0 j0 . If j0 ≤ di0 /2e then all the edges → − incident with u are queried (Step 2), and since u is unbalanced, G is now rejected with probability 1. Otherwise, qi0 j0 edges incident with u are queried independently in Step 3. Since deg(u) < 2i0 +1  qand |ib(u)| ≥ 2j0 , either the expected number of ingoing edges in the sample is at least 1 + 2·2i10 −j0 i02j0 q or the expected number of outgoing edges in the sample is at least 1 + 2·2i10 −j0 i02j0 . To accept q  q the input, we must sample fewer than 1 + 4·2i10 −j0 i02j0 < 1 − 8·2i10 −j0 · 1 + 2·2i10 −j0 i02j0 edges in the majority direction. By the Chernoff bound, the probability of doing so is at most    qi0 j0 1 = exp − ln(6(log β)2 xi0 j0 ) = . exp − 2(i −j ) 6(log β)2 xi0 j0 256 · 2 0 0 Note that from Inequality (9) we must have 2j0 +1 · ni0 > (log4pβ)2 and thus xi0 j0 > ln 12. This, in addition to the fact that β ≥ 2, yields that the probability of not rejecting the input when sampling u’s edges is smaller than 1/4. We thus conclude that the probability of accepting an orientation that is p-far from being β-Eulerian is at most 1/3. We note that the number of testing stages in Algorithm 4.5 can be reduced in some graphs by skipping pairs (i, j) where 2j · ni < (log2pβ)2 , and thus i and j cannot be the i0 and j0 which satisfy Equation (9). In particular, for regular graphs, only one value of i is relevant for testing, which eliminates the square over log β in the query complexity. However, this does not change the power of β nor the dependency on p and m.

5

Testing graphs with high average degree

In this section we obtain a general lower bound for p as a function of d = m/n. This, together with the analysis of the previous section, will provide us with an efficient test for dense graphs. → − → − − → Lemma 5.1 Suppose that G is not Eulerian and H is an acyclic correction subgraph of G which → − is a union of p edge-disjoint spring-drain paths. Then H contains a spring-drain path of length smaller than √np . → − Proof. Consider a breadth-first search (BFS) traversal of H starting at S. We define a partition of VH into levels L0 , . . . , Lt for some t > 0 as follows. Let L0 = S. Now, for any i > 0, while S i−1 j=0 Lj 6= VH , define def

Li = {v ∈ VH \

i−1 [

−→ Lj | there exists u ∈ Li−1 such that (u, v) ∈ EH }.

j=0

→ − Note that if v ∈ Li then H contains a path of length i from some spring to v. Let ` be the minimum index of a level Li which contains a drain. We prove the claim by showing that ` < √np . 14

For every i = 0, . . . , t we denote ni = |Li |. Recall that there are p edge disjoint spring-drain → − paths in H . We first show that for every 0 ≤ i < ` we have ni+1 ≥ p/ni . Consider the level Li for some 0 ≤ i < `. Since there are no drains in levels L0 , . . . , Li , there exist at least p edges from Li to Li+1 . Therefore, there exists a vertex v ∈ Li which has at least p/ni neighbors in Li+1 . Hence, ni+1 ≥ p/ni . `−1 1 Summing over all i = 0, . . . , ` − 1 we obtain Σ`−1 i=0 ni+1 ≥ p · Σi=0 ni , `−1 Σ`−1 i=0 ni+1 ≥ p · Σi=0

and so p≤

Σ`−1 i=0 ni+1 1 Σ`−1 i=0 ni

1 , ni

.

`−1 `−1 1 Now, for a given Σ`−1 i=0 ni , the minimum value of Σi=0 ni is reached when ` divides Σi=0 ni and ni = Σ`−1 i=0 ni /` for every i = 0, . . . , ` − 1. Thus,

p≤

Σ`−1 i=0 ni+1 `2 /Σ`−1 i=0 ni

< n2 /`2 ,

which proves the lemma. → − − → Lemma 5.2 If G is -far from being Eulerian then G is p-far for p > 22 d2 . − → → − − → Proof. Let H0 be an acyclic correction subgraph of G . Then H0 is a union of p edge-disjoint −−−→ spring-drain paths. At each step j ≥ 1, while Hj−1 is not empty, we choose a shortest spring-drain −−→ −−−→ − → −−−→ −−→ − → path Pj−1 in Hj−1 and set Hj = Hj−1 \ Pj−1 . By Lemma 2.6, H0 is a union of p disjoint spring-drain − → paths, and moreover, every Hj is a union of p − j disjoint spring-drain paths. Hence, the graphs − → −−−→ − → H0 , . . . , Hp−1 are non-empty. Furthermore, every subgraph Hj is clearly acyclic, and Hence, by − → Observation 2.5, for j = 0, . . . , p − 1, Hj is a correction subgraph for some non-Eulerian orientation − → n of G. Let `j be the length of Pj for j = 0, . . . , p − 1. Then, by Lemma 5.1 we have `j < √p−j . Summing over j, we obtain r 1 p 1 p p−1 p−1 Σj=0 `j < n · Σj=0 √ 2 m /n = 2 d . Substituting the lower bound for p of Lemma 5.2 in Lemmas 4.4, 4.2 and 4.6, we obtain the following theorem. Theorem 5.3 → − 1. GENERIC-1( G , 22 d2 , ∆) is a 1-sided -test for being Eulerian with query complexity O ∆m = O ∆n . 2 d2 2 d 15

→ 2 2 − 2. SIMPLE-2(  3  G ,2 3d) is a 2-sided -test for being Eulerian with query complexity e m e n6 3 . O =O 6 d6  d → 2 2 − 3. MULTISTAGE-2(  √ G ,2 d , ∆) is a 2-sided -test for being Eulerian with query complexity √  ∆n e e 2∆m =O . O 2 2  d

 d

√  GENERIC-1 gives a sub-linear query complexity for d = ω ∆ , SIMPLE-2 gives a sub √ n·(log n)1/4 linear complexity for d = ω , and MULTISTAGE-2 gives a sub-linear complexity for 3/2  1/4  d = ω ∆  . All of the tests yield the lowest query complexity relative to m when m = Θ(n2 ) (i.e., 6 ) for SIMPLE-2 and O( e e √n/2 ) for MULTISTAGE-2. d = Θ(n)): O(n/2 ) for GENERIC-1, O(1/

6

Testing orientations of an expander graph

A graph G = (V, E) is called an α-expander for some α > 0 if it is connected and for every U ⊆ V such that 0 < |E(U )| ≤ m/2 we have |∂U | ≥ α|E(U )|. |∂U | ≥ α. |E(U )| → − Lemma 6.1 Let G be an Eulerian α-expander. Let G be a non-Eulerian orientation of G. Then → − G contains a spring-drain path of length ` = O(log(1+α) m). Note that while the diameter of G is O(log(1+α) m), this does not automatically implies that its “oriented-diameter” is low, even if we assume that the orientation is Eulerian, as was shown by [3]. → − Proof. Consider a BFS traversal of G starting from the set S of springs. For everySi ≥ 0, let Li def S def be the ith level of the traversal, where L0 = S, and let U 0 while |E(U 0. In the next two lemmas we give lower bounds for the numbers of internal spring-drain paths and β 0 -spring-drain paths for some β 0 in an (α, β)-expander. Using our (p, β)-tests from Section 4 with these bounds, we will later obtain -tests for (α, β)-expanders. → − def Lemma 7.2 Let G [U ] be an (α, β)-expander for some 0 < α < 1 and 0 ≤ β and let mU = |E(U )|. → − Suppose that all the cuts in G are valid with respect to ∂U . Suppose that G [U ] is 0 -far from being → − β 0 -Eulerian for some 0 < 0 < 1 and 0 < β 0 ≤ ∆. Then G [U ] is p(β 0 , 0 )-far from being β 0 -Eulerian  0 0 mU U for p(β 0 , 0 ) = Ω log  mm = Ω log m . U +β U /α+β (1+α)

→ − → − Proof. We show that in any orientation G such that G [U ] is not β 0 -Eulerian, there exists a → − → − spring-drain path inside G [U ] of length O(log(1+α) mU + β). Note that an (α, β)-expander G [U ] → − remains an (α, β)-expander after we invert some of its internal edges. Thus, if G [U ] is 0 -far from being β 0 -Eulerian, then we have to invert internal edges along Ω(0 mU /(log(1+α) m + β)) β 0 -springdrain paths (as inverting spring-drains paths in which both the spring and the drains are β 0 -big is irrelevant for being β 0 -Eulerian). Thereof the lemma will follow. → − Assume that G [U ] is not β 0 -Eulerian, and let ` be the minimum length of a spring-drain path → − of internal edges in G [U ]. Such a path exists by Lemma 7.1. If ` ≤ β then the claim is obviously true. We thus assume that ` > β. Let SU be the set of springs in U . Consider a BFS traversal of → − def G [U ] starting from L0 = SU . For every i > 0, set by induction → − def Li = {v ∈ U | T here exists u ∈ Li−1 s.t. (u, v) ∈ E }. def S def S In addition, let Ai = 0≤j 0, and thus  → → − → − → − − → 1 − E (Ai , Bi ) > E (Ai , Bi ) + E (Bi , Ai ) + E (V \ U, Ai ) − E (Ai , V \ U ) . 2 Substituting the above in Equation (14) we have − → E (Ai , Bi ) > α|E(Ai )| 19

for every i such that β < i ≤ ` and |E(Ai )| ≤ |E(Bi )|. Recall that all the directed edges that exit Ai enter Ai+1 . We thus have |E(Ai+1 )| > (1 + α)|E(Ai )|, and by induction, |E(Ai )| > (1 + α)i−β |E(Aβ )| ≥ (1 + α)i−β β. Therefore, i − β ≤ log(1+α) (|E(Ai )|/β) < log(1+α) mU

(15)

for every i such that β < i ≤ ` and |E(Ai )| ≤ |E(Bi )|. Now, if for every β < i ≤ ` we have |E(Ai )| ≤ |E(Bi )|, then, from Equation (15) we have ` − β < log(1+α) mU and thus ` = O(log(1+α) mU + β). Otherwise, let k > 0 be the minimal index for which |E(Ai )| > |E(Bi )|. From Equation (15) we have k − 1 − β < log(1+α) mU . (16) For every k ≤ i ≤ ` while |E(Bi )| ≥ β, (Bi , Ai ) is a β-cut, and thus from Equation (13) we have → − → − |E(Ai , Bi )| − | E (V \ U, Bi )| + | E (Bi , V \ U )| ≥ 2α|E(Bi )|.

(17)

Note that the set Bi contains drains but no springs. Hence, more edges enter Bi then exit it: → − → − → − → − | E (Ai , Bi )| − | E (Bi , Ai )| + | E (V \ U, Bi )| − | E (Bi , V \ U )| > 0, and thus  → − → → − → − → − 1− | E (Ai , Bi )| > | E (Ai , Bi )| + | E (Bi , Ai )| − | E (V \ U, Bi )| + | E (Bi , V \ U )| . 2 Substituting the above in Equation (17) we have − → | E (Ai , Bi )| > α|E(Bi )| → − for every i such that k ≤ i ≤ ` and |E(Ai )| > |E(Bi )| ≥ β. Since all the edges in E (Ai , Bi ) enter Li , we have |E(Bi )| > (1 + α)|E(Bi+1 )|, and by induction |E(Bi )| > (1 + α)j−i |E(Bj )|. Hence, j − i ≤ log(1+α) (|E(Bi )|/|E(Bi+j )|) < log(1+α) mU

(18)

for every i, j such that k ≤ i ≤ j ≤ ` and |E(Aj )| > |E(Bj )| ≥ β. Now, if |E(A` )| > |E(B` )| ≥ β then, taking i = k and j = ` we have ` − k < log(1+α) mU . Combined with Equation (16) we obtain ` < 2 log(1+α) mU + β + 1 = O(log(1+α) mU + β).

20

Otherwise, let r be the minimum index such that |E(Bi )| < β. Then, for i = k and j = r − 1, Equation (18) yields r − 1 − k < log(1+α) mU . (19) Since |E(Br )| < β, there are less then β levels between r and ` and thus ` < r + β. Hence, with Equations (16) and (19) we achieve ` < 2 log(1+α) mU + 2β + 2 = O(log(1+α) mU + β).

→ − def Lemma 7.3 Let G [U ] be an (α, β)-expander for some 0 < α < 1 and 0 ≤ β ≤ ∆ and let mU = |E(U )|. Suppose that all the cuts in G are valid with respect to ∂U . Suppose that for some  > 0, → − → −  G [U ] is -far from being Eulerian, but β-Eulerian. Then G [U ] is pU -far from   2 -close  to being 

being Eulerian for pU = Ω

mU log(1+α) mU

=Ω

αmU log mU

.

→ − − → → − Proof. As G [U ] is 2 -close to being β-Eulerian, there exists a β-correction subgraph H of G [U ] → − → → − def − − , that is, the digraph obtained from G [U ] by of size at most m/2. Consider G 1 [U ] = G [U ]← H → − → − inverting all the edges in the β-correction subgraph H . Then G 1 [U ] is β-balanced. However, since → − G [U ] is -far from being balanced, at least m/2 more edges must be inverted in order to make it → − Eulerian. By Item 3 of Lemma 7.1, there exists a correction subgraph for G [U ] which is a union of internal paths from β-big springs to β-spring drains. → − To complete the proof, we show that while G 1 [U ] is not Eulerian, it contains an internal springdrain path of length ` = O(log(1+α) mU ) = O(log mU /α). This is done similarly to the proof in Lemma 7.2 which shows the existence of a short β 0 -spring-drain path. However, here we do not need to assume that there are β layers in the beginning and at the end of the BFS where the cut between layers is not a β-cut. Since all the springs are necessarily β-big, we necessarily have at least β edges going between each two successive layers. Note that if β ≤ ∆U then either the conditions of Lemma 7.2 apply for β 0 = β and 0 = /2 or the conditions of Lemma 7.3 apply. Also, note that if β > ∆ then the conditions of Lemma 7.2 apply for β 0 = ∆U and 0 = . We thus obtain the two -tests below. Note that whenever our tests use samples of edges incident with a vertex u ∈ U , the sampling is among all the edges incident with u, and not only internal edges. In the following, mU = |E(U )| and ∆U is the maximum degree in G[U ]. → − Algorithm 7.4 GEN-1( G [U ], ) → − → − 1. If β ≤ ∆U then run GENERIC-1( G [U ], p(β, /2)), and, otherwise run GENERIC-1( G [U ], p(∆U , )). In both cases, p(β 0 , 0 ) is the lower bound given in Lemma 7.2. → − 2. If β ≤ ∆U then run GENERIC-1( G [U ], pU ), where pU is the lower bound given in Lemma 7.3. 3. Reject if at least one of the tests has rejected and accept otherwise. → − Algorithm 7.5 MULTI-2( G [U ], ) 21

→ − 1. If β ≤ ∆U then run MULTISTAGE-2( G [U ], p(β, /2)), and, otherwise run MULTISTAGE→ − 2( G [U ], p(∆U , )). In both cases, p(β 0 , 0 ) is the lower bound given in Lemma 7.2. → − 2. If β ≤ ∆U then run MULTISTAGE-2( G [U ], pU ), where pU is the lower bound given in Lemma 7.3. 3. Reject if at least one of the tests has rejected and accept otherwise. Combining Lemma 7.2 and Lemma 7.3 with Lemma 4.4 and Lemma 4.6, we obtain the following lemmas, which will be used in Section 8. → − → − Lemma 7.6 GEN-1( G [U ], ) is a 1-sided -test for an (α, β)-expander subgraph G [U ], assuming that the external and do not induce an invalid cut. The query complexity of  edges of U are known  β·min[β,∆] ∆U log mU the test is O + , where mU = |E(U )| and ∆U is the maximum degree in G[U ]. α  → − → − Lemma 7.7 MULTI-2( G [U ], ) is a 2-sided -test for an (α, β)-expander subgraph G [U ], assuming that the external  √ edges of U are √ known and do not induce an invalid cut. The query complexity of β· min[β,∆] ∆U log mU e the test is O + , where mU = |E(U )| and ∆U is the maximum degree in α



G[U ].

8

General tests based on chopping

In this section we use our results from Section 7 to provide a 1-sided and a 2-sided test as follows. → − → − Given an orientation G of an Eulerian graph G, we show how to decompose G into a collection of (α, β)-expanders with a relatively small number of edges that are outside the (α, β)-expanders, called henceforth external edges. We will find this “chopping” adaptively while querying external → − edges only. If we do not find a witness showing that G is not Eulerian during the chopping procedure, then we sample a few (α, β)-expanders and test them using GEN-1 (Lemma 7.6) or using MULTI-2 (Lemma 7.7), obtaining a 1-sided test or a 2-sided test respectively. → − Lemma 8.1 (The chopping lemma) Given an orientation G as input and parameters α, β > 0, → − we can either find a witness showing that G is not Eulerian or find non-empty induced subgraphs → − − → → − → − G i = (Vi , E i = E (Vi )) of G (where i = 1, . . . , k for some k), which we call (α, β)-components (or simply components), that satisfy the following: 1. The vertex sets V1 , . . . , Vk of the components are mutually disjoint. → − 2. | E i | ≥ β for i = 1, . . . k. → − 3. All the components G i are (α, β)-expanders. 4. The total number of external edges satisfies [ − → − → E (Vi )| = O(αm2 log m/β). |E \ i=1,...,k

22

During the chopping procedure, we query only external edges, i.e., edges that are not in any com→ − ponent Gi . The query complexity is in the same order also if we find a witness that G is not Eulerian. → − − → Proof. The chopping procedure proceeds as follows. At first, we define G = G [V ] as our single → − component. Then, at each step, we decompose a component G [U ] into two separate components → − → − G [A] and G [B], if (A, B) is a β-cut of U and − → − → |E(A, B)| + | E (V \ U, A)| − | E (A, V \ U )| < 2α|E(A)|. (20) When decomposing, we query the edges of the cut (A, B), and mark them as external edges. Note that we need not query any additional edges to decide on cutting a component, all the information is given by the infrastructure graph G and by the orientation of external edges that were queried in previous steps. After every stage, we check whether the orientations of the edges queried so → − far violate any of the cuts in the graph (see Section 3), in which case we conclude that G is not Eulerian. The procedure terminates once there is no cut of any component that satisfies the chopping conditions. The components are clearly disjoint throughout the procedure. Since we only chopped components across β-cuts, every final component contains at least β edges. Moreover, note that a component is always chopped by the procedure unless all its β-cuts satisfy Equation (13). Hence, → − if the algorithm terminates without finding a witness that G is not Eulerian, then every Gi is an (α, β)-expander. It remains to prove the upper bound for the number of external edges and the query complexity of the chopping procedure. Consider a component U and a β-cut (A, B) of U that was queried in some step of the lemma. Suppose that the cut (A, V \ A) is valid. Then → − → − → − → − | E (A, B)| − | E (B, A)| + | E (A, V \ U )| − | E (V \ U, A)| = 0.

(21)

Combining this with the chopping condition (20), and considering two cases depending on whether → − → − | E (A, V \ U )| − | E (V \ U, A)| is positive or negative, we obtain that −  → → − min | E (A, B)|, | E (B, A)| < α|E(A)|. (22) Hence, if Equation (22) does not hold, then, in any case, after querying the edges in (A, B) we discover an invalid cut in the graph (which could be (A, B) or another cut). We next compute the query complexity in the case where our knowledge graph contains no invalid cuts throughout the procedure, and later show how to modify our analysis for the case where an invalid cut is detected after querying the last cut. For every β-cut (A, B) that we use for partitioning, we refer to the edges in the minimal cut → − → − among E (A, B) and E (B, A) as rare edges, and to the edges in the other direction as common edges. Let us first compute the cost of querying the rare edges only. For every partition of a β-cut (A, B), we “charge” a cost of α on every edge in E(A). From equation (22), the sum of charges is larger than the number of rare edges queried. Since by our definition |E(A)| ≤ |E(B)|, every edge can belong to Side A of a partitioned β-cut (A, B) at most log m times throughout the entire procedure. Hence, the sum of charges is at most α log m for every edge in the graph and at most 23

αm log m in total. We complete the proof by showing that the ratio between the number of external edges and the number of rare external edges is O(m/β). → − Consider the component multigraph G comp , whose vertices are the components Gi and whose → − edges are the external edges of G . By our assumption that the knowledge graph contains no invalid → − cuts, G comp is Eulerian (as a directed multigraph), and hence it is an edge disjoint union of directed → − simple cycles. However, G comp does not contain a directed cycle of common edges. This can proved by induction as follows. When starting the chopping procedure there is only one component and no external edges. Then at every step we partition one component into two sets, and add the common edges, if any, from one of the sets to the other. One can see that this cannot create directed cycles → − of common edges. Therefore, G comp is an edge disjoint union of simple directed cycles, where each → − cycle contains at least one rare edge. As the number of vertices in G comp is at most m/β, we → − conclude that the number of external edges in G is at most m/β times the number of rare edges. → − From the discussion above, it follows that the number of external edges in G is O(αm2 log m/β). Regarding the case where querying the edges of a β-cut (A, B) reveals a violation of a cut in the graph, recall that as long as the knowledge graph does not induce an invalid cut, there exists an Eulerian orientation extending the knowledge graph (see Lemma 3.2). Thus, the edges of (A, B) have a “good” orientation that does not violate any cut in the graph. Clearly, the total query complexity after we query the edges of (A, B) and terminate is no higher than in the case where we would have queried the edges of (A, B) and discovered the good orientation. We are now ready to present our 1-sided test. Algorithm 8.2 (1-sided chopping test) Let α, β > 0 be parameters. → − → − 1. Use Lemma 8.1 (the chopping lemma) for finding (α, β)-components G 1 , . . . , G k and querying their external edges, or reject and terminate if an invalid cut is found in the process. → − 2. Sample 3 ln 3/ (α, β)-components G i randomly and independently, where the probability of → − def selecting a component G i in each sample is proportional to mi = |E(Vi )|. → − → − 3. Test every selected component G i using GEN-1( G i , /2) (Algorithm 7.4). Reject if the test rejects for at least one of the components selected. 4. Accept if the input was not rejected by any of the above steps. → − − → Lemma 8.3 If G is Eulerian, then Algorithm 8.2 accepts G with probability 1. → − → − Proof. If G is Eulerian then G has no invalid cuts and therefore Algorithm 8.2 does not reject → − in Step 1. In addition, every (α, β)-component of G is Eulerian, and since GEN-1 is 1-sided (see Lemma 7.6), Algorithm 8.2 does not reject any of the tested (α, β)-components. Lemma 8.4 If the external edges induce an invalid cut, or if more than an /2-fraction of the internal edges are in (α, β)-components that are /2-far from being Eulerian, then Algorithm 8.2 rejects with probability at least 2/3.

24

Proof. If the external edges induce an invalid cut then the algorithm rejects in Step 1 while trying to perform the chopping. Otherwise, every (α, β)-component sampled in Step 2 is /2-far from being Eulerian with probability at least /2. By Lemma 7.6, for every (α, β)-component that is /2-far from being Eulerian, the rejection probability is at least 2/3. Thus, the probability of rejecting a sampled component is at least /3. Since the components are selected independently, the probability of accepting all the components is at most (1 − /3)3 ln 3/ < e− ln 3 = 1/3. Lemma 8.5 If the external edges do not induce an invalid cut and at most an /2-fraction of the → − internal edges are in (α, β)-components that are /2-far from being Eulerian, then G is -close to being Eulerian. Proof. If the external edges do not induce an invalid cut, then, in particular, all the vertices outside (α, β)-components are balanced. Furthermore, by Lemma 3.2, our knowledge graph may → − be extended into an Eulerian orientation of G . We thus orient the (α, β)-components that are /2-far from being Eulerian so as to make them Eulerian. These components consist of at most m/2 edges. In addition, we invert a minimum number of edges in each of the (α, β)-components that are /2-close to being Eulerian, so as to make them Eulerian too. This requires at most m/2 → − more alterations. We thus obtain an Eulerian orientation that is -close to G . Theorem 8.6 Algorithm 8.2 is a 1-sided test for being Eulerian with query complexity   αm2 log m ∆ log m β · min[β, ∆] O + + . β 2 α 2 √ log m)1/3 In particular, if ∆ = O( m log m) then, for α = (∆(m) and β = 2/3 complexity is ! !  4/3 (∆m log m)2/3 ∆ 2/3 O =O (n log n) .  4/3

(m log m)2/3 , ∆1/3

the query

Proof. The correctness of the test follows from Lemmas 8.3, 8.4, and 8.5. The query complexity follows from Lemmas 7.6 and 8.1 (The chopping lemma). Wenote that Theorem 8.6 provides a sub-linear algorithm for every graph of maximum degree √  2 m ∆ = o log m . For nearly regular graphs, i.e. graphs with m = Ω(∆n), the algorithm is sub-linear  4   n . for every ∆ = o log 2 n We conclude with a similar 2-sided test which gives a sub-linear query complexity for every graph. Algorithm 8.7 (2-sided chopping test) Let α, β > 0 be parameters. → − → − 1. Use Lemma 8.1 (the chopping lemma) for finding (α, β)-components G 1 , . . . , G k and querying their external edges, or reject and terminate if an invalid cut is found in the process. → − 2. Sample 3 (α, β)-components G i independently, where the probability of selecting a component → − def G i is proportional to mi = |E(Vi )|. 25

→ − 3. Test every selected component G i for being Eulerian 12 ln(9/) times independently using → − → − MULTI-2( G i , /2) (Algorithm 7.5). Reject if there is a component G i which was rejected by at least half of its tests. 4. Accept if the input was not rejected in a previous step. → − Lemma 8.8 If G is Eulerian, then Algorithm 8.7 accepts with probability at least 2/3. → − → − Proof. If G is Eulerian then all the (α, β)-components G i are Eulerian. Thus, by Lemma → − 7.7, every run of MULTI-2 on a component G i rejects with probability at most 1/3. By standard → − arguments, the probability of rejecting a component G i at most half of its tests is at most /9. Applying the union bound for the 3/ (α, β)-components sampled, we the probability of rejecting → − G is at most 1/3. Lemma 8.9 If the external edges induce an invalid cut, or if more than /2-fraction of the internal edges are in (α, β)-components that are /2-far from being Eulerian, then Algorithm 8.7 rejects with probability at least 2/3. Proof. If the external edges induce an invalid cut then the algorithm rejects in Step 1 while trying to perform the chopping. Otherwise, the probability of not sampling any (α, β)-component that is /2-far from being Eulerian is at most (1− 2 )3/ < 14 . Suppose that we have sampled at least one (α, β)-component that is -far from being Eulerian. By Lemma 7.7, the acceptance probability of a single test of this component is at most 1/3. Using standard arguments, it can be shown that the probability of accepting in at most half of the tests of this component is smaller than 1/12. We → − conclude that the probability of accepting G in this case is at most 1/3. Note that Lemma 8.5 is true for Algorithm 8.7 as well as for Algorithm 8.2. Theorem 8.10 Algorithm 8.7 is a 2-sided test for being Eulerian with query complexity ! p √   β · min[β, ∆] ∆ log m αm2 log m e O +O + . β 2 α 2 1/6

2/3

(m) ∆ In particular, if ∆ ≤ (m)4/7 , then, for α = (m) , the query complexity is 2/3 and β = ∆1/6  1/3 2/3   6/7  5/16 m e ∆ 4/3 e m8/7 . If ∆ > (m)4/7 , then, for α = ∆ 3/4 and β = ∆1/8 √m, the query O =O  (m)  3/16 3/4   15/16  ∆ m m e e complexity is O = O 5/4 . 5/4

Proof. The correctness of the test follows from Lemmas 8.8, 8.9, and 8.5. The query complexity follows from Lemmas 7.7 and 8.1 (The chopping lemma).

9

Lower bounds for bounded-degree graphs

In this section we prove the following theorems. Theorem 9.1 Non-adaptive q  testing for Eulerian orientations of bounded degree graphs with 2-sided log m error requires Ω log log m queries. Consequently, adaptive testing requires Ω(log log m) queries. 26

Theorem 9.2 Non-adaptive testing for Eulerian orientations of bounded degree graphs with 1-sided 1 error requires 100 m1/4 queries. Consequently, adaptive testing with 1-sided error requires Ω(log m) queries. We start by proving Theorem 9.1. Next, based on the proof of Theorem 9.1, we prove Theorem 9.2.

9.1

Overview of the proof of Theorem 9.1

The proof of Theorem 9.1 is by Yao’s principle. Namely, for infinitely many natural numbers m we define a graph Gm with m edges, and two distributions over the orientations of Gm . The first distribution, Pm , will contain only Eulerian orientations of Gm , and the second distribution, Fm , will contain orientations that are with high probability -far from being Eulerian, where q   = 1/64.

log m Then we show that any non-adaptive deterministic algorithm that makes o log log m orientation queries cannot distinguish between the distributions Pm and Fm with probability larger than 1/5. All our underlying graphs Gm are two dimensional tori, which are 4-regular graphs having a highly symmetric structure (the exact definition is given below). We exploit this symmetry to  q log m construct distributions Pm and Fm such that in both cases, for any fixed set Q of o log log m edges, the orientation of every pair in Q has (with high probability) either no correlation at all, or a correlation that is identical in both cases. To construct these distributions we build the orientations from repeated “patterns” of varying sizes, and show that in order to succeed, a deterministic algorithm must know the approximate size of these patterns.

9.2

Preliminaries

We denote by [`] the set {1, 2, . . . , `}. For i, j ∈ [`] we let ⊕ denote addition modulo `, that is:  i⊕j =

, i+j ≤`

i+j

.

i+j−` , i+j >` Given a graph G and two edges e1 , e2 ∈ E(G), we define the distance between e1 and e2 (or shortly dist(e1 , e2 )) as the minimal distance between an endpoint of e1 and an endpoint of e2 . For an edge e = {u, v} ∈ E(G) and a vertex w ∈ V (G) we define the distance of e from w, or shortly → − dist(e, w), as the minimum of dist(u, w) and dist(v, w). We stress that even in oriented graphs G , the distances between edges and vertices are still measured on the underlying (undirected) graph G. Definition 9.3 (Torus) A torus is a two dimensional cyclic grid. Formally, an ` × ` torus is the graph T = (V,nE) on n = `2 vertices V o= {vi,j : i ∈ [`], E = EH ∪ EV , n j ∈ [`]} and m = 2n edges o where EH = {vi,j1 , vi,j2 } : j2 = j1 ⊕1 and EV = {vi1 ,j , vi2 ,j } : i2 = i1 ⊕1 . We also call EH the set of horizontal edges, and we call EV the set of vertical edges. Two edges e1 , e2 ∈ E are perpendicular if one of them is horizontal and the other is vertical, and otherwise they are parallel. → − Given an orientation T of T , we say that a horizontal edge e = {vi,j , vi,j⊕1 } is directed to the right if vi,j is the start-point of e, and otherwise we say that e is directed to the left. Similarly, we 27

say that a vertical edge e = {vi,j , vi⊕1,j } is directed upwards if vi,j is the start-point of e, and else it is directed downwards. To avoid irrelevant special cases, throughout this section we always assume that ` is even. Now we define a graph operation that is later used in the construction of the distributions Pm and Fm . Definition 9.4 (shifting) Let T be an ` × ` torus, and let a, b ∈ [`]. We say that a mapping π : V (T ) → V (T ) is an (a, b)-shifting of T if for every vi,j ∈ V (T ) we have π(vi,j ) = vi⊕a,j⊕b . Given → − an orientation T of T , shifting preserves the directions of the edges. Namely, if e = {vi,j , vi0 ,j 0 } is directed from vi,j to vi0 ,j 0 in T , then in the (a, b)-shifting of T the edge e0 = {π(vi,j ), π(vi0 ,j 0 )} is directed from π(vi,j ) to π(vi0 ,j 0 ).

9.3

Defining two auxiliary distributions (k)

In this section we describe two simple distributions, Rm and Cm , over the orientations of an ` × ` torus T with m = 2`2 edges. We later build the final distributions Fm and Pm based on Rm and (k) Cm respectively. → − The distribution Rm is simply a random orientation of T ’s edges. Namely, in T ∼ Rm the orientation of each edge e ∈ E(T ) is chosen uniformly at random, independently of the other edges. → − Lemma 9.5 Let T be an orientation of a torus T with m edges, distributed according to the → − distribution Rm . With probability 1 − o(1), there are at least n/4 = m/8 unbalanced vertices in T . Proof.

Recall that V = {vi,j : i, j ∈ [`]} is the set of T ’s vertices. Define a subset I = {vi,j : i + j is even} ⊆ V

of n/2 vertices. Observe that I is an independent set in T , and so every vertex vi ∈ I is balanced with  probability xi = 42 /24 = 3/8 independently from all other vertices in I. By Chernoff’s inequality, the probability that at least half of the vertices in I are balanced is bounded by exp(−n/64). Namely, with probability 1 − o(1) there are at least n/4 unbalanced vertices in I. (k) The second distribution, C√ m , is a distribution over Eulerian orientations of T . The parameter m/2

(k)

k is assumed to divide 2` = 2 . The Eulerian orientations given by Cm are constructed as follows. First, the torus T is decomposed into m/4k edge-disjoint 4k-cycles c1 , c2 , . . . , cm/4k , where each cycle ci has exactly four “corner” vertices, that is vertices that are adjacent to both vertical and horizontal edges. Then, for each cycle ci , one of its two possible Eulerian orientations is chosen → − uniformly at random, independently of other cycles. Let T 0 denote the orientation of T at this → − → − stage. Finally, a, b ∈ [`] are chosen uniformly at random, and T is set to be an (a, b)-shift of T 0 . → − In what follows, for a pair of edges ei , ej ∈ T and an orientation T of T , we say that ei and ej → − → − (k) are independent if either T ∼ Rm , or if T ∼ Cm and the edges ei ,ej reside in different 4k-cycles ci , cj . A set Q ⊆ E(T ) is independent if all pairs e1 , e2 ∈ Q are independent. Observe that if Q is independent, then the orientation of every e ∈ Q is distributed uniformly at random, independently − → of the other members of Q. Clearly if T is distributed according to Rm then every set Q ⊆ E(T ) is (k) independent, but this is not the case for orientations distributed according to Cm . In the following lemmas we claim that under some conditions on the set Q, with high probability it is independent → − (k) even if T is distributed according to Cm . 28

Lemma 9.6 Let T be an ` × ` torus with m edges, and let e1 , e2 ∈ E(T ) be two perpendicular edges → − (k) of T . Let T be an orientation of T distributed according to Cm , for an integer k that divides `/2. Then the probability that e1 and e2 are independent is at least 1 − k2 . Proof. Suppose that e1 and e2 are not independent. Then they must reside in the same cycle c0 . Observe that there is a unique vertex v0 defined by e1 and e2 , which must be one of the “corner” vertices of c0 , namely, v0 must be an endpoint of both horizontal and vertical edges of c0 . The 2 `2 number of 4k-cycles that are required to cover all edges of T is 2` 4k = 2k , so the total number of 2 `2 corner vertices is 4 · 2k = 2`k . Therefore, the fraction of corner vertices is exactly 2/k. Since in the (k)

last step of the definition of Cm the orientation is randomly shifted, the probability that v0 is not a corner of any of the 4k-cycles is exactly 1 − 2/k. Lemma 9.7 Let T be an √ ` × ` torus with m edges and let k be an integer that divides `/2. Let Q ⊆ E(T ) be a set of o( k) edges such that for every pair e1 , e2 ∈ Q either dist(e1 , e2 ) > 2k, or → − (k) e1 and e2 are perpendicular. Then for an orientation T of T distributed according to Cm , the probability that Q is independent is 1 − o(1). Proof. Fix a pair e1 , e2 ∈ Q. If dist(e1 , e2 ) > 2k then e1 and e2 must reside in different 4k-cycles, and hence they are independent. Otherwise, e1 , e2 are perpendicular, and by Lemma 9.6 they are independent with probability at least 1 − k2 . Now the proof is completed by applying the union bound for all o(k) pairs e1 , e2 ∈ Q.

9.4

Defining the distributions Pm and Fm

First we need to define the following operation, that allows us to construct orientations of large tori based on orientations of small ones. This operation preserves the distance of the orientations → − from being Eulerian. In particular, if it is applied on T ∼ Rm , then the resulting orientation is far → − from being Eulerian as long as T contains many unbalanced vertices (and then we can use Lemma → − (k) 9.5), and on the other hand, when applied on the Eulerian orientation T ∼ Cm then the resulting orientation is also Eulerian. → − Definition 9.8 (t-tiling) Let T be a directed `×` torus, and let t be a natural number. We define → − → − the t-tiling of T as a 2t` × 2t` directed torus T t which is constructed as follows. → − For convenience, let vi,j , i, j ∈ [`] denote the vertices of T and let ui,j , i, j ∈ [`] denote the − → → − vertices of T t . First, we partition the 2t` × 2t` torus T t into `2 disjoint 2t × 2t grids {Gi,j }i,j∈[`] , → − → − where every grid Gi,j is associated with the vertex vi,j ∈ V ( T ). The partition of T t into `2 grids → − Gi,j is according to the original layout of the vertices vi,j in the ` × ` torus T . Formally, For every i, j ∈ [`], the grid Gi,j contains the vertices V (Gi,j ) = {ui0 ,j 0 : 2t(i − 1) < i0 ≤ 2ti, 2t(j − 1) < j 0 ≤ 2tj}. Next, every grid Gi,j is partitioned into the following four disjoint t × t grids. → − • The grid Ri,j , which we call the representative of the vertex vi,j ∈ V ( T ). r , which we call the padding to the right of R . • The grid Pi,j i,j

29

b , which we call the padding below R . • The grid Pi,j i,j rb , which we call the padding diagonally to R . • The grid Pi,j i,j

These four t × t grids are composed into a 2t × 2t grid Gi,j by placing Ri,j in the upper left corner, r to the right of R , placing P b below R rb r placing Pi,j i,j i,j and by placing Pi,j below Pi,j . See an example i,j of how a 2 × 2 torus corresponds to its 3-tiling in Figure 1. Figure 1: A 2 × 2 torus and its 3-tiling. The vertex v2,1 is encircled in the original torus, and its r , P b and P rb are encircled in the 3-tiling. corresponding sub-grids R2,1 , P2,1 2,1 2,1

→ − → − →t 1 , r 2 , . . . , r t ∈ V (− The orientation T t is defined as follows. For every vi,j ∈ V ( T ), let ri,j T ) be i,j i,j the t vertices on the main diagonal of the representative grid Ri,j . For every edge e = {vi,j , vi0 ,j 0 } ∈ → − E( T ) directed from vi,j to vi0 ,j 0 and every h ∈ [t], we orient the edges on the shortest path from →t − h to r h h h 0 ri,j i0 ,j 0 in a way that forms a directed path from ri,j to ri0 ,j 0 . For every edge e ∈ E( T ) that def

participates in this path, we call e the originating edge of e0 , and we use the notation org(e0 ) = e. As for the remaining edges (the “padding” part), all horizontal edges are directed to the right, and all vertical edges are directed upwards, as per Definition 9.3. See an example in Figure 2. For def every padding edge e we define org(e) = ∅, since they have no origin in T . The next lemma states that a tiling of an Eulerian torus is also Eulerian, while on the other hand, a tiling of a torus with many unbalanced vertices results with a torus which is far from being Eulerian. → − → − → − Lemma 9.9 Let T be a directed ` × ` torus on n = `2 vertices, and let T t be the t-tiling of T for some natural number t. Then , → − → − • If T is Eulerian, then T t is also Eulerian. → − → − δ • For every 0 < δ < 1, if T contains δn = δ`2 unbalanced vertices, then T t is 16 -far from being Eulerian. 30

Figure 2: A 2 × 2 directed torus and the corresponding 3-tiling. The vertex v2,1 is encircled in 1 , r 2 and r 3 are encircled in its tiling. In the original torus, and the corresponding vertices r2,1 2,1 2,1 addition, the edge {v1,2 , v1,1 } is emphasized in the original torus, and the corresponding edges are emphasized in the 3-tiling. Notice the circular orientation of the dashed edges (the padding part).

→ − Proof. The first statement of the lemma follows from Definition 9.8. Assume now that T has δn unbalanced (spring or drain) vertices. According to the definition of a t-tiling, for every unbalanced → − 1 , r 2 , . . . , r t on the main diagonal of vertex vi,j ∈ V ( T ) we have exactly t unbalanced vertices ri,j i,j i,j → − → − vi,j ’s representative grid Ri,j in T t , so the number of unbalanced vertices in T t is δnt. In addition, → − 1 , r 2 , . . . , r t are also whenever vi,j is a spring (respectively drain) vertex in T , the vertices ri,j i,j i,j →t − springs (respectively drains) in T , so every pair of spring-drain vertices must reside in different grids Ri,j and Ri0 ,j 0 . This implies that (due to the padding parts) the distance from any spring → − → − vertex to any drain vertex in T t is at least t. Consequently, every correction path in T t must be of → − length at least t. Since every correction path in T t can balance at most two unbalanced vertices, → − δnt2 /2 δ and since the length of every such path is at least t, we conclude that T t is tδnt/2 →t = 8nt2 = 16 -far − |E( T )|

from being Eulerian. → − → − Lemma 9.10 Let T t be a t-tiling of a randomly oriented `×` torus T ∼ Rm . Then with probability →t − 1 − o(1), T is 1/64-far from being Eulerian. Proof.

Follows by combining Lemma 9.9 (with δ = 1/4) and Lemma 9.5.

Now we describe the distributions Pm and Fm over the orientations of an ` × ` torus T , where m = 2`2 . To avoid divisibility concerns, we assume that ` = 2k , and k = 2b for some natural number b > 1. It is easy to verify that the same proof works also for general values of ` and k by using rounding as appropriate. Distribution Pm :

→ − Choosing T ∼ Pm is done according to the following steps. 31

• Choose s uniformly at random from the range [k/4, k/2]. Let t = 2s , that is, t can take values in the range [n1/4 , n1/2 ]. • For an

` 2t

×

` 2t

log ` 4

→ − (k) torus H, choose a random orientation H according to the distribution Cm .

→ − − → • Set T 0 to be the t-tiling of H . → − − → • Choose a, b ∈ [`] uniformly at random, and set T to be the (a, b)-shift of T 0 . Distribution Fm :

→ − Choosing T ∼ Fm is done according to the following steps.

• Choose s uniformly at random from the range [k/4, k/2] and set t = 2s . → − • For an 2t` × 2t` torus H, choose a random orientation H according to the distribution Rm . → − − → • Set T 0 as the t-tiling of H . → − − → • Choose a, b ∈ [`] uniformly at random, and set T to be the (a, b)-shift of T 0 .

9.5

Bounding the probability of distinguishing between Pm and Fm

→ − → − According to Lemma 9.9, T ∼ Pm is always Eulerian, and according to Lemma 9.10, T ∼ Fm 1 -far from being Eulerian. Our aim is to show that any non-adaptive is with high probability 64 q  log ` deterministic algorithm making o queries will fail to distinguish between orientations log log ` that are distributed according to Pm , and those q that are distributed according to Fm . Let Q ⊆ E(T ) be a fixed set of at most

1 10

log ` log log `

edges queried by a deterministic algorithm.

def

Let org(Q) = {org(e) : e ∈ Q} ⊆ E(H) be the set of Q’s originating edges (see Definition 9.8). → − Clearly, if T is distributed according to Fm , then the set org(Q) ⊆ E(H) is independent in H, and hence the distribution of the orientations of the edges in org(Q) ⊆ E(H) is uniform. Therefore, it → − is enough to show that whenever T is distributed according to Pm , with probability at least 54 the set org(Q) is independent in H too. In order to prove this we need the next lemma. → − Lemma 9.11 Let q1 , q2 ∈ E(T ) be a pair of edges within distance x. Let T be a random orientation of T , chosen according to distribution Pm . Then Prs [

t 8(log log ` + 2) ≤ x ≤ 4tk] < . 4k log `

Proof. Observe that there are at most 2 log k + 4 = 2 log log ` + 4 values of s for which a fixed t number x can satisfy 2s−log k−2 = 4k ≤ x ≤ 4tk = 2s+log k+2 . On the other hand, s is distributed uniformly among k4 = log4 ` possible values, so the lemma follows. → − For a set Q ⊆ E(T ) and an orientation T of T , we define the following event IQ : For all pairs t q1 , q2 ∈ Q either dist(q1 , q2 ) < 4k or dist(q1 , q2 ) > 4tk in T . The next lemma is a direct consequence of Lemma 9.11. q → − log ` 1 Lemma 9.12 Let Q ⊆ E(T ) be a fixed set of 10 log log ` edges, and let T be a random orientation of T , according to distribution Pm . Then the event IQ occurs with probability at least 9/10. 32

Proof. Follows by applying the union bound on the inequality from Lemma 9.11 for all pairs q1 , q2 ∈ Q. √ Recall that since |org(Q)| = o( log `) = o(k), according to Lemma 9.7 it is enough to show 4 that with probability at least   5 the set org(Q) is such that for every pair org(e1 ), org(e2 ) ∈ org(Q) either dist org(e1 ), org(e2 )

> 2k in H, or org(e1 ) and org(e2 ) are perpendicular. Clearly, if   dist(e1 , e2 ) > 4tk in T , then dist org(e1 ), org(e2 ) > 2k in H. We take care of the other case in the next lemma.

Lemma 9.13 Conditioned on the event IQ , with probability 1 − o(1) for every pair of edges e1 , e2 ∈ t Q such that dist(e1 , e2 ) < 4k one of the following holds: (1) e1 , e2 are perpendicular; (2) one of e1 , e2 has no origin in H; (3) org(e1 ) = org(e2 ). Before proving the lemma, observe that since IQ occurs with probability at least 9/10, the conditions 9 of Lemma 9.7 are satisfied by Q with (total) probability at least 10 − o(1), and hence org(Q) is 9 independent with probability at least 10 − 2 · o(1) > 4/5, where the probabilities are taken over Pm . t Proof. Fix a pair e1 , e2 ∈ Q such that dist(e1 , e2 ) < 4k . If one of them has no originating edge in t H then we are done. Otherwise, since dist(e1 , e2 ) < 4k , by the definition of the t-tiling org(e1 ) and org(e2 ) must have a common endpoint in H, say vi,j . If e1 and e2 are perpendicular, then again we are done. On the other hand, if e1 and e2 are parallel, then in order to have different origins they must be separated by the the main diagonal of Ri,j (the representative grid of the common vertex vi,j ). Notice that this may happen only if the distance of both e1 and e2 from the main diagonal t is at most 4k . But the probability that an edge is within that distance from the main diagonal t 1 1 of some representative grid is at most 4k t t2 = 4k = o( logloglog` ` ). Now the proof is completed by applying the union bound for all pairs e1 , e2 ∈ Q .

9.6

Proof of Theorem 9.1

Let Q ⊆ E(T ) be the fixed set of t Pm

1 10

q

log ` log log `

edge queries that a deterministic algorithm makes.

t Fm

For every t let and be (Pm | t), that is Pm conditioned on fixed t, and (Fm | t) respectively. Note that fixing t, org(Q) becomes defined and, in particular, it is the same fixed set for inputs → t , then the set t and F t . Now, for every t, if − T is distributed according to Fm drawn according to Pm m org(Q) of originating edges is independent. On the other hand, by Lemmas 9.11, 9.12, and 9.13 the set org(Q) is independent with probability at least 4/5 (with respect to the choice of t). Thus with probability at least 4/5 (over the choice of s which determines t), the set org(Q) is independent t on Q is identical to that of F t . Altogether, this implies which implies that the restriction of Pm m that q distinguishing q between  the distributions with probability larger than 1/5 requires more than 1 10

9.7

log ` log log `

=Ω

log m log log m

queries.

Overview of the proof of Theorem 9.2

As opposed to testers with 2-sided error, a tester with 1-sided error is not allowed to reject the input unless a negative witness was found. In our case, as claimed in Lemma 3.1, the only possible witness that an orientation is not Eulerian is an invalid cut, i.e. a (possibly partial) cut that cannot be made balanced under any orientation of the non-queried edges. 33

Following this observation, we prove Theorem 9.2 using the distribution Fm defined in section 0 which are similar to the distributions F , except that t is 9.4. First we define distributions Fm m − → fixed to be `/16, and the orientation H of an 8 × 8 torus H is fixed to be one that makes all 64 0 , any vertices unbalanced. Then we show that for orientations that are distributed according to Fm 1/4 non-adaptive deterministic algorithm that makes o(n ) orientation queries cannot find an invalid 1 cut (a negative witness) with probability larger than 1/5. This will imply that there exists an 16 -far orientation on which any randomized tester fails with probability at least 4/5. The main idea is as follows. A cut can be invalid (and hence unbalanced) only if both its components contain unbalanced vertices. Let us now fix a cut (A, B) of an ` × ` torus T , and let → − 0 . Suppose that indeed both A and B contain T be an orientation of T chosen according to Fm unbalanced vertices, and let Q be the subset of the edges in the cut (A, B) that witness its invalidity. Since the underlying graph T is a torus, one can show that either Q contains Ω(n1/4 ) edges, or otherwise, one of the edges e ∈ Q must be within distance at most O(n1/4 ) from one of the → − → − 0 is O(`) = O(√n), unbalanced vertices of T . Since the amount of unbalanced vertices in T ∼ Fm and since they are grouped into 64 diagonals of length `/32, the number of edges that are within → − distance O(n1/4 ) or less from these unbalanced vertices is bounded by O(n3/4 ). Finally, since T is randomly shifted the probability that a set Q of size o(n1/4 ) contains any such edge goes to zero.

9.8

Proof of Theorem 9.2

0 . First we formally define the distribution Fm

→ − 0 is done according to the following steps. Choosing T ∼ Fm √ m/2 • Set t = `/16 = 16 .

0 : Distribution Fm

→ − • Fix an orientation H of a 2t` × unbalanced. → − → − • Set T 0 to be the t-tiling of H .

` 2t

→ − = 8 × 8 torus H, such that H makes all 64 vertices of H

→ − − → • Pick a, b ∈ [`] uniformly at random and set T to be the (a, b)-shift of T 0 . → − Lemma 9.14 Let T be an ` × ` torus, and let T be an orientation of T distributed according to → 0 . Let Q be a fixed set of 1 m1/4 edges from E(T ). Then the probability (over − 0 ) that Fm T ∼ Fm 100 √ → − one of the edges in Q are within distance at most ` from an unbalanced vertex v ∈ V ( T ) is at most 1/5. → − 0 . Observe that |U | = 64t = Proof. Let U denote the set of unbalanced vertices in T ∼ Fm √ 4` = 4 n, and recall that the vertices in U ⊆ V (T ) are grouped into 64 diagonals of length √ t (see Definition 9.8). Thus the amount of vertices√v ∈ V√ (T ) that are within distance at most ` from some vertex u ∈ U is bounded by 64 · (t√+ 2 `) · 2 ` ≤ 10`3/2 ≤ 20m3/4 . So the probability of a single edge e ∈ Q satisfying dist(e, u) ≤ ` for some u ∈ U is bounded by 20m−1/4 and the lemma follows. As stated in the overview, the proof of Theorem 9.2 relies on the properties of the two dimensional grid. We point out the required features in the next lemma. 34

→ − Lemma 9.15 Let T be an ` × ` torus, and let T be a non-Eulerian orientation of T . Denote → − by U ⊆ V (T ) the set of unbalanced vertices with respect to T . Let Q ⊆ E(T ) be a set of edges → − forming a witness that T is not Eulerian. Denote by q the minimal distance of an edge in Q from an unbalanced vertex, that is, def q = mine∈Q, u∈U {dist(e, u)}. Then |Q| ≥ q. → − Proof. According to Lemma 3.1, if Q is a witness that T is not Eulerian, then Q must contain → − more than half of the edges of some invalid cut, say (A, B), in T . Assume without loss of generality that |A| ≤ |B|, and that the subgraph of T induced on A is connected (otherwise take another unbalanced cut (A0 , B 0 ) witnessed by Q, such that A0 ⊂ A). Since (A, B) is an unbalanced cut, A must contain at least one unbalanced vertex u ∈ U . On the other hand, Q > |E(A,B)| and 2 mine∈Q {dist(e, u)} ≥ q. Therefore, if A forms a connected subgraph and the cut E(A, B) contains edges that are within distance ≥ q from u ∈ A, then E(A, B) must contain at least 2q edges, and consequently, Q must contain at least q edges. Now we can easily complete the proof of Theorem 9.2 by combining Lemma 9.14 and Lemma 1 m1/4 edge queries that a deterministic algorithm 9.15. Namely, let Q ⊆ E(T ) be the fixed set of 100 → − makes. By Lemma 9.15, in order to form a witness that T is not Eulerian, one of the edges in √ → − 1 0 . But Q must be within distance at most 100 m1/4 < ` from an unbalanced vertex in T ∼ Fm according to Lemma 9.14, the probability of the above is at most 1/5. We thus conclude that → − 0 is not Eulerian with probability larger than 1/5 requires more discovering a witness that T ∼ Fm 1 than 100 m1/4 queries.

10

Concluding comments and open problems

For general graphs we have shown a test that has a sub-linear number of queries in most cases. The test procedure is surprisingly involved considering the problem statement. However, unlike the special cases of dense graphs and expander graphs, this should be only considered as a first step for this problem, as many questions still remain open. First, the question arises as to whether we can reduce the computational complexity of the test. While having a sub-linear number of queries, the exponential in m time complexity makes this test unrealistic for implementation. Also, to make the test truly attractive, not only its computation time needs to be polynomial in m, but most of the calculations should be performed in a preprocessing stage, while the amount of calculations done while making the queries should ideally also be sub-linear in m. Related to the preprocessing question is the unresolved question of adaptivity. The current test is adaptive, but we would like to think that a sub-linear query complexity non-adaptive test also exists for the same class of graphs. Other adaptive versus non-adaptive gaps, such as the one concerning the 2-sided lower bounds, need also be addressed.

Acknowledgments We wish to thank Oded Lachish for introducing us to this problem and for his helpful discussion.

35

References [1] N. Alon, E. Fischer, I. Newman and A. Shapira, A combinatorial characterization of the testable graph properties: It’s all about regularity, Proceedings of the 38th ACM STOC (2006), 251–260. [2] N. Alon and A. Shapira, A Characterization of the (natural) graph properties testable with one-sided error, Proceedings of the 46th IEEE FOCS (2005), 429–438, Also SIAM Journal on Computing, to appear. [3] L. Babai, On the diameter of Eulerian orientations of graphs, Proceedings of the 17th SODA, (2006), 822–831. [4] E. Ben-Sasson, P. Harsha and S. Raskhodnikova, Some 3CNF properties are hard to test, SIAM Journal on Computing:35(1), 1–21, 2005 (a preliminary version appeared in Proc. 35th STOC, 2003). [5] M. Blum, M. Luby and R. Rubinfeld, Self-testing/correcting with applications to numerical problems. Journal of Computer and System Sciences 47 (1993), 549–595 (a preliminary version appeared in Proc. 22nd STOC, 1990). [6] G. R. Brightwell and P. Winkler, Counting Eulerian circuits is #P-complete, Proc. 7th ALENEX and 2nd ANALCO 2005 (Vancouver BC), C. Demetrescu, R. Sedgewick and R. Tamassia, eds, SIAM Press, 259–262. [7] S. Chakraborty, E. Fischer, O. Lachish, A. Matsliah and I. Newman: Testing st-Connectivity, Proceedings of the 11th RANDOM and the 10th APPROX (2007): 380–394. [8] E. Fischer, The art of uninformed decisions: A primer to property testing, Current Trends in Theoretical Computer Science: The Challenge of the New Century, G. Paun, G. Rozenberg and A. Salomaa (editors), World Scientific Publishing (2004), Vol. I 229-264. [9] E. Fischer and O. Yahalom, Testing convexity properties of tree colorings, Proc. of the 24th International Symposium on Theoretical Aspects of Computer Science (STACS 2007), LNCS 4393, Springer-Verlag 2007, 109–120. [10] H. Fleishcner, Eulerian graphs and related topics, Part 1. Vol. 1. Annals of Discrete Mathematics 45, 1990. [11] H. Fleishcner, Eulerian graphs and related topics, Part 1. Vol. 2. Annals of Discrete Mathematics 50, 1991. [12] O. Goldreich, S. Goldwasser and D. Ron, Property testing and its connection to learning and approximation, JACM 45(4): 653-750 (1998). [13] O. Goldreich and D. Ron, Property testing in bounded degree graphs, Algorithmica, (2002), 32(2):302– 343 [14] S. Halevy, O. Lachish, I. Newman and D. Tsur, Testing properties of constraint-graphs, Proceedings of the 22nd IEEE Annual Conference on Computational Complexity (CCC 2007), 264–277. [15] S. Halevy, O. Lachish, I. Newman and D. Tsur, Testing orientation properties, Electronic Colloquium on Computational Complexity (ECCC), (2005), 153. [16] T. Ibaraki, A. V. Karzanov and H. Nagamochi, A fast algorithm for finding a maximum free multiflow in an inner Eulerian network and some generalizations, Combinatorica 18(1) (1988), 61–83. [17] T. Kaufman, M. Krivelevich and D. Ron, Tight bounds for testing bipartiteness in general graphs, SICOMP, (2004), 33(6):1441–1483. [18] L. Lov´asz, On some connectivity properties of Eulerian graphs, Acta Math. Hung. 28 (1976) 129–138.

36

[19] M. Mihail, P. Winkler, On the number of Eulerian orientations of a graph, Algorithmica 16(4/5) (1996), 402–414. [20] M. Parnas and D. Ron, Testing the diameter of graphs, Random Struct. and Algorithms, (2002), 20(2):165–183. [21] P. A. Pevzner, H. Tang and M. S. Waterman, An Eulerian path approach to DNA fragment assembly, Proc. Natl. Acad. Sci. USA 98, (2001), 9748–9753. [22] R. W, Robinson, Enumeration of Euler graphs, In Proof Techniques in Graph Theory (Ed. F. Harary), New York: Academic Press, 1969, 147–153. [23] D. Ron, Property testing (a tutorial), In: Handbook of Randomized Computing (S. Rajasekaran, P. M. Pardalos, J. H. Reif and J. D. P. Rolim eds), Kluwer Press (2001), Vol. II Chapter 15. [24] R. Rubinfeld and M. Sudan, Robust characterization of polynomials with applications to program testing, SIAM Journal on Computing 25 (1996), 252–271 (first appeared as a technical report, Cornell University, 1993). [25] W. T. Tutte, Graph theory, Addison-Wesley, New York, 1984.

A

Appendix

Lemma A.1 For every δ > 0 we have 2 eδ < e−δ /8 (1+δ) (1 + δ)

Proof.

¿From the Taylor expansion, we have ln(1 + δ) = δ −

δ2 δ3 + − ... 2 3

and δ · ln(1 + δ) = δ 2 −

δ3 δ4 + − .... 2 3

Summing the two series we obtain that (1 + δ) ln(1 + δ) > δ +

δ3 δ2 δ2 − >δ+ . 2 6 8

Therefore (1 + δ) ln(1 + δ) − δ > and hence

δ2 , 8

2 eδ < e−δ /8 . (1+δ) (1 + δ)

37