Creating a Giant Component - Semantic Scholar

0 downloads 0 Views 240KB Size Report
Oct 26, 2003 -
Creating a Giant Component∗ Tom Bohman †, David Kravitz ‡ Department of Mathematical Sciences Carnegie Mellon University Pittsburgh, PA 15123 October 26, 2003

Abstract Let c be a constant and (e1 , f1 ), (e2 , f2 ), . . . , (ecn , fcn ) be a sequence of ordered pairs of edges on vertex set [n] chosen uniformly and independently at random. Let A be an algorithm for the on-line choice of one edge from each presented pair, and for i = 1, . . . , cn let GA (i) be the graph on vertex set [n] consisting of the first i edges chosen by A. We prove that all algorithms in a certain class have a critical value cA for the emergence of a giant component in GA (cn) (i.e. if c < cA then with high probability the largest component in GA (cn) has o(n) vertices and if c > cA then with high probability there is a component of size Ω(n) in GA (cn)). We show that a particular algorithm in this class with high probability produces a giant component before 0.385n steps in the process (i.e. we exhibit an algorithm that creates a giant component relatively quickly). An algorithm considered by Spencer and Wormald also lies in the class, and the fact that this algorithm has a critical value resolves a conjecture of Spencer. We establish a lower bound on the time of emergence of a giant component in any process produced by an on-line algorithm and show that there is phase transition for the off-line version of the problem of creating a giant component.

1

Introduction

Let c be a constant and (e1 , f1 ), (e2 , f2 ), . . . , (ecn , fcn ) be a sequence of pairs of edges on vertex set [n] = {1, 2, . . . , n} chosen uniformly and independently at random. Our goal is to choose exactly one edge from each pair to form a graph that contains a giant component (i.e. a component of size Ω(n)). We consider both the on-line version (pairs appear sequentially and the choice of edge from ∗ Results similar to Theorems 3 and 4 were obtained independently by Flaxman, Gamarnik and Sorkin [9]. † Supported in part by NSF grant DMS-0100400. ‡ Supported in part by a VIGRE grant from the NSF, DMS-9819950.

1

the pair (et , ft ) is made without knowledge of future edges) and off-line version (the choice of edge from the pair (et , ft ) is made with complete knowledge of the full set of edges) of the problem. It is well known, a classical result of Erd˝ os and R´enyi [7], that for any c > 12 we can create a giant component whp by choosing ei for all i and for any c < 14 whp we can not create a giant even if we are allowed to choose all 2cn edges. Thus, the question is interesting for c ∈ ( 14 , 21 ). Recently, work has been done on a very similar problem in which the object is to avoid a giant component instead of creating one. Here, the interesting case is c > 12 . Bohman and Frieze introduced an on-line algorithm and proved that whp it produces a graph with no giant component for any c < 0.535 [3]. Spencer and Wormald claim that c = 0.89 can be achieved by an on-line algorithm [10]. Bohman and Kim showed that the off-line version of the avoiding a giant component problem has a threshold at coff roughly equal to 0.976 [5]. (If c > coff then whp every graph that consists of at least on edge from each pair (e1 , f1 ), . . . , (ecn , fcn ) has a giant component and if c < coff then the whp there exists a choice of one edge from each pair that produces a graph in which the largest component has size o(n).) This threshold is strictly greater than an upper bound on the on-line version of the avoiding a giant component problem that was established by Bohman, Frieze and Wormald [4]; in other words, there exist values of c for which any on-line algorithm whp produces a graph with a giant but there exists a way to make an off-line choice of one edge from each pair that succeeds whp in producing a graph with no giant component. We begin with general on-line processes. Let A be an on-line algorithm for the choice of one edge from each presented pair (i.e. the choice of edge from the pair (ei , fi ) is made without knowledge of pairs (ej , fj ) such that j > i). Let GA (i) be the graph on vertex set [n] consisting of the first i edges chosen by A. This produces a random graph process GA (1), GA (2), . . . , GA (cn). Such a process is called an Achlioptas process, after Dimitris Achlioptas who posed the question of on-line avoidance of a giant component. Note that this class of processes includes both processes produces by algorithms designed for avoiding a giant component and processes produced by algorithms designed for creating a giant. Our main result is a general theorem that establishes the existence of a critical value for the emergence of a giant component for a class of Achlioptas processes. For notational convenience let et+1 = {ut+1 , vt+1 }, for t = 0, . . . , cn− 1, and let xt+1 and yt+1 be the sizes of the components containing ut+1 and vt+1 , respectively, in GA (t). We call A a bounded first-edge algorithm if there is a constant a and a fixed set SA ⊆ {(i, j) : i, j ∈ [a] ∪ {ℓ}} , where ℓ is a symbol (i.e. ℓ 6∈ [a]), such that A excepts edge et+1 if and only if one of the following holds 1. (xt+1 , yt+1 ) ∈ SA , 2. (xt+1 , ℓ) ∈ SA and yt+1 > a, 2

3. (ℓ, yt+1 ) ∈ SA and xt+1 > a, 4. (ℓ, ℓ) ∈ SA and xt+1 , yt+1 > a. Thus, the symbol ℓ simply indicates all integers greater than a. In words, a bounded first-edge algorithm observes the component sizes of the vertices of et+1 , accepts et+1 if these component sizes fall in some prescribed set, and otherwise accepts the purely random edge ft+1 . Theorem 1. Let A be a bounded first-edge algorithm. There exists a constant cA such that (a) If c < cA then whp the largest component in the graph GA (cn) has O (log n)n12/13 vertices, and (b) If c > cA then whp the graph GA (cn) has a component of size Ω(n).

The main tool in the proof of Theorem 1 is the differential equations method for random graph processes (see Wormald [11] for an excellent treatment of this method). The critical value cA is given by the blow-up point in the differential equation for the sum of the squares of the component sizes, as was conjectured by Spencer (Conjecture 6 of [10]) for a broad class of algorithms, called bounded size algorithms (these algorithms consider the component sizes of all 4 vertices in et ∪ ft – without making any distinction between component sizes that exceed some fixed bound – before making a decision between et and ft ), that includes the bounded first-edge algorithms. Spencer also announced that he and Wormald have a proof that for any bounded size algorithm A and c < cA the largest component in GA (cn) has at most logO(1) n vertices (see HornBlowing in [10]). This is a much stronger bound than that given by part (a) of Theorem 1, and in light of this no attempt to strengthen part (a) of Theorem 1 is made here. Indeed, our focus is part (b) of Theorem 1 which establishes the creation of a giant component. The machinery introduced in the proof of Theorem 1 can be applied to other situations. Note that bounded first-edge algorithms are static, that is, they employ the same rule throughout the process. The proof goes through, for example, for certain algorithms that always make the choice between et+1 and ft+1 without observing ft+1 but allow the rule to change some bounded number of times during the process. The main limitation of the proof of Theorem 1 is that it requires a supply of chosen edges around the critical point that are purely random. Since the critical value cA given in Theorem 1 is given by a blow-up point in a system of differential equations, it can be estimated by numerically solving the system. In Section 5 we estimate the critical point for two bounded first-edge algorithms. The first algorithm is designed to create a giant component quickly and achieves the following. Theorem 2. There exists an Achlioptas process such that for any c > 0.385 whp the graph produced after cn steps has a component of size Ω(n). 3

The second algorithm was introduced for the purpose of avoiding a giant. This algorithm is the subject of Conjecture 2 of [10], which is resolved by Theorem 1. We establish a lower bound on the on-line version of the creating a giant component problem. We show that all on-line algorithms whp fail to create a giant for c slightly larger than 41 . Define f (d) =

3 + 8d − 8de−4d − 14de−12d + 14de−16d . 20 − 8e−4d − 14e−12d + 14e−16d

(1)

Note that f (d) has a local maximum when d ≈ 0.019974, here f (d) ≈ .2545. Theorem 3. Let c be a constant. If there exists d ∈ (0, c) such that c < f (d), then for any Achlioptas process whp all of the components of the graph created in cn steps will be of size O(log n). Finally, we consider the off-line version of the problem of creating a giant component. Here we are given pairs of random edges (e1 , f1 ), . . . , (ecn , fcn ), and we try to create a giant by choosing one edge from each pair. In this case we establish a phase transition. Theorem 4. Let c be a constant and let (e1 , f1 ), (e2 , f2 ), . . . , (ecn , fcn ) be a sequence of ordered pairs of edges on vertex set [n] chosen independently and uniformly at random. If c > 41 then whp there exists a collection of edges E such that |E ∩ {ei , fi }| ≤ 1 for all i and the graph ([n], E) has a component of size Ω(n). Note that it follows from Theorems 3 and 4 that there exist values of c for which any on-line algorithm whp produces a graph with no giant but there exists an off-line choice of one edge from each pair that succeeds whp in producing a graph with a giant component (which is analogous to the situation for the problem of avoiding a giant component). The remainder of this paper is organized as follows. In Section 2 we give the proof of our main result, Theorem 1. In section 5, we discuss methods for estimating the critical values for bounded first-edge algorithms, prove Theorem 2, and make some comments about the smallest value of c for which an Achlioptas process can create a giant component. Section 3 consists of the proof of Theorem 3, and Section 4 consists of the proof of Theorem 4. Finally, in Section 6 we mention a very simple Achliopts process that succeeds in creating a giant component for c < 0.46.

2

Bounded first-edge algorithms

Let A be a fixed bounded first edge algorithm. We assume without loss of 2 generality that SA 6= ([a] ∪ {ℓ}) . We will show (roughly) that the sum of the squares of the component sizes of GA (cn) is bounded for c < cA but goes to infinity as c approaches cA . Furthermore, we will show that for c < cA the ‘large’ components do not make a significant contribution to the sum of the 4

squares of the component sizes. Part (b) of Theorem 1 then follows from an application of Lemma 5. Lemma 5. Let G be a graph on n vertices and let τ be a constant such that G has ai n vertices in components of size i for i = 1, 2, . . . , τ , and a1 + · · · + aτ = 1. If e is a constant such that 2e

τ X

iai > 1,

(2)

i=1

then the graph obtained by adding en random edges to G will whp have a component of size Ω(n). Note that the summation in (2) is the sum of the squares of the component sizes of G. The proof of Lemma 5 is given at the end of this section. We track a + 2 random variables over the evolution of the process using the differential equations method for random graph process (we follow the notation and use a variation of a general theorem of Wormald, who gives an excellent treatment of this method [11]). For i = 1, . . . let Yi (t) be the number of vertices in components of size i in G′A (t). (This is a slightly modified version of GA (t) which is defined below.) Furthermore, define X(t) = Z(t) =

n X

iYi (t)

i=1 n X

i2 Yi (t).

i=1

Note that X gives the sum of the squares of the component sizes while Z gives the sum of the cubes of the component sizes. We track the random variables Y1 , Y2 , . . . , Ya , X, Z and show that Y1 , Y2 , . . . , Ya , X are concentrated around an expected trajectory. We will only give an upper bound the growth of Z over the course of the algorithm. The relationship between the sum of the squares of the components sizes and the sum of the cubes of the component sizes near the critical value is also a key feature of a recent result of Aldous and Pittel [1] on the emergence of the giant component in a version of the random graph in which both vertices and edges appear as the process evolves. We define a set of a + 2 functions on Ra+2 . For (z1 , . . . , za , zX , zZ ) ∈ Ra+2 set zℓ = 1 − z 1 − · · · − z a X zx zy . ρ= (x,y)∈SA

5

For i = 1, . . . , a define X

fi (z1 , . . . , za , zX , zZ ) =

zx zy i

(x,y)∈SA :x+y=i



X

zx zi i −

+ (1 − ρ)i 

Further define

ζi = i2 zi a X ξi ξℓ = zX −

ξi = izi

zy zi i

(x,y)∈SA :x=i

(x,y)∈SA :y=i



X

i−1 X j=1



zj zi−j − 2zi  .

for i = 1, . . . , a a X ζi ζℓ = zZ − i=1

i=1

and X

fX (z1 , . . . , za , zX , zZ ) =

2 2ξx ξy + (1 − ρ)2zX

(x,y)∈SA

fZ (z1 , . . . , za , zX , zZ ) =

X

(3ξx ζy + 3ξy ζx ) + (1 − ρ)6zX zZ .

(x,y)∈SA

Note that f1 , . . . , fa , fX , fZ are continuous and satisfy a Lipshitz condition on any bounded domain. We are interested in the solution of the system of differential equations dzx = fx (z1 , . . . , za , zX , zZ ) dt

x ∈ [a] ∪ {X, Z}

with initial condition z1 (0) = 1, z2 (0) = · · · = za (0) = 0, zX (0) = zZ (0) = 1. Note that f1 , . . . , fa do not depend on zX or zZ . Note further that the solution of the system dzx = fx (z1 , . . . , za , zX , zZ ) = fx (z1 , . . . , za ), dt z1 (0) = 1, z2 (0) = · · · = za (0) = 0

x ∈ [a]

(3)

satisfies z1 (t) < 1

and

z2 (t), . . . , za (t), zℓ (t) > 0

for t > 0 in some neighborhood of 0. It follows that the solution of (3) can be viewed as a collection of function on the positive reals that satisfies z1 (t), z2 (t), . . . , za (t), zℓ (t) > 0 6

for t > 0.

(4)

With these functions in hand we can write dzX 2 = fX (z1 , . . . , za , zX , zZ ) = g1 (t) + g2 (t)zX + g3 (t)zX dt

(5)

where g1 , g2 , g3 are bounded smooth functions of t defined on [0, +∞). The differential equation (5) has a unique solution zX (t) passing through zX (0) = 1. Note that this function blows up at some point (to see this, it may be easier to work with ξℓ instead of zX ). We define cA to be this blow-up point. With zX (t) in hand we can write dzZ = fZ (z1 , . . . , za , zX , zZ ) = g4 (t) + g5 (t)zZ dt

(6)

where g4 and g5 are smooth, bounded functions of t on intervals of the form [0, s) where s < cA . It follows that the unique solution zZ (t) of (6) passing through zZ (0) = 1 exists for 0 ≤ t < cA . In other words zX and zZ blow up at the same point in time, cA . Finally, we note that it follows from (4) that there exists a constant δ > 0 such that z1 (t), z2 (t), . . . , za (t), zℓ (t) > δ

(7)

for 1/4 ≤ t ≤ 2cA . We add the small wrinkle to the graph process GA (1), GA (2), . . . to produce G′A (1), G′A (2), . . .. Set r(n) = n1/13 . If A calls for the acceptance of an edge {x, y} at round i and x or y is in a component of size greater than r(n) then the edge is not added to the process (and no edge is added to the graph in the round). This produces the slightly altered graph process G′A (1), G′A (2), . . .. The random variables Y1′ , . . . Ya′ , X ′ , Z ′ refer to this altered graph process (i.e. Yx′ (i) is the number of vertices in G′A (i) in components of size x, etc). Lemma 6. Let 0 < s < cA be fixed. With probability 1 − O(n4/13 exp(−n1/13 )) we have   ∀x ∈ [a] Yx′ (i) = nzx (i/n) + O n12/13   , and X ′ (i) = nzX (i/n) + O n12/13   Z ′ (i) ≤ nzZ (i/n) + O n12/13

uniformly for all 0 ≤ i ≤ sn.

The proof of Lemma 6 is given below. In order to complete the proof of Theorem 1 we must make an observations about the relationship between GA (i) and G′A (i). Note that G′A (i) is not simply a subgraph of GA (i): the edges we neglect in the formation of G′A may influence future decisions. We use the symbol ∆ to denote symmetric difference. Lemma 7. Let 0 < s < cA be fixed.    n log n = 1 − o(1). P r |E (GA (sn)) ∆E (G′A (sn))| = O r2 (n) 7

Proof. For i = 1, . . . let L(i) be the set of vertices x ∈ [n] such that the component of G′A (i) containing x has more than r(n) vertices. Let D(i) be the set of vertices x ∈ [n] such that the component of GA (i) containing x is different than the component of G′A (i) containing x and the size of at least one of these components is at most a. Let B be the set of rounds i ≤ sn such that that E(GA (sn)) ∩ {ei , fi } = 6 E(G′A (sn)) ∩ {ei , fi }. Let A be the event that there exists i < sn such that |L(i)| > rKn 2 (n) , where K = 2zZ (s). Note that it follows from Lemma 6 that the probability of A is exponentially small. We have P r(i + 1 ∈ B) ≤ 4

|L(i)| + |D(i)| . n

Furthermore, |D(i + 1)| ≤ |D(i)| + 3a. We have n X

  12a(k + Kn/r2 (n)) + nP r(A) P r[D(i) = k] k + n k=0   12a 12Ka = E[|D(i)|] 1 + + 2 + nP r(A) n r (n)   24Ka 12a + 2 ≤ E[|D(i)|] 1 + n r (n)

E[|D(i + 1)|] ≤

for n sufficiently large. It follows that j i  12a 24Ka X 1 + r2 (n) j=0 n # " i+1 2Kn 12a = 2 −1 . 1+ r (n) n

E[|D(i)|] ≤

 A similar calculation gives the bound E[|B|] = O n/r2 (n) .

Let c < cA . The largest component in G′A (cn) has at most 2r(n) vertices. It then follows from Lemma 7 that whp the largest component in GA (cn) is of size O(n log n/r(n)). This establishes part (a) of Theorem 1. We now turn to part (b) of Theorem 1. Let ǫ > 0 and c = cA + ǫ. Let K1 be a constant such that ǫδ 2 K1 > 1. (8) (Recall that δ is defined in (7).) There exists t < cA and a constant K2 such that zX (t) > K1 + 3 and zZ (t) < K2 − 1. It follows from Lemma 6 that whp we have X ′ (tn) > (K1 + 2)n and Z ′ (tn) < K2 n. 8

Thus

∞ X

x=K2

It follows that

xYx′ (tn)

∞ 1 ′ 1 X 2 ′ x Yx (tn) < Z (tn) < n. ≤ K2 K2 x=K2

K2 X

xYx′ (tn) > (K1 + 1)n.

x=1

Now, it follows from Lemma 7 that o(n) of the components in G′A (tn) intersect edges in E (GA (tn)) ∆E (G′A (tn)). Therefore, K2 X

xYx (tn) > K1 n.

x=1

  We note that ni , Y1n(i) , . . . , Yan(i) follows (i/n, z1 (i/n), . . . , za (i/n)) well past the critical value.   1/4  Lemma 8. Let 0 < s < 2cA be fixed. With probability 1−O n1/4 exp −n 8a3 we have   Yx (i) = nzx (i/n) + O n3/4 for all x ∈ [a], uniformly for all 0 ≤ i ≤ sn.

It follows from Lemma 8 and (7) that the probability that fi is chosen by A is at most δ 2 for i = tn, . . . , cn. Therefore, whp at least 3 2 3 δ (t − c)n > δ 2 ǫ 4 4 edges fi such that tn < i < cn are chosen by A. As these are purely random edges part (b) of Theorem 1 follows from an application of Lemma 5. Proof of Lemma 6. We apply Theorem 5.1 of [11]. Let D be the domain D = [0, cA ] × [0, 1]a × [0, 2zX (s)] × [0, 2zZ (s)]. The stopping time T is the smallest i for which   Ya′ (i) X ′ (i) Z ′ (i) i Y1′ (i) , ,..., , , n n n n n does not lie in the domain D.

9

For note keeping purposes we set 2r(n)

Yℓ′ (i)

=

X

Yx′ (i) = n −

x=a+1

a X

Yx′ (i)

x=1 2r(n)

X

Yr′ (i) =

Yx′ (i)

x=r(n)+1 2r(n)

Xℓ′ (i)

X

=

xYx′ (i)



= X (i) −

x=a+1

a X

xYx′ (i)

x=1 2r(n)

Xr′ (i)

=

X

xYx′ (i)

x=r(n)+1

Xx′ (i)

=

xYx′ (i),

Zx′ (i)

p(i) =

= x2 Yx′ (i) for x = 1, . . . , a X Yx′ (i)Yy′ (i) . n2

(x,y)∈SA

Note that p(i) gives the probability that the edge et+1 is chosen. Note further that concentration of Xℓ′ (i) and Yℓ′ (i) around some expected trajectory will follow from the concentration results we obtain for Y1′ (i), Y2′ (i), . . . , Ya′ (i), X ′ (i). We will not establish concentration of Yr′ (i) or Xr′ (i). However, by bounding Z ′ (i) we will have a trivial upper bound on Yr′ (i) and Xr′ (i). We are now ready to consider the expected changes in our random variables that result from one step of the process. We use hi to denote the history of the process up to time i (this is just the sequence of edges e1 , f1 ; e2 , f2 ; . . . ; ei , fi ). For 1 ≤ u ≤ a we have X

E[Yu′ (i + 1) − Yu′ (i)|hi ] =

(x,y)∈SA :x+y=u

Yx′ (i)Yy′ (i)u n2

X Yu′ (i)Yy′ (i)u Yx′ (i)Yu′ (i)u − n2 n2 (x,y)∈SA :y=u (x,y)∈SA :x=u "u−1 # ′ X Yv′ (i)Yu−v (i) Yu′ (i) + (1 − p(i))u −2 n2 n v=1



X

′ Yu/2 (i)u2 /2

Yu′ (i)u2 n2 n2 Yu′ (i)Yr′ (i)u Yu′ (i)Yr′ (i)u + 1(u,ℓ)∈SA + 1(ℓ,u)∈SA 2 n2 " n ′ Yu/2 (i)u/2 + (1 − p(i))u −1u even n2  Yu′ (i)Yr′ (i) Yu′ (i)u +2 +2 n2 n2

− 1(u/2,u/2)∈SA

10

+ 21(u,u)∈SA

The first three lines of this expression give the expected change in Yu′ under the assumption that ei+1 and fi+1 neither fall within a connected component nor touch a component of size greater than r(n) (the first two lines give the change when ei+1 is chosen while the third line gives the change when fi+1 is chosen). The other lines account for the these other possibilities. It follows that for i < T we have   ′ ′ ′ ′ E[Yu′ (i + 1) − Yu′ (i)|hi ] − fu Y1 (i) , . . . , Ya (i) , X (i) , Z (i) n n n n  ′    Yr (i) 1 (9) +O =O n n     1 1 =O +O . n r(n)2 Note that we use the fact that i < T implies Z ′ (i) < 2zZ (s)n and therefore Yr′ (i) = O(n/r(n)2 ). The expected changes for X ′ and Z ′ are more delicate. The key to the calculation for X ′ is the following observation: If H is arbitrary graph with Pan m 2 connected components C1 , . . . Cm , we set X(H) = i=1 |Ci | and we add a + single random edge to H to form the graph H then the expected value of X(H + ) − X(H) is X |Ci ||Cj | i6=j

n2

2|Ci ||Cj | = 2

m X X 2 (H) |Ci |4 − 2 . n2 n2 i=1

(10)

The idea is that 2X 2 /n2 will be the main term and, so long as no large components have appeared the other term is just a small error (this reasoning is attributed to Janson [10]). We have X

E[X ′ (i + 1) − X ′ (i)|hi ] =

(x,y)∈SA



2Xx′ (i)Xy′ (i) 2X ′ (i)2 + (1 − p(i)) 2 n n2

X

x:x6=ℓ,(x,x)∈SA

r(n) X 2x3 Y ′ (n) 2x3 Yx′ (i) x − 1 (ℓ,ℓ)∈S A 2 n2 n x=a+1

r(n)

− (1 − p(i)) −

X

X 2x3 Y ′ (i) x 2 n x=1

(x,y)∈SA :x=ℓ

2Xy′ (i)Xr′ (i) − n2

X

(x,y)∈SA :y=ℓ

2Xx′ (i)Xr′ (i) n2

2Xr′ (i)2 −4X ′ (i)Xr′ (i) + Xr′ (i)2 + 1(ℓ,ℓ)∈SA + (1 − p(i)) n2 n2 The second and third lines accounts for the sum of the fourth powers of the components sizes as pointed out in (10). The fourth and fifth lines account 11

for the fact we drop edges that touch components that have more than r(n) vertices. It follows that for i < T we have   ′ ′ ′ ′ E[X(i + 1) − X(i)|hi ] − fX Y1 (i) , . . . , Ya (i) , X (i) , Z (i) n n n n   ′   Xr (i) r(n) +O (11) =O n n     r(n) 1 =O +O . n r(n) Note that we use the fact that i < T implies both Z ′ (i) = O(n) and Yr′ (i) = O(n/r(n)2 ). For Z ′ , we note that if H is anP arbitrary graph with connected components m C1 , . . . Cm , we define Z(H) to be i=1 |Ci |3 and we add a single random edge + to H to form the graph H then the expected value of Z(H + ) − Z(H) is X |Ci ||Cj | i6=j

n2

2

2

3|Ci | |Cj | + 3|Ci ||Cj |



m X X(G)Z(G) |Ci |5 − 6 =6 n2 n2 i=1

≤6

X(G)Z(G) . n2

It follows that we have E[Z ′ (i + 1) − Z ′ (i)|hi ] ≤

X

(x,y)∈SA

3Xx′ (i)Zy′ (i) + 3Xy′ (i)Zx′ (i) n2

+ (1 − p(t))

6X ′ (t)Z ′ (t) . n2

Note that we do not have to take into account the edges that touch vertices in components having r(n) or more vertices as doing so would only result in a stronger upper bound. Also note that this function is increasing in Z ′ (i). We have   ′ Ya′ (i) X ′ (i) Z ′ (i) Y1 (i) ′ ′ . (12) ,..., , , E[Z (i + 1) − Z (i)|hi ] ≤ fZ n n n n We now apply Theorem 5.1 of [11] (conforming to the notation there as much as possible). We set β(n) = 8r(n)3 = 8n3/13 ,

λ(n) =

1 1 = 1/13 r(n) n

and

γ(n) = 0.

Note that the boundedness hypothesis follows from the fact that we do not add edges that touch the component that have r(n) or more edges. The trend hypothesis for variable Y1′ , . . . , Ya′ , X ′ follows from (9) and (11). Note that we have only a one-sided trend hypothesis for Z ′ . A minor alteration of the proof of Theorem 5.1 in [11] can account for this because 12

(a) zZ does not appear in any of the functions other than fZ , and (b) fZ is increasing in zZ . It follows from (a) that a one-sided bound on Z ′ does not influence the bounds for any other variable. It follows from (b) that having only one-sided bound on Z ′ suffices to establish one-sided bound on Z ′ in future iterations. Finally, we note that in our application of Theorem 5.1 of [11] we violate one of the stated conditions: There is not a constant C0 such that |yl (hi )| < C0 n for all hi . However, this condition is not necessary as we have γ(n) = 0. Proof of Lemma 5. We add edges to G independently, each with probability p = nc , where c = 2e − n−1/4 . Say that this adds a set F of f n edges to create graph G′ . Define the random variable Z ∈ Bi( n2 , nc ). We see that whp f ≤ e: Pr(f > e) ≤ Pr (Z < en) = Pr(Z ≤ E[Z] − Ω(n3/4 ))   ≤ exp −Ω(n1/2 ) = o(1),

with the last inequality coming from the Chernoff bound (equation (2.6) in [8]). The condition of Theorem 5 implies c

τ X

iai > 1

(13)

i=1

for n sufficiently large. We prove Lemma 5 by proving that if c satisfies (13), then whp G′ has a component of size Ω(n). We follow the branching process proof of the emergence of the giant component in the traditional random graph given in [8]. Pτ Define ǫ > 0 as the difference in (13), i.e. ǫ = c i=1 iai − 1. Let δ > 0 be a constant satisfying (caj −

ǫ tj )(1

+ δ) < caj

for

j = 1, 2, . . . , τ

(14)

and n sufficiently large. Define k− = M log n, where M is a constant to be named later, and k+ = n2/3 . Components of G′ with k− or fewer vertices will be called small, while components with k+ or more vertices will be called large. Two vertices in the same component in graph G will be called siblings. For v ∈ G′ , we determine the component of v by way of a process that can be approximated by a branching process. In this process we first reveal the edges in F that contain v. The vertices in these edges (other than v itself) as well as thier siblings are now members of the component of v. These vertices are now unsaturated and v is saturated. As long as the set of unsaturated vertices is not empty, each step of the process consists of choosing an unsaturated vertex w, determining the remaining neighbors of w in edges in F , declaring those neighbors and their siblings unsaturated vertices in the component of v, and finally declaring the vertex w saturated. Define Pv to be the process which starts at vertex v. We will prove the following three claims: 13

I. For any v ∈ G′ and k ∈ [k− , k+ ], define Ev,k as the event that Pv is still alive after k steps but has less than δk unsaturated vertices. We have Pr(Ev,k ) = o( nk1+ ). II. Given any two vertices v, w in large components of G′ , Pr(v, w are in different components) = o( n12 ) III. There exists π ∈ [0, 1), such that whp, there are at most (1 − π + o(1))n vertices in small components. Claim I. implies that whp every vertex v is in a component of size less than k− or more than k+ . Claims II. and III. together imply that whp G′ has a unique component of size at least πn + o(n), and all other components of G′ have size k− or less. Pf I. We examine Pv for some vertex v. For i = 1, 2, . . . , k, let Xi be the number of new vertices which we add to the component on the i-th step. Note that if vertex w is immediate offspring, then all siblings of w from G are also immediate offspring. Therefore, we can bound Xi from below by the quantity Xi− =

τ X

− j(Yi,j − Ki,j )

j=1

where

− Yi,j ∈ Bi (aj n − (1 + δ)k+ , p)

and Ki,j is the number of pairs of siblings that are in the same component of size j in G and are both connected to w by edges in F . Note that a simple first moment calculation gives   τ k X X Ki,j > 20 = o(1/n2 ). Pr  i=1 j=1

Thus, we can ignore the influence of the Ki,j ’s in the following. Fix some k ∈ [k− , k+ ]. If Pv is still alive at step k then Ev,k implies τ k X X i=1 j=1

− jYi,j =

k X

Xi− < (1 + δ)k = (1 + δ)k

Pr(Ev,k ) ≤ Pr

(cjaj − τǫ ).

j=1

i=1

So,

τ X

 τ X k [ 

− < (cjaj − τǫ )(1 + δ)k jYi,j

j=1 i=1

=

τ X j=1

Pr

k X

− Yi,j

i=1

14

< (caj −

ǫ τ j )(1

+ δ)k

   !

We have E

"

Define θj = (caj − bound to get Pr(Ev,k ) ≤

τ X

k X

− Yi,j

#

= kaj c − (1 + δ)c kkn+ = caj − o(k)

i=1 ǫ tj )(1 + δ).

Pr

j=1

k X i=1

− Yi,j

Since δ satisfies (14), we may apply the Chernoff

< θj k

!


(caj −θjj )2 for j = 1, 2, . . . , τ , k− = M log n implies Pr(Ev,k ) = o( n12 ) = o( nk1+ ). Pf II. Suppose v and w are in components of size k+ or more. We know from I. that whp v is in a component with a set V of δk+ vertices which are unsaturated after k+ steps of Pv . Similarly, we have a set W corresponding to w. If V ∩ W 6= ∅, then v and w are in the same component. If they are disjoint, Pr(No edge in V × W ) =   2    2 k 2 (1 − p)(δk+ ) ≤ exp − nc δ 2 k+ = exp −Ω n+ = o n12 . Pf III. Let Z be the number of vertices from G′ which are in small components. We show first that E[Z] ≤ πn for some 0 ≤ π < 1, then we show that Var(Z) = o(n2 ). The result then follows from Chebychev’s inequality. For i = 1, . . . , n define Zi to be 1 if vertex i is in a small component, 0 otherwise. Therefore, E[Z] = nE[Z1 ]. Now, Z1 is bounded from above by the extinction probability of the branching process in which the distribution of immediate offspring of a particle is given by τ X − j(Y1,j − K1,j ) X1− = j=1

where

− Y1,j ∈ Bi(aj n − k− , p)

and K1,j is the number of pairs of siblings that are in the same component of size j in G and are both connected to particular vertex w by edges in F . We have E[X1− ] > 1: E[X1− ]

=

τ X j=1

− jE[Y1,j ]

− jE[K1,j ] =

τ X

j(caj −

ck− n



c2 aj j(j−1) ) 2n

j=1

=

τ X j=1

15

cjaj − o(1) = 1 + ǫ − o(1).

So there exists 0 ≤ π < 1, such that the branching process with distribution X1− becomes extinct with probability π. Therefore, E[Z] ≤ πn. In order to compute the covariance we condition on the event Z1 = 1 and let S be the component in G′ that contains vertex 1. We can view S as a set of random vertices from G, where 0 ≤ |S| ≤ k− . Now, our conditioning has an influence on the probability that Z2 = 1 if 2 ∈ S or if one of the first k− steps P2 included an edge into S. The probability of this happening is bounded by |S| log2 n n + k− |S|p = O( n ). Thus, 2

|Pr(Z2 = 1 | Z1 = 1) − Pr(Z2 = 1)| = O( logn n ) So, var(Z) ≤ E[Z] +

X

cov(Zi , Zj ) ≤ O(n) +

i6=j

3

  2 n o( logn n ) = o(n2 ). 2

On-line Lower Bound

Let c be a constant and let (e1 , f1 ), (e2 , f2 ), . . . , (ecn , fcn ) be a sequence of ordered pairs of edges on vertex set [n] chosen independently and uniformly at random. We will show that if there exists d ∈ (0, c) such that c < f (d) where f (d) is the function defined in (1) (i.e. if c ≤ .2545 . . .) then any on-line algorithm for the sequential choice of one edge from each presented pair (ei , fi ), i = 1, 2, . . . , cn, whp will create a graph whose components are all of size O(log n). To do this, we will use the fact that on-line algorithms usually cannot make “good” choices early in the process. For small i, there is a good chance that all four vertices from (ei , fi ) are appearing for the first time. In this case any on-line algorithm must make an essentially arbitrary choice, which could be a costly mistake if, for example, fi is an isolated edge in the graph consisting of all 2cn edges but ei is a bridge in the giant component. Fix some constant d ∈ (0, c). We will determine its value later. Divide the process into two parts, the first dn pairs and the last (c − d)n pairs. Given the pairs (e1 , f1 ), . . . , (ecn , fcn ), we create an auxiliary graph G on vertex set [n] using the following steps: (i.) For all i < dn, if all four vertices in (ei , fi ) are occurring for the first time in the process, then randomly eliminate one of the edges with probability 1 2. (ii.) Take all remaining edges, including the last 2(c − d)n edges, and let this be the edge set of G. In other words, G will contain every edge except for some that were chosen with probability 21 . It follows from symmetry that for any fixed algorithm A, the 16

probability that A produces a giant is bounded above by the probability that G has a giant. Indeed, if we condition on the first dn rounds of the process then there is a permutation of [n] that maps the graph produced by A in the first dn rounds to a subgraph of the graph consisting of the edges in G that come from (e1 , f1 ), . . . , (edn , fdn ). In order to analyze G we consider a slightly different probability space. Let a1 , b1 , c1 , d1 ; a2 , b2 , c2 , d2 ; . . . be a sequence of vertices chosen uniformly and independently at random from [n] with replacement (e.g. we allow ui = vi ). Setting ei = {ai , bi } and fi = {ci , di }, we get a sequence of pairs of random edges. Of course, this model allows loops and multiple edges, but since the expected number of each over the cn rounds of our process is bounded by an absolute constant, whp the total number of loops and multiple edges is at most log(n). Thus, we can accommodate these flaws in the process by adding an extra log(n) rounds (which have no impact on the rest of argument). The following is a variation of the branching process proof of the classical giant component results for Gn,p which is given in [8], p. 109. We generate an upper bound on the size of the component of G containing a fixed vertex v with the following process. We begin by observing the locations of all appearances of v in the sequence a1 , b1 , c1 , d1 ; a2 , b2 , c2 , d2 ; . . .. Note that we have not yet observed any whole edge and, conditioning on these locations, the rest of the random vertices in the sequence are uniform in [n] \ {v}. The edge-partners of each occurrence of v are now active positions (we still haven’t looked at the vertices in these positions). An active position in round i such that i ≤ dn is called an early active position. An active position in round i such that i > dn is called a late active position. Now, if the first appearance of v is in round i and i ≤ dn then we reveal the other vertices in round i. Let x be the edge-partner of v in this round and let y and z be the other two vertices that appear in this round. We check to see if there is appearance of vertices x, y or z before round i. In particular, we determine the first occurrence of a vertex from the set {x, y, z}. If x, y and z do not appear before round i then we delete this active position with probability 12 . If x is the first vertex from the set {x, y, z} to appear and its first appearance occurs before round i then the position remains active and we call it a heavy active position. Otherwise the position remains an early active position. The vertices x, y and z are now sussed and the vertex v is saturated. Note that so-far this process has simply determined the degree of v and checked to see if the edge containing the first occurrence of v is deleted in the formation of G while revealing as little information about the random graph as possible. Now suppose that after some number of steps in this process we have given sets of saturated and sussed vertices, and sets of early, heavy, and late active positions. The saturated vertices are vertices in the component containing v whose complete neighborhood has already been determined, and the active positions correspond to vertices in the component containing v for which we have not yet observed the full neighborhood. We now consider the vertex, say vertex u, in an arbitrary active position π. We reveal all other occurrences of u in the 17

sequence a1 , b1 , c1 , d1 ; a2 , b2 , c2 , d2 ; . . . and the edge-partners of all occurrences of u (other than the edge-partner of π which is an already saturated vertex) become active positions (early or late depending on the round in which they occur). If π is an early or heavy active position then we may have already checked to see if the first occurrence of u is in a deleted edge and we cannot make that check again. If π is a late active position and the first occurrence of u is in a round i such that i ≤ dn, then we suss out the other vertices in round i. If all vertices appearing in round i are making their first appearance, then we delete the active position in round i with probability 12 . If, on the other hand, the edge-partner of the first occurrence of u appears before any of the other vertices that appear in round i then this position becomes a heavy active position. This step is repeated until the set of active positions is exhausted. Before turning to a thoroughly rigorous analysis of this process, it will be helpful to note that it can be loosely modeled with a branching process in which there are three types of offspring: types 1,2 and 3 corresponding to late, early and heavy active positions, respectively. Note that the probability that a late active position generates an early active position that is deleted can (roughly speaking) be bounded from below by 1 2



1− 1−

 1 4dn n



1−

 3 4dn n



1 − e−4d =: p. 2e12d

Furthermore, the probability that a late active position generates a heavy active position is (roughly speaking) at most      1 3 4dn 1 4dn 1 1 − 1 − ≈ 3 (1 − e−4d )(1 − e−12d ) =: q. 1 − 1 − 3 n n

The i, j position in the matrix

 4(c − d) A = 4(c − d) 4(c − d)

4d − p 4d 4d + 1

 q 0 0

gives the expected number of offspring of type j of an animal of type i in this multi-type branching process. The largest eigenvalue of A is less than 1 if and only if c < f (d), where f (d) is defined in (1). Since this branching process dies out with probability one (see [2, page 186]) if the largest eigenvalue of the matrix A is less than one, we expect c < f (d) to imply that whp the component of G containing v is small. We now make a rigorous argument. Let ǫ be a constant such that 4c + 4(c − d)(2q − p) + 8ǫ < 1, note that such a constant exists as the condition we impose on c in the statement of the theorem is equivalent to c < ( 14 + d(2q − p))/(1 + 2q − p)). Also, set m :=

4(4c + ǫ) log n. ǫ2 18

We consider the first m steps in the process. We begin by noting that it may happen that when we reveal the vertex in some position we will find that it has already been sussed (note that the vertices in some of the early active positions and all of the heavy active positions have already been sussed but this fact is built-in to the process). This ‘bad’ event occurs with probability at most 3m n−m at each position. Since the number of positions that we inspect is at most 4m, the probability the bad event occurs more than 1 time in the process  3mthat 2 is at most 4m ( ) = o( n1 ). Furthermore, when this bad event occurs it 2 n−m introduces at most 2 ‘extra’ active positions, which will have no affect on the rest of the argument. We henceforth assume that no vertex that is revealed has been previously sussed. It may also happen that the first appearance of some vertex that is revealed in a late active position is in a round that also contains a previously viewed vertex (either saturated or sussed). Of course, this event will change the probability of both the deletion of this active position and the probability that this position becomes a heavy active position. The probability of this event is at 3 most n−4m times the number of rounds that hold previously viewed vertices. Since the probability that there exists a vertex that appears more than 2 log n times in the sequence a1 , b1 , c2 , d1 ; a2 , b2 , c2 , d2 ; . . . is o( n1 ), the probability of log n this second type of ‘bad’ event in any step of the process is at most 12m n−4m . Again, we see that we may assume that this event occurs at most once during the process, with no consequence for the rest of the proof. Let X be the number of late active positions that are introduced in the first m steps of the process. Since the numberof late active positions introduced in  1 any one of these steps is dominated by Bi 4(c − d)n, n−m , X is dominated by   1 . It follows from the Chernoff bound ([8], p. 26), that Bi 4(c − d)mn, n−m   Pr X ≤ (4(c − d) + ǫ)m = 1 − o( n1 ).

(15)

Note that we do not necessarily saturate all of the late active positions during the first m steps of the process: if there are many active positions it may be the case that some of the late active positions are left over after the mth step. Let X ′ be the number of late active positions that are actually saturated during the first m steps. Let Z be the number of heavy active positions that are generated during this process. The probability that a late active position generates a heavy active position is at most   4dn   4dn    1 3 1 1 − 1 − n−4m =q+O m 3 1 − 1 − n−4m n

 So, conditioning on (15), Z is dominated by Bi (4(c − d) + ǫ)m, q + O follows from the Chernoff bound that    Pr Z < (4(c − d)q + 2ǫ) m = 1 − o n1 . 19

m n



. It

(16)

It remains to bound the number of early active positions that are generated in the first m steps of the process. The probability that a late active position generates an early active position that is deleted is at least 1 2



1− 1−

 1 4dn−8dm log n n



1−

3 n−4m

4dn

=p−O



m log n n



Let W be the number of deleted early positions in the first m steps of the  m log n ′ process. W dominates Bi X , p − O( n ) , and hence   Pr W ≥ X ′ (p − ǫ) = 1 − o

1 n



.

(17)

  1 Let Y be the sum of m i.i.d. Bi 4dn, n−4m random variables. We have   Pr Y ≤ (4d + ǫ)m = 1 − o( n1 )

(18)

m ≤ X ′ + Z + (Y + Z − W ).

(19)

The number of early active positions that are generated (and not deleted) in this process is dominated by Y + Z − W . Now, in the event the component containing v has more than m vertices, the process does not terminate until after the mth step and we have

It follows from (15), (16), (17) and (18) that, with probability 1 − o( n1 ), the right hand side of (19) is at most X ′ (1 − p + ǫ) + 2m(4(c − d)q) + 4dm + 5ǫm ≤ 4(c − d)(1 − p)m + 2m(4(c − d)q) + 4dm + 8ǫm < m. Thus, the probability that the component containing v has more than m vertices is o( n1 ), and the probability that G contains a component having more than m vertices is o(1), and we have proved Theorem 3.

4

Off-line Phase Transition

Fix c > 14 and suppose (e1 , f1 ), . . . , (ecn , fcn ) are pairs of random edges. In this section we prove that there exists a set E of edges such that |E ∩ {ei , fi }| ≤ 1 for i = 1, . . . , cn and the graph ([n], E) has a component of size Ω(n). Let G be the graph with vertex set [n] and all 2cn edges. For any tree T within graph G, we will say that T survives if there is no i ∈ {1, 2, . . . , n} such that {ei , fi } ⊆ E(T ). In other words, T survives if no two edges in T were paired together. Clearly, if T is a tree in G which survives, then it is possible to make choices so all of T ends up in the final graph.

20

If T is a tree in G with t vertices, let φ(t) be the probability that it survives. A straightforward calculation shows φ(t) =

t−2 Y i=1

t

t

 −1 −1  2 2 Y Y 2cn − 2i 2cn − t − 2j t+1 = = 1− . 2cn − i 2cn + 1 − 2j 2cn + 1 − 2j j=1 j=1

Bounding using the first and last terms in the product and using e−2x ≤ 1−x ≤ e−x for small x leads to    2  t t2 + O( nt ) . (20) ≤ φ(t) ≤ exp − 4cn exp − 2cn−t

With high probability, there exists a surviving tree with ⌊n1/3 ⌋ vertices because φ(⌊n1/3 ⌋) → 1 and whp G has a component of size Ω(n) ≥ ⌊n1/3 ⌋. If T is a surviving tree, we will say that T is maximal if there is no edge e such that T ∪ {e} is a surviving tree. So, if T is maximal then for every edge {u, v} ∈ E(G) such that u ∈ T and v ∈ / T , necessarily {u, v} is paired with some edge in T , otherwise we could add it to T and create a larger surviving tree. Define (21) T = {t : log2 n ≤ t ≤ (c − 41 )2 n } We will prove: Pr(∃ a maximal surviving tree with size t ∈ T ) → 0.

(22)

Since ⌊n1/3 ⌋ ∈ T , (22) establishes the existence of a surviving tree of size at least (c − 41 )2 n whp. For the remainder of this section, note that any given edge appears in G 4c with probability 2cn = n−1 . Since the appearance of fixed edges makes others (n2 ) less likely, we can say 4c j Pr(a set of j edges appears) ≤ ( n−1 ) .

Lemma 9. Let P = Pr(T maximal|T survives), where T is a tree on Kn with t vertices. We have h i   t t 4c t(n − t) 1 − c(n−1) − 2cn−t + O nt . (23) P ≤ t exp − n−1

Proof. To find the probability that T is maximal, note that there are t(n − t) edges “leaving” T . Every one of these edges must either be left out of G or be paired with an edge from T . The probability that an edge is paired with t . In the following sum, j is the exact something in T is bounded above by 2cn−t number of the t(n − t) edges “leaving” T that appear in G: P ≤

t−1 X j=0

t(n−t) j

 4c j t ( n−1 ) ( 2cn−t )j (1 − ρ)t(n−t)−j , 21

where ρ is a lower bound on the probability that a fixed edge appears in G, conditioned on all edges that have been observed. The probability ρ will take its smallest possible value when 2(t − 1) edges have already been observed to exist in G while possibly zero edges have been left out. We can say ρ ≥

4c 4t 2cn − 2(t − 1)  ≥ − . n n − 1 (n − 1)2 − 2(t − 1) 2

Now we will find the maximum possible value of P . We avoid the sum by finding the maximum and multiplying by t.    4c 4t P ≤ t exp −t(n − t) n−1 − (n−1) 2 j  t 4c 4t e 4c . exp( n−1 − (n−1) · max t(n − t) n−1 2 ) 2cn−t j 0≤j 41 , and it is negative whenever α < (c − 14 )2 . Therefore, nEt → 0 whenever t ∈ T . We have proved Theorem 4. 22

5

Two Algorithms

We begin with the bounded first-edge algorithm defined by a = 1 and SA1 = {(ℓ, ℓ)}. In words, A1 chooses edge ei if it includes no isolated vertices and otherwise chooses edge fi . For this algorithm we have dz1 dt dzX dt

= −2z1 (2z1 − z12 )

2 = 2(zX − z1 )2 + 2zX (2z1 − z12 )

with initial conditions z1 (0) = 1, zX (0) = 1. In order to get an upper bound on the critical value cA1 , we approximate the solution by implementing Euler’s method, being sure to underestimate zX . (The program for our approximation, written in C++, is available at http://www.math.cmu.edu/∼tbohman.) Let z˜1 and z˜X denote our approximations. We use the standard Euler’s method for z˜1 . For z˜X we set z˜X (t + h) = z˜X (t) + hφ(˜ zX , z˜1 ) − 2ǫ, where 2 2 φ(˜ zx , z˜1 ) = 2 max z˜X − z˜1 − 2 hǫ , 0 + 2˜ zX (2˜ z1 − z˜12 −

2ǫ h ),

h is the step size, and ǫ accounts for the computational rounding errors. For our approximation we take h = 10−7 , and ǫ = 10−12 suffices. We claim that we have the following (a.) z˜X (t) < zX (t) (b.) z˜X (0.3847) > 104 , and (c.) z1 (t) > 1/2 for t ∈ [0, 0.385]. It then follows from Lemmas 5 and 6 that cA1 < 0.3847 + 0.0001 < 0.385. In order to establish these claims we first note that (d.) e(t) := |z1 (t) − z˜1 (t)| ≤

2ǫ h

for t ∈ [0, 0.4].

To see this, let f1 (z1 ) = −2z1 (2z1 − z12 ). Since |z1′′ (t)| ≤ 4.2 and |f1 (z1 (t)) − f1 (˜ z1 (t))| ≤ 3e(t) for all t ∈ [0, 0.4], we have e(t + h) ≤ (1 + 3h)e(t) + 2.1h2 + ǫ. This leads to (d.). We note that (d.) together with observation of z˜1 suffices to establish (c.). Of course, (b.) follows from the observation of z˜X alone. It remains to prove (a.). To this end, we first note that ′′ (e.) zX (t) > 0 for all t > 0.

We have ′′ ′ ′ 2 zX = 2(zX − z1 )(zX − z1′ ) + 2zX zX z1 (2 − z1 ) + 2zX (1 − z1 )z1′ ,

23

and therefore we may use −2 ≤ z1′ ≤ 0, zX ≥ 1, and z1 ≥ ′′ ′ ′ 2 zX ≥ 2(1 − z1 )[zX + 21 zX zX − 2zX ] ≥

1 2

3 ′ 2 zX

to get 2 − 2zX .

2 ′ whenever It is easy to see from the original differential equation that zX ≥ 32 zX 1 2 1 ′′ z1 ≥ 2 , thus zX ≥ 4 zX > 0. Finally, we note that ′ (f.) z˜X (t) ≤ zX (t) implies φ(˜ zX (t), z˜1 (t)) < zX (t).

Claim (a.) now follows from zX (0) = z˜X (0), (e.) and (f.). Another rule for choosing edges was considered by Spencer and Wormald [10]. Although their intention was to avoid a giant component instead of to create one, our machinery can be applied to determine the existence, and location of a phase transition. We consider the bounded first-edge algorithm A2 defined by a = 1 and SA2 = {(1, 1)}. In words, this algorithm chooses edge ei if it is an isolated edge and otherwise chooses fi . For this algorithm we have dz1 = −2z1 (t)2 − 2z1 (t)(1 − z1 (t)2 ) dt dzX 2 = 2z1 (t)2 + 2zX (t)(1 − z1 (t)2 ) dt Using the methods above to approximate the solution of this system of differential equations we have zX (.5882) > 104 . This implies cA2 < .589. We close this section by noting that simulations suggest that there the Achlioptas processes the chooses the edge from the pair (ei , fi ) that results in the largest increase in the sum of the squares of the component sizes creates a giant component in as few as 0.34n rounds. However, this algorithm is not a not a first-edge algorithm: it depends on all four vertices in ei ∪ fi . For such an algorithm it is a considerable challenge to bound the errors in the numerical simulations.

6

A simple Achlioptas process.

In this section we give a simple Achlioptas process that succeeds whp in creating √ giant for any c > 3166 .  [n] Begin by fixing a set S ∈ αn for some α ∈ (0, 1). During each round,  choose only edges which are in S2 . If two such edges are presented, choose one at random. If no such edges are presented, choose neither.  In each round, the probability that we choose an edge from S2 is 2α2 − α4 .  Furthermore, in each round, any edge in S2 is equally likely. Therefore, whp  we will take (2α2 − α4 + o(1))cn edges at random from S2 . So whp we have a giant component whenever (2α2 − α4 + o(1))cn >

1 2 αn.

(24)

To optimize (24), divide by α and note that the left side is minimized when α2 = 32 . Therefore, we create a component of size Ω(n) whp whenever c > √ 3 6 16 = 0.459 . . . . 24

References [1] D. Aldous, B. Pittel, On a random graph with immigrating vertices: Emergence of the giant component, Random Structures and Algorithms 17 (2000) 79-102. [2] K.B. Athreya and P.E.Ney, Branching Processes, Springer-Verlag(1972). [3] T. Bohman and A. Frieze, Avoiding a Giant Component, Random Structures and Algorithms, 19 (2001), 75–85. [4] T. Bohman, A. Frieze and N. Wormald, Avoiding a Giant Component II, submitted. [5] T. Bohman and J.H. Kim, A Phase Transition for Avoiding a Giant Component, submitted. [6] B. Bollob´as, Random Graphs, Second Edition, Academic Press(2001). [7] P. Erd˝ os and A. R´enyi, On the Evolution of Random Graphs, Publ. math. Inst. Hungar. Acad. Sci., 5 (1960), 17–61. [8] S. Janson, T. Luczak and A. Ruci´ nski, Random Graphs, John Wiley & Sons, Inc.(2000). [9] G. Sorkin. personnal communication. [10] J. Spencer, Percolating Thoughts, email message of January 31, 2001. [11] N. Wormald, The Differential Equation Method For Random Graph Processes and Greedy Algorithms, in Lectures on Approximation and Randomized Algorithms, (M. Karonski and H.J. Promel, eds) (1999) 73–155.

25

Suggest Documents