Connected Components and Evolution of Random Graphs: An Algebraic Approach Ren´e Schott∗, G. Stacey Staples† March 19, 2007
Abstract Questions about a graph’s connected components are answered by studying appropriate powers of a special “adjacency matrix” constructed with entries in a commutative algebra whose generators are idempotent. The approach is then applied to the Erd¨ os-R´enyi model of sequences of random graphs. Developed herein is a method of encoding the relevant information from graph processes into a “second quantization” operator and using tools of quantum probability and infinite-dimensional analysis to derive formulas that reveal the exact values of quantities that otherwise can only be approximated. In particular, the expected size of a maximal connected component, the probability of existence of a component of particular size, and the expected number of spanning trees in a random graph are obtained.
1
Introduction
The evolution of random graphs has been studied in some detail. The first works in this area are attributed to Erd¨os and R´enyi [9], [10], [11]. Definition 1.1. Let n be a positive integer, let V = {1, 2, . . . , n}, and define ¡ ¢ Ω Ω = n2 . A graph process on V is a sequence (Gt )Ω 0 = (V, Et )0 such that each Gt is a graph on V with t edges, and G0 ⊂ G1 ⊂ · · · ⊂ GΩ . Let Gˆ denote the probability space formed by the set of all Ω! graph processes with equal probability defined for all. It is well-known that any graph process is a Markov chain whose states are graphs on V . Example 1.2. Graphs G0 through G11 of a graph process on ten vertices are pictured in Figure 1. Note that G11 is the first connected graph of the sequence. ∗ IECN and LORIA Universit´ e Henri Poincar´ e-Nancy I, BP 239, 54506 Vandoeuvre-l` esNancy, France, email:
[email protected] † Department of Mathematics and Statistics, Southern Illinois University Edwardsville, Edwardsville, IL 62026-1653, email:
[email protected]
1
Figure 1: A graph process on 10 vertices. Erd¨os and R´enyi proved that if t ∼ cn for some fixed c ∈ R where 0 < c < 21 , then almost every Gt is such that its largest component has O(log n) vertices. If c > 21 , then the largest component of almost every Gt has (1 − αc + o(1))n vertices for some 0 < αc < 1. Finally, if t = bn/2c, then the maximal size of a component of almost every Gt is O(n2/3 ). Another notable work is that of Bollob´as [7], who showed that almost every G ∈ Gˆ is such that for t ≥ n/2 + (log n)1/2 n2/3 , the graph Gt has a unique component of order at least n2/3 , referred to as the giant component. Considering “online” processes, Bohman and Frieze [3] investigated algorithms for avoiding the emergence of the giant component in a graph process. Bohman, Frieze, and Wormald [4] and Bohman and Kim [5] also considered avoiding the giant component. In contrast, Flaxman, Gamarnik, and Sorking [12] and Bohman and Kravitz [6] considered algorithms for obtaining a giant component in a graph process. Chung and Lu [8] investigated the distribution of the sizes of the connected components in a family of random graphs with given expected degree sequence. Frieze and L Ã uczak [13] considered maximal numbers of edge-disjoint spanning trees in random graphs. In related work, Palmer and Spencer [17] showed that in almost every random graph process, the hitting time for having k edgedisjoint spanning trees equals the hitting time for having minimum degree k. Also of interest is the investigation by Molloy[15] of the connections between satisfiability thresholds for random k-SAT and thresholds for the emergence of the giant component in a graph process. What is proposed in this paper is an algebraic framework in which many quantities related to a random graph’s connected components can be expressed explicitly. Defined here is an adjacency matrix whose entries lie in a commutative algebra with idempotent generators. After labeling the graph’s edges with idempotent generators of the algebra, computing powers of this matrix reveals infor-
2
mation about the graph’s connected components. The idempotent-adjacency matrix approach is then extended to graph processes by creating a second quantization space of graph processes. All possible graph processes are encoded in one operator, and information about the connected components contained in the N th graph of the sequence is revealed by considering powers of this operator. The method of second quantization is well-known to physicists and goes back to the original work of Berezin [2]. Quantum probabilistic approaches to graph theory are also not new. Hashimoto, Hora, and Obata [14] used the method of quantum decomposition to obtain central limit theorems for growing sequences of graphs. Other applications of quantum probabilistic techniques to graph theory include the work of Obata [16] and Accardi, et al. [1]. Historically, the graph-theoretic work done by quantum probabilists has dealt with specific graphs whose relationship to problems of mathematical physics are understood. In contrast, the philosophy of the current authors is that the tools of quantum probability can be applied to more general graph-theoretic problems. In earlier work, the current authors defined nilpotent adjacency matrices and applied them to the study of cycles in random graphs [19], [20]. A similar approach based on a commutative algebra whose generators {ςi } satisfy ςi 2 = 1 has been used to formulate random walks on the hypercube [21].
1.1
Algebraic Preliminaries
Definition 1.3. Let V be a finite set with n > 0 elements. Let IV be the associative algebra generated by commuting idempotents {ε{i} : i ∈ V } along with the unit scalar ε∅ = 1 ∈ R. In particular, for i, j ∈ V , the generators satisfy ε{i} ε{j} = ε{j} ε{i} , and ε{i} ε{i} = ε{i} .
(1.1) (1.2)
For simplicity of notation, linear basis elements will be indexed by subsets of the power set 2V ; i.e., Y εi = ει . (1.3) ι∈i
Remark 1.4. An easy realization is the algebra generated by canonical projections onto orthogonal hyperplanes in Rn . For example, consider the collection {εi }1≤i≤n defined by εi (x1 , . . . , xn ) = (x1 , . . . , xi−1 , 0, xi+1 , . . . , xn ). Note that any element x ∈ IV has canonical expansion of the form X x= xi εi , i∈2V
3
(1.4)
(1.5)
3 4
2
5
1
6
12
7
11
8
10 9
Ε81< i j j j j 0 j j j j j 0 j j j j j0 j j j j j 0 j j j j j j 0 j j j j j 0 j j j j j j j Ε81< j j j j0 j j j j j0 j j j j j0 j j j k0
0 Ε82< 0 0 0 0 0 0 0 0 0 0
0 0 Ε83< 0 0 0 0 0 0 Ε83< 0 0
0 0 0 Ε84< 0 0 0 0 0 0 Ε84< Ε84
0, ³
ˆ tΨ(B)
´`
X
` Y
pairwise-disjoint `-tuples ζj ,...,ζj 1 `
k=1
= t` `!
ε ik ⊗ ζj k .
(2.27)
Thus, the multivectors associated with pairwise-edge-disjoint `-tuples of spanning trees are recovered with multiplicity `! from the `th power. The multiplicity factor is removed by considering the power series expansion of the exponential, and the highest power of t appearing in this expansion reveals the size of a maximal pairwise-edge-disjoint collection.
3
Second Quantization of Graph Processes
With tools in hand, graph processes can now be formulated as sequences in the algebraic probability space (G, τ ). Associated with any graph process (Gt )Ω 0 is a . This sequence corresponding sequence of idempotent-adjacency matrices, (at )Ω 0 is a quantum stochastic process associated with (Gt )Ω 0. For each 0 ≤ k ≤ Ω, define the indicator function χk : (G, τ ) → {0, 1} by ( 1 if τ (ak 2n−2 ) > 12 , χk (a) = (3.1) 0 otherwise. Define the set S = {(at )Ω 0 } of all quantum stochastic processes associated with graph processes on n vertices. Then, (S, τ ) is an algebraic probability space. Lemma 3.1. On the space of quantum stochastic processes associated with graph processes on n > 1 vertices, define the random variable X(ω) =
∞ X
2−k χk (ω).
(3.2)
k=1
Then, the time step k at which a giant component first emerges in the corresponding graph sequence is given by k0 = 1 − log2 X(ω). 11
(3.3)
Proof. Begin by noting that the value k0 corresponds to the first value of k for which χk (ω) = 1. It then follows from the identity 1=
∞ X
2−k =
k=1
k=1
that 1−
kX 0 −1
kX 0 −1
2−k +
∞ X
2−k
(3.4)
k0
2−k = 2−k0 +1 .
(3.5)
k=1
Hence, k0 = − log2 X(ω) + 1. The method of second quantization refers to the extension of operatortheoretic models of single-particle systems to systems of arbitrarily many particles. In quantum probability theory, a single particle can be represented in a Hilbert space H. In order to work with a system of arbitrarily many particles, an infinite-dimensional Hilbert space is constructed. For example, the Hilbert ∞ M space H⊗n is referred to as the free Fock space over H. The nth direct n=1
summand is the n-particle subspace (cf. [18]). Second quantization associates the Hilbert space with the corresponding Fock space. Operators on the finite-dimensional subspaces are extended to operators on the Fock space. Borrowing the notion of creation operators from quantum probability, we can think of graphs on n vertices as systems of some number of particles between 0 and Ω. At each step of the process, an edge is “created” between a randomly chosen pair of non-adjacent vertices. Hence, the N th graph of the process (G¡t )Ω 0¢ Ω will correspond to an N -particle system. This system can be in any one of N states, depending on which edges are present. The goal now is to create a single operator that encodes all possible graph processes on n vertices. For fixed n > 0, consider the vertex set V = {1, 2, . . . , n}. For each 1 ≤ i ≤ Ω, let ai denote the idempotent-adjacency matrix associated with G = (V, E) where |E| = 1. In other words, the collection {ai } represents all idempotent-adjacency matrices of one-edge subgraphs of the complete graph Kn . Define Γ1 = a1 ⊗ a2 ⊗ · · · ⊗ aΩ . (3.6) By construction, Γ1 encodes all one-step graph processes on n vertices. Let ¡ ¢ Γ2 = a1 ⊗Ω ⊗ a2 ⊗Ω ⊗ · · · ⊗ aΩ ⊗Ω + Γ⊗Ω 1 .
(3.7)
The operator Γ2 now encodes all 2-step graph processes on n vertices. As the construction continues, it becomes apparent that ΓN is the tensor product of ΩN idempotent-adjacency matrices.
12
In particular, the sequence {ΓN } is defined via the recurrence Γ1 = a1 ⊗ a2 ⊗ · · · ⊗ aΩ ΓN = (ΓN −1 )
⊗Ω
+ (a1
⊗ΩN −1
⊗ · · · ⊗ aΩ
⊗ΩN −1
(3.8) ), (N > 1).
(3.9)
By construction, ΓN has the form N
ΓN =
Ω O
m` ,
`=1
where each m` is the idempotent-adjacency matrix of a multi-graph on n vertices having N edges. In particular, those m` associated with simple graphs represent N th steps of graph processes. For any x ∈ IV ⊗ IV ×V , define the scalar sum functional by ** ++ X X hhxii = αi,j εi ⊗ γj = αi,j . (3.10) i∈2V ,j∈2V ×V
i∈2V ,j∈2V ×V
Note that when m is the edge-labeled idempotent-adjacency matrix of a simple ®® graph G on n vertices and N edges, tr(m† m) = 2N + n. This is because the edge between each adjacent pair of vertices appears with multiplicity two, and each vertex is self-adjacent by construction of the matrix. Define the operator % on G by ( ®® I if tr(m† m) > 2N + n, %N (m) = (3.11) m otherwise. Then extend the operator via N
%N (ΓN ) =
Ω O
%N (m` ).
(3.12)
`=1
The purpose of the operator %N is to “sieve out” the edge-labeled idempotentadjacency matrices not associated with simple graphs. The non-identity components of %N (ΓN ) are idempotent-adjacency matrices representing the N th steps of graph processes. All graph processes on n vertices are represented by this single tensor product. For simplification, the following notation will be adopted: ˆ N = %N (ΓN ). Γ Now define the mapping µ : G → 0, 1 by ( 0 if deg(tr(m† m)) = 0, µ(m) = 1 otherwise.
13
(3.13)
Next, define N
µ? (m1 ⊗ · · · ⊗ mΩN ) =
Ω Y
exp(µ(m` )).
(3.14)
exp(τ (m` )).
(3.15)
`=1
Finally, define the mapping τ ? by N
?
τ (m1 ⊗ · · · ⊗ mΩN ) =
Ω Y `=1
Proposition 3.2. Let a graph process (Gt ) be given. Let XN denote the size of a maximal connected component of GN . Then, the expected value of XN is given by ³ ´ ˆ N )2n−2 ) n ln τ ? ((Γ ³ ´ E(XN ) = . (3.16) ˆN ) ln µ? (Γ Proof. Because each edge is chosen from those remaining with equal probability pi at each step i of the graph process, any N -edge graph GN occurs with equal probability. In particular, the N edges are appended independently and in any N Y order, so the probability of each graph GN is N ! pi . i=1 ³ ´ ˆ N ) represents the number of N -edge graphs posThe denominator ln µ? (Γ ³ ´ ˆ N )2n−2 ) represents the sum of sizes of sible, and the numerator n ln τ ? ((Γ maximal components among all possible N -edge graphs. Define the mapping νM : G → {0, 1} by ( 1 if τ (m) = nM, νM (m) = 0 otherwise.
(3.17)
? Now, define νM by N
? νM (m1
⊗ · · · ⊗ mΩN ) =
Ω Y
exp (νM (m` )) .
(3.18)
`=1
Proposition 3.3. Let a graph process (Gt ) be given. Let M ≤ n be an arbitrary positive integer. Let EN,M be the event that GN contains a maximal connected component of size M . Then, ³ ´ ? ˆ N )2n−2 ) ln νM ((Γ ³ ´ P(EN,M ) = . (3.19) ˆN ) ln µ? (Γ
14
Proof. As in the proof of Proposition 3.2, the denominator represents the num³ ´ ? 2n−2 ˆ ber of N -edge graphs. Here, the numerator ln νM ((ΓN ) ) represents the number of N -edge graphs containing a maximal component of size M . The following corollary is an immediate consequence of the preceding results using simple inclusion-exclusion. Corollary 3.4. Let a graph process (Gt ) be given. Let M ≤ n be an arbitrary positive integer. Let XN,M denote the event that a maximal connected component of size M emerges at time step N . Then, ´ ³ ³ ´ ³ ´ ? ˆ N )2n−2 ) ˆ N −1 ) − ln ν ? ((Γ ˆ N −1 )2n−2 ) ln µ? (Γ ln νM ((Γ M ³ ´ ³ ´ P(XN,M ) ≤ + . ˆ N −1 ) ˆN ) ln µ? (Γ ln µ? (Γ (3.20) By considering a second quantization using edge-labeled idempotent-adjacency matrices, it becomes possible to compute the expected number of (k, d)-components of the N th graph of the process. In particular, the expected number of spanning trees of GN can be computed. By considering edge-labeled idempotent-adjacency matrices {ˆ ai } in place of the matrices {ai } used to construct ΓN , the second quantization operator ΥN is analogously defined. That is, Υ1 = a ˆ1 ⊗ a ˆ2 ⊗ · · · ⊗ a ˆΩ ⊗Ω
ΥN = (ΥN −1 )
+
N −1 (ˆ a⊗Ω 1
⊗ ··· ⊗
N −1 a ˆ⊗Ω ), Ω
(3.21) (N > 1).
(3.22)
The operator ΥN is of the form N
ΥN =
Ω O
m ˆ `,
(3.23)
`=1
where each m ˆ ` is the edge-labeled idempotent-adjacency matrix of a multi-graph on n vertices having N edges. In particular, those m ˆ ` associated with simple graphs represent N th steps of graph processes. ˆ N is now defined by The operator Υ ˆ N = %(ΥN ), Υ
(3.24)
where % is defined as in (3.11). N Define the mapping d? : Ge⊗Ω → R by N
?
d (m ˆ1 ⊗ ··· ⊗ m ˆ ΩN ) =
Ω Y
³ ³ ¡ ´´ ¢® exp dim tr m . ˆ 2n−2 ` n,n−1
(3.25)
`=1
The following proposition follows from Proposition 2.14 and the construction of the second quantization operator. 15
Proposition 3.5. Let a graph process (Gt ) be given. Let TN denote the number of spanning trees of GN . Then, the expected value of TN is given by ³ ´ ˆN) n ln d? (Υ ³ ´ . E(TN ) = (3.26) ˆN) ln µ? (Υ (i)
For fixed positive integers k and d, use the mapping χk,d : Ge → {0, 1} of N
(2.12) to define χ?(k,d) : Ge⊗Ω → R by N
χ?(k,d) (m ˆ1
⊗ ··· ⊗ m ˆ ΩN ) =
Ω Y `=1
"
# n 1 X (i) ¡ n−1 ¢ exp χ m ˆ` . k i=1 k,d
(3.27)
(k,d)
Proposition 3.6. Let a graph process (Gt ) be given. Let XN denote the (k,d) number of (k, d)-components of GN . Then, the expected value of XN is given by ³ ´ ˆN) ln χ?(k,d) (Υ (k,d) ³ ´ . (3.28) E(XN ) = ˆN) ln µ? (Υ ³ ´ ˆ N ) repreProof. As in the proof of Proposition 3.2, the denominator ln µ? (Υ ³ ´ ˆN) sents the number of N -edge graphs possible, and the numerator ln χ?(k,d) (Υ represents the sum of numbers of (k, d)-components all possible N -edge graphs.
4
Conclusion
This paper represents one step toward a comprehensive study of graph processes and algorithms using tools of algebraic probability.
References [1] L. Accardi, A. Ben Ghorbal, and N. Obata, Monotone independence, comb graphs and Bose-Einstein condensation, Infin. Dimens. Anal. Quantum Probab. Relat. Top. 7 (2004), 419–435. [2] F.A. Berezin, The Method of Second Quantization, Academic Press, New York, 1966. [3] T. Bohman, A. Frieze, Avoiding a giant component, Random Structures and Algorithms, 19 (2001), 75–85. [4] T. Bohman, A. Frieze, N. Wormald, Avoiding a giant component in half the edge set of a random graph, Random Structures and Algorithms, 25 (2004), 432–449. 16
[5] T. Bohman, J.H. Kim, A phase transition for avoiding a giant component, Random Structures and Algorithms, 28 (2006), 195–214. [6] T. Bohman, D. Kravitz, Creating a giant component, Combin. Probab. Comput., 15 (2006), 489–511. [7] B. Bollob´as, The evolution of random graphs, Trans. Amer. Math. Soc., 286 (1984), 257–274. [8] F. Chung, L. Lu, Connected components in random graphs with given expected degree sequences, Annals of Combinatorics, 6 (2002), 125–145. [9] P. Erd¨os and A. R´enyi, On random graphs. I, Publ. Math. Debrecen, 6 (1959), 290-297. [10] P. Erd¨os and A. R´enyi, On the evolution of random graphs, Publ. Math. Inst. Hungar. Acad. Sci. 5 (1960), 17-61. [11] P. Erd¨os and A. R´enyi, On the evolution of random graphs, Bull. Inst. Internat. Statist. Tokyo, 38 (1961), 343-347. [12] A. Flaxman, D. Gamarnik, G. Sorkin, Embracing the giant component, Random Structures Algorithms, 27 (2005), 277–289. [13] A.M. Frieze, T. L Ã uczak, Edge-disjoint spanning trees in random graphs, Periodica Mathematica Hungarica, 21 (1990), 35–37. [14] Y. Hashimoto, A. Hora, and N. Obata, Central limit theorems for large graphs: Method of quantum decomposition, J. Math. Phys., 44 (2003), 71– 88. [15] M. Molloy, When does the giant component bring unsatisfiability?, Combinatorica, To appear. [16] N. Obata, Quantum probabilistic approach to spectral analysis of star graphs, Interdisciplinary Information Sciences, 10 (2004), 41–52. [17] E.M. Palmer, J.J. Spencer, Hitting time for k edge-disjoint spanning trees in a random graph, Periodica Mathematica Hungarica, 31 (1995), 235–240. [18] K.R. Parthasarathy, An Introduction to Quantum Stochastic Calculus, Birkh¨auser Verlag, Basel, 1992 [19] R. Schott, G.S. Staples, Nilpotent adjacency matrices and random graphs, http://www.siue.edu/∼sstaple/index files/RndGrphs.pdf, (2006). [20] R. Schott, G.S. Staples, Nilpotent adjacency matrices, random graphs, ´ Cartan and quantum random variables, Pr´epublications de l’Institut Elie ◦ 2007/n 8, (2007). [21] G.S. Staples, Clifford-algebraic random walks on the hypercube, Advances in Applied Clifford Algebras, 15 (2005), 213-232.
17