Standard canonical forms for certain rank one perturbations and an application to the (complex) Google PageRanking problem
Stefano Serra Capizzano Dipartimento FM - Sede di Como Universit`a dell’Insubria Via Valleggio 11 - 22100 Como, Italy e-mail:
[email protected]
Inspired by • G. Golub. Part of the results joint with: • R. Horn (thanks to D. Noutsos). Joint/Related work with/by: • C. Breziski, • M. Redivo Zaglia.
The general matrix theoretic problem Assumptions • A ∈ Cn×n with eigenvalues λ, λ2, . . . , λn, • x, y ∈ Cn s.t. Ax = λx, y ∗A = λy ∗,y ∗x = 1, • v ∈ Cn s.t. v ∗x = 1 and c ∈ C, • assume λ 6= cλj for each j = 2, . . . , n. (*)
A(c) := cA + (1 − c)λxv ∗.
• Pb1: eigenvalues of A(c), • Pb2: standard canonical forms (SCFs) of A(c).
The Google matrix theoretic problem
The web hyperlink matrix G(c) used by Google for computing the PageRank is a special case of the matrix A(c) in which: • A ∈ Rn×n real nonnegative, • A row stochastic (dangling nodes considered), • c ∈ [0, 1), x vector of all ones, • v pos. prob. vector i.e. every vi > 0, kvkl1 = 1. Therefore for each j = 2, . . . , n we uniformly have: λ = 1, |λj | ≤ 1 (Perron-Frobenius), v ∗x = kvkl1 = 1, assumption (*) satisfied since |cλj | ≤ c < 1 = λ.
• Obs: c ∈ D1 i.e. c ∈ C, |c| < 1 allowed (!): complex Google matrix, • Pb1: eigenvalues of G(c), • Pb2: standard canonical forms (SCFs) of G(c).
The Google PageRanking problem Given G(c) = cA + (1 − c)xv ∗, c ∈ [0, 1) compute the vector y(c) s.t. y ∗(c)G(c) = y ∗(c),
y(c) ≥ 0, ky(c)kl1 = 1.
• y(c) is the PageRank vector, • the problem for c = 1 is ill-posed in the Hadamard sense (infinitely many solutions, discontinuous dependency on c), • c can be viewed as the regularization parameter that forces uniqueness and stability, • the nonnegativity constraint makes the problem close to the image restoration problem (ill-posed and with the same constraint).
Solution of Problem 1 • Solution of Pb1: the eigenvalues of A(c) are λ, cλ2, . . . , cλn, • Solution of Pb1: the eigenvalues of G(c) are 1, cλ2, . . . , cλn.
The above statements follow from the following result: Proposition 1 Let B ∈ Cn×n with eigenvalues λ, λ2, . . . , λn and let x, w ∈ Cn, Bx = λx. The eigenvalues of B +αxw∗ are λ+αw∗x, λ2, . . . , λn.
• B := cA with eigenvalues cλ, cλ2, . . . , cλn, • α := (1 − c)λ, w := v, • hence αw∗x = (1 − c)λv ∗x =(1 − c)λ, • therefore Spect(A(c)) is given by λ, cλ2, . . . , cλn, • and Spect(G(c)) is given by 1, cλ2, . . . , cλn since λ = 1.
Solution of Problem 1: (Horn’s) poetry of PageRanking • For generic u, w ∈ Cn, det(I + uw∗) = 1 + w∗u, • Since (tI − B)x = (t − λ)x, we have (tI − B)−1x = (t − λ)−1x for any t ∈Spect(B), / • pX (t) denotes the characteristic polynomial of any X ∈ Cn×n.
Proof pB+αxw∗ (t) = = = = =
det(tI − (B + αxw∗)) det((tI − B) − αxw∗) det(tI − B)det(I − α(tI − B)−1xw∗) pB (t)det(I − α(t − λ)−1xw∗) ¡ ¢ −1 ∗ pB (t) 1 − α(t − λ) w x .
Hence (t − λ)pB+αxw∗ (t) = (t − λ − αw∗x)pB (t). Consequently, the eigenvalues of B + αxw∗ are λ + αw∗x, λ2, . . . , λn. •
Solution of Problem 2: A(c) SCFs Theorem 1 Under the first three assumptions on A(c): (a) The Jordan Canonical Form (JCF) of A is [λ] ⊕ J, (J being the ’remaining’ JCF). Let B ∈ Mn−1 with J as JCF. ¤ £ (b) ∃S2 ∈ Mn,n−1 s.t. S := x S2 is nonsingular and S −1AS = [λ] ⊕ B. (c) · ¸ ∗ λ w −1 S A(c)S = , in which w∗ := (1−c)λv ∗S2 0 cB Assume (*) i.e. λ 6= cλj for each j = 2, . . . , n. Let J(c) denote the JCF of cJ (J(c) obtained from J by replacing every block Jk (λj ) by Jk (cλj )). (d) ∃!ξ (with entries rational functions of c) s.t. w∗ = ξ ∗(cB − λI) and · ¸ £ ¤ 1 ξ∗ £ ¤ ∗ S(c) := x S2 = x S2 + xξ . 0 I Then S(c) is nonsingular and S(c)−1A(c)S(c) = [λ] ⊕ cB. In particular, the JCF of A(c) is [λ] ⊕ J(c). £ ¤ −∗ Let S := y S2 . (e) The first column of S(c)−∗ is y − S2ξ.
Solution of Problem 2: G(c) JCF Theorem 2 • Let G(c) the hyperlink Google matrix with complex c ∈ D1. • Assume G(1) = XJ(1)X −1, X = [x|x2| · · · |xn], ˜ = DJ(c)D−1, D = X −∗ = [y|y2| · · · |yn], J(c) diag(1, c, . . . , cn−1), J(c) =
1 cλ2 ∗ ...
... cλn−1
, ∗ cλn
with ∗ ∈ {0, 1}. Then −1 ˜ G(c) = Z J(c)Z , Z = XR−1,
with R = In + e1w∗, w∗ = (0, w2, · · · , wn), w2 = (1 − c)v ∗x2/(1 − cλ2), wj = [(1 − c)v ∗xj + [J(c)]j−1,j wj−1]/(1 − cλj ), j = 3, . . . , n.
Rational expansion for y(c) With the notations of the above theorem w2 = (1 − c)v ∗x2/(1 − cλ2), wj = [(1 − c)v ∗xj + [J(c)]j−1,j wj−1]/(1 − cλj ), j = 3, . . . , n.
• y(c) = y +
Pn
j=2 wj yj ,
• y(c) easily computed for |c| ¿ 1.
• Observation: ∃! lim y(c) = y, c→1
• Suggestion: Vector Extrapolation for c ≈ 1.
Algorithmic proposal • Step 1: with cj = 0.25∗exp(i2jπ/m), i2 = −1, compute y(cj ), j = 0, . . . , m−1 (Evaluation via vector FFT). • Step 2: Vector Extrapolation at the desired (difficult) c ≈ 1 (e.g. c = 0.85, c = 0.99, c = 1) obtaining y˜(c). • Step 3: Project y˜(c) into the nonnegative cone and make l1 normalization. • Step 4: Make iterative refinement by classical procedures: since c ≈ 1 it is advisable to use preconditioning and Krylov techniques.
Observations on the Google PageRanking problem Question: which vector y?
• c = 1: set of l1 norm solutions Pt {z : z ∗ A = z∗, z ≥ 0, kzkl1 = 1}= {z : j=1 λj Z[j], λj ≥ Pt 0, j=1 λj = 1} = Y . • Z[j] ∈ Y , Z[j], j = 1, . . . t, have “local” support. • ∃! lim y(c) = y. c→1
• Such y depends on v (!): exact dependency has to be investigated.
Conclusions and References • Complex Google ⇔ SCFs derived in a more general setting (nonnegativity of G(c) not required). • Asymptotic expansions ⇔ vector FFT Evaluation + Extrapolation. • Ill-position with nonnegativity constraints ⇔ Image Restoration techniques (SPAMS as noise?).
• Spectral analysis, JCF, idea of vector extrapolation for y(c): Serra-Capizzano, SIMAX 2005. • Spectral analysis, SCFs, complex Google: Horn, Serra-Capizzano, 2006. • Extrapolation procedures for PageRanking computation: Breziski, Redivo-Zaglia, Serra-Capizzano, Comptes Rendus 2005; Breziski, Redivo-Zaglia, 2005 (several proposals). Related work by Golub, Langville, Mayer, Ipsen, Kamvar, Del Corso, Gulli’, Romani, etc.