arXiv:1604.08188v5 [math.PR] 28 Feb 2018

Stability of the Matrix Dyson Equation and Random Matrices with Correlations Oskari H. Ajanki˚

László Erdős˚

Torben Krüger˚

IST Austria [email protected]



arXiv:1604.08188v4 [math.PR] 6 Dec 2016

December 8, 2016

Abstract We consider real symmetric or complex hermitian random matrices with correlated entries. We prove local laws for the resolvent and universality of the local eigenvalue statistics in the bulk of the spectrum. The correlations have fast decay but are otherwise of general form. The key novelty is the detailed stability analysis of the corresponding matrix valued Dyson equation whose solution is the deterministic limit of the resolvent.

Keywords: Correlated random matrix, Local law, Bulk universality AMS Subject Classification (2010): 60B20, 15B52, 46T99.

Contents 1 Introduction

2

2 Main results 2.1 The Matrix Dyson Equation . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Random matrices with correlations . . . . . . . . . . . . . . . . . . . . . .

4 4 9

3 Local law for random matrices with correlations

13

4 The Matrix Dyson Equation 16 4.1 The solution of the Matrix Dyson Equation . . . . . . . . . . . . . . . . . 16 4.2 Stability of the Matrix Dyson Equation . . . . . . . . . . . . . . . . . . . . 20 5 Estimating the error term

30

6 Fluctuation averaging

37

7 Bulk universality and rigidity 51 7.1 Rigidity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 7.2 Bulk universality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 A Appendix ˚

Partially supported by ERC Advanced Grant RANMAT No. 338804

58

1

Introduction

E. Wigner’s vision on the ubiquity of random matrix spectral statistics in quantum systems posed a main challenge to mathematics. The basic conjecture is that the distribution of the eigenvalue gaps of a large self-adjoint matrix with sufficient disorder is universal in the sense that it is independent of the details of the system and it depends only on the symmetry type of the model. This universal statistics has been computed by Dyson, Gaudin and Mehta for the Gaussian Unitary and Orthogonal Ensembles (GUE/GOE) in the limit as the dimension of the matrix goes to infinity. GUE and GOE are the simplest mean field random matrix models in their respective symmetry classes. They have centered Gaussian entries that are identically distributed and independent (modulo the hermitian symmetry). The celebrated Wigner-Dyson-Mehta (WDM) universality conjecture, as formulated in the classical book of Mehta [44], asserts that the same gap statistics holds if the matrix elements are independent and have arbitrary identical distribution (they are called Wigner ensembles). The WDM conjecture has recently been proved in increasing generality in a series of papers [18, 21, 24, 25] for both the real symmetric and complex hermitian symmetry classes via the Dyson Brownian motion. An alternative approach introducing the four-moment comparison theorem was presented in [49, 50, 52]. In this paper we only discuss universality in the bulk of the spectrum, but we remark that a similar development took place for the edge universality. The next step towards Wigner’s vision is to drop the assumption of identical distribution in the WDM conjecture but still maintain the mean field character of the model by requiring a uniform lower and upper bound on the variances of the matrix elements. This generalization has been achieved in two steps. If the matrix of variances is stochastic, then universality was proved in [18, 27, 29], in parallel with the proof of the original WDM conjecture for Wigner ensembles. Without the stochasticity condition on the variances the limiting eigenvalue density is not the Wigner semicircle any more; the correct density was analyzed in [1, 2] and the universality was proved [4]. We remark that one may also depart from the semicircle law by adding a large diagonal component to Wigner matrices; universality for such deformed Wigner matrices was obtained in [43]. Finally we mention a separate direction to generalize the original WDM conjecture that aims at departing from the mean field condition: bulk universality for general band matrices with a band width comparable to the matrix size was proved in [11], see also [48] for Gaussian block-band matrices. In this paper we drop the third key condition in the original WDM conjecture, the independence of the matrix elements, i.e. we consider matrices with correlated entries. Correlations come in many different forms and if they are extremely strong and long range, the universality may even be violated. We therefore consider random matrix models with a suitable decay of correlations. These models still carry sufficiently many random degrees of freedom for Wigner’s vision to hold and, indeed, our main result yields spectral universality for such matrices. We now describe the key points of the current work. Our main result is to prove a local law for the resolvent Gpζq :“ pH ´ ζ 1q´1 ,

(1.1)

of the random matrix H “ H˚ P CN ˆN with the spectral parameter ζ in the complex upper half plane H :“ tζ P C : Im ζ ą 0u that lies very close to the real axis. We show that, as the size N of the random matrix tends to infinity, G “ Gpζq is well approximated 2

by a deterministic matrix M “ Mpζq that satisfies a nonlinear matrix equation of the form 1 ` pζ 1 ´ A ` SrMsqM “ 0 .

(1.2)

Here the self-adjoint matrix A and the operator S on the space of matrices are determined by the first two moments of the random matrix A :“ E H ,

SrRs :“ E pH ´ AqRpH ´ Aq .

(1.3)

The central role of (1.2) in the context of random matrices has been recognized by several authors [33, 38, 45, 53]. We will call (1.2) Matrix Dyson Equation (MDE) since the analogous equation for the resolvent is sometimes called Dyson equation in perturbation theory. Local laws have become a cornerstone in the analysis of spectral properties of large random matrices [4, 8, 20, 23, 29, 35, 37, 51]. In its simplest form, a local law considers the normalized trace N1 Tr Gpζq of the resolvent. Viewed as a Stieltjes transform, it describes the empirical density of eigenvalues on the scale determined by η “ Im ζ. Assuming a normalization such that the spectrum of H remains bounded as N Ñ 8, the typical eigenvalue spacing in the bulk is of order 1{N. The local law asserts that this normalized trace approaches a deterministic function mpζq as the size N of the matrix tends to infinity and this convergence holds uniformly even if η “ ηN depends on N as long as η " 1{N. Equivalently, the empirical density of the eigenvalues converges on any scales slightly above 1{N to a deterministic limit measure on R with Stieltjes transform mpζq. Since G is asymptotically close to M, the deterministic limit of the Stieltjes transform of the empirical spectral measure is given by mpζq “ N1 Tr Mpζq. Already in the case of random matrices with centered independent entries (Wigner-type matrices) the limiting measure ρpdωq and its Stieltjes transform mpζq typically depend on the entire matrix of variances sxy :“ E|hxy |2 and the only known way to determine ρ is to solve (1.2). However, in this setting the problem simplifies considerably because the off-diagonal elements of G tend to zero, M is a diagonal matrix and (1.2) reduces to a ř vector equation for its diagonal elements. In case the variance matrix is doubly stochastic, y sxy “ 1 (generalized Wigner matrix ), the problem simplifies yet again, leading to M “ msc 1, where msc “ msc pζq is the Stieltjes transform of the celebrated semicircle law. The main novelty of this work is to handle general correlations that do not allow to simplify (1.2). The off-diagonal matrix elements Gxy , x ‰ y, do not vanish in general, even in the N Ñ 8 limit. The proof of the local law consists of two major parts. First, we derive an approximate equation 1 ` pζ 1 ´ A ` SrGsqG « 0 ,

(1.4)

for the resolvent of H. Second, we show that the Matrix Dyson Equation (1.2) is stable under small perturbations, concluding that G « M. The nontrivial correlations and the non commutativity of the matrix structure in the Dyson equation pose major difficulties compared to the uncorrelated case. Local laws are the first step of a general three step strategy developed in [24, 25, 27, 29] for proving universality. The second step is to add a tiny independent Gaussian component and prove universality for this slightly deformed model via analyzing the fast convergence of the Dyson Brownian motion (DBM) to local equilibrium. Finally, the third step is a perturbation argument showing that the tiny Gaussian component does not alter the local statistics. 3

In fact, the second and the third steps are very robust arguments and they easily extend to the correlated case. They do not use any properties of the original ensemble other than the a priori bounds encoded in the local laws, provided that the variances of the matrix elements have a positive lower bound (see [12, 26, 41, 42]). Therefore our work focuses on the first step, establishing the stability of (1.2) and thus obtaining a local law. Prior to the current paper, bulk universality has already been established for several random matrix models which carry some specific correlation from their construction. These include sample covariance matrices [25], adjacency matrices of large regular graphs [7] and invariant β-ensembles at various levels of generality [9, 10, 16, 17, 30, 34, 47]. However, neither of these papers aimed at understanding the effect of a general correlation nor were their methods suitable to deal with it. Universality for Gaussian matrices with a translation invariant covariance structure was established in [3]. For general distributions of the matrix entries, but with a specific two-scale finite range correlation structure that is smooth on the large scale and translation invariant on the short scale, universality was proved in [14], independently of the current work. Finally, we mention that there exists an extensive literature on the limiting eigenvalue distribution for random matrices with correlated entries on the global scale (see e.g. [5, 6, 13, 33, 36, 46] and references therein), however these works either dealt with Gaussian random matrices or more specific correlation structures that allow one to effectively reduce (1.2) to a vector or scalar equation. While the Matrix Dyson Equation in full generality was introduced for the analysis on the global scale before us, we are not aware of a proof establishing that the empirical density of states converges to the deterministic density given by the solution of the MDE for a similarly broad class of models that we consider in this paper. This convergence is expressed by the fact that N1 Tr Gpζq « N1 Tr Mpζq holds for any fixed ζ P H. We thus believe that our proof identifying the limiting eigenvalue distribution is a new result even on the global scale for ensembles with general short range correlations and non-Gaussian distribution. We present the stability of the MDE and its application to random matrices with correlated entries separately. Our findings on the MDE are given in Section 2.1, while Section 2.2 contains the results about random matrices with correlated entries. These sections can be read independently of each other. In Section 3 we prove the local law for random matrices with correlations. The proof relies on the results from Section 2.1. These results concerning the MDE are established in Section 4, which can be read independently of any other section. The main technical ingredients of the proof in Section 3 are (i) estimates on the random error term appearing in the approximate MDE (1.4) and (ii) the fluctuation averaging mechanism for this error term. These two inputs will be provided in Sections 5 and 6, respectively. Finally, we apply the local law to establish the rigidity of eigenvalues and bulk universality in Section 7.

2 2.1

Main results The Matrix Dyson Equation

In this section we present our main results on the Matrix Dyson Equation and its stability. The corresponding proofs are carried out in Section 4. We consider the linear space CN ˆN of N ˆ N complex matrices R “ prxy qN x,y“1 , and make it a Hilbert space by equipping it 4

with the standard normalized scalar product xR , Ty :“

1 Tr R˚ T . N

(2.1)

We denote the cone of strictly positive definite matrices by C` :“ tR P CN ˆN : R ą 0u , and by C ` its closure, the cone of positive semidefinite matrices. Let A “ A˚ P CN ˆN be a self-adjoint matrix. We will refer to A as the bare matrix. Furthermore, let S : CN ˆN Ñ CN ˆN be a linear operator that is • self-adjoint w.r.t. the scalar product (2.1), i.e. Tr R˚ SrTs “ Tr SrRs˚ T for any R, T P CN ˆN , • positivity preserving, i.e. SrRs ě 0 for any R ě 0. Note that in particular S commutes with taking the adjoint, SrRs˚ “ SrR˚ s, and hence it is real symmetric, Tr R SrTs “ Tr SrRsT, for all R, T P CN ˆN . We will refer to S as the self-energy operator. We call a pair pA, Sq consisting of a bare matrix and a self-energy operator with the properties above a data pair. For a given data pair pA, Sq and a spectral parameter ζ P H in the upper half plane we consider the associated Matrix Dyson Equation (MDE), ´Mpζ q´1 “ ζ 1 ´ A ` S rMpζ qs ,

(2.2)

for a solution matrix M “ Mpζ q P CN ˆN with positive definite imaginary part, Im M :“

1 pM ´ M˚ q P C` . 2i

(2.3)

The question of existence and uniqueness of solutions to (2.2) with the constraint (2.3) has been answered in [38]. The MDE has a unique solution matrix Mpζ q for any spectral parameter ζ P H and these matrices constitute a holomorphic function M : H Ñ CN ˆN . On the space of matrices CN ˆN we consider three norms. For R P CN ˆN we denote N by kRk the a operator norm induced by the standard Euclidean norm k ¨ k on C , by kRkhs :“ xR , Ry the norm associated with the scalar product (2.1) and by N

kRkmax :“ max|rxy | , x,y“1

(2.4)

the entrywise maximum norm on CN ˆN . We also denote the normalized trace of R by xRy :“ x1 , Ry. For linear operators T : CN ˆN Ñ CN ˆN we denote by kT k the operator norm induced by the norm k ¨ k on CN ˆN and by kT ksp the operator norm induced by k ¨ khs . The following proposition provides a representation of the solution M as the Stieltjestransform of a measure with values in C ` . This is a standard result for matrix-valued Nevanlinna functions (see e.g. [32]). For the convenience of the reader we provide a proof which also gives an effective control on the boundedness of the support of this matrix-valued measure. 5

Proposition 2.1 (Stieltjes transform representation). Let M : H Ñ CN ˆN be the unique solution of (2.2) with Im M P C` . Then M admits a Stieltjes transform representation, ż vxy pdτ q (2.5) mxy pζq “ , ζ P H , x, y “ 1, . . . , N . R τ ´ζ The measure Vpdτ q “ pvxy pdτ qqN x,y“1 on the real line with values in positive semidefinite matrices is unique. It satisfies the normalization VpRq “ 1 and has support in the interval r´κ, κs, where (2.6)

κ :“ kAk ` 2kSk1{2 .

We will now make additional quantitative assumptions on the data pair pA, Sq that ensure a certain regularity of the measure Vpdτ q. Our assumptions, labeled A1 and A2, always come together with a set of model parameters P1 and P2 , respectively, that control them effectively. Estimates will typically be uniform in all data pairs that satisfy these assumptions with the given set of model parameters. In particular, they are uniform in the size N of the matrix, which is of great importance in the application to random matrix theory. A1 Flatness: Let P1 “ pp1 , P1 q with p1 , P1 ą 0. The self-energy operator S is called flat (with model parameters P1 ) if it satisfies the lower and upper bound p1 xRy 1 ď SrRs ď P1 xRy 1 ,

R P C`.

(2.7)

Proposition 2.2 (Regularity of density of states). Assume that S is flat, i.e. it satisfies A1 with some model parameters P1 and that the bare matrix has a bounded spectral norm, (2.8)

kAk ď P0 ,

for some constant P0 ą 0. Then the holomorphic function xMy : H Ñ H is the Stieltjes transform of a Hölder-continuous probability density ρ with respect to the Lebesguemeasure, (2.9)

xVpdτ qy “ ρpτ q dτ . More precisely, |ρpτ1 q ´ ρpτ2 q| ď C |τ1 ´ τ2 |c ,

τ1 , τ2 P R ,

where c ą 0 is a universal constant and the constant C ą 0 depends only on the model parameters P1 and P0 . Furthermore, ρ is real analytic on the open set tτ P R : ρpτ q ą 0u.

Definition 2.3 (Density of states). Assuming a flat self-energy operator, the probability density ρ : R Ñ r0, 8q, defined through (2.9), is called the density of states (of the MDE with data pair pA, Sq). We denote by supp ρ Ď R its support on the real line. With a slight abuse of notation we also denote by ρpζ q :“

1 Im xMpζ qy , π

ζ P H,

the harmonic extension of ρ to the complex upper half plane. 6

(2.10)

The second set of assumptions describe the decay properties of the data pair pA, Sq. To formulate them, we need to equip the index set t1, . . . , Nu with a concept of distance. Recall that a pseudometric d on a set A is a symmetric function d : A ˆ A Ñ r0, 8s such that dpx, yq ď dpx, zq ` dpz, yq for all x, y, z P A. We say that the pseudometric space pA, dq with a finite set A has sub-P -dimensional volume, for some constant P ą 0, if the metric balls Bτ pxq :“ ty : dpx, yq ď τ u, satisfy |Bτ pxq| ď τ P ,

(2.11)

τ ě 2 , x P A.

A2 Faster than power law decay: Let P2 “ pP, π1 , π2 q, where P ą 0 is a constant and π k “ pπk pνqq8 ν“0 , k “ 1, 2 are sequences of positive constants. The data pair pA, Sq is said to have faster than power law decay (with model parameters P2 ) if there exists a pseudometric d on the index space t1, . . . , Nu such that the pseudometric space X “ pt1, . . . , Nu, dq has sub-P -dimensional volume (cf. (2.11)) and π1 pνq π1 p0q ` p1 ` dpx, yqqν N ˆ ˙ π2 pνq π2 p0q |SrRsxy | ď kRkmax , ` p1 ` dpx, yqqν N

(2.12)

|axy | ď

R P CN ˆN ,

(2.13)

holds for any ν P N and x, y P X. In order to state bounds of the form (2.12) and (2.13) more conveniently we introduce the following matrix norms. Definition 2.4 (Faster than power law decay). Given a pseudometric d on t1, . . . , Nu and a sequence π “ pπpνqq8 ν“0 of positive constants, we define: ˙´1 ˆ πp0q πpνq N (2.14) kRkπ :“ sup max ` |rxy | , R P CN ˆN . ν x,y“1 p1 ` dpx, yqq N νPN If kRkπ ď 1, for some sequence π, we say that R has faster than power law decay (up to level N1 ) in the pseudometric space X :“ pt1, . . . , Nu, d q.

This norm expresses the typical behavior of many matrices in this paper that they have an off-diagonal decay faster than any power, up to a possible mean-field term of order 1{N. Using this norm the bounds (2.12) and (2.13) take the simple forms: kAkπ1 ď 1 ,

kSrRskπ2 ď kRkmax .

Our main result, the stability of the MDE, holds uniformly for all spectral parameters that are either away from the support of the density of states or where the density of states takes positive values. Therefore, for any δ ą 0 we set ( Dδ :“ ζ P H : ρpζq ` distpζ, supp ρq ą δ .

Theorem 2.5 (Faster than power law decay of solution). Assume A1 and A2 and let δ ą 0. Then there exists a positive sequence γ such that kMpζ qkγ ď 1 ,

ζ P Dδ .

The sequence γ depens only on δ and the model parameters P1 and P2 . 7

(2.15)

Our main result on the MDE is its stability with respect to the entrywise maximum norm on CN ˆN , see (2.4). The choice of this norm is especially useful for applications in random matrix theory, since the matrix valued error terms are typically controlled in this norm. We denote by ( Bτmax pRq :“ Q P CN ˆN : kQ ´ Rkmax ď τ , the ball of radius τ ą 0 around R P CN ˆN w.r.t. the entrywise maximum norm.

Theorem 2.6 (Stability). Assume A1 and A2, let δ ą 0 and ζ P Dδ . Then there exist constants c1 , c2 ą 0 and a unique function G : Bcmax p0q Ñ Bcmax pMq such that 1 2 (2.16)

´1 “ pζ 1 ´ A ` SrGpDqsqGpDq ` D ,

for all D P Bcmax p0q, where M “ Mpζ q. The function G is analytic. In particular, there 1 exists a constant C ą 0 such that (2.17)

kGpD1 q ´ GpD2 qkmax ď C kD1 ´ D2 kmax ,

for all D1 , D2 P Bcmax p0q. 1 {2 Furthermore, there is a positive sequence γ and a linear operator Z : CN ˆN Ñ CN ˆN such that the derivative of G, evaluated at D “ 0, has the form (2.18)

∇Gp0q “ Z ` M Id , and Z, as well as its adjoint Z ˚ with respect to the scalar product (2.1), satisfy kZrRskγ ` kZ ˚ rRskγ ď 1 ,

(2.19)

for every R P CN ˆN . Here c1 , c2 , C and γ depend only on δ and the model parameters P1 , P2 from assumptions A1 and A2. We now introduce a generalization of A1 that relaxes the lower bound on S in (2.7) to demonstrate that our theory on the MDE also applies to a somewhat more general setup. This generalization is of relevance for band matrices with a band width of order N. A1’ L-Flatness: Let P11 “ pp1 , P1 , p0 , Kq with p1 , P1 , p0 ą 0 and K P N. The self-energy operator S is called L-flat (with model parameters P11 and L P N) if for some KˆK zero/one-matrix Z “ pZkl qK with positive diagonal, zkk “ 1, and k,l“1 P t0, 1u some partition A1 , . . . , AK of t1, . . . , Nu, satisfying K

min |Ak | ě p0 N k“1

and

K

min pZ L qkl ą 0 ,

k,l“1

(2.20)

the following upper and lower bounds hold: p1

K ÿ

Zkl xPk , Ry Pl ď SrRs ď P1 xRy 1 ,

k,l“ 1

@R P C ` .

(2.21)

Here, Pk denotes the orthogonal projection onto the subspace of vectors with support in Ak , i.e. pPk vqx :“ vx 1px P Ak q for x “ 1, . . . , N. 8

Note that S being 1-flat simply means that S is flat, i.e. satisfies A1. Remark 2.7 (Band matrices). Consider the example of a symmetric P -dimensional random band matrix H “ phxy qx,yPX P RN ˆN with index space X “ t1, . . . , nuP ,

dpx, yq “ kx ´ yk .

Suppose for simplicity that H has independent centered, E hxy “ 0, entries up to the symmetry constraint hxy “ hyx with variances sxy :“ E h2xy ď

C , nP

for some positive constant C ą 0 and that H has a macroscopic band width, i.e. c sxy ě P 1pdpx, yq ď ε nq , n with constants c ą 0 and ε P p0, 1q. As explained in Section 2.2 below the MDE associated to H has the data pair ÿ sxu ruu , x, y P X . A “ 0, Sxy rRs “ sxy rxy 1px ‰ yq ` δxy uPX

Then the lower bound on S in (2.7) is not satisfied, but A1’ holds true uniformly for n ě 2{ε.

Theorem 2.8 (Replacement of A1 by A1’). Theorems 2.5 and 2.6 remain true without any changes if A1 is replaced by A1’, i.e. if there is some L P N such that S is L-flat and the model parameters P1 are replaced by P11 and K. Proposition 2.2 also remains valid under the change from A1 to A1’, provided the constant c is allowed to depend on L. The proof of Theorem 2.8 requires only minor changes of the proofs under the simple flatness assumption A1. We explain these changes and thus prove Theorem 2.8 in the appendix.

2.2

Random matrices with correlations

In this section we present our results on local eigenvalue statistics of random matrices N ˆN with correlations. Let H “ phx,y qN be a self-adjoint random matrix. For a x,y“1 P C spectral parameter ζ P H we consider the associated Matrix Dyson Equation (MDE), ´ Mpζ q´1 “ ζ 1 ´ A ` SrMpζ qs , A :“ E H ,

(2.22)

SrRs :“ E HRH ´ ARA ,

for a solution matrix Mpζ q with positive definite imaginary part (cf. (2.3)). The linear self-energy operator S : CN ˆN Ñ CN ˆN from (2.22) preserves the cone C ` of positive semidefinite matrices and the MDE therefore has a unique solution [38] whose properties have been presented in Subsection 2.1. Our main result states that under natural assumptions on the correlations of the entries within the random matrix H, the resolvent Gpζ q :“ pH ´ ζ 1q´1 , 9

(2.23)

is close to the non-random solution Mpζ q of the MDE (2.22), provided N is large enough. In order to list these assumptions, we write H as a sum of its expectation and fluctuation 1 H “ A` ? W. N

(2.24)

Here, the bare matrix A is a non-random self-adjoint matrix and W is a self-adjoint random matrix with centered entries, E W “ 0. The normalization factor N ´1{2 in (2.24) ensures that the spectrum of the fluctuation matrix W, with entries of a typical size of order one, remains bounded. In the following we will assume that there exists some pseudometric d on the index set t1, . . . , Nu, such that the resulting pseudometric space X “ pt1, . . . , Nu, dq , has sub-P -dimensional volume for some constant P ą 0, i.e. d satisfies (2.11), and that the bare and fluctuation matrices satisfy the following assumptions: B1 Existence of moments: Moments of all orders of W exist, i.e., there is a sequence of positive constants κ1 “ pκ1 pνqqνPN such that E |wxy |ν ď κ1 pνq ,

(2.25)

for all x, y P X and ν P N. B2 Decay of expectation: The entries axy of the bare matrix A decay in the distance of the indices x and y, i.e., there is a sequence of positive constants κ2 “ pκ2 pνqqνPN such that |axy | ď for all x, y P X and ν P N.

κ2 pνq , p1 ` dpx, yqqν

(2.26)

B3 Decay of correlation: The correlations in W are fast decaying, i.e., there is a sequence of positive constants κ3 “ pκ3 pνqqνPN such that for all symmetric sets A, B Ď X2 (A is symmetric if px, yq P A implies py, xq P A) and all smooth functions φ : C|A| Ñ R and ψ : C|B| Ñ R, we have | CovpφpWA q, ψpWB qq| ď κ3 pνq

k∇φk8 k∇ψk8 , p1 ` d2 pA, Bqqν

(2.27)

for all ν P N. Here, CovpZ1 , Z2 q :“ E Z1 Z2 ´ E Z1 E Z2 is the covariance, WA :“ pwxy qpx,yqPA , and ( ( d2 pA, Bq :“ min max dpx1 , x2 q, dpy1 , y2q : px1 , y1 q P A , px2 , y2 q P B ,

is the distance between A and B in the product metric on X. The supremum norm on vector valued functions Φ “ pφi qi is kΦk8 :“ supY maxi |φi pY q|. B4 Flatness: There is a positive constant κ4 such that for any two deterministic vectors u, v P CN we have E |u˚ Wv|2 ě κ4 kuk2 kvk2 , where k ¨ k denotes the standard Euclidean norm on CN . 10

(2.28)

We consider the set K :“ pP, κ1 , κ2 , κ3 , κ4 q ,

(2.29)

appearing in the above assumptions (2.11) and (2.25)-(2.28), as model parameters. These parameters are regarded as fixed and our statements are uniform in the ensemble of all correlated random matrices of all dimensions N satisfying B1-B4 with given K . Under the assumptions B1-B4 the function ρ : H Ñ r0, 8q, given in terms of the solution M to (2.22) by 1 ρpζ q “ Im Tr Mpζ q , πN is the harmonic extension of a Hölder-continuous probability density ρ : R Ñ r0, 8q (cf. Proposition 2.2), which is called the density of states (cf. Definition 2.3). Theorem 2.9 (Local law for correlated random matrices). Let G be the resolvent of a random matrix H written in the form (2.24) that satisfies B1-B4. For all δ, ε ą 0 and ν P N there exists a positive constant C such that in the bulk, ff « ε C N N ď ν. P D ζ P H s.t. ρpζq ě δ , Im ζ ě N ´1`ε , max |Gxy pζ q ´ mxy pζ q| ě ? x, y“1 N N Im ζ (2.30) Furthermore, the normalized trace converges with the improved rate « ff ε 1 C 1 N P D ζ P H s.t. ρpζq ě δ , Im ζ ě N ´1`ε , Tr Gpζ q ´ Tr Mpζ q ě ď ν. N N N Im ζ N (2.31) The constant C depends only on the model parameters K in addition to δ, ε and ν. In Section 3 we present the proof of Theorem 2.9 that is based on the results from Section 2.1 about the Matrix Dyson Equation. As a standard consequence of the local law (2.30) and the uniform boundedness of Im mxx from Theorem 2.5, the eigenvectors of H in the bulk are completely delocalized. This directly follows from the uniform boundedness of Im Gxx pζ q and spectral decomposition of the resolvent (see e.g. [20]).

Corollary 2.10 (Delocalization of eigenvectors). Pick any δ, ε, ν ą 0 and let u be a normalized, kuk “ 1, eigenvector of H, corresponding to an eigenvalue λ P R in the bulk, i.e., ρpλq ě δ. Then „  Nε C N P max |ux | ě ? ď ν, x“1 N N for a positive constant C, depending only on the model parameters K in addition to δ, ε and ν.

The averaged local law (2.31) directly implies the rigidity of the eigenvalues in the bulk. For any τ P R, we define R żτ V ipτ q :“ N ρpωq dω . (2.32) ´8

This is the index of an eigenvalue that is typically close to a spectral parameter τ in the bulk. Then the standard argument presented in Section 7.1 proves the following result. 11

Corollary 2.11 (Rigidity). For any δ, ε, ν ą 0 we have „  ( Nε C P sup |λipτ q ´ τ | : τ P R , ρpτ q ě δ ě ď ν, N N

(2.33)

for a positive constant C, depending only on the model parameters K in addition to δ, ε and ν. Another consequence of Theorem 2.9 is the universality of the local eigenvalue statistics in the bulk of the spectrum of H both in the sense of averaged correlation functions and in the sense of gap universality. For the universality statement we make the following additional assumption that is stronger than B4: B5 Fullness: We say that H is β “ 1 (β “ 2) - full if H P RN ˆN is real symmetric (H P CN ˆN is complex hermitian) and there is a positive constant κ5 such that E |Tr RW|2 ě κ5 Tr R2 , for any real symmetric R P RN ˆN (any complex hermitian R P CN ˆN ).

In this case we consider κ5 as an additional model parameter. The first formulation of the bulk universality states that the k-point correlation functions ρk of the eigenvalues of H, rescaled around an energy parameter ω in the bulk, converge weakly to those of the GUE/GOE. The latter are given by the correlation functions of well known determinantal processes. The precise statement is the following: Corollary 2.12 (Correlation function bulk universality). Let H satisfy B1-B3 and B5 with β “ 1 (β “ 2). Pick any δ ą 0 and choose any ω P R with ρpωq ě δ. Fix k P N and ε P p0, 1{2q. Then for any smooth, compactly supported test function Φ : Rk Ñ R the k-point local correlation functions ρk : Rk Ñ r0, 8q of the eigenvalues of H converge to the k-point correlation function Υk : Rk Ñ r0, 8q of the GOE(GUE)-determinantal point process, ˇż ˇ « ff ˙ ˆ ˇ ˇ 1 C τ τ ˇ ˇ 1 k Φpτ q ď ´ Υ pτ q dτ ρ ω ` , . . . , ω ` , ˇ ˇ k k ˇ Rk ˇ ρpωqk Nρpωq Nρpωq Nc

where τ “ pτ1 , . . . , τk q, and the positive constants C, c depend only on δ, ε, Φ and the model parameters.

The second formulation compares the joint distributions of gaps between consecutive eigenvalues of H in the bulk with those of the GUE/GOE. The proofs of Corollaries 2.12 and 2.13 are presented in Section 7.2. Corollary 2.13 (Gap universality in bulk). Let H satisfy B1-B3 and B5 with β “ 1 (β “ 2). Pick any δ ą 0, an energy τ in the bulk, i.e. ρpτ q ě δ, and let i “ ipτ q be the corresponding index defined in (2.32). Then for all n P N and all smooth compactly supported observables Φ : Rn Ñ R, there are two positive constants C and c, depending on n, δ, Φ and the model parameters, such that the local eigenvalue distribution is universal, ´` ´` ˘n ¯ ˘n ¯ C E Φ Nρpλi q pλi`j ´ λi q ´ EG Φ Nρsc p0q pλrN {2s`j ´ λrN {2s q j“1 ď c . j“1 N

Here the second expectation EG is with respect to GUE and GOE in the cases of complex Hermitian and real symmetric H, respectively, and ρsc p0q “ 1{p2πq is the value of Wigner’s semicircle law at the origin. 12

During the final preparation of this manuscript and after announcing our theorems, we learned that a similar universality result but with a special correlation structure was proved independently in [14]. The covariances in [14] have a specific finite range and translation invariant structure, ¯ ´x y , , u ´ x, v ´ y , (2.34) E hxy huv “ ψ N N

where ψ is a piecewise Lipschitz function with finite support in the third and fourth variables. The short scale translation invariance in (2.34) allows one to use partial Fourier transform after effectively decoupling the slow variables from the fast ones. This renders the matrix equation (1.2) into a vector equation for N 2 variables and the necessary stability result directly follows from [1]. The main difference between the current work and [14] is that here we analyze (1.2) as a genuine matrix equation without relying on translation invariance and thus arbitrary short range correlations are allowed.

3

Local law for random matrices with correlations

In this section we will prove Theorem 2.9, the local law for random matrices with correlations. The proof heavily relies on the results in Section 2.1 whose proofs will be presented separately in Section 4. By its definition (2.23) the resolvent of H satisfies the perturbed MDE ´1 “ pζ 1 ´ A ` SrGpζ qsqGpζ q ` Dpζ q ,

(3.1a)

where the random error matrix D : H Ñ CN ˆN is given by Dpζ q :“ ´p SrGpζ qs ` H ´ AqGpζ q .

(3.1b)

Note that the self-energy operator S : CN ˆN Ñ CN ˆN , defined in (2.22), is clearly selfadjoint with respect to the scalar product (2.1) and preserves the cone C ` . In particular, pA, Sq is a data pair for the MDE as defined at the beginning of Section 2.1. We will now verify that this data pair satisfies the assumption A1 and A2 that were made in Section 2.1. Lemma 3.1 (MDE data from H). If H satisfies B1-B4, then the data pair pA, Sq from (2.22) satisfies A1 and A2. The corresponding model parameters P1 and P2 depend only on K . Proof. The condition (2.12) on the bare matrix A is clearly satisfied ř by ˚(2.26). The lower bound on S in (2.7) follows from (2.28). To show this, let R “ i ̺i ri ri P C ` with an orthonormal basis pri qN i“1 . Then ˚

v SrRsv “

N ÿ

̺i E |r˚i Wv|2

i“1

ě κ4

N ÿ

̺i .

i“1

for any normalized vector v P CN . We will now verify the upper bounds on S in (2.7) and (2.13). Both bounds follow from the decay of covariances ` ˘ |E wxu wvy | ď κ3 p2νq pqxy quv qν ` pqxv quy qν , 13

qxy :“

1 , 1 ` dpx, yq

ν P N , (3.2)

which is an immediate consequence of (2.27) with the choices WA “ pwxu , wux q, WB “ pwvy , wyv q, φpξ1 , ξ2 q “ ξ 1 and ψpξ1 , ξ2 q “ ξ1 . Indeed, to see the upper bound in (2.7) it suffices to show kSrr r˚ sk ď

C , N

(3.3)

for a constant C ą 0, depending on K , and any normalized vector r P CN , because we ř can use for any R P C ` the spectral decomposition R “ i ̺i ri r˚i as above. The estimate (3.2) yields

¯ κ3 p2νq ´ (3.4) kSrr r˚ sk ď k Qpνq k |r|˚ Qpνq |r| ` pQpνq |r|qpQpνq |r|q˚ , N pνq

ν where we defined the matrix Qpνq with entries qxy :“ qxy and |r| :“ p|rx |qxPX . Since ÿ ν kQpνq k ď max qxy , (3.5) xPX yPX

the inequality (3.3) follows from the sub-P -dimensional volume (2.11) by choosing ν sufficiently large. To show (2.13) we fix any R P CN ˆN and estimate the entries Sxy rRs of SrRs by using (3.2), ˆ ˙ˆÿ ˙˙ ˆˆ ÿ ˙ 1 ÿ ν 1 ν ν ν quv qxy ` qxv kRkmax quy |Sxy rRs| ď κ3 p2νq N u,v N v u

The bound (2.13) follows because the right hand side of (3.5) is finite.

We use the notion of stochastic domination, first introduced in [20], that is designed to compare random variables up to N ε -factors on very high probability sets. Definition 3.2 (Stochastic domination). Let X “ X pN q , Y “ Y pN q be sequences of non-negative random variables. We say X is stochastically dominated by Y if “ ‰ P X ą N ε Y ď Cpε, νqN ´ν , N P N, for any ε ą 0, ν P N and some (N-independent) family of positive constants C. In this case we write X ă Y .

In this paper the family C of constants in Definition 3.2 will always be an explicit function of the model parameters (2.29) and possibly some additional parameters that are considered fixed and apparent from the context. However, the constants are always uniform in the spectral parameter ζ on the domain under consideration and indices x, y in case X “ rxy is the element of a matrix R “ prxy qx,y . To use the notion of stochastic domination, we will think of H “ HpN q as embedded into a sequence of random matrices with the same model parameters. The following lemma asserts that the error matrix D from (3.1b) converges to zero as the size N of the random matrix grows to infinity. The convergence holds with respect to the entrywise maximum norm (2.4). Section 5 is devoted to proving this key technical result. In this section we will merely use it in combination with the stability of the MDE, Theorem 2.6, to show that G approaches the solution M of the unperturbed MDE (2.2). We will use the short hand Λpζ q :“ kGpζ q ´ Mpζ qkmax . 14

(3.6)

Lemma 3.3 (Smallness of error matrix). Let C ą 0 and δ, ε ą 0 be fixed. Away from the real axis the error matrix D satisfies 1 kDpτ ` iqkmax ă ? , N

τ P rĆ, Cs .

(3.7)

Near the real axis and in the regime where the harmonic extension of the density of states is bounded away from zero, we have 1 kDpζ qkmax 1pΛpζ q ď N ´ε q ă ? , N Im ζ

(3.8)

for all ζ P H with ρpζq ě δ and Im ζ ě N ´1`ε . Proof of Theorem 2.9. The proof is divided into two steps. The first step is the proof of (2.30). In the second step we show that the improved convergence rate (2.31) for the trace is a consequence of (2.30) and the fluctuation averaging mechanism (introduced in [28] for Wigner matrices), which is explained in more detail in Section 6. Proof of (2.30): First we conclude from (3.7) and the stability of the MDE, Theorem 2.6, that for spectral parameters with large imaginary part, 1 Λpτ ` iq ă ? , N

τ P rĆ1, C1 s ,

(3.9)

for any fixed constant C1 ą 0. Now let τ P R, η0 P rN ´1`ε , 1s and ζ0 “ τ ` iη0 P H such that ρpζ0 q ě δ for some δ P p0, 1s. Note that ρpζ0 q ě δ and η0 ď 1 imply τ P rĆ1 , C1 s for some positive constant C1 because ρ is the harmonic extension of the density of states with compact support in r´κ, κs (Proposition 2.1). Since in addition the density of states is uniformly Hölder continuous (cf. Proposition 2.2), there is a constant c1 , depending on δ and P, such that inf ρpτ ` iηq ě c1 .

ηPrη0 ,1s

Therefore, by (3.8) and (2.17) we infer that 1 Λpζ q 1pΛpζ q ď N ´ε{4 q ă ? , N Im ζ

ζ P τ ` irη0 , 1s .

(3.10)

Since N ´ε{2 ě pN Im ζq´1{2 , the inequality (3.10) establishes on a high probability event a gap in the possible values that Λpζq can take. The indicator function in (3.10) is absent for ζ “ τ ` i because of (3.9), i.e. at that point the value lies below the gap. From the Lipshitz-continuity of ζ ÞÑ Λpζq with Lipshitz-constant bounded by 2N 2 for Im ζ ě N1 and a standard continuity argument together with a union bound (e.g. Lemma A.1 in [4]), we conclude that the values lie below the gap for any ζ with Im ζ P rη0 , 1s with very high probability. Thus (2.30) holds. Proof of (2.31): Let ζ P H with Im ζ ě N ´1`ε and ρpζq ě δ. We show the improved convergence rate for the normalized trace N1 TrrG ´ Ms “ x1 , G ´ My. By Step 1 there is an εr ą 0 such that for every ν ą 0 we have kG ´ Mkmax ď N ´rε on an event whose complement has probability bounded by N ´ν . On this event G coincides with the unique 15

solution of the perturbed MDE (2.16) from Theorem 2.6 that depends analytically on D. We use the representation (2.18) of its derivative to see that x1 , G ´ My “ x1 , ZrDs ` MDy ` OpkDk2max q “ xZ ˚ r1s ` M˚, Dy ` OpkDk2max q . Since kDkmax ă

? 1 N Im ζ

(cf. (3.8)), we infer that

|x1 , G ´ My| ă |xZ ˚r1s ` M˚, Dy| `

1 . N Im ζ

(3.11)

By (2.19) and (2.15) the entries of Z ˚ r1s ` M˚ have faster than power law decay, i.e. there is a positive sequence α and a constant C2 ą 0, depending only on δ and the model parameters K , such that kZ ˚ r1s ` M˚ kα ď C2 . Thus we can apply the following proposition.

Proposition 3.4 (Fluctuation averaging). Assume B1-B4. Let δ, C ą 0 and ζ P H with δ ď distpζ, rκ´, κ` sq ` ρpζ q ď δ ´1 and distpζ, SpecpHB qq´1 ă N C for all B Ĺ X. Let ε P p0, 1{2q be a constant and Ψ a non-random control parameter with N ´1{2 ď Ψ ď N ´ε . Suppose that the local law holds in the form kGpζ q ´ Mpζ qkmax ă Ψ .

(3.12)

Then the error matrix D, defined in (3.1b), satisfies |xR, Dpζ qy| ă Ψ2 ,

(3.13)

for every non-random R P CN ˆN with faster than power law decay.

The proof of Proposition 3.4 is carried out in Section 6. Applied to the choices R :“ Z r1s ` M˚ and Ψ :“ ?N1Im ζ . the proposition in combination with (3.11) implies (2.31). This finishes the proof of Theorem 2.9 up to verifying Lemma 3.3 and Proposition 3.4. ˚

4 4.1

The Matrix Dyson Equation The solution of the Matrix Dyson Equation

In this section we establish a variety of properties of the solution M to the MDE. The section starts with the proof of Proposition 2.1 and ends with the proof of Theorem 2.5. The main technical result is Proposition 4.2 below which lists bounds on M. Most of the inequalities in this and the following section are uniform in the data pair pA, Sq that determines the MDE and its solution, given a fixed set of model parameters Pk corresponding to the assumptions Ak. We therefore introduce a convention for inequalities up to constants, depending only on the model parameters. Convention 4.1 (Comparison relation and constants). Suppose a set of model parameters P is given. Within the proofs we will write C and c for generic positive constants, depending on P. In particular, C and c may change their values from inequality to inequality. If C, c depend on additional parameters L , we will indicate this by writing CpL q, cpL q. We also use the comparison relation α À β or β Á α for any positive α and β if there exists a constant C ą 0 that depends only on P, but is otherwise uniform in the 16

data pair pA, Sq, such that α ď Cβ. In particular, C does not depend on the dimension N or the spectral parameter ζ. In case α À β À α we write α „ β. For two matrices R, T P C ` we similarly write R À T if the inequality R ď CT in the sense of quadratic forms holds with a constant C ą 0 depending only on the model parameters. In the upcoming analysis many quantities depend on the spectral parameter ζ. We will often suppress this dependence in our notation and write e.g. M “ Mpζ q, ρ “ ρpζ q, etc. Proof of Proposition 2.1. In this proof we will generalize the proof of Proposition 2.1 from [2] to our matrix setup. By taking the imaginary part of both sides of the MDE and using Im M ě 0 and A “ A˚ we see that “ ‰ ´ Im Mpζ q´1 “ M˚ pζ q´1 Im Mpζ q Mpζ q´1 ě Im ζ 1 . In particular, this implies the trivial bound on the solution to the MDE, kMpζ qk ď

1 , Im ζ

ζ P H.

(4.1)

Let w P CN be normalized, w˚ w “ 1. Since Mpζq has positive imaginary part, the analytic function ζ ÞÑ w˚ Mpζqw takes values in H. From the trivial upper bound (4.1) and the MDE itself, we infer the asymptotics iη w˚ Mpiηqw Ñ ´1 as η Ñ 8 . By the characterization of Stieltjes transforms of probability measures on the complex upper half plane (cf. Theorem 3.5 in [31]), we infer ż vw pdτ q ˚ w Mpζqw “ , τ ´ζ

where vw is a probability measure on the real line. By polarization, we find the general representation (2.5). We now show that supp V Ď r´κ, κs, where κ “ kAk ` 2kSk1{2 (cf. (2.6)). Note that A1 implies kSk À 1. Indeed, let p ¨ q˘ denote the positive and negative parts, so that R “ pRe Rq` ´ pRe Rq´ ` ipIm Rq` ´ ipIm Rq´ , holds for any R P CN ˆN . Using this representation, we see that ` ˘ kSrRsk ď P1 xpRe Rq` y ` xpRe Rq´ y ` xpIm Rq` y ` xpIm Rq´ y ď 2P1kRkhs . (4.2)

and since kRkhs ď kRk the bound kSk À 1 follows. The following argument will prove that kIm Mpζ qk Ñ 0 as Im ζ Ó 0 locally uniformly for all ζ P H with |ζ| ą κ. This implies supp V Ď r´κ, κs. Let us fix ζ P H with |ζ| ą κ and suppose that kMk satisfies the upper bound kMk ă

|ζ| ´ kAk . 2kSk

(4.3)

Then by taking the inverse and then the norm on both sides of (2.2) we conclude that kMk ď

2 1 ď . |ζ| ´ kAk ´ kSkkMk |ζ| ´ kAk 17

(4.4)

Therefore, (4.3) implies (4.4) and we see that there is a gap in the possible values of kMk, namely ´ |ζ| ´ kAk ¯ 2 for |ζ| ą κ . , kMpζ qk R |ζ| ´ kAk 2kSk

Since ζ ÞÑ kMpζ qk is a continuous function and for large Im ζ the values of this function lie below the gap by the trivial bound (4.1), we infer kMpζ qk ď

2 |ζ| ´ kAk

for |ζ| ą κ .

(4.5)

Let us now take the imaginary part of the MDE and multiply it with M˚ from the left and with M from the right, Im M “ Im ζ M˚ M ` M˚ SrIm MsM .

(4.6)

By taking the norm on both sides of (4.6), using a trivial estimate on the right hand side and rearranging the resulting terms, we get kIm Mk ď

Im ζ kMk2 . 1 ´ kMk2 kSk

(4.7)

Here we used kMk2 kSk ă 1, which is satisfied by (4.5) for |ζ| ą κ. We may estimate the right hand side of (4.7) further by applying (4.5). Thus we find kIm Mk ď

4 Im ζ 4 Im ζ “ . 2 p|ζ| ´ kAkq ´ 4kSk p|ζ| ´ κ ` 2kSk1{2 q2 ´ 4kSk

(4.8)

The right hand side of (4.8) converges to zero locally uniformly for all ζ P H with |ζ| ą κ as Im ζ Ó 0. This finishes the proof of Proposition 2.1. The following proposition lists a number of important bounds on M. Proposition 4.2 (Properties of the solution). Assume A1 and that kAk ď P0 for some constant P0 ą 0. Then uniformly for all spectral parameters ζ P H the following bounds hold: (i) The solution is bounded in the spectral norm, kMpζ qk À

1 . ρpζ q ` distpζ, supp ρq

(4.9)

(ii) The inverse of the solution is bounded in the spectral norm, kMpζ q´1 k À 1 ` |ζ| .

(4.10)

(iii) The imaginary part of M is bounded from below and above by the harmonic extension of the density of states, ρpζ q 1 À Im Mpζ q À p1 ` |ζ|2qkMpζ qk2ρpζ q 1 . 18

(4.11)

Proof. The inequalities (4.9) and (4.10) provide upper and lower bounds on the singular values of the solution, respectively. Before proving these bounds we show that M has a bounded normalized Hilbert-Schmidt norm, kMpζ qkhs À 1 .

(4.12)

For this purpose we take the imaginary part of (2.2) (cf. (4.6)) and find Im M ě M˚ SrIm MsM .

(4.13)

The lower bound on S from (2.7) implies Im M Á ρ M˚ M ,

(4.14)

where we used the definition of ρ in (2.10). Taking the normalized trace on both sides of (4.14) shows (4.12). Proof of (ii): Taking the norm on both sides of (2.2) yields kM´1 k ď |ζ| ` kAk ` kSkhsÑk¨k kMkhs À 1 ` |z| ,

(4.15)

where kSkhsÑk¨k denotes the norm of S from CN ˆN equipped with the norm k ¨ khs to CN ˆN equipped with k ¨ k. For the last inequality in (4.15) we used (4.12) and that by A1 we have kSkhsÑk¨k À 1 (cf. (4.2)).

Proof of (iii): First we treat the simple case of large spectral parameters, |ζ| ě 1 ` κ, where κ was defined in (2.6). Recall that the matrix valued measure Vpdτ q (cf. (2.5)) is supported in r´κ, κs by Proposition 2.1. The normalization, VpRq “ 1 implies that for any vector u P CN with kuk “ 1 the function ζ ÞÑ π1 Imru˚ Mpζ qus is the harmonic extension of a probability measure with support in r´κ, κs, hence it behaves as ´ζ ´1 for large |ζ|. We conclude that Im Mpζ q „ ρpζ q „

Im ζ , |ζ|2

for |ζ| ě 1 ` κ. Since for these ζ we also have kMpζ qk „ |ζ|´1 by the Stieltjes transform representation (2.5) we conclude that (4.11) holds in this regime. Now we consider ζ P H with |ζ| ď 1 ` κ. We start with the lower bound on Im M. From (4.14) we see that Im M Á ρ M˚ M ě ρ kM´1 k´2 1 ,

(4.16)

and since kM´1 k À 1 by (ii), the lower bound in (4.11) is proven. For the upper bound, taking the imaginary part of the MDE (cf. (4.6)) and using A1 and that Im M Á Im ζ 1 by the Stieltjes transform representation (2.5), we get Im M “ Im ζ M˚ M ` M˚ SrIm MsM À pIm ζ ` ρ qM˚ M À ρ kMk2 1 .

Proof of (i): In the regime |ζ| ě 1 ` κ the bound (4.9) follows from the Stieltjes transform representation (2.5). Thus we consider |ζ| ď 1 ` κ. We take the imaginary part on both sides of (2.2) and use the lower bound in (4.11) and Sr1s Á 1 to get ´ Im Mpζ q´1 ě SrIm Mpζ qs Á ρpζ q 1 .

Since in general Im R´1 ě 1 implies kRk ď 1 for any R P CN ˆN , we infer that kMpζ qk À ρpζ q´1. On the other hand kMpζ qk À distpζ, supp ρq´1 follows from (2.5) again. 19

Proof of Theorem 2.5. Recall the model parameters π1 , π 2 from A2. We consider the MDE (2.2) entrywise and see that |pM´1 qxy | ď |axy | ` |ζ| δxy ` |Sxy rMs| π1 pνq ` π2 pνqkMk π1 p0q ` π2 p0qkMk ď |ζ| δxy ` ` , ν p1 ` dpx, yqq N where we used (2.12), (2.13) and kMkmax ď kMk. By (4.9) and ζ P Dδ , we have }M} À 1δ . Furthermore, for large |ζ| we also have }M} À 1{|ζ|. We can now apply Lemma A.2 from the appendix with the choice R :“ kMkM´1 to see the existence of a positive sequence γ such that (2.15) holds. This finishes the proof of Theorem 2.5.

4.2

Stability of the Matrix Dyson Equation

The goal of this section is to prove Proposition 2.2 and the stability of the MDE, Theorem 2.6. The main technical result, which is needed for these proofs, is the linear stability of the MDE. For its statement we introduce for any R P CN ˆN the sandwiching operator CR : CN ˆN Ñ CN ˆN by C R rTs :“ RTR .

(4.17)

´1 ˚ ˚ Note that CR “ CR´1 and CR “ CR˚ for any R P CN ˆN , where CR denotes the adjoint with respect to the scalar product (2.1).

Proposition 4.3 (Linear stability). Assume A1 and kAk ď P0 for some constant P0 ą 0 (cf. (2.8)). There exists a universal numerical constant C ą 0 such that kpId ´ CMpζq S q´1 ksp À 1 `

1 , pρpζ q ` distpζ, supp ρqqC

(4.18)

uniformly for all ζ P H.

Before we show a few technical results that prepare the proof of Proposition 4.3, we give a heuristic argument that explains the connection between this proposition and the linear stability of the MDE. Let us suppose that the perturbed MDE ` ˘ ´1 “ ζ 1 ´ A ` SrGpDqs GpDq ` D , (4.19) with perturbation matrix D has a unique solution GpDq, depending differentiably on D. Then by differentiating on both sides of (4.19) with respect to D, setting D “ 0 and using the MDE for Mpζq “ Gp0q, we find 0 “ ´Mpζ q´1∇R Gp0q ` Sr∇R Gp0qsMpζ q ` R ,

(4.20)

where ∇R denotes the directional derivative with respect to D in the direction R P CN ˆN . Rearranging the terms in (4.20) and multiplying with M “ Mpζq from the left yields pId ´ CM Sq∇R Gp0q “ MR .

(4.21)

Thus GpDq has a bounded derivative at D “ 0, i.e. the MDE is stable with respect to the perturbation D to linear order, whenever the operator Id ´ CM S is invertible and its inverse is bounded. The following definition will play a crucial role in the upcoming analysis. 20

Definition 4.4 (Saturated self-energy operator). Let M “ Mpζ q be the solution of the MDE at some spectral parameter ζ P H. We define the linear operator F “ F pζ q : CN ˆN Ñ CN ˆN by (4.22a)

F :“ CW C?Im M S C?Im M CW , where we have introduced an auxiliary matrix W :“

`

rRe Ms q 2 1 ` p C?´1 Im M

˘1{4

.

(4.22b)

We call F the saturated self-energy operator or the saturation of S for short. The operator F inherits the self-adjointness with respect to (2.1) and the property of mapping C ` to itself from the self-energy operator S. We will now briefly discuss the reason for introducing F . In order to invert Id ´ CM S in (4.21) we have to show that CM S is dominated by Id in some sense. Neither S nor M can be directly related to the identity operator, but their specific combination CM S can. We extract this delicate information from the MDE via a Perron-Frobenius argument. Unfortunately CM S is neither positivity preserving nor symmetric. The key is to find an appropriate symmetrization of this operator before Perron-Frobenius is applied. A similar problem appeared in a simpler commutative setting in [2]. There, M “ diagpmq was a diagonal matrix and the MDE became a vector equation. In this case the problem of inverting Id ´ CM S reduces to inverting a matrix 1 ´ diagpmq2 S, where S P RN ˆN is a matrix with non-negative entries that plays the role of the self-energy operator S in the current setup. The idea in [2] was to write 1 ´ diagpmq2 S “ RpU ´ FqT ,

(4.23)

with invertible diagonal matrices R and T, a diagonal unitary matrix U and a self-adjoint F with positive entries that satisfies the bound kFk ď 1. It is then possible to see that U ´ F is invertible as long as U does not leave the Perron-Frobenius eigenvector of F invariant. In this commutative setting it is possible to choose F “ diagp|m|qS diagp|m|q, where the absolute value is taken in each component. In our current setting we will achieve a decomposition similar to (4.23) on the level of operators acting on CN ˆN (cf. (4.38) below). The definition (4.22) ensures that the saturation F is self-adjoint, positivitypreserving and satisfies kF k ď 1, as we will establish later.

Lemma 4.5 (Bounds on W). Assume A1 and kAk ď P0 for some constant P0 ą 0. Then uniformly for all spectral parameters ζ P H with |ζ| ď 3p1 ` κq the matrix W “ Wpζ q P C` , defined in (4.22b), fulfils the bounds kMk´1 1 À ρ1{2 W À kMk1{2 1.

(4.24)

Proof. We write W4 in a form that follows immediately from its definition (4.22b), W4 “ C?´1 p C Im M ` C Re M qr pIm Mq´1 s . Im M We estimate pIm Mq´1 from above and below by employing (4.11) in the regime |ζ| À 1, 1 1 C?´1Im M rM˚ M ` MM˚ s À W4 À C?´1Im M rM˚ M ` MM˚ s . 2 ρ kMk ρ 21

Using the trivial bounds 2kM´1 k´2 ď M˚ M ` MM˚ ď 2kMk2 1 and kM´1 k À 1 from (4.10) as well as (4.11) again, we find kMk´4 ρ´2 À W4 À kMk2 ρ´2 . This is equivalent to (4.24). Lemma 4.6 (Spectrum of F ). Assume A1 and kAk ď P0 for some constant P0 ą 0. Then the saturated self-energy operator F “ F pζ q, defined in (4.22), has a unique normalized, kFkhs “ 1, eigenmatrix F “ Fpζ q P C` , corresponding to its largest eigenvalue, (4.25)

F rFs “ kF kspF .

Furthermore, the following properties hold uniformly for all spectral parameters ζ P H such that |ζ| ď 3p1 ` κq and kF pζ qksp ě 1{2. (i) The spectral radius of F is given by

kF ksp “ 1 ´

xF , CW rIm Msy Im ζ . xF,W´2 y

(4.26)

(ii) The eigenmatrix F is bounded from above and below by kMk´7 1 À F À kMk6 1 .

(4.27)

(iii) The saturation F has the uniform spectral gap, i.e., SpecpF {kF kspq Ď r´1 ` ϑ, 1 ´ ϑs Y t1u ,

ϑ Á kMk´42.

(4.28)

Proof. Since F preserves the cone C ` of positive semidefinite matrices, a version of the Perron-Frobenius theorem for cone preserving operators implies that there exists a normalized F P C ` such that (4.25) holds. We will show uniqueness of this eigenmatrix later in the proof. First we will prove that (4.26) holds for any such F. Proof of (i): We define for any matrix R P CN ˆN the operator KR : CN ˆN Ñ CN ˆN via KR rTs :“ R˚ TR .

(4.29)

Note that for self-adjoint R P CN ˆN we have KR “ CR (cf. (4.17)). Using definition (4.29), the imaginary part of the MDE (4.6) can be written in the form Im M “ Im ζ KM r1s ` KM SrIm Ms .

(4.30)

We will now write up the equation (4.30) in terms of Im M, F and W. In order to express M in terms of W, we introduce the unitary matrix rRe Ms ´ i1 C?´1 Im M

, U :“ ´1 C?Im M rRe Ms ´ i1

(4.31)

via the spectral calculus of the self-adjoint matrix C?´1Im M rRe Ms. With (4.31) and the definition of W “ |C?´1Im M rRe Ms ´ i1|1{2 from (4.22b) we may write M as M “ C?Im M CW rU˚ s . 22

(4.32)

Here, the matrices W and U commute. Using (4.32) we also find an expression for KM , namely KM “ C?Im M CW KU˚ CW C?Im M .

(4.33)

Plugging (4.33) into (4.30) and applying the inverse of C?Im M CW KU˚ on both sides, we end up with W´2 “ CW rIm Ms Im ζ ` F rW´2 s ,

(4.34)

´2 ´1 where we used the definition of F from (4.22) and KU s “ W´2 , which holds because ˚ rW U and W commute. We project both sides of (4.34) onto the eigenmatrix F of F . Since F is self-adjoint with respect to the scalar product (2.1) and by the eigenmatrix equation (4.25) we get

xF ,W´2 y “ xF , CW rIm Msy Im ζ ` kF ksp xF ,W´2 y . Solving this identity for kF ksp yields (4.26). Proof of (ii) and (iii): Let ζ P H with |ζ| ď 3p1 ` κq and kF pζqksp ě 1{2. The bounds on the eigenmatrix (4.27) and on the spectral gap (4.28) are a consequence of the following property of F . • For all matrices R P C ` the operator F satisfies kMk´4 xRy1 À F rRs À kMk6 xRy1 .

(4.35)

We verify (4.35) below. Knowing (4.35), the remaining assertions, (4.27) and (4.28), of Lemma 4.6 follow from a technical result, Lemma A.3, in the appendix. This lemma shows the uniqueness of the eigenmatrix F as well. In the regime |ζ| ě 3p1 ` κq the constants hidden in the comparison relation of (4.35) will depend on |ζ|, but otherwise the upcoming arguments are not effected. In particular the qualitative property of having a unique eigenmatrix F remains true even for large values of |ζ|. Proof of (4.35): The bounds in (4.35) are a consequence of Assumption A1 and the bounds (4.24) on W and (4.11) on Im M, respectively. Indeed, from A1 we have SrRs „ xRy1 for positive semidefinite matrices R. By the definition (4.22a) of F this immediately yields D @ F rRs „ C?Im M CW rRs CW C?Im M r1s “ x CW rIm Ms , R y CW rIm Ms . Since (4.24) and (4.11) imply

kMk´2 1 À CW rIm Ms À kMk3 1 , we conclude that (4.35) holds. Proof of Proposition 4.3. To show (4.18) we consider the regime of large and small values of |ζ| separately. We start with the simpler regime, |ζ| ě 3p1 ` κq. In this case we apply the bound 1 , kMpζ qk ď |ζ| ´ κ 23

which is an immediate consequence of the Stieltjes transform representation (2.5) of M. In particular, kCMpζq Sksp ď

kSk 1 kSksp ď ď , 2 2 p|ζ| ´ κq 4p1 ` κq 4

(4.36)

where we used κ ě kSk1{2 in the last and second to last inequality. We also used that kT ksp ď kT k for any self-adjoint T P CN ˆN . The claim (4.18) hence follows in the regime of large |ζ|. Now we consider the regime |ζ| ď 3p1 ` κq. Here we will use the spectral properties of the saturated self-energy operator F , established in Lemma 4.6. First we rewrite Id ´ CMpζq S in terms of F . For this purpose we recall the definition of U from (4.31). With the identity (4.32) we find (4.37)

CM “ C?Im M CW CU˚ CW C?Im M . Combining (4.37) with the definition of F from (4.22a) we verify ´1 ? C ´1 . Id ´ CM S “ C?Im M CW CU˚ pCU ´ F q CW Im M

(4.38)

The bounds (4.24) on W and (4.11) on Im M imply bounds on CW and C?Im M , respectively. In fact, in the regime of bounded |ζ|, we have kCW k À

kMk , ρ

´1 kCW k À ρ kMk2 ,

kC?Im M k À ρ kMk2 ,

kC?´1 k À Im M

1 . (4.39) ρ

Therefore, taking the inverse and then the norm k ¨ ksp on both sides of (4.38) and using (4.39) as well as kCT ksp ď kCT k for self-adjoint T P CN ˆN yields kpId ´ CM Sq´1 ksp À kMk5 kpCU ´ F q´1 ksp .

(4.40)

Note that CU and CU˚ are unitary operators on CN ˆN and thus kCU ksp “ kCU˚ ksp “ 1. We estimate the norm of the inverse of CU ´ F . In case kF ksp ă 1{2 we will simply use the bound kpCU ´ F q´1 ksp ď 2 in (4.40) and (4.9) for estimating kMk, thus verifying (4.18) in this case. If kF ksp ě 1{2, we apply the following lemma, which was stated as Lemma 5.6 in [2].

Lemma 4.7 (Rotation-Inversion Lemma). Let T be a self-adjoint and U a unitary operator on CN ˆN . Suppose that T has a spectral gap, i.e., there is a constant GappT q ą 0 such that “ ‰ Spec T Ď ´kT ksp ` GappT q, kT ksp ´ GappT q Y tkT ksp u ,

with a non-degenerate largest eigenvalue kT ksp ď 1. Then there exists a universal positive constant C such that kpU ´ T q´1 ksp ď

C , GappT q | 1 ´ kT ksp xT , UrTsy|

where T is the normalized, kTkhs “ 1, eigenmatrix of T , corresponding to kT ksp . 24

With the lower bound (4.28) on the spectral gap of F , we find kMk42 | 1 ´ kF kspxF , CU rFsy| kMk42 (. ď max 1 ´ kF ksp , |1 ´ xF , CU rFsy|

kpCU ´ F q´1 ksp À

(4.41)

Plugging (4.41) into (4.40) and using (4.9) to estimate kMk, shows (4.18), provided the denominator on the right hand side of (4.41) satisfies ( ` ˘C (4.42) max 1 ´ kF ksp , | 1 ´ xF , CU rFsy| Á ρpζ q ` distpζ, supp ρq , for some universal constant C ą 0. In the remainder of this proof we will verify (4.42). We establish lower bounds on both arguments of the maximum in (4.42) and combine them afterwards. We start with a lower bound on 1 ´ kF ksp. Estimating the numerator of the fraction on the right hand side of (4.26) from below xF, CW rIm Msy Á ρ xF,W2 y Á kMk´2 xFy , and its denominator from above, xF ,W´2 y À ρ kMk2 xFy , by applying the bounds from (4.24) and (4.11), we see that 1 ´ kF pζ qksp Á

Im ζ . ρpζ q kMpζ qk4

(4.43)

Since ρpζ q is the harmonic extension of a probability density (namely the density of states ρ), we have the trivial upper bound ρpζ q À Im ζ{ distpζ, supp ρq2 . Continuing from (4.43) we find the lower bound 1 ´ kF ksp Á kMk´4 distpζ, supp ρq2 Á pρ ` distpζ, supp ρqq4 distpζ, supp ρq2 , where we used (4.9) in the second inequality. Now we estimate |1 ´ xF, CU rFsy| from below. We begin with ` ˘ |1 ´ xF , CU rFsy| ě Re 1 ´ xF, CU rFsy “ 1 ´ xF , pCRe U ´ CIm U qrFsy ě xF , CIm U rFsy ,

(4.44)

(4.45)

where 1 ´ xF , CRe U Fy ě 0 in the last inequality, because U is unitary and kFkhs “ 1. Since Im U “ ´W´2 (cf. (4.31) and (4.22)) and because of (4.24) we have ´ Im U Á

ρ . kMk2

Continuing from (4.45) and using the normalization kFkhs “ 1, we get the lower bound |1 ´ xF , CU rFsy| Á

ρ2 Á pρ ` distpζ, supp ρqq4 ρ 2 . kMk4

(4.46)

Here, (4.9) was used again. Combining (4.46) with (4.44) shows (4.42) and thus finishes the proof of Proposition 4.3. 25

Proof of Proposition 2.2. We show that the harmonic extension ρpζ q of the density of states (cf. (2.10)) is uniformly c-Hölder continuous on the entire complex upper half plane. Thus its unique continuous extension to the real line, the density of states, inherits this regularity. We differentiate both sides of the MDE with respect to ζ and find the equation (4.47)

pId ´ CM SqrBζ Ms “ M2 .

Inverting the operator Id ´ CM S and taking the normalized Hilbert-Schmidt norm reveals a bound on the derivative of the solution to the MDE, kBζ Mkhs ď kpId ´ CMpζq Sq´1 ksp kMk2 .

(4.48)

Since ζ Ñ xMpζ qy is an analytic function on H, we have the basic identity 2πiBζ ρ “ 2iBζ ImxMy “ Bζ xMy. Therefore, making use of (4.48), we get |Bζ ρ| “

1 1 1 |xBζ My| ď kpId ´ CMpζq Sq´1 ksp kMk2 À C`2 . 2π 2 ρ

(4.49)

For the last inequality in (4.49) we employed the bound (4.9) and the linear stability, Proposition 4.3. The universal constant C stems from its statement (4.18). From (4.49) 1 we read off that the harmonic extension ρ of the density of states is C`3 -Hölder continuous. It remains to prove that ρ is real analytic at any τ0 with ρpτ0 q ą 0. Since ρ is continuous, it is bounded away from zero in a neighborhood of τ0 . Using (4.47), (4.9) and (4.18) we conclude that M is uniformly continuous in the intersection of a small neighborhood of τ0 in C with the complex upper half plane. In particular, M has a unique continuous extension Mpτ0 q to τ0 . Furthermore, by differentiating (2.2) with respect to ζ and by the uniqueness of the solution to (2.2) with positive imaginary part one verifies that M coincides with the solution Q to the holomorphic initial value problem Bω Q “ pId ´ CQ Sq´1 rQ2 s ,

Qp0q “ Mpτ0 q ,

i.e. Mpτ0 ` ωq “ Qpωq for any ω P H with sufficiently small absolute value. Since the solution Q is analytic in a small neighborhood of zero, we conclude that M can be holomorphically extended to a neighborhood of τ0 in C. By continuity (2.10) remains true for ζ P R close to τ0 and thus ρ is real analytic there. In the proof of Theorem 2.6 we will often consider T : pCN ˆN , k ¨ kA q Ñ pCN ˆN , k ¨ kB q, i.e., T is a linear operator on CN ˆN equipped with two different norms. We indicate the norms in the notation of the corresponding induced operator norm kT kAÑB . We will use A, B “ hs, k ¨ k, 1, 8, max, etc. We still keep our convention that kT ksp “ kT khsÑhs and kT k “ kT kk¨kÑk¨k . Some of the norms on matrices R P CN ˆN are ordered, e.g. maxtkRkmax , kRkhs u ď kRk ď kRk1_8 , where ( kRk1_8 :“ max kRk1 , kRk8 , (4.50) with k ¨ k1 and k ¨ k8 denoting the operator norms induced by the ℓ1 - and ℓ8 -norms on CN , respectively: N

kRk1 :“ max y“1

N ÿ

N

|rxy | ,

kRk8 :“ max x“1

x“1

26

N ÿ

|rxy | .

y“1

(4.51)

Note that if k ¨ kAr ď k ¨ kA and k ¨ kB ď k ¨ kBr , then k ¨ kAÑB ď k ¨ kAÑ r B r . In particular, for N ˆN N ˆN T :C ÑC we have e.g. kT kmaxÑhs ď kT kmaxÑk¨k ď kT kmaxÑ1_8 . Proof of Theorem 2.6. We will apply a quantitative version of the implicit function theorem, Lemma A.4. For this purpose we define the function J : CN ˆN ˆCN ˆN Ñ CN ˆN by J rG, Ds :“ pζ 1 ´ A ` SrGsqG ` D .

(4.52)

With this definition the perturbed MDE (2.16) takes the form J rGpDq, Ds “ ´1 . The unperturbed MDE (2.2) is J rM, 0s “ ´1 where M “ Gp0q. For the application of the implicit function theorem we control the derivatives of J with respect to G and D. With the short hand notation WR rTs :“ MpSrTsR ` SrRsT q ,

(4.53)

we compute the directional derivative of J with respect to G in the direction R P CN ˆN , pGq

∇R J rG, Ds “ pζ 1 ´ A ` SrGsqR ` SrRsG

“ ´ M´1 pId ´ CM S ´ WG´M qrRs .

(4.54)

For the second identity in (4.54) we used (2.2). The derivative with respect to D is simply the identity operator, ∇pDqJ rG, Ds “ Id . Therefore, estimating ∇pDq J is trivial. We consider CN ˆN with the entrywise maximum norm k ¨ kmax and use the short hand notation kT kmax :“ kT kmaxÑmax for the induced operator norm of any linear T : CN ˆN Ñ CN ˆN . To apply Lemma A.4 we need the following two estimates: (i) The operator norm of the inverse of Id ´ CM S on pCN ˆN , k ¨ kmax q is controlled by its spectral norm, kpId ´ CM Sq´1 kmax À 1 ` kMk2 ` kMk4 kpId ´ CM Sq´1 ksp .

(4.55)

(ii) The operator norm of WG´M is small, provided G is close to M, kWG´M kmax À kMk1_8 kG ´ Mkmax .

(4.56)

We will prove these estimates after we have used them to show that the hypotheses of the quantitative inverse function theorem hold. pGq Let us first bound the operator R ÞÑ ∇R J rM, 0s. To this end, using (4.54) we have pGq

kp∇R J rM, 0sq´1 kmax “ kpId ´ CM S q´1 rMRskmax ď kpId ´ CM S q´1 kmax kMk1_8 kRkmax , 27

(4.57)

for an arbitrary R. For the last line we have used kMRkmax ď kMk1_8 kRkmax . By Theorem 2.5 there is a sequence γ, depending only on δ and P, such that kMk1_8 ď max x

ÿ y

γpνq ` γp0q , p1 ` dpx, yqqν

ν P N.

(4.58)

ř Here and in the following unrestricted summations x are understood to run over the entire index set from 1 to N. Since the sizes of the balls with respect to d grow only polynomially in their radii (cf. (2.11)), the right hand side of (4.58) is bounded by a constant that only depends on δ and P for a sufficiently large choice of ν. Using this estimate together with the bound (i) for the inverse of Id ´ CM S in (4.57) yields pGq kp∇R J rM, 0sq´1 kmax À 1. Next, to verify assumption (A.53) of Lemma A.4 we write Id ´ p∇pGqJ rM, 0sq´1 ∇pGqJ rG, Ds “ pId ´ CM S q´1 WG´M .

(4.59)

Using (4.56) and (4.55), in conjunction with (4.9) and (4.18), we see that

Id ´ p∇pGqJ rM, 0sq´1 ∇pGqJ rG, Ds

max

ď

1 , 2

(4.60)

for all pG, Dq P Bcmax pMq ˆ Bcmax p0q, provided c1 , c2 „ 1 are sufficiently small. The first 2 1 part of Theorem 2.6, the existence and uniqueness of the analytic function G, now follows from the implicit function theorem Lemma A.4. Proof of (i): First we remark that (2.13) for a large enough ν and together with (2.11) imply (4.61)

kSkmaxÑ1_8 À 1 .

We expand the geometric series corresponding to the operator pId ´ CM Sq´1 to second order, pId ´ CM Sq´1 “ Id ` CM S `

pCM Sq2 . Id ´ CM S

(4.62)

We consider each of the three terms on the right hand side separately and estimate their norms as operators from CN ˆN with the entrywise maximum norm to itself. The easiest is kIdkmax “ 1. For the second term we use the estimate kCM Skmax ď kCM SkmaxÑk¨k ď kCM kkSkmaxÑk¨k . For the third term on the right hand side of (4.62) we apply

pC Sq2

M

ď kCM SkhsÑmax kpId ´ CM Sq´1 ksp kCM SkmaxÑhs .

Id ´ CM S max

(4.63)

(4.64)

The last factor on the right hand side of (4.64) is bounded by

kCM SkmaxÑhs ď kCM SkmaxÑk¨k ď kCM kkSkmaxÑk¨k .

(4.65)

For the first factor we use kCM SkhsÑmax ď kCM SkhsÑk¨k ď kCM kkSkhsÑk¨k . 28

(4.66)

We plug (4.66) and (4.65) into (4.64). Then we use the resulting inequality in combination with (4.63) in (4.62) and find kpId ´ CM Sq´1 kmax À 1 ` kCM k ` kCM k2 kpId ´ CM Sq´1 ksp , where we also used kSkmaxÑk¨k ď kSkmaxÑ1_8 and (4.61). Since kCM k ď kMk2 the claim (4.55) follows. Proof of (ii): Recall the definition of WR in (4.53). We estimate kWR rTskmax ď 2kMk1_8 kSkmaxÑ1_8 kRkmax kTkmax .

(4.67)

From the bound (4.61) we infer (4.56). Proof of (2.18) and (2.19): Now we are left with showing the second part of Theorem 2.6, namely that the derivative of G at D “ 0 can be written in the form (2.18) with the operator Z satisfying (2.19). Since we have shown the analyticity of G, the calculation leading up to (4.21) is now justified and we see that ∇R Gp0q “ pId ´ CM Sq´1 rMRs “ MR ` ZrRs , for all R P CN ˆN . Here, the linear operator Z is given by

CM S rMRs Id ´ CM S ` ˘ “ CM S ` pCM S q2 ` pCM S q2 pId ´ CM S q´1 CM S rMRs .

ZrRs :“

(4.68)

We will estimate the entries of the three summands separately. We show that kZrRskγ ď 12 for any R P CN ˆN with kRkmax ď 1, where γ depends only on δ and P. We begin with a few easy observations: For two matrices R, T P CN ˆN that have faster than power law decay, kRkγ R ď 1 and kTkγ T ď 1, their sum and product have faster than power law decay as well, i.e., kR ` Tkγ R`T ď 1 and kRTkγ RT ď 1. Here, γ R`T and γ RT depend only on γ R , γ T and P (cf. (2.11)). Furthermore, we see that by (2.13) the matrix SrRs has faster than power law decay for any R P CN ˆN with kRkmax ď 1. By the following argument we estimate the first summand on the right hand side of (4.68). Using (2.13), kMRkmax ď kMk1_8 kRkmax and the estimate (4.58), the matrix SrMRs has faster than power law decay. Since CM multiplies with M on both sides (cf. (4.17)) and M has faster than power law decay (cf. Theorem 2.5), we conclude that so has CM SrMRs. Now we turn to the second summand on the right hand side of (4.68). Since CM SrMRs has faster than power law decay, its entries are bounded. Using again (2.13) as above, we see that CM S applied to CM SrMRs has faster than power law decay as well. Finally, we estimate the third summand from (4.68). Since the matrix CM SrMRs has faster than power law decay, its k ¨ khs -norm is bounded. By (4.18) and ζ P Dδ , we conclude that “ ‰ kpId ´ CM Sq´1 CM SrMRs khs ď Cpδq . Thus, we get

“ ‰

CM S pId ´ CM Sq´1 CM SrMRs ď Cpδq kS khsÑmax kCM kmax max

ď Cpδq kS khsÑ1_8 kMk21_8 ,

29

which is bounded by (4.61) and (4.58). Therefore, the third term on the right hand side of (4.68) is an application of CM S to a matrix with bounded entries, which results in a matrix with faster than power law decay. Altogether we have established that (2.19) hold with only kZrRskγ on the left hand side. It remains to show that also Z ˚ rRs satisfies this bound. Since Z ˚ has a structure that resembles the structure (4.68) of Z, namely ´ ¯ ` ˘´1 Z ˚ rRs “ M˚ S CM˚ ` pS CM˚ q2 ` pS CM˚ q2 Id ´ pCM S q˚ S CM˚ rRs ,

we can follow the same line of reasoning as for the entries of ZrRs. This finishes the proof of (2.19) and with it the proof of Theorem 2.6.

5

Estimating the error term

In this section we prove the key estimates, stated precisely in Lemmas 3.3 and 5.1, for the error matrix D that appears as the perturbation in the equation (3.1) for the resolvent G. We start by estimating D away from the convex hull of supp ρ in terms of Λ defined in (3.6). To this end, we define κ´ :“ min supp ρ ,

κ` :“ max supp ρ ,

(5.1)

to be the two endpoints of this convex hull. Lemma 5.1. Let δ ą 0 and ε ą 0. Then the error matrix D satisfies ´ Λpζ q ¯1{2 1 , kDpζ qkmax 1pΛpζ q ď N ´ε q ă ? ` N Im ζ N

(5.2)

for all ζ P H with δ ď distpζ, rκ´, κ` sq ď δ ´1 and Im ζ ě N ´1`ε .

Convention 5.2. Throughout this section we will use Convention 4.1 with the set of model parameters P replaced by the set K from (2.29). If the constant C, hidden in the comparison relation, depends on additional parameters L , then we write α ÀL β. We rewrite the entries dxy of D in a different form, that allows us to see their smallness, by expanding the term pH ´ AqG (cf. (3.1b)) in neighborhoods of x and y. For any B Ď t1, . . . , Nu we introduce the matrix N HB “ phB xy qx,y“1 ,

hB xy :“ hxy 1px, y R Bq ,

(5.3)

obtained from H by setting the rows and the columns labeled by the elements of B equal to zero. The corresponding resolvent is GB pζq :“ pHB ´ ζ 1q´1 .

(5.4)

With this definition, we have the resolvent expansion formula G “ GB ´ GB pH ´ HB qG . In particular, for any y “ 1, . . . , N the rows of G outside B have the expansion Guy “ ´

B ÿ ÿ

GB uv hvz Gzy ,

v zPB

30

uRB.

(5.5)

Here we introduced, for any two index sets A, B Ď X, the short hand notation B ÿ

:“

xPA

ÿ

.

xPAzB

ř In case A “ X we simply write B x , i.e., the superscript over the summation means exclusion of these indices from the sum. Recall that H is written as a sum of its expectation matrix A and its fluctuation ?1N W (cf. (2.24)) and therefore 1 D “ ´ ? WG ´ SrGsG . N We use the expansion formula (5.5) on the resolvent elements in pWGqxy “ and find that the entries of D can be written in the form dxy

B 1 ÿ 1 ÿÿ “ ´? wxu Guy ` ? wxu GB uv hvz Gzy N uPB N u,v zPB 1 ÿ Guv pE wxu wvz q Gzy . ´ N u,v,z

ř

u

wxu Guy

(5.6)

Note that the set B here is arbitrary, e.g., it may depend on x and y. In fact, we will choose it to be a neighborhood of tx, yu, momentarily. Let A Ď B be another index set. We split the sum over z P B in the second term on the right hand side of (5.6) into a sum over w P A and w P BzA and use (2.24) again, ÿ

zPB

hvz Gzy “

ÿ

zPA

avz Gzy `

A ÿ

1 ÿ hvz Gzy ` ? wvz Gzy . N zPB zPA

We end up with the following decomposition of the error matrix: D “

5 ÿ

Dpkq ,

k“1

pkq

where the entries dxy of the individual matrices Dpkq are given by 1 ÿ p1q dxy “ ´? wxu Guy , N uPBxy

(5.7a)

dp2q xy

(5.7b)

dp3q xy

Bxy 1 ÿ ÿ xy wxu GB “ ? uv avz Gzy , N u,v zPAxy

Bxy Axy 1 ÿ ÿ xy ? “ wxu GB uv hvz Gzy , N u,v zPBxy

dp4q xy “

Bxy 1 ÿ ÿ Bxy G pwxu wvz ´ E wxu wvz q Gzy , N u,v zPA uv

(5.7c) (5.7d)

xy

Bxy

dp5q xy “

1 ÿ 1 ÿ ÿ Bxy Guv pE wxu wvz q Gzy , Guv pE wxu wvz q Gzy ´ N u,v zPA N u,v,z xy

31

(5.7e)

and we set Bxy :“ B2N ε1 pxq Y B2N ε1 pyq ,

Axy :“ BN ε1 pxq Y BN ε1 pyq ,

(5.8)

for some ε1 ą 0. Note that although D itself does not depend on the choice of ε1 , its decomposition into Dpkq does. We will estimate each error matrix Dpkq separately, where the estimates may still depend on ε1 . Since ε1 ą 0 is arbitrarily small, it is eliminated from the final bounds on D using the following property of the stochastic domination (Definition 3.2): If some positive random variables X, Y satisfy X ă N ε Y for every ε ą 0, then X ă Y . The following lemma provides entrywise estimates on the individual error matrices. Lemma 5.3. Let C ą 0 a constant and ζ P H with distpζ, SpecpHB qq´1 ă N C for all B Ĺ X. The entries of the error matrices Dpkq “ Dpkq pζ q, defined in (5.7), satisfy the bounds |Bxy | |dp1q kGkmax , xy | ă ? N ˆ ˙1{2 Bxy B ˙1{2 ˆ ÿ Im Guuxy p2q 2 |dxy | ă |Axy |N max |avz | max kGkmax , uRBxy N Im ζ zPAxy v B

ztzu

|Bxy | p Im Gzzxy q1{2 ? kGkmax , max |dp3q | ă xy B ztzu N Im ζ zPBxy |Gzzxy | ˆ ´1 řBxy B ztuu ˙1{2 Im Guuxy N p4q u |dxy | ă |Axy | kGkmax , N Im ζ ˜ ¸ ˆ Bk Bk ˙1{2 Im G Im G |B |´1 xy xk xk xk xk |dp5q ` kGkmax xy | ă |Axy ||Bxy | max Bk 2 Im ζ k“0 N |Gxk xk |N Im ζ ˆ ˙1{2 Im Gyy ` kGkmax , N Im ζ |B

(5.9a) (5.9b) (5.9c) (5.9d) (5.9e)

|

xy where the pBk qk“0 in (5.9e) are an arbitrary increasing sequence of subsets of Bxy with Bk`1 “ Bk Y txk u for some xk P Bxy . In particular, H “ B0 Ĺ B1 Ĺ ¨ ¨ ¨ Ĺ B|Bxy |´1 Ĺ B|Bxy | “ Bxy .

Proof. We show the estimates (5.9a) to (5.9e) one by one. The bound (5.9a) is trivial since by the bounded moment assumption (2.25) the entries of W satisfy |wxy | ă 1. For the proof of (5.9b) we simply use first Cauchy-Schwarz in the v-summation of (5.7b) and then the Ward-identity, B ÿ u

2 |GB xu pζ q| “

Im GB xx pζ q , Im ζ

B Ĺ X, x R B.

(5.10)

For (5.9c) we rewrite the entries of Dp3q in the form dp3q xy

Axy Bxy B ztzu Guzxy 1 ÿ ÿ ? wxu Bxy ztzu Gzy , “ ´ N zPBxy u Gzz

32

(5.11)

where we used the Schur complement formula in the form of the general resolvent expansion identity B ÿ BYtzu B B Guv hvz , B Ĺ X , u, z R B . Guz “ ´Gzz v

To the u-summation in (5.11) we apply the large deviation estimate (A.57) of Lemma A.5 B ztzu with the choices Xu :“ wxy and bu :“ Guzxy , i.e. xy xy Bÿ ´Bÿ ¯1{2 Bxy ztzu Bxy ztzu 2 wxu Guz |Guz | . ă

u

(5.12)

u

The assumption (A.55) of Lemma A.5 is an immediate consequence of the decay of correlation (2.27). In order to verify (A.56) we use both (2.27) and the N-dependent smoothness 1 k∇R GB k “ ? kGB RGB k ď N 2C kRk , N

(5.13)

of the resolvent, where ∇R denotes the directional derivative with respect to WB in the direction R “ R˚ P CpN ´|B|qˆpN ´|B|q . For the inequality in (5.13) we used the assumption distpζ, SpecpHB qq ě N Ć with high probability. By the Ward-identity (5.10) the bound (5.9c) follows from (5.12) and (5.11). To show (5.9d) we employ the quadratic large deviation result Lemma A.6 with the choices xy buv :“ GB Y :“ pwvy qvPXzBxy , X :“ pwxu quPXzBxy , uv . The assumptions (A.74) and (A.75) are again easily verified using (2.27) and (5.13). Applying (A.76) on the pu, vq-summation in (5.7d) we find Bxy ˆ Bÿ ˙1{2 ˆ Bÿ xy xy Bxy ˙1{2 ÿ B Im G uu 2 B Guvxy pwxu wvz ´ E wxu wvz q ă |Guvxy | “ , Im ζ u,v u,v u where we used (5.10) again. Finally, we turn to the proof of (5.9e). Let Bk be as in the statement of Lemma 5.3. We set B 1 ÿk Bk pkq G E wxu wvz , αxz :“ N u,v uv p5q

and use a telescopic sum to write dxy as p5q dxy

“

ÿ

zPAxy

|Bxy |´1

ÿ

k“0

pk`1q pαxz ´

pkq αxz q Gzy

Axy 1 ÿÿ ´ Guv pE wxu wvz q Gzy . N u,v z

We estimate the rightmost term in (5.14) simply by 2 ˙2 Aÿ Axy ˆ xy xy 1 ÿ Aÿ kGk2max ÿ ÿ |E wxu wvz | Guv pE wxu wvz q Gzy ď |Gzy |2 N 2 N z u,v u,v z z À kGk2max 33

Im Gyy , N Im ζ

(5.14)

where the sum over u and v on the right hand side of the first inequality is bounded by a constant because of the decay of covariances (3.2) and we used (5.10) in the second ineqality. Thus, (5.9e) follows from (5.14) and the bound ˙1{2 ˆ k k Im GB 1 1 Im GB xk xk xk xk pk`1q pkq (5.15) `? |αxz ´ αxz | ă . k N Im ζ |GB N N Im ζ xk xk | To show (5.15) we first see that pk`1q αxz

´

pkq αxz

Bk`1 Bk k 1 ÿ GB uxk Gxk v E wxu wvz “ ´ k N u,v GB xk xk

Bk`1 B 1 ÿk Bk 1 ÿ ´ G E wxu wxk z ´ GBk E wxxk wvz , N u uxk N v xk v

(5.16)

where we used the general resolvent identity BYtuu GB ` xy “ Gxy

B GB xu Guy , GB uu

B Ĺ t1, . . . , Nu , x, y, u R B , x ‰ u , y ‰ u .

The last two terms on the right hand side of (5.16) are estimated by the second term on the right hand side of (5.15) using first Cauchy-Schwarz, the decay of covariances (3.2), and then the Ward-identity (5.10). For the first term in (5.16) we use the same argument as in (3.4) to see that (3.2) implies ÿ |ru tv | |E wxu wvz | À krkktk , (5.17) u,v

for any two vectors r, t P CN ˆN . We obtain (5.15) by applying (5.17) with the choice Bk k ru :“ GB uxk , tv :“ Gxk v and using the Ward-identity afterwards. In this way (5.9e) follows and Lemma 5.3 is proven. The following definition is motivated by the formula that expresses the matrix elements of GB in terms of the matrix elements of G. For R P CN ˆN and A, B Ĺ X we denote by RAB :“ prxy qxPA,yPB its submatrix. In case A “ B we write RAB “ RA for short. Now let x P XzB. Then we have ´1 “ ppG´1 qXzB q´1 . GB XzB “ pHXzB ´ ζ 1q

(5.18)

In particular, pGB qXzB “ GB XzB . Definition 5.4. For B Ĺ X we define the CpN ´|B|qˆpN ´|B|q -matrix MB :“ ppM´1 qXzB q´1 .

(5.19)

Lemma 5.5. Let δ ą 0 and ζ P H be such that δ ď distpζ, rκ´ , κ` sq ` ρpζ q ď

1 . δ

Then for all B Ĺ X and x, y R B the matrix MB , defined in (5.19), satisfies kMB kγ Àδ 1 , 34

(5.20)

for some sequence γ, depending only on δ and the model parameters. For x R B we have |mB xx pζ q| „δ 1 ,

Im mB xx pζ q „δ ρpζ q .

(5.21)

Furthermore, there is a positive constant c, depending only on K and δ, such that ´ c ¯ B 1 |GB pζ q ´ m pζ q| Λpζ q ď Àδ p1 ` |B|q Λpζ q , (5.22) xy xy 1 ` |B| for all x, y R B. Proof. We begin by establishing upper and lower bounds on the singular values of MB , kMB k „δ 1 ,

kpMB q´1 k „δ 1 .

(5.23)

We will make use of the following general fact: Let R P CN ˆN satisfy the upper bound kRk Àδ 1 as well as Im R Áδ 1

or

Re R Áδ 1

or

´ Re R Áδ 1 .

(5.24)

A Ď X.

(5.25)

Then any submatrix RA of R satisfies kRA k „δ 1 ,

kpRA q´1 k „δ 1 ,

We verify (5.24) for R “ M in two separate regimes and thus show (5.23). First let ζ be such that ρpζq ě δ{2. Then the lower bound in the imaginary part in (5.24) follows from (4.11) and (4.9). Now let ζ be such that δ{2 ď distpζ, rκ´ , κ` sq ď δ ´1 . Then we may also assume that we have distpRe ζ, rκ´ , κ` sq ě 4δ , because otherwise Im ζ ě 4δ and thus ρpζq Áδ 1. In this situation the claim follows from the case that we already considered, namely ρpζq ě δ{2 because there δ was arbitrary. Since M is the Stieltjes transform of a C` -valued measure with support in rκ´ , κ` s (cf. (2.5)), its real part is positive definite to the left of κ´ and negative definite to the right of κ` . In both cases we also have the effective bound |Re M| Áδ 1 because distpRe ζ, rκ´, κ` sq ě 4δ . Now we apply (5.23) to see (5.20). By (2.26) and (2.13) the right hand side of (2.22) and with it M´1 has faster than power law decay. The same is true for its submatrix with indices in XzB. Thus (5.20) follows directly from the definition (5.19) of MB , the upper bound on its singular values from (5.23) and Lemma A.2. To prove (5.21) we use Im MB “ ´ pMB q˚ pImpM´1 qXzB qpMB q „δ ´ ImpM´1 qXzB „δ ρ 1 , where we applied (5.23) for the first comparison relation and used ´ Im M´1 „δ ρ1 (cf. (4.11) and (4.10)) for the second. The bound on Im mB xx in (5.21) follows and the B bound on |mxx | follows at least in the regime ρpζq ě δ{2. We are left with showing ´1 |mB xx | Áδ 1 in the case δ{2 ď distpζ, rκ´ , κ` sq ď δ . As we did above, we may assume that distpRe ζ, rκ´ , κ` sq ě 4δ . We restrict to Re ζ ď κ´ ´ 2δ . The case Re ζ ě κ` ` 2δ is treated analogously. In this regime Re MB “ pMB q˚ RepM´1 qXzB pMB q „δ RepM´1 qXzB „δ 1 , 35

where we used Re M´1 “ pM´1 q˚ pRe Mq M´1 „δ Re M „δ 1 for the last comparison relation. Thus, (5.21) follows. Now we show (5.22). By the Schur complement formula we have for any T P CN ˆN the identity ` B ˘´1 ` ˘´1 ` ˘ pT qtx,yu “ Ttx,yu ´ Ttx,yuB pTB q´1 TBtx,yu “ pTBYtx,yu q´1 tx,yu ,

(5.26)

for x, y R B and TB :“ ppT´1 qXzB q´1 , provided all inverses exist. We will use this identity for T “ M, G. Note that this definition TB with T “ G is consistent with the definition (5.4) on the index set XzB because of (5.18). Recalling that GBYtx,yu “ pGu,v qu,vPBYtx,yu and MBYtx,yu are matrices of dimension |B| ` 2, we have kGBYtx,yu ´ MBYtx,yu k ď p|B| ` 2qkGBYtx,yu ´ MBYtx,yu kmax ď p|B| ` 2qΛ . Therefore, as long as p|B| ` 2q Λ kpMBYtx,yu q´1 k ď

1 2

we get

kpGBYtx,yu q´1 ´ pMBYtx,yu q´1 k ď 2 kpMBYtx,yu q´1 k2 kGBYtx,yu ´ MBYtx,yu k Àδ p1 ` |B|q Λ , where we used in the last step that kpMBYtx,yu q´1 k „δ 1, which follows from using (5.24) and (5.25) for the choice R “ M in the regimes ρ Áδ 1 and distpRe ζ, rκ´, κ` sq Áδ 1, respectively. Again using the definite signs of the imaginary and real part of M as well as that of pMBYtx,yu q´1 in these two regimes, we infer that

` ˘

ppMBYtx,yu q´1 qtx,yu ´1 „δ 1 ,

as well. We conclude that there is a constant c, depending only on δ and K , such that

` ´ ˘´1 ` ˘´1 c ¯

Àδ p1 ` |B|q Λ .

ppGBYtx,yu q´1 qtx,yu ´ ppMBYtx,yu q´1 qtx,yu 1 Λ ď 1 ` |B| With the identity (5.26) the claim (5.22) follows and Lemma 5.5 is proven.

Proof of Lemmas 3.3 and 5.1. We begin with the proof of (3.7). We continue the estimates on all the error matrices listed in Lemma 5.3. Therefore, we fix ζ “ τ ` i with |τ | ď C. Since Im ζ “ 1, we have the trivial resolvent bound kGB k ď 1 ,

(5.27)

for any B Ď X. We also have a lower bound on diagonal elements, 1 ă 1, |GB xx |

xRB.

(5.28)

Indeed, by the Schur complement formula applied to the px, xq-element of the resolvent GB “ pHB ´ ζ 1q´1 we have B ÿ 1 hvx . ´ B “ ζ ´ axx ` hxu GBYtxu uv Gxx u,v

36

We take absolute value on both sides and estimate trivially, ÿ 1 BYtxu ď |ζ| ` |a | ` kG k |hxu |2 ă 1 . xx |GB | xx u Here we used (5.27) ř to control the norm of the resolvent and the assumptions (2.26) and (2.25) to bound u |hxu |2 . By the choice of Axy and Bxy from (5.8) and assumption (2.11) we also have |Axy | ď |Bxy | ă N ε1P .

(5.29)

We apply (5.27), (5.28) and (5.29) to estimate the right hand sides of the bounds in (5.9) further and find N 2 ε1P κ2 pνq |dxy | ă ? ` N ε1P N ε1 ν , N N

(5.30)

for any ν P N. Here we also used that by assumption (2.26) for any ν P N the expectation matrix satisfies |avz | ď

κ2 pνq , N ε1 ν

z P Axy , v R Bxy ,

(5.31) p2q

to obtain the second summand on the right hand side of (5.30) from estimating |dxy |. Since ε1 ą 0 was arbitrary (5.30) implies (3.7). Now we prove (3.8) and (5.2) in tandem. Let δ ą 0 and ζ P H such that δ ď distpζ, rκ´ , κ` sq ` ρpζq ď δ ´1 and Im ζ ě N ´1`ε . We show that kDkmax 1pΛ ď N ´ε q ă

´ ρ ` Λ ¯1{2 . N Im ζ

(5.32)

From (5.32) the bound (3.8) follows immediately in the regime where ρ ď δ. Also (5.2) follows from (5.32). Indeed, in the regime of spectral parameters ζ P H with δ ď distpζ, rκ´ , κ` sq ď δ ´1 we have ρ „δ Im ζ because ρ is the harmonic extension of a probability density supported inside rκ´ , κ` s. For the proof of (5.32) we use (5.22), (5.21), (5.31), (5.29) and kGxy kmax À 1 ` Λ (cf. (4.9)) to estimate the right hand side of each inequality in (5.9). In this way we get kDkmax 1pΛ ď N ´ε q ă N 2ε1P N 3{2

´ ρ ` Λ ¯1{2 ´ ρ ` Λ ¯1{2 κ pνq 2 2ε1P ` N , N Im ζ N ε1 ν N Im ζ

(5.33)

for any ν P N, provided ε ą ε1 P to ensure N ´ε ď c{|Bxy |, i.e. that the constraint ρ Λ ď N ´ε makes (5.22) applicable. Here, we also used ρ Áδ Im ζ to see that N Im Áδ N1 . ζ Since (5.33) holds for arbitrarily small ε1 ą 0, the claim (5.32) and with it Lemmas 3.3 and 5.1 are proven.

6

Fluctuation averaging

The main result of this section is Proposition 3.4 by which a error bound Ψ for the entrywise local law can be used to improve the bound on the error matrix D to Ψ2 , once D is averaged against a non-random matrix R with faster than power law decay. 37

Proof of Proposition 3.4. Let R P CN ˆN with kRkβ ď 1 for some positive sequence β. Within this proof we use Convention 4.1 such that ϕ À ψ means ϕ ď Cψ for a constant Ă :“ pK , δ, ε1 , β, Cq, where C and δ are the constants from the C, depending only on P statement of the proposition, K are the model parameters (cf (2.29)) and ε1 enters in the splitting of the error matrix D into Dpkq (cf. (5.8)). Note that since ε1 is arbitrary it suffices to show (3.13) up to factors of N ε1 . We will also use the notation Oă pΦq for a random variable that is stochastically dominated by some nonnegative Φ. We split the expression xR, Dy from (3.13) according to the definition (5.7) of the matrices Dpkq . Then we estimate xR, Dpkq y separately for every k. We do this in three steps. First we estimate kDpkq kmax for k “ 2, 3, 5 directly without using the averaging effect of R. Afterwards we show the bounds on xR, Dp1q y and xR, Dp4q y, respectively. In the upcoming arguments the following observation will be useful. The local law (3.12) together with (5.22) implies that for every B Ď X with |B| ď N ε{2 we have (6.1)

kMB ´ GB kmax ă p1 ` |B|qΨ .

Here, until the end of this proof, we consider GB as the CpN ´|B|qˆpN ´|B|q -matrix GB “ pGB xy qx,yRB as opposed to the general convention (5.3). Estimating kDpkq kmax : Here, we show that under the assumption (3.12) the error matrices with indices k “ 2, 3, 5 satisfy the improved entrywise bound kDpkq kmax ă N 3ε1 P Ψ2 ,

k “ 2, 3, 5 ,

(6.2)

where ε1 stems from (5.8) and P is the model parameter from (2.11). We start by estimating the entries of Dp2q . Directly from its definition in (5.7b) we infer Bxy ÿ Bxy p2q 3{2 |avz | . |dxy | ă N |Axy |kGkmax kG kmax max zPAxy

v

The maximum norm on the entries of the resolvents G and GBxy are bounded by (6.1) and (5.20). The decay (2.12) of the entries of the bare matrix and that dpv, zq ě N ε1 in the last sum then imply kDp2q kmax ă N ´ν for any ν P N. To show (6.2) for k “ 3 we use the representation (5.11) and the large deviation estimate (5.12) just as we did in the proof of Lemma 5.3. In this way we get |dp3q xy |

˙1{2 ˆ Bxy ÿ |Bxy | |Gzy | Bxy ztzu 2 ă ? |Guz | max Bxy ztzu . Gzz N zRAxy u

Now we use (6.1), (5.20), (5.21) and (5.29) to conclude ˙ ˆ N ε1 P p3q Ψ ` max |mzy | . kD kmax ă ? zRAxy N

(6.3)

The faster than power law decay of M from (2.15) together with the definition of Axy in (5.8) implies max |mzy | ď CpνqN ´ε1 ν ,

zRAxy

for any ν P N. Since Ψ ě N ´1{2 we infer (6.2) for k “ 3 from (6.3). 38

(6.4)

Finally we consider the case k “ 5. We follow the proof of Lemma 5.3 and use the representation (5.14). We estimate the two summands on the right ř xyhand side of (5.14), Sxz rGs Gzy and use starting with the second term. We rewrite this term in the form A z (2.13) as well as kGkmax ď kMkmax ` Ψ together with the upper bound on M in (2.15). To bound the first term on the right hand side of (5.14) we use (5.16). Each of the three terms on the right hand side of (5.16) has to be bounded by N 2ε1 P Ψ2 . The second and third term are bounded by N1 by the decay of covariances (3.2). For the first term we use (5.17), (6.1) and (5.20). Estimating xR, Dp1q y: Here we will show that |xR, Dp1q y| ă N CP ε1 Ψ2 ,

(6.5)

for some numerical constant C ą 0. We split the error matrix Dp1q into two pieces Dp1q “ Dp1aq ` Dp1bq , defined by 1 ÿ ? wxu muy , dp1aq :“ ´ xy N uPBxy

1 ÿ ? dp1bq :“ ´ wxu pGuy ´ muy q , xy N uPBxy

and

where Bxy is a 2N ε1 -environment of the set tx, yu (cf. (5.8)). The matrix Dp1bq we estimate simply by N ε1 P kDp1bq kmax ă ? Ψ ď N ε1 P Ψ2 , N

(6.6)

where we used (5.29) and (6.1). For the bound on xR, Dp1aq y we write xR, Dp1aq y “ X ` Y ` Z ,

(6.7)

where the three summands on the right hand side are defined as 1

X :“

2

By Bx ÿ ÿÿ x

1

σxuy ,

Y :“

Bx ÿ ÿÿ x

y uPBx2

σxuy ,

y uPBy2

Z :“

ÿ ÿ ÿ

σxuy ,

x yPBx1 uPBxy

and we introduced the short notations σxuy :“

1 r xy wxu muy , N 3{2

and

Bxk :“ BkN ε1 pxq .

The faster than power law decay of both R and M (cf. (2.15)), i.e. that |rxy | ` |mxy | À for dpx, yq ě N ε1 , yields the high moment bounds E |X|

2µ

µ 1 ÿ ÿ ź Àµ 5µ wxi ui wxµì uµì E N i“1 x uPB 2

1 N

(6.8a)

x

2P ε1 µ

E |Y |2µ Àµ E |Z|

2µ

N N 5µ

µ ÿ ź wxi ui w xµì uµì , E x,u

(6.8b)

i“1

µ N 2P ε1 µ ÿ ÿ ź w E , w Àµ x u x u i i µì µì 3µ N x uPB 3 i“1 x

39

(6.8c)

where the sums are over index tuples x “ px1 , . . . , x2µ q P X2µ and Bxk :“ BkN ε1 pxq is the ball around x with respect to the product metric 2µ

dpx, yq :“ max dpxi , yi q . i“1

In (6.8c) we have used the triangle inequality to conclude that dpu, xq ď 3N ε1 . By trivial bounds we see that the right hand side of (6.8a) is ăµ N 2P ε1 µ N ´3µ . This suffices to show (6.5). For Y and Z we continue the estimates in (6.8) by using the decay of correlations (2.27) and the ensuing lumping of index pairs pxi , ui q: # µ ź N ´ν D i : dsym ppxi , ui q, tpxj , uj q : j ‰ iuq ě N ε1 , wxi ui wxµì uµì Àµ,ν (6.9) E 1 otherwise, i“1 where dsym ppx1 , x2 q, py1 , y2qq :“ dppx1 , x2 q, tpy1 , y2 q, py2, y1 quq is the symmetrized distance on X2 , induced by d. Inserting (6.9) into the moment bounds on Y and Z effectively reduces the combinatorics of the sum in (6.8b) from N 4µ to N 2µ and in (6.8c) from N 2µ to N µ . We conclude that N CP ε1 . |xR, Dp1aq y| ă N Combining this with (6.6) implies (6.5) because of and Ψ ě N ´1{2 .

Estimating xR, Dp4q y: Similarly to the strategy for estimating |xR, Dp1q y| we write ÿ ÿ Bxy Bxy p4bq Zxz pGzy ´ mzy q , Z m , , d :“ Dp4q “ Dp4aq ` Dp4bq , dp4aq :“ zy xz xy xy zPAxy

zPAxy

where we introduced for any B Ĺ X the short hand B Zxz

B 1 ÿ B G pwxu wvz ´ E wxu wvz q . :“ N u,v uv

(6.10)

From the decay of correlations (2.27) and dptx, zu, XzBxy q ě N ε1 for any z P Axy as well as the N-dependent smoothness of the resolvent as a function of the matrix entries of W for distpζ, SpecpHqq ě N Ć we see that Lemma A.6 can be applied for a large deviation B estimate on the pu, vq-sum in the definition (6.10) of Zxz for B “ Bxy , i.e. Bxy |Zxz |

ă

ˆ

˙1{2 Bxy 1 ÿ Bxy 2 |G | ă N ε1 P Ψ . N 2 u,v uv

(6.11)

Here we also used (6.1) and (5.20) for the second stochastic domination bound. Combining (6.11) with (6.1) we see that kDp4bq kmax ă N 2ε1 P Ψ2 .

(6.12)

The rest of the proof of Proposition 3.4 is dedicated to showing the high moment bound E |xR, Dp4aq y|2µ Àµ N Cpµqε1 Ψ2µ . 40

(6.13)

Together with (6.2), (6.5) and (6.12) this bound implies (3.13) since ε1 can be chosen arbitrarily small. In analogy to (6.7) we write xR, Dp4aq y “ X ` Y ` Z , for the three summands 1

By Bx ÿ ÿÿ 1

X :“

x

1

σxzy ,

Y :“

Bx ÿ ÿÿ x

y zPBx1

with the short hand σxzy :“

σxzy ,

y zPBy1

Z :“

ÿ ÿ ÿ

σxzy ,

x yPBx1 zPAxy

1 Bxy r xy Zxz mzy . N

Similar to (6.8) the faster than power law decay of R and M implies the moment bounds

E |X|2µ E |Y |2µ E |Z|2µ

µ 1 ÿ ÿ ź Bxi yi Bxµì yµì Zx z Z xµì zµì , E Àµ 6µ N x,y zPB1 i“1 i i x ź ÿ ÿ µ Bx y Bxµì yµì 1 i i Zx z Z xµì zµì , E Àµ 4µ N x,y zPB1 i“1 i i y µ ÿ ź Bxµì yµì 1 ÿ ÿ B xi y i E Zxi zi Z xµì zµì . Àµ 2µ N 1 1 1 x i“1

(6.14a) (6.14b) (6.14c)

yPBx zPBx YBy

Using a priori bound (6.11) it trivially follows that X is bounded by CpµqN 4ε1 P µ N ´2µ Ψ2µ . Since this is already sufficient for (6.14a) we focus on the terms Y and Z in (6.14). We call the subscripts i of the indices xi and zi labels. In order to further estimate the moments of Y and Z we introduce the set of lone labels of px, zq: Lpx, zq :“

) ! ` ˘ Ť i : d txi , zi u, j‰i txj , zj u ě 3N ε1 .

(6.15)

The corresponding index pair pxi , zi q for i P Lpx, zq, is called lone index pair. We partition the sums in (6.14b) and (6.14c) according to the number of lone labels, i.e. we insert the partition of unity 2µ ÿ 1pLpx, zq “ ℓq . 1 “ ℓ“0

A simple counting argument reveals that fixing the number of lone labels reduces the combinatorics of the sums in (6.14b) and (6.14c). More precisely, ÿ ÿ

1 x,y zPBy

ÿ ÿ

ÿ

1pLpx, zq “ ℓ q ď N Cµ ε1 N 2µ`ℓ , 1pLpx, zq “ ℓ q ď N Cµ ε1 N µ`ℓ{2 .

(6.16)

1 zPB 1 YB 1 x yPBx x y

The expectation in (6.14b) and (6.14c) is bounded using the following technical result. 41

Lemma 6.1 (Key estimate for averaged local law). Assume the hypotheses of Proposition 3.4 hold, let µ P N and x, y P X2µ . Suppose there are 2µ subsets Q1 , . . . , Q2µ of X, such that BN ε1 pxi , yi q Ď Qi Ď B3N ε1 pxi , yi q for each i. Then µ ź Cpµqε1 pQi q pQµì q E Ψ 2µ ` |Lpx,yq| . Z Z (6.17) xi yi xµì yµì Àµ N i“1

Using (6.16) and Lemma 6.1 on the right hand sides of (6.14b) and (6.14c) after partitioning according to the number of lone labels, yields E |Y |

2µ

2µ N Cpµqε1 ÿ 2µ`ℓ 2µ`ℓ Àµ Ψ N , N 4µ ℓ“0

E |Z|

2µ

2µ N Cpµqε1 ÿ 2µ`ℓ µ`ℓ{2 Àµ Ψ N . N 2µ ℓ“0

(6.18)

Since Ψ ě N ´1{2 the high moment bounds in (6.18) together with (6.14a) imply (6.13). This finishes the proof of Proposition 3.4 up to verifying Lemma 6.1 which will occupy the rest of the section. Proof of Lemma 6.1. Let us consider the data ξ :“ px, y, pQi q2µ i“1 q , fixed. We start by writing the product on the left hand side of (6.17) in the form. µ ź i“1

µì q ZxpQi yiiq Z xpQµì yµì “

ÿ u,v

wx,y pu, vq Γξ pu, vq ,

(6.19)

where the two auxiliary functions Γξ , wx,y : X2µ ˆ X2µ Ñ C, are defined by µ (ź µì q Γξ pu, vq :“ 1 ui , vi R Qi , @ i “ 1, . . . , 2µ GupQi viiq GupQµì vµì ,

(6.20a)

i“1

wx,y pu, vq :“ and

µ ź i“1

wxiyipui , vi qw xµì yµì puµì , vµì q ,

wxy pu, vq :“

1 pwxu wvy ´ E wxu wvy q . N

(6.20b)

(6.21)

In order to estimate (6.19) we partition the sum over the indices ui and vi depending on their distance from the set of lone index pairs, pxi , yi q with i P L, where L “ Lpx, yq. To this end we introduce the partition tBi : i P t0u Y Lu of X, # BN ε1 pxi q Y BN ε1 pyi q when i P L , Bi :“ (6.22) Ť X z jPL pBN ε1 pxj q Y BN ε1 pyj qq when i “ 0 , and the shorthand Bpξ, σq :“

!

) pu, vq P X4µ : ui P Bσi1zQi , vi P Bσi2zQi , i “ 1, . . . , 2µ ,

(6.23)

where the components σi “ pσi1 , σi2 q P pt0u Y Lq2 of σ “ pσi q2µ i“1 specify whether ui and vi are close to a lone index pair or not; e.g. σi1 determines which lone index ui is close to, 42

if any. For any fixed ξ, as σ runs through all possible elements of pt0u Y Lq4µ , the sets Bpξ, σq form a partition of the summation set on the right hand side of (6.19) (taking into account the restriction ui , vi R Qi ). Therefore it will be sufficient to estimate ÿ wx,y pu, vq Γξ pu, vq (6.24) pu,vqPBpξ,σq

for every fixed σ P pt0u Y Lq4µ . Since xi and yi are fixed, while ui and vi are free variables, with their domains depending on pξ, σq, we say that the former are external indices and the latter are internal indices. Let us define the set of isolated labels, p y, σq “ Lpx, yqztσ 1 , . . . , σ 1 , σ 2 , . . . , σ 2 u , Lpx, 2µ 1 2µ 1

(6.25)

so that if an external index has an isolated label as subscript, then it is isolated from all the other indices in the following sense: 2µ ¯ ´ ď ď tuj , vj uY txj , yj u ě N ε1 , d txi , yi u , j“1

j‰i

p y, σq . pu, vq P Bpξ, σq , i P Lpx,

Notice that isolated labels indicate not only separation from all other external indices, as lone labels do, but also from all internal indices. Given a resolvent entry GB uv we will refer u, v as lower indices and the set B as an upper index set. The next lemma, whose proof we postpone until the end of this section, yields an algebraic representation for (6.24) provided the internal indices are properly restricted. Lemma 6.2 (Monomial representation). Let ξ and σ be fixed. Then the restriction Γξ |Bpξ,σq of the function (6.20a) to the subset Bpξ, σq of X4µ has a representation Γξ |Bpξ,σq “

Mÿ pξ,σq

(6.26)

Γξ,σ,α ,

α“1

in terms of (6.27)

Mpξ, σq Àµ N Cpµqε1 ,

(signed) monomials Γξ,σ,α : Bpξ, σq Ñ C, such that Γξ,σ,α pu, vq for each α is of the form: q n ź ź Et # p Gat bt q p´1q #

t“1

1

ź

p GUurrvr q#

p GFwrr wr q# r“1 rPRp1q

ź

V1

U1

p Guttu1 Gv1tvt q# . t

t

(6.28)

tPRp2q

Here the notations p´1q# and p ¨ q# indicate possible signs and complex conjugations that may depend only on pξ, σ, αq, respectively, and that will be irrelevant for our estimates. The dependence on pξ, σ, αq has been suppressed in the notations, e.g., n “ npξ, σ, αq, Ur “ Ur pξ, σ, αq, etc. The numbers n and q of factors in (6.28) are bounded, n ` q Àµ 1. Furthermore, for any fixed α the two subsets Rpkq , k “ 1, 2, form a partition of t1, . . . , 2µu, and the the monomials (6.28) have the following four properties: Ť 1. The lower indices at , bt , u1t , vt1 , wt are in iPLp Bi , and dpat , bt q ě N ε1 . 43

2. The upper index sets Er , Fr , Ur , Ur1 , Vr1 are bounded in size by N Cpµqε1 , and Br Ď Ur , Ur1 , Vr1 . The total number of these sets appearing in the expansion (6.26) is bounded by N Cpµqε1 . 3. At least one of the following two statements is always true: q n č` č č č ˘ p Ut1 X Vt1 ; Ur X Fr X (I) Di P L , s.t. Bi Ď Et X r“1

t“1

tPR2

rPR1

p . (II) n ` |R | ` 2|R | ě 2µ ` | L| p1q

p2q

Since Lemma 6.1 relies heavily on this representation, we make a few remarks: (i) Monomials with different values of α may be equal. The indices at , bt , u1t , vt1 , wt may overlap, but they are always distinct from the internal indices since from (6.23) and (6.25) we see that tur , vr u2µ r“1 Ď Xz

ď

Bi ,

p iPL

pu, vq P Bpξ, σq .

(ii) The reciprocals of the resolvent entries are not important for our analysis because the diagonal resolvent entries are comparable to 1 in absolute value when a local law holds (cf. (5.21)). (iii) Property 3 asserts that each monomial is either a deterministic function of HpBi q for some isolated label i, and consequently almost independent of the rows/columns p additional offof H labeled by xi , yi (Case (I)), or the monomial contains at least |L| diagonal resolvent factors (Case (II)). In the second case, each of these extra factors will provide an additional factor Ψ for typical internal indices due to faster than power law decay of M and the local law (3.12). Typical internal indices, e.g. when ur and vr are close to each other, do not give a factor Ψ since mur vr is not small, but there are much fewer atypical indices than typical ones and this entropy factor makes up for the lack of smallness. These arguments will be made rigorous in Lemma 6.3 below. By using the monomial sum representation (6.26) in (6.24), and estimating each summand separately, we obtain ˇ ˇ ˇ ˇ µ ˇ ź ˇ ˇ ˇ ÿ M pξ,σq ˇ ˇ ˇ ˇ pQi q pQµì q Cpµqε1 wx,y pu, vq Γξ,σ,α pu, vq ˇ , max max ˇ E ˇ E Zxi yi Z xµì yµì ˇ Àµ N (6.29) σ α“1 ˇ ˇ i“1 ˇ ˇ pu,vqPBpξ,σq

where the factor N Cpµqε1 originates from (6.27), and we have bounded the summation over by a µ-dependent constant. Thus (6.17) holds if we show, uniformly in α, that ˇ ˇ ˇ ˇ ÿ 1 p p ˇ ˇ wx,y pu, vq Γξ,σ,α pu, vq ˇ ď N Cpµqε1 N ´ 2 p |Lpx,yq| ´ | Lpx,y,σq| q Ψ 2µ ` |Lpx,y,σq| . (6.30) ˇE ˇ ˇ pu,vqPBpξ,σq

In order to prove this bound, we fix α, and sum over the internal indices to get ˇ ˇ q n ˇ ˇ ź ź ÿ ź ź ˇ ˇ Et Θrp2q , (6.31) Θp1q wx,y pu, vq Γξ,σ,α pu, vq ˇ ď E | GFwrr wr |´1 | Gat bt | ˇE r ˇ ˇ p2q p1q r“1 t“1 pu,vqPBpξ,σq

rPR

rPR

where we have used the formula (6.28) for the monomial Γξ,σ,α . The sums over the internal 44

indices have been absorbed into the following factors: ˇ ˇ ˇ ˇ ÿ ÿ ˇ Ur ˇ pu, vq G w Θp1q :“ ˇ xr ,yr uv ˇ , r ˇ ˇ uPBσ 1 zQr vPBσ 2zQr r r ˇ ˇ ÿ ÿ ˇ U1 V1 p2q wxr ,yrpu, vq Guur 1r Gvr1rv Θr :“ ˇ ˇ uPBσ 1 zQr vPBσ 2zQr r

r

r P Rp1q , ˇ ˇ ˇ ˇ, ˇ

(6.32) p2q

rPR .

The right hand side of (6.31) will be bounded using the following three estimates which follow by combining the monomial representation with our previous stochastic estimates.

Lemma 6.3 (Three sources of smallness). Consider an arbitrary monomial Γξ,σ,α , of the form (6.28). Then, under the hypotheses of Proposition 3.4, the following three estimates hold: 1. The resolvent entries with no internal lower indices are small while the reciprocals of the resolvent entries are bounded, in the sense that | GFwrr wr |´1 ă 1 .

t | GE at bt | ă Ψ ,

(6.33)

2. If Γξ,σ,α satisfies (I) of Property 3 of Lemma 6.2, then its contribution is very small in the sense that | E wx,y pu, vq Γξ,σ,α pu, vq| Àµ,ν N ´ν ,

(6.34)

pu, vq P Bpξ, σq .

3. Sums over the internal indices around external indices with lone labels yield extra smallness: 1 (6.35) ă N Cpµqε1 N ´ 2 |σr |˚ Ψ k , Θpkq 1 ď r ď 2µ , k “ 1, 2 , r where |σr |˚ :“ |t0, σr1 , σr2 u| ´ 1 counts how many, if any, of the two indices ur and vr , are restricted to vicinity of distinct external indices. We postpone the proof of Lemma 6.3 and first see how it is used to finish the proof of Lemma 6.1. The bound (6.30) follows by combining Lemma 6.2 and Lemma 6.3 to estimate the right hand side of (6.31). If (I) of Property 3 of Lemma 6.2 holds, then applying (6.34) and (6.33) in (6.31) yields (6.30). On the other hand, if (I) of Property 3 of Lemma 6.2 is not true, then we use (6.33) and (6.35) to get n ź t“1

t | GE at bt |

q ź

| GFwrr wr |´1

r“1

ź

Θp1q r

rPRp1q

ź

Θp2q r

(6.36)

rPRp2q

ă N Cpµqε1 Ψ

n ` |Rp1q | ` 2|Rp2q |

N´

1 2

ř

r |σr |˚

.

By Part 3 of Lemma 6.2 we know that (II) holds. Thus the power of Ψ on the right hand p On the other hand, from (6.25) we see that side of (6.36) is at least 2µ ` |L|. 2µ 2µ 2µ ď ÿ ÿ 1 2 1 2 p |σr |˚ . tσr , σr uzt0u ď tσr , σr uzt0u ď |L| ´ | L| ď r“1

r“1

r“1

p Using these Hence the power of N ´1{2 on the right hand side of (6.36) is at least |L| ´ |L|. bounds together with Ψ ě N ´1{2 in (6.36), and then taking expectations yields (6.30). Plugging (6.30) into (6.29) completes the proof of (6.17). 45

Proof of Lemma 6.3. Combining (6.1) and (5.20) we see that for some sequence α Cε1 |GE Ψ` uv | ă N

αpνq , p1 ` dpu, vqqν

whenever

u, v R E , and |E| ď N Cε1 .

(6.37)

By the bound on the size of Et , Fr in Property 2 of Lemma 6.2, (6.37) is applicable for these upper index sets. Then (6.33) follows from the second bound of Property 1 of Lemma 6.2 and the decay of the entries of ME from (5.20). p be the label from (I) of Property 3 of Lemma 6.2. In order to prove Part 2, let i P L We have  „ ı ” ź # # ¨ E Γξ,σ,α pu, vq wxr yrpur , vr q E wx,y pu, vq Γξ,σ,αpu, vq “ E wxiyipui , vi q r‰i

ˆ

#

` Cov wxi yipui , vi q , Γξ,σ,α pu, vq

ź r‰i

wxr yrpur , vr q

#

˙

(6.38)

,

where the first term on the right hand side vanishes because wxy pu, vq’s are centred random variables by (6.21). Now the covariance is smaller than any inverse power of N, since wxiyipui , vi q depends only on the xi -th and yi -th row/column of H, while Γξ,σ,α is a deterministic function of HBi by (I) of Property 3 of Lemma 6.2. Indeed, the faster than power law decay of correlations (2.27) yields (6.34), because the derivative of Γξ,σ,α pu, vq with respect to the entries of H are bounded in absolute value by N Cpµq by the N-dependent smoothness of the resolvents GE pζq as a function of H for spectral parameters ζ with distpζ, SpecpHE qq ě N Ć . For more details we refer to the proof of Lemmas A.6 and A.5, where a similar argument was used. Now we will this end, fix an arbitrary label r “ 1, 2, . . . , 2µ. Let us Ť prove Part 3. To Ť denote BL :“ sPL Bs and BLp :“ tPLp Bt . p1q Let us first consider Θr . If σr1 “ s and σr2 “ t, then we need to estimate ÿ ÿ p , and Qr Ď Ur Ď Qr Y B p . wxr yrpu, vq GUuvr , where s, t P LzL L (6.39) uPBszQr vPBtzQr

Since BszQr , BtzQr Ď XzUr , the indices u, v do not overlap the upper index set Ur . Hence, in the case k “ 1 and s “ t “ 0 the estimate (6.35) follows from (A.76) of Lemma A.6. If s, t P L, then taking modulus of (6.39) and using (6.37) yields (6.35): ¯ ´ pu, vq| max max |GUuvr | Θp1q ď |B zQ | |B zQ | max |w s r t r x r yr r u, vPX

ă

Cε1´

N N

N Cε1 Ψ `

αpνq p1 ` dpBs , Bt qqν

¯

uPBszQr vPBtzQr 1

(6.40)

ď N Cε1 N ´ 2 |σr |˚ Ψ ,

where dpA, Bq :“ inftdpa, bq : a P A, b P Bu for any sets A and B of X. Here we have also used the definition (6.15) of lone labels and Ψ ě N ´1{2 . Suppose now that exactly one component of σr equals 0 and one is in L. In this case, we split wxr yr pu, vq in (6.39) into two parts corresponding to wxr u wvyr and its expectation, and estimate the corresponding sums separately. First, using (A.57) of Lemma A.5 yields ˆ ˙ ÿ ÿ U 1 1 ÿ Ur r ă G w w w G |B zQ | max |w | max x u vy vy s r x u r r r r uv uv uPX u RUr N uPB zQ vPB zQ N (6.41) vPB0zQr r s r 0 ă N Cε1 N ´1{2 Ψ . 46

On the other hand, using (6.37) we estimate the expectation part: ÿ 1 ÿ Ur q G p E w w x u vy r r uv N uPB zQ vPB zQ s

r

0

r

ÿ ˘ ` |BszQr | ă max |E wxr u wvyr | N Cε1 Ψ ` |mUuvr | . uPX N vPB zQ 0

(6.42)

r

As with the other part (6.41) because of (3.2) this is also Oă pN Cε1 N ´1 q. As Ψ ě N ´1{2 , this finishes the proof of (6.35) in the case k “ 1. p2q Now we prove (6.35) for Θr . In this case, we need to bound, ÿ ÿ U1 V1 wxr yrpu, vq Gu ru1r Gvr1r v , (6.43) uPBszQr vPBtzQr

p and where s “ σr1 , t “ σr2 have again values in t0u Y LzL, u1r P BLpzUr1 ,

vr1 P BLp zVr1 ,

Qr Ď Ur1 , Vr1 Ď Qr Y BLp .

By definitions of the lone and isolated labels (6.15) and (6.25), respectively, we know p then dpu, u1 q ě N ε1 , and similarly, if t P LzL, p then dpv 1 , vq ě N ε1 . Thus, that, if s P LzL, r r p then estimating similarly as in (6.40) with (6.37), yields if s, t P LzL, p. s, t P LzL

Θp2q ă N Cε1 N ´1 Ψ2 , r

In the remaining cases, we split (6.43) into two parts corresponding to the term wxr u wvyr and its expectation in the definition of (6.21) of wxr yr pu, vq, and estimate these two parts separately. p and t “ 0, then The average part is bounded similarly as in (6.42), i.e., if s P LzL ÿ 1 ÿ Ur1 Vr1 p E wxr u wvyr q Gu u1r Gvr1 v N uPBszQr vPB0zQr

ă

ÿ |BszQr | p E wxr u wvyr q max (6.44) uPBs N vPB0zQr ˆ ˙ˆ ˙ αpνq αpνq Cε1 Cε1 ˆ N Ψ` N Ψ` . p1 ` dpu, u1r qqν p1 ` dpvr1 , vqqν

p while u1r P B p . Taking ν ą Cε´1 Here dpu, u1r q ě N ε1 since u P Bs , s P LzL, and 1 L using the (3.2) to bound the sum over the covariances by a constant, we thus we see that the right hand side is Oă pN Cε1 N ´1 Ψq. Since Ψ ě N ´1{2 , this matches (6.35) as |σr |˚ “ |t0, s, tu| ´ 1 “ 1. Now, we are left to bound the size of terms of the form (6.43), where wxr yt pu, vq is replaced with N1 wxr u wvyr , and either s “ 0 or t “ 0. In these cases the sums over u and v factorize, i.e., we have ˆ ˙ ˙ˆ ÿ ÿ 1 Vr1 Ur1 Gvr1 v wvyr . wx u G 1 N uPB zQ r u ur vPB zQ s

t

r

r

p then we estimate the When the sum is over a small set, i.e., over Bs1 for some s1 P LzL, p#q ´1{2 sizes of the entries of W and G by Oă pN q and Oă pΨq, respectively. On the other hand, when u or v is summed over B0zQr , we use (A.57) of Lemma A.5 to obtain a bound of size Oă pΨq. In each case, we obtain an estimate that matches (6.35). 47

p “ Lpx, p y, σq, etc. Proof of Lemma 6.2. We consider the data pξ, σq fixed, and write L We start by enumerating the isolated labels (see (6.25)) p , ℓp :“ |L|

p, ts1 , . . . , sℓp u “ L

(6.45)

p and set Bpkq :“ Ykj“1Bsj for 1 ď k ď ℓp (recall the definition from (6.22) and that Bsj ’s are disjoint). The monomial expansion (6.26) is constructed iteratively in ℓp steps. Indeed, we will define 1 ` ℓp representations, Γξ |Bpξ,σq “

Mk ÿ

Γpkq α ,

α“1

pkq

pkq

k “ 0, 1, . . . , ℓp.

(6.46)

where the Mk “ Mk pξ, σq monomials Γα “ Γξ,σ,α : Bpξ, σq Ñ C, evaluated at pu, vq P Bpξ, σq, are of the form q m ź ź 1 Et # # p´1q pGat bt q , (6.47) F pGwrr wr q# t“1 r“1 with some indices at , bt R Et , wr R Fr . The numbers m and q as well as the sets Et , Fr may vary from monomial to monomial, i.e., they are functions of k and α. Furthermore, for each fixed k and α, the lower indices and the upper index sets satisfy p (a) at , bt P tur , vr upr“1 Y Bpkq, and ws P tat , bt um t“1 ;

p p Y Qr1 , for some 1 ď r 1 ď 2µ; (b) Et Ď Bpkq Y Qt1 , for some 1 ď t1 ď 2µ, and Fr Ď Bpkq (c) If at P Bsi and bt P Bsj , with 1 ď i, j ď k, then i ‰ j;

(d) For each s “ 1, . . . , 2µ there are two unique labels 1 ď t1 psq, t2 psq ď m, such that at1 psq “ us , bt1 psq R tvr ur‰s , and at2 psq R tur ur‰s , bt2 psq “ vs hold, respectively. We will call the right hand side of (6.46) the level-k expansion in the following and we will define it by a recursion on k. The level-0 expansion is determined by the formula (6.20a): p0q

Γ1 :“ Γξ |Bpξ,σq ,

M0 :“ 1 .

(6.48)

This monomial clearly satisfies (a)–(d), with m “ 2µ, q “ 0, Et “ Qt , and t1 psq “ t2 psq “ s. The final goal, the representation (6.26), is the last level-ℓp expansion, i.e., p

Γξ,σ,α :“ Γpαℓq ,

α “ 1, 2, . . . , Mℓp “: Mpξ, σq .

(6.49)

Now we show how the level-k expansion is obtained given the level-pk ´ 1q expansion. In order to do that, first we list the elements of each Bsk as txka : 1 ď a ď |Bsk |u “ Bsk , and we define Bk1 :“ H ,

Bka :“ txkb : 1 ď b ď a ´ 1u ,

a “ 2, . . . , |Bsk | ,

k “ 1, 2, . . . , ℓp,

which is a one-by-one exhaustion of Bsk ; namely Bk1 Ď Bk2 Ď . . . Ď Bk,|Bsk | Ď Bsk . Note that Bk,a`1 “ Bka Y txka u. 48

pk´1q

We now consider a generic level-pk ´ 1q monomial Γα , which is of the form (6.47) pk´1q and satisfies (a)–(d). Each monomial Γα will give rise several level-k monomials that are constructed independently for different α’s as follows. Expanding each of the m factors in the first product of (6.47) using the standard resolvent expansion identity GE ab

“

E YBs Gab k

|Bsk |

`

ÿ

a1 “1

1pxka1 R E q

E YB

E YB

Gaxka1ka1 Gxka1 b ka1 E YB

1 Gxka1 xka ka1

,

(6.50a)

and each of the q factors in the second product of (6.47) using |Bsk | ka GF YBka ÿ GFwxYB 1 1 xsa w ka “ F YBs ´ 1pxka R F q F YBka F YBk,a`1 , F ka k Gww Gww Gww GFxkaYB Gww xka a“1

(6.50b)

yields a product of sums of resolvent entries and their reciprocals. Inserting these formulas into (6.47) and expressing the resulting product as a single sum yields the representation ÿ pkq Γpk´1q “ Γβ , α (6.51) βPAα pkq

where Aα pkq is some finite subset of integers and β simply labels the resulting monomials in an arbitrary way. From the resolvent identities (6.50) it is easy to see that the monopkq mials Γβ inherit the properties (a)–(d) from the level-pk ´ 1q monomials. In particular, summing over ř α “ 1, . . . , Mk´1 in (6.51) yields the level-k monomial expansion (6.46), with Mk :“ α |Aαpkq|. We will assume w.l.o.g. that the sets Aα pkq, 1 ď α ď Mk´1 , form a partition of the first Mk integers. pkq This procedure defines the monomial representation recursively. Since Γα is a function of the pu, vq indices, strictly speaking we should record which lower indices in the generic form (6.47) are considered independent variables. Initially, at level k “ 0, all indices are variables, see (6.20a). Later, the expansion formulas (6.50) bring in new lower indices, denoted generically by xka from the set YsPLp Bs which is disjoint from the range of the components ur , vr of the variables pu, vq as Bpξ, σq is a subset of pXzYsPLp Bs q2µ . However, the structure of (6.50) clearly shows at which location the "old" a, b indices from the left hand side of these formulas appear in the "new" formulas on the right hand side. Now the simple rule is that if any of these indices a, b were variables on the left hand side, they are considered variables on the right hand side as well. In this way the concept of independent variables is naturally inherited along the recursion. With this simple rule we avoid the cumbersome notation of explicitly indicating which indices are variables in the formulas. We note that the monomials of the final expansion (6.49) can be written in the form (6.28). Indeed, the second products in (6.28) and (6.47) are the same, while the first product of (6.47) is split into the three other products in (6.28) using (d). Properties 1 and 2 in Lemma 6.2 for the monomials in (6.49) follow easily from (a)–(d). Indeed, (a) yields the first part of Property 1, while the second part of Property 1 follows from (c) p and the basic property dpBs , Bt q ě N ε1 for distinct lone labels s, t P L. For a given ξ, we define the family of subsets of X: ! ) p E :“ B1,ap1q Y B2,ap2q Y ¨ ¨ ¨ Y Bℓ,ap . p ℓq p Y Qr : 1 ď apkq ď |Bsk | , 1 ď k ď ℓ , 1 ď r ď 2µ 49

By construction (cf. (6.50) and (b)) the upper index sets are members of this ξ-dependent family. Since |Qr |, |Bsk | ď N C0 ε1 , for some C0 „ 1, we get |E | Àµ N C0 µ . Property 2 follows directly from these observations. Next we prove Property 3 of the monomials (6.49). To this end, we use the formula (6.51) to define a partial ordering ’ă’ on the monomials by pkq

Γpk´1q ă Γβ α

ðñ

(6.52)

β P Aα pkq . p

ℓ´1 It follows that for every α “ 1, 2, . . . , M “ M ℓp, there exists a sequence pαk qk“1 , such that p0q

Γξ |Bpξ,σq “ Γ1

p

p

ă Γαp1q1 ă ¨ ¨ ¨ ă Γαp ℓ´1q ă Γpαℓq “ Γξ,σ,α . p ℓ´1

(6.53)

Let us fix an arbitrary label α “ 1, . . . , M of the final expansion. Suppose that the pkq k-th monomial Γαk , in the chain (6.53), is of the form (6.47), and define q Dk :“ p X m t“1 Et q X p X r“1 Fr q,

mk :“ m .

(6.54)

pkq

Here, Dk is the largest set A Ď X, such that Γαk depends only on the matrix elements of HpAq . Since both the upper index sets and the total number of resolvent elements of the form pAq Gab are both larger (or equal) on the right hand side than on the left hand sides of the identities (6.50), and the added indices on the right hand side are from Bsk , we have Dk´1 Ď Dk ,

Dk zDk´1 Ď Bsk ,

and

mk ě mk´1 .

(6.55)

We claim that Bsk Ę Dℓp ùñ Bsk Ę Dk ùñ mk ě mk´1 ` 1 .

(6.56)

The first implication follows from the monotonicity of Dk ’s. In order to get the second pk´1q implication, suppose that Γαk´1 equals (6.47). Since Dk does not contain Bsk the monopkq mial Γαk can not be of the form (6.47), with the upper index sets Et and Ft replaced with pkq Et Y Bsk and Ft Y Bsk , respectively. The formulas (6.50) hence show that Γαk contains pAq pk´1q at least one more resolvent entry of the form Gab than Γαk´1 , and thus mk ě mk´1 ` 1. Property 3 follows from (6.56). Indeed, suppose that there are no isolated label s such p yields m p ě m0 ` ℓp. Since that Bs Ď Dℓp. Then applying (6.56) for each k “ 1, . . . , ℓ, ℓ m0 “ p, using the notations from (6.28) we have m ℓp “ n ` |Rp1q | ` 2|Rp2q | ,

by Property (c) of the monomials. This completes the proof of Property 3. Now only the bound (6.27) on the number of monomials M “ Mℓp remains to be proven, which is a simple counting. Let pk be the largest number of factors among the pkq monomials at the level-k expansion, i.e., if the monomials Γα are written in the form (6.28), then ´ ¯ Mk

pk :“ max n ` |Rp1q | ` 2|Rp2q | ` q . α“1

Let us set b˚ :“ 1 ` maxx,y |BN ε1 px, yq|. Each of the factors in every monomial at the level k ´ 1 is turned into a sum over monomials by the resolvent identities (6.50). Since each 50

such monomial contains at most five resolvent entries (cf. the last terms in (6.50b)), we obtain the first of the following two bounds: pk ď 5pk´1

p

and

Mk ď Mk´1 b˚ k´1 .

(6.57)

In order to derive the second bound we recall that each of the at most pk´1 factors in every level-pk ´ 1q monomial is expanded by the resolvent identities (6.50) into a sum p of at most b˚ terms. The product of these sums yields single sum of at most b˚ k´1 terms. From (6.48) and (6.20a) we get M0 :“ 1 and p0 “ 2µ. Since k ď ℓp ď 2µ, we immediately see that maxk pk ď 2µ 25µ. Plugging this into the second bound of (6.57) µ yields Mk ď ppb˚ q2µ 25 q2µ . Since b˚ ď N Cε1 by (2.11) this completes the proof of (6.27). Finally, we obtain the bound on the number of factors in (6.28) using n ` q ď pℓp Àµ 1.

7

Bulk universality and rigidity

7.1

Rigidity

Proposition 7.1 (Local law away from rκ´ , κ` s). Let G be the resolvent of a random matrix H of the form (2.24) that satisfies B1-B4. Let κ´ , κ` be the endpoints of the convex hull of supp ρ as in (5.1). For all δ, ε ą 0 and ν P N there exists a positive constant C such that away from rκ´ , κ` s, ” 1 Nε ı N P D ζ P H , δ ď distpζ , rκ´, κ` sq ď : max|Gxy pζ q ´ mxy pζ q| ě ? ď CN ´ν . (7.1) δ x,y“1 N The normalized trace converges with the improved rate Nε ı ” 1 1 1 ď CN ´ν. (7.2) P D ζ P H , δ ď distpζ , rκ´, κ` sq ď : Tr Gpζ q ´ Tr Mpζ q ě N N δ N

The constant C depends only on the model parameters K in addition to δ, ε and ν.

Remark 7.2. Theorem 2.9 and Proposition 7.1 provide a local law with optimal conver1 gence rate N Im inside the bulk of the spectrum and convergence rate N1 away from the ζ convex hull of supp ρ, respectively In order to prove a local law inside spectral gaps and at the edges of the support of the density of states, additional assumptions on H are needed to exclude a naturally appearing instability that may be caused by exceptional rows and columns of H and the outlying eigenvalues they create. This instability is already present in the case of independent entries as explained in Section 11.2 of [1]. Proof of Proposition 7.1. The proof has three steps. In the first step we will establish a weaker version of Proposition 7.1 where instead of the bound Λ ă ?1N we will only 1 . Then we will use this version in the second step to prove that show Λ ă ?1N ` N Im ζ there are no eigenvalues outside a small neighborhood of rκ´ , κ` s. Finally, in the third step we will show (7.1) and (7.2). Step 1 : The proof of this step follows the same strategy as the proof of Theorem 2.9. Only instead of using Lemma 3.3 to estimate the error matrix D we will use Lemma 5.1. In analogy to the proof of (2.30) we begin by showing the entrywise bound 1 1 , Λpζ q ă ? ` N Im ζ N

ζ P H , δ ď distpζ , rκ´, κ` sq ď 51

1 , Im ζ ě N ´1`ε . δ

(7.3)

In fact, following the same line of reasoning that was used to prove (3.10), but using (5.2) instead of (3.8) to estimate kDkmax we see that ´ Λpζ q ¯1{2 Nε 1 1 ` 4 N ´ε Λpζ q , ď ? ` Λpζ q1pΛpζ q ď N ´ε{2 q ă ? ` N Im ζ N Im ζ N N

(7.4)

for any ε ą 0. The last term on the right hand side can be absorbed into the left hand side and since ε was arbitrary (7.4) yields 1 1 Λpζ q1pΛpζ q ď N ´ε{2 q ă ? ` . N Im ζ N

(7.5)

This inequality establishes a gap in the possible values that Λ can take, provided ε ă 1{2 1 because N ´ε ě ?1N ` N Im . Exactly as we argued for (3.10) we can get rid of the indicator ζ function in (7.5) by using a continuity argument together with a union bound in order to obtain (7.3). As in the proof of Theorem 2.9 we now use the fluctuation averaging to get an improved convergence rate for the normalized trace, 1 1 1 ` . TrrGpζ q ´ Mpζ qs ă (7.6) N N pN Im ζq2 for all ζ P H with δ ď distpζ , rκ´ , κ` sq ď 1δ and Im ζ ě N ´1`ε . Indeed, (7.6) is an immediate consequence of (7.3) and the fluctuation averaging Proposition 3.4. Step 2 : In this step we use (7.6) to prove the following lemma. Lemma 7.3 (No eigenvalues away from rκ´ , κ` s). For any δ, ν ą 0 we have “ ‰ P SpecpHq X pRzrκ´ ´ δ, κ` ` δsq “ H ě 1 ´ CN ´ν ,

(7.7)

for a positive constant C, depending only on the model parameters K in addition to δ and ν. In order to show (7.7) we consider the imaginary part of

1 N

TrrG ´ Ms and use that

N η 1 ÿ 1 Im Tr Gpτ ` iηq “ , N N i“1 pλi ´ τ q2 ` η 2 ´1 ´1 where pλi qN i“1 are the eigenvalues of H. We choose τ P r´δ , κ´ ´ δs Y rκ` ` δ, δ s and η P rN ´1`ε , 1s and estimate a single term in the sum on the right hand side by employing (7.6), 1 1 η 1 1 1 1 ă Im Tr Mpτ ` iηq ` ` À η ` ` . (7.8) δ N pλi ´ τ q2 ` η 2 N N pNηq2 N pNηq2

Here, we used in the last inequality that N1 Tr M is the Stieltjes transform of the density of states ρ with supp ρ Ď rκ´ , κ` s. Since the left hand side of (7.8) is a Lipschitz continuous function in τ with Lipschitz constant bounded by N we can use a union bound to establish (7.8) first on a fine grid of τ -values and then uniformly for all τ and for the choice η “ N ´2{3 , 1 1 sup 4{3 ă 1{3 . 2 pλi ´ τ q ` 1 N τ N 52

In particular, the eigenvalue λi cannot be at position τ with very high probability, i.e. “ ‰ P D i : δ ď distpλi , rκ´ , κ` sq ď δ ´1 ě 1 ´ Cpδ, νqN ´ν . (7.9)

Now we exclude that there are eigenvalues far away from the support of the denĂ be a standard GUE matrix with sity of states by using a continuity argument. Let W pαq pαq 1 2 Ă for α P r0, 1s and :“ α H ` p1 ´ αqW E|w rxy | “ N , pλi qi the eigenvalues of H pαq pαq pαq κ :“ supα maxt|κ` |, |κ´ |u, where κ˘ are defined as in (5.1) for the matrix Hpαq . In p0q particular, κ˘ “ ˘2. Since the constant Cpδ, νq in (7.9) is uniform for all random matrices with the same model parameters K , we see that “ ‰ pαq sup P D i : |λi | P rκ ` δ, δ ´1 s ď Cpδ, νqN ´ν αPr0,1s

pαq pαq Ă ă The eigenvalues λi are Lipschitz continuous in α. In fact, |Bα λi | ď kH ´ Wk Ă follows from Here, the simple bound on kH ´ Wk

? N.

Ă 2µ “ E rTrpH ´ Wq Ă 2 sµ ď CpµqN µ , E kH ´ Wk

for some positive constant Cpµq, depending on µ, the upper bound κ1 from (2.25) on the moments, the sequence κ2 from (2.26) and P from (2.11). Thus we can use a union bound to establish “ ‰ pαq P D α , i : |λi | P rκ ` 2 δ, δ ´1 ´ δs ď Cpδ, νqN ´ν .

(7.10)

Since for α “ 0 all eigenvalues are in r´κ ´ 2 δ, κ ` 2 δs with very high probability and with very high probability no eigenvalue can leave this interval by (7.10), we conclude that “ ‰ P D i : |λi | ě κ ` 2 δ ď Cpδ, νqN ´ν . Together with (7.9) this finishes the proof of Lemma 7.3.

Step 3 : In this step we use (7.7) to improve the bound on the error matrix D away from r´κ´ , κ` s and thus show (7.1) and (7.2) by following the same strategy that was used in Step 1 and in the proof of Theorem 2.9. “ ‰ By Lemma 7.3 there are with very high probability no eigenvalues in κ´ ´ 2δ , κ` ` 2δ . Therefore, for any B Ď X also the submatrix HB of H has no eigenvalues in this interval. In particular, for any x P XzB we have Im GB xx pζ q „δ Im ζ ,

ζ P H , δ ď distpζ , rκ´ , κ` sq ď

1 , δ

(7.11)

in a high probability event. As in the proof of Lemma 3.3 we bound the entries of the error matrix D by estimating the right hand sides of the equations (5.9a) to (5.9e) further. But now we use (7.11), so that Im ζ in the denominators cancel and we end up with 1 kDpζ qkmax 1pΛpζ q ď N ´ε q ă ? , N

(7.12)

on the domain of spectral parameters with δ ď distpζ , rκ´ , κ` sq ď 1δ . Following the strategy of proof from Step 1 we see that (7.12) implies (7.1) and (7.2). This finishes the proof of Proposition 7.1. 53

Proof of Corollary 2.11. The proof follows a standard argument that establishes rigidity from the local law, which we present here for the convenience of the reader. The argument uses a Cauchy-integral formula that was also applied in the construction of the Helffer-SjÃűstrand functional calculus (cf. [15]) and it already appeared in different variants in [28], [22] and [27]. Let τ P R such that ρpτ q ě δ for some δ ą 0. We apply Lemma 5.1 of [4] with the choices ν1 pdσq :“ ρpσqdσ , τ1 :“ κ´ ´ δr ,

ν2 pdσq :“

τ2 :“ τ , η1 :“ N ´1{2 ,

N 1 ÿ δλ pdσq , N i“1 i

η2 :“ N ´1`rε ,

ε :“ 1 ,

r εr ą 0. We estimate the error terms J1 , J2 and J3 in the statement of for some fixed δ, this lemma by using (7.2) and (2.31). In this way we find żτ r ρpσq dσ ´ SpecpHq X rκ´ ´ δ, τ s ă N εr. N κ´ ´δr

Since εr was arbitrary and there are no eigenvalues of H to the left of κ´ ´δr (cf. Lemma 7.3), we infer żτ ρpσq dσ ´ SpecpHq X p´8, τ s ă 1 , N (7.13) ´8

for any τ P R with ρpτ q ě δ. Combining (7.13) with the definition (2.32) of ipτ q yields ż λipτ q 1 ρpσq dσ . ă N τ This in turn implies (2.33) and Corollary 2.11 is proven.

7.2

Bulk universality

Given the local law (Theorem 2.9), the proof of bulk universality (Corollaries 2.12 and 2.13) follows standard arguments based upon the three step strategy explained in the introduction. We will only sketch the main differences due to the correlations. We start by introducing an Ornstein–Uhlenbeck (OU) process on random matrices Ht that conserves the first two mixed moments of the matrix entries 1 dH t “ ´ p H t ´ Aq dt ` Σ1{2 rdB t s , 2

H0 “ H ,

(7.14)

where the covariance operator Σ : CN ˆN Ñ CN ˆN is given as ΣrRs :“ E xW , Ry W , and Bt is matrix of standard real (complex) independent Brownian motions with the appropriate symmetry B˚t “ Bt for β “ 1 (β “ 2) whose distribution is invariant under the orthogonal (unitary) symmetry group. We remark that a large Gaussian component, as created by the flow (7.14), was first used in [39] to prove universality for the Hermitian symmetry class. 54

Along the flow the matrix Ht “ A ` ?1N Wt satisfies the condition B3 on the dependence of the matrix entries uniformly in t. In particular, since Σ determines the operator S we see that Ht is associated to the same MDE as the original matrix H. Also the condition B4 and B5 can be stated in terms of Σ, and are hence both conserved along the flow. For the following arguments we write Wt as a vector containing all degrees of freedom originating from the real and imaginary parts of the entries of Wt . This vector has NpN ` 1q{2 real entries for β “ 1 and N 2 real entries for β “ 2. We partition X2 “ Iď YI 9 ą into its upper, Iď :“ tpx, yq : x ď yu, and lower, Ią “ tpx, yq : x ą yu, triangular part. Then we identify # pwt pαqqαPIď if β “ 1 , 1 ? Wt “ pwt pαqqαPX2 if β “ 2 , N where wt ppx, yqq :“

?1 N

wxy for β “ 1 and

wt ppx, yqq :“

#

?1 N ?1 N

Re wxy Im wxy

for px, yq P Iď ,

for px, yq P Ią ,

for β “ 2. In terms of the vector wt the flow (7.14) takes the form dwt “ ´

wt ` Σ1{2 dbt , 2

(7.15)

where bt “ pbt pαqqα is a vector of independent standard Brownian motions, and Σ1{2 is the square-root of the covariance matrix corresponding to H “ H0 : Σpα, βq :“ E w0 pαqw0pβq . Recall the notation Bτ pxq “ ty P X : dpx, yq ď τ u for any x P X, and set Bk ppx, yqq :“ pBkN ε pxq ˆ BkN ε pyqq Y pBkN ε pyq ˆ BkN ε pxqq ,

k “ 1, 2 .

Using (2.11) and B3 we see that for any α |B2 pαq| ď N Cε

and

|Σpα, γq| ď Cpε, νqN ´ν ,

γ R B1 pαq ,

(7.16)

respectively. For any fixed α, we denote by w α the vector obtained by removing all the entries of w which may become strongly dependent on the component wpαq along the flow (7.15), i.e., we define w α pγq :“ wpγq 1pγ R B2 pαqq .

(7.17)

In the case that X has independent entries it was proven in [12] that the process (7.15) conserves the local eigenvalue statistics of H up to times t ! N ´1{2 , provided bulk local law holds uniformly in t along the flow as well. We will now show that this insight extends for dependent random matrices as well. The following result is a straightforward generalization of Lemma A.1. from [12] to matrices with dependent entries. We remark that a similar result was independently given in [14]. 55

Lemma 7.4 (Continuity of the OU flow). For every ε ą 0, ν P N and smooth function f there is Cpε, νq ă 8, such that ` ˘ r t, (7.18) |E f pwtq ´ E f pw0 q| ď Cpε, νq N 1{2`ε Ξ ` N ´ν Ξ where

„´  ¯ ` α,θ ˘ 1{2 3{2 3 Ξ :“ sup max sup E N |ws pαq| ` N |ws pαqwspδqws pγq| Bαδγ f ws sďt α,δ,γ θPr0,1s (7.19) ´ 3 ¯ 2 r r ` p1 ` |wpαq|q Bαδγ f pwq r , Ξ :“ sup max Bαδ f pwq w r

α,δ,γ

where wsα,θ :“ wsα ` θ pws ´ wsα q for θ P r0, 1s, and Bαk 1 ¨¨¨αk “

B Bwpα1 q

B ¨ ¨ ¨ Bwpα . kq

Proof. We will suppress the t-dependence, i.e. we write w “ wt , etc. Ito’s formula yields ˙ ÿˆ wpαq 1ÿ 2 df pwq “ ´ Σpα, δq Bαδ f pwq dt ` dM , Bα f pwq ` (7.20) 2 2 δ α where dM “ dMt is a martingale term. Taylor expansion around w “ w α yields ż1 ÿ ÿ α 2 α 3 Bα f pwq “ Bα f pw q ` wpδq Bαδ f pw q ` wpδqwpγq p1 ´ θq Bαδγ f pw α,θ qdθ δPB2 pαq

δ,γPB2 pαq

0

ż1 ÿ 2 2 α 3 Bαδ f pwq “ Bαδ f pw q ` wpγq p1 ´ θq Bαδγ f pw α,θ qdθ . 0

γPB2 pαq

By plugging these into (7.20) and taking expectation, we obtain 1ÿ d E wpαq Bα f pw αq E f pwq “ ´ dt 2 α ”` ı ˘ 2 1ÿ ÿ E wpαqwpδq ´ Σpα, δq Bαδ f pw α q ´ 2 α δPB pαq 2 1ÿ ÿ 2 ` Σpα, δq E Bαδ f pw α q 2 α δRB pαq 2 ” ı ÿ ÿ ż1 3 ´ p1 ´ θq E wpαqwpδqwpγq Bαδγ f pw α,θ q dθ α δ,γPB2 pαq

(7.21a) (7.21b) (7.21c) (7.21d)

0

” ı ÿ ż1 1ÿ 3 α,θ Σpα, δq p1 ´ θq E wpγqBαδγ f pw q dθ . ` 2 α,δ γPB pαq 0

(7.21e)

2

Now, we estimate the five terms on the right hand side of (7.21) separately. First, (7.21a) is small since wpαq is almost independent of w α by B3 and (7.17): ` ˘ r ´ν . E wpαq Bαf pw α q “ E wpαq E Bα f pw α q ` Covpwpαq, Bαf pw αqq “ Oε,ν ΞN

In the term (7.21b), if δ P B1 pαq, then wpαqwpδq is almost independent of w α : ”` ı ˘ 2 ` ˘ ` ˘ α 2 r ´ν . E wpαqwpδq ´ Σpα, δq Bαδ f pw q “ Cov wpαqwpδq , Bαδ f pw αq “ Oε,ν ΞN 56

If δ P B2 pαqzB1 pαq, then wpαq is almost independent of pwpδq, w αq and ”` ı ˘ 2 α E wpαqwpδq ´ Σpα, δq B f pw q αδ 2 ` ˘ 2 ď Cov wpαq , wpδqBαδ f pw α q ` | Σpα, δq| E Bαδ f pw α q ´ ¯ 2 3 f pwq| ` |wpαq||Bαδγ f pwq| N ´ν , ď Cpε, νq sup max |Bαδ w

α,δ,γ

r where we have used (7.16). The last term containing derivatives is bounded by Ξ. 2 r For (7.21d) The term (7.21c) is negligible by |Σpα, δq| Àε,ν N ´ν and | E Bαδ f pw αq | ď Ξ. we use (7.16) and the definition of Ξ to obtain ” ı ÿ ÿ ż1 3 α,θ p1 ´ θq E wpαqwpδqwpγq Bαδγ f pw q dθ ď N 1{2`Cε Ξ . α δ,γPB2 pαq

0

The last term (7.21e) is estimated similarly: ” ı ÿ ż1 3 p1 ´ θq E wpγqBαδγ f pw α,θ q dθ ď N ´1{2`Cε Ξ , γPB2 pαq

0

and the double sum over α, δ produces a factor of size CN due to the exponential decay of Σ. Combining the estimates for the five terms on the right hand side of (7.21) we obtain (7.18). Proof of Corollaries 2.12 and 2.13. We will only sketch the argument here as the procedure is standard. First we show that the matrix Ht defined through (7.14) satisfies the bulk universality if t ě tN :“ N ´1`ξ1 , for any ξ1 ą 0. For simplicity, we will focus on t ď N ´1{2 only. Indeed, from the fullness assumption B5 it follows that Ht is of the form r t ` cptqt1{2 U , Ht “ H

(7.22)

r t . For t ď N ´1{2 the where cptq „ 1 and U is a GUE/GOE-matrix independent of H r t has essentially the same correlation structure as H, controlled by essentially matrix H the same model parameters. In particular the corresponding Srt operator is almost the Ă t solve the corresponding MDE with S replaced by Srt and let ρrt denote same as S. Let M Ă t similarly as ρ is related to M (see Definition 2.3). Using the the function related to M general stability for MDEs, i.e., Theorem 2.6 with Gp0q :“ M ,

Ăt s M Ăt , D :“ p Srt ´ S qr M

Ăt , GpDq “ M

Ă t is close to M, in particular ρrt pωq ě δ{2 when (cf. (2.17)) it is easy to check that M r t as well, i.e. the resolvent G r t pζq of ρpωq ě δ. Moreover, the local law applies to H r t approaches M Ă t pζq for spectral parameters ζ with ρpRe ζq ě δ. The bulk spectrum H r t is therefore the same as that of H in the limit. Combining these facts with the of H decomposition (7.22) we can apply Theorem 2.2 from the recent work [41] to conclude bulk universality for Ht , with t “ tN “ N ´1`ξ1 . This result yields universality both in the sense of correlation functions and consecutive gaps. We remark that in order to prove the gap universality Corollary 2.13 as well as a version of Corollary 2.12 where the energy is averaged over a small interval of length N ´1`ε one may also use Theorems 2.4 and 2.5 from [42] or Theorem 2.1 and Remark 2.2 from [26]. 57

Second, we use Lemma 7.4 to show that H and Ht have the same local correlation functions in the bulk. Suppose ρpωq ě δ for some ω P R. We show that the difference ` ˘´ τ1 τk ¯ pτ1 , . . . , τk q ÞÑ ρk;tN ´ ρk ω ` , . . . , ω ` N N

of the local k-point correlation functions ρk and ρk;tN of H and HtN , respectively, converge weakly to zero as N Ñ 8. This convergence follows from the standard arguments provided that

(7.23)

| E F pH t q ´ E F pHq | Ñ 0 ,

where F “ FN is a function of H expressed as a smooth function Φ of the following observables p ź 1 pH ´ ζj˘ q´1 , Tr Np j“1

ζj˘ :“ ω `

τij ˘ iN ´1´ξ2 , N

j “ 1, . . . , p ,

with p ď k and ξ2 P p0, 1q sufficiently small. Here the derivatives of Φ might grow only as a negligible power of N (for details see the proof of Theorem 6.4 in [27]). In particular, basic resolvent formulas yield ” ı ε1 C 1 ξ2 C2 RHS of (7.18) ď C N N E p1 ` Λt q N 1{2`ε t (7.24)

where Λt is defined like Λ in (3.6) but for the entries of Gt pζq :“ pHt ´ ζ q´1 with Im ζ ě N ´1`ξ2 . In particular, we have used |Gxy ptq| ď |mxy ptq| ` Λt À 1 ` Λt here. The 1 constant Ξ from (7.19) is easily bounded by N ε `Cξ2 , where the arbitrary small constant ε1 ą 0 originates from stochastic domination estimates for Λt and |ws pαq|’s. The constant r from (7.19), on the other hand, is trivially bounded by N C since the resolvents satisfy Ξ trivial bounds in the regime |Im ζ | ě N ´2 , and the weight |wpαq| multiplying the third derivatives of f is canceled for large values of |wpαq| by the inverse in the definition G “ pA ` ?1N W ´ ζ 1q´1 . Since the local law holds for Ht , uniformly in t P r0, tN s, we 1 1 see that Λt ď N ε pN ηq´1{2 ď N ε ´ξ2 {2 with very high probability and hence 1

| E F pHtN q ´ E F pHq | ď Cpεq N 1{2`ε N ´1`ξ1 N C pε `ξ2 q .

(7.25)

Choosing the exponents ε, ε1, ξ1 , ξ2 sufficiently small we see that the right hand side goes to zero as N Ñ 8. This completes the proof of Corollary 2.12. Finally, the comparison estimate (7.25) and the rigidity bound (2.33) allows us to compare the gap distributions of HtN and H, see Theorem 1.10 of [40]. This proves Corollary 2.13.

A

Appendix

Proof of Theorem 2.8. Here we list the necessary changes when proving our main results under the weaker assumption A1’ instead of A1. There are only three places in our proofs where this lower bound of (2.7) has been used. (i) In the proof of the Hilbert-Schmidt bound (4.12) to infer (4.14) from (4.13). (ii) In the proof of the lower bound from (4.11), because (4.14) is used in (4.16). 58

(iii) In the proof of the lower bound in the property (4.35) of the operator F . The adjustments of the arguments in (i)-(iii) result in a weaker bound on the eigenmatrix F, namely (4.27) is replaced by kMkĆ1 1 À F À kMkC2 1 , where the positive constants C1 , C2 may depend on L from A1’. The lower bound on the spectral gap ϑ in (4.28) is modified in a similar fashion, i.e., ϑ Á kMkĆ3 for some positive constant C3 , depending only on L. These two changes imply that the constant C in (4.18) depends on L as well, and so does the constant c from Proposition 2.2. Modification of (i): Recall the notations, model parameters and the partition A1 , . . . , AK from A1’ and that Pk denotes the projection onto the subspace of vectors with support in Ak . The matrix Z satisfies zkk “ 1 and pZ L qkl ě 1 for every k, l “ 1, . . . , K. The proof of (4.12) continues from (4.13) as follows: We multiply with Pj on both sides of (4.13) and take the normalized trace. Then we use (2.21) to get xPj Im My Á

K ÿ

k,l“1

zkl xPk Im My kPj MPl k2hs .

(A.1)

A convenient way of writing (A.1) is by introducing the short hand tkl :“ kPk MPl k2hs ,

wk :“ xPk Im My .

With these definitions (A.1) becomes a component wise inequality for vectors in RK with non-negative entries, (A.2)

w Á T Zw .

Note that we used Z “ Z t , which can be assumed because S is symmetric with respect to the scalar product (2.1) on CN ˆN . Since the diagonal entries of Z are bounded away from zero by A1’, we also have w Á T w. Since Im M is positive definite, we conclude that w has positive entries. By the Perron-Frobenius Theorem the symmetric matrix T with non-negative entries has a bounded spectral norm. Therefore, (4.12) holds because kMk2hs

“

K ÿ

k,l“1

tkl À 1 .

Modification of (ii): Instead of using (4.14) in the first inequality of (4.16) we use (4.13) and the lower bound (A.3)

SrIm Ms Á ρ 1 .

We provide a proof of (A.3) under the assumption A1’: Iterating (A.2) 2L times shows that the entries of w are bounded from below by their sum, w Á pT ZT ZqL w Á pT 2 ZqL w Á Z L w Á 59

K ÿ

k“1

wk “ xIm My .

(A.4)

In the second and forth inequality we used A1’, namely zkk “ 1 and pZ L qkl ě 1, while in the third inequality the lower bound on the diagonal elements of T 2 , K K ´ÿ ¯2 ÿ |Ak |2 2 (A.5) tkl Á Á 1, tkl “ xPk MM˚ y2 ě 2 N kM´1 k4 l“1 l“1 was used. The last inequality of (A.5) holds because kM´1 k À 1 by (4.10) and |Ak | „ N by A1’. With (2.21) and zkk “ 1 we see the first inequality of K ÿ

SrIm Ms Á

k“1

wk Pk Á xIm My 1 .

(A.6)

The second inequality of (A.6) follows from (A.4). Modification of (iii): Without the lower bound from (2.7) also the lower bound from (4.35) may no longer be valid. Thus the property (4.35) is replaced by • For all matrices R P C` the L-th iterate of F satisfies kMk´4L xRy1 À F L rRs À kMk6L xRy1 .

(A.7)

Here L is the model parameter from Assumption A1’. The rest of the argument in the proof of Lemma 4.6 following the statement of the property (4.35) until the start of the proof of (4.35) remains unchanged. Proof of (A.7): The upper bound in (A.7) follows from iterating the upper bound of (4.35), which does still hold and whose proof does not require any adjustment. For the lower bound in (A.7) we will use Assumption A1’. We use the abbreviation F “ C ˚ SC ,

C :“ C?Im M CW .

(A.8)

We will now show that for any l “ 1, . . . , L the l-th iterate of F satisfies l

F rRs Á kMk

´4pl´1q

K ÿ

pZ l qkj xC ˚ rPk sRy C ˚ rPj s .

(A.9)

k,j“1

The lower bound in claim (A.7) follows from (A.9) by taking l “ L and using pZ L qkj Á 1 (cf. A1’) as well as C ˚ r1s “ CW rIm Ms Á kMk´2 1 ,

which is a consequence of the lower bounds in (4.11) and (4.24). We prove (A.9) by induction on l. For l “ 1 we simply combine the definition of F with (2.21). For the induction step we assume (A.9) for l ă L. Acting with F on both sides and applying (2.21) again reveals F

l`1

rRs Á kMk

´4pl´1q

K ÿ K ÿ

pZ l qki xC ˚ rPi s2 yzij xC ˚ rPk sRy C ˚ rPj s .

(A.10)

k,j“1 i“1

Now we use xPi y “ N1 |Ai | Á 1 and the lower bounds from (4.11) and (4.24) again to get A ”? ı? E ? ? @ ˚ D C rPi s2 “ C Pi Im M W2 Im M Im M W2 Im M (A.11) Á kMk´4 xPi y Á kMk´4 . Plugging (A.11) into (A.10) shows the induction step. This finishes the proof of (A.9) and concludes the proof of (A.7) as well. 60

Lemma A.1 (Combes-Thomas estimate). Let β “ pβpνqqνPN be positive sequence and C ą 0. Suppose that R P CN ˆN is invertible and satisfies kR´1 k ď C and |rxy | ď

βpνq , p1 ` dpx, yqqν

ν P N , x, y P X .

(A.12)

Then there exists another positive sequence α “ pαpνqqνPN , depending only on β, C and P (cf. (2.11)), such that |pR´1 qxy | ď

αpνq , p1 ` dpx, yqqν

ν P N , x, y P X .

(A.13)

Proof. We fix ν P N. For any ε P p0, 1s we define the diagonal matrices ` ˘N ε Du,ε :“ diag p1 ` dpx, uqqν x“1 , u P X. ν

We estimate the size of the entries of

D´1 u,ε RDu,ε

´ R via

´ 1 ` ε dpu, yq ¯ν ν ´ 1 |pD´1 RD ´ Rq | “ |rxy | . u,ε xy u,ε ε 1 ` ν dpx, uq

If dpu, yq ě dpx, uq , then we use 1 ` νε dpu, yq ď p1 ` νε dpx, uqqp1 ` νε dpx, yqq. If dpu, yq ě dpx, uq, then we first estimate ´ 1 ` ε dpx, uq ¯ν ´ 1 ` ε dpu, yq ¯ν ν ν ´ 1 ď ´ 1 , 1 ` νε dpx, uq 1 ` νε dpu, yq

and afterwards use 1 ` νε dpx, uq ď p1 ` νε dpu, yqqp1 ` νε dpx, yqq. In both cases we find the upper bound ˘ν ˘ `` ε |pD´1 1 ` dpx, yq ´ 1 |rxy | . u,ε RDu,ε ´ Rqxy | ď ν

With p1 ` νε τ qν ´ 1 ď ετ p1 ` τ qν for any τ ě 0 and (A.12) we get |pD´1 u,ε RDu,ε ´ Rqxy | ď ε

βp2νqdpx, yq . p1 ` dpx, yqqν

Now we estimate the matrix D´1 u,ε RDu,ε ´ R in the operator norms k ¨ k8 and k ¨ k1 (cf. (4.51)) induced by the maximum norm and the ℓ1 -norm on CN , ÿ βp2νqdpx, yq max kD´1 κpνq :“ sup max , (A.14) u,ε RDu,ε ´ Rkp ď ε κpνq , xPX pPt1,8u p1 ` dpx, yqqν d yPX

where the supremum is taken over all pseudometrics d that satisfy (2.11). The sequence κ depends only on β and P and for large enough ν we have κpνq ă 8. Since k ¨ k ď maxtk ¨ k8 , k ¨ k1 u by interpolation and kR´1 k ď C we can use the identity ´1 ´1 ´1 ´1 ´1 D´1 , u,ε R Du,ε “ p1 ` R pDu,ε RDu,ε ´ Rqq R

´1 1 together with (A.14) to conclude kD´1 u,ε R Du,ε k ď 2C for the choice ε :“ 2κpνq . In particular, ´ 2C dpx, yq ¯ν ´1 p1 ` dpx, yqqν |pR´1 qxy | , |pR´1 qxy | ě 2C ě |pD´1 R D q | “ 1 ` x,ε xy x,ε 2ν κpνq αpνq

where we may choose

αpνq :“ 2Cp1 ` 2ν κpνqqν .

This finishes the proof of (A.13).

61

Lemma A.2. Let R P CN ˆN such that |rxy | ď

βpνq βp0q ` , p1 ` dpx, yqqν N

holds for every ν P N and x, y P X with some positive sequence β “ pβpνqq8 ν“0 and ´1 8 kR k ď 1. Then there exists a sequence α “ pαpνqqν“0 , depending only on β and P (cf. (2.11)), such that |pR´1 qxy | ď

αpνq αp0q ` , ν p1 ` dpx, yqq N

(A.15)

for every ν P N and x, y P X. Proof. Within this proof we adapt Convention 4.1 such that ϕ À ψ means ϕ ď Cψ for a constant C, depending only on P :“ pβ, P q. It suffices to prove (A.15) for N ě N0 for some threshold N0 À 1. Thus, we will often assume N to be large enough in the following. We split R into a decaying component S and an entrywise small component T, i.e. we define ´ 2C1 ¯ (A.16) R “ S ` T, sxy :“ rxy 1 |rxy | ě . N The main part of the proof of Lemma A.2 is to show that S has a bounded inverse, kS´1 k À 1 .

(A.17)

We postpone the proof of (A.17) and show first how it is used to establish (A.15). Since the entries of S are decaying as |sxy | ď

βpνq , p1 ` dpx, yqqν

for any ν P N, we can apply Lemma A.1 in order to get the decay of the entries of S´1 to arbitrarily high polynomial order ν P N, |pS´1 qxy | ď

Cpνq . p1 ` dpx, yqqν

(A.18)

In particular, we find that the k ¨ k1_8 -norm (introduced in (4.50)) of S´1 is bounded kS´1 k1_8 À 1 .

(A.19)

We show now that kR´1 ´ S´1 kmax À N1 which together with (A.18) implies (A.15). For a matrix Q P CN ˆN viewed as an operator mapping between CN equipped with the standard Euclidean and the maximum norm we use the induced operator norms d ´ dÿ ¯2 ÿ ÿ |qxy |2 , kQk8Ñ2 :“ |qxy | . kQk2Ñ8 :“ max x

y

x

y

We write the difference between R´1 and S´1 as ´S´1 TR´1 “ R´1 ´ S´1 “ ´R´1 TS´1 . 62

(A.20)

The first equality in (A.20) implies kR´1 k8 ď kS´1 k8 p1 ` kTk2Ñ8 kR´1 k8Ñ2 q

ď kS´1 k8 p1 ` NkTkmax kR´1 kq À 1 ,

(A.21)

where we used kQk8Ñ2 ď

?

NkQk ,

kQk2Ñ8 ď

? NkQkmax ,

for any Q P CN ˆN , (A.19) and kTkmax À N1 from the definition of T in (A.16). The second equality in (A.20) on the other hand implies kR´1 ´ S´1 kmax ď kR´1 k8 kTkmax kS´1 k1 À

1 , N

where (A.19) and (A.21) were used in the second inequality. This finishes the proof of Lemma A.2 up to verifying (A.17). We split R˚ R into a decaying and an entrywise small piece as we did with R itself in (A.16), R˚ R “ L ` K , L :“ S˚ S , K :“ S˚ T ` T˚ S ` T˚ T . From the related properties of S and T we can easily see that L “ plxy qx,y is decaying and K has entries of order N1 , i.e. |lxy | ď

Cpνq , p1 ` dpx, yqqν

kKkmax À

1 . N

(A.22)

Using the a priori knowledge L ` K “ R˚ R Á 1 ,

(A.23)

from the assumption kR´1 k À 1 of Lemma A.2, we will show that kL´1 k À 1, which is equivalent to (A.17). Note that both L and K are selfadjoint. Via spectral calculus we write K as a sum of a matrix Ks with small spectral norm and a matrix Kb with bounded rank K “ Ks ` Kb ,

Ks :“ K 1p´ε,εq pKq

(A.24)

with some ε ą 0 determined later. Indeed, from the Hilbert-Schmidt norm bound on the eigenvalues λi pKq of K, ÿ λi pKq2 “ Tr K˚ K ď N 2 kKk2max À 1 , i

we see that rank Kb À ε12 . On the other hand kKs k ď ε by its definition in (A.24). Since L ` K has a bounded inverse (cf. (A.23)), so does L ` Kb for small enough ε, i.e. kpL ` Kb q´1 k À 1 .

(A.25)

Now we fix ε „ 1 such that (A.25) is satisfied. In particular the eigenvalues of L ` Kb are separated away from zero. Since rank Kb À 1, we can apply the interlacing property 63

of rank one perturbations finitely many times to see that there are only finitely many eigenvalues of L in a neighborhood of zero, i.e. rank rL 1r0,c1 q pLqs À 1 ,

(A.26)

for some constant c1 „ 1. In particular, there are constants c2 „ c3 „ 1 such that c2 ` c3 ď c1 and L has a spectral gap at rc2 , c2 ` c3 s, L 1rc2 ,c2 `c3 s pLq “ 0 .

(A.27)

We split L into the finite rank part Ls associated to the spectrum below the gap and the rest, L “ Ls ` Lb , Ls :“ L 1r0,c2 q pLq , Lb :“ L 1pc2 `c3 ,8q pLq .

The rest of the proof is devoted to showing that Ls Á 1r0,c2 q pLq, which implies that L has a bounded inverse and thus shows (A.17). More precisely, we will show that there are points x1 , . . . , xL with L :“ rank Ls À 1 (cf. (A.26)) and a positive sequence pCpνqqνPN such that k1r0,c2 q pLqex k ď

L ÿ

Cpνq , p1 ` dpxi , xqqν i“1

(A.28)

for any ν P N and x P X, where pex qxPX denotes the canonical basis of CN . Let l “ plx q be any normalized eigenvector of Ls in the image of 1r0,c2 q pLq with associated eigenvalue λ. We need to show that λ Á 1. Since xl, 1r0,c2 q pLqex y “ lx , the decay property (A.28) of the spectral projection 1r0,c2 q pLq away from the finitely many centers xi implies that the components lxřhave arbitrarily high polynomial decay away from the points x1 , . . . , xL . In particular, x |lx | is bounded and therefore we have (cf. (A.22)) ´ ÿ ¯2 1 ˚ . |l Kl| ď kKkmax |lx | À N x We infer that for the eigenvalue λ we get a lower bound via 1 À l˚ pL ` Kql “ λ ` l˚ Kl , where we used (A.23) for the inequality. Thus, λ Á 1 for large enough N. Now we prove (A.28) by induction. We show that for any l “ 0, . . . , L there is an l-dimensional subspace of the image of 1r0,c2 q pLq such that the associated orthogonal projection Pl satisfies kPl ex k ď

l ÿ

Cpνq . p1 ` dpxi , xqqν i“1

(A.29)

The induction is over l. For l “ 0 there is nothing to show. Now suppose that (A.29) has been established for some l ă L. We will see now that it then holds for l replaced by l ` 1 as well. We maximize the maximum norm of all vectors in the image of 1r0,c2 q pLq ´ Pl and pick the index xl`1 where the maximum is attained, ξ :“ maxkp1r0,c2 q pLq ´ Pl qex k “ kp1r0,c2 q pLq ´ Pl qexl`1 k . x

64

(A.30)

Here, ξ ą 0 since l ă L. Now we extend the projection Pl by the normalized vector v defined as Pl`1 :“ Pl ` v v˚ ,

v :“

1 p1r0,c2 q pLq ´ Pl qexl`1 . ξ

(A.31)

The so defined vector v attains its maximum norm at the point xl`1 and the value of this norm is ξ, since for any x we have |vx | “ |v˚ p1r0,c2 q pLq ´ Pl qex | ď ξ “

1 ˚ e p1r0,c2 q pLq ´ Pl qexl`1 “ vxl`1 . ξ xl`1

(A.32)

Here we used (A.30) and that 1r0,c2 q pLq ´ Pl ě 0 is an orthogonal projection. We will show that Pl`1 satisfies (A.29) with l replaced by l`1. We start by establishing that ξŤ Á 1. We write kvk2 as a sum of contributions originating from the neighborhoods B :“ l`1 i“1 BR pxi q of the points xi with some radius R to be determined later and their complement. We estimate the components of v by using (A.32) and the definition of v in (A.31), 1 “ kvk2 “

ÿ

yPB

|vy |2 `

B ÿ

|vy |2

y

B ¯2 1 ÿ´ ˚ ď |B|ξ 2 ` 2 |ey 1r0,c2 q pLqexl`1 | ` |e˚y Pl exl`1 | . ξ y

(A.33)

With the induction hypothesis (A.29) the second summand in the sum on the right hand side of (A.33) is bounded by |e˚y Pl exl`1 | ď kPl ey k ď

l ÿ

Cpνq p1 ` Rqν i“1

(A.34)

for y R B. For the other summand in (A.33) we use the decay estimate |e˚y 1r0,c2 q pLqex | “ |p1r0,c2 q pLqqyx | ď

Cpνq , p1 ` dpx, yqqν

The bound (A.35) follows from the integral representation ¿ 1r0,c2 q pLq “ pL ´ ζ 1q´1 dζ ,

ν P N.

(A.35)

(A.36)

Γ

where the integral is over a closed contour Γ encircling only the eigenvalues of L within r0, c2q. Since L has a spectral gap above c2 (cf. (A.27)) we may choose Γ such that maxkpL ´ ζ 1q´1 k À 1 . ζPΓ

Since the entries of L are decaying by (A.22), we can apply Lemma A.1 to see that the entries of pL ´ ζ 1q´1 decay as well. Then (A.35) follows from (A.36). Using (A.34) and (A.35) in (A.33) yields 1 ď |B|ξ 2 `

Cpνq Cpνq|B| À RP ξ 2 ` 2 ν´P 2 ν ξ R ξ R 65

(A.37)

where in the second inequality we estimated the size of B with (2.11). Now we choose R :“ ξ ´1{P , ν :“ r4P s. Using ξ ď 1 (cf. (A.30)), we obtain that the right hand side (A.37) is bounded by a constant multiple of ξ. Thus (A.37) proves ξ Á 1. We finish the induction by using the definition (A.31) of v and estimating kPl`1 ex k ď kPl ex k ` |vx | ď kPl ex k `

˘ 1` kPl ex k ` |p1r0,c2 q pLqqxl`1 x | . ξ

Since ξ Á 1 and by the induction hypothesis (A.29) as well as (A.35) the bound (A.29) with l replaced by l ` 1 follows and Lemma A.2 is proven. Lemma A.3 (Spectral gap). Let T : CN ˆN Ñ CN ˆN be a linear self-adjoint operator preserving the cone C ` of positive semidefinite matrices. Suppose T is normalized, kT ksp “ 1, and satisfies upper and lower bounds, c xRy 1 ď T rRs ď C xRy 1 ,

(A.38)

R P C`,

where c and C are fixed positive constants. Then there is a gap in the spectrum of T , ” c6 ı c6 (A.39) Y t1u . , 1 ´ Spec T Ď ´1 ` 2C 4 2C 4

The eigenvalue 1 is non-degenerate and the corresponding normalized, kTkhs “ 1, eigenmatrix T P C` satisfies the bounds c ? 1 ď T ď C 1. C

(A.40)

Proof. Within this proof c and C have the same meaning as in (A.38). Since T leaves the cone of positive semidefinite matrices invariant, the Perron-Frobenius theorem guarantees the existence of a normalized T P C` with T rTs “ T. We first verify the bounds (A.40) on this eigenmatrix. From the upper and lower bounds (A.38) on T we infer (A.41)

c xTy 1 ď T ď C xTy 1 .

Multiplying by T on both sides of the second inequality and taking the normalized trace yields 1 “ kTk2hs ď C xTy2 . With the lower bound from (A.41) on T we see that T ě normalization of T and the upper bound from (A.41) imply

?c C

1. Furthermore, the

T ď C xTy 1 ď C kTkhs 1 “ C 1. Now we show the existence of a spectral gap and that 1 is a non-degenerate eigenvalue. Let R P CN ˆN be normalized, kRkhs “ 1, and orthogonal to T, i.e., xT , Ry “ 0. Showing (A.39) is equivalent to showing xR , pId ˘ T qrRsy ě

c6 . 2 C4

Since T preserves C ` and therefore T rRs˚ “ T rR˚ s, we have xR , pId ˘ T qrRsy “ xRe R , pId ˘ T qrRe Rsy ` xIm R , pId ˘ T qrIm Rsy . 66

(A.42)

Thus, it suffices to show (A.42) for self-adjoint matrices R. For a normalized self-adjoint R P CN ˆN with xT , Ry “ 0 we use the spectral representation R “

N ÿ

̺i ri r˚i ,

(A.43)

i“1

with the orthonormal eigenbasis pri qN i“1 of R. Plugging (A.43) into the right hand side of (A.42) reveals the identity xR , pId ˘ T qrRsy “ q˚ p1 ˘ Sqq ,

(A.44)

where we introduced the vector q P RN of eigenvalues of R and the matrix S P RN ˆN with non-negative entries as 1 qi :“ ? ̺i , N

sij :“ r˚i T rrj r˚j sri .

The vector q is normalized since kqk “ kRkhs “ 1 and the matrix S is symmetric because of the self-adjointness of T , sij “ xri r˚i , T rrj r˚j sy “ xT rri r˚i s , rj r˚j y “ sji . Furthermore, by (A.38) the entries of S satisfy lower and upper bounds, c C ď sij ď . N N

(A.45)

In particular, by the Perron-Frobenius theorem, the matrix S has a unique normalized eigenvector s with positive entries and with associated eigenvalue equal to spectral norm, S s “ kSks .

(A.46)

We will now show that S has a spectral gap and r has a non-vanishing component in the direction orthogonal to s. This will imply |q˚ Sq| ď 1 ´

c6 , 2C 4

(A.47)

which is equivalent to (A.42) by (A.44) and therefore proves Lemma A.3. To verify (A.47) we start with the observation that the norm of S is bounded by c ď e˚ S e ď kSk “ sup w˚ T rw w˚ sw ď kT ksp “ 1 , kwk“1

(A.48)

where e “ p ?1N , . . . , ?1N q, and that the Perron-Frobenius eigenvector s “ psi qi satisfies N pSsqi C C ÿ max si “ max si ď ? , ď i“1 i“1 kSk cN i“1 c N N

N

where we used (A.46), (A.48), (A.45) and ksk “ 1 in that order. Now we employ Lemma 5.6 from [2] to see that S has a spectral gap ” c3 ı c3 SpecpSq Ď ´kSk ` 2 , kSk ´ 2 Y tkSku . C C 67

(A.49)

Finally we show that there is a non-vanishing component of r in the direction orthogonal to s. More precisely, we write q “ p1 ´ kwk2 q1{2 s ` w ,

(A.50)

for some w K s. By taking the scalar product with t :“ pN ´1{2 r˚i Tri qN i“1 and using the orthogonality t˚ q “ xT , Ry “ 0, we get the equality in c2 ´ÿ ¯2 si ď p1 ´ kwk2 qpt˚ sq2 “ pt˚ wq2 ď ktk2 kwk2 . CN i“1 N

p1 ´ kwk2 q

(A.51)

The first inequality in (A.51) follows from the lower bound on T in (A.40). Since ktk ď kTkhs “ 1 and (cf. (A.48)) C ´ ÿ ¯2 si , N i“1 N

c ď kSk “ s˚ S s ď we conclude that kwk2 ě

c3 c3 ě , C 2 ` c3 2C 2

(A.52)

where we used c ď C. Combining (A.49) with (A.50) and (A.52) yields

´ c3 ¯ c3 c6 |q Sq| ď kSkp1 ´ kwk q ` kSk ´ 2 kwk2 ď 1 ´ 2 kwk2 ď 1 ´ , C C 2C 4 ˚

2

where we also used kSk ď 1. Thus, (A.47) is established and Lemma A.3 is proven. Lemma A.4 (Quantitative implicit function theorem). Let T : CA ˆ CD Ñ CA be a continuously differentiable function with invertible derivative ∇p1q T p0, 0q at the origin with respect to the first argument and T p0, 0q “ 0. Suppose CA and CD are equipped with norms that we both denote by k ¨ k and let the linear operators on these spaces be equipped with the corresponding induced operator norms. Let δ ą 0 such that

IdCA ´ p∇p1q T p0, 0qq´1∇p1q T pa, dq ď 1 , 2 pa,dqPBδA ˆBδD sup

(A.53)

where Bδ# is the δ-ball around 0 with respect to k ¨ k in C# . Suppose kp∇p1q T p0, 0qq´1k ď C1 ,

sup pa,dqPBδA ˆBδD

k∇p2q T pa, dqk ď C2 ,

for some positive constants C1 , C2 , where ∇p2q is the derivative with respect to the second variable. Then there is a constant ε ą 0, depending only on δ, C1 and C2 , and a unique function f : BεD Ñ BδA such that T pf pdq, dq “ 0 for all d P BεD . The function f is continuously differentiable. If T is analytic, then so is f . Proof. The proof is elementary and left to the reader. Lemma A.5 (Linear large deviation). Let X “ pXx qxPX and b “ pbx qxPX be families of random variables that satisfy the following assumptions: 68

(i) The family X is centered, E Xx “ 0. (ii) The family X has uniformly bounded moments, i.e. E |Xx |µ ď β1 pµq ,

(A.54)

µ P N,

for some sequence β1 of positive constants. (iii) The correlations within X decay, i.e. for A, B Ď X and smooth functions φ : C|A| Ñ C, ψ : C|B| Ñ C we have CovpφpXA q, ψpXB qq ď β2 pε, νq k∇φk8 k∇ψk8 , A,B Ă X Nν ε max

dpA,BqěN

ε ą 0 , ν P N , (A.55)

where XA :“ pXx qxPA , dpA, Bq :“ mintdpx, yq : x P A, y P Bu and β2 is a family of positive constants . (iv) The correlations between X and b are asymptotically small |Covpφpbq, ψpXqq| ď β3 pνqk∇φk8 k∇ψk8 N ´ν ,

ν P N,

(A.56)

for any smooth functions φ, ψ : CN Ñ C. Then the large deviation estimate ¯1{2 1 ´ÿ ÿ 2 bx Xx ă |bx | ` ν, N x x

(A.57)

holds for any ν P N.

Proof. Here we use Convention 4.1 such that ϕ À ψ means ϕ ď Cψ for a constant C, Ă :“ pβ1 , β2 , β3 , P q (cf. (2.11)). We divide the proof into three steps. depending only on P Step 1: In this step we introduce a trivial cutoff both for X and b. We show that it suffices to prove the moment bound ÿ 2µ E bx Xx ď Cpµ, δqN δ , δ ą 0, (A.58) x

for two families X and b of random variables that satisfy the upper bounds ÿ ? max|Xx | ď N , |bx |2 ď 1 , x

(A.59)

x

in addition to the assumptions of Lemma A.5. Indeed, for X and b as in Lemma A.5 we define the new random variables rbx :“ `ř

rx :“ p1 ´ EqrXx θpN ´1 |Xx |2 qs , X

bx , ˘ 2 1{2` N ´r ν |b | y y

(A.60)

where νr P N and θ : r0, 8q Ñ r0, 1s is a smooth cutoff function such that θ|r0,1{2s “ 1 and r and rb satisfy the assumption of Lemma A.5. For this θ|r1,8q “ 0. We will now see that X purpose, let φ be smooth. We use the short hand notations θ “ θpN ´1 |Xx |2 q ,

φ “ φpXA q , 69

rA q . φr “ φpX

We claim that the following two bounds are satisfied: r µ ď Cpµ, νqk∇φkµ N ´ν , E |φ ´ φ| 8

Varrφs ď CNk∇φk8 ,

(A.61)

for any ν, µ P N. For the first bound in (A.61) we estimate, for any ν P N, a |E rXx θs| “ | E rXx p1 ´ θqs| ď VarrXx s Pr 2|Xx|2 ą Ns ď CpνqN ´ν ,

where we used EXx “ 0 in the first step, then the Cauchy-Schwarz inequality in the second step and finally (A.54) for the last inequality. We conclude that rx | “ |Xx |p1 ´ θq ` |E rXx θs| ď |Xx |1p2|Xx |2 ą Nq ` CpνqN ´ν . |Xx ´ X

In combination with (A.54) we infer

rx |µ ď Cpµ, νqN ´ν , E|Xx ´ X

(A.62)

for any ν, µ P N. The first inequality in (A.61) now follows from the identity ż1 φpaq ´ φpbq “ dτ pb ´ aq˚ ∇φpa ` τ pb ´ aqq ,

(A.63)

0

r with a “ X and b “ X. For the second inequality we use (A.63) to see that

|φpaq ´ φpbq|2 ď k∇φk2ka ´ bk2 ď Nk∇φk28

ÿ x

p|ax | ` |bx |q2 .

We integrate a and b with respect to the distribution of X and see (A.61). With the help of (A.61) the assumptions (A.54) and (A.55) are easily verified when r The assumption (A.56) is checked similarly. Furthermore, X r and rb we replace X by X. obviously satisfy the additional bounds (A.59). ř r r r rb. In particular, Now suppose that (A.58) holds with X, b replaced by X, x bx Xx ă 1. Then we see that ÿ ÿ r r r ´ν b X ă ă 1, b X x x x x ` N (A.64) x

x

rx | ă N ´ν´1 (cf. (A.62)) and |rbx | ď 1. Plugging the for any ν P N, where we used |Xx ´ X definition (A.60) of rb into (A.64) reveals ÿ ´ÿ ¯1{2 |bx |2 ` N ´rν , bx Xx ă x

x

and since νr was arbitrary, Lemma A.5 is proven, up to checking (A.58) for random variables X and b that satisfy (A.59) in addition to the assumptions of the lemma. Step 2: In this step we completely remove the weak dependence between X and b, i.e. we show that it is enough to prove (A.58) for a centered family X independent of b satisfying (A.59), (A.54) and (A.55). Indeed, suppose that X and b are not independent, 70

but satisfy (A.56) instead. Let rb be a copy of b that is independent of X and b. We show that for any µ, ν P N, ÿ 2µ ÿ 2µ r E bx Xx ´ E bx Xx ď Cpµ, νqN ´ν . (A.65) x

x

The bound (A.65) implies (A.58), provided (A.58) holds with b replaced by rb. To prove (A.65) we expand the powers on the left hand side, compare term by term and find l.h.s. of (A.65) ď N 2µ max CovpXx1 . . . Xxµ Xxµ`1 . . . Xx2µ , bx1 . . . bxµ bxµ`1 . . . bx2µ q , x1 ,...,x2µ

where the maximum is taken over ? all x1 , . . . , x2µ P X. Now we employ (A.56) as well as the bounds |bx | ď 1 and |Xx | ď N to infer (A.65).

Step 3: By Step 1 and Step 2 we may assume for the proof of (A.58) that X is independent of b and that these random vectors satisfy (A.59), (A.54) and (A.55). In this final step we construct for every ε ą 0 a partition of X into non-empty sets I1 , . . . , IK with the following properties: (i) With a constant Cpεq, depending only on ε and P (cf. (2.11)), the size of the partition is bounded by K ď CpεqN pP `1qε .

(A.66)

(ii) The indices within an element of the partition are far away from each other, dpx, yq ě N ε ,

x, y P Ik , x ‰ y .

(A.67)

In other words, the elements within each Ik are far from each other hence the corresponding components of X and b are practically independent. We postpone the construction of this partition to the end of the proof and explain first how it is used to get (A.58). We split the sum according to the partition and estimate K ÿ 2µ ÿ 2µ ÿ 2µ ÿ 2µ K bx Xx . bx Xx ď K max E E bx Xx “ E x

k“1

k“1 xPIk

xPIk

By the estimate (A.66) on the size of the partition and by choosing ε sufficiently small, it remains to show 2µ ÿ bx Xx ď Cpµq . E (A.68) xPIk

For an independent family pXx qxPIk with (A.54), the moment bound (A.68) would be a simple consequence of the Marcinkiewicz-Zygmund inequality. Therefore, (A.54) follows from 2µ 2µ ÿ ÿ rx ď Cpµ, νq , E bx X bx Xx ´ E (A.69) Nν xPI xPI k

k

71

r “ pX rx qx is an independent family of random variables, which is for all µ, ν P N, where X also independent of X and b and has the same marginal distributions as X. To show (A.69) we expand the powers on the left hand side and use the independence r as well as the upper bound |bx | ď 1, of b from X and X l.h.s. of (A.69) 2µ r r r r ď N max E Xx1 . . . Xxµ Xxµ`1 . . . Xx2µ ´ E Xx1 . . . Xxµ Xxµ`1 . . . Xx2µ ,

(A.70)

x1 ,...,x2µ

where the maximum is over all ξ “ px1 , . . . , x2µ q P Ik2µ . For such a ξ let ξ1 , . . . , ξR P Ik denote the indices appearing within ξ, clearly R ď 2µ. Let furthermore the non-negative integers µ1 , . . . , µR and µ r1 , . . . , µ rR denote the corresponding numbers of appearances within px1 , . . . , xµ q and pxµ`1 , . . . , x2µ q, respectively. Then we can further estimate the term inside the maximum on the right hand side of (A.70) corresponding to ξ by using the telescopic sum, ˙ź ˆ R r´1 R R´1 R ź ź ÿ ź µr µ rr rr rr rs µr r µ µr µ µs µ r E Xξr X ξr ´ E E Xξµss X µξrss . (A.71) Xξr X ξr “ Cov Xξr X ξr , Xξs X ξs r“1

r“1

r“1

s“r`1

s“1

? To bound the covariance in (A.71) we use |Xx | ď N and (A.55) with a sufficient large ν in combination with the estimate on the distance (A.67) between indices within one element Ik of the partition. The claim (A.69) follows. We will now inductively construct the partition I1 , . . . , IK with the properties (i) and (ii) above. Suppose that the disjoint sets I1 , . . . , Ik have already been constructed. Then we pick an arbitrary x0 P J0 :“ XzpI1 Y ¨ ¨ ¨ Y Ik q. Next we pick x1 P J1 :“ J0 zBN ε px0 q, then x2 P J2 :“ J1 zBN ε px1 q and so on. The process stops at some step L when JL`1 is empty and we set Ik`1 :“ tx0 , . . . , xL u. By construction, property (ii) is satisfied for all elements Ik of the partition. We verify the upper bound (i) on the number K of such elements. For every k we have Xz

k´1 ď l“1

Il Ď

ď

xPIk

(A.72)

BN ε pxq ,

because otherwise another element of Ik would have been chosen in the construction. The inclusion (A.72) implies N´

k´1 ÿ l“1

(A.73)

|Il | ď |Ik | max|BN ε pxq| ď |Ik |N εP , xPIk

where we used (2.11) for the last inequality. In particular, (A.73) provides a lower bound on the size of Ik which we use to obtain that ˆ k´1 k ÿ ˙ ÿ ` ˘ ´εP |Il | . N´ N ´ |Il | ď 1 ´ N l“1

l“1

Since IK contains at least one element, we infer by induction that 1 ď N´

K´1 ÿ l“1

|Il | ď p1 ´ N ´εP qK´1 N ď Ne´pK´1qN

´εP

.

We solve for K and thus see that (i) holds true. This finishes the proof of Lemma A.5. 72

Lemma A.6 (Quadratic large deviation). Let X “ pXx qxPX , Y “ pYx qxPX , b “ pbxy qx,yPX be families of random variables that satisfy the following assumptions: (i) The families X and Y are centered, E Xx “ E Yx “ 0. (ii) Both X and Y have uniformly bounded moments, i.e. E |Xx |µ ` E |Yx |µ ď β1 pµq ,

µ P N,

for all x P X and some sequence β1 of positive constants. (iii) The correlations within X and Y decay, i.e. for ε ą 0, A, B Ď X with dpA, Bq ě N ε and smooth functions φ : C|A| Ñ C, ψ : C|B| Ñ C we have max |CovpφpZA q, ψpQB qq| ď β2 pε, νq

Z,QPtX,Y u

k∇φk8 k∇ψk8 , Nν

(A.74)

where ZA :“ pZx qxPA and dpA, Bq are defined as in Lemma A.5 and β2 is a family of positive constants . (iv) The correlations between pX, Y q and b are asymptotically small |Covpφpbq, ψpX, Y qq| ď β3 pνqk∇φk8 k∇ψk8 N ´ν ,

ν P N,

(A.75)

2

for smooth functions φ : CN Ñ C, ψ : C2N Ñ C and a sequence β3 of positive constants. Then the large deviation estimate ´ÿ ¯1{2 1 ÿ 2 bxy pXx Yy ´ E Xx Yy q ă |bxy | ` ν, N x,y x,y

(A.76)

holds for any ν P N. Proof. We use Convention 4.1 such that ϕ À ψ means ϕ ď Cψ for a constant C, Ă :“ pβ1 , β2 , β3 , P q (cf. (2.11)). The proof of Lemma A.6 follows depending only on P a similar strategy as the proof of Lemma A.5. Exactly as in Step 1 of the proof of Lemma A.5, we introduce new families of centered random variables rx :“ p1 ´ EqrXx θpN ´1 |Xx |2 qs , Yrx :“ p1 ´ EqrYx θpN ´1 |Yx |2 qs , X

and rescaled coefficients

bxy . ˘ 2 1{2` N ´r ν |b | uv u,v

rbxy :“ `ř

In this way we reduce the proof of (A.76) to showing the moment bound 2µ ÿ δ ą 0, E bxy pXx Yy ´ E Xx Yy q ď Cpµ, δqN δ ,

(A.77)

x,y

for random variables X, Y and b that satisfy all assumptions of Lemma A.6, and the additional bounds ÿ ? ? max|Yx | ď N , |bxy |2 ď 1 . max|Xx | ď N , (A.78) x x x,y

73

Following Step 2 of the proof of Lemma A.5 and using (A.75) we may also assume that b is independent of pX, Y q. To show (A.77) we fix ε ą 0 and choose the partition I1 , . . . , IK from Step 3 of the proof of Lemma A.5 of the index set X. In particular, (A.66) and (A.67) are satisfied. We split the sums over x and y in (A.77) according to this partition and estimate 2µ ÿ 4µ K bxy pXx Yy ´ E Xx Yy q . l.h.s. of (A.77) ď K max E k,l“1

xPIk ,yPIl

By choosing ε sufficiently small and using (A.66) it suffices to show that for any fixed k, l “ 1, . . . , K we have the moment bound 2µ ÿ bxy pXx Yy ´ E Xx Yy q ď Cpµ, δqN δ , δ ą 0. E (A.79) xPIk ,yPIl

For any x P Ik and y P Il we introduce the relation x ’ y

whenever

dpx, yq ď

Nε . 3

ε

If dpx, yq ą N3 , we correspondingly write x ’ y. Since the distances of indices within the set Ik are bounded from below by N ε (c.f. (A.67)), we see that for every x P Ik there exists at most one y P Il such that x ’ y and the other way around. We set # 1 if there exists yr P Il s.t. x ’ yr , ιpxq :“ 0 otherwise , # 1 if there exists x r P Ik s.t. x r’ y, ιpyq :“ 0 otherwise , for any x P Ik and y P Il . Note that if k “ l, then ιpxq “ 1 for all x P Ik . Furthermore, let us define ! ) S :“ pS, T q : S Ď Ik , T Ď Il , such that x ’ y for all x P S , y P T , ε

the pairs of subsets with a distance of at least N3 . Inspired by Appendix B of [19] we use the partition of unity σxy ÿ 1 “ 1px P Sq1py P T q , x P Ik , y P Il , x ’ y , (A.80) |S | pS,T qPS where we introduced the numbers $ 4 ’ ’ ’ &6 σxy :“ ’ 6 ’ ’ % 9

if if if if

ιpxq “ ιpyq “ 0 , ιpxq “ 0 and ιpyq “ 1 , ιpxq “ 1 and ιpyq “ 0 , ιpxq “ 1 and ιpyq “ 1 .

We split the sum in (A.79) into a sum over pairs px, yq with x ’ y and x ’ y. Afterwards we use (A.80) and find ÿ ÿ 1 V pS, T q , bxy pXx Yy ´ E Xx Yy q “ U ` |S | xPI ,yPI pS,T qPS k

l

74

with the short hand notation ÿ bxy pXx Yy ´ E Xx Yy q , U :“ xPIk ,yPIl x’y

V pS, T q :“

ÿ

σxy bxy pXx Yy ´ E Xx Yy q .

xPS,yPT

Thus, proving (A.79) reduces to showing for any pair of index sets pS, T q P S that E |U|2µ ` E |V pS, T q|2µ ď Cpµq. The moment bound on U can be seen with exactly the same argument as (A.58) in the proof of Lemma A.5 since the family of centered random variables Xx Yy ´ E Xx Yy in this sum are almost independent. The moment bound on V pS, T q follows by comparing the moments of V pS, T q with the moments of ÿ rx Yry , Vr pS, T q :“ σxy bxy X xPS,yPT

r S “ pX rx qxPS and YrT “ pYrx qxPT are independent families of random variables, where X which are independent of pb, X, Y q as well, with the same marginal distributions as XS “ pXx qxPS and YT “ pYx qxPT , respectively. The result of this comparison is that E |V pS, T q|2µ ´ E |Vr pS, T q|2µ ď Cpµ, νqN ´ν , ε

because XS and YT are essentially uncorrelated since dpS, T q ě N3 and because the families XS and YT themselves are already essentially uncorrelated (cf. (A.67)). Finally, the moments of Vr pS, T q satisfy the necessary bound by the Marcinkiewicz-Zygmund inequality as in the proof of Lemma A.5. The details are left to the reader.

References [1] O. Ajanki, L. Erdős, and T. Krüger. Quadratic vector equations on the complex upper half plane. arXiv:1506.05095. [2] O. Ajanki, L. Erdős, and T. Krüger. Singularities of solutions to quadratic vector equations on the complex upper half-plane. To appear in Comm. Pure Appl. Math. [3] O. Ajanki, L. Erdős, and T. Krüger. Local Spectral Statistics of Gaussian Matrices with Correlated Entries. Journal of Statistical Physics, pages 1–23, 2016. [4] O. Ajanki, L. Erdős, and T. Krüger. Universality for general Wigner-type matrices. Probab. Theory Related Fields, pages 1–61, 2016. [5] G. Anderson and O. Zeitouni. A Law of Large Numbers for Finite-Range Dependent Random Matrices. Comm. Pure Appl. Math., 61(8):1118–1154, 2008. [6] M. Banna, F. Merlevède, and M. Peligrad. On the limiting spectral distribution for a large class of symmetric random matrices with correlated entries. Stoch. Proc. Appl., 125(7):2700–2726, 2015. [7] R. Bauerschmidt, J. Huang, A. Knowles, and H.-T. Yau. Bulk eigenvalue statistics for random regular graphs. arXiv:1505.06700. 75

[8] R. Bauerschmidt, J. Huang, and H.-T. Yau. Local spectral stability for random regular graphs of fixed degree. arXiv:1609.09052. [9] F. Bekerman, A. Figalli, and A. Guionnet. Transport maps for Beta-matrix models and Universality. Comm. Math. Phys., 338:589–619, 2015. [10] P. Bourgade, L. Erdős, and H.-T. Yau. Universality of general β-ensembles. Duke Math. J., 163(6):1127–1190, 2014. [11] P. Bourgade, L. Erdős, H.-T. Yau, and J. Yin. Universality for a class of random band matrices. arXiv:1602.02312. [12] P. Bourgade and H.-T. Yau. The Eigenvector Moment Flow and local Quantum Unique Ergodicity. arXiv:1312.1301. [13] A. Boutet de Monvel, A. Khorunzhy, and V. Vasilchuk. Limiting eigenvalue distribution of random matrices with correlated entries. Markov Processes and Related Fields, 2(4):607–636, 1996. [14] Z. Che. Universality of random matrices with correlated entries. arXiv:1604.05709. [15] E. B. Davies. The functional calculus. J. Lond. Math. Soc. (2), 52(1):166–176, 1995. [16] P. Deift. Orthogonal polynomials and random matrices: A Riemann-Hilbert approach. Courant Institute of Mathematical Sciences, New York; American Mathematical Society, 1999. [17] P. Deift and D. Gioev. Random Matrix Theory: Invariant Ensembles and Universality. Courant Institute of Mathematical Sciences, New York; American Mathematical Society, 2009. [18] L. Erdős, A. Knowles, H.-T. Yau, and J. Yin. Spectral statistics of Erdős-Rényi graphs II: Eigenvalue spacing and the extreme eigenvalues. Comm. Math. Phys., 314(3):587–640, 2012. [19] L. Erdős, A. Knowles, H.-T. Yau, and J. Yin. Delocalization and Diffusion Profile for Random Band Matrices. Comm. Math. Phys., 323:367–416, 2013. [20] L. Erdős, A. Knowles, H.-T. Yau, and J. Yin. The local semicircle law for a general class of random matrices. Electron. J. Probab., 18(0):1–58, 2013. [21] L. Erdős, S. Péché, J. A. Ramírez, B. Schlein, and H.-T. Yau. Bulk universality for Wigner matrices. Comm. Pure Appl. Math., 63(7):895–925, 2010. [22] L. Erdős, J. A. Ramírez, B. Schlein, and H.-T. Yau. Universality of sine-kernel for Wigner matrices with a small Gaussian perturbation. Electron. J. Probab., 15(18):526–603, 2010. [23] L. Erdős, B. Schlein, and H.-T. Yau. Local semicircle law and complete delocalization for Wigner random matrices. Comm. Math. Phys., 287:641–655, 2010. [24] L. Erdős, B. Schlein, and H.-T. Yau. Universality of random matrices and local relaxation flow. Invent. Math., 185:75–119, 2011. 76

[25] L. Erdős, B. Schlein, H.-T. Yau, and J. Yin. The local relaxation flow approach to universality of the local statistics for random matrices. Ann. Inst. Henri Poincaré Probab. Stat., 48(1):1–46, 2012. [26] L. Erdős and K. Schnelli. Universality for Random Matrix Flows with Timedependent Density. arxiv:1504.00650 (to appear in Ann. Inst. Henri Poincaré Probab. Stat.). [27] L. Erdős, H.-T. Yau, and J. Yin. Bulk universality for generalized Wigner matrices. Probab. Theory Related Fields, 154(1-2):341–407, 2011. [28] L. Erdős, H.-T. Yau, and J. Yin. Universality for generalized Wigner matrices with Bernoulli distribution. J. Comb., 2(1):15–82, 2011. [29] L. Erdős, H.-T. Yau, and J. Yin. Rigidity of eigenvalues of generalized Wigner matrices. Adv. Math., 229(3):1435–1515, 2012. [30] A. Figalli and A. Guionnet. Universality in several-matrix models via approximate transport maps. arXiv:1407.2759. [31] J. Garnett. Bounded Analytic Functions, volume 236 of Grad. Texts in Math. Springer, New York, 2007. [32] F. Gesztesy and E. Tsekanovskii. On Matrix-valued Herglotz Functions. Mathematische Nachrichten, 218(1):61–138, 2000. [33] V. L. Girko. Theory of stochastic canonical equations. Vol. I, volume 535 of Mathematics and its Applications. Kluwer Academic Publishers, Dordrecht, 2001. [34] F. Götze, A. Naumov, and A. Tikhomirov. Local universality of repulsive particle systems and random matrices. Annals of Probability, 42(6):2207–2242. [35] F. Götze, A. Naumov, A. Tikhomirov, and D. Timushev. On the Local Semicircular Law for Wigner Ensembles. arXiv:1602.03073. [36] W. Hachem, P. Loubaton, and J. Najim. The Empirical Eigenvalue Distribution of a Gram Matrix: From Independence to Stationarity. Markov Process. Relat. Fields, 11(4):629–648, 2005. [37] Y. He, A. Knowles, and R. Rosenthal. Isotropic self-consistent equations for meanfield random matrices. arXiv:1611.05364. [38] J. W. Helton, R. R. Far, and R. Speicher. Operator-valued Semicircular Elements: Solving A Quadratic Matrix Equation with Positivity Constraints. Int. Math. Res. Notices, 2007. [39] K. Johansson. Universality of the local spacing distribution in certain ensembles of Hermitian Wigner matrices. Comm. Math. Phys., 215(3):683–705, 2001. [40] A. Knowles and J. Yin. Eigenvector distribution of Wigner matrices. Prob. Theor. Rel. Fields, 155(3):543–582, 2013. [41] B. Landon, P. Sosoe, and H.-T. Yau. Fixed energy universality of Dyson Brownian motion. arXiv:1609.09011. 77

[42] B. Landon and H.-T. Yau. Convergence of local statistics of Dyson Brownian motion. arXiv:1504.03605. [43] J. O. Lee, K. Schnelli, B. Stetler, and H.-T. Yau. Bulk Universality for Deformed Wigner Matrices. arXiv:1405.6634. [44] M. L. Mehta. Random Matrices. Elsevier, 2014. [45] L. A. Pastur and M. Shcherbina. Eigenvalue Distribution of Large Random Matrices, volume 171 of Mathematical Surveys and Monographs. Amer. Math. Soc., 2011. [46] J. H. Schenker and H. Schulz-Baldes. Semicircle law and freeness for random matrices with symmetries or correlations. Math. Res. Lett., 12(4):531–542, 2005. [47] M. Shcherbina. Change of variables as a method to study general β-models: Bulk universality. J. Math. Phys., 55(4):043504, 2014. [48] T. Shcherbina. Universality of the local regime for the block band matrices with a finite number of blocks. J. Stat. Phys., 155(3):466–499, 2014. [49] T. Tao and V. Vu. Random matrices: Universality of local eigenvalue statistics. Acta Mathematica, 206(1):127–204, mar 2011. [50] T. Tao and V. Vu. The Wigner-Dyson-Mehta Bulk Universality Conjecture for Wigner Matrices. Electron. J. Probab., 16(77):2104–2121, 2011. [51] T. Tao and V. Vu. Random matrices: Sharp concentration of eigenvalues. Random Matrices: Theory Appl., 2(3):1350007, 2013. [52] T. Tao and V. Vu. Random matrices: the universality phenomenon for Wigner ensembles. In Modern aspects of random matrix theory, pages 121–172. Amer. Math. Soc., Providence, RI, 2014. [53] F. J. Wegner. Disordered system with n orbitals per site: n “ 8 limit. Phys. Rev. B, 19, 1979.

78