Almost all quantum channels are equidistant

Almost all quantum channels are equidistant Ion Nechita4 , Zbigniew Puchała1,2 , Łukasz Pawela∗1 and Karol Życzkowski2,3

arXiv:1612.00401v1 [quant-ph] 1 Dec 2016

1

Institute of Theoretical and Applied Informatics, Polish Academy of Sciences, Bałtycka 5, 44-100 Gliwice, Poland 2 Faculty of Physics, Astronomy and Applied Computer Science, Jagiellonian University, ulica prof. Stanisława Łojasiewicza 11, Kraków, Poland 4 Zentrum Mathematik, M5, Technische Universität M¨ unchen, Boltzmannstrasse 3, 85748 Garching, Germany and CNRS, Laboratoire de Physique Théorique, IRSAMC, Université de Toulouse, UPS, F-31062 Toulouse, France December 2, 2016

Abstract In this work we analyze properties of generic quantum channels in the case of large system size. We use the random matrix theory and free probability to show that the distance between two independent random channels tends to a constant value as the dimension of the system grows larger. As a measure of the distance we use the diamond norm. In the case of a flat Hilbert-Schmidt distribution on quantum channels, we obtain that the distance converges to 21 + π2 . Furthermore, we show that for a random state ρ acting on a bipartite Hilbert space HA ⊗ HB , sampled from the Hilbert-Schmidt distribution, the reduced states TrA ρ and TrB ρ are arbitrarily close to the maximally mixed state. This implies that, for large dimensions, the state ρ may be interpreted as a Jamiołkowski state of a unital map.

1

Introduction

We will denote the set of density operators on X by Ω(X ). We will denote the set of quantum channels Φ : L(X ) → L(Y) by Θ(X , Y). We will put Θ(X , X ) ≡ Θ(X ). For any linear map Φ : Md1 (C) → Md2 (C), we define it Choi-Jamiołkowski matrix as J(Φ) :=

d1 X

|iihj| ⊗ Φ(|iihj|) ∈ Md1 (C) ⊗ Md2 (C).

(1)

i,j=1

This isomorphism was first studied by Choi [7] and Jamiołkowski [18]. Note that some authors prefer to add a normalization factor of d−1 1 if front of the expression for J(Φ). Other authors use the other order for the tensor product factors, a choice resulting in an awkward order for the space in which J(Φ) lives. The rank of the matrix J(Φ) is called the Choi rank of Φ; it is the minimum number r such that the map Φ can be written as Φ(·) =

r X i=1

∗

[email protected]

1

Ai · Bi∗ ,

for some operators Ai , Bi ∈ Md2 ×d1 (C). Given two pairs of unitary operators, U1,2 ∈ U(d1 ), respectively V1,2 ∈ U(d2 ), define the “rotated map” ΦU1 ,U2 ;V1 .V2 (·) := V1 Φ(U1 · U2∗ )V2∗ . (2) It is an easy exercise to check that the Choi-Jamiołkowski matrix of the rotated map is given by (3) J(ΦU1 ,U2 ;V1 .V2 ) := (U1> ⊗ V1 )J(Φ)(U2> ⊗ V2 )∗ . The transposition appearing in the equation above is due to the following key fact (here U is an arbitrary unitary operator): X X (I ⊗ U ) |ii ⊗ |ii = (U > ⊗ I) |ii ⊗ |ii. i

i

The diamond norm was introduced in Quantum Information Theory by Kitaev [17, Section 3.3] as a counterpart to the 1-norm in the task of distinguishing quantum channels. First, define the 1 → 1 norm of a linear map Φ : Md! (C) → Md2 (C) as kΦk1→1 :=

X kΦ(X)k1 . kXk1

X6=0

Kitaev noticed that the 1 → 1 norm is not stable under tensor products (as it can easily be seen by looking at the transposition map), and considered the following “regularization”: kΦk := sup kΦ ⊗ idn k1→1 . n≥1

In operator theory, the diamond norm was known before as the completely bounded trace norm; indeed, the 1 → 1 norm of an operator is the ∞ → ∞ norm of its dual, hence the diamond norm of Φ is equal to the completely bounded (operator) norm of Φ∗ (see [24, Chapter 3]). We shall need two simple properties of the diamond norm. First, note that the supremum in the definition can be replaced by taking the value n = d1 (recall that d1 is the dimension of the input Hilbert space of the linear map Φ); actually, one could also take n equal to the Choi rank of the map Φ, see [28, Theorem 3.3] or [29, Theorem 3.66]. Second, using the fact that the extremal points of the unit ball of the 1-norm are unit rank matrices, we always have kΦk = sup{k(Φ ⊗ idd1 )(|xihy|)k1 : x, y ∈ Cd1 ⊗ Cd1 , kxk = kyk = 1}. Moreover, if the map Φ is Hermiticity-preserving (e.g. Φ is the difference of two quantum channels), one can optimize over x = y in the formula above, see [29, Theorem 3.53]. Given a map Φ, it is in general difficult to compute its diamond norm. Computationally, there is a semidefinite program for the diamond norm, [30], which has a simple form and which has been implemented in various places (see, e.g. [20]). We bound next the diamond norm in terms of the partial trace of the absolute value of the Choi-Jamiołkowski matrix. The diamond norm has application in the problem of quantum channel discrimination. Suppose we have an experiment in which our goal is to distinguish between two quantum channels Φ and Ψ. Each of the channels may appear with probability 21 . Then, celebrated result by Helstrom [15] gives us an upper bound on the probability of correct discrimination p≤

1 1 + kΦ − Ψk . 2 4

(4)

From the discussion in this section we readily arrive at p≤

1 1 + kTr2 |J(Φ − Ψ)|k∞ 2 4 2

(5)

The main goal of this work is to show the asymptotic behavior of the diamond norm. To achieve this, in Section 2 we find a new upper bound of on the diamond norm of a general map. In the case of a Hermiticity preserving map it has a nice form kΦk ≤ kTr2 J(Φ)k∞ .

(6)

Next, in Section 4 we prove that the well known lower bound on the diamond norm kJ(Φ − Ψ)k1 ≤ kΦ − Ψk converges to a finite value for random independent quantum channels Φ and Ψ in the limit d1,2 → ∞. We obtain that for channel sampled from the flat Hilbert-Schmidt distribution the value of the lower bound is lim kJ(Φ − Ψ)k1 =

d1,2 →∞

1 2 + a.s. 2 π

(7)

The general case is discussed in-depth in the aforementioned section. Finally, in Section 5 we show that the upper bound also converges to the same value as the lower bound. From these results we have for channels sampled from the Hilbert-Schmidt distribution lim kΦ − Ψk =

d1,2 →∞

2

1 2 + a.s. 2 π

(8)

Some useful bounds for the diamond norm

We discuss in √ this section some bounds for the diamond norm. For a matrix X, we denote by √ X ∗ X and XX ∗ its right and left absolute values, i.e. √ √ X ∗ X = V ΣV ∗ and XX ∗ = U ΣU ∗ , when X =√U ΣV ∗ is the SVD of X. In the case where X is self-adjoint, we obviously have √ ∗ X X = XX ∗ . In the result below, the lower bound is well-known, while the upper bound appear in a weaker and less general form in [19, Theorem 2]. Proposition 1 For any linear map Φ : Md1 (C) → Md2 (C), we have p p k Tr2 J(Φ)∗ J(Φ)k∞ + k Tr2 J(Φ)J(Φ)∗ k∞ 1 kJ(Φ)k1 ≤ kΦk ≤ . d1 2

(9)

Proof. Consider the semidefinite programs for the diamond norm given in [30, Section 3.2]: Primal problem maximize: subject to:

Dual problem

1 1 hX, J(Φ)i + hX ∗ , J(Φ)∗ i 2 2 ρ0 ⊗ Id2 X ≥0 X∗ ρ1 ⊗ Id2

minimize: subject to:

1 1 k Tr2 Y0 k∞ + k Tr2 Y1 k∞ 2 2 Y0 −J(Φ) ≥0 −J(Φ)∗ Y1 Y0 , Y1 ∈ Md+1 d2 (C)

ρ0 , ρ1 ∈ Md1,+ (C) 1 X ∈ Md1 d2 (C)

The lower and upper bounds will follow from very simple feasible points for the primal, resp. the dual problems. Let J(Φ) = U ΣV ∗ be a SVD of the Choi-Jamiołkowski state of the −1 ∗ linear map. For the primal problem, consider the feasible point ρ0,1 = d−1 1 Id1 and X = d1 U V . The value of the primal problem at this point is 1 1 1 hU V ∗ , U V ∗ |J(Φ)|i + hV U ∗ , |J(Φ)|V U ∗ i = kJ(Φ)k1 , 2d1 2 d1 3

showing the lower bound. p p For the upper bound, set Y0 = J(Φ)J(Φ)∗ = U ΣU ∗ and Y1 = J(Φ)∗ J(Φ) = V ΣV ∗ , both PSD matrices. The condition in the dual problem is satisfied: ∗ Y0 −J(Φ) U ΣU ∗ −U ΣV ∗ U 0 1 −1 U 0 = = · ⊗Σ · ≥ 0, −J(Φ)∗ Y1 −V ΣU ∗ V ΣV ∗ 0 V −1 1 0 V and the proof is complete. Remark 2 If the map Φ is Hermiticity-preserving (i.e. the matrix J(Φ) is self-adjoint), the inequality in the statement reads simply 1 kJ(Φ)k1 ≤ kΦk ≤ k Tr2 |J(Φ)|k∞ . d1 p Remarkp 3 The two bounds in (9) are equal iff the PSD matrices ϕ := Tr2 J(Φ)∗ J(Φ) and ψ := Tr2 J(Φ)J(Φ)∗ are both scalar. Indeed, the lower bound in (9) can be rewritten as 1 1 1 kJ(Φ)k1 = Tr ϕ = Tr ψ, d1 d1 d1 and the two bounds are equal exactly when the spectra of ϕ and ψ ae flat. Remark 4 The upper bound in (9) can be seen as a strengthening of the following inequality kΦk ≤ kJ(Φ)k1 , which already appeared in the literature (e.g. [29, Section 3.4]). Indeed, again in terms of ϕ and ψ, we have kϕk∞ ≤ kϕk1 and kψk∞ ≤ kψk1 .

3

Discriminating random quantum channels

3.1

Probability distributions on the set of quantum channels

There are several ways to endow the convex body of quantum channels with probability distributions. In this section, we discuss several possibilities and the relations between them. Recall that the Choi-Jamiołkowski isomorphism puts into correspondence a quantum channel Φ : Md1 (C) → Md2 (C) with a bipartite matrix J(Φ) ∈ Md1 (C)⊗Md2 (C) having the following two properties • J(Φ) is positive semidefinite • Tr2 J(Φ) = Id1 . The above two properties correspond, respectively, to the fact that Φ is complete positive and trace preserving. Hence, it is natural to consider probability measures on quantum channels obtained as the image measures of probabilities on the set of bipartite matrices with the above properties. Given some fixed dimensions d1 , d2 and a parameter s ≥ d1 d2 , let G ∈ Md1 d2 ×s (C) be a random matrix having i.i.d. standard complex Gaussian entries; such a matrix is called a Ginibre random matrix. Define then W := GG∗ ∈ Md1 (C) ⊗ Md2 (C) D := (Tr2 W )−1/2 ⊗ Id2 W (Tr2 W )−1/2 ⊗ Id2 ∈ Md1 (C) ⊗ Md2 (C).

(10) (11)

The random matrices W and D are called, respectively, Wishart and partially normalized Wishart. The inverse square root in the definition of D uses the Moore-Penrose convention if W is not invertible; note however that this is almost never the case, since the Wishart matrices with parameter s larger than its size is invertible with unit probability. It is for this reason 4

we do not consider here smaller integer parameters s. Note that the matrix D satisfies the two conditions discussed above: it is positive semidefinite and its partial trace over the second tensor factor is the identity: h i Tr2 D = Tr2 (Tr2 W )−1/2 ⊗ Id2 W (Tr2 W )−1/2 ⊗ Id2 = (Tr2 W )−1/2 (Tr2 W ) (Tr2 W )−1/2 = Id1 . Hence, there exists a quantum channel ΦG , such that J(ΦG ) = D (note that D, and thus Φ are functions of the original Ginibre random matrix G). Definition 1 The image measure of the Gaussian standard measure through the map G 7→ ΦG defined in (10), (11) and the equation J(ΦG ) = D is called the partially normalized Wishart measure and is denoted by γdW1 ,d2 ,s . Another way of introducing a probability distribution on the set of quantum channels is via the Stinespring dilation theorem [27]: for any channel Φ : Md1 (C) → Md2 (C), there exists, for some given s ≤ d1 d2 , an isometry V : Cd1 → Cd2 ⊗ Cs such that Φ(·) = Tr2 (V · V ∗ ).

(12)

Definition 2 For any integer parameter s, let γdH1 ,d2 ,s be the image measure of the Haar distribution on isometries V through the map in (12). Finally, one can consider the Lebesgue measure on the convex body of quantum channels, In this work, we shall however be concerned only with the measure γ W coming from normalized Wishart matrices. γdL1 ,d2 .

3.2

The (two–parameter) subtracted Mar˘ cenko-Pastur distribution

In this section we introduce and study the basic properties of a two-parameter family of probability measures which will appear later in the paper. This family generalizes the symmetrized Mar˘cenko-Pastur distributions from [26], see also [13, 23] for other occurrences of some special cases. Before we start, recall that the Mar˘cenko-Pastur (of free Poisson) distribution of parameter x > 0 has density given by [22, Proposition 12.11] p 4x − (u − 1 − x)2 dMP x = max(1 − x, 0)δ0 + 1[a,b] (u) du, 2πu √ √ where a = ( x − 1)2 and b = ( x + 1)2 . Definition 3 Let a, b be two free random variables having Mar˘cenko-Pastur distributions with respective parameters x and y. The distribution of the random variable a/x − b/y is called the subtracted Mar˘cenko-Pastur distribution with parameters x, y and is denoted by SMP x,y . In other words, SMP x,y = D1/x MP x D−1/y MP y .

(13)

We have the following result. Proposition 5 Let Wx (resp. Wy ) be two Wishart matrices of parameters (d, sx ) (resp (d, sy )). Assuming that sx /d → x and sy /d → y for some constants x, y > 0, then, almost surely as d → ∞, we have Z 2 −1 2 −1 lim k(xd ) Wx − (yd ) Wy k1 = |u| dSMP x,y (u) =: ∆(x, y). d→∞

5

Proof. The proof follows from standard arguments in random matrix theory, and from the fact that the Schatten 1-norm is the sum of the singular values, which are the absolute values of the eigenvalues in the case of self-adjoint matrices. We gather next some properties of the probability measure SMP x,y . Examples of this distribution are shown in Fig. 1. Proposition 6 Let x, y > 0. Then, 1. If x + y < 1, then the probability measure SMP x,y has exactly one atom, located at 0, of mass 1 − (x + y). If x + y ≥ 1, then SMP x,y is absolutely continuous with respect to the Lebesgue measure on R. 2. Define Ux,y (u) = 9u2 (x + y + 2) − 9u(x − y)(x + y − 1) + 2(x + y − 1)3 Tx,y (u) = (x + y − 1)2 + 3u(y − x + u) q Yx,y (u) = Ux,y (u) + [Ux,y (u)]2 − 4 [Tx,y (u)]3 .

(14)

The support of the absolutely continuous part of SMP x,y is the set {u : [Ux,y (u)]2 − 4 [Tx,y (u)]3 ≥ 0}.

(15)

This set is the union of two intervals if y ∈ (0, yc ) and it is connected when y ≥ yc , with yc = −x + 3(2x)2/3 − 6(2x)1/3 + 4. 3. On its support, the density of SMP x,y is given by [Y (u)] 32 − 2 23 T (u) dSMP x,y x,y x,y = . 2 43 √3πu [Y (u)] 13 du

(16)

x,y

Proof. The statement regarding the atoms follows from [4, Theorem 7.4]. The formula for the density and equation (15) comes from Stieltjes inversion, see e.g. [22, Lecture 12]. Indeed, since the R-transform of the Mar˘cenko-Pastur distribution MP x reads Rx (z) = x/(1 − z), the R-transform of the subtracted measure reads R(z) =

1 1 − . 1 − z/x 1 + z/y

The Cauchy transform G of SMP x,y is the functional inverse of K(z) = R(z) + 1/z. To write down the explicit formula for G, one has to solve a degree 3 polynomial equation, and we omit here the details. The statement regarding the number of intervals of the support follows from (15). The inequality is given by a polynomial of degree 6 which factorizes by u2 , hence an effective degree 4 polynomial. The nature of roots of this polynomial is given by the sign of its discriminant, which, after some algebra, is the same as the sign of y − yc , see [35]. In the case where x = y, some of the formulas from the result above become simpler (see also q 3√ [26]). The distribution SMP x,x is supported between u± = ± √12 10x − x2 + (x + 4) 2 x + 2. Finally, in the case when x = y = 1, which corresponds to a flat Hilbert-Schmidt measure on the set of quantum channels, we get that ∆(1, 1) = 21 + π2 . 6

0. 4 P(u)

0. 3 P(u)

0. 2

0. 15

0

−3

0

0

u 3

−4

(a) x = 1, y = 1

0. 5 P(u)

0. 25

0. 25

−4

−2

0

0

2 u 4

(b) x = 1, y = 2

0. 5 P(u)

0

−2

0

2 u 4

−4

(c) x = 0.5, y = 1

−2

0

2 u 4

(d) x = 0.2, y = 0.5

Figure 1: Subtracted Mar˘cenko-Pastur distribution for (x, y)=(1,1) (a), (1, 2) (b), (0.5, 1) (c) and (0.2, 0.5) (d). The red curve is the plot of (16), while the black histogram corresponds to Monte Carlo simulation.

3.3

The asymptotic diamond norm of the difference of two random quantum channels

We state here the main result of the paper. For the proof, see the following two sections, each providing one of the bounds needed to conclude. Theorem 7 Let Φ, resp. Ψ, be two independent random quantum channels from Θ(d1 , d2 ) having γ W distribution with parameters (d1 , d2 , sx ), resp. (d1 , d2 , sy ). Then, almost surely as d1,2 → ∞ in such a way that sx /(d1 d2 ) → x, sy /(d1 d2 ) → y (for some positive constants x, y), and d1 d22 , Z lim kΦ − Ψk = ∆(x, y) =

d1,2 →∞

|u| dSMP x,y (u).

Proof. The proof follows from Theorems 9 and 13, which give the same asymptotic value. Corollary 8 Combining Theorem 7 with Hellstrom’s theorem for quantum channels, we get that the probability p of distinguishing two quantum channels is equal to: p=

5 1 + . 8 2π

Additionally, any maximally entangled state may be used to achieve this value.

7

(17)

4

The lower bound

In this section we compute the asymptotic value of the lower bound in Theorem 7. Given two random quantum channels Φ, Ψ, we are interested in the asymptotic value of the quantity d−1 1 kJ(Φ − Ψ)k1 . Theorem 9 Let Φ, resp. Ψ, be two independent random quantum channels from Θ(d1 , d2 ) having γ W distribution with parameters (d1 , d2 , sx ), resp. (d1 , d2 , sy ). Then, almost surely as d1,2 → ∞ in such a way that sx /(d1 d2 ) → x and sy /(d1 d2 ) → y for some positive constants x, y, Z 1 lim kJ(Φ − Ψ)k1 = ∆(x, y) = |u| dSMP x,y (u). d1,2 →∞ d1 The proof of this result (as well as the proof of Theorem 9) uses in a crucial manner the approximation result for partially normalized Wishart matrices. Proposition 10 Let W ∈ Md1 (C) ⊗ Md2 (C) a random Wishart matrix of parameters (d1 d2 , s), and consider its “partial normalization” D as in (11). Then, almost surely as d1,2 → ∞ in such a way that s ∼ td1 d2 for a fixed parameter t > 0,

D − (td1 d22 )−1 W = O(d−2 ). 2 Note that in the statement above, the matrix W is not normalized; we have d1 d2 1 X δλi ((d1 d2 )−1 W ) → MP t , d1 d2 i=1

the Mar˘chenko-Pastur distribution of parameter t. In other words, W = GG∗ , where G is random matrix of size d1 d2 × s, having i.i.d. standard complex Gaussian entries. Let us introduce the random matrices X = (td1 d22 )−1 Tr2 W

and Y = X −1/2 ⊗ Id2 .

The first observation we make is that the random matrix X is also a (rescaled) Wishart matrix. Indeed, the partial trace operation can be seen, via duality, as a matrix product, so we can write 1 ˜ ˜∗ X= GG , td1 d22 ˜ is a complex Gaussian matrix of size d1 × d1 s; remember that s scales like td1 d2 . where G Since, in our model, both d1 , d2 grow to infinity, the behavior of the random matrix X follows from [10]. √ Lemma 11 As d1,2 → ∞, the random matrix td2 (X − Id1 ) converges in moments toward a standard semicircular distribution. Moreover, almost surely, the limiting eigenvalues converge to the edges of the support of the limiting distribution: √ td2 λmin (X − Id1 ) → −2 √ td2 λmax (X − Id1 ) → 2. Proof. The proof is a direct application of [10, Corollary 2.5 and Theorem 2.7]; we just need to check the normalization factors. In the setting of [10, Section 2], the Wishart matrices are

8

not normalized, so the convergence result deals with the random matrices (here d = d1 and s = td1 d22 ) ! ˜G ˜∗ √ √ Id1 G − = td2 (X − Id1 ). td1 d2 2 2 d1 td1 d2

We look now for a similar result for the matrix Y ; the result follows by functional calculus. √ Lemma 12 Almost surely as d1,2 → ∞, the limiting eigenvalues of the random matrix td2 (Y − Id1 d2 ) converge respectively to ±1: √ td2 λmin (Y − Id1 d2 ) → −1 √ td2 λmax (Y − Id1 d2 ) → 1. Proof. By functional calculus, we have λmax (Y ) = [λmin (X)]−1/2 , so, using the previous lemma, we get −1/2 2 1 2 −1 λmax (Y ) = 1 − √ + o(d2 ) + o(d−1 =1+ √ 2 ), 2 td2 td2 and the conclusion follows. The case of λmin (Y ) is similar. We have now all the ingredients to prove Proposition 10. Proof of Proposition 10. We have

D − (td1 d22 )−1 W = (td1 d22 )−1 (Y W Y − W )

= (td1 d22 )−1 (Yi − I)Wi Yi + Wi (Yi − I) ≤ (td1 d22 )−1 kYi − IkkWi k (1 + kYi k) t−3/2 √ = · td2 kYi − Ik · (d1 d2 )−1 kWi k · (1 + kYi k) . d22 Note that, almost surely, the three random matrix norms in the last equation above converge respectively to the following finite quantities √ td2 kYi − Ik → 1 √ (d1 d2 )−1 kWi k → ( t + 1)2 1 + kYi k → 1. The first and the third limit above follow from Lemma 12, while the second one is the Bai-Yin theorem [3, Theorem 2] or [2, Theorem 5.11]. Let us now prove Theorem 9. Proof of Theorem 9. The result follows easily by approximating the partially normalized Wishart matrices with scalar normalizations. By the triangle inequality, with Dx := J(Φ) and Dy := J(Ψ), we have 1 kDx − Dy k1 − 1 k(xd1 d22 )−1 Wx − (yd1 d22 )−1 Wy k1 d1 d1 1 1 ≤ kDx − (xd1 d22 )−1 Wx k1 + kDy − (yd1 d22 )−1 Wy k1 d1 d1 2 −1 ≤ d2 kDx − (xd1 d2 ) Wx k∞ + d2 kDy − (yd1 d22 )−1 Wy k∞ . The conclusion follows from Propositions 5 and 10. 9

5

The upper bound

The core technical result of this work consists of deriving the asymptotic value of the upper bound in Theorem 7. Given two random quantum channels Φ, Ψ, we are interested in the asymptotic value of the quantity k Tr2 |J(Φ − Ψ)|k∞ . Theorem 13 Let Φ, resp. Ψ, be two independent random quantum channels from Θ(d1 , d2 ) having γ W distribution with parameters (d1 , d2 , sx ), resp. (d1 , d2 , sy ). Then, almost surely as d1,2 → ∞ in such a way that sx /(d1 d2 ) → x, sy /(d1 d2 ) → y (for some positive constants x, y), and d1 /d22 → 0, Z lim k Tr2 |J(Φ − Ψ)|k∞ = ∆(x, y) = |u| dSMP x,y (u). d1,2 →∞

The proof of Theorem 13 is presented at the end of this Section. It is based on the following lemma which appears in [12], the proof being left to the reader; see also [6, Eq. (5.10)] or [5, Chapter X]. Lemma 14 For any matrices A, B of size d, the following holds: k |A| − |B| k ≤ C log d kA − Bk,

(18)

for a universal constant C which does not depend on the dimension d. For the sake of completeness, we give here a proof, relying on a similar estimate for the Schatten classes proved in [12]. Proof. Using [12, Theorem 8], we have, for any p ∈ [2, ∞): k |A| − |B| k∞ ≤ k |A| − |B| kp ≤ 4(1 + cp)kA − Bkp ≤ 4(1 + cp)d1/p kA − Bk∞ , for some universal constant c ≥ 1. Choosing p = log d gives the desired bound, for d large enough. The case of small values of d is obtained by a standard embedding argument. Proof of Theorem 13. Using the triangle inequality and Lemma 14, we first prove an approximation result (as before, we write Dx := J(Φ) and Dy := J(Ψ)): k Tr2 |Dx − Dy | k∞ − k Tr2 |(xd1 d22 )−1 Wx − (yd1 d22 )−1 Wy | k∞

≤ Tr2 |Dx − Dy | − Tr2 |(xd1 d22 )−1 Wx − (yd1 d22 )−1 Wy |

= Tr2 |Dx − Dy | − |(xd1 d22 )−1 Wx − (yd1 d22 )−1 Wy |

≤ d2 |Dx − Dy | − |(xd1 d22 )−1 Wx − (yd1 d22 )−1 Wy |

≤ Cd2 log(d1 d2 ) (Dx − Dy ) − ((xd1 d22 )−1 Wx − (yd1 d22 )−1 Wy )

≤ Cd2 log(d1 d2 ) Dx − (xd1 d22 )−1 Wx ∞ + Dy − (yd1 d22 )−1 Wy ∞ =

log(d1 d2 ) O(1) → 0, d2

where we have used Proposition 10 and the fact that d1 d22 =⇒ log(d1 ) d2 . This proves the approximation result, and we focus now on the simpler case of Wishart matrices. Let us define Z := (xd1 d2 )−1 Wx − (yd1 d2 )−1 Wy Z˜1 := tr2 (|Z|) = Tr2 |(xd1 d22 )−1 Wx − (yd1 d22 )−1 Wy | 10

It follows from [16, Proposition 4.4.9] that the random matrix Z converges almost surely (see Appendix A for the definition of almost sure convergence for a sequence of random matrices) to a non-commutative random variable having distribution SMP x,y , see (13). Moreover, using a standard strong convergence argument [21], the extremal eigenvalues of Z converge almost surely to the extremal points of the support of the limiting probability measure SMP x,y . Hence, the almost sure convergence extends from the traces of the powers of Z to any continuous bounded function (on the support of SMP x,y ), in particular to the absolute value, i.e. to |Z|. From Proposition 16, the asymptotic spectrum of the random matrix Z˜1 is flat, with all the eigenvalues being equal to Z Tr |(xd1 d2 )−1 Wx − (yd1 d2 )−1 Wy | a = lim E = |u|dSMP x,y (u), d1 ,d2 →∞ d1 d2 which, by Proposition 5, is equal to ∆(x, y), finishing the proof.

Remark 15 We think that the condition d1 d2 in the statement can be replaced by a much weaker condition.

6

Concluding remarks

In this work we analyzed properties of generic quantum channels concentrating on the case of large system size. Using tools provided by the theory of random matrices and the free probability calculus we showed that the diamond norm of the difference between two random channels asymptotically tends to a constant specified in Theorem 7. In the case of channels corresponding to the simplest case x = y = 1, the limit value of the diamond norm of the difference is ∆(1, 1) = 1/2 + 2/π. In Fig. 2 we illustrate the convergence of the upper and lower bound to this value. This statement allows us to quantify the mean distinguishability between two random channels. To arrive at this result we considered an ensemble of normalized random density matrices, acting on a bipartite Hilbert space HA ⊗ HB , and distributed according to the flat (HilbertSchmidt) measure. Such matrices, can be generated with help of a complex Ginibre matrix G as ρ = GG∗ /TrGG∗ . In the simplest case of square matrices G of order d = d21 the average trace distance of a random √ state ρ from the maximally mixed state ρ∗ = I/d behaves asymptotically as ||ρ − ρ∗ ||1 → 3 3/4π [26]. However, analyzing both reduced matrices ρA = TrB ρ and ρB = TrA ρ we can show that they become close to the maximally mixed state in sense of the operator norm, so that their smallest and largest eigenvalues do coincide. This observation implies that the state ρ can be directly interpreted as a Jamiołkowski state J representing a stochastic map Φ, as its partial trace ρA is proportional to identity. Furthermore, as it becomes asymptotically equal to the other partial trace ρB , it follows that a generic quantum channel (stochastic map) becomes unital and thus bistochastic. The partial trace of a random bipartite state is shown to be close to identity provided the support of the limiting measure characterizing the bipartite state is bounded. In particular, this holds for a family of subtract Marchenko–Pastur distributions defined in Eq. (13) as a free additive convolution of two rescaled Marchenko–Pastur distributions with different parameters and determining the density of a difference of two random density matrices. In this way we could establish the upper bound for the average diamond norm between two channels and show that it asymptotically converges to the lower bound ∆(x, y) given in Theorem 9. The results obtained can be understood as an application of the measure concentration paradigm [1] to the space of quantum channels. Acknowledgments. I.N. would like to thank Anna Jen˘cová and David Reeb for very insightful discussion regarding the diamond norm, which led to several improvements and simplifications 11

1.35

Diamond norm

1.30 1.25 1.20 1.15 1.10 1.05

0

5

10

15

20

25

d Figure 2: The convergence of upper (green circles) and lower (blue triangles) bounds on the distance between two random quantum channels sampled from the Hilbert-Schmidt distribution (d1 = d2 = d). The results were obtained via Monte Carlo simulation with 100 samples for each data point. of the proof of Proposition 1. I.N.’s research has been supported by the ANR projects RMTQIT ANR-12-IS01-0001-01 and StoQ ANR-14-CE25-0003-01, as well as by a von Humboldt fellowship.

A

On the partial traces of unitarily invariant random matrices

In this section we show a general result about unitarily invariant random matrices: under some technical convergence assumptions, the partial trace of a unitarily invariant random matrix is “flat”, i.e. it is close in norm to its average. Recall that the normalized trace functional can be extended to arbitrary permutations as follows: for a matrix X ∈ Md (C), write Y1 trπ (X) := Tr(X |c| ). d c∈π Recall the following definition from [16, Section 4.3]. Definition 4 A sequence of random matrices Xd ∈ Md (C) is said to have almost surely limit distribution µ if Z ∀p ≥ 1,

a.s. − lim tr(X p ) = d→∞

xp dµ(x),

Proposition 16 Consider a sequence of hermitian random matrices Ad ∈ Md1 (d) (C)⊗Md2 (d) (C) and assume that 1. Both functions d1,2 (d) grow to infinity, in such a way that d1 /d22 → 0. 2. The matrices Ad are unitarily invariant. 3. The family (Ad ) has almost surely limit distribution µ, for some compactly supported probability measure µ. 12

Then, the normalized partial traces Bd := d−1 2 [id ⊗ Tr](Ad ) converge almost surely to multiple of the identity matrix: a.s. − lim kBd − aId1 (d) k = 0, d→∞

where a is the average of µ:

Z a :=

xdµ(x).

Proof. In the proof, we shall drop the parameter d → ∞, but the reader should remember that the matrix dimensions d1,2 are functions of d and that all the matrices appearing are indexed by d. To conclude, it is enough to show that P − lim λmax (Bd ) = a, d1 →∞

since the statement for the smallest eigenvalue follows in a similar manner. Let us denote by d1 1 X b := λi (B) = tr(1) (B) d1 i=1

" #2 d1 d1 d1 X X 1 X 1 1 (λi (B) − b)2 = λi (B)2 − λi (B) = tr(12) (B) − [tr(1) (B)]2 v := d1 d1 d1 i=1

i=1

i=1

the average eigenvalue and, respectively, the variance of the eigenvalues of B; these are real random variables (actually, sequences of random variables indexed by d). By Chebyshev’s inequality, we have √ p λmax (B) ≤ b + v d1 . √ √ Note that one could replace the d1 factor in the inequality above by d1 − 1 by using Samuelson’s inequality [31, 32], but the weaker version is enough for us. We shall prove now that b → a almost surely and later that d1 v → 0 almost surely, which is what we need to conclude. To do so, we shall use the Weingarten formula [11, 33]. In the graphical formalism for the Weingarten calculus introduced in [8], the expectation value of an expression involving a random Haar unitary matrix can be computed as a sum over diagrams indexed by permutation matrices; we refer the reader to [8] or [9] for the details. Using the unitary invariance of A, we write A = U diag(λ)U ∗ , for a Haar-distributed random unitary matrix U ∈ U(d1 d2 ), and some (random) eigenvalue vector λ. Note that traces of powers of A depend only on λ, so we shall write trπ (λ) := trπ (A). We apply the Weingarten formula to a general moment of B, given by a permutation π: #π Y 1 EU trπ (B) = EU Tr B |ci | , d1 i=1

where c1 , . . . , c#π are the cycles of π ∈ Sp , and EU denotes the conditional expectation with respect to the Haar random unitary matrix U . From the graphical representation of the Weingarten formula [8, Theorem 4.1], we can compute the conditional expectation over U (note that below, the vector of eigenvalues λ is still random): EU trπ (B) = d−#π d−p 1 2

X

#(π −1 α) #α d2 (d1 d2 )#β

d1

trβ (λ) Wgd1 d2 (α−1 β).

(19)

α,β∈Sp

Above, Wg is the Weingarten function [11] and trβ (λ) is the moment of the diagonal matrix #(π −1 α)

diag(λ) corresponding to the permutation β. The combinatorial factors d1 and d#α 2 come from the initial wirings of the boxes respective to the vector spaces of dimensions d1 (initial 13

wiring given by π) and d2 (initial wiring given by the identity permutation), see Figure 3. The pre-factors d1−#π d−p 2 contain the normalization from the (partial) traces. Finally, the (random) factors trβ (λ) are the normalized power sums of λ: trβ (λ) =

#β dX 1 d2 Y |w | (d1 d2 )−1 λj i , i=1

j=1

where w1 , . . . , w#β are the cycles of β. Recall that we have assumed almost sure convergence for the sequence (Ad ) (and, thus, for (λd )): ∀π,

a.s. − lim trβ (λ) = d→∞

#β Z Y

x|wi | dµ(x) =: mπ (µ).

(20)

i=1

π(i) U

U∗

λ

π −1 (i)

Figure 3: The i-th group in the diagram corresponding to mπ (B). As a first application of the Weingarten formula (19), let us find the distribution of the random variable b = Tr(B)/d1 . Obviously, −1 2 2 EU b = EU tr(1) (B) = d−1 1 d2 d1 d2 tr(1) (λ)

1 = tr(1) (λ). d1 d2

(21)

Actually, b does not depend on the random unitary matrix U , since d1 d2 1 1 1 X 1 ∗ Tr(B) = Tr(A) = Tr(U diag(λ)U ) = b= λi = tr(1) (λ). d1 d1 d2 d1 d2 d1 d2 i=1

From the hypothesis (20) (with π = (1)), we have that, almost surely as d → ∞, the random variable b converges to the scalar a = m(1) (µ). Let us now move on to the variance v of the eigenvalues. First, we compute its expectation EU v = tr(12) (B) − tr(1)(2) (B). We apply now the Weingarten formula (19) for EU tr(12) (B); the sum has 2!2 = 4 terms, which we compute below: • α = β = (1)(2): T1 = d21 d22 tr(1)(2) (λ) d2 d12 −1 1 2

• α = (1)(2), β = (12): T2 = − tr(12) (λ) d2 d12 −1 1 2

• α = (12), β = (1)(2): T3 = −d21 tr(12) (λ) d2 d12 −1 1 2

• α = β = (12): T4 = d21 tr(12) (λ) d2 d12 −1 . 1 2

Combining the expressions above with (21), we get EU v =

d21 − 1 (tr(12) (λ) − tr(1)(2) (λ)). d21 d22 − 1

Using the hypothesis (20), we have thus, as d1,2 → ∞, Ev = (1 + o(1))d−2 2 (m(12) (µ) − m(1)(2) (µ)). 14

Let us now proceed and estimate the variance of v; more precisely, let us compute E(v 2 ). As before, we shall compute the expectation in two steps: first with respect to the random Haar unitary matrix U , and then, using our assumption (20), with respect to λ, in the asymptotic limit. To perform the unitary integration, note that the Weingarten sum is indexed by a couple (α, β) ∈ S42 , so it contains 4!2 = 576 terms, see [35]. In Appendix B we have computed the variance of v with the usage of symmetry arguments. The result, to the first order reads −4 2 EU (v 2 ) − (EU v)2 = (1 + o(1))2d−2 1 d2 [m(12) (λ) − m(1)(2) (λ)] .

Taking the expectation over λ and the limit (we are allowed to, by dominated convergence), we get −4 2 Var(v) = (1 + o(1))2d−2 1 d2 [m(12) (µ) − m(1)(2) (µ)] . We put now all the ingredients together: p √ P( d1 v ≥ ε) = P(v ≥ ε2 d−1 1 )≤

−4 Var(v) Cd−2 1 d2 ∼ , 2 0 −2 2 [ε2 d−1 [ε2 d−1 1 − Ev] 1 − (1 + o(1))C d2 ]

where C, C 0 non-negative constants depending on the limiting measure µ. Using d1 d22 , the dominating term in the denominator above is ε2 d−1 1 , and thus we have: p √ P( d1 v ≥ ε) . Cε−4 d−4 2 . P −4 Since the series d2 is summable, we obtain the announced almost sure convergence by the Borel-Cantelli lemma, finishing the proof.

B

Calculation of the variance

We remind here, that A ∈ Md1 (C) ⊗ Md2 (C) and B = d−1 2 [id ⊗ Tr](A). Because we assume that A has unitarly invariant distribution, we can write A = U diag(λ)U † =

d1X d2 −1

λi |Ui ihUi |,

(22)

λi Tr2 |Ui ihUi |.

(23)

i=0

where |Ui i = U |ii is i-th column of matrix U and B=

d−1 2 Tr2 A

=

d−1 2

d1X d2 −1 i=0

We denote ρi = Tr2 |Ui ihUi | and consider mixed moments computed in Appendix C. M(i, j, k, l) = ETr(ρi ρj )Tr(ρk ρl ).

(24)

and symmetric mixed moments SM(i, j, k, l) = ETr(ρi ρj )Tr(ρk ρl ) − ETr(ρi ρj )ETr(ρk ρl ).

(25)

Proposition 17 Let v = tr(B 2 ) − (tr(B))2 we have V ar(v) = as d1 , d2 → ∞, in the above µk =

1 d1 d2

P

2(µ21 − µ2 )2 (1 + o(1)) d21 d42 λki . 15

(26)

Direct computations with the usage of symmetric moments SM give us 1 h (d1 d2 )4 µ41 SM(0, 1, 2, 3) d21 d42 + 2(d1 d2 )3 µ2 µ21 SM(0, 0, 1, 2) + 2SM(0, 1, 0, 2) − 3SM(0, 1, 2, 3) + 4(d1 d2 )2 µ3 µ1 SM(0, 0, 0, 1) − SM(0, 0, 1, 2) − 2SM(0, 1, 0, 2) + 2SM(0, 1, 2, 3) + (d1 d2 )2 µ22 SM(0, 0, 1, 1) − 2SM(0, 0, 1, 2) + 2SM(0, 1, 0, 1) − 4SM(0, 1, 0, 2) + 3SM(0, 1, 2, 3) + d1 d2 µ4 SM(0, 0, 0, 0) − 4SM(0, 0, 0, 1) − SM(0, 0, 1, 1) + 4SM(0, 0, 1, 2) − 2SM(0, 1, 0, 1) i + 8SM(0, 1, 0, 2) − 6SM(0, 1, 2, 3) 2 d21 − 1 d22 − 1 d41 d42 µ21 − µ2 2 = 2 2 2 4 4 2 2 2 d2 d1 d2 − 1 d1 d2 − 13d1 d2 + 36 + d21 d22 11µ41 − 22µ2 µ21 + 20µ3 µ1 − 4µ22 − 5µ4 + 5 3µ22 − 4µ1 µ3 + µ4

V ar(v) =

=

2(µ21 − µ2 )2 (1 + o(1)) d21 d42 (27)

C

Mixed moments calculation

Lemma 18 We have the following formulas for mixed moments, which covers all possible cases (because of the symmetry) d2 d31 + 2 d22 + 2 d21 + d2 d22 + 10 d1 + 4d22 + 2 , (28) M(0, 0, 0, 0) = (d1 d2 + 1) (d1 d2 + 2) (d1 d2 + 3) d22 − 1 (d1 (d1 + d2 ) (d1 d2 + 4) + 2) M(0, 0, 0, 1) = , (29) (d1 d2 − 1) (d1 d2 + 1) (d1 d2 + 2) (d1 d2 + 3) (d1 d2 (d1 d2 + 2) − 4) (d1 + d2 ) 2 + 4 M(0, 0, 1, 1) = , (30) d1 d2 (d1 d2 − 1) (d1 d2 + 2) (d1 d2 + 3) d22 − 1 (d1 (d1 + d2 ) (d1 d2 (d1 d2 + 4) + 2) − 2) M(0, 0, 1, 2) = , (31) d1 d2 (d1 d2 − 1) (d1 d2 + 1) (d1 d2 + 2) (d1 d2 + 3) d22 − 1 d1 6d2 + d1 d22 (d1 d2 + 5) − 2 + 2 , (32) M(0, 1, 0, 1) = d1 d2 (d1 d2 − 1) (d1 d2 + 1) (d1 d2 + 2) (d1 d2 + 3) d22 − 1 d1 d1 d2 d1 3d22 + d1 d22 − 1 d2 − 4 − 3d2 + 2 − 8d2 − 2 M(0, 1, 0, 2) = (33) d1 d2 (d1 d2 − 2) (d1 d2 − 1) (d1 d2 + 1) (d1 d2 + 2) (d1 d2 + 3) d22 − 1 d22 d22 − 1 d41 + 2 7 − 6d22 d21 + 22 M(0, 1, 2, 3) = . (34) d21 d22 d21 d22 − 7 2 − 36 First we note d2 d31 + 2 d22 + 2 d21 + d2 d22 + 10 d1 + 4d22 + 2 M(0, 0, 0, 0) = E(Tr(ρ0 ) ) = (d1 d2 + 1) (d1 d2 + 2) (d1 d2 + 3) 2 2

16

(35)

Next we consider the case M(0, 0, 0, 1) M(0, 0, 0, 1) = ETrρ20 Tr(ρ0 ρ1 ) =

1 d1 d2 − 1

ETrρ20 Tr(ρ0 (d2 1d1 − ρ0 ))

1

d2 ETrρ20 − E(Trρ20 )2 d1 d2 − 1 d1 + d2 1 d2 = − M(0, 0, 0, 0) d1 d2 − 1 d1 d2 + 1 d22 − 1 (d1 (d1 + d2 ) (d1 d2 + 4) + 2) = . (d1 d2 − 1) (d1 d2 + 1) (d1 d2 + 2) (d1 d2 + 3) =

(36)

In order to get other mixed moments we need to perform another integration.

C.1

Inner integral

Here we will consider expectations of the following kind ETrρ20 Trρ21 = ETr(Tr1 |U0 ihU0 |)2 Tr(Tr1 |U1 ihU1 |)2 ,

(37)

note, that if we multiply matrix U by a unitary matrix which does not change the first column we will not change the expectation, in fact we can integrate over the subgroup of matrices which does not change the first column of U . Now for a moment we fix matrix U and consider Tr(Tr1 |U0 ihU0 |)2 EV Tr(Tr1 U V |1ih1|V † U † )2 , where matrices V are in the form 1

0 0

V =  .. .

0 v1,1 v2,1

0 v1,2 v1,2

... ... ...

0 v1,d1 d2 −1 v2,d1 d2 −1

.. .

.. .

..

.. .

.

(38)

  . 

(39)

0 vd1 d2 −1,1 vd1 d2 −1,2 ... vd1 d2 −1,d1 d2 −1

The EV is an expectation with respect to the Haar measure on U (d1 d2 −1) embedded in U (d1 d2 ), in the above way. Note, that the vector U V |1i represents a random orthogonal vector to the |U0 i = U |0i. First we calculate EV (U V |1ih1|V † U † ) ⊗ (U V |1ih1|V † U † ) = EV (U |V1 ihV1 |U † ) ⊗ (U |V1 ihV1 |U † )

(40) †

= (U ⊗ U )EV |V1 ihV1 | ⊗ |V1 ihV1 |(U ⊗ U ) . Now, using standard integrals we obtain X 1 EV |V1 ihV1 | ⊗ |V1 ihV1 | = (δi1 j1 δi1 j1 + δi1 j2 δi2 j1 ) θ(i1 i2 j1 j2 )|i1 i2 ihj1 j2 |. (d1 d2 − 1)d1 d2 i1 i2 j1 j2

(41) where θ(x) = (1 − δx,0 ) and incorporates the condition that first element of vector |V1 i is zero. Now we obtain, after elementary calculations, using the fact, that U is unitary (U ⊗ U )EV |V1 ihV1 | ⊗ |V1 ihV1 |(U ⊗ U )† X 1 (δi1 j1 − ui1 ,0 uj1 ,0 )(δi2 j2 − ui2 ,0 uj2 ,0 ) = (d1 d2 − 1)d1 d2 i1 i2 j1 j2 + (δi1 j2 − ui1 ,0 uj2 ,0 )(δi2 j1 − ui2 ,0 uj1 ,0 ) θ(i1 i2 j1 j2 )|i1 i2 ihj1 j2 | 1 = 1d1 d2 d1 d2 + |U0 ihU0 | ⊗ |U0 ihU0 | − 1d1 d2 ⊗ |U0 ihU0 | − |U0 ihU0 | ⊗ 1d1 d2 (d1 d2 − 1)d1 d2 + Sd1 d2 ,d1 d2 (1d1 d2 d1 d2 + |U0 ihU0 | ⊗ |U0 ihU0 | − 1d1 d2 ⊗ |U0 ihU0 | − |U0 ihU0 | ⊗ 1d1 d2 ) , (42) 17

where Sd1 d2 ,d1 d2 is a swap operation on two systems of dimensions d1 d2 , i.e. S = So we get

P

|i1 i2 ihi2 i1 |.

(U ⊗ U )EV |V1 ihV1 | ⊗ |V1 ihV1 |(U ⊗ U )† 1 (1d1 d2 d1 d2 + Sd1 d2 ,d1 d2 ) 1d1 d2 + |U0 ihU0 | ⊗ |U0 ihU0 | − 1d1 d2 ⊗ |U0 ihU0 | − |U0 ihU0 | ⊗ 1d1 d2 , = (d1 d2 − 1)d1 d2 (43) Next we consider EV Tr(Tr2 U |V1 ihV1 |U † )2 = ETrSd1 ,d1 (Tr2 U |V1 ihV1 |U † ) ⊗ (Tr2 U |V1 ihV1 |U † ) = EV TrSd1 ,d1 Tr2,4 (U |V1 ihV1 |U † ⊗ U |V1 ihV1 |U † ) = TrSd1 ,d1 Tr2,4 (U ⊗ U )EV |V1 ihV1 | ⊗ |V1 ihV1 |(U ⊗ U )† 1 = TrSd1 ,d1 Tr2,4 (1d1 d2 d1 d2 + Sd1 d2 ,d1 d2 ) 1d1 d2 d1 d2 + |U0 ihU0 | ⊗ |U0 ihU0 | (d1 d2 − 1)d1 d2 − 1d1 d2 ⊗ |U0 ihU0 | − |U0 ihU0 | ⊗ 1d1 d2

(44)

So we have obtained EV Tr(Tr1 U |V1 ihV1 |U † )2 1 = d1 d22 + Trρ20 − d2 Trρ0 − d2 Trρ0 + d21 d2 + Tr(ρ00 )2 − d1 Trρ00 − d1 Trρ00 (45) (d1 d2 − 1)d1 d2 1 d1 d22 + d21 d2 − 2d1 − 2d2 + 2Trρ20 = (d1 d2 − 1)d1 d2 in the above formulas we used ρ00 = Tr1 |U0 ihU0 | and fact, that Trρ20 = Tr(ρ00 )2 . Using above we write M(0, 0, 1, 1) = ETrρ20 Trρ21 = ETrρ20 EV Tr(Tr1 U |V1 ihV1 |U † )2 1 = (d1 d22 + d21 d2 − 2d1 − 2d2 )ETrρ20 + 2E(Trρ20 )2 (d1 d2 − 1)d1 d2 1 d1 + d2 = (d1 d22 + d21 d2 − 2d1 − 2d2 ) + 2M(0, 0, 0, 0) (d1 d2 − 1)d1 d2 d1 d2 + 1 2 (d1 d2 (d1 d2 + 2) − 4) (d1 + d2 ) + 4 = . d1 d2 (d1 d2 − 1) (d1 d2 + 2) (d1 d2 + 3) (46)

18

Using inner integral we can also calculate M(0, 1, 0, 1), here we write M(0, 1, 0, 1) = ETrρ0 ρ1 Trρ0 ρ1 = ETr(ρ0 ⊗ ρ0 )(ρ1 ⊗ ρ1 ) = ETr(ρ0 ⊗ ρ0 )Tr2,4 (U ⊗ U )EV |V1 ihV1 | ⊗ |V1 ihV1 |(U ⊗ U )† 1 ETr(ρ0 ⊗ ρ0 )Tr2,4 (1d1 d2 d1 d2 + Sd1 d2 ,d1 d2 ) = (d1 d2 − 1)d1 d2 × 1d1 d2 d1 d2 + |U0 ihU0 | ⊗ |U0 ihU0 | − 1d1 d2 ⊗ |U0 ihU0 | − |U0 ihU0 | ⊗ 1d1 d2 1 = E Tr(ρ0 ⊗ ρ0 )(d22 1d21 + ρ0 ⊗ ρ0 − d2 1d1 ⊗ ρ0 − d2 ρ0 ⊗ 1d2 ) (d1 d2 − 1)d1 d2 + TrSd1 d2 ,d1 d2 1d1 d2 d1 d2 + |U0 ihU0 | ⊗ |U0 ihU0 | − 1d1 d2 ⊗ |U0 ihU0 | − |U0 ihU0 | ⊗ 1d1 d2 (ρ0 ⊗ 1d2 ⊗ ρ0 ⊗ 1d2 ) 1 E d22 + (Trρ20 )2 − 2d2 Trρ0 Trρ20 = (d1 d2 − 1)d1 d2 + Tr(ρ20 ⊗ 1d2 ) + Tr(|U0 ihU0 |ρ0 ⊗ 1d2 )2 − 2Tr(ρ0 ⊗ 1d2 )(|U0 ihU0 |ρ0 ⊗ 1d2 ) 1 = E d22 − 2d2 Trρ20 + (Trρ20 )2 + d2 Trρ20 + (Trρ20 )2 − 2Trρ30 (d1 d2 − 1)d1 d2 1 d1 + d2 = + 2M(0, 0, 0, 0) − 2ETrρ30 d22 − d2 (d1 d2 − 1)d1 d2 d1 d2 + 1 2 d2 − 1 d1 6d2 + d1 d22 (d1 d2 + 5) − 2 + 2 = . d1 d2 (d1 d2 − 1) (d1 d2 + 1) (d1 d2 + 2) (d1 d2 + 3) (47) 2

(d1 +d2 ) +d1 d2 +1 This is because ETrρ30 = (d see [34]. 1 d2 +1)(d1 d2 +2) Using above results we obtain other moments

1 ETrρ20 Tr(ρ1 (d2 1d1 − ρ0 − ρ1 )) d1 d2 − 2 1 = d2 ETrρ20 − ETrρ20 Tr(ρ1 ρ0 ) − ETrρ20 Trρ21 d1 d2 − 2 d1 + d2 1 d2 − M(0, 0, 0, 1) − M(0, 0, 1, 1) = d1 d2 − 2 d1 d2 + 1 d2 − 1 (d1 (d1 + d2 ) (d1 d2 (d1 d2 + 4) + 2) − 2) = 2 d1 d2 (d1 d2 − 1) (d1 d2 + 1) (d1 d2 + 2) (d1 d2 + 3)

M(0, 0, 1, 2) = ETrρ20 Tr(ρ1 ρ2 ) =

(48)

Next we consider for mixed moment of type (0, 1, 0, 2) 1 ETr(ρ0 ρ1 )Tr(ρ0 (d2 1d1 − ρ0 − ρ1 )) d1 d2 − 2 1 = d2 ETr(ρ0 ρ1 ) − ETr(ρ0 ρ1 )Tr(ρ20 ) − ETr(ρ0 ρ1 )Tr(ρ0 ρ1 ) d1 d2 − 2 d22 − 1 d1 d1 d2 d1 3d22 + d1 d22 − 1 d2 − 4 − 3d2 + 2 − 8d2 − 2 = d1 d2 (d1 d2 − 2) (d1 d2 − 1) (d1 d2 + 1) (d1 d2 + 2) (d1 d2 + 3) (49)

M(0, 1, 0, 2) = ETr(ρ0 ρ1 )Tr(ρ0 ρ2 ) =

19

Now the last case of all different indices M(0, 1, 2, 3) = ETr(ρ0 ρ1 )Tr(ρ2 ρ3 ) 1 = ETr(ρ0 ρ1 )Tr(ρ2 (d2 1d1 − ρ0 − ρ1 − ρ2 ) d1 d2 − 3 1 = ETr(ρ0 ρ1 ) (d2 − Trρ2 ρ0 − Trρ2 ρ1 − Trρ2 ρ2 ) d1 d2 − 3 d22 − 1 d22 d22 − 1 d41 + 2 7 − 6d22 d21 + 22 . = d21 d22 d21 d22 − 7 2 − 36

20

(50)

References [1] Aubrun, G., Szarek, S. Alice and Bob meet Banach. Book in preparation. 11 [2] Bai, Z. and Silverstein, J.W. Spectral Analysis of Large Dimensional Random Matrices. Springer Series in Statistics, 2010. 9 [3] Bai, Z. D. and Yin Y. Q. Limit of the smallest eigenvalue of large dimensional covariance matrix. Ann. Probab. 21(3), 1275—1294 (1993). 9 [4] Bercovici, H. and Voiculescu, D. Regularity questions for free convolution. Oper. Theory Adv. Appl. 104, 37–47 (1998). 6 [5] Bhatia, R. Matrix Analysis. Graduate Texts in Mathematics, 169. Springer-Verlag, New York, 1997. 10 [6] Bhatia, R. Matrix factorizations and their perturbations. Linear Algebra and its Applications 197, pp 245–276 (1994). 10 [7] Choi, M.-D. Completely positive linear maps on complex matrices. Linear Alg. Appl. 10, 285 (1975). 1 [8] Collins, B. and Nechita, I. Random quantum channels I: Graphical calculus and the Bell state phenomenon. Comm. Math. Phys. 297 (2010), no. 2, 345–370. 13 [9] Collins, B. and Nechita, I. Random matrix techniques in quantum information theory. Journal of Mathematical Physics 57, 015215 (2016). 13 [10] Collins, B., Nechita, I. and Ye, D. The absolute positive partial transpose property for random induced states. Random Matrices: Theory Appl., 01, 1250002 (2012). 8 [11] Collins, B. and Śniady, P. Integration with respect to the Haar measure on unitary, orthogonal and symplectic group. Comm. Math. Phys. 264 (2006), no. 3, 773–795. 13 [12] Davies, E.B. Lipschitz continuity of operators in the Schatten classes. J. London Math. Soc., 37, pp. 148—157 (1988). 10 [13] Deya, A., Nourdin, I. Convergence of Wigner integrals to the tetilla law. Lat. Am. J. Probab. Math. Stat. 9, 101 (2012). 5 [14] Giraud, O. Distribution of bipartite entanglement for random pure states, Journal of Physics A: Mathematical and Theoretical, 40, pp. 2793, (2007) [15] Helstrom, C.W. Quantum detection and estimation theory. Academic press, 1976 2 [16] Hiai, F., and Petz, D. The semicircle law, free random variables, and entropy. AMS, 2000. 11, 12 [17] Kitaev, A. Y. Quantum computations: algorithms and error correction. Russian Mathematical Surveys 52(6):1191 (1997). 2 [18] Jamiołkowski, A. Linear transformations which preserve trace and positive semidefiniteness of operators. Reports on Mathematical Physics, 3(4), 275–278 (1972) 1 [19] Jen˘cová, A. and Pl´ avala, M. Conditions for optimal input states for discrimination of quantum channels. Preprint arXiv:1603.01437. 3 [20] Johnston, N. QETLAB: A MATLAB toolbox for quantum entanglement, version 0.9. http://www.qetlab.com, January 12, 2016. 2 21

[21] Male, C. The norm of polynomials in large random and deterministic matrices. Probability Theory and Related Fields, 154(3-4), 477–532 (2012). 11 [22] Nica, A. and Speicher, R. Lectures on the combinatorics of free probability. Cambridge Univ. Press (2006). 5, 6 [23] Nica, A., Speicher, R. Commutators of free random variables. Duke Math. J. 92, 553–592 (1998). 5 [24] Paulsen, V. Completely bounded maps and operator algebras. Cambridge University Press (2002). 2 [25] Popescu, S. and Short, A. J. and Winter, A. Entanglement and the foundations of statistical mechanics. Nature Physics, 2(11):754–758, 2006. [26] Puchała, Z., Pawela, Ł., Życzkowski, K. Distinguishability of generic quantum states. Physical Review A, 93(6), 062112 (2016). 5, 6, 11 [27] Stinespring, W. F. Positive functions on C ∗ -algebras. Proceedings of the American Mathematical Society, 6(2), 211-216 (1955). 5 [28] Timoney, R. M. Computing the norms of elementary operators. Illinois J. Math. 47:4, 1207–1226 (2003). 2 [29] Watrous, J. Theory of Quantum Information. Book draft available online, July 2016. 2, 4 [30] Watrous, J. Simpler semidefinite programs for completely bounded norms. Chicago Journal of Theoretical Computer Science 8 1–19 (2013). 2, 3 [31] Wolkowicz, H. and Styan, G.P.H., Bounds for eigenvalues using traces, Linear Algebra and Its Applications, 29, pp. 471–506, (1980) 13 [32] Samuelson, P. A. How deviant can you be? Journal of the American Statistical Association, 63(324), 1522–1525 (1968). 13 [33] Weingarten, D. Asymptotic behavior of group integrals in the limit of infinite rank. J. Math. Phys., 19(5):999–1001 (1978). 13 [34] Sommers, H.-J. and Życzkowski, K. Statistical properties of random density matrices, J. Phys. A: Math. Gen., 37(35):8457,(2004), 19 [35] A Mathematica notebook can be found in the supplementary material of the current paper. 6, 15

22