Lecture notes on the Statistical Structure of Quantum Theory

0 downloads 0 Views 441KB Size Report
Jan 17, 2012 - Theorem 3 Let C, K ⊆ H be convex sets and assume that C is closed and K is ... A subset F of a convex set C is said to be a face if for all u, v ∈ C and α ∈ ]0; 1[ we .... therefore characterized by an angle θ1 ∈ T/2πZ which is our set of preparations. .... The neutral element eG is mapped into the identity.
Lecture notes on the Statistical Structure of Quantum Theory Peter Harremo¨es January 17, 2012

1 1.1

Convexity Convex sets

Let H be a vector space. We will identify the vectors in H with points. Let C be a subset of H. Let ~u, ~v ∈ C. If ~u 6= ~v then the map α y α · ~u + (1 − α) · ~v is a parametrization of a straight line through ~u and ~v if the domain is R. If the domain of the parametrization is [0; 1] then the curve is a line with end points ~u and ~v . If C is a circle or a square in two dimensions the whole line α · ~u + (1 − α) · ~v , α ∈ [0; 1] is within C. The same will hold if C has the shape of d an ball in 3 dimensions or [a; b] in Rd . Not all subsets of a vector space satisfy this property. For instance {0, 1} as subset of the vector space R, or a torus as a subset of a three dimensional vector space do not satisfy the property. The set C ⊆ H is said to be convex if for all ~u, ~v ∈ C and all α ∈ [0; 1] the vector α · ~u + (1 − α) · ~v belongs to C. If α ∈ [0; 1] then α · ~u + (1 − α) · ~v is called a convex combination or a mixture of the two points/vectors ~u and ~v .

1

Convex sets

Sets which are not convex

Convex sets and sets which are not convex. 1 (U ) be the set of probability vectors on a finite space U. Example 1 Let M+ 1 Then M+ (U ) is convex.

A hyperplane P L in H is a subset of the form {~u ∈ H | Φ (¯ u) = λ} where Φ is linear and λ is a constant. Let C and K be subsets of H. Then C and K are separated by the hyperplane P L if  < λ for u ¯∈K Φ (¯ u) . ≥ λ for u ¯∈C

2

Two convex sets separated by a hyperplane. The distance dist (C, K) between the sets is defined by dist (C, K) =

inf

~ u∈C,~ v ∈K

k~v − ~uk .

Theorem 2 Assume that H is a Hilbert space. Let C ⊆ H be convex sets and assume that C is closed. Let ~u ∈ H be a vector. Put dmin = dist (~u, C) . If ~ u∈ / C then there exists ~v ∈ C such that dist (~u, ~v ) = dmin , and the hyperplane w ~ ∈ H | hw ~ − ~u | ~v − ~ui = d2min separates the sets C and {~u} . Proof. Let ~vn ∈ C be a sequence such that k~vn − ~uk → dmin . For all m, n ∈ N 1 1 ~vm + ~vn ∈ C. 2 2 Therefore

2

1

1

d2min ≤ ~ v + ~ v − ~ u

2 m 2 n

1 2 = k~vm − ~u + ~vn − ~uk 4  1 2 2 2 = 2 k~vm − ~uk + 2 k~vn − ~uk − k~vm − ~vn k 4 1 1 1 2 2 2 = k~vm − ~uk + k~vn − ~uk − k~vm − ~vn k . 2 2 4 3

Then 2

2

2

k~vm − ~vn k ≤ 4d2min − 2 k~vm − ~uk − 2 k~vn − ~uk → 0 for m, n → ∞,

and ~vn is a Cauchy sequence and converges to some vector ~v . The set C is closed which proves that ~v ∈ C. The equation k~v − ~uk = dmin holds by continuity. Let w ~ be an element in C. Then α · w ~ + (1 − α) · ~v ∈ C by convexity. Therefore 2

k~v − ~uk ≤ kαw ~ + (1 − α) ~v − ~uk

2 2

= kα (w ~ − ~u) + (1 − α) (~v − ~u)k 2

2

2

= α2 kw ~ − ~uk + (1 − α) k(~v − ~u)k + 2α (1 − α) hw ~ − ~u | ~v − ~ui . For α = 0 equality holds. Taking the derivative with respect to α in α = 0 gives 2

0 ≤ −2 k(~v − ~u)k + 2 hw ~ − ~u | ~v − ~ui , which is the desired inequality. We see that a point belongs to a closed convex set if and only if it cannot be separated from the set. The vector ~v ∈ C which minimizes k~v − ~uk is called the projection of ~u on C. The following theorem is known as the Separation Theorem for Convex Sets. Theorem 3 Let C, K ⊆ H be convex sets and assume that C is closed and K is compact. Then C ∩ K = ∅ if and only C and K are separated by a hyperplane. Proof. Put dmin = dist (C, K) . Choose v¯0 ∈ K such that dist (C, v¯0 ) ≤ dmin +1. Then C˜ = {~u ∈ C | dist (~u, K) ≤ d + 1} is compact and there exists ~u1 ∈ C˜ and ~v1 ∈ K such that k~u1 − ~v1 k = d. Then ~u1 is the projection of ~v1 on C and ~v1 is the projection of ~u1 on K. Then the set   1 w ~ ∈ H | hw ~ − ~v1 | ~u1 − ~v1 i = d2min 2 is a separation hyperplane. The Separation Theorem can be generalized to infinite dimensional spaces with suitable topologies, but then the proof requires the axiom of choice (or some of its equivalents). Theorem 4 Let C be a closed convex subset of a finite dimensional Hilbert space H. Let P be a probability measure on C. Then the vector defined by Z ~v = E (~u) = ~u dP ~u C

is element in C.

4

Proof. Assume that ~v ∈ / C. Then there exists a linear map Φ such that Φ (~u) ≥ 0 for ~u ∈ C and Φ (~v ) < 0. Then Φ (~u) can be considered as a positive random variable, and 0 ≤ E (Φ (~u)) = Φ (E (~u)) Φ (~v ) , and we have a contradiction. R The point/vector ~v = E (~u) = C ~u dP ~u is called a convex combination or mixture of the points/vectors in supp(P ) . For a set A ∈ Rd (which is not assumed to be convex) the convex hull of A is the set of vectors of the form R ~ u dP ~ u where P denotes probability measures on A.One sees that the convex C hull is the smallest convex set containing A as a subset. In infinite dimensional spaces it is more complicated, and one has to distinguish between finite probability measures (probability vectors) and more general probability measures. A subset F of a convex set C is said to be a face if for all ~u, ~v ∈ C and α ∈ ]0; 1[ we have that α · ~u + (1 − α) · ~v ∈ F implies that ~u, ~v ∈ F. A face of the convex set C consisting of one point is called an extreme point of C. The set of extreme points in C is denoted ∂e C.

A convex with one of its faces marked. By the dimension of a convex/face set we will mean the dimension of the affine space spanned by the set. Lemma 5 A non-empty compact convex set C ⊆ Rd contains at least one extreme point. Proof. The proof is by induction in the dimension of C. For dim (C) = 0 C contains only one point which must be an extreme point. Assume that the lemma is true for all convex sets of dimension less than n and that dim (C) = n + 1. Let Φ : Rd → R be a linear map such that Φ (C) is not a point. Then Φ (C) is an interval [a; b] . Then set Φ−1 (b) ∩ C is a face of C of dimension less than n and therefore Φ−1 (b) ∩ C contains at least one extreme point, and this point must also be an extreme point of C. The following important theorem is due to Caratheodory.

5

Theorem 6 Let C be of finite dimension d. Assume that C is convex and compact, and that ~u ∈ C is a point. Then there exists extreme points ~v1 , ~v2 , ..., ~vd+1 and a probability vector (α1 , α2 , ..., αd+1 ) such that ~u =

d+1 X

αj · ~vj .

j=1

Proof. The proof is by induction in d. For d = 0 it is obvious. Assume that the theorem is true for convex sets of dimension less that n and that d = n+1. The convex set C is compact and therefore C contains an extreme point ~v1 . Let β0 be the smallest β ∈ R such that w ~ = (1 − β) ~v + β~u ∈ C. Put F = {~v ∈ C | ∃γ ∈ ]0; 1[ , ∃˜ v ∈ C : γ · ~v + (1 − γ) · v˜ = w} ~ . One easily checks that F is a face containing w, ~ and the dimension must be at most d. Therefore w ~ can be written as a convex combination of at most d extreme points. By replacing w ~ by a convex combination of d extreme points in the formula −1 ~u = (1 − β) (w ~ − β · ~v ) and using that β ≤ 0 we get ~v written as a convex combination of at most d + 1 extreme points. 1 (∂e C) → C is surAnother way to state the theorem is that the map M+ jective. In general it is not injective. For instance the centre of a circle is a 1 (∂e C) → C convex combination of any pair of antipodic points. If the map M+ is injective then C is said to be a simplex.

1.2

Convex functions

Let f be a real function with a convex set C as domain. Then f is said to be convex if for all ~u, ~v ∈ C and all α ∈ [0; 1] the following inequality is satisfied f (α · ~u + (1 − α) · ~v ) ≤ α · f (~u) + (1 − α) · f (~v ) . If f is convex then −f is said to be concave. The hypergraph of f is the set {(~v , y) ∈ C × R | y ≥ f (~v )} . We see that f is convex if and only if the hypergraph is convex. Using the results from the previous section on the hyper graph of a convex function we get f (E (~u)) ≤ E (f (~u)) . The inequality is called Jensen’s inequality after the Danish mathematician J. L. W. V. Jensen (1859-1925).

2

States and measurements

A physical experiment consists of a physical arrangement O and a result R. These data may be of an arbitrary nature: They may be discrete if the measurement instrument make registrations of an even for instance the appearance 6

of a particle, they can represent a scalar or vector depending on whether the measurement instrument has one or more scales, or the result may be the entire trace of particle in a bobble chamber. To give a unified treatment we will assume that the results are the elements in a finite set U and denote by B (U) the set of all subsets of U. Similarly we will assume that the set of physical arrangements O (and the possible values of all other variables introduced in this chapter) is a measurable set. An individual result is rarely completely determined by the preparation. If one considers the repeated experiments the frequency of the different results will be uniquely determined. Often O and R will be composed of a number of other variables. In the situations we shall consider O can be divided into a preparation P and a measurement procedure µ. We will assume that P and M are independent and thereby we exclude correlation between P and M. We put O = (P, M). By the preparation an experimental setup is established by giving initial conditions and input data. By the measurement procedure the ”prepared object” is coupled to the measurement apparatus which results in the observation R. The ”object” can be considered as a ”black box”, a coupling or an information channel between preparation and measurement apparatus. Let P be the set of preparations and M the set of measurement procedures. Instead of making just one specific preparation we can equip the set of preparations with a probability measure and perform the different preparations with probabilities according to the probability measure. This is called randomization. A measurement µ ∈ M 1 (U) can be extended to a which maps gives P into probability measures in M+ 1 1 1 map M+ (P) →M+ (U) . The set M+ (P) can be equipped with the pseudo-metric dist defined by dist (s1 , s2 ) = sup kµs1 − µs2 ktot . µ∈M

1 (P) with respect to dist, We define the state space S as the completion of M+ 1 (P) → S. and the elements in S are called states. Hereby we get a mapping M+ This means that 2 preparations gives the same state if it is not possible to distinguish them by any measurement. The state space is automatically bounded because total variation is bounded. By a measurement we will understand any 1 (U) . With these definitions the state space depend on affine mapping S →M+ the set of measurements considered. Therefore it is misleading to say for instance ”the electron is in state φ”. Instead one should say ”our knowledge about the electron is completely described by φ.

Example 7 At time t = 0 a classical particle is send from the position x ¯ = ¯0 3 3 with velocity v¯ ∈ R . This shall be our preparations P = R . Assume that the particle is not subjected to any forces. For any B ⊆ R3 define a measurement by a detection at time t > 0 whether the particles in B ⊆ R3 . In general classical 1 mechanics is characterized by S = M+ (P) and M = B (P) . Example 8 A Stern-Gerlach-apparatus contains an anisotropic magnetic field, which splits a beam of electrons in 2 beams of identical intensity. As the preparation we consider a source emitting the electrons one by one such that only 7

the electrons localized in one of the beams, will continue. The Stern-Gerlachapparatus can be rotated around the axis formed by the beam. The apparatus is therefore characterized by an angle θ1 ∈ T/2πZ which is our set of preparations. another Stern-Gerlach-apparatus is placed after the first, and at last there is two detectors which detects whether an electron is drawn in one or the other direction. This apparatus can be rotated too, and therefore it is characterized by an angle θ2 . Let θi denote the vector (cos (θi ) , sin (θi )) , and let di be detection of the i’th detector. As we shall see we will have (under ideal circumstances) that  θ1 − θ2 P (d1 | d2 ) = cos 2 1 2 1+θ ·θ = . 2 If P is a probability distribution on the set of preparations then the measurement θ2 is mapped into a probability distribution on the event space given by 2



1 + θ1 · θ2 dP θ1 2  R 1 1+ θ dP θ1 · θ2 = . 2 Z

P (d1 | di ) =

R The mapping P → θ1 dP θ1 is therefore exactly the mapping  S (T) → S. This shows that S is isomorphic with (a, b) ∈ R2 | a2 + b2 ≤ 1 . This set can via the mapping   1 1+a b (a, b) → b 1−a 2 be identified with the set of density matrices on the Hilbert space R2 . In elementary quantum mechanics the state space can usually be identified with the set of density matrices/operators on a Hilbert space (often complex). By the above construction S will be a metrically complete convex set. Any metrically complete convex set K can be obtained by a similar construction. Let U be equal to K, and let the set of measurement procedures M be affine 1 mappings K → [0; 1] . Then there exists an affine mapping A : M+ (K) →K where a probability distribution s on K is mapped into the corresponding convex 1 combination in K. For s ∈ M+ (K) and m ∈ M define M (A (s)) a distribution on U = [0; 1] which should be the corresponding measurement. Then S =K. Definition 9 A measurement is said to be simple if the map φ → µφ (D) is extreme among all functionals S → [0; 1] for all subsets D ⊆ U. With this definition the simple measurements are extreme in the set of all measurements. As we shall see later sometimes there may exist extreme measurements which are not simple. 8

1 1 1 Definition 10 Let S = M+ (U ) be a simplex and let µ : M+ (U ) → M+ (V ) be a measurement. Then µ is given by its values on the extreme points δx , x ∈ U. 1 The measurement is extreme if and only if µ (δx ) is extreme in M+ (V ) for all 1 x ∈ V, but the extreme points in M+ (V ) are of the form δy , y ∈ V. Therefore the extreme measurements are given by a map f : U → V and  µφ (D) = φ f −1 (D)  1 where φ is a probability measure in M+ (U ) . The functional φ → φ f −1 (D) 1 from M+ (U ) to [0; 1] maps the extreme points δx into extreme points of [0; 1] and therefore the functional is extreme. We see that the extreme  measurements on a simplex are simple measurements of the form φ f −1 (D) . All other measurements are mixtures of the simple measurements. The simple measurements correspond observations to of which subsets f −1 (D) of U the ”result” x ∈ V belongs to.

Example 11 In the Stern-Gerlach experiment the state space can be identified with the unit circle. The simple measurements can then be identified with projections of the unit circle into a diameter of the circle. If the state is given by the density matrix   1 1+a b S= b 1−a 2 then the simple measurements are given by   a 1+ · θ2 b (a, b) y 2   x where θ2 = is a unit vector. Put y   1 1+x y T = y 1−x 2 This can also be written as  1+

3

a b

   x · y = T r (ST ) . 2

Group representations on convex sets

In physics it is often possible to perform certain ”actions” on the set of preparations. The instruments may be rotated in space. Then it is often possible to make the same measurements on the rotated preparation as on the original preparation. Similarly a measurement instrument may be rotated so that the rotations acts on the set of measurements. The set of rotations form a group 9

because there exists an inverse to any rotation. We see that two preparations can be distinguished by measurements if and only if the preparations can be distinguished after the action of a certain rotation. Therefore the rotations induces an action on the state space. In this sections we shall see that the assumption that a group acts on the state space only leaves few possibilities of the shape of the state space. Definition 12 Let G be a (topological) group and let a finite dimensional state space S. Then an action of G on S is a map which sends a group element g ∈ G into an affine map ag : S → S is such a way that the following conditions are fulfilled • The neutral element e ∈ G is mapped into the identity. • For all g, h ∈ G the following equation is satisfied agh = ag ◦ ah . • The action is continuous, i.e. for any sequence gn converging to g in G the sequence agn converges to ag . Representation may be complicated, but here the focus will be on the simplest representation which are of special interest in quantum theory. Definition 13 Let G be a group acting on a finite dimensional state space S. Then the representation is said to be irreducible if any invariant convex subset is either a point or spans S. In elementary quantum mechanics we are interested in irreducible representations, and the word is used because the phenomenons studied have no internal structure. When a free neutron decays and suddenly do not move according to the Dirac equation for a free spin 1/2 particle (it disappear) this shows exactly that the neutron has a more complicated structure than just being a spin 1/2-particle. Theorem 14 If the state space S has dimension at most 3 and there exists a connected group G acting irreducibly on S, then S is isomorphic to the set of density matrices on a (real or complex) Hilbert space. Proof. The group of maps S → S is compact. Therefore the range of G in this group is compact, so we may assume that G it self is compact and is equipped with the unique normalized Haar-measure m. If ϑ ∈ S then Z ϑ0 = ag (ϑ) dmg G

is invariant under the action of G. Then the state space can be embedded into a real 3-dimensional Hilbert space with ϑ0 in the origin. Let (· | ·) be the inner product in this vector space. Then Z hv | wi = (ag (v) | ag (w)) dmg G

10

is an inner product which is invariant under the action of G. Therefore G can be identified with a connected subgroup of the group O (n) , n ≤ 3. Now, all connected subgroups of O (n) , n ≤ 3 are isomorphic to SO (m) , m ≤ 3. Therefore S is isomorphic to the unit ball in 1,2 or 3 dimensions. The group SO (1) is trivial and can be excluded. Then we just have to remark that the unit circle and the unit ball are isomorphic to the set of density matrices in 2 real or complex dimensions via the map   1 1 + a b + ic (a, b, c) y . b − ic 1 − a 2

3.1

Representations of compact groups

Let G be a compact group with Haar-measure µ. Let G act irreducible on the state space S. We will assume that the representation is non-trivial. Choose φ ∈ S such that G (φ) 6= φ. Then G (φ) spans the convex set S. For a probability measure ν ∈ S (G) define Z m (ν) = g (φ) dνg . Then m is a map S (G) → S. This map can be extended to a map S0 (G) → C (G) where S0 (G) is the set of Radon distributions on G. Further L2 (G, µ) is embedded in S0 (G) . The situation is summarized in the commutative diagram m

C (G)   y

−−−−→

C0 (G) x  i

−−−−→ C (G) .

m

C   y (1)

L2 (G, µ) By compactness of G L2 (G, µ) = ⊕I Hi where Hi are minimal invariant subspaces. This can be rewritten as L2 (G, µ) = ⊕J Lj where Lj is of the form Hi if Hi = Hi or of the form Hi ⊕ Hi if Hi 6= Hi . The elements in Lj are bounded functions by compactness of G. If (fk ) is a generating set for Lj then (Re R (fk ) , Im (fk )) is a generation set of reel functions. For f ∈ Lj we have f dµ = hf | 1i = 0 because C ⊥ Lj . Therefore there exists a generating R set (b) for Lj such that 1 + b ≥ 0 and (1 + b) dµ = 1 which shows that 1 + b ∈ S (G) . Therefore 1 + Lj ∩ S (G) = (1 + Lj )+ spans the convex set   1 + Lj . Now, m (1 + Lj )+ ⊆ S is an invariant convex subset and it will ei  ther be a point or span S. If m (1 + Lj )+ is a point then m (1 + Lj ) is a   point which implies that m (Lj ) = 0. If m (1 + Lj )+ is a point for all i ∈ I 11

then m (S0 (G)) = m (C) . and m (S (G)) is a point which contradicts that the span of m  (S (G)) is non-trivial. Therefore there exists i ∈ I such that m (1 + Lj )+ spans S. Put K = m−1 (m (1)) . Then K is an invariant convex subset of subset of 1 + Lj such that K − 1 is an invariant subspace of Lj . There are 4 possibilities for what K − 1 can be: 0, Hi , Hi or Li . If Hi ⊆ K − 1 then we also have Hi ⊆ K − 1 and therefore Li ⊆ K − 1. This shows that m (1 + Lj ) ⊆ m (K) = m (1) which contradicts that S is non-trivial. In a similar way one can exclude that K-1 is equal to Hi or Li . Therefore K − 1 = 0 and K = 1. This shows that m is injective on (1 + Lj )+ . The above argument shows how all irreducible convex representations can be constructed using the irreducible unitary representations. If G is commutative the unitary irreducible representations of G are 1dimensional so that dim (Hi ) = 1. Therefore dim (Li ) equals 1 or 2. dim(Li ) = 1 : Then S has the form [0; 1] and is isomorphic to the diagonal density matrices on a 2 dimensional Hilbert space. The group of automorphisms is Z2 . dim(Li ) = 2 : Then S can be embedded in the disc ∆ which is isomorphic to the density matrices on a 2-dimensional reel Hilbert space. The group of automorphisms is T × Z2 . If G is connected then there are only 2 possibilities: Either S is trivial or S is isomorphic to ∆. Example 15 If G is the group of rotations in 2 dimensions S is trivial or S is isomorphic to ∆. The trivial representation is called the spin-0 representation. Assume that S is isomorphic to ∆. Then the representation is given by θ → nθ for some n ∈ Z. For n = 0 we have the trivial representation, and further the representations n and −n are isomorphic representations, which shows that the representation is given by a number n ∈ N0 . A quantum mechanic system characterized by the number n is said to have spin n/2.

3.2

Spin

First we shall study rotations in to dimensions. The orthogonal group O (2) is the group of orthogonal transformations. Orthogonal transformations have determinant 1 or −1. The ones with determinant −1 are the reflections. We shall focus on the orthogonal transformations with determinant 1. They form a subgroup called SO (2) that can be identified with T = R/2πZ. The Haar measure on T is simply the uniform distribution U on Tthat is equal to the Lebesgue measure divided with 2π. Assume that T acts irreducible on a state space S. We shall now find the exact shape of S and classify the actions of T on S. One possibility is that T acts trivially on S, i.e. aϑ (φ) = φ for all ϑ ∈ T and all φ ∈ S. This is called the spin 0 representation. We will now assume that the representation is non-trivial, and choose φ ∈ S and ϑ ∈ T such that aϑ (φ) 6= φ. Then the set {aϑ (φ) | ϑ ∈ T} spans the convex set S. For a probability measure

12

1 ν ∈ M+ (T) put

Z m (ν) =

aϑ (φ) dνϑ .

1 Then m is a map M+ (T) → S. If ν is the measure δθ0 with all its mass in the point ϑ0 then

m (ν) = m (δθ0 ) Z = aϑ (φ) dδθ0 ϑ = aϑ0 (φ) . For n ∈ N and (a, b) ∈ R2 consider the function f (ϑ) = 1 + a cos (nϑ) + b sin (nϑ) . Then f can be written as  f (ϑ) = 1 +

a b

   cos (nϑ) · , sin (nϑ)

and we see that f is a positive function if and only if a2 + b2 ≤ 1. This gives a map mn : ∆ → S. Now we consider f as a probability density of a probability measure µ on T. Then Z  aζ (mn (µ)) = aζ aϑ (φ) (1 + a cos (nϑ) + b sin (nϑ)) dU ϑ Z = aζ (aϑ (φ)) (1 + a cos (nϑ) + b sin (nϑ)) dU ϑ Z = aζ+ϑ (φ) (1 + a cos (nϑ) + b sin (nϑ)) dU ϑ Z = aϑ (φ) (1 + a cos (n (ϑ − ζ)) + b sin (n (ϑ − ζ))) dU ϑ . Then we use that a cos (n (ϑ − ζ)) + b sin (n (ϑ − ζ)) = a (cos (nϑ) cos (nζ) + sin (nϑ) sin (nζ)) + b (cos (nϑ) sin (nζ) − sin (nϑ) cos (nζ)) = (a cos (nζ) + b sin (nζ)) cos (nϑ) + (a sin (nζ) − b cos (nζ)) sin (nϑ)      cos (nζ) sin (nζ) a cos (nϑ) = · . sin (nζ) − cos (nζ) b sin (nϑ) Therefore the action of ϑ on S gives a rotation of ∆ by an angle nζ. This action of T on ∆ is obviously irreducible. Therefore either mn (∆) spans S or mn (∆) is a point. 13

Assume that mn (∆) is a point for all n ∈ N. Then all functions of the form 1 + a cos (nϑ) + b sin (nϑ) are mapped into the same point. Then also functions of the form k X 1+ an cos (nϑ) + bn sin (nϑ) n=1

are mapped into the same point in S. These functions are (weak) dense in the set of probability measures on T and therefore all probability measures on T are mapped into a point. Especially the probability measures with all mass in one point are mapped into a point and we see that T acts trivially on S. This is called the spin 0 representation of T.Assume that mn (∆) spans S. Then S can be identified with a subset of R2 spanned by ∆. The group T acts on ∆ as rotation and there is a unique extension of this action to an action as rotations in R2 . Now S is bounded and closed so there exists φ ∈ S such that kφk2 is maximal. Rotations of this state gives a circle in R2 . Then S contains no states outside the circle and by convexity S contains all states inside the circle. After a suitable multiplication S can be identified with ∆. As we have seen the unit disc can be identified with the set of density matrices on a real 2 dimensional Hilbert space. A quantum mechanic system characterized by the number n is said to have spin n/2. To recognize a spin n/2 system one should do the following. First one should find a measurement which is sensitive to rotations. If no such measurement exists, the system has spin 0. If a rotation sensitive measurement has been found one should observe the effect of rotations. If ζ0 is the smallest rotation such that a rotation by the angle ζ0 gives the same measurement results as no rotation then the system has spin π/ζ0 . The table lists some quantum particles and their spin. The particle with integer spin are called bosons and the other are called fermions. Roughly speaking matter is composed of fermions and forces are are carried by bosons. We should also remark that      cos t − sin t 1/2 + a b cos t sin t 1/2 + a cos 2t − b sin 2t a sin 2t + b cos 2t = sin t cos t b 1/2 − a − sin t cos t a sin 2t + b cos 2t 1/2 − a cos 2t + b si so that a rotation of ∆ by an angle 2t imbedded a matrix acting on a Hilbert space corresponds to a rotation by t of the Hilbert space. We see that for a spin n/2 particle a rotation by ζ gives a rotation in the Hilbert space by ζn/2. For bosons ζ (n/2) is well defined but for fermions n/2 is not an integer and there is no unique way to divide by 2 in T. To solve this problem we introduce a double covering of T → T via multiplication by 2. Note that the unitary matrix   cos t − sin t sin t cos t has eigenvalues eit and e−it so it is more instructive to consider the representa-

14

Symbol H0 π0 , π+ , π− e νe µ τ p n u, d, s, c, b, t γ W +, W − Z0 g ∆ Ω−

Name Higgs particle pion electron neutrino muon tau proton neutron quarks photon W -boson Z-boson gluon delta baryon omega particle graviton

Spin 0 0 1/2 1/2 1/2 1/2 1/2 1/2 1/2 1 1 1 1 3/2 3/2 2

Table 1: A selection of elementary particles. The Higgs particle, the gluon and the graviton are hypothetical. The quarks cannot exists independently but there is a lot of experimental evidens for their existence. tion  it e 0

0 e−it



1/2 b + ic b − ic 1/2



e−it 0

0 eit



 =

1/2 e2it (b + ic) −2it e (b − ic) 1/2

Until now only planar rotations have been discussed. Often it is possible to rotate a particle in 3 dimensions but not always. For instance it is only obvious how to rotate a photon if the axis of rotation is parallel to its velocity vector. Irreducible representations of rotations in 3 dimensions are more complicated than in two dimensions, but we can show the state space of irreducible representations can be imbedded in the set of density matrices on a complex Hilbert space. We shall see the relation to representations in 3 dimensions i.e. representations of SO (3) . The group SU (2) acts on the Bloch sphere and therefore we get a group homomorphism π : SU (2) → SO (3) . Let U1 and U2 be special unitary operators in SU (2) leading to the same rotation of the Bloch sphere. Then U1 = αU2 . Taking the determinant gives α2 = 1. Therefore α = ±1, and U1 = ±U2 . Therefore the map π : SU (2) → SO (3) is 2 to 1. Any projective representation ρ : SO (3) → U (n) gives a projective representation ρ ◦ π : SU (2) → U (n) . A representation τ : SU (2) → U (n) gives a unitary representation of SO (3) if and only if τ (−1) = 1.

15

 .

We introduce the matrices 

i 0

0 −i





0 −1

1 0





0 i

i 0



i= j= k=

.

Then i2 = j2 = k2 = −1 ij = −ji = k jk = −kj = i ki = −ik = j . The matrix U = a + bi + cj + dk is has the form   a + ib c + id −c + id a − ib and the determinant a2 + b2 + c2 + d2 . The adjoint is U ∗ = a − bi − cj − dk. Therefore U U ∗ = U ∗U = a2 + b2 + c2 + d2 . Hence U ∈ SU (2) if and only if a2 + b2 + c2 + d2 = 1. It is easy to check that all elements in SU (2) are of this form. Therefore SU (2) has the same topology as a sphere in 4 dimensions. Example 16 Now put z1 = a+ib and z2 = c+id. Then SU (2) can be identified 2 2 with the vectors (z1 , z2 ) with |z1o| + |z2 | = 1.The group SU (2) act on the set n 2 2 (z1 , z2 ) ∈ C2 | |z1 | + |z2 | = 1 via (z1 , z2 ) → (u11 z1 + u21 z2 , u12 z1 + u22 z2 ) where

 U=

u11 u21

u12 u22

 ∈ SU (2) .

If U = a + bi + cj + dk and (z1 , z2 ) = (1, 0) then (z1 , z2 ) is mapped into (u11 , u12 ) = (a + ib, c + id) . Let m be a nonnegative integer. Let Hm be the linear space of homogeneous polynomials of degree m in two complex variables z1 and z2 provided with the scalar product Z (p | q) = p (z1 , z2 ) q (z1 , z2 )dµ (z1 , z2 ) |z1 |2 +|z2 |2 =1

16

where µ denotes the Haar measure on SU (2) , i.e. the uniform distribution on a sphere. Then Hm is a Hilbert space of dimension m + 1. Left translation of a polynomial is given by (Tm (u) p) (z1 , z2 ) = p (u11 z1 + u21 z2 , u12 z1 + u22 z2 ) where

 u=

u11 u21

u12 u22

 ∈ SU (2) .

We see that Hm is invariant under left translation so for each m we have a unitary representation. One can show that the representations are irreducible and that any L2 -function on SU (2) can be written as a sum of homogeneous polynomials so any irreducible representation is isomorphic to the representation on Hi for some i. The group SU (2) has the same topology as the sphere in 4 dimensions which is simply connected. Therefore for any projective (finite dimensional) representation of SU (2) is equal to a unitary representation times a complex function of modulus 1. Therefore we are interested in the unitary representations of SU (2) . The unitary matrix   exp (iθ) 0 0 exp (−iθ) corresponds to a rotation by 2θ in SO (2) . Let Pk be the homogeneous polynomial Pk (z1 , z2 ) = z1m−k z2k (k = 0, 1, ..., m) . Then m−k

Tm (uθ ) Pk (z1 , z2 ) = (exp (iθ) z1 )

(exp (−iθ) z2 )

k

= exp (i (m − 2k) θ) Pk (z1 , z2 ) . On span (P0 , Pm ) the unitary Tm (uθ ) has the matrix   exp (imθ) 0 0 exp (−imθ) so the restriction of the representation of SU (2) to span (P0 , Pm ) is a spin m/2 representation of SO (2) . The m-dimensional representation of SU (2) is therefore called the spin m/2 representation. It is possible to show any irreducible unitary representation of SU (2) is isomorphic to a spin m/2 representation for some nonnegative integer m. Let Ug denote a unitary representation of SU (2) . Then the representation on the density matrices is given by S → Ug SUg∗ which is an unitary transformation in the Hilbert space of matrices with the inner product (S | T ) = T r (ST ∗ ) . It can be considered as the tensor product of the representation Ug with it self. The character of an unitary representation is defied as χ (g) = T r (Ug ) 17

The character satisfies that if U and V are unitary representations then χU ⊕V (g) = χ (Ug ) + χ (Vg ) χU ⊗V (g) = χ (Ug ) χ (Vg ) χU ∗ (g) = χU (g) . For the spin representations we have χTm (g) χTn (g) = χTn−m (g) + χTn−m+2 (g) + ... + χTn+m−2 (g) + χTn+m (g) . In particular ∗ (g) = χT χTm (g) χTm m (g) χTm (g)

2

= (χTm (g)) m X χT2j (g) . = j=0

Therefore the complex unitary representation Tm induces all orthogonal real representation of even order up to 2m. Similarly ∗ (g) + χT ∗ χTm ⊕1 (g) χ(Tm ⊕1)∗ (g) = χTm (g) χTm m (g) + χTm (g) + 1

so the compels unitary representation Tm ⊕ 1 induces the the orthogonal real representation corresponding to Tm .

3.3

Superposition

Once again the simplest example with rotations in 2 dimensions will be treated. Consider a quantum mechanical system containing 2 independent preparations which can be rotated independently around a given axis. The group T2 acts on the set of preparations and we will assume that it also acts on the state space. We search irreducible representations of T2 . The group T2 is commutative and connected so an irreducible representation is trivial or the state space is isomorphic to ∆, and a rotation by the angles α, β corresponds to a rotation of the unit disc by an angle kα + lβ where (k, l) ∈ Z2 . We will concentrate on the case (k, l) = (1, −1) . If one of the angles is fixed we get a spin-1/2 representation of the other. The standard interpretations is that one has made a preparation of 2 spin-1/2 particles. Now the system is equipped with a detector which can give the results detected or not detected. Like in the previous section the measurement can be written as a convex combination of a measurement not depending of the state and a measurement giving the result with certainty for some states. We will assume that the detector is of the last mentioned type.

18

Let φ be a pure state which with certainty gives the result detection. Then cos (α − β) + 1 2 cos (α) cos (β) + sin (α) sin (β) + 1 =    2  cos α cos β · +1 sin α sin β = 2    2 cos α cos β + sin α sin β = . 4

P (detection | φ rotated angles α and β) =

The standard interpretation is that a particle in a state given by the veccos α tor interferes with another particle in state given by the vector sinα  cos β , or even worse if the states are not in phase, that the particles get sin β extinct by each other. This kind of language usage has been a source to many paradoxes. In general the superposition principle can be explained using representations of products of groups with them selves.

4 4.1

Algebras The group algebra

Let G be a finite group with n elements and neutral element e. The a composition on C (G) is defined by X  (f ∗ g) (x) = f xy −1 g (y) . y∈G

A convolution is defined by f ∗ (x) = f (x−1 ) where denotes complex conjugation. With this structure C (G) is a so-called finite dimensional ∗-algebra called the group algebra of G. Thus symmetry groups leads to the matrix algebras in a natural way. Define φ (f ) = f (e)

19

and note that φ (1) = 1. We also have φ (f ∗ f ∗ ) = (f ∗ f ∗ ) (e) X  = f ey −1 f ∗ (y) y∈G

=

X

 f y −1 f (y −1 )

y∈G

=

X

|f (y)|

2

y∈G

≥0 with equality if and only if f = 0. A function φ on an algebra satisfying these conditions is called a faithful state. Here one should note that (f ∗ f ∗ ) (x) may be negative for some values of x. Example 17 The group Z4 = Z/4Z is an example of a commutative group. For n = 0, 1, 2, 3 consider the function fn : j → inj , j ∈ Z4 where i is the imaginary unit. These functions forms a basis for C (Z4 ) . If m 6= n then fm ∗ fn = 0. Therefore each of the functions fn generate a 1 dimensional sub-algebra of C (Z4 ) , and C (Z4 ) is isomorphic to a sum of these algebras. There exists groups that have isomorphic group algebras although the groups are not isomorphic. Therefore part of the structure of the group is lost by forming the group algebra, but often what is lost is not important for the applications we have in mind and loosing irrelevant structure is what sometimes makes life easier.

4.2

∗−Algebras

In order to describe quantum systems in more details we will introduce as an axiom that the state space of a quantum system geometrically has the same shape as the set of states of complex ∗-algebras. In this section we shall study certain sets of matrices and their algebraic properties. Both the Hilbert spaces and the algebras are over the field of complex numbers. Hilbert spaces and algebras over other fields than C will not be discussed, and it is important to realize that many of the results cannot be generalized to other fields.

4.3

Spectral theory

Let H be a finite dimensional complex Hilbert space. The linear maps H → H will be called operators. The set of operators on H will be denoted B (H) . The characteristic polynomial of X is Pchar (λ) = det (X − λI) Then λ ∈ C is an eigenvalue if and only if Pchar (λ) = 0 where Pchar is the characteristic polynomial of X. The set of eigenvalues of X is called the spectrum 20

of X and is denoted Sp (X) . A complex polynomial of degree d has at least one root and at most d roots. Therefore Sp (X) is a finite non-empty set. Let P denote a polynomial given by P (z) = an z n + an z n−1 + ... + a1 z + a0 . Then P (X) is defined by P (X) = an X n + an X n−1 + ... + a1 X + a0 . If X is self adjoined, i.e. X = X ∗ then the eigenvectors are orthogonal, and using the eigenvectors as basis X can be written as   λ1 0 · · ·  0 λ2 · · ·    .. .. . . . . . with the eigenvalues in the diagonal. Then P  P (λ1 ) 0  0 P (λ 2)  .. .. . .

(X) can be written as  ··· ···  . .. .

Let f be a complex function with domain Sp (X) where X is self adjoined. Then f (X) is defined as the linear map with matrix   f (λ1 ) 0 ···  0 f (λ2 ) · · ·  .  .. .. .. . . . In this way to any function with domain Sp (X) an operator is associated. We see that (f + g) (X) = f (X) + g (X) and (f · g) (X) = f (X) · g (X) .

4.4

∗-algebras and their decomposition

Definition 18 A finite dimensional ∗-algebra A over the complex numbers is a finite dimensional complex vector space over C equipped with a composition · :A × A → A and and a map ∗ : A → A such that the following identities holds 1. X · (Y · Z) = (X · Y ) · Z for all X, Y, Z ∈ A. 2. X · (Y + Z) = X · Y + X · Z for all X, Y, Z ∈ A. 3. λ (X · Y ) = λX · Y = X · λY for all λ ∈ C, X, Y ∈ A. ∗

4. (X ∗ ) = X for all X ∈ A. ∗

5. (X + Y ) = X ∗ + Y ∗ for all X, Y ∈ A. 21



6. (X · Y ) = Y ∗ · X ∗ for all X, Y ∈ A. ∗

¯ ∗ for all λ ∈ C, X ∈ A. 7. (λX) = λX The elements in a ∗-algebra are called operators. A ∗-algebra is commutative if X · Y = Y · X for all X, Y ∈ A. The algebra is said to have a unit (or to be unital ) if there exists an operator E such that E is a neutral element with respect to multiplication. A standard argument proves that a ∗-algebra can have at most one unit. Unless directly stated the ∗-algebras in these notes are always assumed to be unital and the unit will be denoted 1. Denote by B (H) the set of linear maps H → H. It is easy to check that B (H) is a finite dimensional ∗-algebra. If H has dimension d then B (H) can be identified with the matrix algebra of d × d complex matrices. Let C (U ) denote the set of complex function U → C where U is a finite set. Then C (U ) is a commutative ∗-algebra with the normal addition, multiplication and conjugation. The rest of this section is devoted to a classification of all finite ∗algebras, and the result will be that B (H) is the most non-commutative algebra and any algebra is somewhere in between algebras like B (H) and C (U ) . For a self adjoined operator X ∈ B (H) the algebra C (Sp (X)) is the smallest ∗-algebra in B (H) containing X. An operator X in a ∗-algebra is said to be self adjoint if X = X ∗ . An operator is said to be positive if there exists an operator Y such that X = Y ∗ Y. The positive operators are automatically self adjoint. Definition 19 An element P in the ∗-algebra A is called an orthogonal projection if P is self adjoined and P 2 = P. If P is an orthogonal projection then 1 − P is also a projection. The set of operators in A which commutes with P is a sub-algebra B of A. Now X y P XP + (1 − P ) X (1 − P ) is a projection of A into B. The set of operators P and 1 − P is an example of a resolution of the identity which will now be defined in full generality. Definition 20 Let X1 , X2 , ..., Xn be a set of operators in a ∗-algebra. Then X1 , X2 , ..., Xn is said to be a resolution of the identity if all operators are positive and n X Xi = 1. i=1

If Xi Xj = 0 for i 6= j the the resolution of the identity is said to be orthogonal.

22

Let P1 , P2 , ..., Pn be an orthogonal resolution of the identity. Then Xi = Xi · 1 n X = Xi · Xi i=1

= Xi2 for all i = 1, 2, ..., n, and all Pi are projections. For all X ∈ A define E (X) by E (X) =

n X

Pi XPi .

i=1

Then E is a projection of A into the a sub-algebra B consisting of operators which commute with all Pi . Let A and B be ∗-algebras. Then the sum of the algebras is denoted A ⊕ B where addition and multiplication is defined componentwise. Let P be a projection in A commuting with all operators in A. Then P A and (1 − P ) A are sub-algebras of A. and the algebra A is isomorphic to P A⊕ (1 − P ) A. A ∗algebra is said to be simple if it can not be written as a non-trivial sum of ∗-algebras. We see that a simple algebra has no projections except 0 and 1. Definition 21 Let φ be a linear map A → C. Then φ is called a state if X ≥ 0 implies φ (X) ≥ 0, and φ (1) = 1. The extreme states are called pure states. If φ (X ∗ X) = 0 implies X = 0 then φ is said to be faithful. Example 22 Let U be a finite set and C (U ) the ∗-algebra of functions U → C. Then for any probability vector (p1 , p2 , ..., pn ) on U a state is given by f→

n X

f (i) pi

i=1

which is the mean value of the random variable f. The state is faithful if and only if pi > 0 for i = 1, 2, ..., n. Theorem 23 Let φ be a state on the finite dimensional ∗-algebra B (H) . Then there exists a positive operator Sφ ∈ B (H) with T r (Sφ ) = 1 such that φ (X) = T r (XSφ ) for all X in B (H) . A state is pure if and only if there exists a vector ~u ∈ H such that φ (X) = h~u | X~ui . The state φ is faithful if and only if 0 is not an eigenvalue of Sφ .

23

Proof. First we observe that the formula (X | Y ) = T r (XY ∗ ) defines an inner product on B (H) . Therefore there exists an operator S in B (H) such that φ (X) = (X | S) = T r (XS ∗ ) . For any vector ~v in H a linear map is defined by ~u → h~u | ~v i · ~v . The operator X of this map is positive. Then the operator XS ∗ gives a linear map ~u → hS ∗ ~u | ~v i·~v = h~u | S~v i·~v , and the trace of XS ∗ is h~v | S~v i . Therefore h~v | S~v i ≥ 0 for all ~v ∈ H, and S is a positive operator. A state is given by a positive operator Sφ which has trace one. Choosing a suitable basis it can be written as   λ1 0 · · · 0  0 λ2 · · · 0     .. .. ..  ..  . . . .  0 0 · · · λd P where λi ≥ 0 and λi = 1. Matrices of this form is a simplex and the extreme points are matrices where one of the eigenvalues is 1 and the others are 0. If φ is extreme choose ~u as an eigenvector of Sφ with eigenvalue 1. The operator S defined in the theorem is the density operator Sφ corresponding to the state φ. Let X be an operator in A1 ⊕A2 , and let P1 and P2 be projections on A1 and A2 . Let φ be a state on A1 ⊕A2 such that φ (Pi ) 6= 0. Then φ (X) = φ (P1 X + P2 X) = φ (P1 X) + φ (P2 X) = φ (P1 ) ·

φ (P1 X) φ (P2 X) + φ (P2 ) · . φ (P1 ) φ (P2 )

Remark that

φ (Pi X) φ (Pi ) is a state on Ai . In normal probability theory this is the well-known Bayes’ formula. In our more general setup it tells that the states on a sum of algebras is a mixture of states on the individual algebras. To find the states on a ∗algebra we have to write the algebra as a sum of algebras where the set of states is known. A representation of a finite dimensional ∗-algebra A is a pair (π, H) where H is a Hilbert space and π : A → B (H) is a ∗-homomorphism. The representation is said to be cyclic if there exist a (cyclic) vector ~v in H such that {π (X) ~v | X ∈ A} = H. The following theorem is known as the GelfandNaimark-Segal construction. X→

24

Theorem 24 (Gelfand-Naimark-Segal construction) Let A be a finite dimensional ∗-algebra with a faithful state φ. Then there exists a cyclic representation (πφ , Hφ , ~vφ ) such that φ (X) = (πφ (X) ~vφ | ~vφ ) . The ∗-homomorphism πφ is injective. Proof. Put Hφ = A equipped with the following inner product (X | Y ) = φ (Y ∗ X) . The ∗-homomorphism is given by πφ (X) Y = XY. Then (πφ (X) Y | Z) = (XY | Z) = φ (Z ∗ XY ) ∗

= φ (X ∗ Z) Y



= (Y | X ∗ Z) = (Y | πφ (X ∗ ) Z) . Put ~vφ = πφ (1) . Then (πφ (X) ~vφ | ~vφ ) = φ (1∗ X1) = φ (X) . The vector ~vφ is cyclic because πφ (X) ~vφ = X1 =X . The ∗-homomorphism πφ is injective because πφ (X) ~vφ = πφ (Y ) ~vφ if and only if X = Y. A representation π : A → B (H) is said to be irreducible if the only subspaces that are invariant under the action of A are the trivial subspaces. Theorem 25 Any representation π : A → B (H) of a finite ∗-algebra is a sum of irreducible representations. Let K be an invariant subspace of H. Then for u ∈ K and v ∈ K ⊥ we have  ∗ (u | π (X) v) = π (X) u | v = (π (X ∗ ) u | v) = 0

25

because X ∗ ∈ A and K is invariant under the action of A. Therefore K ⊥ is also invariant under the action of A. Let P denote the projection of H on K q the projection of K ⊥ . Then π (X) = (P + Q) π (X) (P + Q) = P π (X) P + P π (X) Q + Qπ (X) P + Qπ (X) Q = P π (X) P + Qπ (X) Q. Now we note that X → P π (X) P and X → Qπ (X) Q are representations. In this way we can decompose a representation until we have a sum of irreducible representations. A representation is said to be transitive if any vector different from ~0 is cyclic. It is straight forward to check that a representation is irreducible if and only if it is transitive. Theorem 26 (Burnside’s Theorem) Any irreducible representation π : A → B (H) is surjective. The proof given here builds on the master thesis of Rune Johansen. Proof. The proof is by induction on the dimension of H. for dim (H) = 1 the result is obvious. For the induction step assume that the theorem holds for any Hilbert spaces W with dim (W ) ≤ n. Let dim (H) = n+1. If π (A) = C then π is not transitive. Let X ∈ π (A) /C be some matrix. Then X has an eigenvalue λ and define F = X − λ. Note that F 6= 0 and that F is not invertible. Thus the range of F is a subspace of H that shall be denoted W. An algebra B is defined by  B = F X|W | A ∈ π (A) . Let ~x be any vector in W . Then Bx = {Bx | B ∈ B}  = F A|W ~x | X ∈ π (A) = F {A~x | X ∈ π (A)} = F (H) = W. Thus B is transitive on W and according to the induction hypothesis B = B (W ) . In particular B contains a one dimensional projection P on some vector ~u. This is given by P (~x) = (~x | ~u) · ~u. Now π (A1 ) P π (A2 ) (~x) = (π (A2 ) ~x | ~u) · π (A1 ) ~u = (~x | π (A∗2 ) ~u) · π (A1 ) ~u Using transitivity of π we see that π (A) contains all mappings of the for x → (x | v) · w and therefore π (A) = B (H) . 26

Theorem 27 Let A be a finite dimensional unital ∗-algebra with a faithful state φ. Then A is isomorphic to a sum of matrix algebras. If A is commutative then A is isomorphic to C (U ) . Example 28 Let G be the set of permutations of the elements of {1, 2, 3} . This group has 6 elements so the group algebra is 6 dimensional. A matrix algebra of dimension less than or equal to 6 is either of dimension 1 or 4. If the group algebra was a sum of 1 dimensional matrix algebras then C (G) and G would be commutative. Therefore the group algebra is a sum of 2 one dimensional algebras and one four dimensional algebra. A constant function x → 1 commutes with any other element f ∈ G because X  (f ∗ 1) (x) = f xy −1 1 y∈G

=

X

1f (y)

y∈G

= (1 ∗ f ) (x) . Therefore the constant functions form a 1 dimensional sub-algebra of the group algebra. Similarly the function which is 1 on the even permutations and -1 on the odd permutations commutes with any other element. Therefore this function also generates a 1 dimensional sub-algebra of the group algebra. functions which are orthogonal to these 2 functions then forms a sub-algebra of the group algebra isomorphic to a matrix algebra of 2 × 2 matrices. Using the decomposition of an a algebra into matrix algebras it follows that any state ψ on A is given by a positive operator Sψ ∈ A with T r (Sψ ) = 1 such that ψ (X) = T r (Sψ X) , where the trace denotes the sum of the traces on the matrix algebras. Until now we have seen density operators with respect to the trace. Similarly if S ≥ 0 and φ (S) = 1 then the map X → φ (SX) is a state. All states are of this form if and only if φ if faithful.

4.5

Measurements in ∗-algebras

From now on we will assume that the state space can be represented by the set of states on a ∗-algebra. There are several reasons to choose ∗-algebras. First of all they seem to include all examples in mechanics (classic, quantum mechanic or statistical mechanic). Secondly the category of ∗-algebras and homeomorphisms is nice in the sense that one can form sum, tensor product and limes inside the category. Further one can perform the construction crossed product inside the category, where symmetries on the algebra are expressed as unitary operators in an extended algebra. This illustrates that the commutative part of the algebra has its origin in a commutative algebra of random variables and the non commutative properties reflects symmetries. Our next goal is to 27

make representations of the set of measurements. According to the general definitions a measurement with values in U is given by an affine mapping from the state space into the set of probability distributions on U . Let Xu1 , Xu2 , ..., Xun be a resolution of the identity with U = {u1 , u2 , ..., un } . For B ⊆ U put X MB = Xu . u∈B

Then the map B → MB is called a positive operator valued measure (POVM ). POVMs corresponding to orthogonal resolutions of the identity are sometimes called spectral measures and it is possible to integrate with respect to such spectral measures. Theorem 29 Let A be a finite dimensional unital ∗-algebra with a faithful state φ. The relation µψ (B) = ψ (MB ) = T r (Sψ MB ) , B ⊆ U establish a bijective correspondence between measurements on the state space S (A) and POVMs MB in A. Proof. Let MB be a POVM. Then it is easy to check that µψ (B) = φ (Sψ MB ) , B ⊆ U defines a measurement For any u ∈ U the probability µφ (u) is a linear function of φ. Therefore there exists a positive operator Xu ∈ A such that µψ (u) = (Sψ | Xu ) = T r (Sψ Xu∗ ) . If φ is a pure state corresponding to the vector ~v ∈ H, then µψ (u) = (~v | Xu~v ) ≥ 0. Therefore Xu ≥ 0. For B ⊆ U we have X µψ (B) = µψ (u) u∈B

=

X

φ (Sψ Xu )

u∈B

! = φ Sψ

X

Xu

u∈B

= φ (Sψ MB ) .

Example 30 In a commutative ∗-algebra C (U ) the simple measurements corresponds to partitions of U . 28

Let µ be a simple measurement. Then φ → T r (Sφ MB ) is extreme in the set functionals S (A) → [0; 1] . All operators MB satisfy 0 ≤ MB ≤ 1, i.e. all eigenvalues of MB are in [0; 1] . The extreme operators satisfying 0 ≤ MB ≤ 1 can only have eigenvalues 0 and 1, and therefore MB is a projection. Then the corresponding resolution of the identity is orthogonal. Theorem 31 There is a unique correspondence between simple measurements µ with values in R and self adjoined operators Xµ such that X f (u) µφ (u) = φ (f (Xµ )) u∈U

Proof. Let Mu , u ∈P R be the orthogonal resolution of the identity corresponding to µ. Put Xµ = u · Mu . Then Xµ is a self adjoined matrix and according to the spectral theorem the correspondence is uniquely determined. A simple measurement with values in R is called an observable. The theorem tells that observables can be identified with self adjoined operators. Let X be an observable, If the state is φ then the mean of X is φ (X) and the variance of X is   2 V ar (X) = φ (X − φ (X)) . Let X and Y be self adjoined operators. Then the commutator of X and Y is given by [X, Y ] = XY − Y X. The following theorem is a version of Heisenberg’s uncertainty relation. Theorem 32 Let X and Y be self adjoined operators in a ∗algebra A with state φ. ’Then 1 2 V ar (X) V ar (Y ) ≥ φ (i [X, Y ]) . 4 Proof. First remark that V ar (X) = V ar (X − φ (X)) and [X − φ (X) , Y ] = (X − φ (X)) Y − Y (X − φ (X)) = XY − Y X − φ (X) Y + Xφ (Y ) = XY − Y X = [X, Y ] . Therefore X can be replaced by X − φ (X) in the formula we have to prove. Similarly Y can be replaced by Y − φ (Y ) . Then the algebra can be equipped with an inner product (X | Y ) = φ (XY ∗ ) . Then for all c ∈ R we have 0 ≤ kX − icY k

2

= V ar (X) + c2 V ar (Y ) + cφ (i [X, Y ]) .

29

Therefore the discriminant is positive and we have 2

(φ (i [X, Y ])) − 4V ar (X) V ar (Y ) ≥ 0 Orthogonal resolutions of the identity are sometimes called spectral measures and it is possible to integrate with respect to such spectral measures. Theorem 33 If MB is an orthogonal resolution of the identity then MB MC = 0 if B ∩ C = ∅ and MB2 = MB . All extreme measurements are simple if and only if U has only 2 elements or the algebra is commutative. Proof. Using that MU is extreme MU is a projection and therefore MU2 = MU . Then MU (1 − MU ) MU = MU2 − MU MU = 0 and for MV we get 0 ≤ MV ≤ 1 − MU and further MU MV MU = 0. This proves that MU MV2 MU = ∗ MU MV (MU MV ) = 0. Therefore MU MV = 0. If U has 2 elements then a measurement is given by 2 operators M and M 0 where M 0 = 1 − M. Therefore the set of measurements is isomorphic to the set of operators between 0 and 1. The extreme measurements are exactly the projections and gives the orthogonal resolutions of the identity. We have to prove that if U has at least 3 elements and the algebra is not commutative then there exists an extreme measurements which is not simple. If this is true when U contains 3 elements then it is true also when U has more than 3 elements. Therefore, assume that U has 3 elements. The algebra  is commutative and therefore there exists a sub-algebra isomorphic to B C2 . The matrices will be imbedded in the algebra via this sub-algebra. Define   1 1 cos (θu ) + i sin (θu ) Mu = cos (θu ) − i sin (θu ) 1 3 where θu takes the values 00 , 1200 and 2400 . Then Mu is a resolution of the identity. The matrix Mu has eigenvalues 0 and 2/3. Assume that Mu0 and Mu00 are 2 resolutions of the identity such that Mu = 21 (Mu0 + Mu00 ) . Then Mu0 = αu · Mu because Mu is 2/3 of a projection. Then X 1= Mu0 X = αu · Mu  X αu  1 cos (θu ) + i sin (θu ) = cos (θu ) − i sin (θu ) 1 3   P αu P αu (cos (θ u ) + i sin (θu )) 3 3 P P = . αu αu 3 (cos (θu ) − i sin (θu )) 3 This shows that αu = 1 and therefore Mu0 = Mu . The theorem shows that in general it is not sufficient to consider simple measurements in quantum theory. Many interesting measurements are described by non-simple measurements. Let the Hilbert space H0 be imbedded as a subspace of the Hilbert space H, and let P be the projection on H0 . If MB is a measurement in A ⊆ B (H) then P MB P is a measurement in H0 . 30

Theorem 34 For any measurement MB in A ⊆ B (H) there exists a Hilbert space L containing H and a simple measurement EB in B (L) such that MB = P EB P where P is the projection of L on H. Proof. Let L be the set of mappings U → H. Put X  hf | gi = f (u) | M{u} (g (u)) . u

Then L is a pre-Hilbert space. Let L denote the completion of L with respect to h· | ·i . The map l : v → (u → v) is an isometry of H into L, and H can be identified with a subspace of L. Let Mu be the operator f → f · 1u . Then E obviously is an orthogonal resolution of the identity. For v ∈ H we have 

v | M{u} v = l (v) | E{u} (l (v)) which proves that MB = P EB P.

5

Group representations on a Hilbert space

Let G be a connected group acting on the state space of a sum of matrix algebras. Let φ be a pure state. Then ag (φ) is a pure state for each g in G. Therefore the set {ag (φ) | g ∈ G} is a connected set of pure states, and therefore {ag (φ) | g ∈ G} are states on one of the matrix algebras in the sum. If G is a connected group with a irreducible action on the state space of a ∗-algebra with a faithful state then the ∗-algebra is isomorphic to a matrix algebra. The most important groups in physics are connected and therefore we will restrict our attention to state spaces of the form B1+ (H) where H is a Hilbert space. An operator V from the Hilbert space H into itself is called anti-unitary if it is conjugated linear and satisfies (V ~u | V ~v ) = (V ~v | V ~u). The following theorem is due to Wigner. Theorem 35 Any automorphism of the state space B1+ (H) is of the form S → V SV ∗ , where S denotes the density operator of a state, and where V is a unitary or anti-unitary operator in the Hilbert space H. It is straight forward to check that V SV ∗ is a density operator when S is a density operator. We will not prove the converse in the general case, but in case dim (H) = 2 it is easy and instructive. If dim (H) = 2 then the state space has the shape of a ball in 3 real dimensions. Therefore an automorphism of the state space maps the ball into it self. If the ball is centered then an automorphism is given by an orthogonal map in 3 real dimensions. The an orthogonal map is given by a rotation around an axis and, if the orthogonal map reverse orientation, a reflection in a plane orthogonal to the axis. By 31

a suitable choice of basis the density operators can be identified with 2 × 2 density matrices such that the axis of rotation is identified with the matrices  1  ic 2 . −ic 12 An orientation preserving rotation is given by   cos θ sin θ V = . − sin θ cos θ The reflection



1 2

+a b − ic

b + ic 1 2 −a



 y

1 2

+a b + ic

b − ic 1 2 +a



is given by the anti-unitary operator     x + iy x − iy y . z + iw z − iw Remark 36 An automorphism given by S → V SV ∗ maps the pure state corresponding to the vector ~v into the pure state corresponding to V ~v . It is important to note that the unitary or anti unitary operator is only unique up to a scalar factor of unit modulus. That is V can be multiplied by ω ∈ C, |ω| = 1 without changing the state V S V´ ∗ . Now let G act on the state space. Then for each g ∈ G there exists a unitary or anti unitary operator Vg in H such that ag (S) = Vg SVg∗ . Further ∗ Vg Vh SVh∗ Vg∗ = Vgh SVgh , g, h ∈ G,

for all density operators S. Therefore there exists a complex function g, h y ω (g, h) with |ω (g, h)| = 1 such that Vg Vh = ω (g, h) Vgh , g, h ∈ G.

(2)

Assume that G is a topological group which is connected, and assume that g y Vg is continuous. Then Vg is unitary because it is not possible to pass continuously from a unitary to a anti-unitary operator. Definition 37 A continuous map G → U (H) such that (2) is satisfied is called a projective unitary representation of the group G on the Hilbert space H. If ω (g, h) = 1 for all g, h ∈ G then the representation is said to be unitary. A projective unitary representation is said to be irreducible if {0} and H are the only sub spaces of H invariant under Vg for all g ∈ G. Remark that if a projective unitary representation gives a irreducible action of the group on the state space, then the unitary representation is irreducible, but the opposite is not necessarily the case. 32

Theorem 38 Let Vθ be a projective unitary representation of R on a complex Hilbert space H of dimension d. Then there exists a unitary representation of R given by a self adjoined operator A such that Vθ = αθ exp (iθA) , where |αθ | = 1. Proof. First observe that det (Vθ ) is a continuous function R → {z ∈ C | |z| = 1} . Then there exists a uniquely determined function αθ such that αθd = det (Vθ ) . Then   Vθ det Vθ det = =1. αθ αθd Thus V˜θ =

Vθ αθ

is a projective unitary representation G → SU (H) with V˜θ V˜ζ = ω ˜ (θ, ζ) V˜θ+ζ .

(3)

Taking the determinant on each side of (3) leads to d

1 = (˜ ω (θ, ζ)) . The equation z d = 1 has the d unit roots as solutions which is a discrete set. The map (θ, ζ) y ω ˜ (θ, ζ) from R2 to C is continuous and ω ˜ (0, 0) = 1. Therefore ω ˜ (θ, ζ) = 1 for all (θ, ζ) ∈ R2 and the representation V˜θ is unitary. Let ~v1 , ~v2 , ..., ~vd be eigenvectors of the unitary operator V˜θ . Then ~v1 , ~v2 , ..., ~vd are also eigenvectors of V˜nθ . This is proved by induction in n. For n = 1 this is obvious. Assume that ~v1 , ~v2 , ..., ~vd are eigenvectors for Vnθ . Then V˜(n+1)θ = −1 Vnθ+θ = (ω (g, h)) Vnθ Vθ and ~v1 , ~v2 , ..., ~vd are also eigenvectors for V˜(n+1)θ . Therefore if L is the set of eigenvectors corresponding to an eigenvalue λ of Vθ then L is invariant under V mθ . By continuity L invariant under the Vθ for all θ. n Therefore H can be written as a sum of Hilbert spaces Li such that Vθ has only one eigenvalue λiθ on the invariant Li . Then θ → λiθ is a unitary representation i on C.Therefore there exists ai such P that λθ = exp (iθαi ) . Let Pi denote the projection of H on Li . Put A = ai Pi . Then Vθ = αθ V˜θ X = αθ λiθ Pi X = αθ exp (iθai ) Pi = αθ exp (iθA) . An important example of the group R acting on a state space is time translations. Assume that a (closed) quantum system at time 0 is in a state described by the density operator S0 . Then the theorem states that there exists a self adjoined operator H such the state at time t is described by the density operator St = exp (−iHt/~) S0 exp (iHt/~) 33

where ~ = h/2π and h is Planck’s constant and A in the theorem is given by −H/~. In this case the self adjoined operator H is called the Hamiltonian and the corresponding observable is called the energy observable. Taking the derivative of this equation we get i~

dSt = [H, St ] dt

where [A, B] is the commutator of the operators A and B given by AB − BA. If S0 is a pure state given by the vector ψ then the time evolution is given by the Schr¨ odinger equation dψt i~ = Hψt . dt If the group of time translations and time reversions are considered the group is no longer connected and the time reversions should be described by anti unitary operators. Remark also that Et (H) = T r (exp (−iHt/~) S0 exp (iHt/~) H) = T r (exp (−iHt/~) S0 H exp (iHt/~)) = T r (S0 H) = Et (H) so the mean energy is constant. Now, put Sθ = exp (iθA) S exp (−iθA) where A is self adjoined, and let X be any self-adjoined operator. Then we get the Mandelstam-Tamm inequality 1 2 (T r (Sθ i [X, A])) 4  2 1 d = Eθ (X) . 4 dθ

V arθ (X) V arθ (A) ≥

Remark that T r (Sθ A)   2 V arθ (A) = T r Sθ (A − E (A))   2 = T r exp (iθA) S exp (−iθA) (A − E (A))   2 = T r exp (iθA) S (A − E (A)) exp (−iθA)   2 = T r S (A − E (A)) = V ar (A) . Now, consider the measurement corresponding to X as an estimator of θ. Then the estimator is said to be unbiased if Eθ (X) = θ. Therefore for an unbiased 34

estimator X of θ we have V arθ (X) ≥

1 . 4V ar (A)

This is a kind of non-commutative Rao-Cram´er inequality. Let Ug be a unitary representation of a group G on a Hilbert space H, and let A be the group algebra of F. Then a ∗-homomorphism of A into B (H) is given by X f→ f (g) U (g) g∈G

where f in the group algebra is a function G → C. Therefore finding all unitary representations is equivalent to finding all ∗-homomorphism of A into B (H) . Now it is useful to know that A is a sum of matrix algebras. Using this technique we just have to find find ∗-homomorphism of matrix algebras or equivalently find unitary representations of SU (n) for different values of n. Example 39 Let G = R2 be the set of 1 dimensional space translations and changes of the velocity. Let Wx,v denote the unitary operator corresponding to a spatial shift of x and velocity translation of v. Put Vx = Wx,0 and Uv = W0,v . Then x → Vx and v → Uv are projective representations, and we can assume that they are unitary representations. Now (x, v) = (x, 0) + (0, v) = (0, v) + (x, 0) , but the corresponding unitary operators should give the same change of state S → Uv Vx SVx∗ Uv∗ = Vx Uv SUv∗ Vx∗ for any S. It follows that Uv Vx = exp (iη (x, v)) Vx Uv where (x, v) → η (x, v) is a real continuous function. For x = 0 or v = 0 this implies 1 = exp (iη (0, v)) = exp (iη (x, 0)) so we can put η (x, 0) = η (0, v) = 0 since V0 = U0 = 0. Then η (x, v + v 0 ) = η (x, v) + η (x, v 0 ) mod 2π. The only continuous solution to this equation satisfying η (x, 0) = 0 is η (x, v) = η (x) · v. In the same way η (x + x0 , v) = η (x, v) + η (x0 , v) mod 2π . Therefore η (x, v) = mxv for some real constant m which is called the mass. We thus get Uv Vx = exp (imxv) Vx Uv . This is called Weyl’s canonical commutator relation (CCR). Since W x, v is Vx Uv up to an arbitrary factor of unit modulus we can choose Wx,v = exp (imxv/2) Vx Uv . 35

Assume that the representation (x, v) → Wx,v is irreducible. Then µ = 0 implies [Vx , Uv ] = 0 and the only possibility is the one dimensional representation (x, v) → exp (i (αx + βv)) with α, β ∈ R. The case with µ > 0 is more important. The constant µ is (proportional to) the mass of the quantum system (particle) associated with the representation. The group R2 is simply connected. Therefore no finite dimensional projective representation can be associated with a quantum system with µ 6= 0. Assume that Uv = exp ivA and Vx = exp ixB. The the derivative of Weyl’s CCR with respect to x and v in (x, v) = (0, 0) is iAiB =

iµ + iBiA 2

which is equivalent to

µ . 2i This is called Heisenberg’s canonical commutator relation. Then [A, B] =

1 2 φ (i [A, B]) 4 1  µ 2 = φ 4 2 µ2 = 16

V ar (A) V ar (B) ≥

Let H be the infinite dimensional Hilbert space L2 (R) equipped with the usual scalar product. Then unitary operators are defined by Vx ψ (ξ) = ψ (ξ − x) Uv ψ (ξ) = exp (imvξ) ψ (ξ) Then a projective representation is given by Wx,v ψ (ξ) = exp (imxv/2) Vx Uv ψ (ξ)   x  = exp imv ξ − ψ (ξ − x) . 2

6

Crossed products

We have seen that groups and ∗-algebras are closely related. With the tools that we have available now we are able to make a new construction that emphasize this relation. Let A be a ∗-algebra represented on a Hilbert space H. Let α be an action of a group G on A. If g → Ug is a unitary representation of the group then an action α is given by αg (X) = Ug XUg∗ . If Ug ∈ A for all g then the action α is said to be an inner action. 36

Proposition 40 Let G be a group acting on A. Then the set of inner automorphisms a normal subgroup of G. Proof. Assume that αg1 (X) = Ug1 XUg∗1 and let αg2 be another automorphism. Then   αg2 αg1 αg−1 (X) = αg2 Ug1 αg−1 (X) Ug∗1 2 2   = αg2 (Ug1 ) αg2 αg−1 (X) αg2 Ug∗1 2 = αg2 (Ug1 ) X (αg2 (Ug1 ))



so αg2 αg1 αg−1 is given by the unitary operator αg2 (Ug1 ) . 2 Theorem 41 If A is a simple ∗-algebra with a faithful state then A is perfect, i.e. any automorphism is inner. Proof. A simple ∗-algebra has the form B (H) for some Hilbert spaceP H. Let P1 , P2 , ..., Pn be orthogonal one dimensional projections in A such that Pj = 1. Then X  X Pj = α (1) = 1. α (Pj ) = α We know that α (Pj ) is a projection different from 0 for all j and therefore α (Pj ) must be an orthogonal one dimensional projection. Let vj be the unit vector corresponding to Pj and wj be the unit vector corresponding to α (Pj ) . Then there exists a unique unitary operator U such that U (vj ) = wj . With these definitions   X α (X) = α  Pj XPk  j,k

  X  = α vj ⊗ vj∗ X (vk ⊗ vk∗ ) j,k

  X = α (Xvk | vj ) (vj ⊗ vk∗ ) j,k

=

X

=

X

=

X

(Xvk | vj ) (wj ⊗ wk∗ )

j,k

(XU ∗ wk | U ∗ wj ) (wj ⊗ wk∗ )

j,k

(U XU ∗ wk | wj ) (wj ⊗ wk∗ )

j,k

= U XU ∗ .

37

We see that α is an inner automorphism of a simple ∗-algebra then the associated unitary operator U is uniquely determined modulo a factor λ ∈ C with |λ| = 1. Thus any action of a group on a simple ∗-algebra is given by a projective unitary repretentation of G. An automorphism that is not inner is said to be outer. We shall now embed the algebra in a larger algebra such that the group action is given by a unitary representation that is inner. we shall assume that the algebra is represented on a Hilbert space H. A new Hilbert space L2 (G, H) is defined as the set of functions from G to H. The group G is assumed to be compact so that it has a Haar measure µ and the inner product on L2 (G, H) is defined by Z (f | h) = (f (g) | h (g)) dµ (g) . G

The algebra is embedded in the operators on L2 (G, H) as follows. An element  2 g ∈ G is embedded in B L (G, H) as the unitary operator Ug  Ug (f ) (g0 ) = f g −1 ∗ g0  Let X be an operator. Then π (X) ∈ B L2 (G, H) is given by π (X) (f ) (g) = αg−1 (X) (f (g)) . One easily checks that g → Ug and π are embeddings. Next we see that  Ug π (X) Ug∗ (f ) (g0 ) = π (X) Ug∗ (f ) g −1 ∗ g0  = αg−1 ∗g (X) Ug∗ (f ) g −1 ∗ g0 0

= αg−1 ∗g (X) (f (g0 )) 0

= αg−1 (αg (X)) (f (g0 )) 0

= π (αg (X)) (f ) (g0 ) . Thus Ug π (X) Ug∗ = π (αg (X)) . Definition 42 The algebra generated by the subalgebra π (A) and the operators Ug is called a crossed product and is denoted A nα G. Any element in A nα G can be written as X π (Xg ) Ug g∈G

where g → Xg is a arbitrary function from G to A. We see that X  π (X) = π αg−1 (X) Ug g∈G

38

and Ug0 =

X

π (δg,g0 ) Ug .

g∈G

We have X X X X π (Xg1 ) Ug1 π (Yg2 ) Ug2 = π (Xg1 ) Ug1 π (Yg2 ) Ug∗1 Ug1 Ug2 g1 ∈G

g2 ∈G

g1 ∈G g2 ∈G

=

X X

π (Xg1 παg1 (Yg2 )) Ug1 g2

g1 ∈G g2 ∈G

! =

X

X

g∈G

g1 g2 =g

π (Xg1 αg1 (Yg2 )) Ug 

 =

X

X 





π Xg1 αg1 Yg−1 g



1

 Ug .

g1 ∈G

g∈G

Example 43 If A = C then the Hilbert space can be chosen to be one dimensional and can be identified with C. In this case L2 (G, H) is simply functions on G. The elements of A nα G have the form X f (g) Ug g∈G

where f is a complex function G. Then  X

f (g) Ug

g∈G

X

h (g) Ug =

g∈G

X  X

f (g1 ) h g1−1 ∗ g  Ug

g1 ∈G

g∈G

=

 X

(f ∗ h) (g) Ug .

g∈G

Therefore C nα G is isomorphic to the group algebra of G. If φ is a state on A then a state on A nα G is defined by   Z X φ π (Xg ) Ug  = φ (αg (Xg )) dµ (g) . g∈G

With this definition  φ (π (X)) = φ 

X

  π αg−1 (X) Ug 

g∈G

Z =

 φ αg αg−1 (X) dµ (g)

Z =

φ (X) dµ (g)

= φ (X) . 39



Assume that ρ : G → U (H) is a unitary representation such that ρ (g) ∈ A ∗ for all g ∈ G and such that α (X) = ρ (g) Xρ (g) . Then A n G can be projected into A via X X l: π (Xg ) Ug → Xg ρ (g) g∈G

g∈G

We shall show that it is a homomorphism. Obviously it is linear.     X X X X l π (Xg1 ) Ug1  l  π (Xg2 ) Ug2  = Xg1 ρ (g1 ) Xg2 ρ (g2 ) g1 ∈G

g2 ∈G

g1 ∈G

=

g2 ∈G ∗

X X

Xg1 ρ (g1 ) Xg2 ρ (g1 ) ρ (g1 ) ρ (g2 )

g1 ∈G g2 ∈G

=

X X

Xg1 αg1 (Xg2 ) ρ (g1 g2 )

g1 ∈G g2 ∈G

=

  Xg1 αg1 Xg−1 g ρ (g)

X X

1

g∈G g1 ∈G

 = l

 X X





π Xg1 αg1 Xg−1 g 1



Ug 

g∈G g1 ∈G

 = l

 X

π (Xg1 ) Ug1

g1 ∈G

X

π (Xg2 ) Ug2  .

g2 ∈G

Note that we actively use that ρ is a unitary representation and not only a projective unitary representation. Next we shall show that l ◦ π is the identity. For this (l ◦ π) (X) = l (π (X) Ue ) = Xρ (e) = X. For A = C we know that A n G is the group algebra which is |G|-dimensional so if G is not trivial then A will be a proper subalgebra of A n G. This is a major disadvantage about crossed products: In general the crossed product is an algebra that is much bigger than it ought to be for our applications. Let G be a group acting on A and let N be a normal subgroup of G. Then there also an action of N on G. It is interesting to compare A n N with A n G. The quotient G/N acts on A n N via   X X α ˜ g0  π (Xg ) Ug  = π (αg0 (Xg )) Ug0 gg−1 . 0

g∈N

g∈N

40

We have to check that this is a homomorphism. We have     X X X X α ˜ g0  π (Xg1 ) Ug1  α ˜ g0  π (Yg2 ) Ug2  = π (αg0 (Xg1 )) Ug0 g1 g−1 π (αg0 (Xg2 )) Ug0 g2 g−1 0

g1 ∈N

g2 ∈N

0

g1 ∈N

=

g2 ∈N

X X

π (αg0 (Xg1 )) Ug0 g1 g−1 π (αg0 (Xg2 )) Ug∗ g

−1 0 1 g0

0

g1 ∈N g2 ∈N

=

  π (αg0 (Xg1 )) π αg0 g1 g−1 αg0 (Xg2 ) Ug0 g1 g2 g−1

X X

0

0

g1 ∈N g2 ∈N

=

X X

π (αg0 (Xg1 ) αg0 g1 (Xg2 )) Ug0 g1 g2 g−1 0

g1 ∈N g2 ∈N

=



X X





π αg0 Xg1 αg1 Xg−1 g



1

Ug0 gg−1 0

g∈N g1 ∈N



 =α ˜ g0 

X X





π Xg1 αg1 Xg−1 g 1

g∈N g1 ∈N

We also have  (˜ αg1 ◦ α ˜ g2 ) 





 X

π (Xg ) Ug  = α ˜ g1 

X

π (αg2 (Xg )) Ug2 gg−1  2

g∈N

g∈N

=

X

π (αg1 (αg2 (Xg ))) Ug1 g2 gg−1 g−1 2

1

g∈N

=

X

π (αg1 g2 (Xg )) Ug1 g2 g(g1 g2 )−1

g∈N

 =α ˜ g1 g2 

 X

π (Xg ) Ug  .

g∈N

Next we shall see that (A n N ) n G/N is isomorphic to A n G.

7

Tensor products

7.1

Tensor products of Hilbert spaces

Let H and K be finite dimensional vector spaces. Then a bilinear map f from H × K into the vector space L is a map which satisfies: • For ~v ∈ K the map ~u → f (~u, ~v ) is linear. • For u ∈ H the map ~v → f (~u, ~v ) is linear. Now we will define a new vector space called the tensor product of H and K. Let (~u1 , ~u2 , ..., ~um ) be an orthonormal basis of H and let (~v1 , ~v2 , ..., ~vn ) be 41

Ug0 g1 g



Ug0 gg−1  . 0

an orthonormal basis of K. Then Cnm has a structure as a vector space of dimension nm. Choose an orthonormal basis of Cnm and label the nm basis vectors w ~ i,j , i = 1, 2, ..., m, j = 1, 2, ..., n. Then the map l : H × K → Cnm defined by X  X X l xi ~ui , yj ~vj = xi yj · w ~ i,j i,j

is bilinear. Let ~u ∈ H and ~v ∈ K then l (~u, ~v ) is called the tensor product of ~u and ~v and is denoted ~u ⊗ ~v . We see that ⊗ is distributive in the sense that (~u + w) ~ ⊗ ~v = ~u ⊗ ~v + w ~ ⊗ ~v and ~v ⊗ (~u + w) ~ = ~v ⊗ ~u + ~v ⊗ w. ~ The tensor product of H and K and is denoted H ⊗ K. An equivalent way to define the tensor products is the following. Identify the vector space H with the set C (U ) for some finite set U and identify the vector space K with C (V ) for some finite space V. Then the tensor product of H and K is identified with C (U × V ) . The map l is then identified with the map l (f, g) : (u, v) → f (u) g (v) . Theorem 44 Let H, K and L be vector spaces, and let f : H × K → L be a bilinear map. Then there exists a unique linear map g : H ⊗ K → L such that f (~u, ~v ) = g (~u ⊗ ~v ) . Assume that M is a vector space and l : H × K → M is a bilinear map such that for any bilinear map f : H × K → L there exists a linear map g : M → L such that f (u, v) = g (l (u, v)) . Then there exists an injection map g : H⊗K. → M such that l (~u, ~v ) = g (~u ⊗ ~v ) . Proof. Let f : H × K → L be a bilinear map. Let (~u1 , ~u2 , ..., ~um ) be an orthonormal basis of P H and let (~v1 , ~v2 , ..., ~vn ) be an orthonormal basis of K. If P ~u = xi ~ui and ~v = yj ~vj then X f (~u, ~v ) = xi yj · f (~ui , ~vj ) . i,j

Therefore the linear map g is uniquely determined by g (~ui ⊗ ~vj ) = f (~ui , ~vj ) . Let M be a vector space with the properties stated. Then there exists a g : H ⊗ K→ M such that l (~u, ~v ) = g (~u ⊗ ~v ) ,

42

so we just have to prove that g is injective. Let w ~ be a vector in H ⊗ K. Then the map f : H ⊗ K → C given by f (~u, ~v ) = (~u ⊗ ~v | w) ~ is bilinear and there exists a linear map h : M → C such that (~u ⊗ ~v | w) ~ = f (~u, ~v ) = h (l (~u, ~v )) = h (g (~u ⊗ ~v )) . If g (w) ~ = 0 and w ~=

P

ui i~

⊗ ~v i then X  (w ~ | w) ~ = ~ui ⊗ ~v i | w ~ X  = h g ~ui ⊗ ~v i X = h (g (w)) ~ =0

and w ~ = 0. Therefore the map g : H ⊗ K→ M is injective. Let ⊕ denote addition of vector spaces. Then the tensor product is distributive in the sense that H ⊗ (K ⊕ L) = (H ⊗ K) ⊕ (H ⊗ L) and (K ⊕ L) ⊗ H = (K⊗H) ⊕ (L⊗H) .

7.2

Tensor products of ∗-algebras

The construction of tensor products can be extended to ∗-algebras. Let A and B be finite dimensional ∗-algebras each of them being unital and with a faithful state. Then A and B both have a structure as a vector space. Therefore it is possible to define A ⊗ B as the vector space tensor product of A and B. The tensor product is organized as an algebra via the product given by (X ⊗ Y ) (V ⊗ W ) = XV ⊗ Y W. If A is represented on H via π : A → B (H) and B is represented on K via ρ : B → B (K) then A ⊗ B is represented on H ⊗ K via (π ⊗ ρ) (X ⊗ Y ) (~u ⊗ ~v ) = π (X) (~u) ⊗ ρ (Y ) (~v ) . The special case where A =B (H) and B =B (K) gives a ”natural” isomorphism B (H) ⊗ B (K) → B (H ⊗ K) . It works as follows. For a pair of operators (X, Y ) ∈ B (H) × B (K) a bilinear map from H × K into H ⊗ K is given by (~u, ~v ) → X (~u) ⊗ Y (~v ) . 43

This bilinear map is given by an operator Z : H ⊗ K → H ⊗ K. Thus the map which maps (X, Y ) ∈ B (H) ⊗ B (K) into Z ∈ B (H ⊗ K) is bilinear and is given by a linear map B (H) ⊗ B (K) → B (H ⊗ K) . It is obviously injective and therefore it must be an isomorphism because B (H) ⊗ B (K) and B (H ⊗ K) have the same dimension.

8

Partial measurements

A measurement has a state as input and a probability distribution as output. A partial measurement has a state as input but the output consist of two parts: a probability distribution and a state. The output state can then serve as input for a new measurement or a new partial measurement. The formal definition of a partial measurement is quite abstract, and is given as follows. As usual subsets of a set are identified with their indicator functions. Definition 45 Let A and B be finite dimensional ∗-algebras embedded in the set of operators on Hilbert spaces. Let U be a finite set. Then a partial measurement on A with values in U and output states in B is given by a positive map valued measure (PMVM) E mapping subsets of U into positive linear maps from A into B such that E (1A∪B ) = E (1A ) + E (1B ) for A and B disjoined subsets of U and E (1) maps density operators into density operators. The interpretation is as follows. If the input state is given by the density operator S then the probability of measuring a result in A ⊆ U is T r (E (1A ) (S)) , and if a measurement in A is observed then our knowledge of the output state is described by the density operator E (1A ) (S) . T r (E (1A ) (S)) If A = {u1 , u2 , ..., un } then by linearity E (1A ) =

n X

E (1ui ) .

i=1

Therefore E is determined by its values on the elements in U. The partial measurement can formally be extended to admit functions U → C as input. X E (f ) = f (u) E (1u ) u∈U

Sometimes partial measurements are called instruments. It is possible to compose two partial measurements by using the output state of the first partial measurement as input of the second measurement. The formal definition of the composed measurement is as follows. 44

Definition 46 Let E1 be a partial measurement with input states in A, output states in B and results in the finite set U, and let E2 be a measurement with input states in B and output states in C and results in the finite set V. The a partial measurement with input states in A, output states in C and results in V × U is defined by the formula X (E2 ◦ E1 ) (f ) = f (v, u) E2 (1v ) E1 (1u ) . (v,u)∈V ×U

Remark 47 This definition relies heavily on the condition that U and V are finite sets. The definition of par

Theorem 48 If E is an instrument then there exists a uniquely defined measurement ME such that E (1B ) (φ) ∈S ME1B (φ) for 1B ∈ B (U) . Proof. For any measurable function f there exists a reel number cf (φ) such that E (f ) (φ) ∈S. cEf (φ) Using the properties of E we see that ME1B (φ) = cE1B (φ) is a measurement. It is possible to compose instruments: Theorem 49 Let E i be an instrument with values in Ui for i = 1, 2. IfUi are finite then there exists an instrument E 2 × E 1 with values in U2 × U1 such that E 2 × E 1 (f2 × f1 ) (S) = E 2 (f2 ) E 1 (f1 ) (S)  for all states S and all fi ∈ B Ui . Proof. Using that Ui is finite we can define X E 2 × E 1 (f ) (S) = f (e2 , e1 ) · E 2 (e2 ) E 1 (e1 ) (S) . (e2 ,e1 )∈U2 ×U1

Similarly it is possible to compose an instrument with a measurement. Definition 50 An instrument is said to be repetitive if E (B) E (B) = E (B) for all measurable sets B. 45

Definition 51 The mapping E (1) : S → S will be called the reduction of the state space. Theorem 52 If an instrument E with values in U is repetitive the corresponding measurement ME is simple on the reduced state space E (1) (S) . Proof. Without loss of generality we may assume that U contains 2 elements and that E (1) (S) = S. We have to prove that M (1) , M (1u1 ) , M (1u2 ) and M (0) are extreme functionals: S → [0; 1] . We have M (1) = id , M (0) = 0 and M (1u2 ) = 1 − M (1u1 ) and therefore it is sufficient to show that M (1u1 ) is extreme. Assume M (1u1 ) = 12 Φ1 + 12 Φ2 . If E (1u1 ) (φ) = 0 then Mφ (1u1 ) = 0 and Φi (φ) = 0. If E (1u1 ) (φ) = φ then Mφ (1u1 ) = 1 and Φi (φ) = 1. In general we have E (1u1 ) (φ) E (1u2 ) (φ) φ = Mφ (1u1 ) + Mφ (1u2 ) Mφ (1u1 ) Mφ (1u2 ) which proves that Φi (φ) = Mφ (1u1 ) · 1 + Mφ (1u2 ) · 0 = Mφ (1u1 ) .

Theorem 53 Let M be a simple measurement on S. Then there exists at most 1 repetitive instrumentEM such that M is the measurement corresponding to EM and such that EM (1) is the identity on S. Proof. Let E be a repetitive instrument such that M is the corresponding measurement and such that E (1) is the identity on S. Let B be a measurable subset of U and let φ ∈ S be an extreme state. Then φ = Mφ (1B )

E (1B ) (φ) E (1B ) (φ) + Mφ (1B ) Mφ (1B ) Mφ (1B )

which proves that φ=

E (1B ) (φ) Mφ (1B )

and therefore E (1B ) (φ) = Mφ (1B ) · φ so that E (1B ) is uniquely determined on the extreme points in S and therefore on the whole of S. Theorem 54 Let E be a repetitive instrument. Then E is uniquely determined by its reduction E (1) and its measurement ME via the formula E = EME ◦ E (1) .

46

Proof. This is an immediate consequence of the 2 previous theorems. Although E is repetitive the corresponding measurement need not be simple on all of S. To see this let S be the convex hull of the extreme points φ1 , φ2 and φ3 . Let an instrument E with values in U = {u1 , u2 } given by E (1u1 ) (φ1 ) = φ1 E (1u1 ) (φ2 ) = 0 E (1u1 ) (φ3 ) = 21 φ1 + 12 φ2

E (1u2 ) (φ1 ) = 0 E (1u2 ) (φ2 ) = φ2 . E (1u2 ) (φ3 ) = 12 φ1 + 12 φ2

Then E (1) (φ1 ) = φ1 , E (1) (φ2 ) = φ2 and E (1) (φ1 ) = shows that E (1) = 12 E2 (1) + 12 E2 (1) where Ei (1) (φ1 ) = φ1

Ei (1) (φ2 ) = φ2

1 2 φ1

+ 12 φ2 which

Ei (1) (φ3 ) = φi .

This shows that M = 12 M1 + 21 M1 , and therefore that M is not extreme and therefore not simple. As we have seen earlier a repetitive instrument is completely determined by its measurement and its reduction. The following theorems gives a complete description of these. Theorem 55 Let M be a simple measurement given by the orthogonal resolution of the identity {MB , B ⊆ U} . If the instrument EM exists then it is given by Z  EM (f ) (S) =

f dM

EM (1) S

(4)

and MB is element in the commutator to EM (1) (W) . Proof. First we show that MB are central.   EM (1B ) (S) =1, tr MB tr (EM (1B ) (S)) which shows that MB commutes with EM (1B ) (S) . Therefore MB commutes with EM (1) (S) = EM (1B ) (S) + EM (1{B ) (S) , and MB commutes with all density operators in EM (1) (W) . This proves that (4) defines an instrument so it is sufficient to remark that EM (1) = id, and that MEM = M, which is obvious. The general result is obtained using that a von Neumann algebra is the direct limes of its finite sub-algebras. Theorem 56 Let E be a repetitive instrument on a von Neumann algebra W. Then E (1) is given by a conditional expectation E : W → W 0 where W 0 is a sub-algebra of W, and E (1) (φ) = φ ◦ E. Proof. A proof appears in Davies (1976) and only needs minor changes. A conditional expectation is typically of the form X E (A) = Pi APi where Pi is the measurement corresponding to the instrument. 47

Theorem 57 (Holevo 1986) For any measurement M in a Hilbert space H there exists a Hilbert space K with pure state Sr and a simple measurement E in H ⊗ K such that M µE S⊗Sr (D) = µS (D) for any state S on H. Theorem 58 For any measurement MB in A ⊆ B (H) there exists a Hilbert space L containing H and a simple measurement EB in B (L) such that MB = P EB P where P is the projection of L on H. Proof. Let L be the set of mappings U → H. Put X  hf | gi = f (u) | M{u} (g (u)) . u

Then L is a Hilbert space. Let denote the completion of L with respect to h· | ·i . The map l : v → (u → v) is an isometry of H into L, and H can be identified with a subspace of L. Let Mu be the operator f → f · 1u . Then E obviously is an orthogonal resolution of the identity. For v ∈ H we have 

v | M{u} v = l (v) | E{u} (l (v)) which proves that MB = P EB P. The general result can be obtained by going to the limit. Let E be a resolution of the identity in a Hilbert space L containing H such that M = P EP. Then L is isomorphic with H ⊗ L2 (U) , and there exists an isomorphism such that H = H ⊗ 1u . Put S0 = |1u i h1u | . For X = E (D) we have T r (S ⊗ S0 ) X = T r (SP XP ) = T r (SM (D)) .

9

Entropy

Let P = (p1 , p2 , ..., pn ) be a probability vector. Then the entropy of P is given by n X H (P ) = − pi log pi . i=1

Here we have used the convention 0 log 0 = 0. The function P → H (P ) is positive, concave and continuous because each of the functions pi → −pi log pi are concave and continuous. Here we will use the natural logarithm. Then  H 12 , 12 = log 2 and we say that the entropy of an experiment with the possible outcomes 0 and 1 each with probability 12 is one bit. Using that H (P ) is concave and symmetric in its arguments we see that H (P ) is minimal on the deterministic distributions and maximal with value n on the uniform distribution. A sequence of length d of independent zeros and ones each with probability 12 has entropy −2d · 21d log 21d = d. According to Shannon’s first coding theorem the value of a random variable with entropy H can be encoded in a sequence of zeros and ones of (mean) length approximately equal to H. 48

Pd Now let a state be given by a density matrix S = i=1 λi Pi where λi are eigenvalues and P are projections to corresponding eigenvectors. Then i Pd λ = 1 and λ ≥ 0. Now P is a resolution of the identity and the correi i i=1 i sponding measurement maps S into the probability vector (λ1 , λ2 , ..., λd ) . Then H (λ1 , λ2 , ..., λd ) = −

d X

λi log λi

i=1

= −T r (S log S) . This is called the entropy of S and is denoted H (S) We see that the entropy satisfy • H (S) ≥ 0 with equality if and only if S is a pure state. • H (S) ≤ d with equality if and only if S = 1d . Later we shall see how to encode a quantum state with entropy H (S) into approximately H (S) qubits. The entropy only depend on the eigenvalues of the state therefore the entropy is invariant under a unitary transformation. H (U SU ∗ ) = −T r (U SU ∗ log (U SU ∗ )) = −T r (U S log (S) U ∗ ) = −T r (S log S) = H (S) Especially the entropy is invariant under time shifts. This seems to contradict the second law of thermodynamics. We also introduce the information divergence (often called relative entropy) by the equation D (S k T ) = T r (S (log S − log T )) . Proposition 59 Let S and T be densities of states, and let α, β ≥ 0 be numbers with α + β = 1 Then H (αS + βT ) = αH (S) + βH (T ) + αD (S k αS + βT ) + βD (T k αS + βT ) . Proof. The proof is a simple exercise in the definitions and is left to the reader.

Theorem 60 For density operators S and T the following inequality holds D (S k T ) ≥

 1  2 T r (S − T ) . 2

49

Proof. Put η (t) = −t log t, t ∈ ]0; 1] . Then η 0 (t) = − log t − 1 1 η 00 (t) = − ≤ −1 . t Then a Taylor expansion gives 1 2 (x − y) η 00 (θ) 2 1 2 ≤ η (y) + (x − y) η 0 (y) − (x − y) 2

η (x) = η (y) + (x − y) η 0 (y) +

For some θ between x and y. Therefore 0 ≤ −η (x) + η (y) + (x − y) η 0 (y) − = x log x − x log y + y − x −

1 2 (x − y) 2

1 2 (x − y) . 2

P P Let S = λi Pi and T = κj Qj be spectral decompositions of the operators S and T. Then   1 2 0 ≤ λi log λi − λi log κj + κj − λi − (κj − λi ) T r (Pi Qj ) 2    1 2 = Tr λi log λi − λi log κj + κj − λi − (κj − λi ) Pi Qj . 2 Summing over i and j gives    X λ log λ P Q − λ log κ P Q i i i j i j i j  0 ≤ Tr  2 +κj Pi Qj − λi Pi Qj − 21 (κj − λi ) Pi Qj i,j ! P P λi log λi Pi − i,j λi Pi log κj Qj i P P P = Tr 2 + i κi − j λj − i,j 12 (κj − λi ) Pi Qj   1 2 = T r S log S − S log T − (S − T ) 2 and the result follows. The theorems shows that divergence is non-negative and therefore that the entropy is concave.

50

10

List of notation

A, B ag B (H) B1+ (H) C, K d det (X) δx ∆ φ, ψ G H, K 1 (U ) M+ kktot kk2 N (· | ·) , h· | ·i , · × Sp (X) Tr Pchar (X) P R S (A) S, T S ⊗ T ~u, ~v , w ~ Z

∗-Algebras. The action of the group element g on a state space. Matrices/bounded operators on the Hilbert space H. Density matrices/operators on the Hilbert space H. Convex sets. Dimension. The determinant of X. Dirac measure, i.e. probability measure with all weight in x. The unit disc (a, b) ∈ R2 | a2 + b2 ≤ 1 . Abstract states or states on a ∗-algebra. A group. Hilbert spaces. The set of probability vectors (or probability measures) on U. The norm total variation. The 2 norm in a Hilbert space. The set of natural numbers. Inner products in Hilbert spaces. Product set or product measure. The spectrum of X. Trace. The sum of the diagonal element of a matrix. The characteristic polynomial of X. Set of preparations. Set of real numbers. States on the algebra A. Density matrices or density operators. State space. Tensor product of Hilbert spaces or vectors or algebras. R/2πZ can be identified with [0; 2π[ . Vectors in a real or complex Hilbert space. The set of integers.

References

51