Hyperbolic and Spherical Geometry

21 downloads 0 Views 2MB Size Report
The average 〈a1c2〉 of the products a1i c2i of the results in N experiments is the sum over ... Page 30 .... theorem c2 = a2 +b2 for Euclidean geometry. .... If the truck drives a street with slope m2 = tanβ then its bed has an ...... if k is orthogonal (in the sense of the scalar product (2.44)) to the spacelike four- ...... 2 s−1 (4πε0)− 1.
Norbert Dragon

Geometry of Special Relativity Concise Courses

Preface

So einfach wie m¨oglich, aber nicht einfacher. As simple as possible, but not simpler. This guideline of Albert Einstein obliges in particular each presentation of relativistic physics, a subject which often puzzles laymen, stirs their imagination and tantalizes their comprehension, unnecessarily, because relativistic physics relies on simple geometric notions. If one wants to understand the basic features of the theory of relativity then one does not need coordinates or virtual systems of clocks, which fill the universe, no more than millimeter paper and coordinate axes are required for Euclidean geometry. One only has to consider what observers see rather than to argue that this or that observer is right. Relativity is a physical, not a judicial theory. The slowdown of moving clocks and the shortening of a moving measuring rod unfold naturally from the principle of relativity, just as a tilted ladder is less high than an upright ladder. Clocks are no more mysterious than mileage meters, and show a distance between start and end which depends on the way in between. This is the unspectacular answer to the seemingly paradoxical aging of twins. Just as no one is puzzled by a triangle, where the straight line between two edges is shorter than the detour over the third edge, no one should be shocked by the conclusion and experimental verification, that a clock picks up more time on a straight history as compared to the twin clock of a traveler who takes a detour. The first two chapters are intended to be understandable in essence also to nonphysicists with little mathematical knowledge. Their simplicity, however, may be deceptive. Real understanding requires careful consideration of the arguments, the equations and the diagrams, preferably by reading equipped with a pencil and paper. The following chapters presume mathematical knowledge which physicists and mathematicians acquire during their undergraduate years. To clarify more complicated questions we introduce coordinates as functions of the measured times and directions of light rays and deduce the Lorentz transformations which relate these values to the ones which moving observers measure. These transformations determine how velocities combine, what pictures are seen by moving observers, and how the energy and momentum of a particle depend on its velocity.

iii

iv

Preface

Chapter 4 assembles the basics of mechanics and applies them to relativistic particles. Stress is laid on the correspondence between physics and geometry, between conserved quantities like energy, momentum and angular momentum, and symmetries like a shift in time or space or a rotation or a Lorentz transformation. Jet spaces, which are introduced and used in this investigation, may strike the reader an unnecessary complication. But they provide the clearest and therefore simplest setting to exhibit the correspondence of conserved quantities and infinitesimal symmetries. Chapter 5 presents electrodynamics as a relativistic field theory and in particular shows that changes of the electric charges cause changes of the electromagnetic fields with the speed of light. The electrodynamic interactions are invariant under dilations, which is why they cannot explain the particular values of particle masses or the particular sizes of atoms. In the last chapter we discuss the mathematical properties of the Lorentz group. It acts on the directions of light rays just as the M¨obius transformations act on the Riemann sphere. The text originated from courses which I taught on the subject and from my answers to questions which were frequently asked in the newsgroup de.sci.physik. After a few years the notes changed nearly no more and slumbered on my homepage with a few hundred interested visitors per year until Christian Caron from Springer Verlag encouraged me to have them published. Whether this kiss of a prince awoke a sleeping beauty or a frog, still to be thrown against the wall, is the reader to judge. Helpful comments and patient listening were contributed by Fr´ed´eric Arenou, Werner Benger, Christian B¨ohmer, Christoph Dehne, J¨urgen Ehlers, Christopher Eltschka, Chris Hillman, Olaf Lechtenfeld, Volker Perlick, Markus P¨ossel and Bernd Schmidt. Ulrich Theis translated the early versions of the notes. Sincere thanks are given to Ulla and Hermann Nicolai for their friendly hospitality during my stay at the Albert-Einstein-Institut der Max-Planck-Gesellschaft. Hannover, January 2012

Norbert Dragon

Preface to the Second Edition

After I became aware of several friendly comments on my small book, I polished some of its formulations, rearranged the material a little and added an appendix with further background which is required to understand the advanced subjects. But primarily I tried to account for the objection of Abraham Ungar, that I had disregarded the unit mass shell, the hyperbolic space H 3 , with its intriguing relations of boosts (Fermi-Walker transport) and Wigner rotations (Thomas precession). Ungar’s comment not only stimulated the addition of chapter 7 on hyperbolic geometry. Investigating the Wigner rotation I found in chapter 6 the means to easily calculate its angle. Geometrically, it can be understood as the area of an hyperbolic triangle in terms of the lengths and included angle of two sides. The Wigner rotation is required in the calculation of the generators of the Lorentz group in relativistic quantum theory. Once determined they can be analyzed in detail. The surprising result that helicity of massless particles is a topological charge is presented and exploited in the chapters on relativistic quantum theory and quantum fields. They answer Steven Weinberg’s question at an intermediate stage of my investigations “What is missing in conventional treatments?” showing, among others and contrary to my expectations and my original intentions, the logical inconsistency of massless helicity states and commuting position operators as they are postulated in string theory. I am deeply indebted to Florian Oppermann, Elmar Schrohe, Olaf Lechtenfeld, Domenico Giulini and Stefan Theisen for clarifying discussions and for their untiring patience. Hannover, July 2017

Norbert Dragon

v

Contents

1

Structures of Spacetime . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1 Properties of the Vacuum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Omnipresence of Gravity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Straight Worldlines in Curved Spacetime . . . . . . . . . . . . . . . . . . . . . . . Rotational Motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Lightcone . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Measuring Rods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3 Equilocal and Equitemporal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.4 Limit Speed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Tachyon and Rigid Bodies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.5 Quantum Teleportation and Bell’s Inequality . . . . . . . . . . . . . . . . . . . .

1 1 3 4 5 6 9 10 13 14 15

2

Time and Distance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 Theorem of Minkowski . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Three Equal Clocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Construction of the Referee . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Addition of Velocities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 Time Dilation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Twin Paradox . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4 Length Contraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Equally Accelerated Rockets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Length Paradox . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.5 Doppler Effect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.6 Apparent Superluminal Velocity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.7 Spacetime Coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.8 Scalar Product and Length Squared . . . . . . . . . . . . . . . . . . . . . . . . . . . . Orthogonal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.9 Perspectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

21 21 24 25 26 28 29 32 33 34 35 37 37 39 42 43

vii

viii

Contents

3

Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 Lorentz Transformation of Coordinates . . . . . . . . . . . . . . . . . . . . . . . . Lorentz Transformations in Four Dimensions . . . . . . . . . . . . . . . . . . . Addition of Velocities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Perception . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yearly Aberration of the Light of Stars . . . . . . . . . . . . . . . . . . . . . . . . . Shape of Moving Spheres . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Moving Ruler . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Luminosity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3 Energy and Momentum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Transformation of Additive Conserved Quantities . . . . . . . . . . . . . . . Four-Momentum of a Particle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4 Mass . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5 Decay into Two Particles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.6 Compton Scattering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

47 47 49 51 52 53 54 55 57 58 58 59 61 63 64

4

Relativistic Particles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1 Clocks on Worldlines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Free Particles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3 Jet Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4 Local Functionals and Euler Derivative . . . . . . . . . . . . . . . . . . . . . . . . 4.5 Principle of Stationary Action . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.6 Symmetries and Conserved Quantities . . . . . . . . . . . . . . . . . . . . . . . . . Energy and Momentum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Angular Momentum and Weighted Position . . . . . . . . . . . . . . . . . . . .

65 65 68 69 71 73 74 77 79

5

Electrodynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 5.1 Covariant Maxwell Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 5.2 Local Charge Conservation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 5.3 Energy and Momentum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 Uniqueness and Domain of Dependence . . . . . . . . . . . . . . . . . . . . . . . 88 5.4 Electrodynamic Potentials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 Poisson Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 Harmonic Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 5.5 Gauge Transformation of the Four-Potential . . . . . . . . . . . . . . . . . . . . 94 5.6 Wave Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 Uniqueness and Domain of Dependence . . . . . . . . . . . . . . . . . . . . . . . 96 Plane Waves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 Huygens’ Principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98 Retarded Potential . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 Poincar´e Covariance of the Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102 5.7 Charged Point Particle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 5.8 Action Principle and Noether’s Theorems . . . . . . . . . . . . . . . . . . . . . . 107

Contents

6

ix

The Lorentz Group . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 6.1 Rotations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 6.2 Lorentz Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116 6.3 The Rotation Group SU(2)/ 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120 6.4 The Group SL(2, ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 6.5 The Complex Lorentz Group . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125 6.6 M¨obius Transformations of Light Rays . . . . . . . . . . . . . . . . . . . . . . . . 126

C

Z

7

Hyperbolic and Spherical Geometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127 7.1 Unit Mass Shell . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127 7.2 Spherical Geometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131 7.3 Addition of Velocities and the Wigner Rotation . . . . . . . . . . . . . . . . . 133 7.4 Fermi-Walker Transport . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135 7.5 Thomas Precession . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137

8

Relativistic Quantum Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139 8.1 Induced Representations of the Poincar´e Group . . . . . . . . . . . . . . . . . 143 8.2 Vacuum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146 8.3 Massive Representations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147 8.4 Position of a Massive Particle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149 8.5 Massless Representations and Helicity . . . . . . . . . . . . . . . . . . . . . . . . . 154 8.6 Angular Momentum and Position of a Massless Particle . . . . . . . . . . 159 8.7 Space Inversion and Time Reflection . . . . . . . . . . . . . . . . . . . . . . . . . . 161

9

Quantum Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165 9.1 Free and Interacting Time Evolution . . . . . . . . . . . . . . . . . . . . . . . . . . . 165 9.2 The Perturbative Relativistic S-Matrix . . . . . . . . . . . . . . . . . . . . . . . . . 168 9.3 Fields in Hilbert Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169 9.4 The Massless Vector Field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176

A

Notations and Facts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181 A.1 Linear Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181 A.2 Area and Volume . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184 A.3 Transformations of Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187 A.4 Schur’s Lemma and Tensor Operators . . . . . . . . . . . . . . . . . . . . . . . . . 190 A.5 Complex Conjugate and Contragredient Representations . . . . . . . . . . 192 A.6 Finite Dimensional Representations of SL(2, ) . . . . . . . . . . . . . . . . . 193 A.7 Baker Campbell Hausdorff Formula . . . . . . . . . . . . . . . . . . . . . . . . . . . 197 A.8 Hopf Bundle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199

C

B

Aspects of Quantization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201 B.1 Graded Commutative Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201 B.2 Two-particle States . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205 B.3 The Many-Particle Space H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208 B.4 Time Order, Normal Order and Wick’s Theorem . . . . . . . . . . . . . . . . . 210 B.5 Unitary Representations of Rotations . . . . . . . . . . . . . . . . . . . . . . . . . . 212

x

Contents

B.6 The Wightman Distribution ∆+ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217 C

Systems of Units . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221

D

Collapse of the Wave Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233 Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237

Chapter 1

Structures of Spacetime

Abstract Simple geometric properties of spacetime and free particles underlie the theory of relativity just as Euclidean geometry follows from simple properties of points and straight lines. The vacuum, the empty four-dimensional curved spacetime, determines straight lines and light rays. In the absence of gravity, the vacuum is isotropic and homogeneous and does not allow to distinguish rest from uniform motion. Therefore, contrary to Newton’s opinion, the vacuum cannot contain the information about a universal time which could be attributed to events. Whether two different events are simultaneous depends on the observer – just as in Euclidean geometry it depends on a given direction whether two points lie on an orthogonal line.

1.1 Properties of the Vacuum We can denote each point in space by specifying how far it is ahead, to the right and to the top of a chosen reference point. These specifications are the coordinates of the point. One needs three of them to specify the point. Space is three-dimensional. The coordinates of a point depend of course on the reference point and on the directions which the observer chooses as ahead, right and above. As for appointments in daily life, for physical processes not only the position is important, where an event takes place, but also the time when it occurs. The set of all events, spacetime, is four-dimensional, because to specify a single event one needs four labels, the position where it takes place and the time when it occurs. The position and time specifications which label an event depend – just as the three coordinates of a position – on the observer. The four-dimensional spacetime fascinates and beats our imagination which is shaped in everyday life. Nevertheless it is quite simple. We can easily envisage a stack of pictures, as they are stored in a film reel, which show the sequel of threedimensional situations. Thereby one conceives the four-dimensional spacetime the

1

2

1 Structures of Spacetime

same way as an architect, who draws two-dimensional blueprints, horizontal plans and transversal sections to envisage a three-dimensional building. Using the same means we depict the sequel of events in two-dimensional spacetime diagrams. For example, the geometric figure of two intersecting straight lines shows the physical process that two particles move uniformly and collide in the event, where the lines intersect. If one would display only the position and not the time one would not know whether the particles pass the same position at the same time or fail to meet each other. The physical findings add the insight, which is alien to our intuition, that the four-dimensional spacetime is an entity which is decomposed into layers of equal time only by the observer. Different from what Newton thought, these slices of equal time do not coincide for observers which move relative to each other. That simultaneity depends on the observer is the largest obstacle for understanding relativity. Not only the three coordinates of the position, but also the time which denotes an event, depend on the observer. Spacetime has no measurable universal time which pertains to the events. Each event E determines the later events which can be influenced by E by means of light or electromagnetic signals, and vice versa, it determines the earlier events from which it could have been influenced. These events form the forward and backward lightcone of E, which both belong geometrically to each event in spacetime. An unaccelerated clock which passes two events shows the time which passes in between. This time does not depend on the particular type of clock and is a geometric property of the two events, their temporal distance. From this time all length standards derive, in particular spatial distance is the time which it takes light to run back and forth (1.5). The temporal distance of events imparts a geometry to spacetime which is similar to Euclidean geometry in many respects. As we will see, the events with the same temporal distance from a chosen event lie on a hyperboloid and not, as in Euclidean geometry, on a sphere. The geometry of spacetime is ordinary school geometry, with circles replaced by hyperbolas. To avoid the multitude of possible effects, we investigate processes in an empty region of spacetime, the vacuum, from which all particles have been removed and all influences from outside, such as electric and magnetic fields, are shielded. This vacuum is the stage on which we study the behavior of light and particles that are seen by observers and are measured with clocks and measuring rods. As simple as the idea of a vacuum seems, it is an idealization and can be realized only approximately. We are constantly passed by neutrinos, which come from the sun and single out a particular direction in the otherwise isotropic space. We cannot shield our experiments from these neutrinos since they do not interact sufficiently. But since neutrinos penetrate everything, they do not bother either and cause effects only if we look for them on purpose. The cosmos filled by background radiation is not a vacuum. This radiation is a remnant from the early evolution of the universe and defines a rest frame through which the sun moves with a speed of roughly 370 km/s [47]. This background radiation can be shielded by walls, but the walls have to be cooled so that the heat radiation of the walls does not fill the space.

1.1 Properties of the Vacuum

3

Omnipresence of Gravity Even after removing all particles and shielding from all external influences the vacuum retains structure. Gravity cannot be shielded or extracted from the vacuum. It therefore belongs to the properties of the empty spacetime, though in special arrangements one can compensate gravitational attraction in a region by additional masses. For example, if one completes a segment of a spherical shell, which causes gravitational attraction, to a complete shell, then in the interior of the shell the gravitation of the segment is compensated. However, one cannot compensate unforeseen disturbances from outside by a Faraday cage with particles, which can move freely in a wire with negligible inertia. For freely moving particles, the state with lowest energy is not a vanishing gravitational field, but, because gravity is attractive, the closest packing of the particles and the largest possible gravitational attraction. They do not shield gravity but enlarge it. Also, there are no bodies which are inert and nearly insensitive to gravity. Irrespective of their masses all test particles fall in the same way. One cannot completely transform away gravity by performing experiments in freely falling laboratories. In a laboratory which orbits the earth in free fall one can distinguish three different directions, behind, below and beneath, by gravitational effects without view to the earth and without reference to the walls. Test particles in the laboratory orbit the earth in ellipses, in the simplest case in circles. If the test particles orbit circles in the same plane with different radii, then the particle which is nearer to earth is faster and departs from the other. If the test particles cycle behind each other in the same circle, then their distance remains unchanged. If they orbit initially beneath each other in different circles of the same radius, then the orbital planes intersect and the circles intersect twice per revolution: freely falling particles beneath each other oscillate around each other with the orbital frequency.

Fig. 1.1 Orbits around the Earth

4

1 Structures of Spacetime

Straight Worldlines in Curved Spacetime In the vacuum an observer can determine without reference to other positions whether he is freely falling or accelerating. This he can read from an hour glass –it does not run in free fall – or from a pendulum, which he carries along. If it swings back and forth, an acceleration acts vertically to the axis of rotation, otherwise the pendulum rotates with constant angular velocity. An observer passes in the course of time a set of events. This line in spacetime, directed from his past to his future, is his worldline. For each freely falling particle this line is determined by an event, which it passes and by the velocity in that instant, corresponding to a point and a direction with which the worldline traverses the point. These worldlines do not depend on particulars of the particles, because all particles fall the same way. Therefore the worldlines of freely falling particles define a geometric structure of spacetime itself: the set of straight lines. Note well: the worldlines of freely falling particles and of flashes of light are the straight lines of the four-dimensional spacetime, but their spatial projection are not straight lines of the three-dimensional space. Through each point in space and with given direction there pass different parabolas which are traced out by particles which fall with different velocities. These curves in three-dimensional space are not determined – differently from what one has to require from straight lines – by a point and a direction. Among the spatial curves of freely falling particles one can and does choose a class of curves to define straight lines, if gravity does not change with time. Straight is the path of light. Whether an edge is straight is checked by comparing with light, by sighting along the edge. But light rays are gravitationally deflected and can intersect each other repeatedly. They do not satisfy Euclid’s axiom of parallels. They define straight lines in a space which is curved by gravity. If gravity changes in time because the masses move which generate gravity, then the light rays do no longer define straight spatial lines, because the ways to and fro differ. One could imagine to attribute the label straight to other lines in spacetime, for example to lines which in some coordinate system can be drawn with a ruler. But these lines are no property of spacetime and test particles traverse them only if they are subject to forces which differ for different particles. The only worldlines which are singled out by nature are the worldlines of freely falling particles, including the worldlines of flashes of light. If one follows mentally the path of freely falling particles, which in figure (1.1) orbit the earth in different circles, then one realizes that spacetime is curved. If one relates the positions of the second particle to the positions of the first particle, then the straight worldline of the second particle oscillates around the straight worldline of the first particle, straight lines can intersect each other repeatedly. The reason for the curvature of spacetime and the relative motion of freely falling particles is the fact, that gravity is not the same everywhere: the attraction is stronger the nearer the particles and in equal distance it acts in different directions. By the different gravitational attraction one can distinguish different positions and direc-

5

time

1.1 Properties of the Vacuum

space Fig. 1.2 Straight Lines in Curved Spacetime

tions. If however, one restricts physical evolutions to such short times and small regions that the inconstancy of gravity does not make itself felt at the given precision of measurement, then the effects of gravity become imperceptible in a freely falling system of reference. In sufficiently small regions the curvature of spacetime is insensible and spacetime has the geometric properties of a flat space. We cannot shield gravity but want to avoid the related complications of a curved spacetime. Therefore we restrict our considerations to short times and distances such that gravitational effects are immeasurably small or we account mentally for the known gravitational effects and subtract them from the observed behavior of the physical systems. If then, one has shielded all external influences and subtracted gravitational effects, then one cannot measure the time and the position of an event without reference to other events – just as little as at sea one cannot measure longitude without reference to the sun and time in Greenwich or without GPS. Physical evolutions are the same everywhere and at any time: the spacetime is homogeneous. Similarly physical evolutions are the same in all directions: the spacetime is isotropic.

Rotational Motion Rotational motion, the temporal change of directions, can be measured – different from uniform straight motion – without reference to other bodies such as the distant stars. If one rotates and emits light into some direction, then the reflected light is seen to return from a different direction (7.57). In a rotating cinema one projects into one direction and observes from a different one. Only for nonrotating observers does reflected light return from the direction into which it had been emitted. In rotating reference systems, light to and fro does not follow the same path. This property is

6

1 Structures of Spacetime

used in interferometers. They measure rotation and by now have reached a precision of 10−8 degrees per second [11]. The situation where some object orbits around the observer is different from his own rotation. In both situations the light rays from the object come from directions which change in the course of time, but if one does not rotate one sees each single reflected light ray return from the direction into which one had emitted it (7.57). Remarkably, these rotationfree reference frames defined by the local property of reflected light to return from the direction of emission coincide within high experimental precision with the systems in which the light from the distant stars comes from directions that, apart from parallax, aberration and the motion of the star, do not change in time. This is by no means self-evident and, measured with today’s highest precision [17], does not hold if one orbits the rotating earth. Unluckily, the experimental sensitivity is insufficient to determine whether the corrections due to the orbital motion and the rotating earth apply with respect to the stars of the milky way, which is known to rotate itself, or with respect to the distant quasars.

Lightcone Long before Einstein’s theory of relativity the principle of relativity was known, that one cannot distinguish by any effect of Newtonian mechanics whether an unaccelerated observer moves or rests. However, one was convinced that this principle of relativity is valid only approximately because it was known since 1676 from Olaf Rømer’s observation and interpretation of the orbital periods of the four large moons of Jupiter, Jo, Europa, Ganymed and Kallisto, that c, the speed of light in the vacuum, is finite. Therefore light was assumed to single out an absolute rest system, in which light propagated equally fast in all directions and in which its medium, the ether, rested. For an observer, moving with a velocity v with respect to the ether, light should propagate in different directions with different velocities ranging from c − v to c + v. This conclusion is obvious and wrong: no experiment has ever measured that a moving source of light emits light which in the vacuum propagates with different velocities in different directions. Moreover, never has the motion of the observer made him to register in the vacuum different velocities of light in different directions. This is the result of the seminal experiment of Albert Michelson. 1 The most astonishing property of the ether is that never one found a trace of it. Ether has all the properties of the vacuum, it is the vacuum. Light propagates in the vacuum with a velocity which does not allow to distinguish a stationary observer from a uniformly moving observer. Principle of Relativity: The speed of light in the vacuum does not depend on the motion of the source. No physical observation allows to distinguish an observer at rest from an uniformly moving observer. 1

Experimental findings are discussed in detail in [22] and [69].

1.1 Properties of the Vacuum

7

In the vacuum there is no faster or slower light. Light does not outrun light [9]. For example, in 1987, one observed a supernova in the Large Magellanic Cloud, SN 1987a, which exploded 160 000 years ago and where the luminous plasma was emitted in all directions with a velocity of 25 000 km/s. If this velocity v had added to the velocity of light to c′ = c + v , then the light from the plasma which moved towards us would have arrived 12 000 years earlier than the light from the plasma which moved transversal to the line of sight. Nobody had observed the star when the first light of the explosion arrived here, but in the explosion also neutrinos were emitted some of which were recorded on arrival. As one later found in the data, they had triggered the counters one hour before one had happened to look into the direction of the star and saw the explosion. At that time it was completely visible. Thus, there could have been runtime differences of at most one hour for the different light rays. A year has roughly 365 ·24 hours. With a runtime of 160 000 years the velocities of the light rays therefore were equal up to 1/(160 000 ·365 ·24) ≈ 0.7 · 10−9, i.e. in the first nine decimals. As the neutrinos had moved with this precision with the speed of light the observation implies also that their mass is less than 10 eV/c2 [56]. That the speed of light in the vacuum is independent of the speed of the source agrees with the conclusions from the Maxwell equations (5.3, 5.4) for the electromagnetic fields. From these equations we deduce (page 89) that a charge which is at the position x ′ at the time t ′ influences the electric and magnetic fields at the position x at the time t, c (t − t ′ ) = |x − x ′ | , (1.1)

which is later than t ′ by the runtime of light |x − x ′ |/c . The runtime t − t ′ does not depend on the velocity of the charge, which causes the field. The events (t, x) constitute the forward lightcone of the event (t ′ , x ′ ); electromagnetic causes produce effects with the speed of light in the vacuum. The independence of the propagation of light from the velocity of the source does not imply that other properties, such as the color of the light, the direction of the light rays and the intensity of the radiation, do not depend thereon. The direction of the incoming light rays, the color of the light and the number of photons per time and unit area, the luminosity, do depend on the velocity of the source and the velocity of the receiver. The intensity of electromagnetic radiation depends on the acceleration of the charges that emit the radiation. The independence of the propagation of light from the velocity of the source is illustrated in the spacetime diagram 1.1: A stationary observer traverses the straight worldline O0 , his position x remains unchanged at all times t; a second observer traverses the worldline Ov and moves uniformly into x-direction. If both observers are at some time at the same position and emit a flash of light, then the light propagates from this event E equally fast in all directions2 and independently of whether the light source moves.

2 Our diagrams show only one spatial dimension. Therefore there are only the two directions forwards or backwards.

8

1 Structures of Spacetime

ct

O0

Ov

light

light E

x Fig. 1.3 Observers with Outgoing Light Rays

We call the worldline of a flash of light “light ray”. Our diagrams are rotated such that the light rays which are emitted from E in forward and backward direction are traversed with increasing time from the bottom to the top of the diagram symmetrically to the vertical axis. We choose the units such that light rays include an angle of ±45◦ with the axes. The light rays which pass through other events E ′ are parallel to the light rays through E , because light does not outrun light. Light rays in the same direction do not intersect. Each event E determines the later events, which can be influenced from E by light and conversely it determines the earlier events which could have influenced E. These events constitute the forward and backward lightcone of E . The lightcone of each event does not depend on the motion of the source nor, for instance, on the color of light. The lightcones belong to spacetime itself, like the groove to a gramophone record. They are a geometric structure of spacetime. Apart from the light rays there are no other distinguished straight lines in the empty spacetime. There is no ether which traverses detectable worldlines and there is no measurable universal time which would define layers of equal time. The timeand space-axes are, different from lightcones, no geometric structure of spacetime but depend on the observer. Therefore and to not overcrowd the diagrams we omit the axes – the geometry of spacetime is independent of coordinate axes just as Euclidean geometry. If light and electromagnetic waves were the oscillations of an ether which filled the vacuum, such as sound is an oscillation of air and other matter, and if the worldlines were detectable, which are traversed by the constituents of the ether, then one could measure the motion relative to the ether and could distinguish it from rest. Could one measure a universal time for each event with a result which is the same for all observers and which therefore pertained to the event itself, then one could distinguish an observer at rest from an observer in uniform motion by the physical property, that flashes of light, which he emitted in one event into different

1.2 Measuring Rods

9

directions and which are reflected at the same universal time would return to him, but not to the moving observer, in the same instant. The statement that in the vacuum one cannot distinguish rest from uniform motion sums up the experimental results that no ether is detectable and that no universal time is measurable. These findings disprove Newtonian convictions and seem to contradict common sense, our everyday knowledge. But our experience is restricted to small relative velocities and does not know the vacuum. Physicists clarify by observations and experiments how nature behaves outside our everyday environment.

1.2 Measuring Rods Metersticks are subject to many influences which restrict their precision. The length of metresticks varies with their temperature. On earth one has to see, i.e. to examine with light, whether they are bent by their weight. It causes them to be shorter if they stand and to be longer if they hang. Over larger distances there are no rigid length standards at all though at the exit Echte3 of the highway Hannover Kassel a sign proudly announces “Echte 1000 m”. Distance, which exceeds a few metres, can be measured only with poor accuracy by placing metresticks one after the other. One measures length optically with light. Michelson’s measurements prove that, independent of the velocity of the observer, measuring rods and rigid bodies measure distances which agree, within the precision limited by the inaccurate rods, with the distance determined by the runtime of light. Because this time can be measured with highest precision one adopted in 1983 the definition that 299 792 458 metre is the distance, which light in the vacuum traverses in one second. We choose to specify distance directly by the runtime of light. Then a light year is just a year and a light second just a second and 1 second = 299 792 458 metre .

(1.2)

By this choice of the unit of length velocities are pure numbers, namely their ratio as compared to the velocity of light, and c, the speed of light in the vacuum, has the natural value 1. Meter per second is a numerical factor such as kilo or milli and denotes approximately 3.3 nano, 1 meter = ≈ 3.3356 ·10−9 , second 299 792 458

c = 299 792 458

metre =1. second

(1.3)

In units with c = 1 the equations of relativistic physics are particularly simple and reveal common features of position and time. Only in some results we insert the factors c as they occur ordinarily if we do not convert seconds into metres. 3

The German word “Echte” means genuine, real.

10

1 Structures of Spacetime

That distances can be specified as times is child’s play. In Grimm’s Little Red Ridinghood grandmother’s house is half an hour away from the village. To speak of a second rather than a light second is no more a revolutionary alteration or unlawful frivolity than to denote the distance of half an hour’s walk just by half an hour. That length and time are different does not prevent to measure them in common units. After all, in aviation height and distance are different and are measured in feet and in nautical miles. Nevertheless, the slope of a flight path does not depend on units because foot per nautical mile is nothing but the numerical factor 1.646 ·10−4 which only differs in size from metre per second which equals 3.3356 ·10−9 . Though distances are measured by definition by the time of flight of light and though consequently its velocity has the constant value c = 1, one can nevertheless check experimentally the consistency of this definition. First choose four points O, X,Y and Z, which do not lie in a plane and determine their distances. The results determine the scales and angles in a reference system and are not restricted to satisfy any equation. If then one measures the distances of a fifth point A to the four reference points, then the resulting distance to O restricts A to a sphere around O, the distance to X restricts A to the intersection of two spheres, a circle, and the distance to Y to the intersection of this circle with a third sphere. Therefore the measured distance to Z has to coincide with one of two possible values if the definition is consistent, that the velocity of light is constant everywhere and at all times. The measured n(n − 1)/2 distances of n ≥ 5 particles have to satisfy n(n − 1)/2 − 3n + 6 relations if the velocity of light is constant. This is not guaranteed by definition but abstracted without any conflicting evidence from experience.

1.3 Equilocal and Equitemporal In the course of time a traveller in a train traverses a set of events, his worldline. Related to a point in the train, all events on his worldline occur at the same place. For an observer at the railway embankment the traveller passes different places. Equilocality, the property of two different events to occur at the same place, depends on the observer. Diagram 1.4 shows the worldlines of a uniformly moving clock C and of an observer O, at rest with respect to the clock, who also carries a clock. We denote the events on the worldlines by the time on the clock, which is carried along. At time t− the observer emits light which is reflected and received at time t+ . With this light the observer sees the distant clock C show the time t while his clock shows t+ . Because the clock C does not move relative to the observer O the time t+ − t− , which it takes light to run back and forth, and the direction of the light do not depend on t− and C passes a worldline parallel to the observer’s. Similarly the ends of a nonrotating measuring rod pass parallel worldlines. If the worldline O is parallel to C and if C is parallel to the worldline of another observer R, then R is parallel also to O. Therefore, if and only if two observers do

1.3 Equilocal and Equitemporal

11

t+ t O

C

t

Fig. 1.4 Equilocal

not move with respect to each other, they agree whether two different events occur at the same place. In the same way observers agree, if and only if they rest with respect to each other, whether different events occur at the same time. To occur simultaneously or to be equitemporal depends on the observer and is no intrinsic property of pairs of events, because no measurable universal time exists.

t+

R

O B

C t0 = t

t A

t0

t Fig. 1.5 Equal Clocks at Relative Rest

Diagram 1.5 clarifies which events are simultaneous or equitemporal for an observer. It shows the worldlines of two observers O and C and a referee R in their middle.

12

1 Structures of Spacetime

The referee is equally far away from O and C because he sees flashes of light which he emits in the event A and which are reflected by O and C return both together in the event B. ′ on In the event A the light from O and C shows the referee R the times t− and t− ′ the clocks; he sees them show the times t and t in event B. Both clocks are equal if ′ . For simplicity, we assume the clocks the time differences coincide, t − t− = t ′ − t− set such, that they show him the same time, t = t ′ . That it takes light from O to C the same time as for the way back is not a mere convention, as occasionally claimed, but an observation. In event A the referee sees ′ and in event B he sees the light arrive while the light start at equal times t− = t− ′ the clocks show t = t . That it takes light the same time to run back as to run forth follows from the homogeneity and isotropy of spacetime. The triangles t− At and tBt+ are congruent, so t − t− = t+ − t . The reflection of light, which is emitted by an observer at t− and received after the reflection at t+ occurs in the middle of these times at t=

t+ + t− . 2

(1.4)

The time between reception and emission of light to and from E is by definition (in units with c = 1) twice the distance r between E and the observer, r=

t+ − t− . 2

(1.5)

If one solves these relations for t+ and t− one finds their denominations justified, t+ = t + r

,

t− = t − r .

(1.6)

In two-dimensional spacetime diagrams with given worldline of an observer O one constructs the events which for him occur simultaneous to an event E as a diagonal in a rectangle of light rays, which we call lightangle [35]. One draws the light rays through E to their intersection t+ and t− with the worldline O. The light rays emanating from t− and the incoming light rays of t+ form the lightangle t− Et+ E ′ . For the observer O the events E and E ′ occur at the same time and the same distance in opposite direction, because both events correspond to the emission time t− and reception time t+ . If one enlarges or shrinks the lightangle, holding the intersection of the diagonals fixed, one confirms that all events on the straight line through E and E ′ are simultaneous for O . The worldline of the observer and events which for him occur at the same moment constitute the diagonals of a lightangle, one diagonal consists of equilocal events, the other of equitemporal ones. Equilocality and simultaneity is not a geometric property which pertains to pairs of events. It also depends on the observer. The worldlines of mutually moving observers are not parallel and the diagonals in their corresponding lightangles are not parallel. Therefore mutually moving observers do not agree which events are equilocal nor which are equitemporal.

1.4 Limit Speed

13

t+

O

E E0

t Fig. 1.6 Equilocal and Equitemporal Diagonals

1.4 Limit Speed By the principle of relativity, that in the vacuum one cannot distinguish rest from uniform motion, all causes, not only electromagnetic ones, produce effects at most with the speed of light. Otherwise, if by some interaction an event E ′ in the vacuum could cause an effect with a limit speed faster than light and influenced the event E as in figure 1.6, then there would be an observer O for whom E ′ and E occur simultaneously. In this way the superluminal velocity and light would single out a particular observer who one could distinguish by physical observation in the vacuum from other observers. In a medium, a fluid say, the velocity of sound could surpass the velocity of light in the vacuum. Though one never observed such a miraculous medium, it’s existence would not lead to logical contradictions, even if then one could construct a sequence of events on a closed loop, where each event causes the next. In such a situation one could – as is argued – shoot one’s own grandfather, a logical contradiction, because the grandfather is necessary for the existence of his killer. The contradiction rests on the assumption, that on closed causal loops one still has the freedom to choose what to do. However, as the contradiction shows, this assumption is wrong. If there are closed causal loops, then one can change the conditions only in such a way, that along the closed causal loop one arrives again and again at the initial situation. On causal loops one is bound to the wheel of perpetual rebirth. That the speed of light is the limit velocity for all interactions can be confirmed or disproved by observations. Neutrinos do not interact electromagnetically. As long as one did not know one assumed that they propagated with the limit velocity of light. Today, we know it. The limit velocity of neutrinos coincides with the speed of light at least in the first nine decimals [56] because in the supernova SN 1987a in February 1987 one has detected neutrinos and light which arrived simultaneously after 160 000 years of flight.

14

1 Structures of Spacetime

If we receive a message, then the information is caused by the sender. It cannot be transmitted faster than light. The events E ′ , which can be influenced by E constitute the causal future of E and occur later than E by at least the runtime of light. The causal past of E consists of

Future

spacelike

spacelike E

Past Fig. 1.7 Causal Future and Past of the Event E

all events E ′ which could have influenced E. They have occurred earlier than E by at least the runtime of light from E ′ to E . We call events E ′ spacelike with respect to E if they occur so early, that they cannot be influenced by E and which are not so long ago that one could know about them at E . If E ′ is spacelike to E then, vice versa, E is spacelike to E ′ . Mutually spacelike events cannot influence one another. Different from simultaneity, the property of pairs of events to be spacelike does not depend on an observer. But this property is not transitive: if A is spacelike to B and B spacelike to C , then this does not make A spacelike to C . The causal future and the past of E are separated by the forward and backward lightcone of E from the events which are spacelike to E .

Tachyon and Rigid Bodies Since a particle is at a position because it was previously at a position nearby and because effects propagate with the speed of light at most, therefore the worldline of each particle and of each observer always runs from the past to the future and lies within the lightcone. There is nothing faster than light. Nothing outruns light. Tachyon is the name of a hypothetical particle, which moves faster than light. It would traverse a worldline such as E ′ E in diagram 1.6 . If a tachyon could scatter light it would be at first invisible to an observer O whose worldline it intersects, since its worldline does not intersect the backward lightcone of early events on the worldline of the observer O. At later times each backward lightcone of the worldline of O intersects the worldline of the tachyon in two events. These two intersections

1.5 Quantum Teleportation and Bell’s Inequality

15

depart in opposite directions in the course of time when O traverses his worldline. Hence, the tachyon would appear to the observer as a pair of particles which emerge out of nowhere at some point and run away in opposite directions. There is not a single serious observation which would suggest the existence of tachyons. One can easily cause effects on a tachyonic worldline, for instance if one connects lamps at a runway of an airport with separate cables of equal length and if one simultaneously switches them on. An observer on the runway then sees at first the nearest lamp light up, then the two neighboring lamps, as if a signal would propagate from the nearest lamp to both sides with superluminal speed. However, no lamp lights up because the neighboring lamp has turned on: walls which one puts between the lamps do not interrupt a signal from lamp to lamp. The cause is the flight controller who has switched on the light. This cause produces effects at most with the speed of light. Since causes act at most with the speed of light, there is no extended body, which is precisely rigid. For instance, a blow on one end of a bar does not influence the other end at first. The compression caused by the blow propagates in the bar as a wave with the speed of sound, and only after it has traversed the body and vibrations have faded away does the bar recover its original state. One cannot evade the conclusion that there is no ideal rigid body by imagining a very hard body which resists its deformation by strong internal forces. Hard bodies have high sonic speeds, but the speed of sound is always less than c, the speed of light in the vacuum. Causality is compatible with rigidity only for bodies of infinitesimal size.

1.5 Quantum Teleportation and Bell’s Inequality The revolutionary conclusion of quantum physics is that even if one prepares particles as best as possible there are always measurements with results which no one can predict in each single case but where one can only know the probability of the different possible results. In the particular example of polarization measurements of pairs of photons one can disprove the assumption that the inability to predict the individual results relies only on incomplete knowledge. One is forced to conclude that measuring devices do not show results which were determined previously but ascertain results which were uncertain before. If light passes a polarization filter, it becomes polarized. It passes unchanged a second filter, which polarizes in the same direction a and its intensity decreases by a factor pa (b) = cos2 β , (1.7) if one rotates the second filter in the plane which is orthogonal to the light by an angle β into the direction b. No light passes through crossed filters, if b = a⊥ is orthogonal to a . In the following we denote a filter which polarizes in direction a as filter a for short.

16

1 Structures of Spacetime

Surprisingly, light with low intensity exhibits properties of particles. The photoelectric effect does not become smaller with decreasing intensity of light but rarer. One therefore has to interpret the intensity of light as proportional to the probability to find a photon and the reduction factor pa (b) as probability for the photon, which has been polarized in direction a , to pass the filter b . With the remaining probability 1 − cos2 β = sin2 β the photon is absorbed. This is the same as the probability to pass the crossed filter b⊥ , pa (b⊥ ) = 1 − pa(b) .

(1.8)

In a suitably chosen transition of exited calcium atoms pairs of photons are created, which are emitted in opposite directions with such a polarization that the first photon passes a filter a and the second a filter b with probability [2] p(a, b) =

1 cos2 β , 2

(1.9)

where β is the angle between the directions a and b . The probability for the result that the first photon passes the filter and the second photon is absorbed is the same as the probability p(a, b⊥ ) that the first photon passes the filter a and the second passes the crossed filter b⊥ . In the same way p(a⊥ , b) and p(a⊥ , b⊥ ) are the probabilities that the first photon is absorbed while the second passes and for the case that both photons are absorbed, p(a, b⊥ ) =

1 2 sin β , 2

p(a⊥ , b) =

1 2 sin β , 2

p(a⊥ , b⊥ ) =

1 cos2 β . 2

(1.10)

These probabilities shatter, as we shall see, the physicists’ view of the world. They exclude the interpretation that each photon is equipped with a property which determines in each case and for all filters whether the photon passes or is absorbed. This result is really accidental, as the following considerations show. Combining the two possibilities, that the second photon passes or not, we obtain the probability 1 p1 (a) = p(a, b) + p(a, b⊥) = (1.11) 2 that the first photon passes the filter a, irrespective of what happens to the other photon. This probability is the same as the probability p1 (a⊥ ) for the first photon to be absorbed. The probability is independent of the direction a. Also the second photon becomes absorbed with the same probability p2 (b⊥ ) = p2 (b) = 1/2 with which it passes filter b . Both photons of the pair are unpolarized, as far as measurements of one photon are concerned. If one considers the subset of cases, in which the first photon passes the filter a, then the second photon passes filter b with the conditional probability p(a, b) = cos2 β . p1 (a)

(1.12)

1.5 Quantum Teleportation and Bell’s Inequality

17

This is the same probability as for photons which are polarized by a filter a (1.7). If the first photon passes the filter a , then in this subset of events the second photons is polarized in direction a . In particular, it passes a polarization filter a with certainty. For this behavior there is the way of speaking that the polarization measurement of one photon changed the state of the other photon of the pair instantaneously, no matter how far away it may be, to the state of the same polarization and that the state of the pair collapsed or became reduced. The result of the first measurement, so the alleged interpretation, was transmitted to the second photon or, even more impressive, was quantum teletransported. The reduction of state happened instantaneously faster than light. Untouched by these sensational claims one can conclude as a matter of fact that a measurement of one photon does not cause anything at the second photon. No measurement, performed at one photon, can detect whether the other photon had been measured, is being measured or will be measured, leave alone in which direction and with which result. That the second photon is polarized in direction a in case that the first photon passes the filter a, can be confirmed only after one knows at the second filter, whether the first photon has passed and what the direction of the filter was. This information can be transmitted, according to all accumulated experimental evidence, with the speed of light at the fastest. The correlation of the results of the polarization measurements is caused by the joint preparation of the two photons as a pair in one atomic transition. This preparation works only if both photons are created at the same place. They move with the speed of light. Therefore, this preparation does not produce effects which propagate faster than light. If one throws a coin repeatedly and sends the picture of the upper side to one recipient and the picture of the down side to a second recipient, then each of them obtains pictures of head and tail with equal probability. Each of them knows in the instant, when he opens his letter, the picture which the other receives. The knowledge of the result collapses the probability to a conditional probability, in this example to certainty. In the same way the reduction of state by the result of a measurement replaces the previous state by the conditional state which pertains to the conditional probability in the case of the measured result. Before opening the letter, the receiver is uncertain about its content, but the content is not uncertain but only unknown. The content is certain, whether one opens the letter or not. The probability (1.9) however, contradicts the assumption, that the results of the polarization measurements are determined for all directions and in each case and that one does not know the result prior to the measurement only because the individual causes are insufficiently known. Such an assumption seems to be irrefutable, but it leads to a mathematical restriction, an inequality, which may or may not be confirmed by experiment. The measured results violate the inequality and disprove the assumption. To evaluate the assumption, we consider repeated measurements and enumerate them by i, i = 1, 2, . . . , N . We assume that in the measurement number i the result

18

1 Structures of Spacetime

of the polarization measurement of the first photon in direction a is determined by some causes, even if we do not know them, and we attribute a1 i = 1 to the case that the first photon passes, if not a1 i = −1 . With b1 i we denote the results which in experiment number i would be obtained if the polarization of the first photon was measured in direction b. In the same way c2 i and d2 i denote the results if we measure the polarization of the second photon in experiment number i with a filter c or d . In all cases the results a1 i , b2 i , c2 i and d2 i can only take the values 1 or −1 and c2 i can only be equal or opposite to d2 i . If a1 i (c2 i + d2 i ) is ±2 then b1 i (c2 i − d2 i ) vanishes and vice versa. Therefore their sum never exceeds 2 [12], a 1 i c2 i + a 1 i d 2 i + b 1 i c2 i − b 1 i c2 i ≤ 2 .

(1.13)

The average ha1 c2 i of the products a1 i c2 i of the results in N experiments is the sum over the individual products divided by N, N

ha1 c2 i =

1X a 1 i c2 i . N

(1.14)

i=1

Correspondingly we obtain the averages of ha1 d2 i, hb1 c2 i and hb1 d2 i . If we sum the inequality (1.13) and divide by N, we obtain a Bell inequality [4, page 19] for the average of products of results of polarizations measurements ha1 c2 i + ha1d2 i + hb1 c2 i − hb1 d2 i ≤ 2 .

(1.15)

We can evaluate the average of a1 i c2 i also by counting the frequency N+ and N− with which each possible value +1 or −1 occurs. If we multiply the frequency with P the corresponding value, we get N+ − N− = Ni=1 a1 i c2 i and the average ha1 c2 i = (N+ − N− )/N. If N is sufficiently large, then the relative frequency N+ /N is the probability for the cases where a1 i c2 i has value +1 and N− /N is the probability for the value −1. The probability for a1 i c2 i to have the value +1 is p(a, c) + p(a⊥, c⊥ ), with probability p(a, c⊥ ) + p(a⊥ , c) the product has the value −1. Therefore the experimental probability (1.9), which is also predicted by quantum theory, yields the average ha1 c2 i = cos2 γ − sin2 γ = cos(2γ ) .

(1.16)

It is given by the scalar product of unit vectors A and C which include double the angle as a and c , cos(2γ ) = A · C . By the same reason ha1 d2 i = A · D, hb1 c2 i = B · C and hb1 d2 i = B · D are scalar products of unit vectors with doubled angles. If one varies the vectors A and B, then the sum ha1 c2 i + ha1 d2 i + hb1 c2 i − hb1 d2 i = A · C + A· D + B· C − B · D

(1.17)

becomes maximal, if A has the direction of C + D and B the direction of C −√ D. Then the sum has the value |C + D| + |C − D| which takes its maximal value 2 2

1.5 Quantum Teleportation and Bell’s Inequality

19

if the unit vectors C and D are orthogonal. Therefore, if one takes c to include the angle π /8 = 22.5◦ with b, a an angle of π /4 = 45◦ and d an angle of√3 π /8 = 67.5◦ then the sum of the averages becomes maximal and has the value 2 2. This value is confirmed by the measurements and disproves Bell’s inequality (1.15).

c b

a d

Fig. 1.8 Directions of Polarization

To derive the inequality we only assumed that the results a1 i , b1 i , c2 i and d2 i of each measurement i are certain and do not depend on the direction in which one chooses to measure the other photon. In real life [2] one can measure in each single case only in one direction, either in direction a or b and either in direction c or d. One has to determine a1 i and b1 j or c2 i and d2 j in different measurements i 6= j . A random generator chooses the directions of the measurements after the photons have left the source and so late, that the choice cannot be known by signals with the speed of light at the time of measurement at the position of the other photon. That the measured values of the product polarization do not satisfy the Bell inequality shatters the views which physicists have of the world. Measurements do not read off properties, which determine the result, because then the results would have to satisfy Bell’s inequality. Measurements ascertain results which previously were not certain. The measured violation of Bell’s inequality refutes the elusion from reality that each result was certain in each case but only the cause of each result was insufficiently known. In quantum physics there is no cause for each single result but only causes for probabilities of results of measurements. This indeterminacy has nothing to do with the time evolution or with the interaction and occurs already if the measurement is performed immediately after the preparation of the quantum state.

Chapter 2

Time and Distance

Abstract An elementary geometric fact, stated as the intercept theorem, makes an observed clock run visibly slower, if it moves away in the line of sight and to run visibly faster by the inverse factor, if it approaches the observer with the same velocity. This Doppler effect of light in the vacuum is particularly simple, because, different from the Doppler effect of sound, it depends only on the relative velocity of the light source and its observer. We employ a referee to determine whether moving clocks are equal and how the times between pairs of events compare. This time endows spacetime with a geometric structure, the distance, which is similar to but also different from Euclidean distance. From the Doppler effect we determine the addition of velocities, time dilation and length contraction and clarify the related paradoxes.

2.1 Theorem of Minkowski Consider as in diagram 2.1 a clock C and an observer O with another clock. Both move uniformly along straight worldlines and meet in the event O, the origin, where their worldlines intersect. There both clocks are set to zero for simplicity, so that in the following we can speak of times rather than of time differences. When the observer O looks at the clock C , which moves away from him uniformly in the line of sight, then he reads the time tC which passed on C until the emission of the light. At the moment of observation his own clock shows a time tO . This time of reception is proportional to the time of emission1 tO = κ (O, C )tC

for tC > 0 ,

(2.1)

with a coefficient κ (O, C ) , which does not depend on tC [9]: if the observer later reads the time tC′ , then the triangle OtC′ tO′ is similar to OtC tO and all distances are enlarged by the same factor. Therefore the ratios tO /tC and tO′ /tC′ coincide. 1

κ and ν are the Greek letters kappa and nu.

21

22

2 Time and Distance

If a quartz is carried along with the clock C and oscillates n times during the time tC with a frequency νC = n/tC , then the observer O sees these oscillations

C

O tO0

tO 0

tC tC O Fig. 2.1 Intercept Theorem

while on his own clock the time tO elapses. So he observes the frequency

νO =

1 νC . κ (O, C )

(2.2)

This visible change of frequency of the clock which moves in the line of sight is the longitudinal Doppler effect. It is related to the Doppler effect of sound, which one can hear as whining drop of the pitch of passing police cars or racing cars. As one cannot distinguish rest from uniform motion, the Doppler factor κ (O, C ) only depends on the relative velocity of C and O and, contrary to the Doppler effect of sound, not on the velocity with respect to a medium. Moreover, κ depends on whether both clocks run equally fast. For two clocks at rest this can be easily seen. For the moving clocks O and C this is more difficult. One has to correct for the various and changing times which it takes light to run from the clock to the observer who compares both clocks. However, no correction is necessary for a referee R as in diagram 2.2 who is always in the middle of the clocks. Flashes of light which he emits at some time to O and C are reflected and return in the same instant. Because the referee is always in the middle, the runtimes of light to and from O and C are equal. Both clocks run the same if they show the referee equal times,

τ′ = τ .

(2.3)

2.1 Theorem of Minkowski

23

O

R

C

τ0

τ

O Fig. 2.2 Comparison of Clocks

This is the geometric definition of equal lengths on intersecting straight worldlines of moving observers. Without exception the definition agrees with the physical behavior of real equal clocks. We continue the worldlines of the light rays, which are received and reflected by C as it shows the time τ ′ , to the worldline of the observer O and denote in figure 2.3 with t− and t+ the times shown by the clock of O as he emits the light ray to C and receives it, respectively. Due to (2.1) the clock O shows the time

τ = κ (O, R) κ (R, O)t−

(2.4)

when the light ray emitted at time t− and reflected by R arrives. This is because τ is a multiple of the time at which the light ray was reflected by R, and this time is a multiple of the time t− at which the light ray was emitted by O. By the same reason t+ = κ (O, R) κ (R, O) τ .

(2.5)

Thus, t+ /τ = τ /t− and τ 2 = t+t− = t 2 − r2 (1.6). Moreover, the equal clocks show equal times, τ ′ = τ . This proves the Theorem of Minkowski: Let two observers O and C move linearly and uniformly and meet in some event O, when they set their equal clocks to zero. Then the time τ , which elapses on the clock C until an event E , is the geometric mean of the time t− shown by the clock of the observer O when he emits light to E and the time t+ which his clock shows when he receives light from E ,

τ 2 = t+t− = t 2 − r2 .

(2.6)

24

2 Time and Distance

O

R

C

t+

τ0 =

τ

pt

+t

t O Fig. 2.3 Theorem of Minkowski

This relation is as fundamental for the geometry of spacetime as the Pythagorean theorem c2 = a2 + b2 for Euclidean geometry. 2 According to the Pythagorean theorem in Euclidean geometry all points on a circle are equally far away from the center. By τ 2 = t 2 − r2 all spacetime points of equal temporal distance to the origin lie on hyperbolas.

Three Equal Clocks That equal clocks show their referee equal time differences is consistent: if the clock O1 equals the clock O2 and if O2 equals O3 then O1 equals O3 . If the clocks move in the same direction and meet in a common event, then the relation 2 2 t 4 = t+ t− = t++t+−t−+t−− (2.7)

holds. As in (2.4) one has t−+ = κ (O1 , O2 )κ (O2 , O1 )t−− and t++ = κ (O1 , O2 )κ (O2 , O1 )t+− .

(2.8)

2 t 2 and the clock of O equals the clock of O . If two clocks Therefore t 4 = t++ 3 1 −− equal a third then they equal each other. This holds also, if the worldlines of the

2

The temporal distance, phowever, is inappropriate for the topology of spacetime as each neighbourhood Uε (x) = {z : |(x − z)2 | < ε }, ε > 0, has a nonempty intersection with each Uε ′ (y).

2.1 Theorem of Minkowski

25

O1

t++

O2

O3

t+ t+

t

t + t

t

Fig. 2.4 Three Equal Clocks

clocks do not lie in a plane or do not intersect because the worldlines can be translated and rotated without changing the clocks.

Construction of the Referee To construct the worldline of the referee between two observers O and C one draws the light rays through a point τ ′ on one worldline. They intersect the other worldline in t+ and t− and determine the geometric mean τ of t+ and t− . The worldline of the referee is the straight line through the intersection of the light rays through τ and τ ′ . This worldline also passes the origin O , as τ is the geometric mean of t+ and t− .

τ

t r

t

t+

t

Fig. 2.5 Geometric Mean

√ The geometric mean t+t− is constructed in Euclidean geometry by help of a circle with a diameter, which consists of the line segments t+ and t− . Its radius is the arithmetic mean t = (t+ + t− )/2 , the line segment t+ is longer by r = (t+ − t− )/2 , t+ = t + r , the line segment t− is shorter t− = t − r. The orthogonal line through the √ √ endpoint of t+ cuts the circle with a segment of length τ = t 2 − r2 = t+t− .

26

2 Time and Distance

2.2 Addition of Velocities In diagram 2.3 the Doppler factor τ ′ /t− = κ (C , O) is the ratio of the time of reception to the time of emission (2.1) of light rays sent from the observer O to C , and t+ /τ ′ = κ (O, C ) is the ratio for the way back. Both clocks are equal, τ = τ ′ . Therefore (2.4) and (2.5) state that the Doppler factor κ (O, C ) , by which O sees frequencies of C shifted, equals the Doppler factor κ (C , O) , by which C perceives shifted frequencies of O κ (C , O) = κ (O, C ) . (2.9) On motion in the line of sight the Doppler shift is reciprocal. This reciprocity alone, t+ = κτ together with τ = κ t− , implies Minkowski’s theorem and the dependence of the Doppler factor on the relative velocity,

κ2 =

t+ , t−

τ 2 = t+t− .

(2.10)

By t+ = t + r and t− = t − r (1.6) this implies

κ2 =

t + r 1 + r/t r2 = , τ 2 = t 2 − r2 = (1 − 2 )t 2 , t − r 1 − r/t t

(2.11)

and, because v = r/t is the velocity with which the clock C moves away from the observer O ,

Ê

1+v 1+v =√ , 1−v 1 − v2 κ2 − 1 v= 2 , κ +1 p τ = 1 − v2 t .

κ (v) =

(2.12) (2.13) (2.14)

If the observer O emits a pulse of light at a time tE < 0, while the clock moves towards him (recedes with negative velocity) then, as diagram 2.6 shows, the ratio κ (−v) = tR /tE of the times of reception and emission is the inverse of the ratios of the times which the clocks show later, when they move away from each other

κ (−v) =

tR tC 1 = = . tE tO κ (v)

(2.15)

A clock, which moves away from an observer, appears slower, because it shows him the time tC = tO /κ , when his own and equal clock shows tO and κ (v) is larger than 1 for positive velocity v > 0. On motion in the line of sight an approaching clock appears faster, because during the approach the Doppler factor is inverse to the Doppler factor during recession. With (2.13) one can determine the velocity v (as is routinely done in traffic supervision) by measuring the Doppler shift κ . It retains its value, if one exchanges

2.2 Addition of Velocities

27

C

O tO

tC O tR tE Fig. 2.6 Towards and Away

observer and observed object. Therefore, two observers who move in the line of sight measure the same relative velocity.

O1

O2

O3 t3

t1

t2

Fig. 2.7 Addition of Velocities

If three observers, O1 , O2 and O3 , move in the same direction and register the times on their clocks at which a light pulse passes then these times are proportional, t2 = κ21t1 , t3 = κ32t2 , t3 = κ31t1 .

(2.16)

From κ31t1 = κ32 κ21t1 one immediately concludes

κ31 = κ32 κ21 .

(2.17)

The Doppler factor κ31 , by which O3 sees his clock run faster than the clock of O1 , is the product of the Doppler factor κ32 , by which O3 observes his clock run faster than the clock of O2 with the Doppler factor κ21 for the observer O2 and the clock of O1 . In terms of velocities (2.12) and squared this means (in units with c = 1) 1 + v31 1 + v32 1 + v21 = 1 − v31 1 − v32 1 − v21

(2.18)

28

2 Time and Distance

or, solved for v31 , v31 =

v32 + v21 . 1 + v32v21

(2.19)

The velocity v31 , with which O3 sees the observer O1 recede, is not the sum v32 + v21 of the velocity v32 , with which O3 observes O2 recede, and the velocity v21 , with which O2 perceives the recession of O1 . The naive addition of velocities is only approximately correct as long as in ordinary life v32 and v21 are small compared to the speed of light, c = 1 . Up to the sign in the denominator velocities add like inclinations. If the bed of a tipper lorry is inclined by an angle α , then on even ground it has the slope m1 = tan α . If the truck drives a street with slope m2 = tan β then its bed has an overall angle α + β to the horizontal and the overall inclination tan α + tan β m1 + m2 cos α sin β + sin α cos β sin(α + β ) = = = . cos(α + β ) cos α cos β − sin α sin β 1 − tan α tan β 1 − m1 m2 (2.20) We define the rapidity σ as the logarithm of the Doppler factor κ ,

m3 =

σ = ln κ =

1 1+v ln , 2 1−v

v=

eσ − e− σ = tanh σ . eσ + e− σ

(2.21)

To the addition of rapidities σ there corresponds the multiplication of the Doppler factors, κ = eσ . These rapidities, not the velocities, add on motion of several observers in the same direction.

2.3 Time Dilation If the time t elapses on a clock between two events O and E , then on a second equal clock, which moves relative to the first one with a velocity v, the shorter time (2.14) p τ = 1 − v2 t (2.22) goes by between the corresponding events which are simultaneous to O and E for the first clock. Time dilation is reciprocal. This can be deduced from diagram 2.8 where we have continued the light rays of diagram 2.3 to the worldlines of the observers O and C . The events are denoted by the times on the clocks which the observers carry along. All the time the referee R is in the middle of O and C and sees both clocks ′ coincide as do τ and τ ′ and also t show equal times. Therefore the times t− and t− + ′ and t+ , because light from each pair of events in which the clocks show these times reaches the referee in the same instant. For the observer O the event E ′ , in which the moving clock C shows the time τ ′ = τ , is simultaneous to the event, in which his own clock shows the arithmetic mean t = (t+ + t− )/2 of the time t− , which it shows, when light to E ′ starts

2.3 Time Dilation

29

O

R

C

t+

0 t+

t0

t

τ0 E0

E τ t0

t O Fig. 2.8 Reciprocal Dilation of Time

and the time t+ , at which the reflected light returns. So for O√the event E ′ occurs at √ time t , but the moving clock shows less time, τ = t+t− = 1 − v2 t (2.14), which is smaller than the arithmetic mean t (given that the velocity v does not vanish). ′ + t ′ )/2 = t is simulFor C the event in which his clock shows the time t ′ = (t+ − taneous to thepevent E, √ when the clock of O, which moves with respect to C , shows ′ t ′ = 1 − v2 t . So for C the clock of O runs slower just as well. the time τ = t+ − Time dilation is reciprocal because the observers do not agree on which events are simultaneous. In Euclidean geometry the corresponding fact is commonplace: if you look in horizontal direction from a lighthouse at sea level to a second lighthouse of identical construction also at sea level some miles away then the other lighthouse does not reach the height of the first one because the surface of the earth is curved. Height depends on which direction is horizontal and the horizontal directions of both lighthouses do not coincide.

Twin Paradox Reciprocal time dilation appears to be contradictory, if for example one considers twins. The first twin, the traveller T , departs in an event A with a velocity v for Mars and turns back with velocity v′ after the arrival in event M. The other twin, the stay-at-home S , waits calmly the time t + t ′ until the return of his brother. Which twin, if any, is younger in the end E? For each twin, the other has moved. Does this imply the contradiction, that each twin has aged less than the other? It is often tacitly insinuated that the observations of both twins agree up to the short acceleration at Mars and that from their observations one cannot distinguish the traveller from the stay-at-home. This is wrong.

30

2 Time and Distance

Each twin sees the other redshifted during the travel to Mars and blueshifted on the way back. In the first period each twin sees the clock of his brother run slower, in the second faster, than his own clock by a Doppler factor which agrees with the Doppler factor of his brother. But the stay-at-home sees the traveller longer redshifted (and age slower) and shorter blueshifted (and age faster) than the traveller sees the stay-at-home. Both twins see the stay-at-home age more than the traveller.

t + t0 E T

τ0

t+ S t

M

t

T

τ 0 A Fig. 2.9 Twins

On arrival at Mars M the traveller T sees his clock show the travel time τ and a redshifted light ray from the stay-at-home S show the time t− , which has elapsed on the clock of S since the start A . The travel time is larger by a Doppler factor κ (2.1), τ = κ t− . During the return trip the traveller observes on his clock the time τ ′ go by while blueshifted light shows him that the time t + t ′ − t− passes on the clock of the stayat-home until the end E, τ ′ = κ ′ (t + t ′ − t− ). Altogether, he sees the stay-at-home age by τ τ′ t + t′ = + ′ (2.23) κ κ while he has grown older by τ + τ ′ . The stay-at-home sees the traveller redshifted during the time t+ = κτ , which is longer than τ , and blueshifted during the rest of the waiting time t + t ′ − t+ = κ ′ τ ′ , which is shorter than τ ′ . Altogether, S ages by t + t′ = κ τ + κ′ τ′

(2.24)

while he observes the traveller grow older by τ + τ ′ . This agrees with (2.23) as one confirms with (2.12, 2.14) and with the relation vt + v′t ′ = 0 that the traveller returns. Both equations imply

2.3 Time Dilation

t + t′ =

31

τ 1 τ′ 1 τ τ′ ( + κ ) + ( ′ + κ ′) = √ +√ . 2 2 κ 2 κ 1−v 1 − v′ 2

(2.25)

The waiting time t + t ′ is longer than the travel time τ + τ ′ . “Who rests, rusts” and “Travelling keeps young” correctly states the relativistic effect. The worldline of the traveller differs from the one of the stay-at-home by the acceleration on arrival at Mars M. Such an acceleration is necessary if the second worldline through the two events A and E is to differ from the straight worldline of the first twin because in flat spacetime there is only one straight line through two points.

E

S

M T

A

Fig. 2.10 Equal Phases of Acceleration

But the traveller does not become younger during the acceleration. Even if both twins undergo identical phases of acceleration they can age differently, as diagram 2.10 shows. There both twins travel together until event A where the stay-at-home S brakes. The traveller T brakes later to reach M, there he accelerates to return. After a fitting waiting time the stay-at-home accelerates in exactly the same way and joins the traveller in E from where both continue their joint flight. Between A and E the twins age differently though their acceleration consisted of equal phases. During these phases they age the same, but the remaining pieces of their worldlines constitute the sides of the triangle AME in diagram 2.9 and there S ages more. Time is what clocks show. The clocks of the twins show different times on return. Therefore, time between two events does not only depend on these events but also on the worldline which the clock passes in between; just as in Euclidean geometry the path length between two points of a curve depends on the path which connects both points. Clocks are like mileage counters. In a spacetime diagram the different aging of the twins is as paradoxical as in Euclidean geometry the statement that in a triangle each side is shorter than the sum

32

2 Time and Distance

of the other two sides. In order to understand triangles one does not need differential geometry of curved spaces, even if one deals with circles and corners, i.e. with curved trajectories. Similarly, the general theory of relativity is not needed for the solution of the twin paradox. It can be used, but gives the same explanation and the same answer as the special theory of relativity: between every two sufficiently adjacent events on the worldline of every free-falling observer there elapses more time than on all other timelike worldlines connecting these two events. If the two events are not sufficiently adjacent, then gravity can cause the complication that different world lines of free-falling observers connect these events and that on these worldlines different times go by, even though none of the observers has experienced a sensible acceleration. For instance, a space station may orbit the earth in free fall and a second station launched vertically from the earth may fly past the first in free fall during the motion upwards. If the apogee of the second space station is suitably chosen, it can meet the first space station again on the way downwards after the first station has orbited the earth. During the vertical fall more time has elapsed between the two encounters than in the space station orbiting the earth. The different aging of the twins can be measured with atomic clocks flying around the earth [24] such that for one twin his velocity adds to the revolution of the earth and subtracts for the other twin. In addition, the gravity on ground and during the flight differs and influences the clocks, just as gravity and motion influence the clocks of GPS satellites. There these relativistic effects are routinely accounted for. Luckily, the precision of clocks enabled the measurement of the relativistic effects only after the gravitational influence was understood. Clocks at sea level, which are carried along with the rotating earth, run equally fast. The rotation does not only lead to different velocities, which depend on the latitude, but also to a flattening of the globe, such that the clocks which move faster are further away from the center. Taken together, the different gravity and the different velocity compensate their effects on the clocks at sea level exactly.

2.4 Length Contraction Two moving measuring rods have the same length, if they are equally long for a referee R, who, just as in the left diagram 2.11 is always in their middle [41]. The beginning of each rod traverses the worldline of the corresponding observer C and O, the end traverses a parallel worldline. As the referee R confirms, the rods of C and O have the same length, because in the events τ and τ ′ , which are simultaneous for him and equally far away, both ends of both rods coincide. √ A moving rod is shorter than an equal rod at rest by the same factor 1 − v2 by which a moving clock runs slower than an equal clock at rest. This can be deduced from the middle diagram of 2.11. There we have omitted all auxiliary lines and shown the segment from t to τ ′ which consists of events which occur simultaneously for the observer C . At this moment, his measuring rod extends from t to τ ′ and the

2.4 Length Contraction

C

τ

R

33

O

τ0

O

C

t q lv τ

C

O

t0

τ0

τ

O Fig. 2.11 Contraction of Moving Rods

right ends of both rods coincide. The moving rod is shorter, its left end intersects the line segment from t to τ ′ in the event q. The triangles t O τ ′ and t τ q are similar, therefore the length lv of the segment τ ′ q ′ relates √ to the length l of the segment τ t as the length of O τ to the length of Ot. But τ = 1 − v2 t is the length of O τ and t the length of Ot. Therefore, a measuring rod which moves uniformly with a velocity v has the shorter length p lv = 1 − v2 l , (2.26) if compared to an equal measuring rod of length l at rest. As the right diagram in 2.11 shows, length contraction is reciprocal. For the observer O the events τ and t ′ are simultaneous and the measuring rod of C is shorter.

Equally Accelerated Rockets We consider two rockets which we idealize as points. Initially they rest in a distance L, later they are accelerated in an equal way such that, as in diagram 2.12, their worldlines are related by a translation by L. For an observer at rest both rockets have a distance L at all times. After the acceleration the rockets follow straight worldlines with velocity v . If the crews of the rockets then measure the mutual distance with measuring rods which they carry along, they obtain some value √ l . For the observer at rest, this rod is moving and contracted and has length L = 1 − v2 l , because it reaches from one rocket to the other. So l is larger than L . A rope as considered in [5, chapter 9], initially spanned between the rockets and stretched to rupture, snaps immediately, if the rockets and the rope are accelerated equally. This is also what the crews of both rockets observe. For them the rocket in front reaches the final velocity earlier and veers away from the rear rocket.

34

2 Time and Distance

l

L Fig. 2.12 Accelerated Rockets

If one wants to accelerate the constituents of the rope, which rest initially until a time t = 0 , to a velocity v, such that their distances, as seen by the constituents, remain unchanged, then one has to accelerate the pieces in the rear more but for a shorter time thanpthe pieces in front such that all points at r, 0 ≤ r ≤ L √ traverse worldlines x(t) = (r + R)2 + t 2 − R during the times 0 ≤ t ≤ v (r + R)/ 1 − v2 and move straight and uniformly afterwards. Here 1/R is the acceleration of the last point in the rear.

Length Paradox Just as time dilation leads to the twin paradox, length contraction seemingly leads to a contradiction, if one considers whether a car with high speed fits into a garage of equal length. For the owner of the garage it is at rest and the moving car is shorter, therefore the car fits into the garage. Seen from the driver, however, the garage is shorter and does not fit the car. The situation is depicted in the spacetime diagram 2.11, where C and the parallel worldline represent the owner at the gate and the rear wall of the garage while O and the parallel worldline correspond to the front and rear fender of the car. Consider a red flash of light, emitted by a photo sensor in the event τ ′ , when the front fender hits the garage wall, and a green flash of light, which is emitted in the event τ , when the rear fender passes the gate. The referee sees both flashes in the same instant and confirms that the garage and the car are equally long because he is in the middle of the gate and the wall and the runtimes of light from τ and τ ′ to him are equal, The owner of the garage C observes the red flash from τ ′ after the green one. If he accounts for the runtime of light, he concludes that the front bumper had hit the garage wall in the event τ ′ at the time t after the event τ , in which the rear fender passed the gate. For him, the car had fitted into the garage at time t , the car war shorter. The driver O sees the green flash of light from the rear of his car τ later than the red flash. If he accounts for the runtime of light he concludes that the green flash τ

2.5 Doppler Effect

35

had been emitted at the time t ′ after the red flash. For him, the front fender had hit the wall before the rear fender had passed the gate. So he concludes that the garage is shorter than the car. This in not a contradiction and not a paradox. Observers, who move relative to each other, do not have to agree on the order of events which are not cause and effect as in the case under consideration. The passage of the rear fender through the gate does not cause the crash of the front fender on the wall and vice versa. Both observers agree that a fast, slim car can pass a slim garage of equal length if the car in addition has some transverse velocity, just as one can thread long yarn through the narrow eye of a needle. With some transverse velocity of the car the worldlines of the front and rear fender no longer lie in the plane of the diagram 2.11. They can intersect the plane in the events q and τ . For the garage owner these events are simultaneous, before them the car was on one side and afterwards it is on the other: the car has passed the garage, which is longer than the car. Also the driver observes his car pass the garage, though it is shorter than his car. He first drives around the wall with his front fender and later passes the gate with his rear. Whether a fast car fits through a garage does not only depend on the length but also on the temporal sequence of the events just as it depends on the direction of a long ladder whether it fits through a low door.

2.5 Doppler Effect If a clock C moves with a velocity v in direction e at an angle θ to the line of sight, then its distance to an observer O changes by dr = v cos θ dt during the short time dt . The changed distance cause a changed runtime of light and light rays l and l from two events on the clock which started with a time difference dt reach the observer (in units with c = 1) with a time difference

τO = dt + v cos θ dt .

(2.27)

In this consideration we use times, velocities and angles as determined by the observer O . √ On the moving clock C the time τC = 1 − v2 dt elapses between the emission of the two flashes of light. This follows from (2.12), because spacetime is homogeneous and time flows between the origin (0, 0, 0, 0) and (t, x, y, z) the same as between (t0 , x0 , y0 , z0 ) and (t0 + dt, x0 + vx dt, y0 + vy dt, z0 + vz dt) Consequently the observer O sees the time √ 1 − v2 τO (2.28) τC = 1 + v cos θ

36

2 Time and Distance

pass by on the moving clock while on his own, equal clock the time τO passes. Eq. (2.12), τO = κτC , is the special case in which the clock recedes in the line of sight with cos θ = 1 .3

τO l C l O

τC

Fig. 2.13 Doppler Effect

If an oscillator is carried along with the clock and oscillates n-times with a frequency νC = n/τC , the observer sees these n oscillations while the time τO passes on his own clock. He observes the frequency νO = n/τO , √ 1 − v2 νO = νC . (2.29) 1 + v cos θ √ If v cos θ > 1 − v2 − 1 , then the clock is seen slower and the frequency of light from the clock is shifted to√ smaller values of red light, it is redshifted. Otherwise, if v cos θ < 1 − v2 − 1 and the clock moves towards the observer, then it appears faster and its light is blueshifted. This change of the perceived frequency is the Doppler effect. It is commonly used to measure velocities. On motion √ crosswise to the line of sight, cos θ = 0, the transversal Doppler effect τC = 1 − v2 τO directly shows the slowdown of moving clocks, because the distance between source and observer just does not change. The Doppler shift is usually time dependent (because the direction changes) and reciprocal only for motion in the line of sight. If the observer O sends two flashes of light with a delay of dt = τˆO to the clock then the second flash reaches the clock later by dt ′ √ = dt + v cos θ dt ′ that is dt ′ = τˆO /(1 − v cos θ ). During this interval the time τˆC = 1 − v2 dt ′ elapses on the moving clock. Seen from the clock, frequencies from O are shifted to 3

Diagram 2.13 depicts the worldlines of the observer O and the clock C in a plane. However, we consider the general case in which the worldline of the observer is parallel to the plane and does not intersect the worldline of the clock. Note that in spacetime diagrams the frequency of light is not a property of a light ray but pertains to the distance of two events on parallel light rays.

2.7 Spacetime Coordinates

37

1 − v cos θ νˆ C = √ νˆ O . 1 − v2

(2.30)

√ This agrees with νˆ C = 1 − v2/(1 + v cos θ ′ ) νˆ O (2.29) because θ ′ is the angle to the line of sight, changed by aberration (3.21), with which C sees O move.

2.6 Apparent Superluminal Velocity A jet of gas streams out of the quasar 3 C 273 with a measurable angular velocity [32, chapter 11]. If one multiplies the observed angular velocity with the known distance one obtains a velocity of seven times the speed of light for the crosswise motion. The quasar seems to emit particles with superluminal velocity. This conclusion is wrong, the product of the distance with the observed angular velocity is not the velocity transverse to the line of sight. The clock C in diagram 2.13 moves by v sin θ dt = r dθ within the short time dt transverse to the line of sight, where r denotes its present distance. The flashes of light l and l reach the observer with a difference angle dθ and a time difference τO = dt + v cos θ dt because l starts later by dt and has to pass a distance which is larger by dr = v cos θ dt. So the observed angular velocity ωO = dθ /τO and the apparent transverse velocity u = r ωO are

ωO =

v sin θ , r (1 + v cos θ )

u=

v sin θ . 1 + v cos θ

(2.31)

This velocity u becomes maximal for the angle cos θ = −v between the direction of √ motion and the line of sight and in this case has the value v/ 1 − v2. This value can be arbitrary large though |v| is smaller than c = 1, the speed of light.

2.7 Spacetime Coordinates We can denote the events E in spacetime simply by the values (t+ ,t− , θ , ϕ ), the light coordinates of E, which an observer O determines as he sends light to E and receives it from E . He reads the times of emission, t− , and reception, t+ , from his clock and determines the direction of the outgoing light ray by means of, say, two angles θ and ϕ . Primarily coordinates only have to denote the events uniquely, at least in some range of their values. Other coordinates, which are invertible functions of the light coordinates, are equally conceivable. In particular, light coordinates are related in a simple way to inertial coordinates (t, x, y, z) , in which particles, which move uniformly on straight lines, traverse straight p coordinate lines. The time t and the distance r = x2 + y2 + z2 , at which the event E occurs, are the arithmetic mean and half the difference of the light coordinates t+ and t−

38

2 Time and Distance

(1.4,1.5),

t+ + t− t+ − t− , r= . (2.32) 2 2 The direction of the outgoing light ray from O to the event E is the incident direction of the light ray from E because the observer does not rotate meanwhile but uses reference directions that do not change in time. t=

z

6

3 x  

θ

x

/

 ϕ

-y

Fig. 2.14 Spherical Coordinates

The angles θ and ϕ of the light ray to E and the distance r are the spherical coordinates and define the cartesian spatial coordinates of the event by

„ Ž x=

x y z

„ = r eθ ,ϕ = r

sin θ cos ϕ sin θ sin ϕ cos θ

Ž .

(2.33)

For events on the worldline of the observer O one has t+ = t− , hence x = 0. In particular, the origin O has coordinates (0, 0, 0, 0). If O emits a light ray at time t0 in the direction eθ ,ϕ , then the light ray passes events for which t− , θ and ϕ are constant t=

t+ + t0 , 2

x(t) =

t+ − t0 eθ ,ϕ , 2

(2.34)

or, if we express the variable t+ in terms of t, then the light ray is given by the map   (2.35) Γ : t 7→ t, x(t) = t, (t − t0 ) eθ ,ϕ .

This is a worldline parameterized by t, which at the time t0 intersects the worldline of the observer. In the coordinates (t, x, y, z) it is a straight worldline which is traversed with the speed of light c = 1, since the speed v = dx dt is a unit vector. The equations also holds for t < t0 for a light pulse incident from the opposite direction −eθ ,ϕ . For such a light ray t+ = t0 is constant and t = (t0 + t− )/2 and x = −(t0 − t) eθ ,ϕ . While light coordinates (t+ ,t− , θ , ϕ ) of a passing light ray are discontinuous in the event, in which it intersects the worldline of the observer, inertial coordinates are continuous. Displacing the light ray by x0 + t0 eθ ,ϕ yields more generally the light ray which passes x0 at the time t = 0 ,

2.8 Scalar Product and Length Squared

39

  Γ : t 7→ t, x(t) = t,t eθ ,ϕ + x0 .

(2.36)

If the worldline of a linearly and uniformly moving massive particle passes the origin O at time t = 0, the observer O sees afterwards all events on this worldline from the same direction. The angles θ and ϕ are constant, except at t = 0. The particle departs into the opposite of the direction from which it approached and the angles change discontinuously from θ to = π − θ and from ϕ to ϕ + π at t = 0 . According to (2.10) one has t+ = κ 2 t− for events on the straight worldline of the particle. For its coordinates this means t = (κ 2 + 1)

t− , 2

x = (κ 2 − 1)

t− eθ ,ϕ , 2

(2.37)

or, if we express t− in terms of t and use (2.13), the worldline is given by   dx κ 2 − 1 eθ ,ϕ . Γ : t 7→ t, x(t) = t, vt with v = = 2 dt κ +1

(2.38)

Translating the worldline by x0 one obtains more generally the worldline of an uniformly moving particle which passes the point x0 at time t = 0 ,   Γ : t 7→ t, x(t) = t, vt + x0 . (2.39) So the coordinates (t, x, y, z) which we have constructed from the light coordinates t+ , t− , θ and ϕ are inertial coordinates in which the worldlines of free particles and light rays are straight coordinate lines.

2.8 Scalar Product and Length Squared Together, the time and the spatial coordinates of each event constitute an ordered set (t, x, y, z) of four real numbers, and each such four-tuple corresponds to one and only one event. In Special Relativity spacetime, the set of all events, is 4 . We denote the components of the four-tuple which corresponds to a particular event E either by tE , xE , yE and zE or we use a name like u for the four-tuple u = (u0 , u1 , u2 , u3 ) and enumerate the components with a superscript which can take one of the values 0, 1, 2, or 3. One has to determine from the context whether a superscript denotes an exponent (rarely) or enumerates a footnote (rarely) or a component. The homogeneity of spacetime makes it an affine vector space. If one shifts all events u = (u0 , u1 , u2 , u3 ) , which participate in some physical process, by s = (s0 , s1 , s2 , s3 ) in space and time, then the events

R

u + s = (u0 + s0 , u1 + s1 , u2 + s2 , u3 + s3 ) can participate in an equally possible process. The scaled versions of spacetime diagrams consist of events

(2.40)

40

2 Time and Distance

au = (au0 , au1, au2 , au3 )

(2.41)

scaled by a common factor a. However, elementary physical processes are not scale invariant. While scaled diagrams of physical processes with free pointlike particles correspond to equally possible physical processes, this is not true for interacting particles, for example, one has never observed an enlarged hydrogen atom (electron and proton bound by electromagnetic interactions). The set 4 , equipped with the operations of addition and multiplication by a scale factor, is a four-dimensional vector space. Its elements are called four-vectors.4 On an uniformly moving clock, which passes the two events (t0 , x0 , y0 , z0 ) and (t0 + t, x0 + x, y0 + y, z0 + z) , there elapses the time

R

τ 2 = t 2 − x2 − y2 − z2 .

(2.42)

This follows from (2.12), because spacetime is homogeneous and time flows between the origin (0, 0, 0, 0) and the event (t, x, y, z) the same as between (t0 , x0 , y0 , z0 ) and (t0 + t, x0 + x, y0 + y, z0 + z) . The time between two events does not depend on the details of the clock used to measure it. The time is a measure for distance, i.e. a geometric structure, in spacetime. Events with equal temporal distance to the origin are not in a plane t = const as in nonrelativistic physics or on a sphere x2 + y2 + z2 = r2 = const as in Euclidean geometry, but on an hyperboloid t 2 − x2 − y2 − z2 = τ 2 = const. The square of the temporal distance between two events is not subject to the Pythagorean theorem but to the theorem of Minkowski. The clock does not depend on which observer determines coordinates for the ′ ,t ′ , θ ′ and ϕ ′ and converts events. If another observer measures light coordinates t+ − ′ ′ ′ ′ ′ them into spacetime coordinates (t0 , x0 , y0 , z0 ) and (t0 + t ′ , x′0 + x′ , y′0 + y′ , z′0 + z′ ) of the two events, then the sums of squares appearing in (2.42) have to agree t 2 − x2 − y2 − z2 = t ′ 2 − x′ 2 − y′ 2 − z′ 2 .

(2.43)

The sum of squares plays a central role in relativistic physics. We introduce the related scalar product of four-vectors like u = (u0 , u1 , u2 , u3 ) and w = (w0 , w1 , w2 , w3 ) u · w := u0 w0 − u1w1 − u2 w2 − u3 w3 .

(2.44)

As length squared of a four-vector w we define 5 4

R

Without mentioning it explicitly we shall consider different copies of 4 , e.g. spacetime or the set of four-velocities, four-momenta or four-accelerations. Vectors from different spaces cannot be added, because they differ in units. e.g. a velocity v cannot be added to a position x. What can be added is the image vt of a velocity v under the linear map t, which maps it to the space of positions. Though vectors from different four-spaces cannot be added, their directions can be compared, because, as we shall see, the Lorentz group acts on each of these spaces and the xdirection, for example, is the set of vectors which is invariant under rotations around the x-axis and under boosts in y- and z-directions (also compare page 187). 5 One has to deduce from the context whether the length squared or the y-component of a vector is meant.

2.8 Scalar Product and Length Squared

41

w2 = w · w = (w0 )2 − (w1 )2 − (w2 )2 − (w3 )2 .

(2.45)

In this notation, the time τ between events u and w is given by

τ 2 = (u − w)2 .

(2.46)

The scalar product (2.44) maps each pair of four-vectors to a real number and is symmetric and linear in each argument, (a denotes an arbitrary real factor) v·w = w·v ,

u · (v + w) = u · v + u ·w ,

v · (a w) = a (v · w) ,

(2.47) (2.48)

but, different from Euclidean geometry, not definite. Lightlike vectors have length squared zero though they do not vanish. The scalar product is nondegenerate, i.e. the scalar product of a vector v vanishes with all other vector if and only if v = 0 vanishes. The scalar product of two vectors u and v can be written as the difference of lengths squared 1 (2.49) u · v = ((u + v)2 − (u − v)2) . 4 Since different observers determine different coordinates but the same lengths squared of differences of four-vectors (2.43), scalar products of difference vectors do not depend on the coordinate system of the respective observer either, u · v = u ′ · v′ .

(2.50)

If the length squared w2 is positive we call w timelike, if it is negative we call w spacelike, if w2 = 0 , w 6= 0 , w is called lightlike. A timelike or lightlike vector w is future directed, if its component w0 is positive, otherwise it is past directed. Two events A and B are mutually spacelike if the corresponding difference vector from B to A wAB = (tA − tB , xA − xB , yA − yB, zA − zB ) (2.51) is spacelike. Correspondingly we define lightlike or timelike pairs of events. An event B can cause an effect A only, if wAB is future directed timelike or lightlike. Events on a light ray are mutually lightlike. Events on the worldline of an observer are mutually timelike since each observer is slower than light. If his worldline is straight then the length squared of the difference vector of two of his events is the square of the time which passes on his clock between the two events.

42

2 Time and Distance

Orthogonal To construct the line O⊥ , which orthogonally intersects the line O in the point t , one chooses two points on O , t+ and t− , which are equally far away from t , and determines a second point E which is equally far away from t+ and t− . The orthogonal line O⊥ is the line through t and E . This is true in Euclidean geometry and in spacetime. In spacetime, however, the distance is given by τ (2.42). If t− and t+ are two events on the worldline of the observer O and if t is in their middle then the intersections E and E ′ of the light rays through t− and t+ lie on the orthogonal line through t, because E and E ′ are equally far away from t− and from t+ , to wit the distance vanishes because the separations are lightlike.

t+ v

H E0

O?

t

w

E

O v t

Fig. 2.15 Orthogonal Vectors with Hyperbola

The worldline O consists of events which are equilocal for the observer. The events on O⊥ are equitemporal for him (figure 1.6). The lines of equilocal events are orthogonal to the lines of equitemporal events. Using the vector v from t− to t and from t to t+ and the vector w from t to E, the light ray from t− to E is v + w. The vector v − w is the light ray back from E to t+ . The length squared of the lightlike vectors v + w and v − w vanishes, 0 =(v + w)2 = v2 + 2v ·w + w2 ,

0 =(v − w)2 = v2 − 2v ·w + w2 .

(2.52)

Therefore v2 = −w2 and the scalar product of the orthogonal vectors vanishes, v·w = 0 .

(2.53)

2.9 Perspectives

43

The length squared v2 is the square of the time between the events t− and t on the worldline of the observer O. This is the runtime of light and therefore the distance from O to the event E. Because v2 = −w2 the square of the distance of two simultaneous events, which are separated by the spacelike vector w , is −w2 .

Orthogonal: The vector w from an event t on the worldline of a uniformly moving observer to an event E, which occurs simultaneously for him, is orthogonal in terms of the scalar product (2.44) to his worldline. The negative length squared −w2 is the square of the distance between E and the observer. The hyperbola H through t around t− is defined to consist of points which are obtained from t− by equally long translations u(s) p u(s) = 1 + s2 v + s w , u(s)2 = v2 , (2.54)

where s varies in the real numbers. In particular, the point t on the hyperbola corresponds to s = 0 . In terms of the length squared of spacetime, all points of H are equally far away from t− . Each vector from t− to a point A on the orthogonal line O⊥ is of the form x(s) = v + sw, where s is some real number. Because of v · w = 0 it is as long as the vector −v + sw from √t+ to A. Because 1 + s2 > 1 for s 6= 0 , all points of H apart from t lie on the side of O⊥ which is opposite to t− , one has u(s) = x(s) + a(s)v with a positive a(s). In addition, t belongs to both O⊥ and H . Therefore both curve touch each other at t and the straight line O⊥ is tangent to the hyperbola H in the point t. The tangent at t is orthogonal to the vector from t− to t . The same conclusion is obtained by differentiating u(s)2 with respect to s . The tangential vector t(s) = du ds (s) is orthogonal to the position vector u(s) , u(s) · u(s) = constant ⇒

du ·u = 0 . ds

(2.55)

2.9 Perspectives If one takes bearing in horizontal direction from a lighthouse at sea level to a second lighthouse of identical construction also at sea level some miles away then the other lighthouse does not reach the same height because the surface of the earth is curved (see figure 2.16). Height is a perspective quantity. It depends on which direction is horizontal and the horizontal directions of both lighthouses do not coincide. Perspective shortening is physically relevant, Because one can change the height of a ladder by rotation, it may pass a low door though the ladder is longer than the height of the door and though rotations leave the sizes of the door and the ladder unchanged.

44

2 Time and Distance

The figure 2.16 depicts the perspective height of two measuring rods M0 and Mα in Euclidean geometry which are rotated with respect to each other. The circle consists of points of equal distance to the center; each tangent vector is orthogonal to the position vector. The measuring rods intersect in the point O . For an observer, who measures height with M0 , all point on the straight line through B, which is orthogonal to M0 , are equally high. In particular the point E is as high as B and higher as the end point of Mα . Rotated measuring rods reach less high.

E

B M0

Mα O

Fig. 2.16 Rotated Measuring Rods

The perspective shortening of height is reciprocal. Judged from Mα the rotated rod M0 is lower. If one replaces in fig. 2.16 the circle by a hyperbola, one obtains the geometric relations in spacetime. In diagram 2.17 the hyperbola H consists of events with equal temporal distance τ to the origin O (2.42). Equal uniformly moving clocks of observers O0 and Ov , who pass the origin, show the same laps of time, τ , when their worldlines intersect the hyperbola. The tangents in B and B′ are orthogonal to the worldlines of the observers O0 and Ov respectively (2.55). Therefore they consist of events which occur simultaneously for O0 or Ov . The tangents intersect the worldline of the other observer before the time τ has elapsed on it. If the√time τ elapses on a clock between two events, then the shorter time τEO = τE ′ O = 1 − v2 τ (2.12) passes between the simultaneous events on a moving clock, just as two points on a vertical ladder have a shorter distance than equally high points on a tilted ladder. The perspective relations in spacetime are reciprocal as in Euclidean geometry. The abbreviated summary “moving clocks run slower” suppresses the specifications of the segments OE and OB or OE ′ and OB′ the duration of which is to be compared. The abbreviation is the reason for misunderstandings, because “running slower” is an order relation and the clock of O0 cannot run slower and also faster

2.9 Perspectives

45

O0

Ov

H

B0

B

E E

0

O Fig. 2.17 Time Dilation

than the clock of Ov . In fact, both clocks are equal as confirmed by the referee in diagram 2.2. Contraction of moving measuring rods can be read of diagram 2.18 which is the mirrored version of diagram 2.17 . The beginning and the end of uniformly moving

O0

Ov E

O

E0

B0 B

l

Fig. 2.18 Contraction of Moving Rods

measuring rods of observers O0 and Ov traverse pairs of parallel straight worldlines. Both rods have the length l, because the left ends coincides in O and the right ends B and B′ lie on the auxiliary hyperbola, which consists of points P which satisfy −wPO 2 = l 2 . The segments OB and OB′ are orthogonal to the worldlines of the observers O0 and Ov , because they are position vectors and parallel to tangent vectors of the hyperbola (page 43).

46

2 Time and Distance

Therefore the events O, E ′ and B are simultaneous for O0 . At this moment, the left ends coincide, but the right end of the moving rod only reaches to E ′ , so the moving rod is shorter than the own rod which reaches up to B . For Ov the events O, E and B′ are simultaneous. Then the left ends coincide in O and the rod, which moves relative to Ov only reaches to E and is shorter than the own rod which reaches up to B′ . In diagram 2.19 the waiting time and the travel times of the twin paradox are not determined from the Doppler factors as in diagram 2.9 but compared by the auxiliary hyperbola from M to τ ′ with origin at A and the hyperbola from M to τ ′′ with origin at E .

E

t 00 τ 00 M

τ0 t0 S

T

A Fig. 2.19 Twin Paradox

The line segments from A to τ ′ and from τ ′′ to E on the worldline of the stayat-home S last as long as for the traveller T the travel to and from Mars. The stay-at-home has aged in addition during the time which passed between τ ′ and τ ′′ . On the straight worldline of the stay-at-home more time has passed than on the worldline of the traveller with a kink. The tangents Mt ′ and Mt ′′ of the hyperbolas consist of events, which are simultaneous to the arrival at Mars for an uniformly moving observer Oto , who also flies to Mars respectively for an uniformly moving observer Ofrom , who flies back. The tangents intersect the worldline of S in t ′ and t ′′ confirming that for observers who fly to or from Mars the clock of the stay-at-home shows less time than simultaneously on their own clocks. But the events t ′ and t ′′ do not coincide, they are simultaneous to the arrival at Mars for different observers. Between t ′ and t ′′ the stay-at-home ages so much that in the end he has grown older than the traveller.

Chapter 3

Transformations

Abstract Motion of an observer makes him measure Doppler shifted or Lorentz transformed light coordinates. These Lorentz transformations relate the velocities of particles which different observers measure, and change by aberration the directions, from which incident light rays are perceived. Aberration is conformal: angles and relative sizes of small, neighbouring objects agree in aberrated pictures. Lorentz transformations determine how the energy and the momentum of a particle depend on its velocity. The conservation of energy and momentum restricts the decay of a particle and the scattering of two particles with observable consequences.

3.1 Lorentz Transformation of Coordinates How do the coordinates (t, x, y, z) which an observer O attributes to an event E correspond to the coordinates (t ′ , x′ , y′ , z′ ) which an observer O ′ measures for the

O

O0

t+

0 t+ E

t

t0

Fig. 3.1 Lorentz Transformation

47

48

3 Transformations

same event, if he moves with a velocity v relative to O? We start with the case that the worldlines of the two observers intersect in the event O and that they set their clocks to t ′ = t = 0 on this occasion. To keep the discussion simple, both observers choose their x-axis in the direction of relative motion such that for O the observer O ′ moves in x-direction and vice versa the observer O moves in the −x′ -direction of O ′ . Then the y- and z-coordinates of each event E in the plane of the worldlines of the observers vanish, y = z = 0 = y′ = z′ . Each observer sees the clock of the other observer slowed down by the same Doppler factor κ (O, O ′ ) = κ (O ′ , O) (2.9). When flashes of light to and from an ′ and t event E as in diagram 3.1 pass the observers, then the times on their clocks t− − ′ and also t+ and t+ , are proportional to each other with the Doppler factor (2.12)

Ê κ (v) =

1+v 1+v =√ 1−v 1 − v2

(3.1)

because they are pairs of times which equal clocks show at the emission and reception of a flash of light. So the observer O ′ measures the light coordinates ′ t+ = κ −1 t+ ,

′ t− = κ t− .

(3.2)

The transformation of the two light coordinates decomposes into two transfor′ depends only on t and t ′ only mations of coordinates, which scale separately – t+ + − ′ t ′ of the lightangle on t− – and inversely, such that their product, the area t+t− = t+ − with corners O and E, remains unchanged. ′ + t ′ )/2 and x′ = (t ′ − t ′ )/2 (1.4, 1.5) the In spacetime coordinates t ′ = (t+ − + − transformation is coupled 1 1 (κ + κ −1 )t − (κ − κ −1 ) x , 2 2

1 1 x′ = − (κ − κ −1)t + (κ + κ −1 ) x . (3.3) 2 2 √ Inserting κ (v) and 1/κ (v) = κ (−v) = (1 − v)/ 1 − v2 (2.15) one obtains t′ =

t − vx t′ = √ , 1 − v2 or in matrix notation

 ′‹

 ‹

t t = L−v x′ x

−vt + x x′ = √ 1 − v2



(3.4)

‹

1 1v , Lv = √ 2 1−v v 1

.

(3.5)

The linear transformation L−v is (in units with c = 1) the Lorentz transformation of the coordinates of an event in the t-x-plane to the t ′ -x′ -coordinaten which an observer, who moves in x-direction with velocity v , attributes to the same event. Such a coordinate transformation is called passive. It does not change the events. A transformation is called active, if it transforms events. If in (3.5) one reverses the sign of v, then one obtains the active Lorentz transformation Lv which maps the

3.1 Lorentz Transformation of Coordinates

49

√ worldline of a particle at rest to the worldline Γv : t 7→ (t,tv)/ 1 − v2 of a particle, which moves in the t-x-plane with velocity v in x-direction. From (3.2) one concludes that κ −1 and therefore the negative velocity −v corresponds to the inverse transformation, L−v = (Lv )−1 .

Lorentz Transformations in Four Dimensions Also if the event E does not lie in the t-x-plane, the Lorentz transformation Λ of the (t, x, y, z)-coordinates to (t ′ , x′ , y′ , z′ )-coordinates has to be linear because the worldlines of free particles are straight lines for each observer. Consequently, triangles formed by three intersecting straight lines, are mapped to triangles. The sides of a triangle correspond to vectors u and v and, for the third side, w = u + v. They are mapped to Λ (u), Λ (v) and Λ (u + v) = Λ (u) + Λ (v), because triangles transform into triangles. From Λ (u + u) = Λ (u) + Λ (u) one concludes Λ (n u) = n Λ (u) for integer n, and from Λ (u) = Λ (n (1/n u)) = n Λ (1/n u) one deduces Λ (a u) = a Λ (u) for rational a and because Λ is continuous for each real a. Therefore Λ is linear. The coordinates y′ and z′ are linear combinations of t, x, y and z, which vanish for arbitrary t and x if the event E lies in the plane y = z = 0 . So y′ and z′ do not depend on t and x  ′‹  ‹  ‹ y ab y = . (3.6) z′ cd z √ √ Also t ′ = (t − vx)/ 1 − v2 + ey + f z and x′ = (−vt + x)/ 1 − v2 + gy + hz are the most general linear combinations which coincide with (3.4) for y = z = 0 . The Lorentz transformation has to leave √ scalar products√invariant (2.50). Therefore e, f , g and h vanish: the vectors ( 1 − v2, 0, 0, 0), (0, 1 − v2, 0, 0), (0, 0, 1, 0) and (0, 0, 0, 1) are mutually orthogonal. For the observer O ′ they have components (1, −v, 0, 0), (−v, 1, 0, 0), (e, g, a, c) and ( f , h, b, d) and the first two vectors have to be orthogonal to the last two. So (3.4) and (3.6) are valid for arbitrary (t, x, y, z), where (a, c) and (b, d) are the components of Lorentz transformed normalized, mutually orthogonal vectors and consequently are normalized and mutually orthogonal. By a2 + c2 = 1 the coefficients a and c can be written as cos α and sin α of some angle α . Because of ab + cd = 0 the coefficients (b, d) are a multiple of (−c, a), and due to b2 + d 2 = 1 only (b, d) = ±(−c, a) can hold. The Lorentz transformation of the plane orthogonal to the direction of motion is a rotation (or a rotary reflection)

 ′‹



y cos α − sin α = z′ sin α cos α

‹ ‹ y z

.

(3.7)

If the angle of rotation vanishes, α = 0, the Lorentz transformation is called a boost or rotationfree,

50

3 Transformations

t − vx t′ = √ , 1 − v2

−vt + x x′ = √ , 1 − v2

y′ = y ,

z′ = z .

(3.8)

This is the passive transformation to the coordinates used by an observer who moves in x-direction and uses the same spatial directions. Combining t, x, y, z and t ′ , x′ , y′ , z′ to column vectors this passive boost is compactly written with a matrix L−v,x as

† ′ t x′ y′ z′

† 

= L−v,x

t x y z

†

, Lv,x =



√1

1−v2



‹

1v v1

.

1

(3.9)

1

The matrix Lv,x maps√actively the world line of a particle at rest to the worldline Γv,x : t 7→ (t,tv, 0, 0)/ 1 − v2 of a particle moving with velocity v in x-direction. If one applies rotations Dα around a fixed axis to a point y and continuously increases the angle α then Dα y traverses a path α 7→ Dα y, the orbit (page 188) of y under rotations. It is a circle, because rotations leave Euclidean distances invariant. By the same reason points y in figure 3.2 traverse hyperbolas, if one applies Lorentz boosts Lv in the 0-1-plane and varies the velocity v.

Fig. 3.2 Lorentz Flow

Lightangles with the origin as one corner and the opposite corner on a fixed ′ ′ hyperbola have the same area, t+t− = t+ t− . Lightlike vectors get stretched or shrunken. The origin is a hyperbolic fixed point. It is a stagnation point of the flow, not a vortex as in the case of rotations. Denoting the four-vectors (t, x, y, z) and (t ′ , x′ , y′ , z′ ) in (3.9) even shorter just by x and x′ and dropping in the notation Lv,x the label x, then the active boost is x′ = Lv x .

(3.10)

Vice versa one has x = L−v x′ because the inverse boost is the boost in the opposite direction,

3.1 Lorentz Transformation of Coordinates

51

L−v = (Lv )−1 .

(3.11)

More generally, the observer O ′ can move with velocity v in an arbitrary direction and his spatial directions can be rotated by a matrix D with respect to O . This is the most general Lorentz transformation Λ : It can be uniquely decomposed into a boost Lv in the direction of v and a rotation D, Λ = Lv D (6.35). Moreover the worldlines of the two observers do not have to intersect, as we assumed up to now to simplify our considerations. They can be shifted in space and time by a translation a = (a0 , a1 , a2 , a3 ) . Then their coordinates are related by a Poincar´e transformation

§

TΛ ,a :

R4 → R4

x 7→ x′ = Λ x + a

.

(3.12)

Successive Poincar´e transformations TΛ1 ,a1 and TΛ2 ,a2 define their product TΛ2 Λ1 ,Λ2 a1 +a2 = TΛ2 ,a2 ◦ TΛ1 ,a1 .

(3.13)

−1 The inverse transformation is TΛ ,a = TΛ −1 ,−Λ −1 a . Poincar´e transformations constitute a group, i.e. a set of elements g with an associative product and a unit element, here T1,0 , where each element has an inverse g−1 . Poincar´e transformations are not the most general transformations, which map lightcones to lightcones and leave invariant the velocity of light. It is invariant under the larger group of conformal transformations which contains in particular dilations x 7→ ea x by a positive factor ea .

Addition of Velocities Let the worldline Γ : λ 7→ x(λ ) of a particle be parameterized such that the time t strictly increases with the parameter λ , dt/ dλ > 0. The tangent vector, the derivative dx/ dλ , transforms under Poincar´e-transformations simply as a four vector and for an observer, moving with a velocity u, |u| < 1, has components, which are related by dx dx′ =Λ (3.14) dλ dλ to the components for an observer at rest. The velocity v, the derivative of the spatial components with respect to the time t dx dt dt dx = =v , dλ dt dλ dλ

(3.15)

is a rational function of the derivatives with respect to the parameter λ . Therefore, the velocity v′ , which the moving observer measures, is a rational function of the components of v, (i, j, k ∈ {1, 2, 3})

52

3 Transformations

dx i P Λ i0 + j Λ i jv j d λ ′i X = v = . dx 0 Λ 0 + Λ 0 k vk 0 Λ dλ k

Λ

(3.16)

In particular, if we choose the x-axis in direction of the relative motion, Λ = L−v,x (3.9), and the y-axis such that v lies in the x-y-plane and if we denote v by its modulus and its angle θ (vx , vy , vz ) = v (cos θ , sin θ , 0) ,

(v′x , v′y , v′z ) = v′ (cos θ ′ , sin θ ′ , 0) ,

(3.17)

then the transformation of v, solved for cos θ ′ and v′ yields v cos θ − u

cos θ ′ = È

,

(3.18)

(u − v cos θ )2 + (1 − u2) v2 sin2 θ . 1 − u v cos θ

(3.19)

(u − v cos θ )2 + (1 − u2) v2 sin2 θ

È v′ =

3.2 Perception If two mutually moving observers meet in some event, which we choose as origin O, then their past lightcones coincide and each light ray seen by one observer is also perceived by the other observer. Nevertheless they see something different. Color, direction and luminosity of the light depend on the velocity of the observer relative to the source. The Doppler effect changes the frequency, i.e. the color, of the light and in addition the number of photons received per second. Aberration changes the direction n , from which the light ray is seen to come. The change of their directions changes also the density of the light rays. Hence, the moving observer sees a smoothly deformed version of the image of the stationary observer with changed colors and brightness. Topologically, i.e. in terms of the relative positions the two images coincide, since they are related by a continuous invertible map. In particular, a moving observer cannot see things hidden from an observer at rest in this instant at this position. However, at high speed the moving observer’s field of vision in the direction of motion contains light rays which for the stationary observer come from behind [15]. To be definite: let the observer O ′ move with velocity v relative to O in z-direction and consider a light ray in the z-x-plane, which enters the eye of O at t = 0 with an angle θ to the z-axis. In his coordinates the light traverses the worldline Γ : t 7→ l(t) = t(1, −e) , e = (sin θ , 0, cos θ ) (2.35).1 For O ′ the same light ray enters 1

The light pulse moves into the opposite of the direction e from which it is seen.

3.2 Perception

53

with some angle θ ′ to the opposite of the direction into which he sees O move. O ′ attributes the worldline t ′ 7→ t ′ (1, −e′ ) , e′ = (sin θ ′ , 0, cos θ ′ ) = Λ l(t) to the same light ray. The rotationfree Lorentz boost in z-direction leaves the x- and y-coordinates invariant and stretches the difference t− = t − z by κ (2.9), t− ′ = κ t− (3.2). So the observer O ′ perceives the ratio u = ly /l− = sin θ /(1 + cos θ ) diminished by a factor κ .

(x; z)

u

u0

Fig. 3.3 Stereographic Projections to u = x/(1 + z) and to u′ = x/(1 − z) = 1/u

The ratio u is the stereographic projection of (x, z) = (sin θ , cos θ ) from the south pole (0, −1). It projects (x, z) to u = tan(θ /2) as (0, −1), (0, 0) and (x, z) are the vertices of an equilateral triangle, so the line from (0, −1) to (x, z) includes half the angle θ with the vertical axis as the line from the origin (0, 0) to (x, z). This is the geometric background of the trigonometric identity 2 cos θ2 sin θ2 θ sin θ = = tan . 1 + cos θ 2 2 cos2 θ2

(3.20)

So the angle θ ′ of the incident light ray, seen by the observer O ′ , is related by

θ′ tan = 2

Ê

1−v θ tan 1+v 2

(3.21)

to the angle θ which O observes [48]. The change of the direction of the incident light ray is called aberration. As tan θ increases monotonously with the angle θ , therefore θ ′ is smaller than θ for 0 < v < 1. Just as rain, incident light comes more from the direction into which an observer moves.

Yearly Aberration of the Light of Stars The earth orbits the sun at a mean radius of r = 1.50 ·1011 m. The length 2π r, covered within a year, as compared to a light year, 9.46 ·1015 m, is the velocity

54

3 Transformations

v = 1.00 ·10−4 . For such small velocities δ θ = θ ′ − θ is small and (3.21) is approximately

θ 1 θ′ ≈ tan = tan + δ θ 2 2 cos2 (θ /2) 2

Ê

1−v θ θ tan ≈ (1 − v) tan , 1+v 2 2

δ θ ≈ −v sin θ .

(3.22)

A star which for a stationary observer lies in a direction θ = π2 orthogonal to the current direction of motion of the earth is seen from an observer on earth in a direction shifted by |δ θ | = 10−4 , i.e. 20.5 arc seconds.2 During each year the direction of the velocity of the terrestrial observers changes. Therefore, as discovered by James Bradley in 1728, we perceive distant stars in directions which traverse in the course of the year ellipses with a major axis of 41 arc seconds. Bradley’s observation of the aberration of stars was the first direct proof of the Copernican System that the earth orbits the sun.

Shape of Moving Spheres Aberration, the change of directions eθ ,ϕ of incident light rays to the directions eθ ′ ϕ ′ from which a moving observer sees the incoming light, is an invertible map of the set of all directions, the two-dimensional sphere S2 , to itself. Moving observers do not see spheres length-contracted to pan cakes but again as spheres [48, 33, 59, 71]. So aberration is a map of S2 to itself which maps circles, the outline of spheres, to circles. This is concluded from the following arguments. All directions e, e 2 = 1 of incident light rays from the outline of a sphere constitute a circular cone with some opening angle δ to the axis n, n 2 = 1, cos δ = e · n .

(3.23)

The light ray which an observer at the origin sees from the direction e traverses the worldline Γ : t 7→ l(t) = t k , k = (1, −e) , k2 = 0 . (3.24) Its tangent vector, k, is lightlike and belongs to the circular cone (3.23) if and only if k is orthogonal (in the sense of the scalar product (2.44)) to the spacelike fourvector n = a (− cos δ , n) , n2 < 0 , k · n = a (− cos δ + e· n) = 0 .

(3.25) (3.26)

Since Lorentz transformations preserve scalar products (2.50), the transformed vectors n′ = Λ n and the transformed tangent vectors k′ = Λ k satisfy the equation 2

As metre per second (1.3), an arc second is just a number, 1′′ = 2π /(360 · 60 · 60) ≈ 4.848 · 10−6 .

3.2 Perception

55

k′ 2 = 0, n′ 2 < 0 and k′ · n′ = 0 . Because each spacelike vector is of the form (3.25), the vector n′ = Λ n defines an axis n′ and an opening angle δ ′ of a circular cone of light rays. For a moving observer the light rays transformed by aberration come from directions which form a circular cone. Infinitesimal cones with opening angle δ are scaled by a factor D which is obtained by differentiation of θ ′ (θ ) (3.21) √ 1 dθ ′ 1 − v2 1 cos2 (θ ′ /2) = = D= = . 2 2 −2 2 dθ κ cos (θ /2) κ (1 + κ tan (θ /2)) cos (θ /2) 1 + v cos θ (3.27) Because a small circle is magnified by a factor D, each of its diameters appear enlarged by this factor irrespective of its direction. Moreover the scale factor D of nearby objects is nearly the same. Pictures, which are perceived by observers in relative motion at the same place in the same instant, agree in the relative sizes of nearby, small objects. In particular, angles between intersecting lines are preserved by aberration because they are the ratio of small circular arcs to small radii of circles: aberration is conformal. That aberration is conformal also follows simply from the fact that it scales by κ −1 (3.21) the stereographic projection (figure 3.3) (x, y, z) 7→ (u, v) = (

y x , ) 1+z 1+z

(3.28)

R

of the sphere S2 = {(x, y, z) : x2 + y2 + z2 = 1} ⊂ 3 of the directions e = (x, y, z) of incident light rays. This dilation is a conformal map and also the projection (3.28) is conformal, because on S2 one has u 2 + v2 + 1 =

x2 + y2 + (z + 1)2 2 = . (1 + z)2 1+z

(3.29)

A circle on S2 , the set of points e = (x, y, z) which in S2 have distance d to the center n ∈ S2 , satisfies the linear equation nx x + ny y + nz z = cos d (7.26). Dividing by (1 + z) one obtains nx u + ny v −

nz + cosd + nz = 0 , 1+z

(3.30)

and because of 2/(1 + z) = u2 + v2 + 1 this is a quadratic equation for u and v with the same coefficient for u2 and v2 , i.e. a circle (or a straight line) in the u-v-plane.

Moving Ruler Let us investigate, as illustrated in figure 3.4, the one-dimensional image of the edges of a flat infinitesimal ruler in the x-y-plane with sides dx and dy parallel to

56

3 Transformations

the axes. An observer at rest looking at the ruler under an angle θ = α + π /2 to the x-axis from a distance a = A/ cos α perceives the radians dx cos α /a and dy sin α /a. At the same time and at the same position an observer moving in x-direction with velocity v sees the edges of a ruler flying past him where all pixels are shifted by aberration as compared to the stationary observer and form an angle θ ′ = α ′ + π /2 with the x-axis. Since aberration is conformal, the length ratio of the visible edges of the ruler is the same for the moving observer and the stationary observer. Both observers therefore see the projection of a ruler rotated by α perpendicular to the line of sight. If one observes a moving ruler with an angle θ ′ to the x-axis and

v

dx



dy

α

α

A

α α0

y

6

-x

Fig. 3.4 Moving Ruler

compares it with a ruler at rest, it appears to be rotated by θ ′ (v, θ ) − θ . Because aberration preserves the relative size of small, nearby objects, a moving ruler does not appear contracted in any direction but is seen scaled by sin θ ′ / sin θ = cos α ′ / cos α (3.27). The moving observer therefore sees in the direction α ′ the edges of a ruler at the distance A/ cos α ′ . This is the distance a ruler at rest has on the x-axis shifted by A, if it is seen under the angle α ′ . The visible size of the moving ruler does not depend on its velocity but only on the retarded position which it had for the observer when it emitted its light. An object at rest is observed as it is and where it is, even if one perceives its light later. A moving object is not seen where it is but where it was with the corresponding size. Each aspect ratio of visible parts, however, is seen as it is perceived by a comoving observer at the same instant at the same place. A concealed side of the ruler cannot be seen before the ruler has passed the observer. For the stationary observer the right, short edge of the ruler is visible only if it is behind him in x-direction. Then it is also behind the moving observer, for if two mutually moving observers meet at some time and some position they agree upon which events occur at this time behind or ahead of them in direction of the relative motion. In particular, the events (t = 0, x = 0, y, z) which separate behind and ahead are invariant under rotationfree Lorentz transformations in x-direction. A light ray which is emitted with an angle θ with

3.2 Perception

57



v

c  6 + -x θ

  

y

Fig. 3.5 Janus Angle

cos θ = v

(3.31)

to the direction of motion v of a particle reaches the observer in the same instant as the particle is on his side. If seen under this angle a uniformly moving particle is next to the observer. At this moment only the sides of the particle are visible; the co-moving observer, for whom the particle is at rest, observes it under an angle of π 2 besides, above or below himself. Because this angle separates what is behind and what is before the observer, we call it Janus angle after the Roman god of doors and doorways, just as January lies between the old year and the remainder of the new year.

Luminosity The images, which moving observers perceive, are not only deformed but also changed in their luminosity and color. We denote with n(ω , θ , ϕ ) dω dt dΩ the number of photons in the frequency interval dω which an observer perceives in the time dt in the direction (θ , ϕ ) in the solid angle segment dΩ = sin θ dθ dϕ . The observer moving in direction θ = 0 perceives photons in the Doppler-shifted frequency interval dω ′ = D−1 dω (2.29) in the solid angle segment dΩ ′ = D2 dΩ modified by aberration (3.27). The time interval dt ′ in which the moving observer at the same position and the same moment in time sees the same number of photons is given by dt ′ = D dt. This one infers most easily from the fact that for a uniform photon current the number of photons per time defines a frequency just as the number of vibrations of the wave per time. Because both observers detect the same number of photons, (3.32) n′ dω ′ dt ′ dΩ ′ = n′ D2 dω dt dΩ = n dω dt dΩ , and the moving observer sees the spectral photon current density n′ (ω ′ , θ ′ , ϕ ′ ) =

(1 + v cos θ )2 n(ω , θ , ϕ ) . 1 − v2

(3.33)

58

3 Transformations

3.3 Energy and Momentum A conserved quantity is a function φ (t, x, v) of the time t, the position x and the velocity v which on the paths of physical particles retains its initial value,

φ (t, x(t),

dx dx (t)) = φ (0, x(0), (0)) . dt dt

(3.34)

For example, in Newtonian physics the energy E and the momentum p of a free particle with mass m 1 E = E0 + mv 2 , p = mv , (3.35) 2 are conserved due to the equation of motion dp =0, dt

dx 1 = p. dt m

(3.36)

The value of the energy for a particle at rest is irrelevant in Newtonian physics (unless one considers the decay of a particle), usually E0 is simply set to zero. The mass m is trivially conserved during the motion, because it is a constant which depends neither on the position nor on the velocity. Mass is not conserved in decays (3.54) both in Newtonian and relativistic physics. What is conserved are the numbers of baryons and leptons. This is understood as related to a symmetry only in quantum field theory.

Transformation of Additive Conserved Quantities Of course, for a free particle all functions of the velocity are conserved quantities since the velocity is constant for free motion. The outstanding importance of energy and momentum comes from the fact that they are additive conserved quantities. These quantities can be exchanged with other particles in interactions, such that their sum, evaluated before and after the interaction, remains unchanged. As the multiples and sums of additive conserved quantities are also additive, conserved quantities they constitute a vector space of functions φ (t, x, v). Another observer attributes Poincar´e transformed time, position and velocity to the same particle (3.12, 3.16), which cursorily we denote by Mg (t, x, v) with g a shorthand for a Poincar´e transformation TΛ ,a : x 7→ Λ x + a. As all observers are equivalent, also φ Mg (t, x, v) is an additively conserved quantity and the composite function φ ◦ Mg is in the vector space of all conserved quantities. The map Ng : φ 7→ φ ◦ Mg maps this space linearly to itself, as composing functions is linear. For all Poincar´e transformations g there exist linear maps Ng , acting on the values of the conserved quantities, such that Ng φ = φ ◦ Mg , Ng ◦ φ ◦ Mg−1 = φ .

(3.37)

3.3 Energy and Momentum

59

Mathematically additive conserved quantities φ are equivariant functions (A.38). Catchy and airily: physics is the same for all observers. Successive Poincar´e transformations g1 and g2 yield their product g2 g1 (3.13), Mg2 ◦ Mg1 = Mg2 g1 , therefore the linear maps Ng satisfy Ng2 g1 φ = (φ ◦ Mg2 ) ◦ Mg1 = (Ng2 φ ) ◦ Mg1 = Ng2 (φ ◦ Mg1 ) = Ng2 Ng1 φ .

(3.38)

As this holds for all φ , the matrices Ng satisfy Ng2 g1 = Ng2 Ng1 .

(3.39)

A map N : g 7→ Ng which attributes matrices to the elements g of a group G such that their matrix product Ng2 Ng1 corresponds to the group product g2 g1 , is called a representation of the group G.

Four-Momentum of a Particle In the simplest example, the representation matrices Ng act in a four-dimensional space of conserved quantities p = (p0 , p1 , p2 , p3 ) which in anticipation of the result of our investigation we call the four-momentum. Translations Ta : x′ = x+a are taken to be represented trivially by NTa = 1 and Lorentz transformations Λ (including rotations R) by themselves, NΛ = Λ . Then (3.37) reads   (3.40) p (t + a0, x + a, v) = p(t, x, v) , p MΛ (t, x, v) = Λ p(t, x, v) . As spacetime translations do not change p(t, x, v), it depends only on the velocity v. A particle at rest traces out a worldline Γrest : t 7→ (t, a) with vanishing velocity. Since v = 0 is invariant under rotations, MR (t, a, 0) = (t, Ra, 0), and since the four-momentum only depends on the velocity, rotations R do not change the fourmomentum of a particle at rest, Rprest = prest . Accordingly, for a particle at rest the spatial part p = (p1 , p2 , p3 ) of the four-momentum vanishes and the momentum at rest has the form prest = (m, 0, 0, 0) . (3.41) The particle which traces out the Lorentz transformed worldline Γv,x = Lv,x Γrest with Lv,x given by (3.9) moves with velocity v in x-direction. It has the Lorentz transformed momentum Lv,x prest

 0‹



1 p 1v =√ 2 p1 1−v v 1

‹ ‹

m = 0

„

√m

1−v2

√m v

Ž .

(3.42)

1−v2

If finally we rotate the velocity and the momentum into an arbitrary direction, we obtain

60

3 Transformations

m mv p0 (v) = √ , p(v) = √ . (3.43) 1 − v2 1 − v2 We give the components of the four-momentum the same names as the quantities in Newtonian physics with which they agree at small velocities. Up to higher powers of v one has 1 p0 (v) = m + m v 2 + . . . 2

,

p(v) = m v + . . .

.

(3.44)

Therefore (making factors c explicit for once) E = c p0 is the energy m c2 , E(v) = q 2 1 − vc2

(3.45)

the spatial part of the four-momentum is the momentum p and m is the mass of the particle. It has to be is positive. The energy is bounded from below by the energy m (in units with c = 1) of the particle at rest. It increases with |v| and diverges if |v| approaches the velocity of light. A massive particle is always slower than light. A massless particle, m = 0, has to move with the speed of light if it is to carry energy and momentum. Otherwise (3.43) would apply and exclude both. Let the particle move in x-direction such that dx dt = (1, 1, 0, 0) is the tangential vector to its world line. It is invariant under rotations around the x-axis and under multiplication with 3 Λa = eaω , where ω satisfies (ηω )T = −(ηω ),

†

ω=

0 0 1 0 0 1 1 −1 0 0 0 0

0 0 0 0



†

, Λa = e a ω =

2

2

1 + a2 − a2 a 2 a2 1 − a2 a 2 a −a 1



.

(3.46)

1

Its four-momentum p must also be invariant under rotations around the x-axis and under multiplication by Λa . Thus, the y- and z-components of the momentum p vanish and p0 = p1 has to hold, p = |p|(1, 1, 0, 0). If the particle moves in an arbitrary direction one has p = |p|(1, v) where |v| = 1, p0 = E = |p| , p = Ev .

(3.47)

Also for particles which move with the velocity of light the four-momentum is a multiple of the tangent vector of the world line. For massive and for massless particles the velocity v is the ratio (3.15) v=

p p . =p 2 p0 m + p2

(3.48)

As the columns of Λa consists of the components of the image of the standard basis and as they constitute an orthonormal four-basis, Λa are Lorentz matrices.

3

3.4 Mass

61

The velocity is a function of the momentum and the momentum is conserved. This is the statement that particles, whether massive or massless, are inert. To change their velocity, forces have to transfer momentum in the course of time. The vacuum is the same for all observers and therefore has a four-momentum which is invariant under all transformations p′ = Λ p. Therefore its energy and its momentum must vanish pvacuum = (0, 0, 0, 0). This also holds for the contribution of the so-called quantum fluctuations to the energy which trouble some physicists.

3.4 Mass The mass is the length of the four-momentum of a free particle. No matter what its velocity is, in units with c = 1 it satisfies p (3.49) p2 = (p0 )2 − p 2 = m2 , p0 = m2 + p2 > 0 .

This is the equation for a sheet of a hyperboloid: the four-momenta of a free particle lie on the mass-shell. The relation (3.49) of energy and momentum also holds for particles which move with the velocity of light, for instance for photons. They are massless. Their fourmomentum p is lightlike p2 = (p0 )2 − p 2 = 0 ,

p0 = |p| > 0 .

(3.50)

Photons with four-momentum p correspond to quanta of plane electromagnetic waves (5.82) with four-wave-vector k = (|k|, k). Here the momentum p = h¯ k is a multiple of the wave-vector, the factor h¯ = 1.055 ·10−34 Js [47] is Planck’s constant. The energy E = h¯ ω = h¯ |k| of the photons is a multiple of the frequency ν = ω /(2π ) of the electromagnetic wave. This relation is fundamental for Planck’s derivation of the thermal radiation density and Einstein’s interpretation of the photoelectric effect. According to (3.45) particles at rest have the energy Erest = m c2 .

(3.51)

This is probably the most famous equation in physics. It underlies the realization that during the transmutation of atomic nuclei by fission or fusion energy can be released, for the total mass of the nuclei is measurably different from the sum of the individual masses. The mass difference is due to the binding energy which can be used in war or peace, destructively or beneficially. Equation (3.45) also contains the statement that it requires infinite energy to accelerate a massive particle to the speed of light. Massive particles are always slower than light. The mass m is independent of the velocity. In some presentations of the theory of relativity the √ term “mass” is used for the product γ m with the velocity dependent factor γ = 1/ 1 − v2. This is a waste of a denomination, because for E = γ m c2 we

62

3 Transformations

already have a name: energy. These days one denotes with mass the quantity which in old and outdated presentations goes by the long-winded “rest mass”. Moreover, if one denotes γ m “mass” it is tempting to insert this into formulas of Newtonian physics, which emerge in the limit of low velocities from relativistic physics, expecting to obtain equations, which are valid for all velocities. Even if this is true in one case for the momentum p = γ mv one nearly always obtains nonsense: the kinetic energy is neither γ m v 2 /2 nor p2 /(2γ m) . A particle does not become heavier with increasing speed: weight depends on the acceleration as compared to free fall. A fast moving particle does not generate the gravity of a mass, magnified by a factor γ . It does not become a black hole by rapid motion. If it required only a factor γ , then Einstein would have needed ten minutes rather than ten years to include gravity into his relativistic formulation of mechanics and electrodynamics. Force is not mass times acceleration. The equations of motion of a relativistic charged particle state, as we will show in section 5.3, that the total momentum of the particle and the fields remains conserved, F=

dp . dt

(3.52)

The force F is the momentum per time dp/ dt transferred to the particle. If the force is not parallel or orthogonal to the velocity then it does not accelerate into the direction of the force, 1 (p · F) p F dv X ∂ v dpi 3.48 =p −p (F − (v · F) v ) . = p = 3 2 2 2 dt ∂ pi dt m +p m + p2 m2 + p 2 i (3.53) Inertia of fast particles depends on the direction of the velocity v. A transverse √ force causes an acceleration dv⊥ / dt = 1 − v2 F⊥ /m ; parallel to the velocity, the particle is more inert by a factor 1/(1 − v2 ). Also massless particles are inert, dv⊥ / dt = F⊥ /|p|, in their direction of motion even infinitely inert, dvk / dt = 0. If one defined force to denote mass times acceleration, it would not satisfy “actio et reactio” because momentum, not mass times velocity, is conserved. If macroscopic bodies move and interact with each other, then effects from the finite speed of sound become large long before relativistic effects are measurable. At high relative velocities the binding of the constituents of macroscopic bodies becomes negligible, they do not behave as rigid bodies but nearly as a gas of free particles which collide, restricted by energy and momentum conservation, and which interact with fields like the electromagnetic or the gravitational field.

3.5 Decay into Two Particles

63

3.5 Decay into Two Particles If a particle at rest with mass m decays into two particles with masses m1 and m2 , then the energies of the decay products are fixed by the masses involved. Due to momentum conservation the momentum p of È the particle one is opposite to the moÈ 2 2 2 mentum of the other. Their energies are E1 = m1 + p and E2 = m2 + p 2 , since energy and momentum lie on the mass shell (3.49). Energy conservation implies that the sum of these energies coincides with the energy m of the decaying particle at rest È È (3.54) m = m21 + p 2 + m22 + p 2 > m1 + m2 . In particular, the mass m of the decaying particle is not conserved. It is greater than the sum of the masses of the decay products. This is the same geometric fact as with the twin paradox that the sum p1 + p2 of two timelike four-vectors is longer than the sum of the individual lengths of p1 and p2 . Repeatedly taking the square and reordering terms one solves for p2 , E1 and E2 1 (m4 + m41 + m42 − 2m2m21 − 2m2m22 − 2m21 m22 ) , 4m2 1 1 (m2 + m21 − m22 ) , E2 = (m2 − m21 + m22 ) . E1 = 2m 2m

p2 =

(3.55) (3.56)

If more generally two particles are not at rest then their momenta p1 = aP + p and p2 = bP − p consist of a fraction of the total momentum P = p1 + p2 , P2 = m2 , with a + b = 1 and their relative momentum p with p · P = 0, m21 − m22 P m21 − m22 P ) ) −p, + p , p = (1 − 2 m2 2 m2 2 (3.57) 2 p m21 − m22 p2 m21 − m22 p1 w(m , m12 , m22 ) 2= −p ) ) − (1 + , , p = (1 − m2 2 m2 2 2m p1 = (1 +

where w is the permutation symmetric function p w(x, y, z) = x2 + y2 + z2 − 2xy − 2xz − 2yz È √ È √ √ √ = x − ( y + z)2 x − ( y − z)2.

(3.58)

If tachyons existed with a spacelike four-momentum p, p2 = −m2 , then the mod√ 2 ulus of its momentum, |p| = m + E 2 , not its energy E , would be bounded from below. If they interacted electromagnetically they could emit photons with arbitrary large energy, because the tachyon mass shell is a doubly-ruled surface [70] and the four-momentum of the tachyon after the emission of the photon with fourmomentum k , k2 = 0 , is again on-shell, (p − k)2 = −m2 , if p · k = 0, i.e. (3.26) if the photon is emitted with an angle cos θ = p0 /|p| to the momentum of the tachyon.

64

3 Transformations

3.6 Compton Scattering On elastic scattering of two particles, i.e. for a scattering process where the number of particles and their masses remain unchanged, the conservation of energy and momentum fixes the energies after the collision as a function of the scattering angle θ and the initial energies. Let us consider, for example, an incident photon with energy E which is scattered elastically from an electron initially at rest. This process is called Compton scattering. Let p and q be the four-momenta of the photon and the electron before and p′ and q′ after the scattering. Four-momentum conservation implies p + q = p′ + q′ ,

(3.59)

or in more detail, if we choose the x-axis in the initial direction of motion of the photon and the y-axis such that the photon, scattered by an angle θ , finally moves in the x-y-plane,

†  E E 0 0

†  +

m 0 0 0

† =

E′ ′ E cos θ E ′ sin θ 0



† +

m + E − E′ E − E ′ cos θ −E ′ sin θ 0

 .

(3.60)

Here we have already taken into account that the scattered photon with energy E ′ is massless and satisfies p′ 2 = 0. That also the electron is on its mass shell after the collision, q′ 2 = m2 , relates the scattering angle and the energies (m + E − E ′)2 − (E − E ′ cos θ )2 − (E ′ sin θ )2 = m2 .

(3.61)

The squares of E, E ′ and m cancel. With the conventional factors c one obtains m c2 m c2 = + 1 − cos θ . E′ E

(3.62)

So the energy of the scattered photon is fixed by the scattering angle. The energy is smaller than the energy of the incoming photon. This contradicts the notion of an incoming electromagnetic wave corresponding to the photon of energy E = h¯ ω which accelerates the charged electron, which then itself emits a wave with the scattered photons. In such a process the frequency of the emitted wave would coincide with the original frequency. Equation (3.62) on the other hand follows from the assumption that electrons are particles and that electromagnetic waves consist of particles, namely photons. However, this is not a proof of the particle property of electromagnetic waves. One may also obtain (3.62) if one – which we did not do – treats the electron as well as the photon as a wave. That waves behave like particles and that particles have properties of waves belongs to the basic findings of quantum physics.

Chapter 4

Relativistic Particles

Abstract The time evolution of particles and fields is characterized by a corresponding functional, the action. It is stationary for the real evolution as compared to other imaginable evolutions. This principle of the stationary action allows to understand the conservation of energy, momentum and angular momentum as consequences of symmetries, because to each infinitesimal symmetry of the action there corresponds a conserved quantity and, vice versa, to each conserved quantity there corresponds an infinitesimal symmetry of the action. Physical and geometrical properties, conservation laws and symmetries, are intimately related by this Noether theorem.

4.1 Clocks on Worldlines In the course of time a clock traverses its worldline f , parameterized by some parameter s from some real interval I ,

§ f:

I⊂ s

R → R4

(4.1)

7→ f (s)

and shows in each event f (s) = ( f 0 (s), f 1 (s), f 2 (s), f 3 (s)) the time τ (s) which goes by on the clock. The coordinate time f 0 is assumed to increase monotonously. Then the events on the worldline are traversed precisely once and causally ordered. On a straight worldline the time on the clock increases by (2.42) s É  0 ‹2  1 ‹2  2 ‹2  3 ‹2 df df df df df df · (4.2) − − − = ds ∆ τ = ds ds ds ds ds ds ds 0

1

2

3

between adjacent events with difference vector ds ( ddsf , ddsf , ddsf , ddsf ) . The root is real, because events on worldlines of clocks are mutually timelike and the tangent vector ddsf has positive length squared.

65

66

4 Relativistic Particles

If the worldline f is not straight but accelerated, it is approximated by an increasingly finer sequence of small straight segments. An ideal clock measures and adds the times which pass on these segments. Time: An ideal clock records the time s Z Zs  0 ‹2  1 ‹2  2 ‹2  3 ‹2 df df df df − − − (4.3) τ [ f ] = ∆ τ = ds ds ds ds ds s

f

between the events A = f (s) and B = f (s) on its worldline f . In contrast to Newtonian physics the time on the clock is not a function of the events A and B, and ∆ τ is not the derivative dτ of a function τ , which is defined in spacetime. The time on the clock does not pertain to the events A and B alone but is the path length of the worldline f from A to B . Time is additive. If C is an event on the worldline f in between A and B and if f1 and f2 denote the parts of the worldline to and from C, then in a suggestive notation f = f1 + f2 and Z Z Z

∆τ =

f1 + f2

∆τ +

f1

∆τ .

(4.4)

f2

The time τ [ f ] is independent of the parameterization of the worldline. This holds because each other parameterization f ′ : s′ 7→ f ′ (s′ ) of the worldline ′ 0 is given by f ′ (s′ ) = f (s(s′ )) with with monotonically increasing coordinate time fq  ds ds 2 monotonically increasing s(s′ ), therefore ds ′ = ds′ . By the chain rule one has df′ ds′

=

ds d f ds′ ds ,

and with the integral substitution theorem the assertion follows,

Zs′ s′

É ′

ds

df′ df′ · = ds′ ds′

Zs′ s′

ds ds ′ ds

É

df df · = ds ds



Zs

É ds

df df · . ds ds

(4.5)

s

If one parameterizes the worldline with the coordinate time, f 0 (s) = s, then the worldline is given by f : t → (t, f 1 (t), f 2 (t), f 3 (t)) and the tangent vector has the 1 2 3 components ddtf = (1, ddtf , ddtf , ddtf ) = (1, vx , vy , vz ) and the length squared 1 − v 2 . So the clock registers the time

τ[ f ] =

Z

t

È

t

dt

1 − v2 (t) .

(4.6)

If one chooses the parameter s along the worldline such that the tangent vector 2 has unit length ddsf = 1 everywhere, then s coincides up to a constant with τ , the proper time which passes on the clock. Conversely, the tangent vector has unit length everywhere if the worldline is parameterized with its proper time,

4.1 Clocks on Worldlines

67

d f 2 = 1 ⇔ τ[ f ] = s − s . ds

(4.7)

On straight worldlines τ 2 coincides with the length squared (2.45) of the difference vector wBA (2.51) from A to B. The proper time τ [ f ] is independent of the acceleration in the sense that the integrand ∆ τ (4.2) does not depend on the second derivatives d2 f /d s2 . Yet, the time τ [ f ] depends on the worldline f and has different values for straight and for curved worldlines from A to B. By definition an ideal clock records the path length of the worldline irrespective of its acceleration. Whether a real clock is ideal has to be checked by measurement. Many real clocks deviate considerably from ideal clocks. Sun dials display the angle of an earthbound axis to the sun. Pendulum clocks and hour glasses measure the acceleration. Paradoxically, an hour glass runs if one keeps hold of it and stops if one let it loose and drops it, even before it hits the ground. Each pendulum circulates with constant angular velocity if one drops the pendulum clock. Quartz clocks change their frequency if the accelerating forces deform the quartz crystal. In quantum mechanical systems the modification of the Hamiltonian, which causes the acceleration, normally also changes the energy levels with which the clock operates. If, for example, atoms are stored in a magnetic field it detunes the transition frequencies. Clocks can be realized by completely different physical processes, for instance by comparison with the rotation of the earth or by electromagnetic transitions in atoms. After the exclusion of obviously unsuitable clocks, e.g. moving sun dials, and after correction of the known imperfections of real clocks, in particular after gravitational corrections, (4.3) agrees without exception with all observations. Among others, it underlies the everyday operation of the global positioning system [46, 54]. To check our understanding of elementary particles physicists [3] keep muons in a cyclotron on a circular path and compare the observed rotation of the muon spin in the magnetic field to the theoretical predictions with a precision of nine decimals. All observations are compatible with the simple assumption that the inner clock of the muons, which controls their decay, records the time (4.3) independently of the acceleration. There was no deviation from (4.3) within the measurement precision of 1%, even though for high energy muons the acceleration in the cyclotron, a, amounts to roughly 4 · 1016 times the acceleration by the gravity of the earth. However, in terms of the muon mass mµ c2 = 105 MeV [47], Planck’s constant h¯ = 1.05 ·10−34Js and the speed of light c, the acceleration is still small, a h¯ /(mµ c3 ) ≈ 10−13. This number is the change of the velocity of the muon in the cyclotron relative to the speed of light during one oscillation of its wave function.

68

4 Relativistic Particles

4.2 Free Particles On straight worldlines, the worldlines of free particles, more times passes between each pair of events A and B than on each other worldline from A to B. To show this one considers a worldline s 7→ f (s) and evaluates the time τ for neighbouring curves s 7→ f (s) + δ f (s) which also pass through A and B , δ f (s) = 0 and δ f (s) = 0 . Then τ changes to first order in δ f by

É Z s Ê Zs d δ f d f  d f d f − 12 d( f + δ f ) d( f + δ f ) df df  δ τ = ds · − · · · = ds ds ds ds ds ds ds ds ds s

s

(4.8) and after integration by parts one obtains

δτ = −

Z

s

s

ds δ f ·

É d  d f  d f d f −1  · . ds ds ds ds

(4.9)

No boundary terms occur on integrating by parts, since δ f (s) = 0 and δ f (s) = 0 . For a worldline f of extremal length of time, δ τ vanishes for all variations δ f which vanish at the boundary. So the map f satisfies the differential equation df

d È ds =0, ds d f · d f ds

(4.10)

ds

which states that the direction of the tangent is constant. The length of the tangent vector ddsf cannot be fixed by the condition of stationary time, since the time τ , as shown in (4.5), does not depend on the parameterization of the worldline. That (4.10) is not only sufficient but also necessary for the time to be stationary shows the following consideration. If in the integral (4.9) one component, say the zero component, of the vector which multiplies δ f is greater than zero at some value of the parameter, then it is positive in a whole neighbourhood, because by assumption it is continuous. If then one chooses the corresponding δ f 0 such that it is positive inside this neighbourhood and vanishes outside, then δ τ is negative and the time τ is not stationary. If one chooses the parameterization such that the tangent vector has unit length (4.7), then the parameter s coincides up to a constant with the time τ on the worldline and (4.10) states that the tangent vector does not change along the path d2 f =0. ds2

(4.11)

If moreover, we choose the constant by f 0 (0) = 0, then the worldline f of extremal time is the straight worldline (2.39) of a free particle

4.3 Jet Functions

69

 ‹



‹

s 1 0 ffree : s 7→ √ + x(0) 1 − v2 v

.

(4.12)

The initial position x(0) and the initial velocity v are chosen by the initial conditions. On the straight worldline from A to B the time τ is not only extremal but longer than on each curved worldline, as our discussion of the twin paradox has shown. Time between two events is maximal on the straight worldline which connects them.

4.3 Jet Functions

R R

Each smooth, parameterized curve f (a map of some parameter interval I ⊂ to d , where the dimension d is called the number of degrees of freedom) naturally defines its lift of order k dk f  d2 f df fˆ : t 7→ fˆ(t) = (t, f (t), (t), 2 (t) . . . k (t) dt dt dt

R R R

R

(4.13)

R

to a curve in the jet space Jk = I × d × d × d · · · × d = I × d (k+1) . Traditionally the coordinates of points of J1 carry names like (t, x, v) for the time t, the position x = (x1 , x2 , . . . xd ) and the velocity v = (v1 , v2 , . . . vd ). Correspondingly and in a more systematic notation, which indicates the order of differentiation, a point in Jk is given by coordinates (t, x, x(1) , x(2) . . . x(k) ) . Functions of Jk are also functions of Jm for m > k, they just do not depend on x(k+1) , . . . x(m) . Jet space (or a domain of it) is the domain of definition of conserved quantities, √ for example the energy E(t, x, v) = m/ 1 − v2 (3.45) of a free particle with mass m is a function of the domain |v| < 1 of J1 . Composed with the lift of a curve f one obtains the energy on the curve in the course of time, t 7→ (E ◦ fˆ)(t). On the straight worldline of a free particle E ◦ fˆfree is constant. Nevertheless, the energy E is not a constant, but a jet function which varies with the velocity. The same applies to each other conserved quantity (3.34). Conserved quantities are jet functions φ , which become constant if composed with the lift of a physical path, d/ dt(φ ◦ fˆphysical ) = 0. The convective derivative dt differentiates jet functions φ such that the subsequent composition of dt φ with the lift of a path yields the derivative of the composed function with respect to the parameter t of the path  d dt φ ◦ fˆ = (φ ◦ fˆ) . dt

(4.14)

It consists of the partial derivative with respect to t and differentiates each variable x(k) and replaces it by the variable x(k+1) . At (t, x, x(1) , . . . x(k+1) ) ∈ Jk+1 the convective derivative dt of a function φ of Jk is   m m dt φ = ∂t + xm (4.15) (1) ∂xm + x(2) ∂xm · · · + x(k+1) ∂xm φ (1)

(k)

70

4 Relativistic Particles

which implies (4.14) by the chain rule 1 (dt φ ) ◦ fˆ)(t) =

 dk+1 f m ∂  d2 f m ∂ dfm ∂ ∂t + φ| + ···+ + m m 2 dt ∂ x dt ∂ x(1) dt k+1 ∂ xm (k) =

t, f (t),

dk f df (t) (t),... dt dt k

df d dk f  φ t, f (t), (t), . . . k (t) . dt dt dt



(4.16)

Consider a curve in the space of functions, i.e. a map λ 7→ fλ of a parameter λ which ranges in some interval, which for λ = 0 passes through f = f0 . Jet functions φ become functions of the parameter λ when they are evaluated on this curve, i.e. when they are composed with the lift fˆλ . According to the chain rule and because partial derivatives can be reordered, jet functions change on the curve by   ∂ ∂ ∂f ∂ k f (t, λ ) = (t, λ ), . . . φ ◦ fˆλ (t) = φ (t, f (t, λ ), ∂λ ∂λ ∂t ∂t h ∂ (4.17)  ∂ ∂ m ∂ φ ∂ k ∂ m  ∂ φ i ˆ m ∂φ + + . . . ◦ f f f f (t) . λ ∂λ ∂ xm ∂t ∂λ ∂ vm ∂t ∂λ ∂ xm (k)

We call δ f = ∂∂λ | fλ the change of f . If through each f of a considered set of λ =0 functions there passes a curve fλ such that for each f the change depends at each time t only on t, f (t) and a finite number of derivatives of f at t, then we call the change local (not to be confused with gauged) and δ f is a jet function δ x(t, x, v, . . . ) evaluated on the lift fˆ, ∂ δ x ◦ fˆ = (4.18) f . ∂ λ |λ =0 λ To a local change δ x there corresponds the convective change

δ = δ xm

 ∂  ∂  ∂ + dt δ xm + · · · + (dt )k δ xm + ... m m ∂x ∂v ∂ xm (k)

(4.19)

which maps jet functions φ to their change or variation δ φ . Composed with fˆ = fˆ0 it gives the change of the composed function φ ◦ fˆλ at λ = 0,  ∂ δ φ ◦ fˆ = (φ ◦ fˆλ ) . ∂ λ |λ =0

(4.20)

The convective change δ commutes with the convective time derivative dt and is completely determined by δ x.

1 We use Einstein’s summation convention: the index pair m designates the sum over the range of its possible values.

4.4 Local Functionals and Euler Derivative

71

4.4 Local Functionals and Euler Derivative Functionals are maps of functions into the real numbers. For example, the proper time τ [ f ] (4.6) shown by an ideal clock in an event B is a functional of the parameterized curve f : t 7→ ( f 1 (t), f 2 (t), f 3 (t)) which the clocks traverses in the course of the parameter time until it reaches B. A local functional S employs its Lagrangian L , a smooth real function of some domain D ⊂ Jk of a jet space (we restrict our considerations mainly to k = 1 because otherwise the energy cannot be bounded from below)

§ L :

R

D ⊂ J1 → , (t, x, v) 7→ L (t, x, v)

to map curves f (with fˆ taking values in D) to real numbers S[ f ] via Z Z  df ˆ S[ f ] = dt L ◦ f (t) = dt L (t, f (t), (t)) . dt

(4.21)

(4.22)

The functional S is called local because it is an integral over the range of parameters t where the integrand at parameter time t depends only on t, the value of f at this time and the value of a finite number of its derivatives, but not on the value of f at some other parameter time t ′ nor on a Taylor series of f which in J∞ represents analytic f at neighbouring times t ′ 6= t. For example, √ the proper time τ (4.6) is a local functional with Lagrangian L (t, x, v) = 1 − v2 . The ideal clock is insensitive to acceleration in the sense that its Lagrangian is a function of the domain |v| < 1 of the jet space J1 and does not depend on second or higher derivatives x(k) with k > 1 . To investigate a functional S in the neighbourhood of some f , we consider it on a curve through f , i.e. on a one parameter family of curves fλ with f0 = f . We denote by δ f = ∂ fλ /∂ λ|λ =0 the tangent of the family at f . The functional S is differentiable at f , if for all smooth δ f , which vanish on the boundary, the derivative dS[ fλ ]/ dλ|λ =0 =: δ S[ f , δ f ] exists and is of the form

δ S[ f , δ f ] = where

δS δf,

Z

dt δ f m (t)

δS δ f m (t)

(4.23)

the functional derivative of S at f , is continuous.

The functional derivative δδ Sf is unique: if for continuous h and g and all smooth δ f Z

dt δ f m (t) hm (t) =

Z

dt δ f m (t) gm (t)

(4.24)

then h = g . Otherwise, if the difference h1 (t) − g1(t) were positive at some time t it would be positive in a whole neighbourhood of this time. Then one could choose a

72

4 Relativistic Particles

δ f 1 which is positive within the neighbourhood and vanishes outside and δ f m = 0 R for m 6= 1. Then dt δ f m (t) (hm (t) − gm(t)) would be positive and contradict (4.24) . To determine the functional derivative of a local functional (4.22), we differentiate S[ fλ ] with respect to λ at λ = 0 Z Z  4.20  ∂ d ˆ L ◦ fλ (t) = dt δ L ◦ fˆ (t) S[ fλ ]|λ =0 = dt (4.25) dλ ∂ λ |λ =0 where the change of the Lagrangian, assuming it is a function of J1 , is

δL = δ fm

 ∂L ∂L + dt δ f m . ∂ xm ∂ vm

(4.26)

In the last term we shift the derivative away from δ f at the cost of a complete time derivative  ∂L ∂L  ∂L  dt δ f m (4.27) = −δ f m dt m + dt δ f m m ∂v ∂v ∂ vm and obtain ∂L ∂L  ∂L  . (4.28) − dt m + dt δ f m δL = δ fm m ∂x ∂v ∂ vm The integral over the complete time derivative contributes only boundary terms to the change of the action  Z Z t   ∂ L ∂ L ∂L d 4.14 m ˆ = dt ˆ = dt dt δ f m δ f ◦ f ◦ f (4.29)  δ f m m ◦ fˆ t ∂ vm dt ∂ vm ∂v

and vanishes if all functions fλ have the same initial and final values as f , since then the variation δ f vanishes at the boundary. The jet function ∂ˆ L ∂L ∂L := m − dt m (4.30) ˆ∂ xm ∂x ∂v is the Euler derivative of L in the case that L is a function of D ⊂ J1 . The Euler derivative of Lagrangian which depends on higher derivatives x(r) is derived analogously from δ L by shifting all derivatives away from δ x(r) = (dt )r δ x ,

∂ˆ L ∂L X ∂L (−1)r (dt )r m . := m + ∂x ∂ x(r) ∂ˆ xm

(4.31)

r≥1

So the functional derivative at f of a local action (4.22) with Lagrangian L with respect to variations which vanish on the boundary is the Euler derivative of the Lagrangian, composed with the lift of f ,

δS ∂ˆ L ˆ = ◦f . δ fm ∂ˆ xm

(4.32)

4.5 Principle of Stationary Action

73

4.5 Principle of Stationary Action The equations, which determine the time evolution of elementary physical systems,2 state that a corresponding local functional, the action of the system, is stationary at the physical paths. From its initial position the system traverses in the course of time that curve to the final position which makes the action stationary in comparison to all other conceivable curves from the same initial position to the same final position. For example, a free particle with mass m traverses straight worldlines and therefore worldlines which extremize the action (4.22) with Lagrangian3 p (4.33) Lfree (t, x, v) = −m 1 − v2 . The action S is stationary at f , if its variational derivative vanishes at f . So the Euler derivative (4.30) of the Lagrangian vanishes on the physical path fphysical

∂ˆ L ˆ ◦ fphysical = 0 . ∂ˆ xm

(4.34)

These are the Euler-Lagrange equations which single out the physical paths fphysical from all other conceivable paths from the same initial to the same final position. The Euler-Lagrange equations of the free relativistic particle state, that its momentum (3.43) is constant, −

d mv √ =0, dt 1 − v2

v=

dfphysical . dt

(4.35)

Consequently its worldline is straight. In Newtonian mechanics without magnetic field, the Lagrangian is the difference L (t, x, v) = Ekin − Epot

(4.36)

of kinetic and potential energy. In particular, the Lagrangian of the harmonic oscillator, a particle with mass m on a spring with constant κ , is L oscillator (t, x, v) =

1 1 m v2 − κ x2 . 2 2

(4.37)

Its Euler derivative is − m x(2) − κ x . The Euler-Lagrange equation −m f¨ − κ f = 0 p has the solution f : t 7→ a cos(ω t + ϕ ), where ω = κ /m is 2π times the frequency. The amplitude a and the phase ϕ are chosen by the initial conditions. 2

These are systems with no friction and with only such constraints on the jet variables x(r) , which follow as convective time derivatives of constraints on the positions. 3 We choose the sign and the normalization of the Lagrangian of the free relativistic particle such that to first order in v2 it agrees with the Newtonian Lagrangian LNewton = Ekin = 12 m v2 up to an irrelevant constant. Then the conserved quantities of the relativistic and the nonrelativistic particles coincide in leading order and in denomination.

74

4 Relativistic Particles

The Euler-Lagrange equations (4.34) hold in all coordinate systems, for the principle of stationary action does not make use of the choice of coordinates which describe the path. If we view the coordinates x as functions x(t, y) of other coordinates y and if we convert the Lagrangian

∂x ∂x L˜ (t, y, w) = L (t, x(t, y), + w) , ∂t ∂y

(4.38)

the Euler-Lagrange equations hold in y-coordinates if and only if they are satisfied in x-coordinates ∂ˆ L˜ ∂ xn ∂ˆ L = m . (4.39) ∂ y ∂ˆ xn ∂ˆ ym This follows from (4.23), δ x = ∂∂ xy δ y and (4.32). A jet function L is a convective time derivative if and only if its Euler derivative vanishes. It is simple to verify that the Euler derivative of dt K = ∂t K + vm ∂ ∂xm K vanishes. To show the converse, we write the Lagrangian as an integral over its derivative L (t, x, v) = L (t, 0, 0) +

Z

1



0

∂ L (t, λ x, λ v) . ∂λ

(4.40)

The derivative of the Lagrangian with respect to λ is (4.28)  ∂L ∂ˆ L  ∂L   (t, λ x, λ v) = xm + dt xm m (t,λ x,λ v) . (t, λ x, λ v) ∂λ ∂v ∂ˆ xm

(4.41)

Therefore the Lagrangian can be written as

Z t Z 1  ∂ˆ L  ∂L m  λ dt ′ L (t ′ , 0, 0) . + d x + d t (t, m λ x, λ v) ∂v ∂ˆ xm 0 0 (4.42) This is a convective time derivative if the Euler derivative of the Lagrangian vanishes identically in the jet variables for all (t, λ x, λ v). L (t, x, v) = xm

Z

1



4.6 Symmetries and Conserved Quantities Examples of continuous transformations Tα of curves f : t 7→ f (t) are translations by a multiple of a vector c Tα f = f + α c

(4.43)

or a temporal translation by minus some time α (Tα f )(t) = f (t + α )

(4.44)

4.6 Symmetries and Conserved Quantities

75

or a rotation by an angle α

 1‹ Tα



f cos α − sin α = sin α cos α f2

‹  1‹ f f2

.

(4.45)

In these examples the real transformation parameter α varies continuously and parameterizes a curve in a group. It is chosen such that for α = 0 the curve passes the unit element, T0 = id. The tangent vector to the curve Tα at the unit element is called the infinitesimal transformation δ ,

δ f = ∂α |α =0 Tα f .

(4.46)

We have δ f = c for spatial translations, δ f = ddtf for temporal translations and δ ( f 1 , f 2 ) = (− f 2 , f 1 ) for rotations. In these cases the change of all curves f under the infinitesimal transformation is local (compare page 70), i.e. a jet function δ x evaluated on the lift of the curve, δ f = δ x ◦ fˆ . Correspondingly the Lagrangian of a local action changes by (assuming it is a function of J1 )

δ L = δ xm

∂L ∂ L 4.28 ∂ˆ L ∂L  + dt δ xm m . + (dt δ xm ) m = δ xm m m ˆ ∂x ∂v ∂v ∂x

(4.47)

If the action (4.22) changes by boundary terms only, i.e. if δ L is the convective derivative of some jet function K (which is determined only up to a constant)

δ L + dt K = 0 ,

(4.48)

then we call δ an infinitesimal symmetry of the action. From the definition of an infinitesimal symmetry of the action (4.48), and from (4.47) one concludes

∂ˆ L + dt Q = 0 , ∂ˆ xm ∂L Q = K + δ xm m . ∂v

δ xm

(4.49) (4.50)

Equation (4.49) subsumes the [43] Noether Theorem: To each infinitesimal symmetry δ xm of the action there corresponds a conserved quantity Q. Vice versa, to each conserved quantity there corresponds an infinitesimal symmetry of the action. The conserved quantity is unique up ˆ to a constant, the infinitesimal symmetry is unique up to contributions cmn ∂ˆ Ln with ∂x cmn = −cnm . The theorem holds, because the physical paths satisfy the equations of motion ∂ˆ L ◦ fˆphysical = 0 (4.34). Consequently the Noether charge Q is a conserved quan∂ˆ xm tity, d/ dt(Q ◦ fˆphysical ) = 0 .

76

4 Relativistic Particles

Conversely, to each conserved quantity there corresponds an infinitesimal symmetry of the action. This is true because by definition a jet function Q is a conserved quantity if the time derivative of Q ◦ fˆphysical vanishes due to the equations of motion, i.e. if dt Q can be written as a multiple of the Euler derivative of the Lagrangian and of the derivatives of the Euler derivative with some jet functions r0 and r1 , dt Q + r0m

∂ˆ L ∂ˆ L + r1m dt =0. ∂ˆ xm ∂ˆ xm

(4.51)

If we combine the terms with the product rule and redefine the conserved quantity ˆ by terms r1m ∂ˆ Lm which vanish on physical paths then it is related to an infinitesimal ∂x symmetry δ x = r0 − dt r1 in standard form (4.49), dt Q + r1m

 ∂ˆ L ∂ˆ L  + r0m − dt r1m =0. ∂ˆ xm ∂ˆ xm

(4.52)

One cannot require infinitesimal symmetries δ x to vanish on the boundary. This restriction would exclude translations and rotations. We also do not require that infinitesimal symmetries can be integrated to finite transformations. Such requirements would spoil the correspondence to conserved quantities (5.167). The Noether theorem is important because often symmetries of the action are evident and can be seen as a geometric property of the Lagrangian, e.g. that it is invariant under translations or rotations or independent of the parameter time t. Conserved quantities are crucial for the integrability of the equations of motion, i.e. whether the solutions can be obtained by integration of given functions or by solving for implicitly given functions. If the equations of motion apply to d degrees of freedom, the equations are integrable if and only if there are d independent conserved quantities Q1 . . . Qd , whose corresponding infinitesimal transformations δ1 , . . . δd , when applied successively, can be interchanged just like translations δi δ j = δ j δi . If one modifies integrable equations of motion by additional terms, such perturbations, even if they are small, lead to chaotic paths which, though they have small measure, are dense in the space of all paths. These results are derived and discussed in e.g. [1, 6, 16, 39], to which we refer for a thorough presentation. Symmetries of the action are not only important because they are related to conserved quantities but also because the transformed solutions to the equations of motion are solutions which in some cases may turn out to be unknown previously. In general relativity one obtains the gravitational field of a uniformly moving mass from the gravitational field of a mass at rest by a Lorentz transformation. Thereby one proves that rapid motion does not turn a mass into a black hole. Transformations which map solutions of the equations of motion to solutions are not necessarily symmetries of the action, e.g. the solutions f : t 7→ − 21 gt 2 + v0 t + x0 of the equation f¨ = −g of a vertically falling particle are mapped to each other by Tα : (Tα f )(t) = e2 α f (e−α t). But the infinitesimal transformation δ x = 2 x −t v does not leave the Lagrangian L = 21 m v2 − m g x invariant up to a derivative, as the Euler derivative of δ L = dt (−t L ) + 3 L does not vanish.

4.6 Symmetries and Conserved Quantities

77

Energy and Momentum p The Lagrangian L (t, x, v) = −m 1 − v2 (4.33) of a free relativistic particle is invariant under Poincar´e transformations, i.e. under spatial and temporal translations, rotations and boosts. The conserved quantity (4.50) which corresponds to the infinitesimal translation δ x = c by a constant vector c , ci

∂L = c·p , ∂ vi

mv p= √ , 1 − v2

(4.53)

is by definition and in agreement with (3.43) the momentum p in direction of the vector c. Translation invariance of the action corresponds to the conservation of momentum. If the Lagrangian depends only on the velocity, not on the coordinate of a degree of freedom, then the variable is called cyclic. Then the Lagrangian is invariant under translation in this direction, e.g. under (x1 , x2 , . . . xd ) 7→ (x1 + α , x2 , . . . xd ) and under the infinitesimal transformation δ x = (1, 0, . . . 0). The Noether charge (4.50) is the of the cyclic variable. The Euler-Lagrange canonically conjugate momentum ∂∂ L v1 equations just state that it is conserved,  d ∂L ˆ ∂L ∂L ∂L  =0 ∧ − dt 1 ◦ fˆphysical = 0 ⇒ ◦ fphysical = 0 . (4.54) ∂ x1 ∂ x1 ∂v dt ∂ v1 Temporal translation (4.44) yields the infinitesimal transformation

δx = v .

(4.55)

This is an infinitesimal symmetry of the action whenever the Lagrangian L (t, x, v) does not depend on t, since ∂t L = 0 implies 4.15

δ L = vm ∂xm L + (dt vm ) ∂vm L = dt L − ∂t L = dt L ,

(4.56)

i.e. (4.48) with K = −L . The Noether charge (4.50) corresponding to the invariance under time translation is by definition the energy E, E = vm

∂ L −L . ∂ vm

(4.57)

The energy is conserved if the Lagrangian does not depend explicitly on the time. In agreement with (3.45) the energy of a free particle with Lagrangian (4.33) is m . E(t, x, v) = √ 1 − v2

(4.58)

The definition (4.57) of the energy contains the recipe (4.36) for the Newtonian Lagrangian. The operator v∂v counts the degree of homogeneity in velocities,

78

4 Relativistic Particles

v∂v (v)n = n (v)n , where the superscript denotes an exponent for once. If one decomposes the Lagrangian and the energy into pieces L n , which are homogeneous Pn and EP of degree n in the velocities, then by (4.57) E = En = n (n − 1)Ln, so L =

X En + L1 . n−1

(4.59)

n6=1

A part of the Lagrangian, which is linear in the velocities L1 (x, v) = q vi Ai (x) does not contribute to the energy but adds a magnetic force q v j (∂xi A j − ∂x j Ai ) to the Euler derivative and to the equations of motion. In Newtonian mechanics the energy is the sum of the kinetic energy Ekin , which is quadratic in the velocity, n = 2, and the potential energy Epot , which is independent of the velocity, n = 0, so the Lagrangian is L = Ekin − Epot + L1. In Hamiltonian mechanics the energy (4.57) defines the Hamiltonian H (x, p) = vm pm − L , pm = ∂vm L ,

(4.60)

as a function of the phasespace variables, the position x and the canonically conjugate momentum p = ∂v L , rather than the jet variables x and v. The motion of particles is an area preserving map with respect to the measure dpm dxm − dH dt [1]. This important geometric property is basic for the Kolmogorov-Arnold-Moser theorem [39] and the conclusion that even small perturbations of integrable motions lead to chaotic curves [6] which are dense in the space of all curves though they can have small measure. If in a one-dimensional motion the energy, a function of J1 , is conserved, i.e. if dtd E ◦ fˆphysical = 0 , then the time t(x) at which the particle passes the point x can be calculated as an integral and the path x = f (t) can be obtained as the inverse function of t(x). For example, if the energy is of the form E(x, v) =

1 m v2 + V (x) 2

(4.61)

with some potential V , then we solve 4 for the velocity v and obtain

É v(x, E) = ±

2 (E − V (x)) . m

(4.62)

On the physical path v ◦ fˆ = d f / dt and E ◦ fˆ is a constant number. So f (t) satisfies the first order differential equation df (t) = ± dt

É

2 (E − V ( f (t))) . m

(4.63)

4 By the implicit function theorem, an analog solution v(x, E) exists for general E(x, v) in the neighbourhood of each possible value of the energy where the energy varies with the velocity.

4.6 Symmetries and Conserved Quantities

79

This cannot be easily integrated as the right side contains the unknown function f . But the derivative of the inverse function t(x), f (t(x′ )) = x′ , is the reciprocal dt dx so dt ′ (x ) = ± dx Applying

Rx x

=(

|x′

‚É

d f −1 ) , dt |t(x′ )

(4.64)

Œ−1

2 (E − V (x′ )) m

.

(4.65)

dx′ to both sides now yields t(x) as an integral over a known integrand t(x) − t(x) = ±

Z

x

dx′ È

x

1 2 ′ m (E − V (x ))

.

(4.66)

All solvable equations of motion are solvable because they possess sufficiently many conserved quantities and because the conserved quantities determine the solutions as inverse functions of integrals over known functions just as in the example of one-dimensional motion.

Angular Momentum and Weighted Position If the action is invariant under rotations around an axis (4.45), the corresponding Noether charge is by definition the angular momentum in the direction of this axis. A rotation D is a linear transformation x 7→ Dx, where D is orthogonal (6.6), DT D = 1. Differentiating this relation for a one parameter set of rotations Dλ with D0 = 1 at λ = 0 one concludes that each infinitesimal rotation Ω = ∂λ Dλ |λ =0 is an antisymmetric matrix, Ω T = −Ω . In three dimensions the linear map Ω maps each vector x to the vector product with a vector ω = (ω 1 , ω 2 , ω 3 ),

„ Ω=

ω3 −ω 2

−ω 3

ω1

ω2 −ω 1

Ž ,

Ωx = ω × x .

(4.67)

The Langrangian (4.33) is invariant under each infinitesimal rotation δ x = ω ×x and the corresponding change of the velocities δ v = dt δ x = ω ×v . So the corresponding Noether charge (4.50) is

ω · L = (ω × x) · √

mv = ω · (x × p) . 1 − v2

(4.68)

Here we have used (a × b) · c = a · (b × c). The Noether charge defines the angular momentum L of a relativistic particle. As in Newtonian mechanics it is the vector product of the position x with the momentum p

80

4 Relativistic Particles

L = x×p .

(4.69)

Each Lorentz transformation Λ of spacetime can be decomposed into a rotation and a symmetric matrix LP = (LP )T (6.35), the Lorentz boost. The conserved quantities which are related to rotations have already been considered. It remains to investigate boosts and their Noether charges. To lowest order in the transformation parameter v, the boost (3.8) in x-direction changes t by −vx and x by −vt and leaves y and z invariant. More generally a boost in an arbitrary direction changes t and x by an infinitesimal transformation δt = c · x and δx = ct. The corresponding change of curves f : → 3 , which particles follow in the course of time, is more complicated than (A.35) because Lorentz boosts do not act on the product space × 3 by a product of representations. They map the set of pairs (t, x) = (t, f(t)) (which define the map f) to lowest order to the set (t + δt , x + df δx ) = (t + δt , f(t) + δx ) which up to this order is the same as the set (t, f(t) − dt δt + δx ) , i.e. the function f changes under a boost by the jet function

R R

R R

δ x(t, x, v) = −vδt + δx = −v(c · x) + ct

(4.70)

evaluated on the lift, δ f = δ x ◦ ˆf . Correspondingly the velocities change by

δ v(t, x, v, x(2) ) = dt δ x = c − x(2) (c · x) − v(c · v) .

(4.71)

√ This is an infinitesimal symmetry of the Lagrangian L = −m 1 − v2 (4.33), p p p  x(2) · v  v·δv = c · v 1 − v2 − c· x √ = −dt c · x 1 − v2 . 1 − v2 = √ 2 2 1−v 1−v (4.72) The Noether charge (4.50) is the product of c with the energy weighted initial position, (K ◦ fˆfree = E ffree (0)), −δ

x − vt = xE −t p . K=m√ 1 − v2

(4.73)

Its components and the components of the angular momentum L are the antisymmetrized products of the components of the four-vectors x = (t, x) and p = (E, p) M mn = xm pn − xn pm , M mn = −M nm , m, n ∈ {0, 1, 2, 3} .

(4.74)

The four-momentum p is a jet function which depends only on the velocity v. The angular momentum and the weighted position also depend on x = (t, x). Under Poincar´e transformations x′ = Λ x + a the momentum p transforms as a difference vector p′ = Λ p (3.40). By construction M transforms together with p linearly and reducibly, but indecomposably, as M ′ mn = Λ m k Λ n l M kl + (am Λ n l − an Λ m l ) pl ,

p′ k = Λ k l pl .

(4.75)

4.6 Symmetries and Conserved Quantities

81

In particular, the transformation of the angular momentum under translations is the Huygens-Steiner theorem L ′ = L + a × p. P È The Lagrangian for N free particles is simply the sum − i mi 1 − v2[i] of the Lagrangians (4.33) of the individual particles which traverse paths f[i] : t 7→ f[i] (t). Trivially, the total four-momentum (or momentum for short) p=

N X

p[i]

(4.76)

i=1

remains conserved, as each individual momentum is a conserved quantity. However, by the Noether theorem the total momentum remains a conserved quantity of interacting particles if the interaction is invariant under translations of space and time. If one can neglect the contribution of the interaction to the momentum when the particles are far apart, then this conserved momentum is given by the sum (4.76) of the momenta of the individual, sufficiently distant particles. That the energy and spatial momentum of the real particles are conserved has been verified in all relevant experiments and observations. By the Noether theorem this means that the action is invariant under translations in spacetime. It required a “desperate hypothesis” by Pauli who introduced a neutrino as an additional bookkeeping entry to account for the energy-momentum balance in the beta decay of nuclei. Decades later the neutrinos, more precisely three kinds of neutrinos, were detected and their postulated properties were verified experimentally. Energy and momentum conservation in general relativity is a subtle problem. The pseudo tensor of Landau and Lifschitz [34, §101] describes the gravitational energy and momentum densities which are exchanged with the energy and momentum of matter while their sum is conserved. The components of the pseudo tensor are quadratic forms in the partial derivatives of the gravitational field and therefore cannot be the components of a tensor, as a change of the coordinates can make all these derivatives vanish at a chosen point. But there cannot be anything better than the pseudo tensor. If it existed and differed essentially, it would constrain the motion of the gravitational and matter fields in addition to the conservation laws from the pseudo tensor. For such additional restrictions, that not only the energy but also a better energy are conserved, there is no experimental or observational indication. The invariance of the action under boosts implies that the sum of K = x E − t p over the individual particles is time-independent on the physical path, i.e. composed P P with fˆphysical . Therefore the center X = x E/ E moves linearly and Pof energy P uniformly with the velocity V = p/ E P P x(0) E (x E − t p) P = P ⇔ X(t) = Vt + X(0) . (4.77) E E

Chapter 5

Electrodynamics

Abstract The electromagnetic fields and charge and current densities constitute a relativistic physical system. Changes in the charge distribution cause changes of the fields with the speed of light. Though the effects are retarded, they nevertheless approximately obey actio et reactio because the electric field of a charge in straight uniform motion points to its instantaneous, not its retarded position. Only acceleration leads to a loss of energy and momentum by radiation. The electrodynamic interactions are invariant under dilations, which is why they cannot explain the particular values of particle masses or the particular sizes of atoms.

5.1 Covariant Maxwell Equations Electric and magnetic fields E(x) and B(x) change by the Lorentz force  FLorentz (x, v) = q E(x) + v × B(x)

(5.1)

√ the momentum pparticle = mv/ 1 − v2 (3.43) and thereby the velocity v of a particle with mass m and charge q, which passes at time t the position x = (x1 , x2 , x3 ) dpparticle = FLorentz . dt

(5.2)

In (5.1) (t, x1 , x2 , x3 ) are the cartesian coordinates of the point x in spacetime. The electromagnetic fields are related to the charge density ρ and the current density j by the Maxwell equations. To stress the essential, we discuss the equations in length units with c = 1 and charge units with ε0 = 1 (appendix C), div B = 0 , div E = ρ ,

∂ B=0, ∂t ∂ rotB − E = j . ∂t rotE +

(5.3) (5.4)

83

84

5 Electrodynamics

The divergence and the rotation or curl of a vectorfield C = (C1 ,C2 ,C3 ) denote the following linear combinations of derivatives with respect to the cartesian coordinates „ 3 Ž ∂2C − ∂3C2 (5.5) div C = ∂1C1 + ∂2C2 + ∂3C3 , rotC = ∂3C1 − ∂1C3 , ∂1C2 − ∂2C1 where we employed the paper saving notation which we will use from now on,

∂0 =

∂ , ∂t

∂1 =

∂ , ∂ x1

∂2 =

∂ , ∂ x2

∂3 =

∂ . ∂ x3

(5.6)

Explicitly the homogeneous Maxwell equations (5.3) read

∂1 B1 + ∂2 B2 + ∂3B3 = 0 , ∂2 E 3 − ∂3 E 2 + ∂0B1 = 0 ,

(5.7)

∂3 E 1 − ∂1 E 3 + ∂0B2 = 0 , ∂1 E 2 − ∂2 E 1 + ∂0B3 = 0 .

We interpret the field strengths as the six components of an antisymmetric tensor, the field strength tensor F, which maps pairs of four-vectors v and w to the flux F(v, w) of field strength, which flows through the parallelogram with edges v and w, Fmn = −Fnm , m, n ∈ {0, 1, 2, 3} , F0i = E i , Fi j = −εi jk Bk , i, j, k ∈ {1, 2, 3} , (5.8)

†

F00 F10 F20 F30

F01 F11 F21 F31

F02 F12 F22 F32

F03 F13 F23 F33



†

=

0 E1 E2 E3 −E 1 0 −B3 B2 −E 2 B3 0 −B1 −E 3 −B2 B1 0



.

(5.9)

Then the homogeneous Maxwell equations have the intriguing symmetric pattern

∂1 F23 + ∂2 F31 + ∂3 F12 = 0 , ∂2 F30 + ∂3 F02 + ∂0 F23 = 0 , ∂3 F01 + ∂0 F13 + ∂1 F30 = 0 , ∂0 F12 + ∂1 F20 + ∂2 F01 = 0 .

(5.10)

They relate the cyclic permutations of the derivatives of the components of F ,

∂l Fmn + ∂m Fnl + ∂n Flm = 0 .

(5.11)

If a quantity Xlmn = −Xlnm is antisymmetric in the last index pair, such as ∂l Fmn , then its cyclic sum Zlmn = Xlmn + Xmnl + Xnlm is antisymmetric under each permutation of two indices, e.g. Zmln = Xmln + Xlnm + Xnml = −Xmnl − Xlmn − Xnlm = −Zlmn . So (5.11) does not consist of 4 · 4 · 4 independent equations, as one might deduce from three indices l, m and n which range over four values. Rather l, m and n must be

5.2 Local Charge Conservation

85

pairwise different in a nontrivial equation and their permutation does not lead to a new equation. Thus, (5.11) are the 4 · 3 · 2 / 3! = 4 independent equations (5.10). The inhomogeneous Maxwell equations (5.4) read explicitly

∂1 E 1 + ∂2 E 2 + ∂3 E 3 = ρ , −∂0 E 1 + ∂2 B3 − ∂3 B2 = j1 ,

(5.12)

−∂0 E 2 + ∂3 B1 − ∂1 B3 = j2 , −∂0 E 3 + ∂1 B2 − ∂2 B1 = j3

or, in terms of the components of the field strength tensor if we insert vanishing terms like ∂0 F00 to emphasize the structure and denote ρ by j0 ,

∂0 F00 − ∂1 F10 − ∂2 F20 − ∂3F30 = j0 ,

−∂0 F01 + ∂1 F11 + ∂2 F21 + ∂3F31 = j1 ,

(5.13)

−∂0 F02 + ∂1 F12 + ∂2 F22 + ∂3F32 = j2 , −∂0 F03 + ∂1 F13 + ∂2 F23 + ∂3F33 = j3 . The minus signs pertain to the definition of F with raised indices (A.14) F mn = η mk η nl Fkl , F 0i = −F0i , F i j = Fi j , i, j ∈ {1, 2, 3} ,

†

F 00 F 10 F 20 F 30

F 01 F 11 F 21 F 31

F 02 F 12 F 22 F 32

F 03 F 13 F 23 F 33



†

=

0 −E 1 −E 2 −E 3 E 1 0 −B3 B2 E 2 B3 0 −B1 E 3 −B2 B1 0

(5.14)



.

(5.15)

Then the inhomogeneous Maxwell equations have the form

∂0 F 00 + ∂1 F 10 + ∂2 F 20 + ∂3 F 30 = j0 , ∂0 F 01 + ∂1 F 11 + ∂2 F 21 + ∂3 F 31 = j1 , ∂0 F 02 + ∂1 F 12 + ∂2 F 22 + ∂3 F 32 = j2 ,

(5.16)

∂0 F 03 + ∂1 F 13 + ∂2 F 23 + ∂3 F 33 = j3 , or in index notation

∂m F mn = jn .

(5.17)

5.2 Local Charge Conservation A double sum over a symmetric and an antisymmetric pair of indices vanishes, Trs mn = −Trs nm = Tsr mn ⇒ Tkl kl = 0 ,

(5.18)

86

5 Electrodynamics

since, after permuting and relabeling the summation indices, it equals its negative, Tkl kl = Tlk kl = −Tlk lk = −Tlm lm = −Tkm km = −Tkl kl .

(5.19)

The field strength is antisymmetric, F mn = −F nm . Partial derivatives, if they are continuous, can be interchanged ∂n ∂m = ∂m ∂n , so the double sum ∂n ∂m F mn vanishes. So, applying ∂n to (5.17), we obtain the continuity equation 0 = ∂n jn . The fourdivergence ∂0 j0 + ∂1 j1 + ∂2 j2 + ∂3 j3 of the four-current vanishes or, in other words, the charge density decreases by the divergence of the current density,

∂n jn = 0 ,

∂0 ρ + divj = 0 .

(5.20)

The continuity equation restricts conceivable sources ρ and j of the electromagnetic fields. Only such charge and current densities can occur in the Maxwell equations which satisfy the continuity equation and therefore local charge conservation. Local charge conservation implies more than just charge conservation. Charge would already be conserved if it vanished in the laboratory and reappeared at the same instant on the other side of the moon. Local charge conservation, however, implies that the charge QV in each time independent volume V changes in the course of time only because unbalanced currents flow in and out through the boundary surface ∂ V (read “boundary of V ”). This follows by Gauß’ theorem if one integrates the continuity equation over the volume V , Z QV (t) = d3x ρ (t, x) , (5.21) V Z Z I d QV (t) = d3x ∂0 ρ = − d3x div j = − d2 f · j . (5.22) dt V V ∂V In particular, a single charge cannot be produced from the vacuum.

5.3 Energy and Momentum The electromagnetic field generates an energy density, energy currents, momentum densities and momentum currents. These quantities are the components T kl of the energy-momentum tensor of the electromagnetic field  1 T kl = − F k n F ln − η kl Fmn F mn . 4

(5.23)

The energy-momentum tensor is symmetric and traceless T kl = T lk ,

ηkl T kl = 0 .

(5.24)

For each value of the index k, the T kl are four components of a four-current which, in absence of electric charge and currents, is conserved,

5.3 Energy and Momentum

87

∂l T lk = −F k n jn .

(5.25)

To verify this we calculate −∂l Tk l ,  1 1 ∂l Fkn F ln − δkl Fmn F mn = (∂l Fkn )F ln + Fkn ∂l F ln − (∂k Fmn )F mn = 4 2  ln 1 ln = ∂l Fkn − ∂n Fkl − ∂k Fln F + Fkn ∂l F = Fkn jn . 2

(5.26)

Here we have used the Maxwell equations (5.11, 5.17) and that the double sum with F ln = −F nl antisymmetrizes in the indices l and n, because a double sum of a symmetric and antisymmetric index pair vanishes (5.18),    2F ln ∂l Fkn = F ln ∂l Fkn − ∂n Fkl + F ln ∂l Fkn + ∂n Fkl = F ln ∂l Fkn − ∂n Fkl . (5.27)

If the charge and current densities vanish then the energy-momentum tensor consists of components of four conserved currents (5.25). For each k the component T k0 is a density whose spatial integral Z pk = d3x T k0 (x) (5.28)

is time-independent (5.21) if the integral over the current (T k1 , T k2 , T k3 ) through the boundary vanishes, i.e. if no energy or momentum is radiated away. The conserved quantities correspond, as we shall show on page 110, to the invariance of the action under temporal and spatial translations and are therefore the energy and momentum (p0 , p1 , p2 , p3 ) = (E, p). So T 00 is the energy density and (T 10 , T 20 , T 30 ) is the momentum density. Using (5.9, 5.15) we write the densities as quadratic expressions in the electric and magnetic field, (i, j, k ∈ {1, 2, 3}) , Z   1 1 2 2 00 T = E +B , E = d3x E2 + B2 , (5.29) 2 2 Z p = d3x E × B . (5.30) T i0 = εi jk E j Bk ,  The energy density u = 21 E2 + B2 is nonnegative and vanishes if and only if the field vanishes. The Poynting vector S = E×B

(5.31)

is the momentum density of the electromagnetic field. Because the energy-momentum tensor is symmetric, T 0 i = T i 0 , the Poynting vector is also the current density of the energy. In presence of electric charge and current density, energy and momentum of the electromagnetic field vary in time due to (5.25)

88

5 Electrodynamics

 1 2 E + B2 + div S = −j · E, 2  ∂0 Si + ∂ j T i j = − ρ E i + εi jk j j Bk  1 T i j = − E i E j + Bi B j − δ i j E2 + B2 , 2

∂0

where

(5.32) (5.33) (5.34)

since energy and momentum can be exchanged with charged particles. If the overall momentum of particles and electromagnetic field is conserved, then ρ E+j×B must be the increase of the momentum density of the carriers of charge, and j · E is the change of their energy density. The momentum pparticle and the energy Eparticle of a point particle with charge q, which traverses the curve x(t) and has the charge density ρ (t, z) = q δ 3 (z − x(t)) and the current density j (t, z) = q x˙ δ 3 (z − x(t)), therefore change in time by  dpparticle = FLorentz = q E + v × B , dt

dEparticle dx = q ·E . dt dt

(5.35)

So the Lorentz force (5.1) in the relativistic equation of motion of charged point particles states the overall conservation of energy and momentum. The energy of the particle changes by the work done by the electric field, the magnetic field does not change the particle energy. The spatial components of the energy-momentum tensor are current densities of momentum. Through a small parallelogram with edges a and b passes the flow of momentum F i (a, b) = T i j (a × b) j . If the flow is absorbed by the parallelogram, then F i (a, b) is the force which acts on it. The isotropic part of T i j , i.e. the part which is proportional to δ i j , is the pressure. By (5.34) the pressure of the electromagnetic field is a third of its energy density, the energy-momentum tensor is traceless. If the radiation is not isotropic, e.g. light from the sun, then T i j n j are the components of the radiation pressure exerted on an absorber with normal vector (n1, n2, n3 ).

Uniqueness and Domain of Dependence The electrodynamic fields depend at time t > 0 in position x only on the charges and currents and initial values at time t = 0 in the causal past of (t, x), i.e. the domain G, which is bounded by the backward light cone of (t, x) and the spacelike initial surface I, which is cut by the backward light cone from the surface t = 0 [13, chapter VI, §4], G = {(t ′ , y) : 0 ≤ t ′ ≤ t , |x − y| ≤ t − t ′ } ,

I = {(0, y) : |x − y| ≤ t} .

(5.36)

To be specific if two solutions with the same charges and currents in G coincide in their initial values of the fields E and B on the initial surface I, then the difference of both solutions satisfy the Maxwell equations with charges and currents, which

5.3 Energy and Momentum

89

vanish in G and with initial values, which vanish on I. Such a solution, however, has to vanish in G: nothing comes from nothing. This holds because the energy density u is nowhere smaller than the modulus of the energy current density S , u=

 1 2 E + B2 ≥ |E × B| = |S| , 2

(5.37)

for (|E| − |B|)2 ≥ 0 implies (|E|2 + |B|2 ) ≥ 2|E| |B| and |E| |B| ≥ |E × B|. Consequently for all future directed, timelike four-vectors w = (w0 , w), w0 − |w| > 0 ,

(5.38)

the density wm T m0 is nonnegative, wm T m0 ≥ 0, wm T m0 = w0 u − w· S ≥ w0 u − |w||S| ≥ (w0 − |w|) u ≥ 0

(5.39)

and vanishes if and only if the energy density u = (E2 + B2 )/2 and therefore all field strengths vanish, wm T m0 = 0 ⇔ E = 0 = B. (t ; x)

S V I Fig. 5.1 Domain of Dependence G

Each inner point (t ′ , y) of the domain G lies on a spacelike surface S = {(t(y), y) : t(y) ≥ 0 , y ∈ I} ,

(5.40)

which together with I borders a domain V ⊂ G. Within V the current jm and therefore the four-divergence ∂m T m0 vanish (5.25), so Z (5.41) d4y ∂m T m0 = 0 . V

By Gauß’ theorem the integral over dω = dt d3 y ∂m T m0 over the volume V equals the integral over ω = d3 y T 00 − 21 dt dyi dy j εi jk T k0 over the boundary I and S. But I does not contribute, because the initial values vanish, therefore Z  (5.42) d3y T 00 − ∂it T i0 = 0 . S

90

5 Electrodynamics

The dual four-vector w = (1, −∂1t, −∂2t, −∂3t) is future directed and timelike everywhere because S is spacelike. Therefore the integrand wm T m0 is nonnegative and the integral vanishes only if E and B vanish everywhere on S . So both fields vanish in each point (t ′ , y) of the interior of G. Because the fields are continuous, they vanish in all of G. Each solution of the Maxwell equations is uniquely determined by the charge and current densities and the initial values of E and B on a spacelike surface I . The fields at (t, x) with t > 0 depend only on the initial values of that part of the surface I, which lies in the backward light cone of (t, x) and on the charge and current densities in the domain G, which is bounded by the backward light cone and the initial surface. For t < 0 the corresponding results hold for the forward light cone. Localized changes in the initial data and in the charges and currents cause changes in the fields at most with the velocity of light. Nothing outruns light.

5.4 Electrodynamic Potentials The electric field E of a spherically symmetric, static charge distribution follows from the obvious ansatz that E is time-independent and radial1 E(x) =

x E(|x|) . |x|

(5.43)

The modulus of the E-field is determined by integrating the Maxwell equation div E = ρ (5.4) over a ball Br with radius r. On the right-hand side one obtains the charge Q(r) in the ball. By Gauß’ theorem the left-hand side equals the integral over the boundary of the ball, the sphere ∂ Br , I Z d2 f · E(x) . (5.44) d3x div E(x) = Q(r) = ∂ Br

Br

On the sphere the outward normal df is parallel to the electric field E, so the scalar product d f · E equals the product of the absolute values. The modulus of the electric field is constant on the sphere and can be extracted from the integral, which gives the size 4π r2 of the sphere. We obtain Q(r) = 4π r2 E(r) , E(r) =

Q(r) . 4π r 2

(5.45)

For a spherically symmetric charge distribution only those charges influence a test charge at the point x which are within the sphere of radius |x|. The electrostatic force F = qE repels charges q with the same sign as Q(r) . 1 Unfortunately, “energy” and “electric field” begin with the same letter. We denote the absolute value of the electric field and the energy by E and let the context resolve the ambiguity.

5.4 Electrodynamic Potentials

91

Inside a homogeneously charged sphere with radius R the ratio of Q(r) to the total charge Q equals the ratio of the volume 34 π r3 to the total volume 34 π R3 , so one has Q(r) = Q r3 /R3 and the electric field of a homogeneously charged sphere is Q · E(r) = 4π

¨

r R3 1 r2

if r < R if r ≥ R

.

(5.46)

A test particle in a homogeneously charged ball with opposite charge is subject to the same force, increasing linearly with the distance, as a spherically symmetric harmonic oscillator. It orbits ellipses around the center (different from Kepler ellipses which orbit a focal point). The electrostatic field is the negative gradient of a potential φ (x) and therefore also satisfies the remaining Maxwell equations with B = 0 and j = 0 , E = − grad φ ,

Q φ (x) = · 4π

¨

2

− 2xR3 + 23R if |x| < R . 1 if |x| ≥ R |x|

(5.47)

Poisson Equation The static potential outside of a point particle at the origin with charge q is the Coulomb potential φ : y 7→ q/(4π |y|). If the particle is at x, the corresponding potential φ (y) = q/(4π |y − x|) is the shifted potential, i.e. the Maxwell equations are translation-invariant. The potential of several point charges is the sum of the potentials of the individual charges, because the Maxwell equations are linear,

φ (y) =

1 X qi . 4π |y − xi |

(5.48)

i

For a continuous, static charge distribution ρ the potential becomes the integral Z ρ (x) 1 φ (y) = d3x . (5.49) 4π |x − y| It satisfies the Poisson equation, − div grad φ = ρ , i.e. ∆ φ = −ρ , where

∆ = div grad = ∂1 2 + ∂2 2 + ∂3 2

(5.50)

is the Laplace operator. To show that (5.49) solves the Poisson equation, we cut out of the integration volume V a ball Bε ,y around y with radius ε Bε ,y = {x : |x − y| ≤ |ε |}

(5.51)

92

5 Electrodynamics

and consider the integral over the remaining volume Vε = V −Bε ,y for ε → 0 . Within 1 Vε the function 1/|x − y| is differentiable and satisfies ∆ |x−y| = 0. So in Vε one has Z

1 dx ∆ φ (x) = Iε (y) = |x − y| Vε 3

Z

 1 1  ∆ φ (x)− ∆ φ (x) . (5.52) |x − y| |x − y|

d3x



The integrand is a sum of derivative terms

 1 1 1  1  ∆ φ (x) − ∆ φ (x) = ∂i ∂i φ (x) − ∂i φ (x) . |x − y| |x − y| |x − y| |x − y|

Therefore, according to Gauß’ theorem Z  1 1  grad φ (x) − grad d2 f · φ (x) Iε (y) = |x − y| |x − y| ∂ Vε Z  1 x−y d2 f · φ (x) . grad φ (x) + = 3 |x − y| |x − y| ∂ Vε

(5.53)

The boundary of Vε consists of the boundary of V and the sphere ∂ Bε ,y . Observe, that in the surface integral (5.53) the normal vector d2 f = d2 f n points out of Vε into the ball Bε ,y . On the sphere ∂ Bε ,y one has x−y 1 =− 2n. |x − y|3 ε

1 1 = , |x − y| ε

The integral over the first term vanishes in the limit ε → 0, Z Z 1 1 d2 f · d2 f · grad φ (x) −→ 0 , grad φ (x) = |x − y| ε ∂ Bε ,y ∂ Bε ,y

(5.54)

(5.55)

because by the mean value theorem of integration it equals a value of (n · grad φ ) at some point of the sphere times its size 4πε 2 divided by ε . The scalar product in the second term of the integrand is minus the product of the absolute values. So for some point z of the sphere ∂ Bε ,y x−y 1 df φ (x) = − 2 3 |x − y| ε ∂ Bε ,y

Z

2

Z

∂ Bε ,y

d2 f φ (x) = −

4πε 2 φ (z) . ε2

(5.56)

In the limit of vanishing ε the point z tends to the center y of Bε ,y , so the integral over ∂ Bε ,y tends to −4πφ (y) . Altogether we obtain Z

V

d3x

1 ∆ φ (x) = −4πφ (y) + |x − y|

or, solved for φ (y),

Z

∂V

d2 f ·

 1 x−y grad φ (x) + φ (x) 3 |x − y| |x − y| (5.57)

5.4 Electrodynamic Potentials

93

1 ∆ φ (x) + |x − y| 4π

 1 x−y grad φ (x) + φ (x) . 3 |x − y| |x − y| V ∂V (5.58) Each function φ with continuous second derivatives in V ⊂ 3 , which is continuous on the closure of V , is determined by ∆ φ and its boundary values on ∂ V . The boundary values of the electrostatic potential of spatially bounded charge distributions can be chosen to vanish for V = 3 , i.e. (5.49) solves the Poisson equation for insular charge distributions with vanishing boundary values.

φ (y) = −

1 4π

Z

d3x

Z

d2 f ·

R

R

Harmonic Functions Functions φ , which in a domain V solve the Laplace equation,

∆φ = 0 ,

(5.59)

e.g. the electrostatic potential in a charge free region, are called harmonic in V . The surface integral over the normal derivative of harmonic functions over the boundary surface of V yields the charge within V and therefore vanishes Z Z (5.60) d2 f · grad φ = d3x ∂i ∂i φ = 0 . ∂V

V

The representation (5.58) of functions with continuous second derivatives in V also holds for each ball BR,y ⊂ V (5.51) around some inner point y of V . Because φ is harmonic in V , the volume integral over BR,y and the surface integral over the normal derivative vanish, because the factor 1/R is constant on the boundary. So φ equals at y its mean value MR,y [φ ] on the sphere ∂ BR,y , φ (y) = MR,y [φ ], MR,y [φ ] =

1 4π

Z

cos θ =+1

cos θ =−1

dcos θ

Z

2π 0

 dϕ φ y + R n(θ , ϕ ) ,

(5.61)

where n denotes the unit vector with angles θ , ϕ (2.33),

n(θ , ϕ ) = (sin θ cos ϕ , sin θ sin ϕ , cos θ ) .

(5.62)

As each harmonic function equals its mean value on surrounding spheres, it becomes minimal or maximal on the boundary of the domain V , in which it is harmonic. In particular, as an electrostatic potential does not have a local minimum in a charge free region, there is no electrostatic trap for charged particles. Each solution φ of the Poisson equation ∆ φ = −ρ is uniquely determined by its boundary values and the source ρ , because the difference of two solutions with the same ρ and the same boundary values is a solution of the Laplace equation, which vanishes on the boundary ∂ V and therefore in the enclosed volume V .

94

5 Electrodynamics

The potential on a conducting surface ∂ V becomes constant after all currents have faded away. If the surface encloses a charge free region V , then the potential is also constant in V because its values are between the minimal and maximal values at ∂ V . Therefore the electric field strength vanishes in each Faraday cage. If the normal derivative ni ∂i ϕ of a harmonic function vanishes on ∂ V Z Z Z Z X 3 3 2 i (∂i φ )2 , (5.63) d f n φ ∂i ϕ = d3x 0 = − d x φ ∆ ϕ = d x ∂i ϕ ∂i φ − ∂V

V

V

V

then the gradient ∂i φ vanishes in V and φ is constant. The normal derivative on the boundary determines the solution up to a constant, because the difference of two solutions with the same normal derivative is a harmonic function with vanishing normal derivative and therefore constant. As φ 2 and (∂i φ )2 are nonnegative, (5.63) shows, that in regions without boundary the Laplace operator ∆ does not have positive eigenvalues, ∆ φ = λ φ ⇒ λ ≤ 0 .

5.5 Gauge Transformation of the Four-Potential The homogeneous Maxwell equations (5.11) imply that the field strengths Fkl (x) are the antisymmetrized derivatives of four potential functions A = (A0 , A1 , A2 , A3 ), Fkl (x) = ∂k Al (x) − ∂l Ak (x) , k, l ∈ {0, 1, 2, 3} .

(5.64)

Field strengths of this form solve the homogeneous Maxwell equations, for (5.11) is totally antisymmetric under permutations and the order of partial derivatives can be interchanged ∂k ∂l Am − ∂l ∂k Am = 0. Conversely, one verifies by differentiation that the antisymmetrized derivatives of the following four-potential (which is defined in star-shaped regions, which contain with each point x also the connecting line to the origin) Al (x) =

Z

1

dλ λ xm Fml (λ x)

(5.65)

0

yield the field strengths, if they satisfy the homogeneous Maxwell equations (5.11),

∂k Al − ∂l Ak = =

Z

Z

5.11

=

=

1 0 1 0

Z

Z

0 1

0

 dλ λ δkm Fml (λ x) + λ xm · λ (∂k Fml )|(λ x) − k ↔ l

dλ 2λ Fkl (λ x) + λ 2xm (∂k Fml − ∂l Fmk )|(λ x) 1

 dλ 2λ Fkl (λ x) + λ x ∂m Fkl (λ x) 2 m



λ =1  ∂ λ 2 Fkl (λ x) = λ 2 Fkl (λ x) λ =0 = Fkl (x) . ∂λ

(5.66)

5.6 Wave Equation

95

The component A0 is called scalar potential φ . The spatial components are combined to the vector potential A = (−A1 , −A2 , −A3 ). Then (5.64) states for the magnetic and the electric field (5.9) B = rotA ,

E = − grad φ − ∂0 A .

(5.67)

Because of ∂k ∂l χ = ∂l ∂k χ and (5.64) the field strengths do not change if one adds to the four-potential Al the derivative of a function χ A′l = Al + ∂l χ .

(5.68)

This change (5.68) of the four-potential with an arbitrary function is called a gauge transformation. It changes the scalar potential φ and the vector potential A to

φ ′ = φ + ∂0 χ ,

A ′ = A − grad χ .

(5.69)

Four-potentials related by gauge transformations cannot be distinguished from each other by any physical effect. Relativistic quantum field theory provides a deep reason for gauge invariance [14]. A quantized four-potential A creates, among others, states of negative norm. Unless all physical processes are gauge invariant these states participate in physical processes and lead to logical contradictions. So gauge invariance is required for logical consistency of theories with a quantized four-potential.

5.6 Wave Equation If one raises the indices (A.14, 5.14) of Fkl = ∂k Al − ∂l Ak

∂ m = η mk ∂k ,

An = η nl Al ,

(5.70)

and inserts F mn = η mk η nl Fkl into the inhomogeneous Maxwell equations (5.17), one obtains (5.71) ∂m ∂ m An − ∂m ∂ n Am = jn .

In the second term the derivatives can be interchanged ∂m ∂ n Am = ∂ n ∂m Am and by a choice of the gauge one can satisfy the Lorenz condition2,

∂m Am = 0 ,

(5.72)

which makes the second term vanish. Then each component of the four-potential satisfies separately an inhomogeneous wave equation An = jn . 2

(5.73)

This gauge condition is due to Ludvig Valentin Lorenz, not Hendrik Antoon Lorentz [31, 36].

96

5 Electrodynamics

The differential operator in the wave equation (pronounced “box”)  = ∂m ∂ m = η mk ∂m ∂k = ∂02 − ∂12 − ∂22 − ∂32

(5.74)

is called the wave operator or d’Alembert operator. If ∂m A′ m = f does not vanish, then the Lorenz condition (5.72) holds for the gauge transformed four-potential A = A′ − ∂ χ if χ satisfies χ = f . However, the Lorenz condition ∂m Am = 0 does not completely fix the gauge. One can still change A by the gradient ∂ χ , if χ satisfies the wave equation χ = 0.

Uniqueness and Domain of Dependence The solutions u of the inhomogeneous wave equation u = g

(5.75)

is uniquely determined by the source g and by the initial values of u and ∂0 u at t = 0 and depends at (t, x) only on the source in the domain G and the initial values in I (5.36). This is deduced just as the corresponding result for the electromagnetic fields with the arguments of page 89 from the energy-momentum tensor, which corresponds to the wave equation, 1 Tmn = ∂m u ∂n u − ηmn ∂r u ∂ r u . 2

(5.76)

It is conserved ∂m T mn = 0, if u satisfies the homogeneous wave equation u = 0 , and is positive definite for each future directed timelike vector w = (1, w1 , w2 , w3 ), 2wm T m0 = (∂0 u)2 + ∂i u ∂i u − 2wi ∂i u ∂0 u = (∂0 u − wi ∂i u)2 + (∂i u ∂i u) − (wi ∂i u)2 ≥ (∂0 u − wi ∂i u)2 + (∂i u ∂i u)(1 − w j w j ) ≥ 0 .

(5.77) The sum wm T m0 vanishes only, if all derivatives ∂m u vanish. By the arguments of page 89 the difference of two solutions with the same source and with the same initial values is constant on each spacelike surface S within the domain G. It vanishes, because S and I have common points. So it vanishes in G. More generally, the solutions u = (u1 , u2 , . . . ) to systems of wave equations ui (x) + f i (x, u(x), ∂ u(x)) = hi (x) , ui (x) = gmn (x)∂m ∂n ui (x) ,

(5.78)

depend at x = (t, x), t > 0, only on the initial data u(0, x), ∂t u(0, x) , and on the inhomogeneity h in the backward light cone of x (of the metric g with one timelike and D − 1 spacelike dimensions) whenever in a neighbourhood of u = 0 = ∂ u, f (x, 0, 0) = 0, the jet function | f (x, u, ∂ u)|2 < c(|u|2 + |∂ u|2 ) is bounded by a quadratic polynomial in u and ∂ u. In particular, the solutions of the Klein-Gordon

5.6 Wave Equation

97

equation, ( + m2 )u = 0, which describes massive particles, and even tachyonic fields with a negative m2 only depend on the initial data and inhomgeneity in the backward light cone.

Plane Waves All terms in the homogeneous wave equation are derivative terms of the same order, two, and the quadratic form k2 = k · k = η mn km kn = k0 2 − k1 2 − k2 2 − k3 2 which corresponds to the wave operator  = η mn ∂m ∂n allows for real, lightlike vectors k 6= 0, k2 = 0 . Therefore the wave equation u = 0 has the remarkable property to allow plane waves u(x) = f (k · x) of arbitrary form f (as long as f has continuous second derivatives)  f (k · x) = η mn km kn f ′′ (k · x) = k2 f ′′ (k · x) = 0 .

(5.79)

Plane waves of the form u(t, x) = f (t − nx) with arbitrary, smooth f move with the speed of light in direction n, n2 = 1, and solve the homogeneous wave equation of the four-dimensional spacetime. The plane wave deserves its name: it is constant on planes orthogonal to n, u(t, x + x0 ) = f (t − n(x + x0 )) = f (t − nx) = u(t, x) if

n x0 = 0 .

(5.80)

It is also constant along the worldlines of photons, which move with the speed of light in the direction n, u(t, x(0) + t n) = f (t − n (x(0) + t n)) = f (−n x(0)) = u(0, x(0)) .

(5.81)

A real, plane monochromatic wave u is more particularly of the form u(t, x) = ℜ a e−i k · x = ℜ a ei (k x−ω t) = A cos(k x − ω t + α ) ,

(5.82)

where a = Aeiα is constant. The circular frequency ω = k0 is determined up to its sign by 0 = k2 = ω 2 − k2 as the absolute value of the wave vector k. We choose ω to be positive, √ (5.83) ω = k0 = k2 , and write the plane wave with negative k0 as ei k · x .

A superposition of monochromatic plane waves with different wave vectors is called a wave packet Z √  ˜ ei (k x−k0 t) a∗ (k) + e−i (kx−k0 t) a(k) , k0 = k2 , u(t, x) = dk (5.84) ˜ = dk

d3 k . (2π )3 2 k0

(5.85)

98

5 Electrodynamics

It solves the wave equation, u = 0 , with initial values u|t=0 = ψ and (∂0 u)|t=0 = φ , d3k i k x e ψ˜ (k) , (2π )3 Z Z  d3k i k x ˜ i kx 0 ∗ ˜ e φ (k) , φ (x) = dk e (−ik ) a (k) − a(−k) = (2π )3

ψ (x) =

Z

 ˜ ei kx a∗ (k) + a(−k) = dk

Z

(5.86)

which determine by their Fourier amplitudes ψ˜ and φ˜ the amplitudes a and a∗ a∗ (k) = k0 ψ˜ (k) + i φ˜ (k) ,

a(k) = k0 ψ˜ (−k) − i φ˜ (−k) .

(5.87)

The components of the four-potential Am satisfy the wave equation and the Lorenz condition (5.72), which restricts the amplitudes by km am (k) = 0. Moreover, adding to Am the gradient ∂ m χ of a solution χ of the wave equation yields a gauge equivalent potential with the same field strengths and changes the amplitudes am (k) by −ikm c(k), where c(k) is the amplitude of χ . Therefore, for each k only two of the four amplitudes am (k) are unrestricted and physically relevant. An electromagnetic wave oscillates in the two directions, which are orthogonal to the wave vector. In other words: light of a given colour and direction is a superposition of two basic, transverse polarizations.

Huygens’ Principle It seems physically plausible that the wave φ at the position x at time t > 0 should be a superposition of the earlier wave at t = 0 from all the places x + tn, n2 = 1, from which one can reach x with the speed of light in the run time t. To clarify this assumption we consider the mean value (5.61) Mt,x [φ ] of a function φ on spheres ∂ Bt,x around the point x with radius t. The mean value (5.61) is also defined for negative t, because the mean value over all directions coincides with the mean value over all opposite directions, i.e. Mt,x [φ ] is an even function of t with a continuation to t = 0, (5.88) Mt,x [φ ] = M−t,x [φ ] , M0,x [φ ] = φ (x) . It has continuous second derivatives if φ has. The derivative with respect to t is a surface integral over the normal derivative Z Z 1 1 ∂0 Mt,x [φ ] = d2 f ni ∂i φ (y) . (5.89) dcos θ dϕ ni ∂i φ (x + t n) = 4π 4π t 2 ∂ Bt,x

For t > 0 the vector n is the outward directed normal vector. By Gauß’ theorem the surface integral equals the volume integral over the divergence Z Z t 1 2 dr r dcos θ dϕ ∆ φ . ∂0 Mt,x [φ ] = (5.90) 4π t 2 0

5.6 Wave Equation

99

This holds also for negative t, because then n is the inward directed normal vector at y = x + t n. If we differentiate again with respect to t and if the derivative acts by the product rule on 1/t 2 , then we obtain a contribution which coincides with ∂0 M up to a factor −2/t. The derivative with respect to the upper boundary t of the integral yields the integrand, which is the mean value of the derivative ∆ φ and which equals the derivative ∆ Mt,x [φ ] of the mean value, 2 ∂0 2 Mt,x [φ ] = − ∂0 Mt,x [φ ] + ∆ Mt,x [φ ] . t

(5.91)

So the mean value satisfies Darboux’s differential equation,  2 ∂0 2 + ∂0 − ∆ Mt,x [φ ] = 0 , t

(5.92)

consequently u : (t, x) 7→ t Mt,x [φ ] solves the homogeneous wave equation u = 0 . This holds also for t = 0, as the second derivatives of t Mt,x [φ ] are continuous. The solution t Mt,x [φ ] vanishes initially at t = 0, its time derivative at t = 0 in position x has the value φ (x) . The coefficients η nm of the wave equation η mn ∂m ∂n u = 0 are constant (5.74). Therefore the derivative ∂0 (t Mt,x [ψ ]) of a solution of the wave equation is also a solution. ∂0 (t Mt,x [ψ ]) has vanishing time derivative at t = 0, because it is an even function of t, and assumes at t = 0 in position x the value ψ (x) . Therefore  u(t, x) = t Mt,x [φ ] + ∂0 t Mt,x [ψ ] (5.93) solves in 1 + 3-dimensions the wave equation u = 0 with the initial values u(0, x) = ψ (x) ,

∂0 u(0, x) = φ (x) .

(5.94)

Each solution of the wave equation is uniquely determined by its initial values (page 96). Accordingly (5.93) is the solution with the initial values ψ and φ . As it depends on the mean of the initial values it changes little if the initial values change little. The initial value problem of the wave equation is well-posed. The solution (5.93) satisfies Huygens’ principle: at time t in position x it is composed additively from the initial values at time t = 0 from all points y, from which one can reach x in time t with the speed of light. This principle does not hold in even space dimensions and in one-dimensional space. There also the initial values from nearer places contribute to the solution. In d = 1, 2, 4, . . . space dimensions the initial values reverberate.

100

5 Electrodynamics

Retarded Potential To get an idea of the solution of the inhomogeneous wave equation, we integrate

∂0 2 u − ∆ u = gˆ

(5.95)

over the short time interval τ < t < τ + ε . The initial values of u at time τ are taken to vanish, u(τ , x) = 0 , ∂t u(τ , x) = 0 . On the left hand side we obtain ∂0 u(τ + ε ) up to terms which, because of the initial conditions, are of second order in ε , the result of the right hand side is simply given a name, Z τ +ε Z τ +ε  ∂0 u(τ + ε ) = dt ∂0 2 u − ∆ u = dt g(t, ˆ x) = g(τ + ε , x) . (5.96) τ

τ

If the inhomogeneity gˆ differs from zero only for this short interval, then subsequently for t > τ the corresponding solution ϕτ solves ϕτ = 0 with initial values

ϕτ (τ , x) = 0 ,

∂0 ϕτ (t, x)|t=τ = gτ (x) ,

gτ (x) = g(τ , x) ,

(5.97)

up to terms which vanish with ε . ϕτ is given by (5.93) translated in time by τ

ϕτ (t, x) = (t − τ ) M(t−τ ),x [gτ ] .

(5.98)

If the inhomogeneity g acts longer, then we compose the solution from solutions, which are generated by short sources: if u1 is a solution of a linear inhomogeneous equation L u1 = g1 with source g1 and if u2 is a solution with source g2 , then the sum u = u1 + u2 is a solution with the sum of the sources, L(u1 + u2 ) = (g1 + g2 ) . To make these considerations precise, we consider the integral Z t (5.99) dτ ϕτ (t, x) . u(t, x) = 0

It vanishes initially at t = 0, as the domain of integration vanishes then. Its time derivative differentiates with respect to the upper boundary of the integral, where it yields the integrand, and with respect to the first argument of ϕτ , Z t Z t ∂0 u(t, x) = ϕt (t, x) + dτ ∂0 ϕτ (t, x) = dτ ∂0 ϕτ (t, x) . (5.100) 0

0

The time derivative vanishes at t = 0 together with the domain of integration. The second time derivative is Z t Z t ∂0 2 u(t, x) = ∂0 ϕτ (t, x)|τ =t + dτ ∂0 2 ϕτ (t, x) = g(t, x) + dτ ∂0 2 ϕτ (t, x) . 0

0

(5.101) If we add −∆ u = 0 dτ (−∆ ϕτ (t, x)) and take into account that ϕτ solves the wave equation we are left with Rt

5.6 Wave Equation

101

 u(t, x) = g(t, x) .

(5.102)

So as a function of t and x the integral Z t dτ (t − τ ) M(t−τ ),x [gτ ] u(t, x) =

(5.103)

0

solves u = g with vanishing initial values u(0, x) = 0 and ∂t u(0, x) = 0 . To evaluate the integral for t > 0 we substitute τ (r) = t − r and integrate over r , Z

t

1 dr r Mr,x [gt−r ] = 4π

Z

t

Z

1

Z



1 g(t − r, x + r n) . r −1 0 0 0 (5.104) The three integrals extend over the points y = x + r n of the ball Bt,x around x with radius t and r = |x − y| is the distance of y to the center. So the solution of the inhomogeneous wave equation u = g with vanishing initial values is Z 1 g(t − |x − y|, y) . (5.105) t≥0: u(t, x) = d3y 4π Bt,x |x − y| u(t, x) =

dr r

2

dcos θ



For t < 0 we substitute τ (r) = t + r in (5.103), then after the interchange of the lower and upper boundary the r-integral extends from 0 to |t| = −t and the three integrals are again a volume integral over the ball Bt,x around x with radius |t|, Z g(t + |x − y|, y) 1 d3y . (5.106) t≤0: u(t, x) = 4π Bt,x |x − y| If more generally the initial values at t = 0 do not vanish, then the solution of the initial value problem of the inhomogeneous wave equation consists of this particular solution and a wave packet (5.93) with the initial values (ε (t) is the sign of t (B.106)) Z  1 g(t − ε (t) |x − y|, y) u(t, x) = t Mt,x [φ ] + ∂0 t Mt,x [ψ ] + d3y . (5.107) 4π Bt,x |x − y| The solution exists, is unique and depends continuously on the initial data. So the initial value problem of the inhomogeneous wave equation is well-posed. The retarded potential Z g(t − |x − y|, y) 1 d3y uret (t, x) = (5.108) 4π |x − y| is a particular solution of u = g which vanishes at early times if g does. Different from (5.107) uret may suffer infrared divergencies, i.e. the integral over the unbounded integration volume may not exist if |g| does not vanish on all backward lightcones faster than 1/r2 with increasing distance r.

102

5 Electrodynamics

Poincar´e Covariance of the Fields The linear map of the source g to its retarded potential uret (5.108) is Poincar´e equivariant: if x 7→ x′ = Λ x + a is a time orientation preserving Poincar´e transformation (3.12), then the transformed source gˆ : x 7→ g(Λ −1 (x − a)) (A.35) generates the transformed retarded potential uˆret : x 7→ uret (Λ −1 (x − a)). To see this we shift in (5.108) the integration variable and integrate over z = x− y Z g(t − |z|, x − z) . (5.109) 4π uret (t, x) = d3z |z| If we denote (t, x) by x and (|z|, z) by z, z2 = 0, then f = 4π uret is written as the special case m2 = 0 of an integral f (x) =

Z

d3z g(x − z) , z0

z0 =

p

m2 + z 2 .

(5.110)

The integral extends over all points x − z of the backward light cone of x, as z2 = 0 and z0 > 0. For m2 > 0 the domain of integration, the points z with z2 = m2 ≥ 0, z0 > 0 constitute a mass shell. For translations, x′ = x + a, Λ = 1, the translated potential fˆ(x) = f (x − a) =

Z

d3z g(x − a − z) = z0

Z

d3z g(x ˆ − z) z0

(5.111)

is the potential of the translated source. Because Lorentz transformations are linear, Λ −1 x− z = Λ −1 (x− z′ ) with z′ = Λ z, one has Z 3 Z 3 Z 3 ˆf (x) = f (Λ −1 x) = d z g(Λ −1 x − z) = d z g(Λ −1 (x − z′ )) = d z g(x ˆ − z′ ) . z0 z0 z0 (5.112) If in particular Λ is a rotation or rotary reflection, z′ 0 = z0 , z ′ = Dz, then we can integrate over the rotated variables (z′ 1 , z′ 2 , z′ 3 ). The modulus of the√determinant of the Jacobi matrix J = ∂ z′ /∂ z = D is | detD| = √ 1 (6.6), and z′ 0 = m2 + z ′ 2 is 0 unchanged by a rotation and coincides with z = m2 + z 2 . So for rotations one concludes Z 3′ Z 3 Z 3 dz dz dz ′ ′ g(x ˆ − z ) = g(x ˆ − z ) = g(x ˆ − z) . (5.113) fˆ(x) = z0 z′ 0 z0 In the last step we have named the integration variable z again. So the rotated potential uˆret is the potential of the rotated source gˆ . The measure d3z/z0 is invariant under arbitrary time orientation preserving Lorentz transformations. We have to show this only for boosts in x-direction be-

5.6 Wave Equation

103

cause each Lorentz transformation can be written as such a boost which is preceded and followed by a rotation (6.35). √ The boost Lv,x in x-direction (3.9), applied to the four-vector ( m2 + z 2 , z) , gives √ p p 1 v zx m2 + z 2 + v zx √ =√ 1+ √ m2 + z ′ 2 = m2 + z 2 , 1 − v2 1 − v2 m2 + z 2 (5.114) √ zx + v m2 + z 2 ′ ′ ′ √ zx = , zy = zy , zz = zz , 1 − v2 the determinant of the 3 × 3 Jacobi matrix J, J i j = ∂ z′ i /∂ z j , is detJ =

 ∂ z′x 1 v zx =√ 1+ √ . 2 2 2 ∂ zx 1−v m +z

(5.115)

Altogether one has for each m2 ≥ 0

3 d3z d3z′ det J = √ d z √ =√ . m2 + z ′ 2 m2 + z ′ 2 m2 + z 2

(5.116)

After changing and renaming the integration variable in (5.112) we can conclude fˆ(x) =

Z

d3z g(x ˆ − z′ ) = z0

Z

d3z′ g(x ˆ − z′ ) = z′ 0

Z

d3z g(x ˆ − z) . z0

(5.117)

So the Lorentz transformed potential corresponds to the Lorentz transformed source. By the same reason the map of the current to the retarded four-potential Am ret (x) =

1 4π

Z

d3z m j (x − z)| √ , z0 = z 2 |z0 |

(5.118)

is equivariant under Poincar´e transformations Aˆ m (x) = Λ m n An (Λ −1 (x − a)) , jˆn (x) = Λ n m jm (Λ −1 (x − a)) .

(5.119)

The transformed current jˆ is conserved if j is,

∂xn jˆn (x) = Λ n m ∂xn (Λ −1 (x − a))r ∂zr jm (z)|

z=Λ −1 (x−a)



n



−1 r

m n ∂zr j (z)



r

m m ∂zr j (z)

= ∂zm jm (z) = 0 .

(5.120)

Also the map of the amplitudes to the wave packet (5.84) is Poincar´e equivariant under suitable transformations of the amplitudes, Z  ˜ an∗ (k) ei k · x + an (k) e−i k · x , (5.121) Anhom (x) = dk | √ k0 =

k2

104

5 Electrodynamics

˜ (5.85) and the domain of integration are both because the integration measure dk Poincar´e invariant (5.116). Under a translation x 7→ x + b the amplitudes a = (a0 , a1 , a2 , a3 ) change by multiplication with the function fb : k 7→ eik · b , aˆ = fb a , (a transformation which is not adjoint to the transformations Ng and Mg of the target and base manifold (A.35)) Z  n n ˜ (an∗ (k) e−i k · b ) ei k · x + (an (k) ei k · b ) e−i k · x . (5.122) ˆ A (x) = A (x − b) = dk

Under Lorentztransformations, the amplitudes transform as a vector field, which is defined on the massless shell of momenta k, k2 = 0. To be specific, if we evaluate the transformed field Aˆ n (x) = Λ n m Am (Λ −1 x), then the scalar product k · (Λ −1 x) in (5.121) equals k′ · x with k′ = Λ k, because scalar products are Lorentz invariant. If we integrate over the three spatial components ˜ = dk ˜ ′ is a of k′ = Λ k rather than of k, use am (k) = am (Λ −1 k′ ) and the fact that dk Lorentz invariant measure (5.116), then the transformed wave packet satisfies Z  ˜ ′ am∗ (Λ −1 k′ ) ei k′ · x + am (Λ −1 k′ ) e−i k′ · x . Aˆ n (x) = Λ n m Am (Λ −1 x) = Λ n m dk (5.123) The denomination of the integration variables, k or k′ , is irrelevant. So the Lorentz transformed wave packet corresponds to the Lorentz transformed amplitudes aˆn (k) = Λ n m am (Λ −1 k) .

(5.124)

The components of the wave packet are related by the Lorenz condition ∂m Am = 0 (5.72), which is already satisfied by the retarded potential,

∂m

Z

d3z m j (x − z) = |z|

Z

d3z ∂m jm (x − z) = 0 . |z|

(5.125)

Therefore the amplitudes an∗ (k) of the wave packet are restricted kn an∗ (k) = 0 .

(5.126)

Moreover, one can change them by a gauge transformation by the gradient of a function χ (5.68) with amplitude c∗ (k) which satisfies the homogeneous wave equation. Thereby the amplitudes change by ∗

a′ n (k) = an∗ (k) + i kn c∗ (k) , k2 = 0 .

(5.127)

By (5.126) only three of the four complex amplitudes of the four-potential are independent. The amplitude in the direction of k can be gauged away. So the wave packet contains for given wave vector k two degrees of freedom. They correspond to two independent transverse directions of polarization.

5.7 Charged Point Particle

105

5.7 Charged Point Particle A particle with charge q, which is forced to traverse a worldline t 7→ (t, z(t)) with bounded velocity, z˙ 2 (t) ≤ v2max < 1, generates by its charge and current densities

ρ (t, y) = q δ 3 (y − z(t)) ,

j (t, y) = q

dz 3 δ (y − z(t)) . dt

(5.128)

the scalar potential (5.118) Z Z q ρ (tret , y) δ 3 (y − z(tret )) 1 d3y d3y = , tret = t − |x− y| . (5.129) φ (t, x) = 4π |x − y| 4π |x − y| The argument of the δ -function y ′ = y − z(tret ) is a composite function of the integration variable y, as the retarded time tret = t − |x − y| depends on y. By the substitution theorem for integrals such a composite δ -function, applied to a test function f , yields Z Z ∂y 1 3 3 ′ d y δ (y (y)) f (y) = d3y′ det ′ δ 3 (y′ ) f (y(y′ )) = ∂ y′ f (y) ˆ , y′ (y) ˆ =0. det ∂y ∂y |yˆ

(5.130) In the case at hand, y ′ vanishes at y = z(tret ). The test function f is the Coulomb potential 1/|x − z(tret )|. The Jacobi matrix of the substitution and its determinant is dz x − z ∂ y′ i ∂ y′ dzi x j − z j = δ i j + Ni j , Ni j = − , det = 1− , j ∂y dt |x − z| ∂y dt |x − z|

(5.131)

where we exploited that N has rank 1 which implies det(1 + N) = 1 + trN. We obtain the scalar potential and analogously the vector potential q |x − z(tret )| − dz dt

q |x − z(tret )| − dz dt

dz . dt (5.132) This four-potential is named after Alfred-Marie Li´enard and Emil Wiechert. In the form q um , (5.133) Am (x) = 4π y · u 4π φ (t, x) =

x−z |x−z|

,

4π A(t, x) =

x−z |x−z|

one can calculate the corresponding field strength with tolerable algebraic effort. Here u = dz ds is the normalized tangent vector to the worldline of the particle at its intersection z(s(x)) with the backward light cone of x. We parameterize the worldline with its proper time s, which is shown by a watch which is carried along.

 ‹

u=

dz 1 1 =√ ds 1 − v2 v

,

v=

dz , dt

u2 = 1 .

(5.134)

106

5 Electrodynamics

The lightlike vector y = x−z(s(x)) points from the cause to the effect: from the event z(s(x)), in which the particle crosses the backward light cone of x to the observer at x in direction n and in distance r



y=

‹

 ‹

|x − z(s(x))| 1 =r x − z(s(x)) n

,

y2 = 0 ,

y·u = √

r 1 − v2

(1 − vn) .

(5.135)

The proper time s on the worldline defines the time s(x), which an observer at x reads from the clock. It has the constant value s on the forward light cone of z(s). The gradient km of s(x) is calculated by differentiation of y2 = 0, 0 = (δm n − un ∂m s) yn ,

km := ∂m s =

ym . y·u

(5.136)

It is lightlike k2 = 0 and given by √  ‹ 1 − v2 1 . k= 1 − vn n

(5.137)

With its help we can express the derivatives of y and y · u by the four-acceleration

 ‹



1 du dt d 1 1  va √ = u˙ = = 2 ) + v (v a) 2 2 2 v a (1 − v ds ds dt (1 − v ) 1−v

‹

,

d2 z dt 2 (5.138)

a=

and the quantities which we have introduced,

∂m yn = δm n − unkm , ∂m (y · u) = (∂m yn ) un + y · u˙ km = um + (y · u˙ − 1) km .

(5.139)

We obtain the field strengths q q ∂m (y · u) un + ∂m un −m ↔ n 4π (y · u)2 4π (y · u) (5.140) q q wm = u + ( u ˙ − u k · u) ˙ . m m m 4π (y · u)2 4π (y · u)

Fmn = ∂m An − ∂n Am = − = km wn − knwm ,

and in particular the electric field, E i = F0i = k0 wi − ki w0 , E(t, x) =

  q q (1 − v 2 ) n−v + n × (n − v) × a . 2 3 3 4π r (1 − vn) 4π r (1 − vn)

(5.141)

The part which does not depend on the acceleration, decreases with distance as 1/r2 and does not show in the direction n from the cause, the event z, to the effect at x, but into the direction x − (z + rv), away from the destination z + rv, which the particle would reach with constant velocity v in the moment when it effects the field at x. The acceleration dependent part, the radiation field, decreases with 1/r and is orthogonal to the direction n from the cause to its effect at x.

5.8 Action Principle and Noether’s Theorems

107

The magnetic field of the particle is orthogonal to n and E Bk = −εi jk ki w j = εi jk ki w j = εi jk ki /k0 E j ,

B = n×E .

(5.142)

The energy current density S = E × B of the radiation points into the direction n away from the cause: an accelerated charge looses energy by radiation. Mathematically and physically the classical description of matter by point particles is untenable. If no interaction contributes with a negative energy density, then the charge of the electron cannot be concentrated in a sphere with a smaller radius than half the electron radius relectron =

e2 = 2.818 ·10−15m . 4π ε0 melectron c2

(5.143)

Otherwise the electric field outside of an electron at rest would contain already more energy than the rest energy of the electron. Relativistic quantum theory solves this problem as particles cannot be localized for a finite time better than their Compton wave length (page 151) which is roughly 137 times their critical radius.

5.8 Action Principle and Noether’s Theorems Each field f , e.g. the field strength E or B, is a differentiable map

§ f:

M →N x 7→ f (x)

(5.144)

of an n-dimensional base space M to a d-dimensional target space N . It defines a corresponding map, its lift fˆ to the jet space Jk ,

§ fˆ :

M → Jk x 7→ fˆ(x) = (x, f (x), ∂ f (x), . . . , ∂ k f (x))

(5.145)

which maps each x ∈ M to x and the values of f and all its partial derivatives up to order k at this point. We denote a typical point of Jk by its coordinates (x, y, y(1) , y(2) , . . . y(k) ). Here x denotes a point in the base space, y is a d-tupel with components yr , y(1) a point in the tangent space denoted by components yrm , where m ranges over n values. y(l) has components yrm1 m2 ...ml where each mi ranges over n values. Because the order of partial derivatives can be exchanged, the components of the variables y(l) are completely symmetric under each permutation π of the derivative indices, yrm1 m2 ...ml = yrmπ (1) mπ (2) ...mπ (l) .

(5.146)

For a unique identification one can order them lexicographically and restrict the independent ones by m1 ≤ m2 · · · ≤ ml .

108

5 Electrodynamics

We define the convective derivatives dm of jet functions in analogy with (4.15)   XX X φ (5.147) dm φ = ∂xm + yrm m1 m2 ...ml ∂yr l≥0

m1 m2 ...ml

r m1 ≤m2 ···≤ml

such that their composition with the lift gives the derivatives of the composed function (4.14) (dm φ ) ◦ fˆ = ∂xm (φ ◦ fˆ) . (5.148) A curve in the space of fields F is a one parameter family of fields λ 7→ fλ , where λ ranges in some interval. Jet functions φ , evaluated on the lift of fλ , define by their change as a function of λ and by the chain rule the convective change δ ,

∂ (φ ◦ fˆλ ) = (δ φ ) ◦ fλ , ∂λ

(5.149)

which acts as differential operator on jet functions by X X (dm1 . . . dml δ yr )∂yr δ = δ yr ∂yr + (dm δ yr )∂yrm + . . . r

m1 ≤m2 ···≤mk

m1 m2 ...ml

+ ... ,

(5.150) where δ y ◦ fˆλ = ∂∂λ fλ is the change of fλ and the derivatives of fλ change by the derivatives of the change, δ yrm1 m2 ...mk ◦ fˆλ = (dm1 dm2 . . . dmk δ yr ) ◦ fˆλ . Derivatives with respect to coordinates and the parameter may be interchanged (4.19), so the convective derivate dm commutes with the convective change δ . If there are curves through each point of F such that for all f ∈ F the change δ f = δ y ◦ fˆ is given by an d-tupel of jet functions δ y, evaluated on the lift fˆ, we call δ y and δ local (not to be confused with gauged). The Lagrangian L is a function of some jet space, e.g. J1 ,

§ L :

R

R

D ⊂ n+d+d n → . (x, y, y(1) ) 7→ L (x, y, y(1) )

(5.151)

It defines a local functional on the space F of fields f , the action S,

§ S:

R

F → R R . f 7→ S[ f ] = dnx (L ◦ fˆ)(x) = dnx L (x, f (x), ∂ f (x))

(5.152)

The equations of motion state for the physical fields, that the action is stationary up to boundary terms, i.e. for all paths fλ through f0 = fphys the derivative of S[ fλ ] vanishes for λ = 0 up to boundary terms Z ∂ S[ fλ ]|λ =0 = dnx (δ L ◦ fˆphys )(x) . (5.153) 0= ∂λ Because in δ L the derivatives of δ y can be shifted away

5.8 Action Principle and Noether’s Theorems

109

δ L = δ yr ∂yr L + dm (δ yr ) ∂yrm L = δ yr ∂yr L − dm ∂yrm L + dm (δ yr ∂yrm L ) (5.154) at the cost of complete derivatives, which however only contribute boundary terms, and because the change of the action has to vanish for all δ f = δ y ◦ fˆ the Euler derivative of the Lagrangian 

∂ˆ L = ∂yr L − dm ∂yrm L ∂ˆ yr

(5.155)

vanishes for physical fields, i.e. they satisfy the Euler-Lagrange equations

∂ˆ L ˆ ◦ fphys = 0 . ∂ˆ y

(5.156)

In particular the Maxwell equations (5.71) for the potential A = (A0 , A1 , A2 , A3 ) are the Euler-Lagrange equations of the action S = SMaxwell + Smatter with Z 1 SMaxwell [A] = − (5.157) d4x Fmn (x)F mn (x) , Fmn = ∂m An − ∂n Am . 4 The action of the charged matter provides by definition the current

δ Smatter = − jm (x) , δ Am (x)

(5.158)

which in the Maxwell equations is considered as given source, only restricted by the continuity equation ∂m jm = 0 (5.20). Infinitesimal local transformations δ y are maps of some jet space, e.g. J1 , to the target space N . They are called infinitesimal symmetries of the action S if the Lagrangian is invariant in first order up to derivative terms, i.e. if there exist jet functions K m , such that

δ yr ∂yr L + (dm δ yr ) ∂yrm L + dm K m = 0

(5.159)

which is equivalent to

δ yr

∂ˆ L + dm j m = 0 ∂ˆ yr

(5.160)

where the current jm (and K m to start with), given by jm = δ yr ∂yrm L + K m + dn Bmn ,

(5.161)

is unique only up to so called improvement terms, i.e. jet functions dn Bmn with Bmn = −Bnm [14, 45]. (5.160) is the

Noether Theorem of Field Theory: To each infinitesimal symmetry δ y of the action there corresponds a conserved current j. Vice versa to each conserved current there corresponds an infinitesimal symmetry of the action.

110

5 Electrodynamics

By (5.160) the divergence of jm ◦ fˆphys vanishes on account of the equations of motion (5.156). Conversely to each conserved current j¯ there corresponds an infinitesimal symmetry δ y of the action, because by definition jet functions j¯m are the components of a conserved current if its divergence is proportional to (derivatives of) the equations of motion, i.e. if there are jet functions sr and sr m such that dm j¯m + sr

∂ˆ L ∂ˆ L + sr m dm =0, ∂ˆ yr ∂ˆ yr

∂ˆ L ∂ˆ L dm ( j¯m + sr m ) + (sr − dm sr m ) =0. ∂ˆ yr ∂ˆ yr

(5.162)

(5.163)

ˆ This shows (5.160) with jm = j¯m + sr m ∂ˆLr and δ yr = (sr − dm sr m ) . ∂y In particular, the infinitesimal translation of the potential A by a constant fourvector −c accompanied by an infinitesimal gauge transformation (5.68) by the gradient of −cl Al δ An = cl (dl An − dn Al ) = cl Fln (5.164)

is a symmetry of the action SMaxwell with the Lagrangian L = −1/4 Fmn F mn . The Lagrangian changes by δ L = −dm (cl Fln )F mn = −1/2((dm cl Fln − dn cl Flm ) F mn , because the double sum with F mn antisymmetrizes in m and n (5.27). Moreover, by construction of F as antisymmetrized derivatives of the potential A, the sum over the cyclic permutations of dm Fln vanishes, therefore δ L = −1/2 dl (cl Fmn ) F mn = dl (cl L ) , i.e. δ is a symmetry (5.159) with K m = −cm L . The conserved current (5.161) turns out to be the energy-momentum tensor R (5.23) in direction of c, jm = cl T lm . This justifies to call pk = d3x T k0 (x) (5.28) the energy and the momentum. They are the conserved quantities which correspond to the invariance of the action under translations in spacetime. Because the energy-momentum tensor of the electromagnetic field is symmetric and traceless (5.24), the current jm = cl T lm is conserved not only for constant c but more generally if c satisfies the conformal Killing equation [44] 1 ∂m cn + ∂n cm − ηmn ∂k ck = 0 , 2

(5.165)

which has the general 15-parameter solution cm (x) = am + ωmn xn + d xm + bm x · x − 2(b ·x)xm ,

(5.166)

where am parameterizes an infinitesimal translation, ωmn = −ωnm a Lorentz transformation, d a scale transformation and bm a proper conformal transformation. The vector fields −cm ∂m satisfy the commutator algebra (6.23) of O(2, 4). More generally, the conformal group of a space with a conformally flat metric with p plus signs and q minus signs is isomorphic to (a cover of) the group O(p + 1, q + 1)/ 2 . The action SMaxwell of the potential A is invariant under infinitesimal conformal transformations δ An = cm Fmn but the physical properties of matter are not invariant

Z

5.8 Action Principle and Noether’s Theorems

111

under dilations and proper conformal transformations: atoms exist only in their invariable size, not in each scaled version. That matter breaks the symmetry group of the electrodynamic interactions to the Poincar´e group and not to a larger or smaller subgroup, e.g. that it does not define a rest frame, cannot be deduced logically from the Maxwell equations, but is the result of observations and experiments. Exponentiated an infinitesimal proper conformal transformation yields the map Tb : x 7→ x′ =

x + b x2 . 1 + 2b ·x + b2 x2

(5.167)

R

This is not an invertible transformation of 4 , no matter how small b 6= 0 is, because the denominator vanishes on the lightcone (x + b/b2)2 = 0 or, if b2 = 0, on the plane 1 + 2b ·x = 0 . An infinitesimal symmetry, which corresponds to a conserved current, need not be the derivative of a one parameter group of transformations. More precisely, the cartesian coordinates of Minkowski space are stereographic coordinates of S p × Sq, the set of directions of light rays in p+1, q+1, on which conformal transformations act transitively (compare (5.166) and (8.52)). If an infinitesimal symmetry δ y of the action is a jet function which is linear in a field ξ and its derivatives ξ(1) , ξ(2) . . . , on which the Lagrangian does not depend, e.g. of the form δ yr = ξ Rr + (dm ξ )Rr m , (5.168)

R

where Rr and Rr m are some ξ -independent jet functions, then the transformation is called an infinitesimal gauge symmetry. For example, the change of the potential by the gradient of an arbitrary function ξ , δ Am = dm ξ (5.68), is a gauge symmetry of the action SMaxwell (5.157). To each infinitesimal gauge symmetry there corresponds an identity, the Noether identity, among the Euler derivatives with respect to the fields

∂ˆ ∂ˆ ξ

‚

∂L δ yr ∂ˆ yr ˆ

Œ = Rr

∂ˆ L ∂ˆ L  =0. − d m Rr m ∂ˆ yr ∂ˆ yr

(5.169)

To prove the identity take the Euler derivative of (5.160) with respect to ξ . Applied to dm jm the Euler derivative vanishes identically. Vice versa, to each identity among the Euler derivatives of the Lagrangian rr

∂ˆ L ∂ˆ L + r r m dm =0, ∂ˆ yr ∂ˆ yr

(5.170)

there corresponds the infinitesimal gauge symmetry δ yr = ξ (rr − dm rr m )− rr m dm ξ . Just multiply the identity with ξ and rearrange the terms to obtain (5.160). Second Noether Theorem: To each infinitesimal gauge symmetry of the action corresponds an identity among the Euler derivatives of the Lagrangian. Vice versa, to each identity among the Euler derivatives of the Lagrangian there corresponds an infinitesimal gauge symmetry of the action.

112

5 Electrodynamics

In electrodynamics the gauge transformation δ Ar = dm ξ δ m r is of the form (5.168) with Rr = 0 and Rr m = δ m r and the Noether identity (5.169) states, that the ˆ divergence of the Euler derivative with respect to the potential vanishes, dm ∂ˆ L = 0, ∂ Am in particular for LMaxwell the identity is dm dn F nm = 0 . If one inserts (5.168) into (5.160) and uses (5.169) one obtains for infinitesimal gauge symmetries of the Lagrangian L d m ( j m + ξ Rr m

∂ˆ L )=0 ∂ˆ yr

(5.171)

as identity in jet space. The divergence vanishes if and only if the terms in the braces are the divergence of antisymmetric jet functions [14, 45] j m = − ξ Rr m

∂ˆ L + dnBnm , Bnm = −Bmn . ∂ˆ yr

(5.172)

Therefore, in gauge theories by the equations of motion the charge in each volume is determined by a surface integral over its boundary, as dm B0m does not contain a time derivative, but is the divergence of some three-vector field E Z Z Z 3 0 3 d2 f · E . (5.173) QV = d x j = d x div E = V

V

∂V

Consider a gauge invariant matter Lagrangian Lmatter with a fixed gauge field. Such a field is called a background field. It defines a subset of infinitesimal gauge transformations, called rigid transformations, which leave the background field invariant, δ A = 0. For example, a vanishing potential A = 0 in electrodynamics is invariant under constant gauge transformations ξ , δ A = ∂ ξ = 0. As another example, the metric of Minkowski space is invariant under infinitesimal Poincar´e transformations, which satisfy Killing’s equation ∂m ξn + ∂n ξm = 0 . To these symmetries of the background there correspond conserved Noether currents which by (5.172) and by the equations of motion of the matter fields are given, up to improvement terms, by the Euler derivatives with respect to the gauge field, evaluated at the background field jm rigid = δ φi

∂ Lmatter ∂ˆ Lmatter + K m + dn Bmn = −ξ Rm . Ak ∂ ∂m φi ∂ˆ Ak

(5.174)

In general relativity, this explains why the Euler derivative of the matter Lagrangian with respect to the metric is the energy-momentum tensor T mn , the collection of Noether currents, which are conserved because Minkowski space is invariant under translations of space and time, jm ◦ φˆphys =|g

mn =ηmn

T mn ξn ◦ φˆphys ,

T mn = −

1 ∂ˆ Lmatter . 2 ∂ˆ gmn

(5.175)

Chapter 6

The Lorentz Group

Abstract Rotations in higher dimensional spaces define one- and two-dimensional subspaces in which they act just as in the Euclidean plane. Similarly Lorentz boosts in higher dimensions transform two-dimensional subspaces in the same way as in 1 + 1-dimensional spacetimes. Each Lorentz transformation Λ of the fourdimensional spacetime corresponds uniquely to a pair ±M of linear transformations of a complex two-dimensional space, the space of spinors. Their inspection reveals that aberration, the Lorentz transformation of the directions of light rays, acts as a M¨obius transformation of the Riemann sphere.

6.1 Rotations Rotations and rotary reflections of a d-dimensional Euclidean space V

§ D:

V →V (v1 , v2 , . . . vd ) 7→ (D1 j v j , D2 j v j , . . . Dd j v j )

1

(6.1)

constitute the subgroup O(d) of linear transformations, which leave the scalar product (here given in an orthonormal basis) u · v = u j δ jk vk = u1 v1 + u2 v2 + · · · + ud vd

δ jk =

§

1 if j = k 0 if j 6= k

(6.2) (6.3)

of all vectors v = (v1 , v2 , . . . vd ) and u = (u1 , u2 , . . . ud ) and thereby all angles and lengths invariant, (6.4) Di j u j Di k vk = u j δ jk vk . 1 We use Einstein’s summation convention. Unless stated otherwise, each pair of indices denotes the sum over the range of its values, Di j v j = Di 1 v1 + Di 2 v2 + · · · + Di d vd .

113

114

6 The Lorentz Group

This holds if and only if the orthogonality relations hold, Di j Di k = δ jk .

(6.5)

The columns of the matrix D contain the components of orthonormal vectors, or phrased as a matrix property, the transposed matrix DT , DT j i = Di j , equals the inverse matrix, DT D = 1 . (6.6) In particular (detD)2 = 1, because of 1 = det1 = det(DT D) = detDT detD = (det D)2 . The determinant of D therefore is 1 or −1. Rotary reflections and rotations form the group O(d) of the orthogonal transformations in d dimensions. We reserve the name rotations for the orientation preserving transformations with determinant det D = 1. They constitute the group of special orthogonal transformations, SO(d). The determinant is a continuous function of the matrix elements. Therefore there does not exist a one parameter set of rotary reflections Dλ , which varies continuously with λ and connects a rotation with detDλ =0 = 1 and a reflection with det Dλ =1 = −1: rotations are not connected to reflections. Each rotation can be generated by repeated application of infinitesimal rotations, it is continuously connected to the identity 1. In other words, the group SO(d) is connected. This is seen by inspecting the eigenvalue equation of rotary reflections, Dw = λ w ,

w 6= 0 .

(6.7)

The eigenvalues λ satisfy the characteristic equation det(D − λ 1) = 0 .

(6.8)

For real d × d-matrices this is a polynomial equation of order d with real coefficients and has d not necessarily distinct complex solutions, where we count each real solution as complex, though special, solution. To each complex eigenvalue λ = σ + iτ of the real rotary reflection D = D∗ there belongs an eigenvector w ∈ d with complex components, wi = ui + i vi . The conjugate λ ∗ = σ − iτ is eigenvalue of the eigenvector w∗ i = ui − i vi , because D is real and conjugation yields 0 = ((D − λ 1)i j w j )∗ = (D − λ ∗1)i j w∗ j . By the orthogonality relations (6.5) and the eigenvalue equation

C

(D w∗ )i (D w)i − w∗ i wi = 0 = (λ ∗ λ − 1)w∗ i wi = (|λ |2 − 1)(u 2 + v 2 ) ,

(D w)i (D w)i − wi wi = 0 = (λ 2 − 1)w∗ i wi = (λ 2 − 1)(u 2 − v 2 + 2 i u ·v) ,

(6.9)

and because u 2 + v 2 does not vanish, each eigenvalue of an orthogonal transformation has modulus |λ |2 = 1 . Each real eigenvalue λ is 1 or −1. The corresponding eigenvector is invariant Dn = n, or is reflected, Da = −a. If the eigenvalue λ = cos α + i sin α is not real, then (λ 2 − 1) does not vanish and u and v are orthogonal and have equal length. We choose the normalized vectors

6.1 Rotations

115

e1 = v/|v| and e2 = u/|u| as basis and spell out the eigenvalue equation D(e2 + ie1 ) = (cos α + i sin α )(e2 + i e1 )

(6.10)

or, decomposed into real- and imaginary part, De1 = (cos α ) e1 + (sin α ) e2 , De2 = −(sin α ) e1 + (cos α ) e2 ,

(6.11)

i.e. on this orthonormal basis of the plane which is spanned by e1 and e2 the rotary reflection acts by the matrix (3.7)

 Dα =

cos α − sin α sin α cos α

‹ .

(6.12)

The real subspace U⊥ of vectors y, which are orthogonal to a real or complex eigenvector w, y · w = 0, is mapped to itself, D(U⊥ ) ⊂ U⊥ , y · w = 0 = (D y) · (D w) = λ (D y) · w ,

DU⊥ ⊂ U⊥ .

(6.13)

The space U⊥ together with e1 and e2 or n or a spans the Euclidean space V because, due to the positive definiteness of the scalar product, the vectors in U⊥ are linearly independent from e1 and e2 or n or a. Restricted to U⊥ the rotation D has an eigenvalue which is either real, then D leaves the eigenvector invariant or reflects it, or the eigenvalue is not real, in which case D acts by a rotation Dβ of the two-dimensional eigenplane. Altogether, there is a orthonormal basis of V in which the matrix D takes the form

‰

“

Dα ..

D=

. ,



(6.14)

1 −1 where 1 denotes a block of eigenvalues 1 and −1 another block of eigenvalues −1. If the dimension of V is odd, there has to exist a real eigenvalue 1 or −1, there is an axis of rotation or of a reflection. In particular, in 3 dimensions we can use the axis n to decompose each vector into its parallel and orthogonal parts, k = kk + k⊥ , kk = n (n · k) . They are completed by n × k⊥ to a right handed basis (if k⊥ does not vanish). In these terms the rotation by an angle α is given by Dα n (kk + k⊥ ) = kk + (cos α ) k⊥ + (sin α ) n × k⊥ .

(6.15)

Because of cos(−α ) = cos α and sin(−α ) = − sin α a rotation around the axis n by an angle α coincides with the rotation around −n by the angle 2π − α . Moreover D0n = D2π n = 1 are independent of n. So each rotation corresponds uniquely to an

116

6 The Lorentz Group

R

antipodal pair ±p of points on S3 = {p ∈ 4 : (p1 )2 + (p2 )2 + (p3 )2 + (p4 )2 = 1}, the three-dimensional sphere, if we conceive n as the direction to p, as seen from the north pole, and α /2 as its distance along a great circle. So the group SO(3) is the manifold S3 / 2 , the set of pairs of antipodal points of S3 . In case of a proper reflection the number of eigenvalues −1 is odd, in case of a rotation even. Each pair of eigenvalues −1 can be considered to belong to a rotation Dπ by α = π . Because each angle of rotation can be continuously increased from zero to α , all rotations are continuously connected to 1 and to one another. Consequently the orthogonal group O(d) consists of two disconnected components, the group of rotations SO(d) and the set P SO(d), where the parity transformation P reflects an odd-dimensional subspace, Pa = −a, and leaves the orthogonal subspace pointwise invariant, Pn = n.

Z

6.2 Lorentz Transformations The Lorentz group O(p, q), p q 6= 0, consists of the real, linear transformations Λ of the points x of the p + q-dimensional Minkowski space p,q ,

R

x′ = Λ x , which leave invariant the scalar product of x·y =

p X i=1

(6.16)

R p,q (here given in an orthonormal basis)

i i

xy −

p+q X

xi yi .

(6.17)

i=p+1

In matrix notation it is given by x · y = xT η y with a matrix

η=



‹

1

(6.18)

−1

which has a p × p unit block and a q × q block −1 . So Lorentz matrices have to satisfy (Λ x)T η Λ y = xT η y for all x and y and therefore

Each exponential

Λ T ηΛ = η .

(6.19)

Λ = eω

(6.20)

of a matrix ω is a Lorentz transformation if ηω is antisymmetric

ηω = −(ηω )T ,

(6.21)

as ηω = (−ω T ) η implies η eω = e−ω η and (eω )T η (eω ) = eω e−ω η = η (6.19). T

T

T

6.2 Lorentz Transformations

117

Vice versa, (6.21) is necessary for Λ = et ω to be a Lorentz transformation for each value of t ∈ , because to first order in t (6.19) amounts to ω T η + ηω = 0. The matrices ω are linear combinations ω = 12 ω mn lmn of a basis lnm = −lnm , enumerated by an index pair mn, m < n, and with matrix elements

R

(lmn )r s = δm r ηns − δn r ηms

(6.22)

consistent with ω r s = 12 ω mn lmn r s . Their commutators [lkl , lmn ] = lkl lmn − lkl lmn satisfy the Lorentz algebra [lkl , lmn ] = −ηkm lln + ηkn llm + ηlm lkn − ηln lkm .

(6.23)

The group O(p, q) is the manifold O(p, q) = O(p) × O(q) ×

Rqp .

(6.24)

To show this, we decompose the (p + q) × (p + q)-matrix Λ into a p × p -matrix A, a q × q -matrix D, a q × p -matrix C and a p × q -matrix B



AB Λ= CD

‹

(6.25)

and write (6.19) in terms of these matrices, AT A = 1 + C T C ,

DT D = 1 + B T B ,

AT B = C T D .

(6.26)

The symmetric matrix CT C is diagonalizable by a rotation and has nonnegative P diagonal elements λ j = i Ci jCi j . Therefore the eigenvalues of 1 + CT C are not smaller than 1 and A is invertible, (detA)2 = det(1 + CT C) ≥ 1. The same applies to D, (det D)2 ≥ 1. Each invertible real matrix A can be uniquely and continuously decomposed into the product of an orthogonal transformation O, OT = O−1 , and a symmetric matrix S, ST = S, with positive eigenvalues, A = OS .

(6.27)

For AT A defines a symmetric matrix S2 with positive eigenvalues λi > 0, i = 1, . . . p, and also the positive symmetric matrix √ (6.28) S = AT A , S = S T .

C C

We recall that if a function f : U ⊂ → is defined in a domain U which contains the eigenvalues λ of a diagonalizable matrix M then the linear map f (M) is defined to map the eigenvectors w of M to f (M) w = f (λ )w. The matrix O = AS−1 is orthogonal as ST −1 AT AS−1 = S−1S2 S−1 = 1 confirms.

118

6 The Lorentz Group

Because A and D are invertible, Λ (6.25) can be decomposed uniquely (6.27)

Λ=



‹

O Oˆ

‹

SQ P Sˆ

,

(6.29)

with P = Oˆ −1C and Q = O−1 B . S and Sˆ are invertible and symmetric, O and Oˆ are orthogonal matrices. Equation (6.19) reads S 2 = 1 + PT P ,

SQ = PT Sˆ ,

Sˆ2 = 1 + QT Q .

(6.30)

Using Q = S−1 PT Sˆ and S−2 = (1 + PT P)−1 in the last equation, one obtains ˆ −1 S−1PT Sˆ or Sˆ2 = 1 + SPS

1 = Sˆ−2 + P(1 + PT P)−1 PT .

(6.31)

For all functions f (λ ) which are defined on the possible spectrum λ ≥ 0 of PT P and PPT , one has f (PT P) PT = PT f (PPT ) . (6.32) This holds, as the next remarks show, applied to each eigenvector w of PPT and, as there is a basis of eigenvectors, as a matrix equation. If PT w does not vanish, it is eigenvector of PT P with the same eigenvalue λ , (PT P)PT w = PT (PPT )w = λ PT w, which implies f (PT P) PT w = f (λ )PT w = PT f (PPT )w . This holds also if PT w = 0, because then PPT w = 0 and f (PT P) PT w = 0 = PT f (0) w . Applied to (6.31) one obtains 1 = Sˆ−2 + PPT /(1 + PPT ) or Sˆ2 = 1 + PPT .

(6.33)

Inserting this into Q = S−1 PT Sˆ one concludes with (6.32) Q = S−1 PT Sˆ =

p p −1 1 + PT P PT 1 + PPT = PT .

(6.34)

So each Lorentz matrix is uniquely given by a pair of orthogonal transformations O ∈ O(p), Oˆ ∈ O(q) and a rotationfree Lorentz transformation LP , LP = (LP )T , a boost, which is determined by a matrix P with q rows and p columns ‚√ Œ  ‹ O 1 + PT P √ PT Λ= L , LP = . (6.35) Oˆ P P 1 + PPT One easily confirms (LP )−1 = L−P

(6.36)

and that rotations of a boost yield the boost in rotated directions



O

‹ L Oˆ P



OT

‹ = LOPO ˆ T , Oˆ T

(6.37)

i.e. the columns of P transform as q-dimensional vectors by left multiplication with Oˆ and the rows as p-dimensional row vectors by right multiplication with OT .

6.2 Lorentz Transformations

119

R

Because q × p matrices P constitute qp , each Lorentz transformation corresponds one to one to a point in the manifold O(p) × O(q) × qp. The space qp is connected and each orthogonal group consists of two disconnected components, so O(p, q) has four disconnected components. The determinant of (6.19) implies

R

R

(det Λ )2 = 1

(6.38)

because det(Λ T η Λ ) = (det Λ T )(det η )(det Λ ) and because det Λ T = det Λ . So the determinant of a Lorentz transformation can only have the values +1 or −1. The boosts LP are connected to the identity, their determinants are continuous and discrete on the connecting path. So the determinant of boosts are constant, detLP = 1. The special transformations Λ with det Λ = 1 constitute the special orthogonal group SO(p, q). Lorentz transformations with detO = det Oˆ = 1 preserve the orientation of the timelike and spacelike directions and constitute the restricted Lorentz group SO(p, q)↑ . It is connected. Analogously we define the restricted Poincar´e group to be the connected component which contains the identity. The other connected components of O(p, q) are obtained by multiplication with the inversion of time T and by the parity transformation P, which reflect an odddimensional timelike or spacelike subspace, and by T P,

‡

−1

T =

‘ 1 ..

‡ ,

.

P=

‘

1 ..

.

.

1

(6.39)

−1

1

The boosts LP act in 1 + 1-dimensional mutually orthogonal subspaces Ui as the transformation Lv (3.5). To see this consider the eigenvectors wi of PPT . They are mutually orthogonal if they correspond to different eigenvalues λi or can be chosen to be mutually orthogonal, if the eigenvalue is degenerate. In addition we choose them to be normalized, w j · wi = −wTj wi = −δi j . The real nullvectors u of PPT are annihilated already by PT as PPT u = 0 implies uT PPT u = 0 and this sum of squares of the components of PT u vanishes only if PT u = 0. Therefore LP leaves each u invariant, LP u = u. Similarly one has Pv = 0 and LP v = v for each nullvector v ∈ p of PT P. Each normalized eigenvector e1 of PPT with nonvanishing eigenvalue λ defines √ an orthogonal, normalized eigenvector e0 = PT e1 / λ of PT P with the same eigenvalue. On these vectors LP acts by p p √ √ (6.40) LP (e0 ) = 1 + λ e0 + λ e1 , LP (e1 ) = λ e0 + 1 + λ e1 . p This is the two-dimensional Lorentz boost Lv (3.5) with velocity v = λ /(1 + λ ) . Though each Lorentz transformation Λ decomposes into a rotation R and a boost L p (6.35) and though the (real and imaginary parts of the) eigenvectors of R and also of L p span p,q , this it is not true for the eigenvectors of each Lorentz trans-

R

R

120

6 The Lorentz Group

formation, i.e. not each Lorentz transformation decomposes into a transformation of one or two-dimensional subspaces. As a counterexample, the Lorentz transformation Λa (3.46) has only one eigenvector (up to multiples). If the indefinite metric contains only one plus sign, then the Lorentz boost (6.35) simplifies, because P is a q × 1 matrix, a column vector which we denote T 2 2 by p/m, m > 0. The √ √ product P P is its Euclidean length squared p /m .T Write T T 1 + PP as 1 + 1 + PP − 1 , then the second term is the function g(PP ) with √ √ g(x) = 1 + x − 1 = x f (x) where f (x) = ( 1 + x − 1)/x is continuous for x ≥ 0. As PPT f (PPT ) = P f (PT P)PT (6.32) and f (PT P) = f (p2 /m2 ) commute, p p m 1 + p2 /m2 − 1 T T PPT . (6.41) PP = p 1 + PP − 1 = 2 p2 /m2 m + p2 + m the rotationfree Lorentz transformation (6.35) takes the form 1 Lp = m

‚

p0 p

pT m1 +

Œ p pT p0 +m

,

p0 =

p m2 + p2 .

(6.42)

which occurs frequently in particle physics. The boost L p transforms the fourmomentum p = (m, 0, 0, 0) of a massive particle at rest to the four-momentum p = (p0 , p1 , p2 , p3 ) of a particle which moves with velocity v = p/p0 (3.48), Lp p = p .

(6.43)

The boosts L p do not depend on the mass m > 0 but correspond one to one to points u = p/m, u2 = 1, in the hyperbolic space H 3 (7.1).

6.3 The Rotation Group SU(2)/

Z2

The unitary, unimodular transformations of a two-dimensional complex vector space constitute the group SU(2), which as a manifold is the three sphere S3 . The unitarity condition U †U = 1 states that the columns of each unitary 2×2matrix U contain the components of a complex, orthonormal basis. If U has the form  ‹ ac , (6.44) U= bd then |a|2 + |b|2 = 1. The second column is orthogonal to the first, a∗ c + b∗ d = 0, if and only if (c, d) is a multiple of (−b∗ , a∗ ) and the condition detU = 1 determines this multiple

 U=

a −b∗ b a∗

‹ ,

(ℜa)2 + (ℑa)2 + (ℜb)2 + (ℑb)2 = 1 .

(6.45)

Z2

6.3 The Rotation Group SU(2)/

121

R

To each U corresponds a point on S3 = {(v, x, y, z) ∈ 4 : v2 + x2 + y2 + z2 = 1} and to each point on S3 there corresponds a U ∈ SU(2) with a = v + iz and b = x + iy . Points on S3 can be designated by an angle 0 ≤ α ≤ 2π and a three-dimensional unit vector n = (nx , ny , nz ) as (v, x, y, z) = cos α /2 (1, 0, 0, 0) − sin α /2 (0, nx , ny , nz ) . Correspondingly one can write each SU(2)-matrix as the following linear combination of the 1-matrix σ 0 and the three Pauli matrices σ i , i = 1, 2, 3,

σ = 0



‹

1

σ =

,

1



‹

1

1

,

1

σ =



‹

2

i

−i

,

σ =



3

‹

1

α α ) 1 − i (sin ) n · σ = 2 2  ‹ cos α2 − i (sin α2 ) nz −i (sin α2 ) (nx − iny ) . −i (sin α2 ) (nx + iny ) cos α2 + i (sin α2 ) nz

−1

,

(6.46)

Uα n = (cos

(6.47)

Elementary calculation confirms that 1, iσ 1 , iσ 2 and iσ 3 satisfy the real algebra of quaternions,

σ i σ j = δ i j 1 + i εi jk σ k ,

i, j, k ∈ {1, 2, 3} .

(6.48)

If multiplied and summed with mi and n j this multiplication can also be written as (m · σ )(n · σ ) = (m · n)1 + i(m × n) · σ .

(6.49)

In particular for each unit vector n one has (n · σ )2 = 1 and with (n · σ )2k = 1, (n · σ )2k+1 = n · σ and (−i)2k = (−1)k the exponential series of −in · σ simplifies exp(−i

X (−1)k ( α )2k X (−1)k ( α )2k+1 α 2 2 n·σ) = 1−i n·σ . 2 (2k)! (2k + 1)! k

(6.50)

k

So each U ∈ SU(2) is generated by an infinitesimal transformation −i α2 n · σ , Uα n = exp(−i

α α α 1 − i sin n·σ . n · σ ) = cos 2 2 2

(6.51)

A subgroup H which is mapped by the adjoint action of all g ∈ G to itself, gHg−1 = H, is called normal. Its cosets gH form a group G/H, as the product g2 H with g1 H does not depend on which element one chooses to represent the coset, ′ g2 h2 g1 h1 = g2 g1 (g−1 1 h2 g1 )h1 = g2 g1 h . For instance, 2 , the cyclic group with two elements ±1, is a normal subgroup of SU(2). Matrices U ∈ SU(2) are 2 -equivalent, if they differ at most by their sign. The group SU(2)/ 2 consists of SU(2)-transformations up to the sign. The group SO(3) of rotations in three dimensions is isomorphic to the group SU(2)/ 2 . This is to say, there exists a bijective, i.e. invertible and exhaustive, map D from SU(2)/ 2 to SO(3), which maps each pair ±U of unitary 2×2-matrices with detU = 1 to a 3×3-rotation DU = D−U and which is compatible with the group multiplication, DU1 U2 = DU1 DU2 . The map is exhaustive, each rotation R ∈ SO(3)

Z

Z

Z

Z

Z

122

6 The Lorentz Group

is a representation DU of some unitary 2×2-matrix U . The preimage of DU in SU(2)/ 2 is unique. If DU = DV then U = V or U = −V . Each U ∈ SU(2) defines the linear map

Z

ˆ † DU : kˆ 7→ kˆ ′ = U kU

(6.52)

of the space of hermitean, traceless 2 × 2-matrices kˆ to itself. A 2×2-matric kˆ is hermitean, kˆ = kˆ † ,



‹



k11 k12 k k = 11 12 k21 k22 k21 k22

ܠ

 =

∗ k∗ k11 21 ∗ ∗ k12 k22

‹ ,

(6.53)

if the matrix elements k11 and k22 are real and if k12 is the complex conjugate of k21 . The matrix is traceless if k11 = −k22 . Such matrices constitute a three-dimensional real vector space and can be written as real combinations of the Pauli matrices (6.46) kˆ = k · σ = ki σ i =



k3 k1 − ik2 k1 + ik2 −k3

‹ .

(6.54)

The linear transformation k · σ 7→ U(k · σ )U † = k′ · σ is a rotation k 7→ k′ = DU k. To check this we write (6.51) with c = cos α2 and s = sin α2 just as U = c − isn · σ and decompose k = kk n + k⊥ into a part which is parallel to n and a transverse part, kˆ = kˆ k + kˆ ⊥ with kˆ k = kk n · σ and kˆ ⊥ = k⊥ · σ . The part kˆ k commutes with each power series in n · σ , in particular with U, and therefore is invariant U kˆ kU † = kˆ kUU † = kˆ k . (6.55)

As k⊥ is orthogonal to n the matrix k⊥ · σ anticommutes with n · σ (6.49) (k⊥ · σ )(n · σ ) = −(n · σ )(k⊥ · σ ) .

(6.56)

Therefore U(k⊥ · σ )U † = U 2 (k⊥ · σ ) and with Uα2 n = U2α n (6.51) , (6.49)

U(k⊥ · σ )U † = (cos α − i sin α n · σ )(k⊥ · σ ) = (cos α k⊥ + sin α n × k⊥ ) · σ . (6.57)  α So Uα n = e−i 2 n · σ causes by k · σ 7→ Uα n k · σ Uα† n = (Dα n k) · σ the rotation Dα n of vectors k around the axis n by the angle α (6.15) Dα n : kk + k⊥ 7→ kk + cos α k⊥ + sin α n × k⊥ .

(6.58)

Vice versa, there corresponds to each rotation Dα n around the axis n by the α α +2π angle α the pair of unitary matrices U = e−i 2 n · σ and −U = e−i 2 n · σ .

C

6.4 The Group SL(2, )

123

C

6.4 The Group SL(2, ) Each invertible, complex matrix M can be uniquely decomposed into a product of a unitary matrix U, U † = U −1 , and the exponential eH (which has positive eigenvalues) of a hermitean matrix, H = H † , M = U eH .

(6.59)

This equation generalizes the well known polar decomposition z = eiϕ r of a nonvanishing complex number z into its phase and its modulus r > 0 . The polar decomposition of each invertible matrix exists because e2H = M † M

(6.60)

defines uniquely a hermitean matrix H = H † because M † M is hermitean, and therefore diagonalizable and has positive eigenvalues λ = e2h . Therefore the hermitean matrix H = 21 ln M † M exists which has the same eigenvectors as M † M and the real eigenvalues h. The matrix U = Me−H (6.61) is unitary, (Me−H )† (Me−H ) = e−H M † Me−H = e−H e2H e−H = 1 , so M = U eH . As each invertible complex matrix M corresponds one to one to a pair (U, H) of a unitary and a hermitean matrix and because the hermitean N × N-matrices constitute a real N 2 -dimensional vector space, therefore the group GL(N, ) of the general 2 linear transformations in N complex dimensions is the manifold U(N) × N . If the matrix M belongs to the subgroup SL(N, ) of the special linear transformations with det M = 1, then the trace of H, the sum tr H = H i i of its diagonal elements, vanishes, tr H = 0 , because det(e2H ) = det(M † M) = 1 and det(e2H ) is the product of its eigenvalues e2h . Moreover U is unimodular, detU = det(Me−H ) = 1. 2 The group SL(N, ) therefore is the manifold SU(N) × (N −1) . In particular, 3 3 SU(2) is the manifold S (6.45) and SL(2, ) is the manifold S × 3 . The unique decomposition of each Lorentz transformation into an orthogonal transformation and a rotationfree Lorentz transformation (6.35) shows that the group SO(1, 3)↑ of the transformations which preserve the time orientation and the space orientation is the manifold SO(3) × 3. The rotation group SO(3) is S3 / 2 . Therefore SL(2, ) is the covering manifold of SO(1, 3)↑ . Also as a group SL(2, ) is a cover of SO(1, 3)↑ . This means: there is a fourdimensional, real representation of SL(2, ), which maps each M ∈SL(2, ) to a Lorentz transformation ΛM with Λ 0 0 ≥ 1 and det Λ = 1, compatible with the group product, ΛM1 M2 = ΛM1 ΛM2 . The representation is exhaustive, each Lorentz transformation Λ with Λ 0 0 ≥ 1 and det Λ = 1 is a representation ΛM of a complex matrix M with det M = 1 . The preimage of Λ is unique up to the sign. One has ΛM = ΛN exactly if M = N or M = −N. The transformation ΛM is the linear map

C

C

C

C

C

R

C

R

R

R

Z

C

ΛM : kˆ 7→ kˆ′ = M kˆ M †

C

(6.62)

124

6 The Lorentz Group

of hermitean 2 × 2-matrices kˆ = kˆ † to hermitean matrices kˆ′ . They are real linear combinations kˆ = k0 σ 0 − k1 σ 1 − k2 σ 2 − k3 σ 3 =



k0 − k3 −k1 + ik2 −k1 − ik2 k0 + k3

‹ (6.63)

of the matrix σ 0 = 1 and the three Pauli matrices (6.46), which span a fourdimensional real vector space. Also M kˆ M † is hermitean and defines a real fourvector k′ = (k′ 0 , k′ 1 , k′ 2 , k′ 3 ). Its components k′ m = Λ m n kn

(6.64)

are linear in k with real matrix elements Λ m n . The matrices Λ are a representation of the group SL(2, ), as ΛM1 M2 = ΛM1 ΛM2 holds for successive transformations The transformation ΛM is a Lorentz transformation, because the determinant of the 2 × 2-matrix kˆ is a quadratic polynomial, namely the length squared (2.45) of the four-vector k, det kˆ = (k0 )2 − (k1 )2 − (k2 )2 − (k3 )2 , (6.65)

C

and coincides because of det M = 1 with the determinant of kˆ′ = M kˆ M † . Therefore k′ 2 = k2 and k 7→ k′ = Λ k is a Lorentz transformation. α We have already shown, that each U ∈ SU(2) can be written as e−i 2 n · σ (6.51) and causes, just as −U, a rotation of k around the axis n by the angle α (6.58). We calculate the Lorentz transformation which corresponds to the hermitean factor eH in M = UeH . The traceless, hermitean matrix H is a real linear combination H = − β2 n · σ of the three Pauli matrices (6.46) (n denotes a unit vector). The exponential series eH simplifies due to (n · σ )2 = 1 as in (6.50)

β β β eH = exp(− n · σ ) = (ch )1 − (sh ) n · σ . 2 2 2

(6.66)

The matrix kˆ is mapped to kˆ′ = eH kˆ (eH )† = eH kˆ eH and splits into kˆ = kˆ k + kˆ ⊥ with kˆ k = k0 − kkn · σ and kˆ ⊥ = −k⊥ · σ , where k = kk n + k⊥ decomposes k into parts which are parallel and orthogonal to n . The perpendicular part kˆ ⊥ anticommutes with H ∝ n · σ (6.56), kˆ ⊥ H = −H kˆ ⊥ , because k⊥ and n are orthogonal to each other. Therefore k⊥ is unchanged, kˆ′ ⊥ = eH kˆ ⊥ eH = eH e−H kˆ ⊥ = kˆ ⊥ .

(6.67)

The matrix kˆ k commutes with eH and e2H is eH with doubled rapidity eH kˆ k eH = kˆ k e2H = (k0 1 − kkn · σ )(ch β 1 − (sh β )n · σ )   = (ch β ) k0 + (sh β ) kk 1 − (sh β ) k0 + (ch β ) kk n · σ =k

′0

1 − kk′ n · σ

.

(6.68)

6.5 The Complex Lorentz Group

‚

We conclude

125

k′ 0 kk′

Œ



ch β sh β sh β ch β

=

‹  0‹ k kk

.

(6.69)

So (6.69, 6.67) effect the boost Lv (3.9) in direction of n with velocity v = tanh β . In contrast to rotations and rotationfree Lorentz transformations, not all matrices M ∈ SL(2, ) can be written as exponential of an infinitesimal transformation

C

N = exp((k + il ) · σ ) = ch z +

sh z (k + il) · σ , z

(k + il)2 = z2 .

(6.70)

There are exceptions of the form



‹

−1 b M= 0 −1

,

b 6= 0 .

(6.71)

For M to be of the form (6.70) sh z/z must not vanish as N must not be diagonal. For the main diagonal elements to coincide, one has to have k3 = l 3 = 0. N21 = 0 states k1 + il 1 + i(k2 + il 2 ) = 0. Consequently z = 0 and N11 = ch z = 1 6= M11 . The matrix M is not an exponential but a product M = U eH (6.59) of exponentials. This shows that the Hausdorff series (A.92) for C(A, B) in eA eB = eC has only a restricted domain of convergence.

6.5 The Complex Lorentz Group

C4 to itself, a ∈ C4 , Λ ∈ GL(4, C) : Λ T η Λ = η .

Complex Poincar´e transformations map TΛ ,a : k 7→ Λ k + a ,

(6.72)

They are of interest as real analytic functions of a real domain U are the restriction to real arguments of functions which are holomorphic in a complex domain Uc ⊃ U. Just as 4 corresponds to hermitean 2 × 2 matrices kˆ = km σ m , k ∈ 4 , which transform under SL(2, ) matrices A or −A to kˆ ′ = A kˆ A† (6.62), so does 4 correspond to general complex 2 × 2 matrices kˆ = km σ m , k ∈ 4 , which transform unter pairs (A, B) or (−A, −B) of SL(2, ) matrices to

R

C

C

C

ˆ = A kˆ BT . T(A,B) (k)

C

R C

(6.73)

The transformations T(A,B) are linear in k ∈ 4 and leave invariant the determinant of kˆ (6.65). So they are complex Lorentz transformations. The subgroup of complex Lorentz transformations with det Λ = 1 is connected and isomophic to SL(2, )×SL(2, )/ 2 . It contains the real restricted Lorentz transformations ±(A, A∗ ) and the space time reflecting Lorentz transformations ±(A, −A∗ ) which in the real Lorentz group constitute a disconnected component.

C

C Z

126

6 The Lorentz Group

6.6 M¨obius Transformations of Light Rays The determinant of kˆ (6.63) vanishes if k is the wavevector of a light ray, because k is lightlike, k = (|k|, k), and det kˆ = k2 = 0 (6.65). As the matrix kˆ only has rank 1, its elements are products of the components of a two-dimensional, complex vector n(k) with the components of the conjugate vector. It is unique up to a phase eiγ (k) , 2



|k| − k3 −k1 + ik −k1 − ik2 |k| + k3

‹ 2

 ‹ =

n1 n2

  n∗1 , n∗2 ,

…

‹

n1 (k) = n2 (k)

k1 − ik2 −È

|k| + k3 p |k| + k3

DZ . (6.74)

Lorentz transformations change kˆ = n n† into k′ = M kˆ M † = (Mn) (Mn)† , i.e. n transforms as a two-dimensional complex vector into n′ = M n . Such a vector with this SL(2, ) transformation is called a spinor,

 ′‹

C



n1 an1 + bn2 = cn1 + dn2 n′2

‹

 ,

M=

ab cd

‹ ,

a, b, c, d ∈

C,

ad − bc = 1 . (6.75)

The ratio z = n1 /n2 corresponds one to one to the incident direction e, from which the light ray is seen. The wave vector has the form (|k|, k) = |k|(1, −e) . In spherical coordinates e = (sin θ cos ϕ , sin θ sin ϕ , cos θ ) (2.33) and by (3.20) one has z=

θ n1 k1 − ik2 = − tan e−iϕ . =− 3 n2 |k| + k 2

(6.76)

C C Z

The set of directions of light rays is the Riemann sphere ∪ {∞} . As z is the ratio of spinor components, it is changed by Lorentz transformations Λ , which correspond to the pair ±M(Λ ) ∈ SL(2, )/ 2 , by the corresponding M¨obius transformation az + b TM : z 7→ . (6.77) cz + d Aberration and rotation of light rays are M¨obius transformations of z = − tan θ2 e−iϕ . Two M¨obius transformations TM and TN coincide if M = N or M = −N, so the M¨obius group is isomorphic to SL(2, )/ 2 i.e. to the restricted Lorentz group. If z1 , z2 , z3 are three different points on the Riemann sphere ∪ {∞} and if also w1 , w2 , w3 are different, then there is exactly one M¨obius transformation [40]

C Z

z 7→ T z :

C

(T z − w1 )(w2 − w3 ) (z − z1 )(z2 − z3 ) = , (T z − w2 )(w1 − w3 ) (z − z2 )(z1 − z3 )

(6.78)

which maps z1 to w1 = T z1 , z2 to w2 = T z2 and z3 to w3 = T z3 . Therefore, at a given place there is exactly one observer, who perceives three given stars in three given directions. The positions of the other stars are then fixed. 2

We choose n(k) proportional to the second column of nk (8.88). It is real analytic outside the neg ative 3-axis A− = k ∈ 3 : k = (0, 0, −|k|) where its phase is discontinuous, n(k) = s(k)e−iϕ (k) .

R

Chapter 7

Hyperbolic and Spherical Geometry

Abstract In the same way as spherical geometry arises from Euclidean geometry on the unit sphere, hyperbolic geometry ensues on the unit mass-shell. The relations between lengths and angles of triangles in hyperbolic and spherical geometry differ only by the substitution of the trigonometric functions cos and sin by their hyperbolic counterparts ch and sh, and some signs. Lorentz boosts accelerate 4-velocities ˆ which, however, is and define by their successive application a kind of addition + neither commutative nor associative, because the deficit angle of nondegenerate triangles does not vanish. We determine this angle from simple matrix products and derive the precession of directions which are transported without torque.

7.1 Unit Mass Shell The normalized tangent vectors to timelike worldlines, the four velocities q, q2 = 1, constitute the three-dimensional hyperbolic space H 3 , the unit mass shell, n o H 3 = q = (q0 , q1 , q2 , q3 ) : (q0 )2 − (q1)2 − (q2)2 − (q3 )2 = 1 ⊂ 1,3 . (7.1)

R

It is the orbit of the arbitrarily chosen origin

q = (1, 0, 0, 0)

(7.2)

under the restricted Lorentz group SO(1, 3)↑ which acts on the column1 vectors q (and on sets of them such as lines or triangles) by matrix multiplication from the left. The orbit H 3 can be viewed as the set of cosets SO(1, 3)↑ /SO(3). Each 1-parameter group of boosts in a fixed direction n, n2 = 1, acting on q defines an oriented, straight line through q

1

For a better print layout we write q in unambiguous context and in running text as row vector.

127

128

7 Hyperbolic and Spherical Geometry

Γn : λ 7→ qn (λ ) =



ch λ n sh λ

‹ .

(7.3)

Transforming these straight lines with the boost L p by multiplication from the left, one obtains the straight lines L p Γn through p = L p q L p Γn : λ 7→ pn (λ ) = L p qn (λ ) .

(7.4)

Through each pair of distinct points (p, q) there passes a unique straight line. It endows its points with a natural distance, the length of the segment from q = p(λ1 ) to p = p(λ2 ) (7.5) d(p, q) = |λ2 − λ1 | . This distance is Lorentz invariant and related to the equally Lorentz invariant scalar product by ch(d(p, q)) = p · q = p0 q0 − p1q1 − p2 q2 − p3q3 . (7.6) This holds for pairs (p, q) and extends to arbitrary pairs (p, q) as the metric and the straightness of lines are invariant under Lorentz transformations. The distance is symmetric, d(p, q) = d(q, p) (7.7) p p and vanishes only if ch(0) = 1 = 1 + p2 1 + q2 − p · q or (1 + p2 )(1 + q2 ) = (1 + p· q)2 or (p − q)2 + (p2 q2 − (p · q)2 ) = 0 which implies p = q. So d(p, q) = 0 ⇔ p = q .

(7.8)

For H 3 several coordinate systems are convenient. Points q may be denoted by their spatial part q, p q0 = 1 + q2 , q = q , (7.9)

by their 3-velocity

v 1 , q= √ , v = q/q0 , q0 = √ 1 − v2 1 − v2

(7.10)

or by their stereographic projection u=

q q0 + 1

, q0 =

2u 1 + |u|2 , q= . 1 − |u|2 1 − |u|2

(7.11)

Spherical coordinates denote each point q by its distance λ ≥ 0 to the origin q and the direction n of q, ch λ = q0 , n =



q ch λ , q= n sh λ |q|

‹ .

(7.12)

7.1 Unit Mass Shell

129

(q0; q)

u

Fig. 7.1 Stereographic Projection of H 3 to the Poincar´e Ball u < 1 in

R3

The modulus of the velocity is |v| = |q|/q0 = tanh λ , so λ is the rapidity (2.21). To speak of the rapidity between two velocities or of their distance is synonymous. In spherical coordinates for the vertices A and B of a triangle (A, B, q) the length c = d(A, B) of the edge opposite to the vertex q is given by ch c = A · B, ch c = (ch a)(ch b) − (sh a)(sh b) cos γ ,

(7.13)

where γ is the angle included by nA and nB . This relation of the length c to the lengths a and b of the other two edges and their included angle γ is the hyperbolic law of cosines. It holds for all triangles (A, B,C) as each of them can be mapped isometrically to a triangle (A′ , B′ , q). The law of cosines and − cos γ ≤ 1 imply ch c ≤ (ch a)(ch b) + (sh a)(sh b) = ch(a + b) and consequently the triangle inequality c ≤ a + b as the hyperbolic cosine is strictly monotonous for positive arguments. The straight segment from q to a point p is the shortest curve from q to p because for each point C in line one has d(O,C) + d(C, A) = d(O, A) ((7.13) with γ = π ) while for all points C′ , which are not in line, the triangle inequality implies d(O,C′ ) + d(C′ , A) > d(O, A). Lorentz transformations map these shortest curves from the origin to other shortest curves, for Lorentz transformations are isometries. By the law of cosines (7.13) one can express each of the angles α , β , γ and lengths a, b, c of a hyperbolic triangle by three of these variables, if they are posa ≤ b + c and their permuted versions. From itive and satisfy α + β + γ ≤ π andp (7.13) one readily calculates sin γ = 1 − cos2 γ in terms of a, b and c

∆ , ∆ 2 = 1 − (cha)2 − (ch b)2 − (ch c)2 + 2(cha)(ch b)(ch c) . (sh a)(sh b) (7.14) Combined with its permuted versions this shows the hyperbolic law of sines sin γ =

sin α sin β sin γ = = . sh a sh b sh c

(7.15)

130

7 Hyperbolic and Spherical Geometry

In a rectangular triangle, γ = π /2, one has sin α = (sh a)/(sh c) as compared to sin α = a/c in Euclidean geometry or sin α = (sin a)/(sin c) in spherical geometry. In a triangle with a = b = r and infinitesimal c and γ the expansion of (7.13) 1+

γ2 γ2 c2 = (ch r)2 − (sh2 r)2 (1 − ) = 1 + (shr)2 , 2 2 2

(7.16)

yields the arc length c = γ sh r of a segment of a circle with angle γ and radius r. The circumference of the hyperbolic circle with radius r is C(r) = 2π sh r .

(7.17)

Integrating C(r) dr one obtains the area of the hyperbolic circle with radius r Z r  dr′ C(r′ ) = 2π (ch r) − 1 . A(r) = (7.18) 0

It is bounded by the circumference,

sh r r A(r) (ch r) − 1 2, the deficit angle of nonplanar triangles cannot simply be added, because the corresponding rotations around different axes do not commute. There the curvature 2-form is Hodge dual to d − 2-forms which take values in the Lie algebra of SO(d).

7.4 Fermi-Walker Transport

135

dδ a = (sin ϕ ) tanh db |b=0 2

(7.44)

is of prime importance for the generators of infinitesimal Lorentz transformations in massive, unitary representations of the Poincar´e group (8.48). To express δ in terms of the lengths a, b and c of the sides of the hyperbolic triangle, one uses tanh a2 = sh a/(1 + cha) and (7.13, 7.14) to express cos γ and sin γ , γ = π − ϕ , as functions of a, b and c. Then the factors (sh a)(sh b) cancel in tan δ2 , 1 − (cha)2 − (ch b)2 − (ch c)2 + 2(ch a)(ch b)(ch c) δ (tan )2 = . 2 2 1 + (cha) + (chb) + (ch c)

(7.45)

Solving (tan δ2 )2 = N/D = (1 − cos δ )/(1 + cos δ ) for (1 + cos δ ) = 2D/(D + N) one obtains Heron’s formula for hyperbolic triangles [37] 2 1 + (cha) + (chb) + (chc) . cos δ + 1 = (1 + cha)(1 + chb)(1 + chc)

(7.46)

This can also be derived with tedious manipulations from cos δ = − cos(α + β + γ ), trigonometric identities for sums of angles and with the law of cosines (7.13,7.14). Recalling ch a = pb · pc , ch b = pc · pa and ch c = pa · pb (7.6) one can read (7.46) as area of the triangle pa , pb , pc in H 3 in terms of Minkowski scalar products.

7.4 Fermi-Walker Transport An observer is not completely described by his worldline Γ : τ 7→ x(τ ) which he traverses in his eigentime τ . One has also to specify the three spatial directions to which he relates his observations. Together with the normalized tangent vector e0 = dx/dτ , they constitute at each point of his worldline a vierbein e0 , e1 , e2 , e3 , an orthonormal basis of the local tangent space, ea · eb = ηab , a, b ∈ {0, 1, 2, 3} .

(7.47)

The normalized tangent vectors e0 , the four velocities, constitute the threedimensional hyperbolic space (7.1). The spacelike vectors e1 , e2 and e3 , to which the observer relates spatial directions, are orthogonal to e0 and constitute a basis of the tangent space to H 3 at e0 . Considered as columns of a matrix, each vierbein defines a Lorentz transformation and, vice versa, the columns of each Lorentz transformation Λ are a vierbein ea m = Λ m a . It relates the components of a vector v = eˆm vˆm in the coordinate basis to the observer basis v = ea va , vˆm = Λ m a va . If the observer accelerates and rotates, his vierbein changes along Γ by dea = ec ω c a , ωab = ηac ω c b , ωab = −ωba , dτ

(7.48)

136

7 Hyperbolic and Spherical Geometry

with coefficient functions ωab (τ ), which, as differentiation of (7.47) shows, are antisymmetric (compare (6.21)). If we expand the vierbein around τ = 0 then up to quadratic terms it is given by ea m (τ ) = ea m + τ eb m ω b a ,

(7.49)

where all coefficients on the right hand side are understood to be evaluated at τ = 0. The worldline is the integral of e0 and up to cubic terms given by 1 xm (τ ) = xm + τ e0 m + τ 2 eb m ω b 0 . 2

(7.50)

From which direction m = (m1 , m2 , m3 ) with respect to his vierbein does the observer see a light ray return at x(τ ), if it had been emitted in direction n = (n1 , n2 , n3 ) at x(−τ ) and had been reflected somewhere in the neighbourhood [57, p. 123]?

τ

v

Γ u

τ Fig. 7.2 Rotationfree Directions

The emitted and reflected light rays, parameterized by λ and λ ′ , are given by m m ym E (λ ) = x (−τ ) + λ u m ′ m m ′ yR (λ ) = x (τ ) − λ v

(7.51)

with tangent vectors u and v um = e0 m (−τ ) + niei m (−τ ) , vm = e0 m (τ ) − mi ei m (τ ) ,

i=3 X i=1

i 2

(m ) =

i=3 X

i 2

(n ) = 1 .

i=1

The light rays intersect in the event, where they are reflected

(7.52)

7.5 Thomas Precession

137

yE (λ ) = yR (λ ′ ) .

(7.53)

These are four equations which for each τ determine λ , λ ′ and m as functions of n. In linear order in τ and taking scalar products with ea , a = 0, 1, 2, 3, one obtains λ = λ ′ = τ and mi = ni . So, up to higher order one has

λ = τ + τ 2 λ2 ,

λ ′ = τ + τ 2 λ2′ ,

mi = n i + τ δ n i .

(7.54)

λ2 + λ2′ = 2ω0 j n j ,

(7.55)

In second order (7.53) implies

δ ni = (λ2 − λ2′ )ni − 2ωi j n j ,

which is possible only if λ2 = λ2′ , as n and m are normalized, n · δ n = 0 (2.55),

δ ni = −2ωi j n j , λ2 = λ2′ = ω0 j n j .

(7.56)

The transport of the vierbein is rotationfree and the observer is nonrotating, if each light ray, which is reflected nearby, is seen to return from the direction into which it was emitted, i.e. if δ n = 0 for all directions n. This holds if and only if ωi j = 0, i.e. if the observer’s vierbein changes by infinitesimal boosts. The ensuing rotationfree transport (7.48) of the vierbein along the worldline is called FermiWalker transport,



‹

dea 0 fT = eb ω b a , ω = , f 0 dτ de1 de2 de3 de0 = f 1 e1 + f 2 e2 + f 3 e3 , = f 1 e0 , = f 2 e0 , = f 3 e0 . dτ dτ dτ dτ

(7.57)

7.5 Thomas Precession The velocities of an earth-bound observer orbit in the course of a year a plane circle  τ 7→ x(τ ) = e τ , r sin(ωτ ), −r cos(ωτ ), 0 , p (7.58)  dx = e, r ω cos(ωτ ), r ω sin(ωτ ), 0 , e = 1 + r2ω 2 . dτ

Using the abbreviations p = rω and

C = cos(e ωτ ) , S = sin(e ωτ ) , c = cos(ωτ ) , s = cos(ωτ ) , one checks that

(7.59)

138

7 Hyperbolic and Spherical Geometry

†



e pc ps 0

†



†

pC S s + eC c −S c + eC s 0



† 

pS −C s + e S c Cc + eSs 0

0 0 , e1 = , e2 = , e3 = e0 = 0 1 (7.60) is a vierbein (7.47), ea · eb = ηab , which is Fermi-Walker transported (7.57) (use d/d(ωτ )(c, s,C, S) = (−s, c, −e S, eC) and e2 − 1 = p2 for confirmation) d d d e0 = f 1 e1 + f 2 e2 , e1 = f 1 e0 , e2 = f 2 e0 , with ( f 1 , f 2 ) = rω 2 (−S,C) . dτ dτ dτ (7.61) Considered as columns of a Lorentz matrix Λ = (e0 , e1 , e2 , e3 ) the vierbein decomposes into rotations Dωτ and D−1 eωτ in the 1-2-plane and a boost L p in 1-direction,

†

†

1

Λ = Dωτ L p D−1 eωτ =

c −s s c

†

e p pe 1

1

1



1 CS −S C

.

(7.62)

1

Using D L p = LDp D (6.37) exhibits the vierbein of the orbiting, rotationfree observer to rotate in the reverse sense with angular velocity (e − 1)ω as compared to an observer, who reaches the same velocity with a pure boost LDωτ p , (e0 , e1 , e2 , e3 ) = LDωτ p D−1 (e−1)ωτ .

(7.63)

In particular, after each revolution τ = 2π /ω the basis e′1 = e1√ cos δ − e2 sin δ and ′ e2 = e1 sin δ + e2 cos δ is rotated by δ = 2π (e − 1) = 2π (1/ 1 − v2 − 1) in the reverse direction. The angle δ is the hyperbolic area (7.18) of the plane orbit in the velocity space H 3 . The velocity of the earth in orbit around the sun is small as compared to the velocity of light, v = 1.00 ·10−4. So the area is sufficiently approximated by its Euclidean value δ = π v2 . If the directions to the stars have not changed otherwise during the year, i.e. if their motion can be neglected, they appear according to Special Relativity rotated forwards in the Fermi-Walker transported basis. The rotation is caused by the curvature of H 3 . However, in the analysis of the orbit of the earth one has to account also for gravity and its related curvature of spacetime. This is beyond the scope of Special Relativity. The analogous investigation in General Relativity of Fermi-Walker transport in free fall around a gravitating massp with Schwarzschild radius r0 in distance r yields the de Sitter precession δ = 2π ( 1 − (3r0)/(2r) − 1) per revolution. This is the main contribution to the observed yearly precession [10] of the normal vector of the orbit of satellites orbiting the earth. In addition, the vector precesses by the Thirring Lense effect because the rotation of the earth around its axis contributes to its gravitational field.

Chapter 8

Relativistic Quantum Theory

Quantum theory associates to each physical state of a system a ray, usually represented by a normalized vector Ψ , in a Hilbert space H and predicts the probabilities of the results of all conceivable measurements. In the simplest case the results ai of the measuring device A can be enumerated by some discrete index i. The device A is perfect if for each result ai there exists a state Λi , the corresponding eigenstate of A, which yields ai with certainty. 1 The result is nondegenerate, if no finer measurement can distinguish several eigenstates. Then the probability for ai to result if the state Ψ is measured with a perfect apparatus A is given by w(i, A, Ψ ) = | hΛi Ψ i |2 . (8.1)

In particular, w(i, A, Λ j ) = | Λi Λ j |2 = δ i j . The basic equation (8.1) is sometimes called the Copenhagen interpretation of quantum theory as if one could understand quantum physics differently. But no formulation of quantum theory can do without this statement of what the physical properties are in case that the system is represented mathematically by Ψ . The fundamental equation of quantum theory has no simpler explanation. The polarization measurements of pairs of photons violate Bell’s inequality (section 1.4) showing that the results are ascertained not until the measurement. This rules out all deterministic explanations as to how and when it is that a result becomes certain. To recapitulate the mathematical structure in (8.1): a Hilbert space H is a separable, complete, complex vector space with a scalar product, h i : H × H → . It is linear in the second argument and conjugate symmetric, i.e. for all c1 , c2 ∈ and Λ , Ψ1 , Ψ2 ∈ H one has

C C

hΛ Ψ1 c1 + Ψ2 c2 i = hΛ Ψ1 i c1 + hΛ Ψ2i c2 , hΛ Ψ i = hΨ Λ i . ∗

(8.2a) (8.2b)

Accordingly the scalar product is conjugate linear in the first argument 1 Two consecutive polarization filters with oblique polarization directions constitute a simple example of an imperfect device: One cannot prepare photons which surely pass.

139

140

8 Relativistic Quantum Theory

hc1Ψ1 + c2Ψ2 Λ i = c∗1 hΨ1 Λ i + c∗2 hΨ2 Λ i .

(8.3)

As the scalar product is only linear with respect to real factors of the first argument, it is called a sesquilinear form from Latin ‘sesqui’ for one and a half.

The scalar product of a vector with itself is positive definite (so Λi Λ j = δ i j

P and Ψ = j Λ j Λ j Ψ ) and defines the norm and the topology in Hilbert space 0 ≤ hΨ Ψ i = kΨ k2 < ∞ ,

kΨ k = 0 ⇔ Ψ = 0 .

(8.4)

Each Hilbert space is complete by definition, Cauchy sequences converge. If a complex vector space with a positive definite scalar product is not complete, then its Cauchy completion, the set of equivalence classes of Cauchy sequences, is. To be separable requires H to be the Cauchy P completion of a denumerable set of vectors. As the probabilities (8.1) add to i w(i, A, Ψ ) = 1, each physical state Ψ corresponds to a normalized vector hΨ Ψ i = 1. This normalized vector is unique only up to a phase as Ψ and eiα Ψ , α ∈ , yield the same probability distributions. An operator H is a linear map of a dense subspace of H , its domain of definition D(H), to H . Its hermitean adjoint is the operator H ⋆ defined by swapping sides in the scalar product hΛ H Ψ i = hH ⋆Λ Ψ i (8.5)

R

implying c⋆ = c∗ for complex numbers, (AB)⋆ = B⋆ A⋆ and D(H) ⊆ D(H ⋆ ). We admit to the carelessness with which many physicists disregard the distinction of symmetric, self adjoint and essentially self adjoint operators; for a correct treatment see [50, 53]. An operator U is unitary, if it is invertible and leaves all scalar products invariant, hU Λ UΨ i = hΛ Ψ i , U ⋆U = UU ⋆ = 1 .

(8.6)

An operator is self adjoint or hermitean if H = H ⋆ and D(H) = D(H ⋆ ). It generates a one parameter group of unitary transformations Ut : Ψ 7→ e−iHt Ψ , Ut Us = Ut+s , U0 = 1. Vice versa, if a one parameter group of unitary transformations Ut acts continuously, limt→t ′ UtΨ = Ut ′ Ψ , then the group is generated by a skew hermitean operator (−iH)⋆ = (iH), defined by −iHΨ = limt→0 (Ut Ψ − Ψ )/t on states Ψ for which t 7→ UtΨ is differentiable. An invertible, additive map A is antiunitary if hAΛ AΨ i = hΨ Λ i. So it is conjugate linear, A(c1Ψ1 + c2Ψ2 ) = c∗1 (AΨ1 ) + c∗2(AΨ2 ). The Hilbert space of a particle with mass m > 0 and spin s consists of measurable and square integrable wave functions

Ψ : Mm →

C2s+1

(8.7) p which map the mass shell Mm of momenta p, p0 = m2 + p2 , to a complex vector space 2s+1 , the space of spin. The scalar product of two states is

C

8 Relativistic Quantum Theory

hΦ Ψ i =

Z

141

˜ dp

Mm

s X

Φn∗ (p) Ψn (p) =

n=−s

Z

˜ Φ ∗ T (p) Ψ (p) dp

(8.8)

Mm

with a Lorentz invariant measure (5.116) and the conventional normalization factors of quantum field theory, ˜ = dp

d3p ˜ . ˜ Λ p) = dp p , d( (2π )3 2 m2 + p2

(8.9)

The probability w(n, U , Ψ ) to obtain the spin n and a momentum p in a domain U if measuring the state Ψ is given by Z ˜ |Ψn (p)|2 . (8.10) w(n, U , Ψ ) = dp U

So the momentum amplitude Ψˆn (Ψˆn )(p) =

Ψn (p) √ 1 (2π ) 2(m2 + p2 ) 4

(8.11)

3 2

is a half density of the momentum p. Its modulus squared is the probability density for the spin value n and momentum p per volume d3p. 2s+1 ˜ The map ˆ relates unitarily the functions Ψ in L2 (Mm , dp; ), which are 3 2 3 ˜ to functions Ψˆ in L ( , d p; 2s+1 ). square integrable with respect to dp, Wave functions Ψ which differ only in a set of vanishing measure represent the same vector in H because kΨ k = 0 if and only if Ψ = 0 (8.4). Such an equivalence class of wave functions contains at most one continuous function. Quantum theory ensues from (8.1) together with the assumption, that the time evolution of states, i.e. the paths t 7→ Ψ (t) which they traverse in Hilbert space in the course of time t, is linear in the initial state. To be consistent with the probability statement (8.1) the linear operators Ut which map the initial state Ψ (0) to its time evolved state Ψ (t) at time t

R

Ψ (t) = Ut Ψ (0)

C

C

(8.12)

have to be unitary. If they constitute a one parameter group, Ut Us = Ut+s , then the paths t 7→ Ψ (t) with a differentiable time evolution satisfy the Schr¨odinger equation2 (in units with h¯ = 1) (8.13) i ∂t Ψ (t) = H Ψ (t) with a time independent hermitean generator of the time evolution, the Hamiltonian H, H = i ∂t Ut |t=0 , H = H ⋆ , Ut = e−it H . (8.14) The derivative ∂t acts on paths in Hilbert space, not on states, ∂t is no operator of H . The Schr¨odinger equation does not determine states but paths in Hilbert space.

2

142

8 Relativistic Quantum Theory

The nontrivial content of the Schr¨odinger equation in each quantum model is the choice of the appropriate Hamiltonian acting in an appropriate Hilbert space. The eigenvalues E of the Hamiltonian are the possible energies of the states. The Sch¨odinger equation seems to be nonrelativistic and to violate relativistic invariance as space (if at all) and time enter in a completely different way. But for relativistic invariance of the time evolution it is only necessary that Poincar´e transformations map to each other the physical paths in Hilbert space. More precisely, Poincar´e transformations have to map the probabilities w(i, A, Ψ (t)) (8.1) to themselves, as they constitute all observable properties of the system. The states Ψ correspond to rays, the sets { cΨ : c ∈ }, represented by normalized vectors which are unique up to a phase. Also the eigenvectors Λi of each device A (8.1) are unique only up to a nonvanishing factor. So in a relativistic theory Poincar´e transformations have to map rays  of a Hilbert space H to rays leaving all probabilities |hΛ Ψ i|2 / hΛ Λ i hΨ Ψ i invariant. Wigner’s theorem [68, page 233] states that each probability preserving map of rays corresponds to a unitary map or an antiunitary map, which is unique up to a phase, just as in geometry each length preserving transformation of triangles, which keeps the origin fixed, is a rotation or a rotary reflection. As the square of a unitary or antiunitary map is unitary and as each translation or boost or rotation in the Poincar´e group is a square, each of its probability preserving realization corresponds to unitary transformations Ug . They are unique up to a phase Ug′ = exp (iα (g))Ug . The product of two such unitary transformations has to yield Ug2 g1 up to a phase, i.e. the map g 7→ Ug constitutes a unitary ray representation

C

Ug1 Ug2 = eiω (g1 ,g2 ) Ug1 g2 .

(8.15)

For example, the map (6.47) which attributes to each rotation around the axis n by the angle α the matrix Uα n is a ray representation of SO(3) in SU(2). Associativity restricts the phases

ω (g1 , g2 g3 ) + ω (g2, g3 ) = ω (g1 , g2 ) + ω (g1g2 , g3 )

(8.16)

and a redefinition of Ug changes them to

ω ′ (g1 , g2 ) = ω (g1 , g2 ) + α (g1) + α (g2 ) − α (g2 g1 ) .

(8.17)

In case of the cover of the restricted Poincar´e group one can choose α such that ω ′ vanishes [26, page 483]. To the probability preserving transformations of rays there corresponds a unitary representation of the cover of the restricted Poincar´e group. We therefore call a quantum theory relativistic if and only if its Hilbert space carries a unitary, faithful representation of the cover of the restricted Poincar´e group. This unitary representation imparts physical properties to the vectors in Hilbert space. In particular, the unitary representation Ua = eiP · a of translations Ta x = x + a is generated by four commuting, hermitean operators P = (P0 , P1 , P2 , P3 ). Therefore, by definition, P is the four-momentum. Eigenstates of P correspond to systems, such as elementary or composite particles, with definite energy and momentum.

8.1 Induced Representations of the Poincar´e Group

143

8.1 Induced Representations of the Poincar´e Group Mackey’s theorems classify the unitary representations U : g 7→ Ug , Ug : H → H , of the Poincar´e group P, for which the matrix elements Φ UgΨ are measurable functions of P to for each Φ , Ψ ∈ H [38]. Each U decomposes, uniquely up to unitary equivalence, into the sum or integral of induced irreducible representations. To describe the unitary, irreducible representations denote the restricted Lorentz group by G, let H ⊂ G be the stabilizer of a momentum p ∈ p,q ,

C



H = h ∈ G : hp = p

©

R

,

(8.18)

and let R represent H unitarily and irreducibly in a Hilbert space V , e.g. V = R(h2 ) R(h1 ) = R(h2 h1 ) .

C2s+1 , (8.19)

Denote by HR the space of functions Ψ : G → V which satisfy

Ψ (gh) = R(h−1 ) Ψ (g) ∀g ∈ G, h ∈ H .

(8.20)

HR is mapped linearly to itself by the multiplicative representation Ua of translations (UaΨ )(g) = eipaΨ (g) , p = gp , (8.21) and by the adjoint transformation Ug of Lorentz transformations (A.35) (UgΨ )(g′ ) = Ψ (g−1 g′ ) .

(8.22)

One easily confirms that Ua and Ug (8.21,8.22) represent the Poincar´e group, Ua Ub = Ua+b , Ug Ug′ = Ug g′ and Ug Ua = Ub Ug with b = ga. This representation is called induced by the representation R of the stabilizer H. As R is unitary, the scalar product of the values at g of two functions Φ , Ψ ∈ HR    Φ (g), Ψ (g) = R(h−1 ) Φ (g), R(h−1 ) Ψ (g) = Φ (gh), Ψ (gh) (8.23)

is constant in each left coset gH. Each coset corresponds uniquely to a momentum p = gp of the mass shell M = G/H, so one can define a scalar product in HR by hΦ Ψ i =

Z

M

˜ Φ |Ψ dp



(8.24)

p

˜ = d(g ˜ p) is the Lorentz invariant measure (5.116). Equipped with this where dp scalar product HR becomes a Hilbert space and U a unitary representation Z Z Z    ˜ ˜ Φ |Ψ . ˜d(gp) Ug Φ |UgΨ = d(gp) Φ |Ψ p = dp (8.25) gp p M

M

M

Two of such irreducible representations are equivalent if and only if the inducing representations R are equivalent and their mass shells coincide.

144

8 Relativistic Quantum Theory

Each function Ψ ∈ HR is completely determined within each coset gH by its value in one point, Ψ (gh) = R(h−1 ) Ψ (g) (8.20), and each coset gH corresponds one to one to a momentum p = g p = g h p ∈ M . Therefore, if one chooses for each momentum p a Lorentz transformation σ (p), which maps p to σ (p)p = p, then each function Ψ ∈ HR defines a momentum wave function ψ (p) = Ψ (σ (p)). The other way round, to eachmomentum wave function ψ there corresponds the function Ψ (g) = R g−1 σ (gp) ψ (gp) in HR . However, in D = 4 the massless shell M0 does not allow a continuous choice σ (p) of Lorentz transformations, which for all p ∈ M0 map p to p = σ (p)p. One has to cover the massless shell by several open domains Uα ⊂ M , ∪α Uα = M0 . We enumerate them by Greek indices which we exempt from the summation convention. A section corresponding to such a cover of M consists of maps σα : Uα → G, the local sections, which in their domain Uα are a right inverse of the projection

π : G → M , g 7→ gp

(8.26)

i.e. σα assigns smoothly to p ∈ Uα a transformation σα (p) which maps p to p,

σα (p) p = p , πσα = idUα .

(8.27)

If already one domain alone covers M , then the section σ is called global and the group G can be viewed as the product M × H of the base manifold with a fiber H. In their common domain local sections differ by their transition function hαβ

σβ (p) = σα (p) hαβ (p) , h−1 αβ (p) = hβ α (p) , hαβ : Uα ∩ Uβ → H .

(8.28)

The composition with a section makes each function Ψ ∈ HR (8.20) a momentum wave function, i.e. a section of the V bundle over the mass shell consisting of the collection of functions ψα ,

ψα = Ψ ◦ σα : Uα → V

(8.29)

which in their common domains Uα ∩ Uβ are related by (8.20)  −1 ψβ = Ψ (σα hαβ ) = R hαβ ψα = R(hβ α ) ψα .

(8.30)

Vice versa, a collection ψα with such transitions defines a function Ψ ∈ HR   8.30,8.28 Ψ (g) = R g−1 σα (gp) ψα (gp) = R g−1 σβ (gp) ψβ (gp) .

(8.31)

−1 g σα (p) = σβ (g p)Wβ α (g, p) , Wβ α (g, p) = σβ (g p) g σα (p) .

(8.32)

So momentum wave functions correspond one to one to functions Ψ ∈ HR and UgΨ and UaΨ correspond to the transformations of the momentum wave functions. Left multiplication by g maps σα (p) to gσα (p) in the fiber over some domain Uβ and related to σβ (gp) by an H-transformation, the Wigner rotation Wβ α (g, p) ∈ H,

8.1 Induced Representations of the Poincar´e Group

145

It enters as follows the transformation of the momentum wave functions,   8.29 8.32 8.20 (gp) = UgΨ (σβ (gp)) = UgΨ (gσα (p)Wβ−1 α) =  8.22 8.29 R(Wβ α ) UgΨ (gσα (p)) = R(Wβ α ) Ψ (σα (p)) = R(Wβ α ) ψα (p) , Ug ψ



β

Ug ψ



β

 (gp) = R Wβ α (g, p) ψα (p) .

(8.33)

These transformations of momentum wave functions represent G, as the Wigner rotations satisfy for p ∈ Uα , g1 p ∈ Uβ and g2 g1 p ∈ Uγ by (8.32) Wγα (g2 g1 , p) = Wγβ (g2 , g1 p)Wβ α (g1 , p) . One can drop the denomination of the coordinate patches in (8.33)   Ug ψ (gp) = R W (g, p) ψ (p) ,

(8.34)

(8.35)

if one agrees to read equations about values of sections to apply by definition in the coordinate patches which contain the arguments. Multiplicative are operators R which multiply each wave function with a possibly momentum dependent matrix, (Rψ )(p) = R(p)ψ (p). Under composition with any map g : q 7→ p = g(q) they and their products satisfy (Rψ ) ◦ g = (R ◦ g)(ψ ◦ g) , (R2 R1 ) ◦ g = (R2 ◦ g)(R1 ◦ g) .

(8.36)

In terms of the multiplicative operators Rg , defined by (Rg ψ )(p) = R(W (g, p))ψ (p), the induced representation Ug (8.33) and the representation R of (8.34) simply read Ug ψ = (Rg ψ ) ◦ g−1 , Rg2 g1 = (Rg2 ◦ g1 )Rg1 .

(8.37)

Lorentztransformations Λ sufficiently close to the identity are invertibly related by the exponential map to the elements ω of a neighbourhood of 0 of the Lorentz algebra, Λ = eω . Their unitary representation Ueω = e−iMω , applied to states Ψ , which transform smoothly, determines the generators −iMω = − 2i ω mn Mmn [53] (Uet ω Ψ − Ψ ) . t→0 t

−iMω Ψ = lim

(8.38)

In physics, one widely assumes the power series expansion N X 1 (−iMω )kΨ (p) N→∞ k!

Ueω Ψ (p) = lim

(8.39)

k=0

but this restricts the smooth wave functions Ψ to be uniformly analytic: if ω varies in a complementary subspace of the algebra of the stabilizer of p 6= 0 then the bijection ω 7→ eω p defines coordinates ω for a neighbourhood of p. There Ψ (e−ω p)

146

8 Relativistic Quantum Theory

is given by a power series R−1 (W (eω , e−ω p)) k 1/k!(−iMω )kΨ (p) (8.33) in the coordinates ω . For (−iMω )kΨ to exist for all k the state Ψ has to allow the application of arbitrary polynomials in Mmn . This requires the momentum wave function Ψ to be a Schwartz function, i.e. to decrease with all its derivatives faster than any inverse polynomial of the momenta. For each multiindex α = (α1 , α2 , α3 ) and β = (β1 , β2 , β3 ) ∈ 30 there have to exist constants Cα ,β such that P

N

|pα dβ Ψ (p)| < Cα ,β

(8.40)

α

for all p ∈ Mm , where pα = pα1 1 p2α2 p3 3 and dβ = (∂ p1 )β1 (∂ p2 )β2 (∂ p3 )β3 . Provided the maps g 7→ UgΨ of G to H are measurable for all Ψ , then the states Z Ψf = dµg f (g)UgΨ , (8.41) G

averaged with an invariant measure dµg and with smooth functions f : G → compact support, exist and transform by UgΨf = Ψf ◦g−1 .

C of

(8.42)

Consequently they are infinitely often differentiable with respect to the coordinates ω . The space spanned by these states, the G˚arding space, is invariant under Ug and dense in H . It is the core of the skew hermitean generators −iMω , i.e. the common domain of their algebra [53]. That such a core exists under the above measurability condition justifies physicists who manipulate the unbounded generators purely algebraically. But it shows also maps from smooth to discontinuous functions to be incompatible with the algebra of the generators.

8.2 Vacuum The orbit of p = 0 consists of p only and its stabilizer is the Lorentz group. Its only finite-dimensional, irreducible, unitary representation is trivial, R(g) = 1. The state Ω , hΩ Ω i = 1, which spans the representation space, is identified by its invariance Ua Ω = Ω , Ug Ω = Ω under the Poincar´e group: it is the vacuum, the only state which looks the same to all observers. Because of PΩ = p Ω = 0 the energy of the vacuum vanishes in relativistic quantum theory, unheeding all unnecessary attempts at its calculation. If the vacuum had an energy, then in relativistic quantum theory there had to exist Lorentz transformed vacua with Lorentz transformed energy and momentum, i.e. the name vacuum would simply turn out to be a misnomer. That the vacuum fluctuates is a myth which culminates in Sokal’s hoax that in quantum gravity π fluctuates. Before a generic measurement one only knows the probabilities (8.1) of the various outcomes. The single results, not the state, fluctuate. This is what quantum fluctuations mean at bottom. The particle number of the vacuum, its charge, energy, momentum and angular momentum, however, are certain and vanish.

8.3 Massive Representations

147

8.3 Massive Representations The mass shell of a particle with mass m > 0 is the orbit of p = (m, 0, 0, 0),



Mm = (p0 , p) : p0 =

p m2 + p2 , p ∈

R3

©

.

(8.43)

The stabilizer of p is the group of rotations, H = SO(3). Each of its irreducible, unitary representations R is determined by its dimension 2s + 1, where s, the spin of the particle, is nonnegative and half integer or integer (appendix B.5). The manifold SO(1, 3)↑ is the product SO(3) × 3 as each Lorentztransformation Λ = L p O can be uniquely decomposed into a boost L p and a rotation O (6.24). So there exists a global section

R

σ : Mm → SO(1, 3)↑ , σ (p) = L p , L p p = p .

(8.44)

and we need no index to distinguish several local sections. Composed with this section, the functions Ψ ∈ HR become wave functions of Mm . For gt = et ω the derivative of the left side of (8.33) with respect to t at t = 0 yields (ω p)i ∂iΨ − iMω Ψ . 3 Rotations Dt = et ω coincide with their Wigner rotation  W (Dt , p) (6.37). Let them be unitarily represented by R(et ω )Ψ (p) = exp 2t ω i j Γi j Ψ (p) with skew hermitean matrices Γi j , which represent the Lie algebra of rotations (6.23) (i, j, k, l ∈ { 1, 2, 3 }) [Γi j , Γkl ] = δik Γjl − δil Γjk − δ jkΓil + δkl Γik , Γi j = −Γi †j = −Γji .

(8.45)

The derivative with respect to t at t = 0 is 12 ω i j Γi jΨ (p) and we obtain   −iMi jΨ (p) = − pi ∂ p j − p j ∂ pi Ψ (p) + Γi jΨ (p) .

(8.46)

As −iMi j generate rotations,4 (J 1 , J 2 , J 3 ) = (M23 , M31 , M12 ) are the components of the angular momentum operator J consisting of orbital angular momentum L = −ip × ∂ p and spin S contributed by the matrices iΓ , J = L + S, [Li , S j ] = 0. The component functions of the differential operator on the right side of (8.46) are the negative of the infinitesimal motion δ p of the points p. This motion occurs on the left side of (8.33) or as inverse transformation Mg−1 on the right side of (A.35). The infinitesimal rotation li j rotates the basis vectors ei to e j (6.22). The pinfinitesimal boost l0i maps e0 to −ei , moves points p by (l0i p) j = −p0 δ j i , p0 = m2 + p2 , and changes wave functions by p0 ∂ pi . 3

The summation index i enumerates coordinates of the mass shell, in terms of which Lorentz transformations act smoothly e.g. the spatial components of the momentum. Though the components and the derivatives depend on the coordinates, the differential operator is independent of coordiP3 m nates. It is the restriction to the mass shell of the vector field m=0 (ω p) ∂m which is tangent to the Lorentz flow (figure 3.2) of four-momentum. The derivatives ∂m , m = 0, 1, 2, 3, however, cannot be applied separately to the wave functions Ψ as they are no functions of 1,3 . 4 The sign is chosen such that U ω = exp(−iω · J) (6.51) with (ω 1 , ω 2 , ω 3 ) = (ω 23 , ω 31 , ω 12 ). e

R

148

8 Relativistic Quantum Theory

The Wigner rotation W (Lq , p) of a boost Lq has been determined in (7.43, 7.44). It rotates in the p-q-plane from p to q by the deficit angle δ of the hyperbolic triangle with vertices (p, p, L p q), a sh a |p| dδ = (sin ϕ ) tanh = (sin ϕ ) = (sin ϕ ) 0 . db |b=0 2 ch a + 1 p +m

(8.47)

As the Wigner rotation of an infinitesimal boost l0i applied to p rotates from p to −ei in the same sense as from ei to p it is represented by Γi j p j /(p0 + m). Combining both infinitesimal changes, we obtain the generators of boosts of wave functions  −iM0iΨ (p) = p0 ∂ piΨ (p) + Γi j

pj Ψ (p) . p0 + m

(8.48)

The generators (8.46, 8.48) are skew hermitean with respect to the scalar product (8.8) and represent the Lorentz algebra (6.23). This is simple to check for [−iMi j , −iMkl ] and manifest for [−iMi j , −iM0k ], because pi , ∂ pi and Γi j transform as vectors or products of vectors under the rotations of momentum and spin; [−iM0i , −iM0 j ] = iMi j holds as if by miracle, pk pl 0 ∂ Γ + , p ]= j jl p p0 + m p0 + m   1 pi (p0 (p0 + m)δi l − pi pl )Γjl − i ↔ j + = p0 0 ∂ p j + 0 2 p (p + m) k l  p p δi j Γkl − δil Γk j − δk j Γil + δkl Γi j + 0 2 (p + m)  = − −(pi ∂ p j − p j ∂ pi ) + Γi j .

[p0 ∂ pi + Γik

(8.49)

For different masses the representations of translations are inequivalent as the spectra of P differ. However, states Ψ with mass m are unitarily related by dilation (Ueλ Ψ )(p) = e−

D−2 2 λ

Ψ (e−λ p)

(8.50)

to states Ueλ Ψ with mass eλ m. The dilation commutes with Lorentz transformations. Stereographic coordinates ui =

pi p0 + m

, pi = m

2ui , i = 1, . . . , D − 1 , u2 = ui ui < 1 , 1 − u2

(8.51)

map each massive shell to the interior ID−1 of the unit ball. The generators of rotations in these coordinates are obtained from (8.46) by renaming p to u. The generators of boosts (8.48) are by the chain rule (Ψ (u) := Ψ (p(u))) (−iM0iΨ )(u) = (

1 + u2 ∂ui − uiuk ∂uk )Ψ (u) + Γik uk Ψ (u) . 2

(8.52)

8.4 Position of a Massive Particle

149

8.4 Position of a Massive Particle Though relativistic quantum theory represents the Poincar´e group, the isometry group of space time, there do not exist commuting, hermitean operators Xm which correspond to the measurement of time and position. Their translation eiP · a Xm e−iP · a = Xm − am , m ∈ {0, 1, . . .D − 1} ,

(8.53)

would be generated by the momentum operators Pm which constitute Heisenberg pairs, i.e. satisfy the Heisenberg commutation relations [Pm , Pn ] = 0 = [Xm , Xn ] , [Pn , Xm ] = iδ n m .

(8.54)

By the Stone-von Neumann theorem [50, 53, 62], up to unitary equivalence, all irreducible representations of D Heisenberg pairs by hermitean operators act on square integrable wave functions Ψ ∈ H = L2 ( D, dDp) by

R

  ∂ PmΨ (p) = pmΨ (p) , XnΨ (p) = −i n Ψ (p) . ∂p

(8.55)

More precisely, the largest subspace of H which is mapped to itself by the algebra generated by Xn and Pm and by the unitary transformations ei(pX−xP) , (x, p) ∈ 2D , is the Schwartz space S ( D ) of rapidly decreasing smooth functions (8.40). Both the operators X = (X0 , . . . XD−1 ) and P = (P0 , . . . PD−1 ) have the purely continuous spectrum D and P2 = Pm Pn ηnm the purely continuous spectrum . This is why in an acceptable quantum theory the Hamiltonian H = P0 cannot belong to a Heisenberg pair [P0 , X 0 ] = i. Otherwise the energies would be unbounded from below. To restrict them to nonnegative values violates the hermiticity of X 0 . The space HS = H × HN of one particle states of the covariant string factorizes into a Hilbert space H , acted upon by Heisenberg pairs of momentum and position operators including X 0 and P0 = H, and into HN , a denumerable sum of finite dimensional eigenspaces of a level operator N. As P0 belongs to a Heisenberg pair the energy is not bounded from below. This defect is seemingly augmented by the constraint  P2 − M 2 (N) Ψphys = 0 , P2 = Pm Pn ηmn (8.56)

R

R

R

R

that physical states Ψphys vanish outside discrete mass shells. Then, however, they differ from zero only in a set of D-dimensional measure. Consequently R vanishing their scalar products Φ Ψphys H = dDp hΦ ∗ (p)|Ψphys (p)iHN = 0 vanish for all S Φ ∈ HS . So each physical state vanishes, Ψphys = 0. In other words, the operator Pm Pn ηmn does not have a point spectrum if each Pm belongs to a Heisenberg pair. So the covariant string predicts the absence of physical states. The argument that (8.56) pfollows for physical states from the Schr¨odinger equation (8.13) with H = P0 = M 2 + p2 contains a subtle error. The Schr¨odinger equation does not hold for states but for their time evolution. But paths which states traverse in Hilbert space in the course of time have to be distinguished from the states.

150

8 Relativistic Quantum Theory

R

Imposed on states in L2 ( D , dD p) the constraint (8.56) has no solution though undeniably it holds for the free time evolution of states in L2 ( D−1 , dD−1 p). In the lightcone gauge string theory contains D − 1 Heisenberg pairs: the transverse momenta PT = (P1 , . . . PD−2 ), their corresponding position operators XT = (X 1 , . . . X D−2 ) and the pair P+ , X − , [P+ , X − ] = i. The operator M 2 = P+ P− − P2T has a discrete spectrum, so P− = (m2 + P2T )/P+ on one particle states of mass m. In an appropriate basis the operators (PT , P+ ) act multiplicatively and (XT , X − ) differentiate Schwartz functions Ψ ∈ S ( D−1 ). X − generates translations of p+

R

R

Ua = e−iX

−a

 UaΨ (pT , p+ ) = Ψ (pT , p+ − a) .

,

(8.57)

The Heisenberg pairs and P− constitute a consistent algebra only if they, together with the one parameter groups which they generate, map a dense core to itself. But if on some mass shell Ψ ∈ S ( D−1 ) 6= 0, then there is a neighbourhood of a point (pT , a) where |(m2 + p2T )Ψ (p)| > ε > 0 and Φ = UaΨ does not vanish in a neighbourhood U of (pT , 0). Consequently Φ is not in the domain of P− ,

R

− P Φ P− Φ ≥

Z

dp+ dD−2p

U

ε2 =∞. (p+ )2

(8.58)

So the Heisenberg pairs and P− have no dense core. A discrete mass spectrum, i.e. an operator P− = (m2 + P2T )/P+ , and Heisenberg pairs (PT , P+ ) and (XT , X − ), as assumed in string theory in the lightcone gauge, are inconsistent. Different from X 0 or X − the spatial position operator X can be defined for massive relativistic particles. But their localization has strange features. The position operator acts multiplicatively on the Fourier transformation Ψ˜ of the momentum amplitude Ψˆ (8.11)

Ψ˜ (x) =

Z

d3p (2π )

3 2

e

ip x

Ψˆ (p) =

Z

˜ eip x dp

p 2p0 Ψ (p) .

(8.59)

√ If the state propagates freely then its Hamiltonian is H = m2 + P2 and by the Schr¨odinger equation (8.13) its time evolution from initially Ψ˜ (0, x) = Ψ˜ (x) is Z p p ˜ ˜ ei(px−p0 t) 2p0 Ψ (p) , p0 = m2 + p2 . Ψ (t, x) = dp (8.60)

Upon measuring the state at time t, the probability to find the particle in a spatial region U with the spin value n is Z w(n, U , Ψ (t)) = d3x |Ψ˜n (t, x)|2 . (8.61) U

This is not a probability to find the time t upon measuring. In relativistic quantum theory space and time are essentially different though x and t enter (8.60) by the Lorentz invariant scalar product p · x. Ψ (t, x) is not the value of a wave function Ψ ∈ L2 ( 4 , d4x) as it cannot be square integrable over 4 . Rather, the time t is the parameter of a path t 7→ Ψ (t) in the Hilbert space L2 ( 3 , d3x ).

R

R R

8.4 Position of a Massive Particle

151

At different times the amplitudes are related by Z ˜ Ψ (x) = d3y D(x − y) Ψ˜ (y)

(8.62)

where x = (t, x), y = (y0 , y) and D(x) =

Z

d3p i(px−√m2 +p2 t) ∂ = −2i 0 ∆+ (x) , e 3 (2π ) ∂x

(8.63)

is the derivative of the Wightman distribution ∆+ (B.109)

∆+ (x) =

Z

d3p e−ip · x (2π )3 2p0

(8.64)

√ 3 and falls off like e−mr (mr)− 2 for large spacelike distances r = x2 (B.103). This is why one cannot strictly localize a particle such that its position wave function vanishes exactly at least for some time interval outside a spatially bounded tube in spacetime. Though one can start the time evolution with such a state, then at each later moment the wave function has exponentially damped but nonvanishing tails far away. This holds in relativistic as well as in nonrelativistic quantum theory. In relativistic physics each path of the position wave function Ψ˜ of a free particle satisfies the Klein-Gordon equation ( + m2 )Ψ˜ (x) = 0

(8.65) √ as a consequence of the free Schr¨odinger equation i∂tΨ = m2 + P2 Ψ and depends in a domain U only on the initial values in the backward lightcone of U (5.78). Nevertheless Ψ˜ does not vanish outside the forward light cone of a bounded domain outside of which it vanished initially because the time derivative belongs to the initial data of the Klein-Gordon equation and, like an uncertainty relation, the Schr¨odinger equation prevents to localize both ∂t Ψ˜ (0, x) and Ψ˜ (0, x). That one has confirmed the pointlike structure of the electron up to distances of 10−18m does not reflect a measurement of its position with this precision but compares the local interactions of the electron to the ones of composite particles. There is no local probability current i.e. no jet function j, which depends on Ψ˜ and Ψ˜ ∗ and finitely many of their spatial derivatives, which together with the probability density j0 (x) = |Ψ˜ |2 (x) satisfies a continuity equation (5.20) while the particle moves freely. The divergence in the jet variables  of such a current j is a polynomialp and cannot cancel i∂t Ψ˜ ∗Ψ˜ = Ψ˜ ∗ (HΨ )˜− (HΨ )˜∗Ψ˜ with H = m2 + p2 . The nonlocalizability has nothing to do with pair creation: pair creation p occurs in interactions, not during free motion. It is the free Hamiltonian H = m2 + p2 which excludes the precise localization of both Ψ˜ and ∂t Ψ˜ . Because there is no current j, which vanishes where the probability density |Ψ˜ |2 does, therefore all attempts of Bohmian mechanics are futile to interpret the Schr¨odinger equation as a description of a real motion of an initial probability distribution of particles along a velocity field v = j/|Ψ˜ |2 .

152

8 Relativistic Quantum Theory

The expectation value x(t) = hΨ (t)|XΨ (t)i of the position of a freely moving particle, its center, changes in the course of time by the expectation value of the velocity d i x (t) = hΨ (t)| − i[X i , H(p)]Ψ (t)i = hΨ (t)|∂ pi H(p) Ψ (t)i = vi dt

(8.66)

which is time independent, because each function of the momenta commutes with H. So the center moves uniformly on a straight line, x(t) = x(0) + vt. By a Poincar´e transformation the center can be made to rest at the origin, x(t) = 0, which we assume for the following consideration of the spread of its position density. In the long run, for t → ∞, such a state Ψ (t) yields the position density of a cluster of classical particles [65], which moved from the origin with the velocity vi (p) = ∂ pi H(p) to vt, where the cluster’s velocity density |Ψˆ (p(v))|2 |det ∂∂ pv | is determined by its momentum density |Ψˆ (p)|2 5

lim t 3 |Ψ˜ (t, vt)|2 = |Ψˆ (p)|2

t→∞

(m2 + p2 ) 2 mv θ (1 − |v|) , p = √ . m2 1 − v2

(8.67)

To show this consider the expectation value hΨ (t)|e−iaX/tΨ (t)i in terms of the position wave function Ψ˜ and the momentum wave function Ψˆ Z Z (8.68) d3x |Ψ˜ (t, x)|2 e−iax/t = d3p Ψˆ ∗ (p)(eiHt e−iaX/t e−iHt Ψ )ˆ(p) . On momentum wave functions H acts multiplicatively (HΨ )ˆ(p) = H(p)Ψˆ (p) and e−iaX/t translates the momentum, (eiHt e−iaX/t e−iHt Ψˆ )ˆ(p) = eiH(p)t (e−iHt Ψ )ˆ(p+a/t) = eiH(p)t e−iH(p+a/t)t Ψˆ (p+a/t) . (8.69) For t → ∞ this converges to lim e−i(H(p+a/t)−H(p))tΨˆ (p + a/t) = e−ia v(p)Ψˆ (p) , vi (p) = ∂ pi H(p)

t→∞

(8.70)

proving lim

t→∞

Z

d x |Ψ˜ (t, x)|2 e−iax/t = 3

Z

d3p |Ψˆ (p)|2 e−ia v(p) .

(8.71)

√ Substituting x(v) = vt on the left and p(v) = mv/ 1 − v2 on the right, where v ranges in the unit ball B = { v |v| < 1 }, this becomes the equality of Fourier transformed functions Z Z ∂p ˆ lim d3v t 3 |Ψ˜ (t, vt)|2 e−iav = d3v | det (8.72) ||Ψ (p(v))|2 e−ia v . t→∞ ∂ v B The inverse Fourier transformation and the determinant of the Jacobian yield (8.67). For a massless particle analogous reasoning would lead to

8.4 Position of a Massive Particle

lim t 3 |Ψ˜ (t, vt)|2 = δ (|v| − 1)

t→∞

153

Z

0



dk k2 |Ψˆ (kv)|2 .

(8.73)

3 But this contradicts the existence of limt→∞ t 2 Ψ˜ (t, vt) as there is no root |Ψ˜ (t,tv)| of δ (|v| − 1). Under translations the paths of freely evolving position wave functions (8.60) transform as expected (8.21)   (8.74) UaΨ˜ ˜(x) = ei P · aΨ ˜(x) = Ψ˜ (x − a) .

The position wave function of the Lorentz transformed path is Z p ˜ e−ip · x 2p0 (UΛ Ψ )(p) . (UΛ Ψ )˜(x) = dp

(8.75)

Denote the integration variable by Λ p and use the Lorentz invariance of both the scalar product and the integration measure Z È ˜ e−ip · (Λ −1 x) 2(Λ p)0 (UΛ Ψ )(Λ p) . (8.76) (UΛ Ψ )˜(x) = dp The momentum function transforms by the Wigner rotation (8.33) so one obtains Z p ˜ RΛ (p) e−ip · (Λ −1 x) 2p0 Ψ (p) , where (UΛ Ψ )˜(x) = dp (8.77) È 0 0 i 0 RΛ (p) = Λ 0 + Λ i p /p R(W (Λ , p)) . The integral over the product is the convolution of the Fourier transformation R˜Λ of RΛ with Ψ˜ ◦ Λ −1 Z  d3y ˜ (UΛ Ψ )˜(t, x) = (8.78) RΛ (x − y) Ψ˜ Λ −1 (t, y) . 3 (2π ) 2

If Λ is a rotation D, then W (D, p) = D and RD = R(D) is momentum independent. 3 Its Fourier transformation is R˜ D (y) = (2π ) 2 δ 3 (y)RD and the position wave function transforms under rotations as expected by the adjoint transformation (A.35) (UDΨ )˜(t, x) = RDΨ˜ (t, D−1 x) .

(8.79)

Under boosts, however, even if the Wigner rotation is represented trivially, RΛ is momentum dependent already because of the ratio of the normalization factors p Λ 0 0 + Λ 0 i pi /p0 . In marked contrast to quantum fields (9.17), the paths of position wave functions transform nonlocally by convolution with R˜Λ and not adjoint to the boosts of space time, (8.80) (UΛ Ψ )˜(x) 6= DΛ Ψ˜ (Λ −1 x) .

154

8 Relativistic Quantum Theory

8.5 Massless Representations and Helicity The cone M0 of momenta of a massless particle, the orbit under the restricted Lorentz group of a lightlike momentum such as p = (1, 0, 0, . . . 1), is the manifold SD−2 × ,  © M0 = (p0 , p) : p0 = p| > 0 , p ∈ SD−2 × , (8.81)

R

R

as p 6= 0 is specified by its direction and its nonvanishing modulus |p| = eλ > 0. Though the choice of p adopts a unit of energy, this does not spoil dilational symmetry as dilations map the massless shell to itself and the linear transformations Ueλ , (Ueλ Ψ )(p) = e−λ Ψ (e−λ p) ,

(8.82)

represent them unitarily with respect to the scalar product (8.8) for mass m = 0. The stabilizer H of p is generated by infinitesimal Lorentz transformations ω , ηω = −(ηω )T (6.21), with ω m 0 + ω m z = 0 which consequently are of the form

„

ω (a, ωˆ ) =

Ž

aT a ωˆ −a aT

.

(8.83)

Here ωˆ is a skewsymmetric (D − 2) × (D − 2) matrix, which generates a rotation w ∈ SO(D − 2) and a is a D − 2-vector. Because ω (a, 0) and ω (b, 0) commute they generate translations in D − 2 dimensions. They are rotated by w. So H is the Euclidean group E(D − 2) of translations and rotations in D − 2 dimensions. By Mackey’s theorems [38] each massless unitary irreducible representation of the Poincar´e group is induced by a unitary irreducible representation of H=E(D− 2). By the same theorems each such representation of E(D − 2) is characterized by the SO(D − 2) orbit of some point q ∈ D−2 , a sphere SD−3 or { 0 }, and a unitary irreducible representation of its stabilizer SO(D − 3) or, if q = 0, SO(D − 2). In D = 4, if q 6= 0 the orbit is a circle, its stabilizer is trivial and a unitary, irreducible representation of E(2) acts on wave functions of a circle. Contrary to its denomination ’continuous spin’ such a representation leads to integer or half integer, though infinitely many, helicities (the Fourier modes of the wave functions). But such a representation is not contained in the restriction of a finite-dimensional representation of the Lorentz group to E(2) as is required for local fields and local interactions (9.25). Even if one could successfully construct interactions for these infinitely many massless states per given momentum, one expects them to make the specific heat of each cavity infinite contradicting a hot big bang and its observed remnant, the microwave background radiation. If q = 0 ∈ 2 , then the orbit is the point q = 0 and the translations Ta in E(2) are represented trivially by Ta = ei q · a = e0 = 1. The stabilizer is H =SO(2). Each unitary irreducible (ray) representation R is one-dimensional and represents the rotation Dδ by the angle δ by multiplication with R(Dδ ) = e−ihδ where h is an arbitrary, real number, which characterizes the representation R.

R

R

8.5 Massless Representations and Helicity

155

The restriction of SO(1, 3)↑ to E(2) is not unitary. The E(2)-translations

… Ta = eω (a,0) =

2

2

1 + a2 aT − a2 a 1 −a 2 a2 aT 1 − a2 2

DZ .

(8.84)

acting on the light basis

†  e0 =

1 0 0 1

†  , e1 =

0 1 0 0

†  0 0 1 0

, e2 =

† , e0¯ =



1 0 0 −1

Ta e0 = e0 , Ta ei = ei + e0 ai , Ta e0¯ = e0¯ + ei 2 ai + e0 a2 , i ∈ { 1, 2 } ,

(8.85)

(8.86)

generate E(2) orbits. E.g. the orbit of e0¯ is the intersection of the lightcone p2 = 0 with the plane p · e0 = 2. Its spatial projection is the paraboloid of revolution pz = −1 + (p2x + p2y )/4 . The linear hull of all Ta e0¯ is 1,3 = W4 . It contains the in-

R

pz

Fig. 8.1 E(2) orbit around the fixed points on the positive 3-axis

variant subspaces W3 , spanned by e0 , e1 , e2 , and W1 , spanned by e0 , W1 ⊂ W3 ⊂ W4 . But neither W4 nor W3 decompose into sums of irreducible spaces. This indecomposability can occur because the matrices Ta are unitary only with respect to an indefinite scalar product in case of which an invariant subspace, W1 , and its orthogonal complement, W3 , can be nested and need not span the complete space, W4 . In D = 4 there is no global section with which to relate functions in HR (8.20) to momentum wave functions of M0 (8.29) 5 . Such a global section cannot exist as S3 is simply connected – each closed path in S3 can be continuously contracted to a point – while S1 is not, so S3 6= S2 × S1 (page 200). Therefore we use local sections N p and S p to derive the transformation of momentum wave functions. They are defined in the north UN outside the negative 3-axis A− = { ( pz |, 0, 0, −|pz |) } and in the south US outside the positive 3-axis A+ = { ( pz |, 0, 0, |pz |) } UN = { p ∈ M0 : p| + pz > 0 } , US = { p ∈ M0 : p| − pz > 0 } . 5

(8.87)

Only in the 4 cases D − 1 = 1, 2, 4, 8 does the group manifold SO(1, D − 1) ∼SO(D − 1) × factorize as M0 × E(D − 2) ∼ SD−2 ×SO(D − 2) × D−1 [27, page 292 ff.].

R

RD−1

156

8 Relativistic Quantum Theory

The northern section N p = D p B p boosts p = (1, 0, 0, 1) in 3-direction to B p p = (|p|, 0, 0, |p|) and then rotates around n = (− sin ϕ , cos ϕ , 0) by the smallest angle, namely by θ , along a great circle to p = |p|(1, sin θ cos ϕ , sin θ sin ϕ , cos θ ). Rather than to work with 4 × 4 Lorentz matrices N p one multiplies more easily their corresponding (page 123) SL(2, ) matrices n p . In the notation θ ′ = θ /2, È p p tan θ ′ = p2x + p2y /(|p| + pz ) = (|p| − pz)/(|p| + pz ) (3.20) and p′ = |p| this preimage in SL(2, ) is (6.51, 6.66) … √ DZ ‚ cos θ ′ Œ |p|+pz px −ipy ′ ′ −i ϕ −√ −p sin θ e 1 ′ |p| |p|+pz n p = sin θp′ iϕ =√ . (8.88) √ ′ ′ p +ip x y e p cos θ 2 √ ′ |p|+pz p

C

C

|p|

|p|+pz

One easily checks that n p transforms pˆ = 1 − σ 3 to n p pˆ (n p )† = |p| − p · σ = pˆ (6.62). In UN , 0 ≤ θ < π , the section n p is real analytic. It cannot be continuously continued to A− as the limit θ → π depends on ϕ . The section … px −ipy DZ √ ‚ cos θ ′ −iϕ Œ √ − |p|−pz ′ ′ |p| |p|−p −p sin θ z 1 p′ e √ sp = =√ (8.89) sin θ ′ ′ cos θ ′ eiϕ |p|−p z p 2 √px +ipy p′ |p|

|p|−pz

is defined and real analytic in the south, 0 < θ ≤ π . The corresponding southern section S p rotates B p p in the 1-3-plane by π to |p|(1, 0, 0, −1) and then along a great circle by the smallest angle to p, namely by π − θ around n′ = (sin ϕ , − cos ϕ , 0). In their common domain local SL(2, ) sections differ by the multiplication from the right (8.28) with a momentum dependent matrix in the preimage of H = E(2),

C

‚ w=

e−iδ /2 0 a eiδ /2

Œ .

(8.90)

In the case at hand, s p differs in UN ∩ US from n p by a preceding rotation around the 3-axis by 2ϕ . Geometrically this angle is the spherical area of the zone (spherical biangle) with vertices ez and −ez and sides through ex and p/|p|,

‚

sp = np

e−iϕ (p)

Œ

eiϕ (p)

px + i py px + i py p , eiϕ (p) = È =p . 2 2 |p| − pz |p| + pz px + py

(8.91)

The angle δ of the Wigner rotation W (Λ , p), which accompanies a Lorentz transformation, Λ N p = NΛ pW (Λ , p) (8.32), can be easily read from the 22-element of the SL(2, ) preimage λ n p , (this is why we calculate in SL(2, ))

C

C

λ n p = nΛ p w(Λ , p) , because by (8.90) δ /2 is the phase of (λ n p )22 = |(nΛ p w)22 | eiδ /2 .

(8.92)

8.5 Massless Representations and Helicity

157

If Λ is a rotation by α around the axis n then λ is given by (6.51) and the Wigner angle δ and its infinitesimal value are tan

δ nz |p| + n· p dδ nz |p| + n· p = , . = α 2 (|p| + pz)(cot 2 ) + nx py − ny px dα |α =0 |p| + pz

(8.93)

As R(ei δ ) = e−ihδ , the representation −i h py /(|p| + pz ) of the infinitesimal Wigner rotation accompanies e.g. the infinitesimal rotation in the 3-1-plane. Analogously one determines the Wigner angle of a boost (6.66) tan

−(nx py − ny px ) −(nx py − ny px ) δ dδ , = = β 2 d β |p| + pz (|p| + pz)(coth 2 ) + nz|p| + n· p |β =0

(8.94)

and, bearing in mind that −iM0i boosts from e0 to −ei , obtains the generators   −iM12Ψ N (p) = − px ∂ py − py ∂ px ΨN (p) − i h ΨN (p) ,   py ΨN (p) , −iM31Ψ N (p) = − pz ∂ px − px ∂ pz ΨN (p) − i h |p| + pz   px −iM32Ψ N (p) = − pz ∂ py − py ∂ pz ΨN (p) + i h ΨN (p) , |p| + pz (8.95)  py −iM01Ψ N (p) = |p|∂ pxΨN (p) − i h ΨN (p) , |p| + pz  px −iM02Ψ N (p) = |p|∂ pyΨN (p) + i h ΨN (p) , |p| + pz  −iM03Ψ N (p) = |p|∂ pzΨN (p) . In particular the helicity, the angular momentum p · J/|p| in the direction of the momentum p 6= 0, is a multiple of 1, namely the real number h,  (px M23 + py M31 + pz M12 )Ψ N (p) = h |p| ΨN (p) . (8.96)

Applied to functions which are differentiable outside A− the operators −iMmn satisfy the Lorentz algebra (6.23) for each value of the real number h. For h 6= 0 the functions (|p| + pz )−1ΨN are not defined on A− and seem to show that smooth states in HR are not transformed to each other (8.22). One cannot require the wave functions ΨN to vanish on A− because then they would not span a space which is mapped to itself by rotations. One also cannot turn a blind eye to the singularity of (8.95) taking unwarranted comfort in the misleading argument [18, 55] that singularities in a set of measure zero do not count: M13Ψ is not square integrable, if Ψ is differentiable in a neighbourhood U of a point pˆ = (|pz |, 0, 0, −|pz |) ∈ A− and does not vanish there, |ΨN |2 ( p) ˆ = C > 0. This holds true as for pz < 0 by the triangle inequality |p| + pz = |p| − |pz | is È 2 2 smaller than the axial distance r = px + py to A− . If the derivative terms of M13Ψ stay bounded, the integral over U over the multiplicative terms contributes beyond

158

8 Relativistic Quantum Theory

Rr

Rr any bound to kM13Ψ k2 as ε dr r |h Ψ /(|p| + pz )|2 > C h2 ε d r r/r2 diverges if ε Rr goes to zero, h2 ε dr/r = h2 (ln r − ln ε ) → ∞. If Ψ does not vanish on A− then for kM13Ψ k to exist ΨN must not be differentiable there. The derivative term DΨN = −pz ∂ pxΨN has to cancel up to a finite term χ near A− the multiplicative singularity MΨN which behaves there as −2ih|pz| py /(p2x + p2y ) ΨN . This subjects ΨN to a condition (D + M)ΨN = χ , which by variation of parameters is solved by a solution f of the homogeneous condition (D + M) f = 0 times  a differentiable function. The homogeneous condition  ∂ px − 2ihpy/(p2x + p2y ) f = 0 and, for M23Ψ to exist, ∂ py + 2ihpx /(p2x + p2y ) f = 0 determines f (p) = e−2ihϕ (p) . The differentiable function which multiplies this transition function is ΨS . If ΨN is square integrabel and smooth in UN , then MmnΨ is square integrable and smooth if and only if in UN ∩US the function ΨN is the product of the transition function e−2ihϕ times a function ΨS , which is smooth in US ,

ΨS (p) = e2i hϕ (p) ΨN (p) .

(8.97)

Multiplying (8.95) with the transition function e2i h ϕ (p) and using (8.97) one obtains   −iM12Ψ S (p) = − px ∂ py − py ∂ px ΨS (p) + i h ΨS(p) ,   py −iM31Ψ S (p) = − pz ∂ px − px ∂ pz ΨS (p) − i h ΨS (p) , |p| − pz   px −iM32Ψ S (p) = − pz ∂ py − py ∂ pz ΨS (p) + i h ΨS (p) , |p| − pz (8.98)  py −iM01Ψ S (p) = |p|∂ pxΨS (p) + i h ΨS (p) , |p| − pz  px ΨS (p) , −iM02Ψ S (p) = |p|∂ pyΨS (p) − i h |p| − pz  −iM03Ψ S (p) = |p|∂ pzΨS (p) .

The same generators ensue if, along the lines of the derivation of (8.95), one reads the Wigner angle δ from the phase of the 12-element (λ s p )12 = −|(sΛ p w)12 |eiδ /2 . The transition function e2 i h ϕ (p) is defined and smooth in UN ∩ US only if 2h ∈ is integer. This is why the helicity of a massless particle is integer or half integer. For each other value −iMmn are not skew hermitean operators in a Hilbert space. For helicity h 6= 0 each continuous momentum wave function Ψ has to vanish along some line, a Dirac string in momentum space, from ln |p| = −∞ to ln |p| = ∞. Namely, if one removes the set N , where Ψ vanishes, from the domains UN and US then the remaining sets UˆN and UˆS cannot both be simply connected. In UˆN the phase of ΨN is continuous and its winding number along a closed path, being integer, does not change under deformations of the path. For a contractible path this winding number vanishes, as the phase along the path becomes constant upon its contraction. If UˆN is simply connected then it contains a contractible path around A+ which also lies in UˆS . On this path the phase of ΨN has vanishing winding number but the phase of ΨS = e2i h ϕ ΨN winds 2h-fold. So the path cannot be contractible in UˆS .

Z

8.6 Angular Momentum and Position of a Massless Particle

159

8.6 Angular Momentum and Position of a Massless Particle Restricted to the subgroup of rotations the induced representation is a direct integral of representations. In coordinates E = |p| and coordinates u for the sphere S2 of directions (with rotation invariant surface element dΩ ) the scalar product Z Z ∞ 1 hΦ Ψ i = (8.99) dE E dΩ Φ ∗ (E, u) Ψ (E, u) 2(2π )3 0 S2

R

R

reveals the Hilbert space as tensor product (A.7) H ( + )⊗ Hh (S2 ) where H ( + ), the space of wave functions of the energy E, is left pointwise invariant under rotations and Hh (S2 ) is the space of states of helicity h on the unit sphere. In Hh (S2 ) the representation of rotations is induced by the representation R(eiδ ) = e−i hδ of rotations around the 3-axis. To be induced by an irreducible representation does not make the representation of SO(3) irreducible. Rather Hh decomposes into angular momentum multiplets, each characterized by its total angular momentum j. Such a multiplet contains a state Λ which is annihilated by J+ = M23 + iM31 (B.65c) and by M12 − j (B.65b), (M12 − j)Λ = 0, (M23 + iM31 )Λ = 0 .

(8.100)

By (8.95) these are differential equations for ΛN . They become easily solvable if we consider ΛN as a function of the complex stereographic coordinates (B.90) w=

px + ipy px − ipy , w¯ = , |p| + pz |p| + pz

which map the northern domain to

(8.101)

C. Then the differential equations read (B.92)

(w∂w − w¯ ∂w¯ + h − j)ΛN = 0 , (w2 ∂w + ∂w¯ + h w)ΛN = 0 .

(8.102)

Recollecting that w∂w measures the homogeneity in w, w∂w wr = r wr (where superscripts denote exponents), the first equation is solved by ΛN = w j−h g(|w|2 ) and the second equation implies g to be homogeneous in 1 + |w|2 of degree − j,  ¯ = ( j − h + h)g + (|w|2 + 1)g′ w j−h+1 = 0 , ΛN (w, w)

w j−h . (1 + |w|2) j

(8.103)

The state Λ is smooth only if j − h is a nonnegative integer. It is square integrable with respect to the measure (B.89) if also j + h is nonnegative which is just the restriction to be smooth also in southern stereographic coordinates. They are related to the northern coordinates in their common domain by inversion at the unit circle, w′ =

px − ipy px + ipy 1 1 = , w¯ ′ = = , |p| − pz w¯ |p| − pz w

and ΛS is smooth only if j + h is a nonnegative integer

(8.104)

160

8 Relativistic Quantum Theory

ΛS (w′ , w¯ ′ ) = (

w′ j+h w 2h ΛN = . |w| (1 + |w′ |2 ) j

(8.105)

So j ≥ |h|: there is no round photon with j = 0. The representation Rh (ei δ ) = e−i hδ of H = SO(2) induces in the space of sections over the sphere no G = SO(3) multiplet with j < |h| and one multiplet for j = |h|, |h| + 1, . . ., i.e. the representation j is induced by Rh with multiplicity

§

nh ( j) =

0 if j < |h| . 1 if j ∈ { h|, |h| + 1, |h| + 2, . . .}

(8.106)

The multiplicity nh ( j) exemplifies the Frobenius reciprocity [38]: The representation Rh of the subgroup H induces each representation j of the group G with the same multiplicity m j (h) with which the restriction of j to the subgroup H contains Rh , m j (h) = nh ( j). States of definite helicity h 6= 0 do not allow Heisenberg pairs X i and P j of operators which both transform under rotations as vectors. Such Heisenberg pairs require complete spin multiplets (h = −s, −s + 1, . . .s in D − 1 = 3 spatial dimensions) as the angular momenta Mi j and, by the Heisenberg commutation relations, the orbital angular momenta Li j = X i P j − X j Pi , i, j ∈ { 1, 2, . . . D − 1} ,

(8.107)

generate the same rotations of X and P. So Γi j = Mi j − Li j commute with X and P and, by the product rule, with Lkl , [Γi j , Lkl ] = 0. Both −iMi j and −iLi j satisfy the angular momentum algebra (8.45). Denoting for readability index pairs by one letter and the structure constants by f , [Ma , Mb ] = i fab c Mc , one has [Γa , Lb ] = 0 = [La , Γb ] and i fab c Mc = [La + Γa, Lb + Γb] = i fab c Lc + [Γa, Γb ] , so [Γa , Γb ] = i fab cΓc .

(8.108)

Consequently Mi j = Li j + Γi j (8.46) where Γi j commute with X and P and satisfy the angular momentum algebra, i.e. they generate a matrix representation of SO(D − 1). For h 6= 0 and if Ψ does not vanish on the positive or negative 3-axis A+ or A− then the partial derivatives of ΨN and ΨS = e2 i h ϕ (p) ΨN are not both square integrable. Well defined in M0 are the covariant derivatives Di = i |p|−1 M0i = ∂ pi + Ai (p) ,

(8.109)

with the connection Ai (p) (the multiplicative part of the covariant derivative) in the northern and southern coordinate patch given by

„ AN (p) =

ih |p|(|p| + pz)

−py px 0

Ž

„ , AS (p) =

ih |p|(|p| − pz)

py −px 0

Ž ,

(8.110)

and related in the intersection of their coordinate patches by the transition function

8.7 Space Inversion and Time Reflection

161

DS i = e2 i h ϕ (p) DN i e−2 i h ϕ (p) .

(8.111)

The covariant derivative D j and the momentum Pi do not constitute Heisenberg pairs as the commutator [Di , D j ] yields the field strength of a momentum space monopole of charge h at p = 0, [Pi , P j ] = 0 , [Pi , D j ] = −δ i j , [Di , D j ] = Fi j = ∂i A j − ∂ j Ai = i h εi jk

Pk . (8.112) |P|3

The geometry of massless particles with nonvanishing helicity is noncommutative. In terms of the covariant derivative the generators of Lorentz transformations (8.95, 8.98) take the manifestly rotation equivariant form −iMi j = −(Pi D j − P j Di ) − i h εi jk

Pk , −iM0i = |P|Di . |P|

(8.113)

They satisfy the Lorentz algebra on account of (8.112), as is trivial for h = 0 and easy to confirm for arbitrary h. However, the covariant derivative D is a skew hermitean operator only if 2h is integer. The generators are smooth in S2 × but not in 3 . The integrand = 12 dpi dp j Fi j , the first Chern class, is a topological density: integrated over each surface S which encloses the tip p = 0 of the cone p0 = |p| Z 1 = ih (8.114) 4π S

R

F

R

F

it yields a value which depends only on the transition functions of the bundle and remains constant under changes of the connection A of the covariant derivative as the Euler derivative (5.155) of (A) with respect to A vanishes.

F

8.7 Space Inversion and Time Reflection Enlarging the restricted Lorentz group by the parity transformation P or by the time reflection T

†



1

P=

−1

−1

†

,T = −1

−1



1

,

1

(8.115)

1

one obtains the groups O(1, 3)↑ of time orientation preserving Lorentz transformations or O(1, 3)+ of space orientation preserving Lorentz transformations, each with two disconnected components. The group O(1, 3) which contains both P and T has four disconnected components. Parity reflects spatial translations Ta : x 7→ x + a (3.12) while T reflects temporal translations. Both invert boosts L p (6.42) and leave rotations O invariant,

162

8 Relativistic Quantum Theory

PTa P −1 = TPa , PL p P −1 = LP p = (L p )−1 , POP −1 = O , T Ta T −1 = TT a ,

T L p T −1 = LP p = (L p )−1 ,

T OT −1 = O .

(8.116)

By Wigner’s theorem [68, page 233], P and T have to be realized in Hilbert space by either a unitary or an antiunitary transformation. In a Hilbert space with non-negative energies, however, there is no choise because translations are represented by Ua = exp iP · a. So by (8.116) T iP0 T −1 = −iP0 and PiP0 P −1 = iP0 . If T was linear, then H = P0 , the generator of the free time evolution, and −P0 = T P0 T −1 had the same spectrum So the energy would not be bounded from below. The same holds for an antiunitary realization of parity. So P acts unitarily and T antiunitarily and both commute with P0 . Parity reverts boosts and spatial momentum and preserves angular momentum, so it inverts helicity, while time reversal reverts spatial and angular momentum, commutes with boosts and conserves helicity, PPi P −1 = −Pi , PMi j P −1 = Mi j , T Pi T −1 = −Pi ,

T Mi j T −1 = −Mi j ,

PM0 j P −1 = −M0 j , T M0 j T −1 = M0 j .

(8.117)

The squares P 2 and T 2 are unitary transformations, which commute with all Poincar´e transformations. Consequently, in irreducible representations, P 2 is a √ ′ phase λ . We absorb it by the redefinition P = P/ λ and drop the prime P 2 = 1, fixing P up to its sign. By associativity T 2 T = T T 2 and antilinearity T 2 is real and as it is also unitary, the Hilbert space decomposes into a subspace where T 2 = 1 and an orthogonal complement where T 2 = −1. Choosing the sign and phase of P and T for the state of a massive spin s-particle with maximal m one obtains for all m ∈ { −s, −s + 1, . . .s } ∗ (PΨ )m (p) = Ψm (P p) , (T Ψ )m (p) = (−1)s−mΨ−m (P p) .

(8.118)

The phase of P is independent of m because P commutes with J± = M23 ± iM31 . T is antilinear and anticommutes with M jk , so T J− = −J+ T . Therefore the phase of T alternates with m: at p = p one has c(s, m)Ψm = ((J+ )s−mΨ )s and with the same real factor (B.65d) c(s, m)Ψ−m = ((J− )s−mΨ )−s implying s−m c(s, m)(T Ψ )m = (J+ T Ψ )s = (−1)s−m (T (J− )s−mΨ )s ∗ . = (−1)s−m ((J− )s−mΨ )∗−s = c(s, m)(−1)s−mΨ−m

(8.119)

The spatial and temporal reflection have the products P 2 = 1 , PT = T P , T 2 = (−1)2s .

(8.120)

Parity maps massless states of helicity h to states with the opposite helicity, while time reversal preserves the helicity and, beeing antilinear, maps to the complex conjugate wave function. Both reflect the northern coordinate patch UN to the southern

8.7 Space Inversion and Time Reflection

163

US preserving the smoothness of the wave functions which constitute the domain of definition of the generators of the Poincar´e algebra.6 By choice of the phases of P and T acting on the northern wave function of positive helicity states one obtains − (PΨ )+ N (p) = ΨS (P p) ,

+∗ (T Ψ )+ N (p) = ΨS (P p) ,

+ (PΨ )− S (p) = ΨN (P p) ,

2h −∗ (T Ψ )− S (p) = (−1) ΨN (P p) ,

2h − + 2h +∗ (PΨ )+ S (p) = (−1) ΨN (P p) , (T Ψ )S (p) = (−1) ΨN (P p) ,

(8.121)

2h + − −∗ (PΨ )− N (p) = (−1) ΨS (P p) , (T Ψ )N (p) = ΨS (P p) ,

The remaining lines follow from P 2 = 1, PT = T P and the transition functions ΦS (p) = e2ihϕ (p)ΦN (p) (8.97) for helicity h together with e2ihϕ (P p) = e2ihϕ (p)(−1)2h . They imply T 2 = (−1)2h . (8.122) Check (8.117) to hold for (8.46, 8.48) with (8.118) and for (8.95, 8.98) with (8.121). One could concieve of a more general time reversal T ′ = UT , where U commutes with Poincar´e transformations and acts unitarily on the species of particles of the same mass and spin or helicity. Then T ′ 2 = U ∗U(−1)2s where U ∗U = η has to be diagonal with elements 1 or −1. So T ′ mixes at most two species in case of which U is (up to an irrelevant phase)

 U=

6

‹

1 −1

.

(8.123)

Discontinuous P and T as in [63, cp. 2.6] are incompatible with smooth Lorentz transformations.

Chapter 9

Quantum Fields

9.1 Free and Interacting Time Evolution Two-particle states consist of products of one-particle states and transform by the product representation UΛ ,a ⊗UΛ ,a . So the infinitesimal generators satisfy the Leibniz rule. Applied to two-particles states Ψ i j (p1 , p2 ) the generator P0 preserves the individual momenta p1 and p2 separately, (P0 Ψ )i j (p1 , p2 ) = (p01 + p02) Ψ i j (p1 , p2 ) ,

(9.1)

0

which means that the time evolution U0 (t) = e−itP generated by H = P0 is free and preserves the individual energies, momenta and spins of the single particles. The time evolution is interacting if it does not map products of single particle states to the product of the time evolved factors. Let U(t¯ − t) denote this unitary, interacting time evolution from states at time t to states at time t¯, Ψ (t¯) = U(t¯ −t)Ψ (t). If t is sufficiently early, the initial state Ψ (t) consists of distant particles, elementary or composite, moving freely before they come near enough to interact. The final state is considered at sufficiently late time t¯ such that the scattered particles have separated and their mutual interactions have become negligible so that the particles again move freely. To abstract from the inessential, we would like to consider the limit t¯ → ∞ and t → −∞. However, the wave packets of freely moving particles keep on moving and spread and have no limit, just as the straight motion of a free classical particle has no limit. This is why we split off the free time evolution lim U0 (t)−1Ψ (t) = Ψout , lim U0 (t)−1Ψ (t) = Ψin , t→−∞

t→∞

(9.2)

supposing that the limits Ψin and Ψout of incoming and outgoing states exist. The map Ψin 7→ Ψout defines the scattering matrix, the S-matrix, S = lim U0 (−t)U(2t)U0 (−t) ,

(9.3)

Ψout = S Ψin .

(9.4)

t→ ∞

165

166

9 Quantum Fields

The S-matrix is unitary, S−1 = S† . If there is no interaction then S = 1. For the limit t → ∞ to exist, the one-particle states, elementary or composite, have to be stable and the mass (not only the energy) of their free time evolution contains all contributions from switched on self interaction and composition e.g. a single charged particle does not have a self Coulomb field in addition. These self interacting one-particle states and the vacuum are eigenstates in the pure point spectrum of the S-matrix, the scattering many-particle states are in its continuous spectrum. We adhere to the Schr¨odinger picture to view the time evolution of particles as paths Γ : t 7→ Ψ (t) in Hilbert space. The time independent states Ψin and Ψout are the initial states of the free time evolutions Γin and Γout which approach Γ at early and late times just as the path of a classical scattered particle has straight asymptotes which are characterized by their initial position and direction.

Γ Ψin Ψout

Fig. 9.1 Interacting Path with Asymptotes and Their Initial States

In the Heisenberg picture not the states change in the course of time but the operators which represent measuring devices. Note well: not operators by themselves evolve in time in the Heisenberg picture, e.g. not density matrices, but the operators which represent measuring devices and enter (8.1) by their eigenvectors,

¬ ¶ Λ (0) U(t)Ψ (0)

¬

Schr¨odinger



= U(t)−1Λ (0) Ψ (0)

Heisenberg

.

(9.5)

The Schr¨odinger picture and the Heisenberg picture are invertibly related and therefore termed equivalent. But the Heisenberg picture is as counterintuitive as the Ptolemaic system which describes the orbit of the earth around the sun in coordinates in which the earth rests. The Ptolemaic system is a memorable misconception which was shared throughout the scientific community and was handed down from generation to generation hindering scientific progress for 2 000 years. The quantum fields, which we shall construct, are labeled by space time points x and deserve the denomination ‘free’ because their time evolution is generated by P0 . They do not represent measuring devices, notwithstanding axiomatic systems which call them ‘observables’. They are used to construct the S-matrix of interacting par-

9.1 Free and Interacting Time Evolution

167

ticles and rightly ignore no-go theorems that they cannot coexist with interacting quantum fields. Interacting fields are not required. The S-matrix contains all experimentally available information about the interacting particles: its matrix elements determine the differential cross sections [30, page 201] Z 1 ˜ 1 . . . dq ˜ n (2π )4 δ 4 (p1 + p2 − q1 − · · · − qn) dq dσ2→∆n = 4[(p1 · p2 )2 − m21 m22 ]1/2 ∆n 2 ∗ hq1 , . . . , qn T p1 , p2 i (9.6)

where hq1 , . . . , qn T p1, p2 i are the amplitudes of the reduced T matrix S = 1 + iT ,

(9.7)

from which a delta function has been split off  (S − 1)Ψin (q1 , . . . , qn ) = (9.8) Z ˜ 1 dp ˜ 2 (2π )4 δ 4 (p1 + p2 − q1 − · · · − qn) hq1 , . . . , qn T p1 , p2 i Ψin (p1 , p2 ) . i dp The unitarity condition S† S = 1 relates the imaginary part of the scattering amplitude to a positive definite square, just as |1 + it|2 = 1, t ∈ , describes a circle,

C

i(T † − T ) = T † T .

(9.9)

The S-matrix does not map a Hilbert space Hin of free incoming states to a different space Hout and at intermediate times the states are not in some space of interacting states. Such a view of different spaces is incompatible with a differentiable time evolution and its perturbative evaluation, because vectors from different spaces cannot be added. Not the states but their time evolution is free or interacting. One and the same space H is the stage of the time evolution of interacting particles and the free time evolution with H = P0 , just as in classical mechanics the space 3 contains the paths of interacting particles and straight lines which are traced out by free particles. That 3 contains Kepler ellipses and straight lines does not make the points x ∈ 3 elliptic or linear neither are the states Ψ ∈ H interacting or free. The measured cross sections of conveniently chosen processes at some momenta fix the parameters of the theoretical S-matrix which then predicts the remaining results for varying momenta. Comparing theory and measurement the present standard model of the interacting, fundamental particles (quarks, electrons, neutrinos, gauge bosons and Higgs scalar) with roughly 20 parameters has been derived in agreement with all experimental evidence of high energy physics. However, the structures of hadrons as bound states of quarks still remain to be understood as do the values of the parameters, the incorporation of gravity in a quantum theory with only finitely many parameters and gravity on galactic and cosmic scales, which indicate otherwise unobservable dark matter and dark energy.

R

R

R

168

9 Quantum Fields

9.2 The Perturbative Relativistic S-Matrix Assume that in spacetime one can switch on the interaction with an intensity function g such that g(x) = 1 in the neighbourhood of x corresponds to full interaction at x and g(x) = 0 to no interaction. Then the S-matrix is a functional S[g] with S[0] = 1 and the S-matrix with fully switched on interactions is S = S[1]. The S-matrix is perturbative if S[g] can be expanded in powers of g, S[g] = 1 +

∞ Z X

d4x1 . . . d4xn

n=1

1 g(x1 ) . . . g(xn ) Sn (x1 , . . . xn ) . n!

(9.10)

Poincar´e covariance of the S-matrix requires the adjoint transformation AdUΛ ,a of the S-matrix to yield the S-matrix of the transformed intensity function,  −1 UΛ ,a S[g]UΛ−1 (9.11) ,a = S[AdΛ ,a g] , AdΛ ,a g (x) = g(Λ (x − a)) .

In particular, with fully switched on interactions the S-matrix commutes with translations and rotations, so it conserves momentum and angular momentum. The support supp(g) of a function g is the smallest closed subset of its domain containing all points not mapped to zero. If g1 and g2 are functions of spacetime then g2 is said to be later than g1 if all points of supp(g2 ) are later than all points of supp(g1 ). If all points of supp g1 are spacelike to all points of supp g2 then we call g1 spacelike to g2 . The S-matrix is causal if in each case that g2 is later than g1 the S-matrix of g2 + g1 is the product in chronological order from right to left, the time ordered product, of the S-matrices of g1 and g2 , S[g2 + g1 ] = S[g2] S[g1 ]

whenever g2 is later than g1 .

(9.12)

Then, by (9.11), S[g2 ] has to commute with S[g1] if g1 is spacelike to g2 because there is a Lorentz transformation Λ such that Λ supp(g1 ) is later than Λ supp(g2 ) and a transformation Λ ′ such that Λ ′ supp(g2 ) is later than Λ ′ supp(g1 ), S[g1 ] S[g2 ] = S[g2 ] S[g1]

if g1 is spacelike to g2 .

(9.13)

Bogoliubov’s analysis [8] of these requirements shows that each unitary, perturbative, Poincar´e-covariant and causal S-matrix is the time ordered product (B.51) Z S[g] = T exp i d4x Lint (x, g(x)) , T Lint (x, g(x))⋆ = T Lint (x, g(x)) , (9.14) where at each point the time ordered TLint (x) is a hermitean operator which depends on the value of the intensity function g and which is local in the sense that T Lint (x, g(x)) commutes with TLint (y, g(y)) if x is spacelike to y, [T Lint (x, g(x)), T Lint (y, g(y))] = 0

if x is spacelike to y .

(9.15)

9.3 Fields in Hilbert Space

169

Lorentz covariance requires the time ordered operators T Lint to transform as a scalar field UΛ ,a T Lint (x, g(x))UΛ−1 (9.16) ,a = T Lint (Λ x + a, g(x)) . As the intensity function g has numerical values it is a classical field and commutes with UΛ ,a rather than being transformed. This isnecessary for the Poincar´e covariance (9.11) of the S-matrix: by g(x) = AdΛ ,a g (Λ x + a) and because the domain of integration and its measure invariant, d4x = d4 (Λ x + a), the transR 4 are Poincar´e −1 formed exponent UΛ ,a T d x Lint (x, g(x))UΛ ,a equals the one with the transformed R  intensity function T d4x Lint x, (AdΛ ,a g)(x) .

9.3 Fields in Hilbert Space The transformation of the position wave function is in marked contrast to the local transformations of quantum fields Φ (x), UΛ Φ (x)UΛ−1 = DΛ−1 Φ (Λ x) , Ua Φ (x)Ua−1 = Φ (x + a) ,

(9.17)

which at each point x of spacetime are vectors in a space of operators (more precisely operator valued distributions) on which a finite-dimensional matrix representation DΛ of Lorentz transformations acts, i.e. the components of Φ constitute a column which allows for matrix multiplication with DΛ and each component is an operator in Hilbert space (or a more general space, the Fock space F , with an indefinite scalar product) which is transformed by Ad UΛ . Note that it is the operators Φ (x) which are transformed. The map Φ : x 7→ Φ (x) is equivariant, DΛ UΛ Φ (Λ −1 x)UΛ−1 = Φ (x). In quantum theory we call quantum fields just fields for short and distinguish them from fields with numerical values, they commute with Ua and UΛ , by the denomination classical fields or background fields. If DΛ = Λ is the vector representation we call the quantum field a vector field. We use the same denomination also for sections of the tangent bundle, i.e. for first order differential operators vi (x)∂i . Usually the context resolves the ambiguous denomination. As each component of DΛ−1 Φ is a linear combination of operators and as UΛ2 is linear, therefore UΛ2 DΛ−11 Φ = DΛ−11 UΛ2 Φ . So (9.17) represents the Lorentz group, = DΛ−11 UΛ2 Φ (Λ1 x)UΛ−1 = DΛ−11 DΛ−12 Φ (Λ2Λ1 x) . UΛ2 DΛ−11 Φ (Λ1 x)UΛ−1 2 2

(9.18)

If the representation DΛ is finite-dimensional and does not represent each g trivially by 1, it cannot be unitary as the restricted Lorentz group is not compact. The only known solutions to the locality requirement (9.15) construct the interaction Lagrangian as a normal ordered real polynomial in elementary local free fields, corresponding to elementary particles or the constituents of bound states e.g. to quarks contained in hadrons.

170

9 Quantum Fields

A field is a vector of operator valued distributions Φ (x) (which we sloppily call operators) which transforms under translations Ua = ei P a and Lorentz transformations UΛ by (9.17) UΛ Φ (x)UΛ−1 = DΛ−1 Φ (Λ x) , ei P a Φ (x) e−i P a = Φ (x + a) ,

(9.19)

where D is a finite-dimensional representation of the Lorentz group. It decomposes into a sum of irreducible representations Dk/2,l/2 (A.60). The field is called scalar if DΛ ≡ 1 is the trivial representation. The Hilbert space H decomposes into a sum or integral of spaces, on which the unitary representation of the Poincar´e group acts irreducibly. Correspondingly each field decomposes into fields, which make transitions from one irreducible space to another. Vacuum to vacuum transitions, the vacuum expectation values c = hΩ Φ (x) Ω i, are constant, as the vacuum is translation invariant, and vanish unless Φ is a scalar field, as the vacuum is Lorentz invariant. Trivial as a constant scalar field may seem, it is nevertheless crucial for generating masses. For readability we employ matrix notation. The fields Φ are column vectors on which the representation matrices DΛ act. Depending on the context Φ may denote fields from subspaces or all fields including the adjoint operators Φ ⋆ . If Φ transforms by the irreducible representation Dk/2,l/2 then Φ ⋆ also denotes a column of operators and transforms by the complex conjugate representation Dl/2,k/2 . We denote by Φ † = Φ ⋆ T row vectors of the adjoint fields. On them matrices act from the right. The parts of local fields Φ (x) = eiPx Φ (0)e−iPx which map the vacuum Ω to oneparticle states Ψ and vice versa define their creation and annihilation parts. They are also called the negative and positive frequency parts Φ − and Φ + , Z Z



−ip x + − ˜ ˜ v(p) eip x Ψ ∗ (p) , Ω Φ (x) Ψ = dp u(p) e Ψ (p) , Ψ Φ (x) Ω = dp

(9.20) where u(p) and v(p) are matrices which act on the components of Ψ or Ψ ∗ and are acted on by the representation D. Their columns are the u- and v-spinors or polarization vectors. To combine them to matrices not only simplifies the notation but exhibits their main property, to intertwine the inducing unitary representation acting on states with the restriction of D to the stabilizer H of p (9.25). The vacuum to one-particle amplitudes and the free fields, which we construct from them, satisfy the Klein-Gordon equation with the mass of the state Ψ , ( + m2)Φ + (x) = 0 , ( + m2)Φ − (x) = 0 .

(9.21)

Consequently, each free massive spin-1/2 field is part of a field which satisfies the Dirac equation (A.72). Up to normalization each field is determined by the representation Dk/2,l/2 and the mass and spin or the helicity of the particle, as we are going to show. The unitary representation UΛ preserves scalar products and the vacuum

9.3 Fields in Hilbert Space

171



¬

Ω Φ + (x)Ψ = UΛ Ω UΛ Φ + (x)UΛ−1UΛ Ψ = DΛ−1 Ω Φ + (Λ x)(UΛ Ψ ) . (9.22) ˜ and by (Λ p) · (Λ x) = p · x we get Integrating over Λ p rather than p, d˜ Λ p = dp, Z Z ˜dp u(p) e−ip x Ψ (p) = D−1 dp ˜ u(Λ p) e−i(Λ p)(Λ x) (UΛ Ψ )(Λ p) Λ Z (9.23) 8.33 −1 ˜ u(Λ p) e−ip x R(W (Λ , p)) Ψ (p) . = DΛ dp This holds for all Ψ , so u(p) and analogously v(p) satisfy almost everywhere1 DΛ u(p) = u(Λ p) R(W (Λ , p)) , DΛ v(p) = v(Λ p) R∗ (W (Λ , p)) .

(9.24)

At p and for Λ = h in the stabilizer H, the Wigner rotation is W (h, p) = h, so u(p) and v(p) intertwine the inducing representations R and R∗ of the stabilizer H with the restriction to H of the representation D, Dh u(p) = u(p) R(h) , Dh v(p) = v(p) R∗(h) .

(9.25)

By Schur’s lemma each irreducible R is (equivalent to) one of the representations Ri into which D decomposes if restricted to H

…

DZ

R1 (h) ..

Dh =

.

.

(9.26)

Rm (h) We fix the normalization of u(p) by requiring its columns to be the orthonormal standard basis vectors of R in terms of the basis of D, e.g. if DΛ = Λ is the fourdimensional self representation and if the particle is massive, then H = SO(3) and D contains a spin 0 representation withpbasis vector e and a spin-1 representation with basis e1 , e0 , e−1 related by L− em = (1 + m)(2 − m)em−1 (B.65d). These basis vectors are the columns of u(p). The intertwiner of Dh and R∗(h) is v(p) = u∗ (p) as the matrices DΛ = Λ are real,

‡

‘

1

u(p) =

−1 √ 2 −i √ 2

√1 2 −i √ 2

1

‡

‘

1

, v(p) =

−1 √ 2 √i 2

√1 2 √i 2

.

(9.27)

1

Equation (9.24) determines the functions u(p) and v(p) uniquely by their values at p. For Λ = L p (λ = n p in case m = 0 (8.88)) the Wigner rotation is trivial, so u(p) = DL p u(p) , uN (p) = Dn p u(p) , v(p) = DL p v(p) , vN (p) = Dn p v(p) . 1

One can drop the restriction ‘almost’.

(9.28)

172

9 Quantum Fields

These functions satisfy (9.24) because D represents (8.32), DΛ DL p = DLΛ p DW (Λ ,p) , W (Λ , p) ∈ H and because of (9.25). The restriction of Dk/2,l/2 to SO(3) yields the representation k/2 ⊗ l/2 which decomposes into a sum of multiplets with spin (k + l)/2, (k + l)/2 − 1, . . ., |k − l|/2. An irreducible R(h) is a representation with one of these spins. Fields in different representations D and D′ which both create and annihilate the same particle are sometimes termed dual as if they were physically equivalent. But dual fields do not permit the same local interactions. If the particle is massless, then (9.25) restricts u(p) not only for h ∈ SO(2) but for h ∈ E(2). As Euclidean translations Ta (8.84) are trivially represented, R(Ta ) = 1, therefore u(p) is invariant under DTa , a restriction which is largely overlooked, DTa u(p) = u(p) .

(9.29)

The fundamental representation D1/2,0 represents Ta by (8.90)



1 a1

‹ (9.30)

and leaves invariant only the subspace which is spanned by the second basis spinor 2 . More generally, the representation Dk/2,l/2 of Ta leaves invariant only the 2 . . . 2 2˙ . . . 2˙ component of uα1 ...αk β˙1 ...β˙l (p). So by (9.28) uN (p) is proportional to uN α1 ...αk β˙1 ...β˙l (p) = nα1 (p) . . . nαk (p) nβ∗˙ (p) . . . n∗β˙ (p) 1

where nα (p) are

l

(9.31)

√ 2 times the components of the second column of n p (8.88)



‹

n1 (p) = n2 (p)

„

p −ip

− √x y |p|+pz √

Ž .

(9.32)

|p|+pz

The spinor n(p) is a square root of the lightlike vector p (6.74) nα (p) n∗β˙ (p) = pα β˙ .

(9.33)

So massless fields Φ + which transform under Dk/2,l/2 with k · l 6= 0 contain k or l, whichever is smaller, factors P. Simple massless fields, from which no factor P can be split off, transform under Dh,0 , if they annihilate particles with helicity −h < 0 in states Ψ− , or under D0,h , if they annihilate positive helicity particles in states Ψ+ . Conveniently choosing the normalization the annihilation and creation amplitudes (they follow by conjugation) are

2

The projector pm σm /2 = (σ 0 − σ 3 )/2 distinguishes the second basis vector in spinor space.

9.3 Fields in Hilbert Space

¬ ¬

173



Ω Φα+ ˙ 1 ...α˙ 2h (x) Ψ+ = Ω

Φα+1 ...α2h (x) Ψ−



=

Z Z

˜ e−ipx n∗α˙ (p) . . . n∗α˙ (p) Ψ+ N (p) , dp 1 2h ˜ e−ipx nα (p) . . . nα (p) Ψ− N (p) , dp 1 2h

¬ ¶ Z ˜ eipx nα (p) . . . nα (p) Ψ+∗ N (p) , Ψ+ Φα−1 ...α2h (x) Ω = dp 1 2h

(9.34)

¬ ¶ Z − ˜ eipx n∗α˙ (p) . . . nα∗˙ (p) Ψ−∗ N (p) . Ψ− Φα˙ 1 ...α˙ 2h (x) Ω = dp 1 2h By Ψ± S (p) = e± 2 ihϕ (p)Ψ±N (p) (8.97) and by nα (p) = sα (p) e

−iϕ (p)



‹

s1 (p) = s2 (p)

,

! √ − |p|−pz √px +ipy

(9.35)

|p|−pz

these amplitudes can be equivalently expressed in southern coordinates, e.g. n∗α˙ 1 (p) . . . nα∗˙ 2h (p) Ψ+ N (p) = s∗α˙ 1 (p) . . . sα∗˙ 2h (p) Ψ+ S (p) . The amplitudes are the matrix elements of fields Φ = Φ − + Φ + Z  ˜ nα (p) . . . nα (p) eipx a† (p) + e−ipxa− N (p) , Φα1 ...α2h (x) = dp 1 +N 2h Z  ˜ n∗α˙ (p) . . . nα∗˙ (p) eipx a† (p) + e−ipxa+ N (p) , Φα˙ 1 ...α˙ 2h (x) = dp −N 1 2h

(9.36)

(9.37)

† Φα˙ 1 ...α˙ 2h (x) = Φα1 ...α2h (x) ,

with creation operators a†+ (p) and a†− (p) of massless particles of positive helicity h and negative helicity −h and their annihilation operators a+ (p) and a− (p) (B.50). Both particles of the helicity pair have to exist for the fields to be local such that they commute or anticommute with each other at spacelike separation

Φ (x)Φ (y) = ±Φ (y)Φ (x) if (x − y)2 < 0 .

(9.38)

In case that both Φ (x) and Φ (y) have only undotted or only dotted indices, this locality requirement is automatically satisfied, because Φ (x) Φ (y) contains only products of creation operators with different annihilation operators. For the mixed index picture we obtain (∂α β˙ = σαmβ˙ ∂xm )

¬



Ω Φα1 ...α2h (x) Φβ˙1 ...β˙2h (y) Ω =

Z

˜ e−ip(x−y) p ˙ . . . p ˙ dp α1 β1 α2h β2h Z ˜ e−ip(x−y) , = (−1)2h (i)2h ∂α1 β˙1 . . . ∂α2h β˙2h dp

(9.39)

174

9 Quantum Fields

as only the a− N (p′ ) a†− N (p) operators contribute and nα (p) nβ∗˙ (p) = pα β˙ (9.33). In the opposite order only a+ N (p′ ) a†+ N (p) contribute, ¬ ¶ Z ˜ eip(x−y) p ˙ . . . p ˙ Ω Φβ˙1 ...β˙2h (y) Φα1 ...α2h (x) Ω = dp α1 β1 α2h β2h Z ˜ eip(x−y) . = (i)2h ∂α1 β˙1 . . . ∂α2h β˙2h dp

(9.40)

R R ˜ e−ip x = dp ˜ eip x = −1/(4π 2 x2 ) coincide (B.111), so For x2 < 0 the integrals dp both vacuum expectation values agree up to the sign (−1)2h : local half integer helicity fields anticommute, local integer helicity fields commute. This is the spin statistics theorem for massless fields in Hilbert space. Because of nα (p) nα (p) = ε αβ nβ (p) nα (p) = 0 the massless fields do not only satisfy the wave equation but also the Weyl equation (∂ α β˙ = ε αγ ∂γ β˙ )

∂ α β˙ Φαα2 ...αk (x) = 0 .

(9.41)

So n-fold derivatives of massless fields Φα1 ...α2h transform under Dh+n/2,n/2. Spatial reflection (A.65) maps Φ (x) linearly to Π Φ (x) Π −1 = ε Φ † (Px), n(p) = ε s∗ (P p) ,

Π a†+ N (p) Π −1 = a†− S (P p) ,

(9.42)

Π Φα1 ...α2h (x) Π −1 = Φ α˙ 1 ...α˙ 2h (Px) . The fields are operator valued distributions, which integrated with Schwartz functions f : 4 → 2h+1 , x 7→ f α1 ...α2h (x), f α1 ...α2h = f απ (1) ...απ (2h) , yield operators Z Φ f = d4x f α1 ...α2h (x) Φα1 ...α2h (x) (9.43)

R

C

which create a particle in the state fˆ (Φ f Ω )N (p) = fˆN (p) = nα1 (p) . . . nα2h (p) f˜α1 ...α2h (p)|

p0 =



m2 +p2

(9.44)

where f˜ is the restriction to the mass shell of the Fourier transformation of f .3 The test functions f have the gauge symmetry that the continuation of their Fouriertransformation away from the mass shell is irrelevant and that also a contribution P απ (1) (p) gαπ (2) ...απ (2h) (p) to f˜ does not change the state Φ f Ω . πn We call a field, which effects transitions from one-particle states to one-particle states, theP current and denote it by J(x). If J(x) = eiPx J(0)e−iPx can be approximated by a sum i j |Γi ′ i Ji j (x) hΓj | where Γi ′ and Γj constitute complete systems, then the In contrast to the Fourier transformation of a solution of the Klein-Gordon equation, f˜ contains no factor δ (p2 − m2 ) but is again a Schwartz function. A solution of the Klein-Gordon equation is no Schwartz function of 4 as it does not fall off faster than any power of x0 . 3

R

9.3 Fields in Hilbert Space

175

states Ψ ′ and Ψ contribute to the matrix elements hΨ ′ J(x)Ψ i as in a scalar product and the matrix elements are determined by an integral kernel J(q, p), a vector of matrices which multiply the components of the wave functions and depend on the incoming and outgoing momenta, Z

′ ˜ dp ˜ Ψ ′∗ T (q) J(q, p) ei (q−p) x Ψ (p) . Ψ J(x) Ψ = dq (9.45) In the massless case, the integral kernel is specified in coordinate domains with transition functions resulting from (8.97), e.g. Z

′ ˜ dp ˜ ΨS′∗ (q) JSN (q, p) ei(q−p) x ΨN (p) . Ψ J(x) Ψ = dq (9.46)

The unitary operators UΛ transform states and fields and preserve scalar products



′ ¬

Ψ J(x)Ψ = UΛ Ψ ′ UΛ J(x)UΛ−1UΛ Ψ = DΛ−1 UΛ Ψ ′ J(Λ x)(UΛ Ψ ) . (9.47)

Integrating over Λ q and Λ p rather than q and p we get Z ˜ dp ˜ Ψ ′∗T (q) J(q, p) Ψ (p) ei(q−p) · x dq (9.48) Z ˜ dp ˜ (UΛ Ψ ′∗T (Λ q) J(Λ q, Λ p) (UΛ Ψ )(Λ p) eiΛ (q−p) · Λ x = DΛ−1 dq Z 8.33 −1 ˜ dp ˜ Ψ ′∗T (q) R′† (W (Λ , q)) J(Λ q, Λ p) R(W (Λ , p)) Ψ (p) ei(q−p) · x . = DΛ dq so J(q, p) is determined in This holds for all Ψ ′ and Ψ in the spaces of the transition, each orbit M(q,p) = (Λ q, Λ p) : Λ ∈ SO(1, 3)↑ by its value in one pair of points DΛ R′ (W (Λ , q)) J(q, p) R−1 (W (Λ , p)) = J(Λ q, Λ p)

(9.49)

and the value J at some chosen pair of momenta (q, p) has to be invariant under its



stabilizer H = h ∈ SO(1, 3)↑ : (hq, hp) = (q, p)

©

Dh R′ (h) J R−1 (h) = J ,

(9.50)

or, equivalently, J intertwines R with the product representation Dˆ ⊗ R′ where Dˆ is the restriction of D to the stabilizer H Dh R′ (h) J = J R(h) .

(9.51)

The stabilizer of a pair of linearly independent momenta q and p leaves the plane which they span pointwise invariant and consists therefore of rotations in D − 2 dimensions. In D = 4 the stabilizer of p = λ (1, 0, 0, 1) and q = λ (1, 0, 0, −1) (which are contained in the southern coordinate domain of q and the northern domain of p) consists of the rotations around the 3-axis. At these momenta the current has to

176

9 Quantum Fields

provide the difference of angular momenta, i.e. if R(rα ) represents the rotation rα ′ by α around the 3-axis by e−imα and if R′ (rα ) = e−im α then J is an eigenvector of ′ the restriction of DΛ to Drα with eigenvalue ei(m −m)α . In physical terms: in back scattering, p = −q, the orbital angular momentum in direction of p vanishes and the spin of the current has to absorb the difference of the spins of the states. This restricts the scattering amplitude for all linearly independent q and p, because p + q defines a rest system, in which the scattering is backwards. Helicity is the angular momentum in the direction of the spatial momentum (8.96), so m = h and m′ = −h′ , because the spatial momentum of q is opposite to the 3-axis, a fact which also underlies Yang’s theorem (page 207). In the rest system of an incoming and outgoing particle, the difference of their spins is the sum of their helicities, m − m′ = h + h ′ . (9.52) For the current J not to vanish, DΛ restricted to SO(3) has to contain a spin-s representation with s ≥ |h + h′|. This is the Weinberg-Witten theorem [64]. Consequently all matrix elements of a vector current hΨ ′ jm (x) Ψ i and therefore the expectation values of charges 4 ­ Z · Z 1 0 3 0 ˜ ΨN∗ (p) lim jNN hΨ Q Ψ i = Ψ d x j (x) Ψ = dp (q, p) 0 ΨN (p) (9.53) q→p 2p vanish in states Ψ with helicity h = 1 as do the matrix elements of the energy momentum tensor hΨ ′ T mn (x) Ψ i in massless helicity-2 states. Untouched by the theorem, gluons carry charges and gravitational waves carry energy. However, contrary to what we assumed so far, the corresponding fields generate not a Hilbert space with a definite scalar product but a Fock space with an indefinite, nondegenerate scalar product. Representations of the Poincar´e group, which are unitary with respect to an indefinite scalar product allow massless particles to be contained in indecomposable representations R of E(2), e.g. helicity ±1 together with helicity 0 (8.86) and to satisfy (9.51) with nonvanishing back scattering amplitude J of a vector current which effects transitions from helicity 0 to helicity ±1.

9.4 The Massless Vector Field The restriction of the vector representation DΛ = Λ of the Lorentz group to the stabilizer of p = (1, 0, 0, 1), the Euclidean group E(2), does not decompose into a sum of 4 helicity transformations with helicities h = 0, 1, 0, −1 but contains the helicities 4 Lorentz invariance does not restrict the dependence of J(q, p) on q · p. So the forward scattering 0 (p, p) could differ from the coincidence limit lim 0 amplitude q→p jNN (q, p) The distributional R 3 jNN 3 −i(q−p)x 3 limit d x e = (2π ) δ (q − p) applies only if tested with continuous functions.

9.4 The Massless Vector Field

177

1 and −1 in an indecomposable, reducible three-dimensional representation (8.86). So in a Hilbert space the vector field cannot create and annihilate particles with helicity ±1 which transform separately. One can relax the transformation law of the vector field [63] and only require that it transforms by multiplication with Λ modulo the gradient of fields ΦΛ  eiPa Am (x)e−iPa = Am (x+ a) , UΛ Am (x)UΛ−1 = Λ n m An (z)+ ∂n ΦΛ (z) | . (9.54) z=Λ x

We call such a representation a coset representation. In partiular, the fields at x = 0 are a representation of SO(3) which decomposes into irreducible, unitary representations with definite spin. Therefore we can take ΦD to vanish for rotations D ∈SO(3) ⊂SO(1, 3). The transformation (9.55) UD Am (x)UD−1 = Dn m An (Λ x) fixes its annihilation part A− m for helicity h = 1 particles Z ˜ −ipx N ˙ (p)aN (p) , where N ˙ = n ˙ n ˙ , Aα β˙ (x) = dpe α 2 1β αβ αβ

(9.56)

They have to annihilate e.g. helicity-1 states Ψ . Otherwise, one would have hΩ UΛ Am (0)Ψ i = Λ n m hΩ An (0)UΛ Ψ i which is inconsistent as helicity-1 states in a Hilbert space with a positive definite scalar product can only be annihilated by (derivatives of) fields Fmn = −Fnm which transform under D1,0 and D0,1 (9.34). The scalar fields Φ are introduced to compensate for the indecomposability of the E(2) representation which is obtained by restriction of the vector representation of the Lorentz group (9.30) and therefore has to create or annihilate helicity-1 states. But then the derivative ∂m Φ is no gauge transformation which deserves its interpretation as changing only irrelevant features. Classically the real fields Am (λ , x) = Am (x)+ λ ∂xm C(x), λ ∈ are equivalent for arbitrary real λ (5.68) because only the field strengths Fmn = ∂m An − ∂n Am contribute in electrodynamics to the energy-momentum tensor (5.23) and influence the paths of charged particles (5.1). In the corresponding quantum theory, the fields are operators and create and annihilate particles. They are equivalent, if no probability (8.1) of a result of a measurement depends on λ , a condition which is very different from covariance requirements of differential geometry. The quantum equivalence of the hermitean fields Am (λ , x) = Am (x) + λ ∂xm C(x), requires that the fields do not act in a Hilbert space H but in a Fock space F (Krein space in the mathematical literature) with an indefinite scalar product. Let us integrate in quantum theory the hermitean operator valued distributions Am (λ , x), which correspond to the classical fields, with real Schwartz functions am , Z Z  4 m (9.57) d x a (x) Am (x) + λ ∂xm C(x) = A + λ C , C = d4x a(x)C(x)

R

178

9 Quantum Fields

where a(x) = −∂m am (x). The operators A + λ C are equivalent for each λ only if all matrix elements of C vanish in physical states Ψ , which by assumption include the vacuum Ω , (9.58) hΨ C Ω i = 0 .

Otherwise probability amplitudes (8.1) depended on λ and would be inequivalent. As C Ω is orthogonal to all physical states, C generates an unphysical scalar particle, a ghost, with the momentum wave function Ψc (p) = −i pm a˜m (p) where a˜m is the m restriction of the Fourier of a to the mass shell of ghosts.

transformation 2 In quadratic order Ω (A + λ C) Ω is independent of λ if and only if hC Ω C Ω i = 0 .

(9.59)

Similarly all matrix elements of C2 vanish in physical states. If the scalar product was positive definite, then CΨ = 0 for each physical Ψ . So the hermitean operator C would vanish in the many-particle Hilbert space Hphys of physical states which is mapped to itself by Poincar´e transformations, the S-matrix and the fields, which create and annihilate physical particles. Even if the complete space was larger, H = Hphys ⊕H ′ the operator C could act only in the unobservable shadow world H ′ and would vanish for all practical purposes, i.e. in the irreducible Hphys . If, however, the scalar product is not definite, then it allows for nonvanishing states C Ω to satisfy (9.59) in the same way as the Minkowski metric allows for nonvanishing lightlike vectors with vanishing length squared. By assumption the energy is bounded from below. Therefore the positive frequency part of C(x) Z  ˜ c† (p) eipx + c(p) e−ipx (9.60) C(x) = dp

annihilates the vacuum and Ω C2 Ω = 0 for all Schwartz functions am implies

¬

¶ Ω c(p) c† (p′ ) Ω = 0 .

(9.61)

By assumption the scalar product is nondegenerate, otherwise hermitean conjugation would not be defined. So there have to be antighost states Ψc¯ , different from physical states Ψ and from ghost states Ψc , but of the same mass, with a nonvanishing scalar product with ghosts, Z ˜ Ψc¯∗ (p) Ψc (p) . hΨc¯ Ψc i = −i dp (9.62) The relative phase of their wavefunction is fixed such that the field Z  ˜ c¯†(p) eipx + c(p) ¯ ¯ e−ipx , C(x) = dp

(9.63)

9.4 The Massless Vector Field

179

which creates antighost states, is hermitean. Its positive frequency part annihilates ghosts ¬ ¶ (9.64) Ω c(p) ¯ c†(p′ ) Ω = −i (2π )3 2p0 δ 3 (p − p′ ) . Assuming the states to be bosons or fermions the matrix elements (9.61,9.64) derive from creation and annihilation operators with graded commutation relations [c(p′ ), c¯†(p)] = i (2π )3 2p0 δ 3 (p − p′ ) = −[c(p), ¯ c†(p′ )] .

(9.65)

Adding suitable ghost states to the antighosts states one can arrange for their scalar product with themselves to vanish. So the remaining graded commutators of ghost and antighost creation and annihilation operators can be taken to vanish [c(p), c† (p′ )] = 0 = [c(p), ¯ c¯† (p′ )] .

(9.66)

¯ The ghost field C(x) and C(x) are local, i.e. their graded commutator at spacelike separation vanishes if the ghost and the antighost are fermionic, as at spacelike distances the Wightman distribution is a symmetric function of (x − y) (B.109) Z  ˜ dp ˜ ei(−qx+py) [c(q), c¯†(p)] + ei(qx−py) [c†(q), c(p)] ¯ [C(x), C(y)] = dq ¯ Z (9.67)  ˜ e−i p (x−y) + (−1)|C| |C|¯ ei p (x−y) . = dp

Appendix A

Notations and Facts

A.1 Linear Algebra Each linear map M of a real or complex vector space V to a vector space W is determined by its action on a basis e1 , e2 . . . ed , because each vector v ∈ V is a linear combination1 v = em vm with real components v1 , v2 . . . vd and by linearity M(em vm ) = M(em ) vm . The images M(em ) = e′n M n m of the basis vectors em are linear combinations of the basis vectors e′1 , . . . e′d ′ of W . The component M n m is the element of the d ′ × d-matrix M in the row n (first index) and the column m (second index). The mth column of the matrix M contains the components of the image Mem of the basis vector em . The image M v has components (M v)n = M n m vm . Endomorphisms of a vector space V are linear maps of V to itself. The linear maps M from V to W constitute a vector space, because one can naturally add and scale them. Even vectors v ∈ V can be viewed as linear maps v : λ 7→ λ v from or to V . They create vectors out of numbers justifying the denomination creation operators. The linear maps l of V to or constitute the dual vector space V p and correspond to single-row matrices. They are annihilation operators and map vectors to numbers. In terms of its components ln = l(en ) each dual vector l acts on v ∈ V by l(v) = l(en vn ) = l(en )vn = ln vn as matrix product of a row l times a column v. Each dual vector l is a linear combination lm f m of the dual basis f 1 , f 2 . . . f d , which map vectors v to their components, f m (v) = vm , in particular f m (en ) = δ m n , where δ m n are the components of the Kronecker δ , the elements of the unit matrix 1,

R C

R C

δ mn =

§

1 if m = n . 0 if m = 6 n

(A.1)

P Each linear map M : V → W is a linear combination M = m,n e′n M n m f m of annim ′ hilation maps f to numbers followed by creation maps en to vectors. 1

We employ Einstein’s summation convention. Each pair of indices denotes the sum over the range of its values, em vm := e1 v1 + e2 v2 + . . ..

181

182

A Notations and Facts

The map M : V → W defines the transposed map M T from W p to V p by composition, also called pullback or right multiplication, M T : l 7→ l ◦ M .

(A.2)

Transposition reflects each matrix along its diagonal: the rows of M T contain the columns of M, M T n m = M m n , (M T l)n = l(M en ) = l(em M m n ) = lm M m n = M Tn m lm . Under linear transformations D of the vector space V the linear maps l ∈ V p from V to transform naturally by the adjoint transformation (A.35) by right multiplication with D−1

R

AdD : l 7→ l ◦ D−1 = (D−1 )T l , AdD|

Vp

= DT −1 .

(A.3)

This transformation is contragredient to the transformation of vectors in the sense, that the transformed dual vector l ′ = DT −1 l applied to v′ = Dv gives the same result as before the transformation, l ′ (v′ ) = (l ◦ D−1 )(Dv) = l(v) . If D represents a group, Dg2 Dg1 = Dg2 g1 , so does DT −1 , DTg2−1 DTg1−1 = DgT2−1 g1 . We indicate the transformation property of vectors by the index position and denomination of their components. Vector components have upper indices and transform with D, components of dual vectors have lower indices and transform with the contragredient representation DT −1 , v′ m = D m n vn , lm′ = DT −1 m n ln = ln D−1 n m .

(A.4)

The sum of a lower with an upper index is invariant, lm′ v′ m = ln vn . The trace of an endomorphism, tr M = M m m , does not depend on the basis, tr AMA−1 = tr M. Tensor is the collective name for maps T to or which are linear or conjugate linear in several vector arguments. A tensor with no vector argument is called scalar. A dual vector l is a tensor with one vector argument. Each tensor T : V × W → corresponds one-to-one to a linear map Tˆ : V → W p which maps u ∈ V to the dual vector Tˆ u ∈ W p which maps each w ∈ W to T (u, w). The scalar product (2.44) η : V × V → , (u, v) 7→ η (u, v) = u · v = um ηmn vn depends bilinearly on two vectors. In an orthonormal basis it has components   1 if m = n = 0 , ηmn = −1 if m = n ∈ { 1, 2, 3 } . (A.5)  0 if m 6= n

R C

R

R

Examples of tensors with three vector arguments are the volume vol(a, b, c) of a parallelepiped with edges a, b and c or the flow of water J(a, b, c) through a parallelogram with edges a and b on which rain falls with current density c. Because tensors are linear, they are completely determined by their values on basis vectors, η (u, v) = η (em , en )um vn or J(a, b, c) = J(ei , e j , ek ) ai b j ck . These values define the components of the tensor, ηmn = η (em , en ) or Ji jk = J(ei , e j , ek ). If the tensor T depends on a dual vector l = lm f m , then to this argument there corresponds an upper index of its components, e.g. if T : V × V p → then T (en , f m ) = Tn m and

R

A.1 Linear Algebra

183

T (u, l) = T (en , f m ) un lm = Tn m un lm .

(A.6)

Each lower index, which enumerates tensor components, relates to a vector argument, each upper index to a dual vector argument. Tensors can be scaled by a factor and they can be added if they are of the same type i.e. have the same arguments. So the tensors T over the vector spaces V and W map the cartesian product V × W = { (v, w) : v ∈ V , w ∈ W }, which is no vector and constitute a vector space. Its dual space is the tensor space, bilinearly to product V ⊗ W . Each pair (u, w) ∈ V × W defines the linear map

R

u ⊗ w : T 7→ T (u, w)

(A.7)

of tensors to numbers. Because tensors are linear, T (a u + v, w) = aT (u, w)+ T (v, w), the tensor product (unlike the cartesian product) is distributive (au + v) ⊗ w = a(u ⊗ w) + v ⊗ w , u ⊗ (aw + z) = a(u ⊗ w) + u ⊗ z .

(A.8)

If e1 , e2 . . . ed is a basis of V and e′1 , e′2 . . . e′d ′ a basis of W then their tensor products are a basis of V ⊗ W as the linear combinations v = ei ⊗ e′j vi j constitute the space dual to the tensors v(T ) = vi j ei ⊗ e′j (T ) = vi j T (ei , e′j ) = vi j Ti j .

(A.9)

A generic element of the tensor product does not factorize into the product of two factors but is a linear combination of products. If V and W are equipped with scalar products, then (ei ⊗ e′j ) · (ek ⊗ e′l ) = (ei · ek )(e′j · e′l )

(A.10)

defines the corresponding scalar product of the basis in V ⊗ W . Under corresponding linear transformations D : V → V and Dp : W → W the tensor product transforms into the product of the transformed factors D(u ⊗ w) = Du ⊗ Dp w .

(A.11)

So the components of tensors transform like the products of the components of vectors and dual vectors with one transformation matrix for each index, e.g. Tm′ n = DT −1 m k Dn l Tk l .

(A.12)

As tensors transform linearly, T = 0 is a fixed point. The Kronecker δ is invariant under all tensor transformations, δ ′ = δ . By their definition Lorentz transformations leave the scalar product invariant, η ′ (u, v) = η (Λ −1 u, Λ −1 v) = η (u, v). By the orthogonality relation (6.6) the contragredient rotation DT −1 coincides with D. This is why one can identify an Euclidean space with its dual and need not distinguish between upper and lower indices in an orthonormal basis. But the contragredient Lorentz transformation Λ T −1 is only equivalent, not equal, to Λ (6.19)

184

A Notations and Facts

Λ T −1 = ηΛ η −1 , Λ T −1 m n = ηmk Λ k l η −1 lm .

(A.13)

Therefore one has to distinguish between four-vectors and their dual vectors. The scalar product η : V × V → naturally defines the linear map η : V → V p which maps each vector u to the dual vector η u : v 7→ η (u, v). The dual vector η u has components (η u)n = um ηmn as (η u)(v) = (η u)n vn = η (u, v) . The scalar product is nondegenerate, so the map η is invertible and defines a Lorentz invariant correspondence of vectors and dual vectors. The correspondence allows the shorthand notation, to call η u just u again. From the position of the index one deduces whether one deals with the components of the vector or its corresponding dual vector. The components are related by ‘raising and lowering the index with η ’

R

um = ηmn un , un = η −1 nk uk ,

(u0 , u1 , u2 , u3 ) = (u0 , −u1 , −u2 , −u3 ) .

(A.14)

We also use the convention to write η mn rather than η −1 mn and deduce from its upper index pair that we deal with the components of the matrix inverse of η . Because ηmn = ηnm is permutation symmetric, a sum of a lower with an upper index is like a seesaw: one may raise the lower index and lower the upper index, un vn = um ηmn vn = um vm . Lowering and raising indices commutes with differentiation, ∂k wm = ηmn ∂k wn . The map η intertwines the Lorentz transformations of the vector space and its dual, Λ T −1 (η u) = ηΛ η −1 η u = η (Λ u). Whether one maps a vector to its dual and transforms later or transforms first and maps to the dual afterwards is the same, the correspondence between vector and dual vector is Lorentz equivariant.

A.2 Area and Volume The area a ∧ b of a parallelogram with edge vectors a and b is linear both in a and b. It vanishes if the edges coincide, a∧a = 0 .

(A.15)

By 0 = (a + b) ∧ (a + b) = a ∧ a + a ∧ b + b ∧ a + b ∧ b = 0 + a ∧ b + b ∧ a + 0 the area changes sign upon interchange of a and b a ∧ b = −b ∧ a .

(A.16)

That area can be negative takes getting used to. But a patch mends a hole and a pile of gravel fills a pit, so area and volume can have both signs. They are oriented. The aperture of a camera is the area of a hole. One can choose to call it positive or negative, but a black spot in the aperture contributes to its area with opposite sign. In the plane we count by convention the area of a ∧ c as positive, if c is on the left of a, and as negative, if c is on the right. Signed area is additive as shown in the following figures.

A.2 Area and Volume

185

c b

c c

a

a

b a+b

a+b

c

Fig. A.1 Addition of Parallelograms of Equal and Opposite Sign: a ∧ c + b ∧ c = (a + b) ∧ c.

Area is invariant under shear, (a + λ b) ∧ b = a ∧ b + (λ b) ∧ b = a ∧ b. Shearing a and b to the basis directions of ex and ey relates a ∧ b to the basis area ex ∧ ey .

b

a

by

a

0

b a

0

Fig. A.2 Cavalieri’s Principle: Shear Preserves Area

That a ∧ b = (ax by − ay bx ) ex ∧ ey rightly expresses the area of a parallelogram with edges a and b as a sum of signed areas is algebraically trivial. To confirm it graphically from the following diagram is hard to see and left to the reader.

ay bx by

b

a a

0

Fig. A.3 Area of a Parallelogram

by ax

186

A Notations and Facts

As the volume a ∧ b ∧ c of a three-dimensional parallelepiped with edges a, b and c is linear in each of its edge vectors and vanishes if two of them are equal, it changes sign under each alternation (A.16): volume is a completely alternating multilinear form. In particular, in n = 3 dimensions the volume ei ∧ e j ∧ ek spanned by three basis vectors vanishes if i = j or i = k or j = k, coincides with e = e1 ∧ e2 ∧ e3

(A.17)

if i, j, k is an even permutation of 1, 2, 3 and is −e in case that i, j, k is an odd permutation of 1, 2, 3, (A.18) ei ∧ e j ∧ ek = e εi jk .

The components of the ε -symbol are  i, j, k is an even permutation of 1, 2, 3 ,  1 if i, j, k is an odd permutation of 1, 2, 3 , εi jk = −1 if  0 else .

(A.19)

It is antisymmetric in each pair of indices, εi jk = −ε jik = −εk ji = −εik j . Expanding the edges a = ei ai , b = e j b j and c = ek ak and using the multilinearity yields the volume as triple sum, a ∧ b ∧ c = e εi jk ai b j ck

(A.20)

which explicitly is  a ∧ b ∧ c = e a1 (b2 c3 − b3c2 ) + a2(b3 c1 − b1c3 ) + a3(b1 c2 − b2 c1 ) .

(A.21)

In a right-handed orthonormal basis, ei · e j = δi j , e = 1, the volume spanned by a, b and c can also be considered as scalar product of a with the vector product b × c, the Hodge dual of b ∧ c, a ∧ b ∧ c = a · (b × c) , b × c = ei εi jk b j ck .

(A.22)

The determinant of an n-dimensional endomorphism L is the factor by which it scales the n-dimensional volume (La1 ) ∧ (La2 ) ∧ · · · ∧ (Lan ) = detL (a1 ∧ a2 ∧ · · · ∧ an ) .

(A.23)

Rotary reflections D either preserve the volume or change its sign, det D = ±1 (6.6). As they leave invariant scalar products one concludes in 3 dimensions (Da) · (Db × Dc) = (det D) a · (b × c) = (det D) (Da) · D(b × c) .

(A.24)

This holds for all a′ = D a, so (Db) × (Dc) = (det D) D (b × c) .

(A.25)

A.3 Transformations of Functions

187

Because b × c transforms under rotary reflection of its factors as an axial vector by multiplication with (det D) D the space of axial vectors is different from the space of polar vectors which transform by multiplication with D. This distinction is often indicated by their different units of measurement, which forbid to add them. Nevertheless one can identify directions in different spaces V and V ′ if they carry equivalent, irreducible representations Dg and D′g = ADg A−1 of rotations. The linear map A defines a correspondence of each basis e1 , e2 . . . of V to the basis e′1 = A e1 , e′2 = A e2 . . . of V ′ and in corresponding bases the matrix elements of Dg and D′g coincide. By Schur’s lemma (appendix A.4) the correspondence A is unique up to a nonvanishing factor, i.e. up to the choice of the unit. This is why the scalar product and the vector product is also defined for vectors from different spaces V and V ′ , e.g. for velocities and magnetic fields. The unit of the product is the product of the units of the factors. The space Λ k (V ) is spanned by k-parallelepipeds v = v1 ∧v2 ∧· · · ∧vk with edges v1 , v2 , . . . vk . If V has a nondegenerate scalar product, so has Λ k (V ) X sign(π ) (u1 · vπ (1) ) (u2 · vπ (2) ) . . . (u p · vπ (k) ) . (u1 ∧ · · · ∧ uk ) · (v1 ∧ · · · ∧ vk ) = π ∈Sk

(A.26) Let V be a n-dimensional vector space with a scalar product η and with an orthonormal basis e1 , e2 , . . . en (6.17). The Hodge star ∗ is linear and maps v ∈ Λ k (V ) to its Hodge dual ∗v ∈ Λ (n−k) V , the (n − k)-parallelepiped in the space orthogonal to v which multiplied with v yields the basic volume times v · v, v ∧ ∗v = (v · v) e1 ∧ · · · ∧ en .

(A.27)

As the Hodge star commutes with orientation preserving Lorentz transformations Λ , det Λ = 1, it does not depend on the chosen orthonormal basis but only on its orientation. Because of ∗∗v = (−1)k (n−k) (∗v · ∗v)/(v · v) v it squares to ∗∗ = (−1)k (n−k) sign(det η ) .

(A.28)

A.3 Transformations of Functions A group is a set G with an associative product G×G → G , a b 7→ a b

a (b c) = (a b) c ,

(A.29)

a unit element e, which leaves all group elements a unchanged, ea = ae = a , where each element has an inverse,

(A.30)

188

A Notations and Facts

a−1 a = a a−1 = e .

(A.31)

For example, the nth roots of 1 constitute the cyclic group with n elements,

Zn =



1, z, z2 . . . zn−1

©

,

z=e

2π i n

.

(A.32)

The invertible maps T : M → M of each set M to itself, the transformations of M , constitute a group where successive application is their group product. For example, the symmetric group Sn is the group of permutations of { 1, 2, . . . n }. If a subset H of a group G contains each product and each inverse of its elements, then H is a group itself, a subgroup of G . For example, the linear transformations L of a d-dimensional complex vector space V form the general linear group GL(d, ). The determinants of volume preserving transformations L have the special value det L = 1 . Volume preserving linear transformations are called unimodular and constitute the subgroup SL(d, ) of special linear transformations. If the vector spaces are real, the corresponding transformation groups are called general or special linear group in d real dimensions, GL(d, ) or SL(d, ) . The transformations may leave a set of objects invariant. Then we call them symmetries of this set, e.g. rotations are symmetries of the sphere. The rotations around the 3-axis constitute the subgroup H of rotations, which leave the unit vector e3 in 3-direction invariant. Applying all rotations to e3 one obtains the points of the two-dimensional sphere S2 . This exemplifies the notions of the stabilizer and the orbit of a point. The subgroup H = { h : h x = x } of transformations h ∈ G, which leave a chosen point x invariant, constitute its little group or its stabilizer. The points to which x is transformed constitute its orbit M = { g x : g ∈ G } under G. It is the smallest manifold which contains x and in which G acts as group of transformations. It acts transitively: each point y = g x can be transformed to each other z = g′ x, e.g. (g′ g−1 ) y = z. The projection π , which maps g ∈ G to the point y = gx ∈ M

C

C

R

R

π : G → M , g 7→ gx

(A.33)

is surjective, M = π (G). The set π −1 (y) = { g : π (g) = y } of all elements which are projected to the same point y, the fiber over y, is the left coset gH = { g h , h ∈ H } as gx = g′ x if and only if g−1 g′ ∈ H. As π is constant on each coset it defines a one to one map from the set of left cosets G/H = { gH } to the orbit M . By this map one can identify the orbit M with G/H and G acts on G/H by left multiplication. No matter which transformations one investigates, the orbit and its transformation under G are completely determined by the group G and the stabilizer H. Cosets gH and g′ H are either disjoint or identical. The group G is their disjoint union. It is a bundle of fibers, each of which can be mapped invertibly to each other. However, the fiber bundle G need not be a product (G/H) × H (page 200). A transformation g may act on different objects, e.g. a rotation may rotate points, rigid bodies or forces. Then, acting on different sets, the transformations are different. But one has also to identify them as they correspond to the same group element. This distinction and identification is achieved by the concept of a realization M of a group. It maps the group elements g ∈ G to the transformations Mg of a manifold M

A.3 Transformations of Functions

189

such that successive transformations correspond to the group product, Mg2 ◦ Mg1 = Mg2 g1 .

(A.34)

If Ng is another realization of G on a different manifold N , then Mg and Ng are not the same transformations but only different realizations of the same group element. The elements k which are realized by Mk = idM constitute a normal subgroup K ⊂ G, gK = Kg ∀g ∈ G . G/K is a group and is realized faithfully, i.e. only the transformation Me , e ∈ G/K, leaves each x ∈ M invariant. The realization of a group by linear transformations is called a representation. If a group is realized on M by Mg and on N by Ng , then it is naturally realized on the cartesian product M × N = { (x, y) : x ∈ M , y ∈ N }, by the map Mg × Ng : (x, y) 7→ (Mg x, Ng y) which maps each pair of points to the pair of their images. A map f : M → N is defined by the sets M and N and a subset Γf ⊂ M × N of the cartesian product, the graph of f , which contains for each x ∈ M precisely one pair (x, y). This pair defines f (x) by y = f (x), (x, f (x)) ∈ Γf . In other words, the graph is a section of the product M × N and cuts each fiber { x } × N once. As Mg is invertible the graph is also the set of pairs (Mg −1 x, f (Mg −1 x)). These pairs are mapped by Mg × Ng to x, Ng f (Mg −1 x) . So g, which is realized as transformation of the manifolds M and N naturally transforms the space F of maps f from M to N by the adjoint transformation Adg : F → F , f 7→ Adg ( f ) = Ng ◦ f ◦ Mg −1 .

(A.35)

The adjoint transformations Adg realize the group G on the space F of maps from M to N , Adg2 Adg1 = Adg2 g1 . The endomorphisms O of a vector space N , on which a representation Ng acts, transform by Adg O = Ng O Ng−1 . If Ng = eω is generated by transformations ω , then O changes to first order in ω by the infinitesimal adjoint transformation, adω (O) = [ω , O] = ω O − Oω ,

(A.36)

the commutator of ω with O. The infinitesimal transformation adω generates Ad eω , eadω = Ad eω ,

(A.37)

as both A(t) = Ad e t ω and B(t) = ead t ω satisfy dtd C(t) = [ω ,C(t)] with the initial value C(0) = 1 , so A(1) = B(1). A map f : M → N , which is invariant under the adjoint transformations of a group, is called equivariant. It intertwines the realizations M and N, fequivariant ◦ Mg = Ng ◦ fequivariant .

(A.38)

The value f (x) of an equivariant function at some chosen point x is invariant under its stabilizer: Mh x = x implies Nh f (x) = f (x). This value determines f everywhere in the orbit, f (Mg x) = Ng f (x) . This way we determined the momentum p as an equivariant jet function of t, x, v for a particle at rest and for a boosted particle (3.43).

190

A Notations and Facts

Let R be a representation of some subgroup H of a group G, Rh2 Rh1 = Rh2 h1 , which acts in a vector space V p of row vectors by multiplication from the right. The vector space FR of maps Ψ : G → V p which satisfy

Ψ (g′ h) = Ψ (g′ ) Rh , ∀g′ ∈ G, h ∈ H ,

(A.39)

is mapped linearly to itself by the adjoint Adg of left multiplication Mg : g′ 7→ g g′ (Adg Ψ )(g′ ) = Ψ (g−1 g′ ) .

(A.40)

This representation of G in FR is called induced by the representation R of H.

A.4 Schur’s Lemma and Tensor Operators A set K of endomorphisms K of a vector space V , e.g. a representation of a group or an algebra, is called reducible if all K map a proper subspace U , { 0 } = 6 U 6= V , to itself. In a basis of V in which the first vectors span U , the matrices corresponding to reducible K have a common block of vanishing matrix elements and are of the form  ‹ ∗∗ K= . (A.41) 0∗ A set K of endomorphisms K is irreducible if { 0 } and V are the only subspaces which are mapped by all K to themselves. 2 A linear map I : V → V ′ intertwines sets (e.g. groups) K and K ′ of endomorphisms of vector spaces V and V ′ if to each K ∈ K there corresponds a K ′ ∈ K ′ and vice versa such that K ′ I = IK . (A.42) Schur’s Lemma: If I 6= 0 intertwines irreducible K and K ′ then I is invertible. Consequently, if I 6= 0 then the intertwined K ′ and K are equivalent, K ′ = IKI −1 . Proof of the lemma: The image IV is invariant under K ′ since K ′ (Iv) = I(Kv). Likewise the nullspace NI = { n ∈ V : In = 0 } is invariant under K because of I(Kn) = K ′ (In) = 0. By I 6= 0 one has IV 6= { 0 } and NI 6= V . As K and K ′ are irreducible the invariant spaces have to be IV = V ′ and NI = { 0 }. So I is invertible. Corollary: If endomorphisms K of a finite dimensional complex vector space are irreducible and commute with an endomorphism W , KW = W K, then W = λ 1. Proof: Each endomorphism W of a finite dimensional complex vector space has an eigenvector v 6= 0, (W − λ 1)v = 0. So W − λ 1 is not invertible. But it intertwines the irreducible set of endomorphisms K with itself, K(W − λ 1) = (W − λ 1)K. By Schur’s lemma noninvertible intertwiners of irreducible K vanish, W − λ 1 = 0. 2 Reducibility depends not only on the matrices which represent K. E.g. the group SO(2) acts irreducibly on 2 , but on 2 the same matrices have invariant subspaces V± = { λ (1, ±i) : λ ∈ }.

R

C

C

A.4 Schur’s Lemma and Tensor Operators

191

The corollary does not hold for endomorphisms of a real vector space: the abelian group SO(2) is an irreducible set of endomorphisms of 2 and commutes with each of its elements, not only with multiples of 1. Let representations Rg and Lg of some group G act on complex vector spaces R and L (L, M, R for left, middle, right) with basis ei and e′′m . A nonvanishing tensor operator consists of linearly independent linear maps

R

Tj : R → L

(A.43)

which transform under Adg (A.35) into each other r Lg T j R−1 g = Tr (Mg ) j .

(A.44)

Consequently Mg represent G in a vector space M with a basis e′j , Mg2 Mg1 = Mg2 g1 . Denote by nL the multiplicity with which the product representation M ⊗ R contains the irreducible representation L. In detail, this means that there are linearly independent vectors Emα = e′j ⊗ ei c ji mα , α = 1, . . . , nL , which for fixed α span spaces Vα ⊂ M ⊗ R on which the group acts by L (we skip the argument g), M r j Rs i c ji mα = crs nα Ln m .

(A.45)

The vectors (T j ei ) c ji mα ∈ L transform by left multiplication with L L(T j ei ) c ji mα = (L T j R−1)(R ei ) c ji mα = Tr es M r j Rs i c ji mα = Tr es crs nα Ln m (A.46) and define by (Wα )n m = (Tr es )n crs mα matrices which commute with L, LWα = Wα L. As L is irreducible Schur’s lemma implies Wα = λα 1. The vectors Emα can be completed to a basis of M ⊗ R (adapted to invariant subspaces). Its coefficients c ji mα , the Clebsch-Gordan coefficients, are components of an invertible matrix. Multiplying (Tr es )n crs mα = λα δ n m with the inverse matrix c−1mα ji yields the Wigner-Eckart theorem that the tensor operators T j : R → L , which transform under the representation M, constitute an nL -dimensional space. Each tensor operator is determined (T j ei )n = c−1nα ji λα

(A.47)

by nL coefficients λα . They are termed reduced matrix elements hL α T Ri. The case that M ⊗ R does not contain the representation L, nL = 0, implies the selection rule T j = 0. If, for example, L, M and R are SO(3) representations with spin s′′ , s′ and s, then s′′ ∈ { s′ + s, s′ + s − 1, . . . s′ − s| } or hL T Ri = 0. One can choose the basis vectors as eigenvectors of a maximal compact abelian subgroup A ⊂ G as each representation of a compact abelian group is equivalent to a unitary representation by diagonalized matrices. In this basis (A.45) implies the selection rule (T j ei )n = 0 unless the eigenvalues multiply, λ (e′′n ) = λ (e′j )λ (ei ). If A is generated by infinitesimal transformations then their eigenvalues add or the components (T j ei )n vanish, m(e′′n ) = m(e′j ) + m(ei ).

192

A Notations and Facts

A.5 Complex Conjugate and Contragredient Representations A complex matrix representation D of a group G defines the conjugate matrices D∗ , D∗ m˙ n˙ = (Dm n )∗ ,

(A.48)

which also represent G because Dg2 Dg1 = Dg2 g1 implies D∗g2 D∗g1 = D∗g2 g1 . The matrices D∗ act in a vector space V ∗ with a basis which we enumerate with dotted indices e1˙ , . . . , ed˙ to indicate their complex conjugate transformation, D∗ en˙ = em˙ D∗ m˙ n˙ . Some of the representations D and D∗ and their contragredient ones DT −1 and ∗ T −1 D may be equivalent or equal. If in some basis D = D∗ then the representation is real. If a real representation Dˆ acts irreducibly in a real vector space V it may become reducible if linearly extended to the complexified vector space V ⊗R . Then Dˆ = D ⊕ D∗ is the sum of a complex irreducible representation D and its conjugate D∗ [19], e.g. for G = SO(2)

C



‹



‹

1 cos α − sin α 1 1 = Dˆ α = sin α cos α −i i 2i

eiα

‹ e−iα

‹

i −1 i 1

.

(A.49)

For D to be real, it is not sufficient to be equivalent to D∗ , for example the self representation of SU(2) (6.45) is equivalent but not equal to its conjugate



‹  ‹  ‹ o a −b∗ 1 ∗ −1 2 2 : |a| + |b| = 1 , g = g . SU(2) = g = b a∗ −1 1 n

(A.50)

If D is equivalent to the contragredient complex conjugate representation D∗ T −1 , = ADA−1 , then A = D∗ T AD which means that the nondegenerate sesquilinear formP hu vi = u∗ T A v P is invariant under D. In an orthonormal basis it has the form p+q p (un )∗ vn . So in this basis D takes values in the group (un )∗ vn − n=p+1 hu vi = n=1 U(p, q) of pseudo-unitary transformations and is called pseudo-unitary. If D is equivalent to the contragredient representation, DgT −1 = A Dg A−1 , then DTg −1 A = A Dg and, as this holds for all g, i.e. also for g−1 , DTg A = A D−1 g . Transposing the last relation yields DTg −1 AT = AT Dg implying DTg −1 (A ± AT ) = (A ± AT ) Dg . So in the subspaces where A + AT or A − AT are invertible, the equivalence transformation can be taken to be symmetric or antisymmetric. If V is real and A = η symmetric, η = η T , then in an orthonormal basis the Pq Pp (un )2 − n=1 (un+p)2 . quadratic form uT η u is a signed sum of squares uT η u = n=1 The representation D maps G into the orthogonal group O(p, q) and is called an orthogonal or Lorentzian representation. If A = j is antisymmetric, jT = − j, then in a suitable basis it is of the form D∗ T −1



j=

−1

1

‹

, jmn = δm+d,n − δm−d,n , m, n ∈ { 1, . . . , 2d } ,

(A.51)

C

A.6 Finite Dimensional Representations of SL(2, )

193

where 1 denotes the d × d unit matrix. Such a matrix j occurs e.g. as value of the Poisson bracket of the phase space coordinates in Hamiltonian mechanics with d degrees of freedom, { xm , xn }Poisson = jmn . The linear group, which leaves the antisymmetric, bilinear form hu, vi = um jmn vn invariant, is called SP(2d), the symplectic group in 2d dimensions. Correspondingly, a representation which maps G into the symplectic group is termed symplectic.

C

A.6 Finite Dimensional Representations of SL(2, )

C

The transformations D ∈ SL(2, ) which act on complex two-dimensional vectors u ∈ 2 , the Weyl spinors, by 3

C

D : u 7→ Du , (Du)α = Dα β uβ ,

α , β ∈ { 1, 2 } , det D = 1 ,

(A.52)

are symplectic, because they leave invariant the antisymmetric bilinear form h , i hu, v i = u ε v = uα ε T

αβ

vβ , ε

αβ

= −ε

βα



12

=1, ε =



1 −1

‹

= −ε T , (A.53)

as DTε D is antisymmetric and therefore proportional to ε . The factor of proportionality Dγ 1 ε γδ Dδ 2 = D1 1 D2 2 − D2 1 D1 2 = detD ε 12 is 1 which proves DT ε D = ε . So hu, v i = hDu, Dv i is invariant and D and DT −1 = ε Dε −1 are equivalent. The nondegenerate symplectic form hu, v i defines an invariant, invertible map ε : u 7→ ε u, from V = 2 to its dual V p , (ε u)(v) = − hu, vi and allows the shorthand notation of raised and lowered indices, u1 = u2 , u2 = −u1 ,

C

γ α˙ α˙ β γ˙ u¯β˙ , u¯β˙ = εβ−1 uα = ε αβ uβ , uβ = ε −1 ˙ γ˙ u¯ . β γ u , u¯ = ε ˙

(A.54)

As ε is antisymmetric, a sum uα vα = ε αβ uβ vα = −uβ vβ changes sign if one lowers the first and raises the second index. If for readability we omit summation indices then the first undotted index of the pair is understood to be up and the first ˙ dotted index down, e.g. uσ m v¯ is the shorthand notation of uα σαmβ˙ v¯β and (uv)∗ = v¯u. ¯ Note that the momentum wavefunctions of spin-1/2 states are not Weyl fields. They transform differently under Lorentz boosts: the wave functions transform by multiplication with the unitary representation of the Wigner rotation, the Weyl fields transform by multiplication with the nonunitary spin-1/2 boost matrices (6.66). Further representations of SL(2, ) are obtained from the repeated tensor product of the spinor representation with itself

C

Dk/2 : uα1 ...αk 7→ Dα1 β1 . . . Dαk βk uβ1 ...βk . 3

(A.55)

We denote the spinor components by lower Greek indices to conform with (6.62) and our convention to denote the matrix elements of the Pauli matrices by σ m ˙ . αβ

194

A Notations and Facts

Products with the contragredient representation DT −1 need not be considered as they are equivalent to products with D and do not yield inequivalent new representations. It is also sufficient to consider only completely symmetric tensor products, because an antisymmetric product such as 1 δγ uαβ = −uβ α ⇔ uαβ = ε −1 αβ c , c = 2 ε uγδ ,

(A.56)

is proportional to ε −1 and therefore equivalent to a lower tensor product. A basis of the symmetric k-fold tensor product Vk/2 of the spinor space V with basis e1 , e2 is given by the symmetrized product of k −k′ factors e1 with k′ factors e2 , k′ = 0, 1, . . . k . So Vk/2 has dimension k + 1. Under rotations around the 3-axis by ′ the angle α these basic products are multiplied by e−i(k−2k )/2 . So they span an irreducible SU(2) multiplet with spin s = k/2 (B.65). For k > 0 this representation of SL(2, ) is not unitary though restricted to SU(2) it is. The representation D∗ of SL(2, ) by the complex conjugate matrices

C

C

D∗ : u∗ 7→ D∗ u∗ , (D∗ u∗ )α˙ = Dα∗˙ β u∗β˙ ˙

(A.57)

is inequivalent to D, D∗ 6= ADA−1 , as D and ADA−1 have the same eigenvalues but D∗ has the conjugate eigenvalues and conjugation does not map the eigenvalues { λ , 1/λ } of each SL(2, )-matrix to themselves, e.g. { 2i, −i/2 }∗ 6= { 2i, −i/2 }. SU(2)-matrices g = e−iα n · σ are equivalent to g∗ , g = ε g∗ ε −1 (A.50), which means εσ i ∗ ε −1 = −σ i . (A.58)

C



So ε eH ε −1 = e−H for hermitean traceless 2 × 2 matrices H. The matrices ε D∗ ε −1 represent g eH ∈ SL(2, ) ∼ S3 × 3 by their image under inversion at S3 ,

C

R

ε (g eH )∗ ε −1 = g e−H .

(A.59)

Just as D, the complex conjugate transformation D∗ is equivalent to its contragredient transformation. To obtain further representations from D∗ it therefore suf∗ fices to consider the symmetric tensor product Vl/2 of l factors of the vector space ∗ 2 V = . It is l + 1-dimensional and carries an irreducible spin s = l/2 representation of SU(2). ∗ is the (k + 1)(l + 1)-dimensional Finally the tensor product Vk/2,l/2 = Vk/2 ⊗ Vl/2 complex space on which the representation Dk/2,l/2 acts. It is the most general finitedimensional, irreducible representation of SL(2, ),

C

C

(Dk/2,l/2 u)α1 ...αk β˙1 ...β˙l = Dα1 γ1 . . . Dαk γk D∗β˙

1

δ˙1

. . . D∗β˙ δl uγ1 ...γk δ˙1 ...δ˙l . ˙

l

(A.60)

It transforms tensors with components which are denoted by k symmetric undotted indices and l symmetric dotted indices.

C

A.6 Finite Dimensional Representations of SL(2, )

195

Mathematical investigations [7] show, that each finite-dimensional representation of SL(2, ) can be decomposed into a sum of irreducible representations and that each irreducible, finite-dimensional representation of SL(2, ) is equivalent to some Dk/2,l/2 . The representation Dk/2,l/2 is inequivalent to Dk′ /2,l ′ /2 if (k, l) 6= (k′ , l ′ ). The representations Dk/2,l/2 of SL(2, ) are compatible with the reality condition

C

C

C

(uα1 ...α

k β1 ...βl

˙

˙

)∗ = uβ1 ...βl α˙ 1 ...α˙ k

(A.61)

which for k 6= l holds in the reducible sum Vk/2,l/2 ⊕ Vl/2,k/2 of two conjugate representation spaces. Only for k = l does the reality condition define a real invariant subspace of a complex irreducible representation. The restricted Lorentz group acts on spinors by SL(2, ) transformations (6.66)

C

C



SL(2, ) = u eH : u ∈ SU(2), H = H † , tr H = 0

©

,

(A.62)

where u corresponds to a rotation O and eH to a boost L p . By (8.116) the spinorial representation P and T of spatial and temporal reflection have to satisfy A.59

Pu eH P −1 = T u eH T −1 = u e−H = ε (u eH )∗ ε −1 .

(A.63)

C C C

The conjugate representation (u eH )∗ of SL(2, ) is inequivalent to (u eH ). So by (A.63) P maps 2 to a different space in which SL(2, ) is represented by ε D∗ ε −1 . The group which is generated by P and SL(2, ), the spin group, restricted to SL(2, ) is reducible  ‹ D Ψ, (A.64) Dˆ : Ψ 7→ ε D∗ ε −1

C

C

A vector of this complex four-dimensional representation space of the spin group, a Dirac spinor Ψ , consists of two Weyl spinors ψ and ε ψ¯ which transform under D and ε D∗ ε −1  ‹ ψ Ψ= (A.65) ε ψ¯ By choice of their relative phase and by P 2 = 1 (8.120) parity simply reflects the two Weyl spinors into each other PΨ = The spinor CΨ =



‹

1 1

Ψ.

‹  −1 ‹ ψ¯ ∗ ε = Ψ∗ εψ ∗ ε

(A.66)



(A.67)

is the charge conjugated spinor. A Dirac spinor which is invariant under charge conjugation, ψ ∗ = ψ¯ , is called a Majorana spinor. Taking T to act conjugate linear by

196

A Notations and Facts

 T =

ε

‹ ε −1

∗,

(A.68)

we confirm T (ueH )T −1 = ε (ueH )∗ ε −1 (A.63) and T 2 = −1 (8.120). If one indicates the SL(2, )-transformation of a Dirac spinor Ψ by the index ˙ ˙ position of its components then Ψ = (ψ1 , ψ2 , ψ¯ 1 , ψ¯ 2 ) = (ψ1 , ψ2 , ψ¯ 2˙ , −ψ¯ 1˙ ). Define the bared Pauli matrices σ¯ as

C

σ¯ n = εσ n T ε −1 , (σ¯ 0 , σ¯ 1 , σ¯ 2 , σ¯ 3 ) = (σ 0 , −σ 1 , −σ 2 , −σ 3 ) .

(A.69)

With (6.48) one confirms i i σ m σ¯ n = η mn 12×2 − ε mnkl σk σ¯ l , σ¯ m σ n = η mn 12×2 + ε mnkl σ¯ k σl . 2 2

(A.70)

A Weyl field ψ (x) with an undotted lower index, which satisfies the KleinGordon equation ( + m2)ψ = 0, m > 0, defines by m ε ψ¯ = iσ¯ n ∂n ψ ,

(A.71)

a corresponding Weyl field ε ψ¯ with a dotted upper index. It solves iσ n ∂n ε ψ¯ = mψ and the Dirac spinor Ψ = (ψ , ε ψ¯ ) fulfills by construction the Dirac equation  iγ n ∂n − m Ψ = 0 (A.72) with 4 × 4 matrices γ n , the gamma matrices,

γn =



σ¯ n

σn

‹ .

(A.73)

They satisfy the Dirac algebra (which mathematicians term Clifford algebra)

γ m γ n + γ n γ m = 2 η mn 1 .

(A.74)

Vice versa, the projectors PL and PR ,



PL =

‹

1 − iγ 5 1 + iγ 5 1 , PR = , γ 5 = γ 0γ 1γ 2γ 3 = i −1 2 2

,

(A.75)

applied to a solution Ψ of the Dirac equation (A.72) define by PLΨ = (ψ , 0) and PRΨ = (0, ε ψ¯ ) Weyl fields ψ and ψ¯ which satisfy ( + m2) Ψ = 0. If m = 0 then the Weyl fields PLΨ and PRΨ satisfy the Dirac equation separately.

A.7 Baker Campbell Hausdorff Formula

197

A.7 Baker Campbell Hausdorff Formula Let X denote an endomorphism of a finite-dimensional vector space V . Its commutator defines the infinitesimal adjoint map adX of endomorphisms Y (A.36), adX : Y 7→ [X,Y ] = X Y − Y X .

(A.76)

The Baker Campbell Hausdorff formula [25] shows that for sufficiently small X and Y the product eX eY of exponentials is an exponential eZ(X,Y ) where Z is X + Y plus a power series in adX and adY applied to [X,Y ]. A Lie algebra is a real vector space L with an antisymmetric, bilinear nonassociative product, the Lie-bracket, (A, B,C ∈ L, a ∈ )

R

[A, B] = −[B, A] , [A + B,C] = [A,C] + [B,C] , [aA, B] = a[A, B]

(A.77)

which satisfies the Jacobi-Identity [A, [B,C]] + [B, [C, A]] + [C, [A, B]] = 0 .

(A.78)

A representation R of a Lie algebra maps it linearly to the real algebra of endomorphisms of some vector space D (not necessarily finite dimensional) such that each Lie bracket is mapped to the commutator of its factors, R[A,B] = RA RB − RBRA .

(A.79)

We call the representation unitary or a ∗-representation if D is a dense subspace of a Hilbert space and if for all A ∈ L and Φ , Ψ ∈ D hR(A)Φ Ψ i = − hΦ R(A)Ψ i .

(A.80)

By the Baker Campbell Hausdorff formula there corresponds to each finite dimensional representation of a Lie algebra the group of products of exponentiated representation matrices. To derive the formula for Z(X,Y ) in eZ(X,Y ) = eX eY recall that adX generates the adjoint transformation of eX (A.37) Ad eX Y = eX Y e−X = eadX Y .

(A.81)

Let t 7→ Z(t) denote a curve in the space of matrices which for t = 0 passes the Z(t) at t = 0 point Z(0) = X with tangent vector dZ dt (0) = Y . For the derivative of e we write d (A.82) eZ(t) = ∆ (X,Y ) . dt |t=0 It is linear in Y , ∆ (X, aY ) = a ∆ (X,Y ) and coincides with Y for X = 0, ∆ (0,Y ) = Y . Up to a factor eX the derivative ∆ (X,Y ) is a power series in adX applied to Y because for each natural number m one has X

Y

eX+tY = (e m +t m )m .

(A.83)

198

A Notations and Facts

Differentiation of the m factors yields m−1

X X X Y X X Y d (e m +t m )m = (e m )m−k−1 ∆ ( , )(e m )k dt |t=0 m m k=0

=e

m−1 m X

m−1

(A.84)

k X 1X Ad − mX ∆ ( ,Y ) . e m m k=0

By

1 m

Pm−1 k=0

qk =

1−qm m(1−q)

and by (A.81) the sum of maps tends for m → ∞ to ∞

m−1

X (−1)n k 1 − e− adX 1X 1 − e− adX → Ad − mX = = (adX )n . (A.85) − ad X e m ad (n + 1)! X m m(1 − e ) n=0 k=0 The other factors in (A.84) tend for m → ∞ to eX and ∆ (0,Y ) = Y . As (A.84) is continuous in these factors and does not depend on m, one has d 1 − e− adX d 1 − e− adZ dZ eX+tY = eX Y , eZ(t) = eZ . dt |t=0 adX dt adZ dt

(A.86)

For Z(t), which emerges upon switching on Y in a product of exponentials eZ(t) = eX etY , this means e−Z(t)

d Z(t) 1 − e− adZ dZ e = =Y . dt adZ dt

(A.87)

(A.88)

This series in adZ is invertible, if X and Y and consequently Z are small enough. The −s ln es reciprocal of 1−es is 1−e −s . So dZ ln eadZ Y = g(eadZ )Y , = dt 1 − e− adZ

(A.89)

where the function ln s s ln s (1 + (s − 1)) ln(1 + (s − 1)) = = 1 − s−1 s − 1 s−1 ∞ n+1 X (−1) (s − 1)n = 1+ n(n + 1)

g(s) =

(A.90)

n=1

is analytic within |s − 1| < 1 . Because of eadZ = AdeZ = AdeX etY = AdeX AdetY = eadX et adY we finally arrive at eadX et adY ln(eadX et adY ) dZ = Y. dt eadX et adY − 1

(A.91)

A.8 Hopf Bundle

199

Integrated over t with Z(0) = X one obtains with Z = Z(1) the Baker Campbell Hausdorff formula for Z in eZ = eX eY Z 1 eadX et adY ln(eadX et adY ) Y dt Z=X+ eadX et adY − 1 0 (A.92) Z ∞ X (−1)n+1 1 = X +Y + dt (eadX et adY − 1)nY . n(n + 1) 0 n=1

The series eadX et adY − 1 is at least linear in adX and adY t2 1 [1 + adX + (adX )2 + . . . ] [1 + t adY + (adY )2 + . . .] − 1 2 2 1 t2 = adX +t adY + (adX )2 + t adX adY + (adY )2 + . . . 2 2

(A.93)

The t-integration of the monomials in (eadX et adY − 1)n with k factors t adY yields a coefficient 1/(k + 1). Up to terms with three or more commutators one obtains 1 1 1 Z = X + Y + [X,Y ] + [X, [X,Y ]] − [Y, [X,Y ]] + . . . 2 12 12

(A.94)

The example X =π

 eX =

‹

−1 0 0 −1



‹

0 −1 1 0



, eY =

11 01

 ,Y=

‹

‹

01 00

,

, eX eY = M =



‹

−1 −1 0 −1

(A.95)

shows that the Baker Campbell Hausdorff formula holds only for sufficiently small X and Y because M is not the exponential of a matrix (6.71).

A.8 Hopf Bundle Each subgroup H decomposes a group G into cosets gH = { gh : h ∈ H }. They are either disjoint gH ∩g′ H = 0/ or, if g−1 g′ ∈ H, identical and are equal in the sense that each can be mapped continuously and invertibly to each other, g′ H = (g′ g−1 ) gH. So G is a bundle over G/H, the set of all cosets. There is a projection π of G to the base manifold G/H π : g 7→ gH (A.96) and each fiber, the preimage of a point in G/H, is diffeomorphic to H. By left multiplication with g′ ∈ G the fibers gH are mapped to fibers g′ gH, so G acts as a transformation group on G/H and H is the stabilizer of the coset eH.

200

A Notations and Facts

The bundle G can fail to be the product G/H × H of the base manifold with the fiber. We show this for H = U(1) and G = SU(2) (6.45)

§ H=

e−iϕ

‹ eiϕ

: ϕ∈

§

ª

R

,

G=

a −b∗ b a∗

‹

ª : |a|2 + |b|2 = 1

.

(A.97)

As a manifold SU(2) is the three-dimensional sphere S3 and the subgroup H a circle S1 . The decomposition of S3 into S1 -fibers gH is its Hopf fibration. √ Each fiber can be written with a fixed value of 0 ≤ C ≤ 1, S = 1 − C2 and α ∈ [0, 2π ) as the set

§

gH =

C eiα −S eiα S e−iα C e−iα

‹

eiϕ

‹

e−iϕ

,ϕ ∈

ª

R

(A.98)

because for a b = g11 g21 6= 0 their common phase defines ϕ . For C = 0 and C = 1 the coset is independent of α . So the points of G/H correspond one-to-one to the points of a sphere S2 (there the poles are independent of the longitude).

Fig. A.4 Stereographic Projection of the Hopf Fibration of S3 [51]

The real and imaginary parts of 2ab∗ = 2 (v + iz) (x − iy) and also |a|2 − |b2 | are independent of the common phase of a and b. So  π : (v, x, y, z) ∈ S3 7→ 2(vx + zy), 2(zx − vy), (v2 + z2 − x2 − y2 ) ∈ S2 (A.99) is constant along each fiber and projects S3 to S2 ⊂

R3 because of

|2ab|2 + (|a|2 − |b2|)2 = (|a|2 + |b|2)2 = 1 .

(A.100)

Though S3 projects to S2 with S1-fibers, it can not factorize, S3 6= S2 × S1 , as each closed path in S3 can be continuously contracted to a point, i.e. S3 is simply connected, but S1 is not.

Appendix B

Aspects of Quantization

B.1 Graded Commutative Algebra The values of classical bosonic and fermionic fields and their derivatives at finitely many points generate a graded, commutative ring over the complex numbers with a unit element. In products the factors can be reordered at the cost of a sign for each transposition of two fermionic factors. We enumerate the generating variables φ i by an index and distinguish bosons from fermions by their grading |i|,

§

|i| =

0 if φ i is bosonic . 1 if φ i is fermionic

(B.1)

By assumption they have an associative, distributive, graded commutative product

φ i φ j = (−1)|i| | j| φ j φ i ,

(B.2)

i.e. bosons commute with bosons and fermions, fermions commute with bosons and anticommute with fermions. Consequently, a permutation π , which consists of an odd number of transpositions of fermionic factors, changes the sign of a product,

φ i1 φ i2 . . . φ in = signF(π )φ iπ (1) φ iπ (2) . . . φ iπ (n) .

(B.3)

The algebra does not contain any relations in addition to these graded commutation rules and is assumed to be expandable by further bosonic or fermionic elements, e.g. parameters which had not been listed previously or fields at additional points. In products of monomials A(φ ) B(φ ) one can commute the factors of A with the factors of B at the cost of a sign if A and B contain an odd number of fermionic factors. So monomials are graded commutative and the grading of a product is the sum of the gradings of its factors modulo 2 |AB| = |A| + |B|

mod 2 .

(B.4)

201

202

B Aspects of Quantization

In words, boson times boson is a boson, boson times fermion is a fermion and fermion times fermion is a boson. However, a boson which is the product of two fermions φ and ψ is not a number by which one could divide or from which one could take the square root. Its square (ψφ )2 = −ψ 2 φ 2 vanishes because φ φ = −φ φ = 0 and ψψ = −ψψ = 0. We consider only polynomials which are linear combinations of monomials with the same grading. They are graded commutative A B = (−1)|A| |B| B A .

(B.5)

We call linear maps O of the graded commutative algebra to itself operations in order to distinguish them from operators which are understood to act in a (dense subspace of a) Hilbert space. Each element A of the graded algebra defines the left multiplicative operation to map elements B to AB. This multiplicative operation is sloppily also denoted by A leaving it to the context to clarify what is meant, e.g. (B.16) applies to the algebraic element φ i , (B.22) holds for its multiplicative operation. Operations can be decomposed into bosonic operations, they preserve the grading of their argument, and fermionic operations, which map bosons to fermions and fermions to bosons, |Obosonic (A)| = |A| ,

|Ofermionic (A)| = |A| + 1

mod 2 .

(B.6)

They have a natural grading, |O| = |O(A)| − |A|

mod 2 .

(B.7)

The grading of successive operations is the sum of the gradings |O1 O2 | = |O1 | + |O2 |

mod 2 .

(B.8)

A module is a vector space of linear combinations of some basis ei with grading |i| with graded commutative coefficients. The grading of the components M i j of a bosonic operation M acting on such a module is |M i j | = |i| + | j|

mod 2 .

(B.9)

as Me j = ei M i j . If the basis is ordered ’bosons first’ then M decomposes



ab M= cd

‹ (B.10)

into transitions a from bosons to bosons, transitions d from fermions to fermions, transitions b from fermions to bosons and transitions c from bosons to fermions. M is invertible as a formal power series if and only if its c-number part M is invertible

B.1 Graded Commutative Algebra

 M=

203

‹

a

, M = (1 + N) M , N = (M − M) M −1 , X −1 = M −1 M −1 = (1 + N) M (−N)n . d

(B.11)

n

The trace of M, tr M = tr a + tr d, is not invariant under AdA : M 7→ AMA−1 , as tr MN is not tr NM for all bosonic M and N, in fact M i j N j i = (−1)|i|+| j| N j i M i j , because of (|i| + | j|)2 = |i| + | j| modulo 2. So (−1)|i| M i j N j i = (−1)| j| N j i M i j

(B.12)

and the supertrace str M =

X i

(−1)|i| M i i = tr a − trd ,

(B.13)

is cyclic, str MN = str NM, and therefore invariant, str AMA−1 = str A−1 AM = str M. Also the definition of the transposed matrices M T l = l ◦ M (A.2), which act on the dual vector space, and their product involves a graded sign X MiT k = (−1)|i||k| M k i , (M T M ′ T )i j = (−1)|k| MiT k Mk′ T j = (M ′ M)T i j . (B.14) k

Left derivations v are operations with the graded Leibniz rule v(A B) = (vA) B + (−)|v| |A|A (vB) .

(B.15)

The sign of the second term is required for consistency v(AB) = (−1)|A||B| v(BA). By the product rule v is determined by its action on the generating variables, v(φ i ) = vi (φ ). Each derivation is the linear combination v = vi (φ ) ∂φ i of the partial derivatives ∂φ i which satisfy the product rule and

∂φ i φ j = δi j .

(B.16)

They have the same grading as their corresponding variables, |∂φ i | = |φ i | , ∂φ i ∂φ j = (−)|i| | j| ∂φ j ∂φ i .

(B.17)

The algebra is assumed to allow for a conjugate linear involution ∗, (aA + bB)∗ = a A + b∗B∗ and (A∗ )∗ = A, which preserves the grading, |A∗ | = |A|, and satisfies ∗ ∗

(A B)∗ = B∗ A∗ .

(B.18)

The conjugate operation O∗ is defined such that it is linear, O∗ (A) = (−)|O||A| (O(A∗ ))∗ ,

(B.19)

(O1 O2 )∗ = (−)|O1 | |O2 | O∗1 O∗2 .

(B.20)

This implies

204

B Aspects of Quantization

Conjugation does not reverse the order of successive operations. The conjugate of a left derivation is a left derivation. This is the reason for the sign in (B.19). Without it the conjugate of a derivation would be a right derivation, i.e. in the product rule the right factor would be differentiated without a sign, and there could not be a self conjugate fermionic derivation such as the exterior derivative d or the BRST-transformation in gauge theories. The partial derivative with respect to a real fermionic variable is purely imaginary: φ = φ ∗ and |φ | = 1 imply (∂φ )∗ (φ ) = −1 = −∂φ φ . Care must be exercised with fermionic variables which carry indices which can be raised or lowered ψβ = εβ γ ψ γ with an antisymmetric two form εαβ = −εβ α (A.54). If ∂ψ α ψ β = δα β and ∂ α = ε αγ ∂ψ γ then ∂ α ψβ = −δ α β . Likewise ψ¯ α˙ = (ψ α )∗ and ∂β˙ = (∂ψ β )∗

imply ∂α˙ ψ¯ β = −δα˙ β for fermionic ψ . The graded commutator of operations O and P ˙

˙

[O, P] = O P − (−1)|O| |P| PO

(B.21)

is their anticommutator, if both O and P are fermionic, and their commutator, if O is bosonic or P is bosonic. For example, the partial derivative ∂φ i and the multiplication with φ j satisfy the graded Heisenberg algebra [∂φ i , φ j ] = δi j .

(B.22)

The graded commutator is linear in both arguments, graded antisymmetric [O, P] = −(−1)|O| |P| [P, O] ,

(B.23)

and satisfies the product rule [O, P Q] = [O, P] Q + (−1)|O| |P|P [O, Q] .

(B.24)

Therefore, if bosonic operations P and Q are polynomials in graded commutative operations, then P and Q commute, because by the product rule their commutator can be reduced to vanishing graded commutators. So bosonic polynomials in local bosonic or fermionic fields commute at mutually spacelike separations. The graded commutator of derivations is a derivation. Each subspace of derivations, which is closed under graded commutation and under conjugation, constitutes a real (self conjugate) graded Lie algebra, e.g. the generators of the Poincar´e group and fermionic derivations Dα , Dβ˙ span the supersymmetry algebra [66]



Dα , Dβ˙

©

= 2Pm σαmβ˙ .

(B.25)

The spacetime derivatives dm (5.147) are bosonic derivations of the algebra. Applied to spacetime coordinates, dm are the partial derivatives dm xn = δm n . The variables φ with dm φ = 0 are called spacetime constants. Applied to the other generating variables φ they yield the variables dm φ and applied repeatedly one obtains the vari-

B.2 Two-particle States

205

ables dm1 dm2 . . . dmk φ which are completely symmetric under permutations of the derivatives. Collectively we denote by φˆ the graded commutative variable φ and its derivatives. By this algebraic definition of dm all variables are infinitely often differentiable though for classical fermionic fields one cannot even define continuity, as the space of continuously many anticommuting variables lacks a natural topology. We restrict our considerations to finitely many anticommuting variables, fermionic fields φ and their derivatives of finite order at finitely many points x. The price to pay is that integrals over Lagrangians containing fermionic φ are purely formal, e.g. the partition sum neither converges to an integral nor diverges with finer and finer partitions of the domain of integration. Such an integral over the Lagrangian seems necessary to deduce the EulerLagrange equations from the variational principle. But the Euler derivatives (5.155) of the Lagrangian, considered as a bosonic polynomial or series in x and φˆ , are algebraically well defined even without an integral. The integral is needed only to discard derivatives and can be completely avoided if one simply considers the variation of the differential form ω (x, dx, φˆ ) = L (x, φˆ ) d4 x modulo exact forms dη (x, dx, φˆ ).

B.2 Two-particle States The states of two different particles are sums of products of one particle states Ψ ⊗ Φ and correspond to rays in H1 ⊗ H1 with scalar product





Ψ ⊗ Φ Ψ ′ ⊗ Φ′ = Ψ Ψ′ Φ Φ′ . (B.26)

If the two-particle state is a product, then the probabilities (8.1) factorise for results of measurements, which are performed on the separate particles and therefore have factorising eigenstates Λ1i ⊗ Λ2j . Then the knowledge of the result of a measurement of the first particle does not alter the probabilities of the results of the second particle. If the two particle state does not factorise, but is the sum of products, it is called entangled and the individual results are correlated, in the extreme case the result of the first particle determines the result of the second particle (section 1.5). The general two-particle state is given by wave functions Ψ : Mm1 × Mm2 → d1 d2 with values Ψ i j (p , p ) where p and p are the individual momenta and i 1 2 1 2 and j enumerate the spin or helicity components of the two particles. Their scalar product is Z X ˜ 1 dp ˜ 2 χ ∗i j (p1 , p2 ) Ψ i j (p1 , p2 ) . hχ Ψ i = dp (B.27)

C

ij

Mm1 ×Mm2

In case of identical bosons, e.g. two photons, the permutation of their spin and momenta in a common coordinate patch leaves Ψ invariant. The permutation of two identical fermions, e.g. two electrons, changes the sign of Ψ

206

B Aspects of Quantization

ij ji ij ji (p1 , p2 ) = ΨBoson (p2 , p1 ) , ΨFermion (p1 , p2 ) = −ΨFermion (p2 , p1 ) . (B.28) ΨBoson

The unitary transformation of one-particle states under Poincar´e transformations extends naturally to two-particle states UΛ ,a (Ψ1 ⊗ Ψ2) = (UΛ ,aΨ1 ) ⊗ (UΛ ,aΨ2 ) ,

(B.29)

and implies the product rule for the generators Mmn and Pm Mmn (Ψ1 ⊗ Ψ2 ) = (MmnΨ1 ) ⊗ Ψ2 + Ψ1 ⊗ (MmnΨ2 ) .

(B.30)

In particular the individual momenta add to the total momentum P = p1 + p2 PΨ (p1 , p2 ) = (p1 + p2)Ψ (p1 , p2 )

(B.31)

with a continuous spectrum of P2 ≥ (m1 + m2 )2 . As the product representation is a representation it is by Mackey’s theorems the sum or integral of irreducible representations. The total momentum P of states with nonnegative mass and energy lies in the interior of the forward lightcone. Colinear individual momenta have vanishing measure in the set of pairs of momenta and do not contribute to the scalar product. So the two-particle states do not contain a massless representation. To decompose the transformation of two-particle states into a sum of irreducible transformations we introduce the center variables. They correspond to the center of mass coordinates of nonrelativistic physics. We write the momenta in terms of the total momentum P = p1 + p2 , P2 = m2 , and the relative momentum p, p · P = 0 (3.57), m21 − m22 P m2 − m2 P ) + p , p2 = (1 − 1 2 2 ) − p , (B.32) 2 m 2 m 2 p w(m2 , m21 , m22 ) m2 − m2 p 2 m2 − m2 p 1 , −p2 = . p = (1 − 1 2 2 ) − (1 + 1 2 2 ) m 2 m 2 2m p1 = (1 +

The momentum q in the rest system is obtained from p by boosting back with the the Lorentz boost LP (6.42) which maps P = (m, 0, 0, 0) to P, p = LP q , q = (0, q) ∈

R3 .

(B.33)

Both P and p transform under Lorentz transformations Λ as four vectors, the three vector q transforms in conjunction with P in the pair (P, q) by Wigner rotation,   Λ P, q = Λ P,W (Λ , P)q , W (Λ , P) = (LΛ P )−1Λ LP . (B.34)

The last relation of (B.32) specifies the contribution of the relative momentum q to the mass of the two-particle state. Its size q2 = −p2 becomes more transparent in units of m1 + m2

B.2 Two-particle States

207

w(m2 , m21 , m22 ) w(x2 , y2 , z2 ) (m, m1 , m2 ) = (m1 + m2 ) , (x, y, z) = 2m 2x m1 + m2 p and in its factorised form w(x2 , y2 , z2 ) = (x2 − (y + z)2)(x2 − (y − z)2)

(x2 − 1)(x2 − a2) m1 − m2 q2 = , a= , 0≤a≤1, x≥1. (m1 + m2 )2 4x2 m1 + m2

(B.35)

(B.36)

Expanding around x = 1 to lowest order in q2 exhibits m − (m1 + m2 ), the kinetic energy in the rest system, as the nonrelativistic kinetic energy of a particle with the reduced mass µ m − (m1 + m2 ) =

q2 m1 m2 + O(q4 ) , µ = . 2µ m1 + m2

(B.37)

Solving (B.36) for m2 one has to all orders in q2 m2 = m21 + m22 + 2 q2 + 2

È

m21 m22 + (m21 + m22)q2 + q4 .

(B.38)

The rotations D leave invariant the total momentum at rest P = (m, 0, 0, 0) , D(P, q) = (P, Dq) .

(B.39)

Their representation determines the total spin of the two-particle state. The spins s1 and s2 of two massive particles add to spin s = s1 + s2 , s1 + s2 − 1, . . . , |s1 − s2 | which combines with orbital angular momentum l = 0, 1, 2 . . . from the q-dependence to the total spin j = l + s, l + s − 1, . . . , |l − s|. Positronium consists of a spin 1/2 electron and a spin 1/2 positron with combined spin s = 1 and s = 0. So for fixed dependence on |q| there are 2 states with total spin j = 0 and 4 states with j = 1, 2, . . . . Photons have helicity +1 or −1 and are restricted by Bose symmetry to satisfy (B.28) for each fixed P, χ i j (q) = χ ji (−q) , (B.40) where χ i j (q) = Ψ i j (P/2+q, P/2−q). The helicity i is the angular momentum of the first photon in direction of q and j is the angular momentum of the second photon in direction −q (8.96). In direction of q the angular momenta add to m = i − j. So χ ++ and χ −− induce rotation multiplets from m = 0. As χ ++ and χ −− are even functions of q each of them induces one multiplet with total spin j = 0, 2, 4, . . . . The function χ +− determines χ −+ and carries combined angular momentum m = 1 − (−1) = 2 in some chosen direction q. By (8.106) this induces one multiplet for each j = 2, 3, 4, . . . . So the multiplicities n j of total spin j = 0, 1, 2, 3, . . . of twophoton states are n0 , n1 , n2 , n3 = 2, 0, 3, 1 . This proves Yang’s theorem: there is no spin 1 multiplet of two photons. The theorem clarifies why positronium with j = 1 does not decay into two photons: the interaction is Poincar´e invariant and therefore preserves the total momen-

208

B Aspects of Quantization

tum P and the total spin. But there is no two-photon state with j = 1 into which j = 1 positronium can decay. There is a three photon state with j = 1. But the decay rate to three photons is suppressed by a factor α ∼ 1/137 for the production of an additional photon which is why one finds positronium predominantly in its stablest state as orthopositronium with j = 1.

B.3 The Many-Particle Space H In relativistic quantum theory the Hilbert space H1 of one-particle states is spanned by the eigenstates of P2 with a continuous spectrum of the spatial momentum P. It decomposes into a finite or denumerable sum of irreducible, faithful, unitary representations of the Poincar´e group on discrete mass shells with positive energy. The point spectrum of P2 distinguishes one-particle states from states with several particles and a continuous spectrum of P2 . The irreducible representation spaces and consequently H1 are separable: there are systems of denumerable, orthonormal states Λ1 , Λ2 , . . . the Cauchy completion of which is H1 . We choose to specify the states on a mass shell by their wavefunctions Ψ : (p, i) 7→ Ψ i (p), where the index i enumerates their spin components and other discrete labels which carry names like isospin, charge, color or flavor. We assume H1 = HB + HF to decompose into a space of bosonic states and a space of fermionic states. The n-particle states are elements of the subspace Hn ⊂ H1 ⊗ · · · ⊗ H1

(B.41)

of the n-fold tensor product which is graded symmetric under all permutations π

Ψ ∈ Hn ⇔ Ψ i1 ...in (p1 , . . . , pn ) = signF (π )Ψ iπ (1) ...iπ (n) (pπ (1) , . . . , pπ (n) )

(B.42)

with signF (π ) (B.3) beeing 1 or −1 if π consists of an even or odd number of transpositions of fermions. Poincar´e transformations map the tensor product naturally to itself (A.11), UΛ ,a (Ψ ⊗ · · · ⊗ Φ ) = (UΛ ,aΨ ) ⊗ · · · ⊗ (UΛ ,a Φ )

(B.43)

preserving the graded permutation symmetry. So the unitary representation extends naturally from H1 to Hn on which the generators act by the product rule. This is why P0 cannot generate an interacting time evolution, as by the product rule it conserves the individual momenta of the many-particle states (9.1). Together with the zero particle space H0 = { cΩ : c ∈ } of multiples of the vacuum Ω the n-particle spaces span the separable Hilbert space

C

H = ⊕∞ n=0 Hn .

(B.44)

B.3 The Many-Particle Space H

209

It is generated from the vacuum by creation operators a⋆f which add a particle in state f ∈ H1 to each n-particle state Ψn (a⋆f Ψn )i1 ,...,in+1 (p1 , . . . , pn+1 ) =

√ i ,...,i n+1 X (B.45) signF(π ) f iπ (1) (pπ (1) ) Ψn π (2) π (n+1) (pπ (2) , . . . , pπ (n+1)) . (n + 1)! π ∈ Sn+1

The sum over the group Sn+1 of all permutations π of 1, . . . , n + 1 just performs √ 1. the graded symmetrization up to the factor n + √ As the states Ψn are graded symmetric, it suffices to sum with coefficient 1/ n + 1 over representatives (e.g. the transpositions (i, n + 1), i = 1, . . . n and id) of the n + 1 cosets in Sn+1 /Sn . The annihilation operator ag∗ , the adjoint of a⋆g , ag∗ = (a⋆g )⋆

(B.46)

absorbs a particle in state g ∈ H1 , (ag∗Ψn )i1 ...in−1 (p1 , . . . , pn−1 ) =

√ X n j

Z

˜ g j (p)∗ Ψ j i1 ,...in−1 (p, p1 , . . . , pn−1 ) . dp

M

(B.47) In particular, for n = 0, ag annihilates the vacuum, ag Ω = 0. The creation and annihilation operators satisfy the graded commutation relations [ag , a f ] = 0 , [a⋆g , a⋆f ] = 0 , [ag∗ , a⋆f ] = hg f i .

(B.48)

For smooth wave functions f and g one writes the creation and annihilation operators as operator valued distributions a j⋆ (q) and ai (p) applied to test functions XZ XZ ˜ gi (p) ai (p) . ˜ a⋆ j (q) f j (q) , ag = dp (B.49) dq a⋆f = j

M

i

M

These distributions satisfy [ai (p), a j (q)] = 0 , [a⋆ i (p), a⋆ j (q)] = 0 , p [ai (p), a⋆ j (q)] = (2π )3 2 m2 + p2 δ 3 (p − q) δ i j .

(B.50)

The structure of the Hilbert space as a sum over the graded symmetric powers of the one-particle space, H = ⊕Hn , can be postulated or derived assuming a vacuum and the commutation relations of the creation and annihilation operators as they follow from canonical quantization of graded commutative fields. Historically, mistaking these fields for wave functions of quantum states, their quantization was counted as a second quantization, a name which still sticks and lends quantum field theory a dispensable mysterious flair.

210

B Aspects of Quantization

B.4 Time Order, Normal Order and Wick’s Theorem Classical bosonic and fermionic fields φ (xi ) and their derivatives at n points constitute finitely many generators of a graded commutative algebra which we denote by φ1 , φ2 , . . . The time order T maps products of these variables at distinct points, xi 6= x j for i 6= j, to the products of their corresponding local quantum fields Φ (xi ) with the factors ordered such that their times increase from right to left. Enumerate the points such that x01 ≥ x02 · · · ≥ x0n and let π denote a permutation of { 1, . . . , n } then time order is defined as T(1) = 1 , T(φπ (1) φπ (2) . . . φπ (n) ) = signF(π ) Φ (x1 ) Φ (x2 ) . . . Φ (xn ) .

(B.51)

The time order T is linearly extended to polynomials. It is a partial quantization, a map from the subspace, restricted by xi 6= x j for i 6= j, of the algebra of classical fields to operator valued distributions. By its definition T(φ1 . . . φn ) is graded symmetric under permutations π of its arguments: their transposition yields – up to a sign if they are fermionic – the same product of quantum fields ordered by increasing times   (B.52) T φπ (1) . . . φπ (n) = signF (π ) T φ1 . . . φn

Therefore the arguments of T cannot be operators Φ and Ψ with a nonvanishing numerical value c of their graded commutator. Else all operators c T X vanished  c T X = T cX = T ΦΨ − (−)|Φ | |Ψ |Ψ Φ X = T(ΦΨ )X − T(ΦΨ )X = 0 . (B.53) The definition of T extends to coordinates xm i and bosonic or fermionic parameters ji which we feel free to introduce when convenient. Consistent with the linearity of the time order and with the definition of derivatives as limit of differences, one defines the time order of differentiated fields dm φ by the differentiated quantum fields ∂xm Φ (x), m T xm T(. . . φi . . . ) . i := xi T , T j(xi ) := j(xi ) T , T(. . . (dm φ )i . . . ) := ∂xm i

(B.54)

For the time order T to depend continuously on the times x01 . . . x0n even if some of them coincide, the graded commutators of the fields Φ at equal times but different positions have to vanish, i.e. the quantum fields Φ have to be local, their graded commutator has to vanish for spacelike separations (xi − x j )2 < 0. If, moreover, they transform as fields UΛ Φ (x)UΛ−1 = D−1 (Λ )Φ (Λ x) under time orientation preserving Lorentztransformations Λ , then time order is equivariant   T D−1 (Λ )φ (Λ x1 ), . . . , D−1 (Λ )φ (Λ xn ) = UΛ T φ (x1 ), . . . , φ (xn ) UΛ−1 , (B.55) as Λ changes at most the temporal order of spacelike separations. Local fields Φ (x) = Φ − (x) + Φ + (x) decompose into a creation part and an annihilation part. Corresponding to this decomposition the normal order N maps linearly

B.4 Time Order, Normal Order and Wick’s Theorem

211

polynomials of graded commutative variables φi− and φi+ to the polynomials of the creation and annihilation parts of the corresponding quantum fields where all creation parts are on the left of the annihilation parts. To define normal order formally, consider the labels − and + of a product of n creation- or annihilation parts as a map ε : { 1, . . . n } → { −, + }. It decomposes { 1, . . . n } into the disjoint union of the sets ε −1 (−) with k elements, 0 ≤ k ≤ n, and ε −1 (+). They can be specified by a permutation π , ε −1 (−) = { π (1), . . . , π (k) }, ε −1 (+) = { π (k + 1), . . ., π (n) } with π (i) < π ( j) for i < j ≤ k and for k < i < j. Abbreviating Φ ε (i) (xi ) by Φiεi , the normal order is defined by  N(1) = 1 , N φ1ε1 . . . φnεn = signF (π ) Φπ−(1) . . . Φπ−(k) Φπ+(k+1) . . . Φπ+(n) . (B.56) For the product of n local fields φi = φi− + φi+ this means

 X N φ1 . . . φn = signF (π )Φπ−(1) . . . Φπ−(k) Φπ+(k+1) . . . Φπ+(n)

(B.57)

ε

where the sum runs over all 2n decompositions ε of { 1, . . . , n } = ε −1 (−) ∪ ε −1 (+). Normal order is Lorentz invariant and graded symmetric under permutations π   N D−1 (Λ )φ ε1 (Λ x1 ) . . . D−1 (Λ )φ εn (Λ xn ) = UΛ N φ ε1 (x1 ) . . . φ εn (xn ) UΛ−1 ,   N φ1 . . . φn = signF (π ) N φπ (1) . . . φπ (n) . (B.58)

So the arguments of N are not operators (B.53). Normal order is a quantization, a linear map from a graded commutative algebra to operator valued distributions. Each operator, which is defined on the many-particle space H with an orthonormal basis (complete system) f1 , f2 , . . . of H1 , can be written in terms of cration and P annihilation operators as a sum O = m,n Om,n of normal ordered operators X

Om,n =

i1 ...im+n

Oi1 ...im im+1 ...im+n a⋆fi . . . a⋆fim a fi 1

m+1

. . . a fi

m+n

.

(B.59)

This follows by induction with respect to n as the products of the creation operators, applied to the vacuum Ω , span H . Wick’s theorem relates time order and normal order Wick’s Theorem: The time ordered product is the sum over all contractions of the normal ordered product, X signF (π ) N(φa1 . . . φan−2k ) ∆ (xb1 − xc1 ) . . . ∆ (xbk − xck ) , (B.60) T(φ1 . . . φn ) = C

  ∆ (x) = Ω T φ (x)φ (0) Ω , π (1), . . . π (n) = (a1 , . . . an−2k , b1 , c1 , . . . bk , ck ) .

The sum runs over all decompositions C of { 1, . . . n } into a set { a1 , . . . an−2k }, k ≥ 0, contributing a normal ordered operator, and sets { bi , ci }, i = 1, . . . k, which contribute the contractions or propagators ∆ (xbi − xci ) from xbi to xci . Each decom-

212

B Aspects of Quantization

position C corresponds uniquely to a permutation π with ai < a j and bi < b j for  i < j and bi < ci . There are n!/ 2k (n − 2k)! k! k-fold contractions of n fields. The theorem holds for n = 0 and n = 1. We leave the induction on n to the reader. As the relation between time order and normal order is graded permutation 0 0 0 symmetric it is sufficient to consider the case R x41 ≥ x2 · · · ≥ xn . Wick’s theorem, applied to S = T exp i d x Lint (x) (9.14), allows to calculate the scattering amplitudes (9.6) as sum over all connected Feynman diagrams. One is tempted to continuously extend the definition of time order also to coinciding spacetime points unsuspecting any difficulties as the sequence of the factors of a square is irrelevant. But the square Φ (x)Φ (x) is not defined! The right side of (B.51) diverges if two points approach each other (B.113). There is nothing continuous to be extended. All divergences of naively evaluated Feynman diagrams stem from time ordered products blindly continued to coinciding points [8]. For example the matrix elements limε →0 hχ Φ (x + ε )Φ (x)Ψ i in many-particle states χ and Ψ do not exist, as ∆ (ε ) diverges with ε → 0. Well definedR are the matrix elements of the normal ordered products. We use them to define T d4x Lint (x), where Lint (x) is a polynomial of fields at coinciding points x. Diagrammatically this definition of the interaction Lagrangian amounts to dropping all self contractions in the perturbative evaluation of (9.14). R More are required to define the matrix elements of T( d4x Lint )n R 4 subtractions R as ( d x Lint (x))n = d4x1 . . . d4xn Lint (x1 ) . . . Lint (xn ) is a 4n-dimensional integral which extends also over neighbourhoods of coinciding space-time points xi = x j . There the time order of products of normal ordered fields require subtle definitions. This is the subject of subtraction schemes and renormalization theory for which we refer to Bogoliubov, Parasiuk, Hepp, Zimmermann and Lowenstein [49][61].

B.5 Unitary Representations of Rotations The generators of a unitary representation of rotations in three dimensions are antihermitean and satisfy the Lie algebra (8.45), i, j, k ∈ { 1, 2, 3 }, [Γi j , Γkl ] = δik Γjl − δil Γjk − δ jkΓil + δkl Γik , Γi j = −Γi †j = −Γji .

(B.61)

Multiplying the generators with i one obtains hermitean operators, the components of the angular momentum in units with h¯ = 1, L1 = iΓ23 , L2 = iΓ31 , L3 = iΓ12 ,

(B.62)

which satisfy the angular momentum algebra (i, j, k ∈ { 1, 2, 3 }) [Li , L j ] = i εi jk Lk .

(B.63)

B.5 Unitary Representations of Rotations

213

As the angular momentum operators Li are hermitean, their commutator is antihermitean. Therefore the right side of (B.63) contains a factor i which obscures that (B.61) is a real Lie algebra. If hermitean operators L1 , L2 , L3 satisfy the angular momentum algebra and map a Hilbert space Hl , but not a genuine subspace, to itself then Hl is spanned by an orthonormal basis Λl,m in terms of which one can completely specify the angular momentum operators. The quantum number l fixes the eigenvalue l(l + 1) of the squared total angular momentum, the Casimir operator L2 = (L1 )2 + (L2 )2 + (L3 )2 .

(B.64)

The magnetic quantum number m is the eigenvalue of the angular momentum nL in some chosen direction n, which we take to be the 3-direction, i.e we take m to be the eigenvalue of L3 . The angular momentum operators do not change the values of l. It characterizes the irreducible space Hl , the angular momentum multiplet with total angular momentum l. As the rotation group SO(3) ∼ S3 / 2 is compact, each representation is equivalent to a sum of unitary, irreducible and finite-dimensional representations. So each hermitean operator which maps such a finite-dimensional representation space, an angular momentum multiplet, to itself has a spectrum of discrete eigenvalues and is diagonalizable. The angular momentum multiplet is (2l + 1)-dimensional (consequently possible values for l are half integer 0, 1/2, 1, 3/2, . . .) and is spanned by orthonormal basis vectors Λl,m with m values −l, −l + 1, . . . , +l. So l = mmax is the largest m number in the angular momentum multiplet. On these basic states Λl,m the angular momentum operators act according to

Z

L2Λl,m = l(l + 1)Λl,m ,

(B.65a)

L3Λl,m = m Λl,m ,

(B.65b)

(L1 + iL2 )Λl,m = (L1 − iL2 )Λl,m =

È

(l − m)(l + m + 1) Λl,m+1 ,

È

(l + m)(l − m + 1) Λl,m−1 .

(B.65c) (B.65d)

This results from the angular momentum algebra and the hermiticity as follows. The Casimir operator L2 commutes with each component of the angular momentum L1 , L2 and L3 because the Leibniz rule of commutators [A, BC] = [A, B]C + B[A,C]

(B.66)

and (B.63) imply [Li , L j L j ] = iεi jk (Lk L j + L j Lk ) which vanishes because it is a double sum of a symmetric with an antisymmetric index pair (5.18). In other words, angular momentum operators generate rotations and therefore leave the length squared of vectors, e.g. x2 + y2 + z2 or L2 , invariant, [Li , L2 ] = 0 .

(B.67)

214

B Aspects of Quantization

So the operators Li map the subspaces Hl of eigenvectors of L2 with a fixed eigenvalue λ (l) to themselves. Such a subspace is spanned by normalized eigenstates Λl,m of L3 , (B.68) L2Λl,m = λ Λl,m , L3Λl,m = m Λl,m . In the complex vector space of linear combinations of angular momenta the sums L+ = L1 + iL2 ,

L− = L1 − iL2 = (L+ )† ,

(B.69)

are eigenvectors of the adjoint transformation adL3 (A.36) with eigenvalues 1 and −1, [L3 , L+ ] = L+ , [L3 , L− ] = −L− . (B.70) Applied to an eigenstate of L3 with eigenvalue m (B.68), L+ and L− yield eigenstates with eigenvalues m + 1 and m − 1 or vanish, as [A, B] = α B and Av = λ v imply   A Bv = [A, B] + BA v = α + λ )Bv , (B.71)     (B.72) L3 L+Λl,m = (m + 1) L+Λl,m , L3 L−Λl,m = (m − 1) L−Λl,m . As L+ and L− raise or lower the eigenvalues of L3 with a constant step size by their adL3 eigenvalue, they are called ladder operators or raising and lowering operators. The product L− L+ = (L1 − iL2 )(L1 + iL2 ) = L21 + L22 + i[L1 , L2 ] is L− L+ = L2 − L23 − L3 .

(B.73)

As Λl,m is normalized this implies





¬ L+Λl,m L+Λl,m = Λl,m L− L+Λl,m = Λl,m (L2 − L32 − L3 )Λl,m =

= Λl,m (λ − m(m + 1))Λl,m = λ − m(m + 1) .

(B.74)

kL+Λl,m k2 = λ − m(m + 1) , kL−Λl,m k2 = λ − m(m − 1) .

(B.75)

So we obtain (and similarly from L+ L− = L2 − L32 + L3 )

These norms are nonnegative (8.4), so for fixed λ the quantum numbers m are bounded from above by some maximal m value l = mmax and the raising operator L+ applied to the state with m = l has to vanish. Therefore λ = l(l + 1) which proves (B.65a). Also L− vanishes if applied to the state with lowest m, l(l + 1) = mmin (mmin − 1), implying mmin = −l (as mmin ≤ mmax ), mmax = l ,

mmin = −l .

(B.76)

The repeated application of L+ to the state with minimal L3 eigenvalue mmin = −l raises the m quantum number in integer steps until one arrives at the state with mmax = l. So for each m ∈ { −l, −l + 1, . . . , +l } there is a state Λl,m . These states are mutually orthogonal and span a (2l + 1)-dimensional space, the angular momentum

B.5 Unitary Representations of Rotations

215

multiplet with total angular momentum l, on which the operators Li act. As the  dimension is a natural number, l ∈ 0, 21 , 1, . . . is an integer or half integer. The norms in (B.75) are quadratic polynomials in l which vanish for l = m or l = −m, so they are divisible by (l − m) or (l + m), l(l + 1) − m(m + 1) = (l − m)(l + m + 1) , l(l + 1) − m(m − 1) = (l + m)(l − m + 1) . (B.77) Choosing the phases by (L− )Λl,m = clmΛl,m−1 with clm =

È

(l + m)(l − m + 1) , cl−m = clm+1 =

È

(l − m)(l + m + 1) ,

(B.78)

one obtains (L+ )Λl,m = clm+1Λl,m+1 and proves (B.65). The angular momentum of a state which does not derive from its dependence on position or momentum is called spin, the corresponding operators and their quantum numbers are denoted by S and s rather than L and l. In the simplest example, for s = 1/2, the spin-1/2 operators S1 = (S+ + S− )/2, S2 = (S+ − S− )/(2i) and S3 map by (B.65) the basis Λ↑ (m = 1/2) and Λ↓ (m = −1/2) to states with components which constitute 1/2 times the first and second columns of the Pauli matrices (6.46), 1 Si = σ i , 2



‹

01 σ = 10 1



0 −i ,σ = i 0

‹

2



‹

1 0 ,σ = 0 −1 3

.

(B.79)

Their products (6.48) and the unitary transformations (6.51) which they generate are investigated in section 6.3. In case that the angular momentum operators act on wave functions Ψ : 3 → by the differential operators L = x × p,    L1 = −i x2 ∂x3 − x3 ∂x2 , L2 = −i x3 ∂x1 − x1 ∂x3 , L3 = −i x1 ∂x2 − x2 ∂x1 , (B.80)

R

C

they are called orbital angular momentum. As orbital angular momentum generates rotations of the domain 3 of the wavefunctions  (B.81) e−iα nLΨ (x) = Ψ (Dα−1n x)

R

where Dα n denotes the rotation around the axis n by the angle α (6.15) and as each rotation by 2π restores the initial position, D2π n = 1 , therefore orbital angular momentum has only integer values of m and consequently of l, e−i 2π L3 Λl,m = e−2π i mΛl,m = Λl,m .

R

(B.82)

The representation of SO(3) on the space L2 ( 3 , d3x ) of square integrable functions of 3 is reducible already because 3 is the union of spheres S2 of radius r ∈ + and (B.81) represents rotations already in L2 (S2 , dΩ ). Spherical coordinates (θ , ϕ ) do not continuously cover S2 and yield awkward differential operators as generators of rotations. We prefer stereographic coordinates (u, v) and (u′ , v′ ) (3.28) for the north and the south (S2 without (0, 0, −1) or (0, 0, 1)),

R

R

R

216

B Aspects of Quantization

 x, y, z =

€

x y  y  x , (u′ , v′ ) = , , , (u, v) = 1+z 1+z 1−z 1−z

Š 2

2 2u 2v , , 1−u −v 1+u2 +v2 1+u2 +v2 1+u2 +v2

=

€

(B.83)

′2 +v′2 2u′ 2v′ , , −1+u 1+u′2 +v′2 1+u′2 +v′2 1+u′2 +v′2

Š

.

In their common domain these coordinates are related by inversion at the unit circle I : (u, v) 7→ (u′ , v′ ) =

u v  . , u 2 + v2 u 2 + v2

(B.84)

The infinitesimal rotations (li j x)k = −(δi k x j − δ j k xi ) (6.22) change the stereographic coordinates by (li j x)k ∂xk = xi ∂x j − x j ∂xi applied to (u, v)

 ‹



‹

 ‹

1

2 − v2 )

‹

 ‹



‹

u −v , l12 = . l23 v u uv (B.85) These changes are the negative components (page 147) of the differential operators, which represent the infinitesimal rotations on L2 (S2 ) and multiplication with i yields the angular momentum operators in stereographic coordinates acting on states with helicity h (8.95) u =− v

uv u 1 2 + v2 ) , l31 v = (1 − u 2

2 (1 + u

 1 (L1Ψ )N (u, v) = i u v ∂u + (1 − u2 + v2 )∂v ΨN (u, v) + h u ΨN (u, v) , 2  1 (L2Ψ )N (u, v) = i − (1 + u2 − v2 )∂u − u v ∂v ΨN (u, v) + h v ΨN (u, v) , 2  (L3Ψ )N (u, v) = i −u ∂v + v ∂u ΨN (u, v) + h ΨN (u, v) .

(B.86)

In the south they act on ΨS (u′ , v′ ) = ((u + iv)/(u − iv))h ΨN (u, v) (8.97) by (8.98)  1 (L1Ψ )S (u′ , v′ ) = i −u′ v′ ∂u′ − (1 − u′2 + v′2 )∂v′ ΨS (u′ , v′ ) + h u′ ΨS (u′ , v′ ) , 2  1 ′ ′ ′2 (L2Ψ )S (u , v ) = i (1 + u − v′2 )∂u′ + u′ v′ ∂v′ ΨS (u′ , v′ ) + h v′ ΨS (u′ , v′ ) , 2  (L3Ψ )S (u′ , v′ ) = i −u′ ∂v′ + v′ ∂u′ ΨS (u′ , v′ ) − h ΨS(u′ , v′ ) . (B.87)

The generators are hermitean with respect to the scalar product (it has the same form in southern coordinates) Z 4 Φ ∗ (u, v) Ψ (u, v) (B.88) hΦ Ψ i = du dv 2 (1 + u + v2 )2

with the rotation invariant integration measure: the surface element of the sphere, expressed in stereographic coordinates. dΩ = d cos θ dϕ = dz d arctan(x/y) = du dv

4 (1 + u2 + v2 )2

(B.89)

B.6 The Wightman Distribution ∆+

217

A state Ψ ∈ L2 (S2 , dΩ ) belongs to the domain of the algebra of the generators Li if ΨN is smooth in 2 and if ΨS = e2i h ϕ (u,v) ΨN ◦ I−1 can be smoothly continued to (u′ , v′ ) = 0. Then (L+ )n+ (L− )n− (L3 )n3Ψ is square integrable with respect to the scalar product (B.88) for each multiindex (n+ , n− , n3 ) ∈ 30 . Though the inducing representation Rh : Dδ 7→ e−ihδ of H = SO(2) is irreducible this does not make the induced representation of G = SO(3) irreducible. For each l ∈ { h|, |h| + 1, . . . } the space L2 (S2 ) contains exactly one angular momentum multiplet (8.106). In case h = 0 it is spanned by the spherical harmonic functions Ylm . For different |h|, the generators (B.86) are not unitarily equivalent in L2 (S2 ) as the spectrum of L2 differs. For h = 0 and in terms of the complex variables and complex partial derivatives

R

N

1 1 w = u + i v , w¯ = u − i v , ∂ = (∂u − i ∂v ) , ∂¯ = (∂u + i ∂v ) , 2 2

(B.90)

the angular momentum operators in the north split into linear combinations of ln = −wn+1 ∂ , l¯n = −w¯ n+1 ∂¯ ,

(B.91)

L1 = − 12 (1 − w2 )∂ + 12 (1 − w¯ 2 )∂¯ = 12 (l−1 − l1 ) − 21 (l¯−1 − l¯1 ) , L2 = − 2i (1 + w2 )∂ − 2i (1 + w¯ 2 )∂¯ = 2i (l−1 + l1 ) + 2i (l¯−1 + l¯1 ) , − w¯ ∂¯ = −l0 + l¯0 , L3 = w∂

(B.92)

[ln , lm ] = (n − m)ln+m , [l¯n , l¯m ] = (n − m)l¯n+m , [ln , l¯m ] = 0 ,

(B.93)

which together with ∂ w = ∂¯ w¯ = 1 , ∂ w¯ = ∂¯ w = 0 are easily checked to satisfy the angular momentum algebra on account of the two copies of the Witt algebra

Z

R

which hold for n ∈ in the space of smooth functions of 2 − { 0 }. The subalgebra for n = −1, 0, 1 leaves invariant the subspace of differentiable functions of 2 which composed with inversion at the unit circle I : w 7→ 1/w¯ can be differentiably continued to 0. This inversion maps the two subalgebras to each other, Iln I−1 = −l¯−n . The southern coordinate z′ = u′ − iv′ has the holomorphic transition function ′ z (w) = 1/w and exhibits the Riemann sphere S2 as one-dimensional complex mani¯ = fold CP1 . For h = 0 rotations map the vector space of meromorphic functions ∂Ψ 0 to itself. However apart from the constant function they are not contained in a Hilbert space in which rotations are unitarily represented.

R

B.6 The Wightman Distribution ∆+ We [8, § 15] evaluate the distribution Z p d3 p e−ip · x , p0 = m2 + p2 ∆+ (x) = 3 0 (2π ) 2p

(B.94)

218

B Aspects of Quantization

√ in spherical coordinates, p · x = m2 + k2 x0 − k r cos θ , k = |p| , r = |x|. The integration over the angle ϕ simply yields 2π and the integration over cos θ contributes Z +1 Z ∞ √ dk k2 1 i k r cos θ −i m2 +k2 x0 √ d cos π θ e e 2 (2π )3 0 2 m2 + k2 −1 Z ∞ 2  √ 2 2 0 1 dk k 1 √ = 2 (B.95) eikr − e−ikr e−i m +k x 8π 0 m2 + k2 (ikr) Z ∞ √ dk k 1 i (kr− m2 +k2 x0 ) √ e = 8i π 2 r −∞ m2 + k2

∆+ (x) =

leading to

∆+ (x) = −

1 ∂ 1 f (x0 , r) , f (x0 , r) = 4π r ∂ r 2π

The substitution

Z



dk √ e−i( m2 + k 2 −∞



m2 +k2 x0 −kr)

.

(B.96)

k = m sh ϕ

(B.97) √ casts the integral into its standard form. One has m2 + k2 = m ch ϕ , dk = dϕ m ch ϕ Z ∞ 0 1 0 dϕ e−im(x ch ϕ −r sh ϕ ) . (B.98) f (x , r) = 2π −∞ √ √ If x lies in the forward lightcone, x0 > r, then with x0 = λ ch ϕ0 , r = λ sh ϕ0 and 2

λ = x · x = x0 − r 2 ,

(B.99)

with the addition theorems of the hyperbolic functions and shifting the integration variable, one reduces the integral to the Bessel functions J0 and N0 [23, eq. 3.714], Z ∞ √ 1 dϕ e−im λ (ch ϕ ch ϕ0 −sh ϕ sh ϕ0 ) x0 > r : f (x0 , r) = 2π −∞ Z ∞ √ 1 dϕ e−im λ ch(ϕ −ϕ0 ) (B.100) = 2π −∞ Z ∞ √ √  √ 1 1 = dϕ e−im λ ch ϕ = − N0 (m λ ) + iJ0 (m λ )) . 2π −∞ 2 √ √ For spacelike x, for r > |x0 |, we put r = −λ ch ϕ0 and x0 = −λ sh ϕ0 and obtain Z ∞ √ p 1 1 dϕ e−im −λ sh ϕ = K0 (m −λ ) . (B.101) r > |x0 | : f (x0 , r) = 2π −∞ π √ √ For x in the backward lightcone, −x0 > r, with x0 = − λ ch ϕ0 and r = λ sh ϕ0 we get similarly

B.6 The Wightman Distribution ∆+

− x0 > r :

f (x0 , r) =

1 2π

Z

219 ∞

dϕ e

√ im λ ch ϕ

−∞

=−

√  √ 1 N0 (m λ ) − iJ0(m λ ) . 2 (B.102)

d d J0 (z), N1 (z) = − dz N0 (z), The functions J0 , N0 , K0 and their derivatives J1 (z) = − dz d K1 (z) = − dz K0 (z), behave for large arguments asymptotically as [23, section 8.45]

2 sin z 2 sin z 2 z → ∞ : J0 (z) = − √ , N0 (z) = √ , K0 (z) = − √ e−z , πz πz πz (B.103) 2 cosz 2 −z 2 cos z , N1 (z) = − √ , K1 (z) = √ e , J1 (z) = √ πz πz πz while for small arguments one has [23, section 8.40] z2 z + O(z4 ) , J1 (z) = + O(z3 ) , 4 2 z 2 z z 2 z2 2 2 N0 (z) = (1 − ) ln + C + O(z ln z) , N1 (z) = − + ln + O(z ln z) , π 4 2 π πz π 2 z2 z 1 z z K0 (z) = −(1 + ) ln − C + O(z2 ln z) , K1 (z) = − + ln + O(z ln z) . 4 2 z 2 2 (B.104) Here C is Euler’s constant [23, section 8.367] Z ∞ dt e−t lnt ≈ 0.577 214 . (B.105) C=− J0 (z) = 1 −

0

Allowing for the different cases of spacelike, future and past x by the step function θ (λ ) and the sign function ε (x0 )

θ (λ ) =

§

1 if λ ≥ 0 , ε (x0 ) = 0 if λ < 0

§

1 if x0 ≥ 0 , −1 if x0 < 0

(B.106)

and using ∂ /∂ r = (−2r)∂ /∂ λ one obtains p √ √   1 1 ∂ 1 θ (λ ) −N0 (m λ ) + iε (x0 ) J0 (m λ ) + θ (−λ ) K0 (m −λ ) 2π ∂ λ 2 π 2  1 0 (B.107) δ (λ ) −N0 + iε (x ) J0 − K0 + = 4π π p √ √  1 m m 1 θ (−λ )K1 (m −λ ) . + √ θ (λ ) N1 (m λ ) − iε (x0 ) J1 (m λ ) − 2 √ 8π λ 4π −λ

∆+ (x) =

The factor of δ (λ ) is evaluated at λ = 0,

lim −N0 (z) + iε (x0 )J0 (z) −

z→0

We arrive at

 2 K0 (z) = iε (x0 ) . π

(B.108)

220

B Aspects of Quantization

√ √  i 1 m √ θ (λ ) N1 (m λ ) − iε (x0 ) J1 (m λ ) − ε (x0 )δ (λ ) + 4π 8π λ (B.109) p 1 m √ θ (−λ )K1 (m −λ ) . − 2 4π −λ p p p  For small m |λ | and up to an error of order O m |λ | ln(m |λ |) this is

∆+ (x) =

p 1 m2 m |λ | i m2 0 ∆+ (x) ≈ ε (x )δ (λ ) − 2 + 2 ln ε (x0 )θ (λ ) −i 4π 4π λ 8π 2 16π

(B.110)

and for m = 0

i 1 ε (x0 )δ (λ ) − 2 . (B.111) 4π 4π λ For all values of the mass the distribution ∆+ (x) has a δ function singularity on the lightcone and a pole −1/(4 π 2 x · x) . Proportional to the mass squared it has a p logarithmic singularity ln |x · x| and a discontinuity θ (x · x)ε (x0 ) . The vacuum expectation value of the time ordered product of two scalar fields, the propagator, is ∆+ (x) for positive times and ∆+ (−x) for negative times

∆+ |m=0 (x) =

hT φ (x)φ (0)i := hΩ T φ (x)φ (0)Ω i = θ (x0 )∆+ (x) + θ (−x0)∆+ (−x) .

(B.112)

It is an peven function of x and does not vanish for spacelike arguments, x · x < 0 , z = m |λ | ,

  m i 1 m 1 √ θ (λ ) N1 (z)−i J1 (z) − 2 √ θ (−λ )K1 (z) . δ (λ )+ 4π 8π λ 4π −λ (B.113) By contrast, the commutator is uneven and vanishes for spacelike separations

hT φ (x)φ (0)i =

h[Φ (x), Φ (0)]i = ∆+ (x)− ∆+ (−x) =

√ i i m √ θ (λ )ε (x0 ) J1 (m λ ) . ε (x0 )δ (λ )− 2π 4π λ (B.114)

Appendix C

Systems of Units

R

Physical data Q are products of error intervals of numbers z ∈ and powers of units q = (q1 , q2 , . . . qn ), which are also called the dimension of the data. One can add and multiply data. In mathematical terms they constitute a polynomial algebra of units and inverse units. The current standard metric system, the International System of Units (Syst`eme international d’unit´es or SI), is based on the metre (m), kilogram (kg) and second (s) as well as kelvin (K), ampere (A), candela (cd) and mole (mol). In physics one also needs some roots of these units, e.g. in quantum physics where the modulus squared of the position wave function ψ˜ : 3 → , x 7→ ψ˜ (x) is a probability density. Therefore the position wave function ψ˜ (x) has SI units m−3/2 . The sum of two data of different dimension is direct: 299 792 458 m + 5 kg can be uniquely decomposed into 299 792 458 m and 5 kg which is why one usually keeps them separate and considers only sums of data of the same dimension. However, it is not always clear what exactly it means that units are incommensurable. Is “1 foot” different from “1.646 ·10−4 nautical miles”? Definitely yes for pilots, as height and distance are completely different in aviation, which is why they are measured in different units. On the other hand, tree fellers transform height into length with the above factor. Are “kilo” or “milli” units? Definitely not, they denote numbers 1000 and 1/1000. But just the same holds for “1 mole”. It designates the Avogadro number

R

1 mol = NA = 6.022 142 ·1023 ,

C

(C.1)

the number of atoms in 12 g carbon 12C. The ratio of the arc length to the radius is the same for all segments of circles with the same angle. So one can specify the angle by this dimensionless ratio. The unit radian (symbol rad) is redundant, because 1 rad is 1. The unit just signifies that the number designates an angle. The complete circle corresponds to the angle 2π . It is graded to 360 degrees of arc. So one degree is the number 1◦ = π /180 ≈ 0.017 453 3. The spherical angle of a part of the sphere S2 is the size of its area divided by the square of the radius. This dimensionless ratio of areas amounts to 4π in case of the

221

222

C Systems of Units

complete sphere. The unit steradian or square radian (symbol sr) only means that the number denotes a spherical angle, 1 sr = 1. Caution is required with data, which are specified per radian or per square radian, e.g. the brightness of a candle, the current density of energy per spherical angle. The overall flux is the integral over the unit sphere and, in case that the density is isotropic, 4π times the density per spherical angle. Many products of units carry their own names. Together with important physical constants they are listed in the following table. The numbers in the columns m, kg, s and A denote the exponents of these units, e.g. the line “electric constant” reads

ε0 = 107 /(4π z2c ) m−3 kg−1 s4 A2 where zc = 299 792 458 .

(C.2)

Table C.1: Units and Physical Constants Quantity

Unit

Symbol

length metre m mass kilogram kg time second s current ampere A velocity of light c ε0 electric constant reduced Planck constant h¯ Newton constant GN µ0 magnetic constant charge coulomb C capacitance farad F dose, energy per mass gray Gy frequency hertz Hz inductance henry H energy joule J force newton N resistance ohm Ω pressure, stress pascal Pa conductance siemens S magnetic field (SI) tesla T voltage volt V power watt W magnetic flux (SI) weber Wb

Value 1 1 1 1 zc 107 /(4π z2c ) 1.054 571 6 ·10−34 6.673 ·10−11 4π · 10−7 1 1 1 1 1 1 1 1 1 1 1 1 1 1

m kg s A 1 0 0 0 1 -3 2 3 1 0 -2 2 0 2 2 1 2 -1 -2 0 2 2 2

0 0 0 1 0 0 0 1 0 0 0 1 0 -1 0 -1 4 2 1 -1 0 -1 -2 0 1 -2 -2 0 1 1 -1 4 2 0 -2 0 0 -1 0 1 -2 -2 1 -2 0 1 -2 0 1 -3 -2 1 -2 0 -1 3 2 1 -2 -1 1 -3 -1 1 -3 0 1 -2 -1

Becquerel and hertz are the same unit, Bq = Hz. Decays or oscillations per second are both a number per second.

C Systems of Units

223

The reaction rate or catalytic activity of one mole per second is called katal in chemistry and abbreviated by the symbol kat = mol/s. The unit candela for luminous intensity is defined as the energy current of cd = 1/683 watt per steradian .

(C.3)

A candle with an isotropic intensity of 1 cd radiates 4π /683 watt. Candela and watt are numerically related just as foot and nautical mile. Apparent temperature, relative luminance and equivalent energy dose are examples of biological quantities which include effects on the human body, e.g. the sensitivity of human eyes. If two units are universally related then one can identify their dimension and convert them into each other. This simplifies the equations of physics and uncovers their content. For example, one can convert by help of the Boltzmann constant kB = 8.617 342(15) 10−5 eV K−1

(C.4)

the unit of temperature, kelvin (symbol K), to a unit of energy, electron volt (eV), kB = 1 ⇔ 1 K = 8.617 342(15) 10−5 eV , 11 600 K = 1 eV ,

(C.5)

just as one converts electron volt (eV) to joule (J), 1 eV = 1.602 176 462(63) 10−19 J .

(C.6)

To the surface temperature of the sun, 5 780 K, there corresponds light with a mean photon energy of roughly 0.5 eV, more than an order of magnitude smaller than the ionization energy of the hydrogen atom, the Rydberg energy Ry = 13.6 eV. This explains why photons which are radiated from simply exited atomic hydrogen with an energy of 3/4 Ry are far in the ultraviolet and invisible to the human eye. If one considers temperatures as energies, then the Boltzmann constant kB is 1 just as the ratio of 2π to 360 degrees of arc. Originally all units had been chosen arbitrarily because the continuum of imaginable values did not single out a distinguished value. With the progress of physics this has changed. The velocity of light in the vacuum, c, the reduced Planck constant h¯ , the electric constant ε0 and the Newton constant GN can replace the SI units. The numerical factor, though, for the units ε0 and GN is ambiguous: Heaviside Lorentz units employ ε0 and cast the Maxwell equations into their simplest form. But then the Coulomb potential is more complicated than in Gauß units of 4π ε0 . The Newton constant GN together with the velocity of light c and the Planck constant h¯ marks the energy scale of the Planck mass

Ê 2

mPlanck c =

h¯ c5 = 1.220 9 1019 GeV . GN

(C.7)

224

C Systems of Units

However, if one investigates theoretically processes, in which quantum theory, the finite velocity of light and gravity are all together essential, namely the gravitational √ scattering of high energy particles, then already at an energy which is 32π ≈ 10 smaller than the Planck energy gravity becomes so strong that the hitherto successful methods to evaluate quantum field theory (perturbation theory, Feynman diagrams) fail. So the energy scale at which gravity becomes strong is probably a factor 10 smaller than 1019 GeV. As regards c and h¯ also the numerical factors of the natural units are fixed: the velocity of light c, not some multiple, is the limit velocity of each observed transport of energy and momentum. Similarly h¯ , not a multiple, is the helicity of a photon and the smallest conceivable change of an angular momentum. It depends on the considered domain of physics which of the physical constants are suitable units. Relativistic physics uncovers its relations of time and space in units of the velocity of light c if one converts metre, originally defined as the 40 000 000th part of the equatorial perimeter of the earth, to seconds (1.2) c = 1 ⇔ 1 s = 299 792 458 m .

(C.8)

Quantum theory simplifies in terms of the unit h¯ . Electrostatics simplify in units of ε0 or 4πε0 rather than ampere. The electrical constant ε0 enters the potential of the hydrogen atom as e2 /(4πε0 ) together with the charge e of the electron. This combination of units has the dimension length times energy, as one obtains the potential energy of two charges by division by their distance r. Also h¯ c has this dimension as h¯ has the dimension of an action, i.e. of the product of energy with time or momentum with length. Therefore the fine structure constant e2 1 (C.9) α= = 7.297 352 569 8(24) 10−3 ≈ 4πε0 h¯ c 137 is a dimensionless number, independent of units, which characterizes the strength of the electromagnetic interactions not only of the electron. The fine structure constant is universal, because all other charges are integer multiples (in case of quarks multiples of one third) of the electron charge. This quantization of charge remains mysterious in Maxwell theory and becomes understandable only in quantum field theory where so called anomalies have to cancel. The electron mass m corresponds to the rest energy m c2 ≈ 511 keV of the electron. The energy scale of the hydrogen atom, the Rydberg energy, is a dimensionless multiple, i.e. a product with a function of the fine structure constant α . As the Schr¨odinger equation which determines the energies does not contain c, this function has to be α 2 . So the Rydberg energy is (the factor 1/2 follows from the detailed calculation of the Schr¨odinger equation) Ry =

1 2 2 α m c ≈ 13.6 eV . 2

(C.10)

The quantum of action h¯ has the dimension of length times momentum. Division by m c therefore yields a length, the reduced Compton wave length of the electron,

C Systems of Units

225

λ=

h¯ = 3.861 592 642(28) 10−13 m . mc

(C.11)

As the Schr¨odinger equation of the hydrogen atom contains no c, its length scale is the Bohr radius h¯ ≈ 0.529 10−10 m . (C.12) a= α mc In a more systematic notation the physical constants and other units Q1 , Q2 , . . . are given by numbers Zi ∈ + , Zi > 0 , and powers of the SI units q1 , q2 , . . . ,

R

Qi = Zi qe11i qe22i . . . qenni = Zi

n Y

(qk )eki .

(C.13)

k=1

R

If the vectors ei = (e1i , e2i , . . . eni ) ∈ n , which combine the exponents, are linearly P dependent, i ei λi = 0, then the corresponding powers of the physical constants combine to numbers Z(λ ) which are independent of the chosen units, e.g. the fine structure constant or all ratios of particle masses, Y λ  P e1i λi P e2i λi Y λ Q1λ1 Qλ2 2 · · · = (C.14) Zi i q1 i Zi i = Z(λ ) . q2 i ··· = i

i

On the other hand, if n exponent vectors e1 , e2 , . . . en in (C.13) are linearly independent, they constitute a basis and the rescaled quantities n

Qi Y = (qk )eki , Qˆ i = Zi

(C.15)

k=1

can be solved for the units q j qj =

n Y i=1

E Qˆ i i j ,

X

eki Ei j = δk j .

(C.16)

i

Suitable roots of Q1 , Q2 , . . . Qn can therefore be used as units. Roots arise at such a change of the basis, as the elements of a matrix inverse E of an integer matrix e are usually not integer but rational. The equation shows that the trivial task to eliminate a unit in favour of a physical constant requires the inversion of a matrix in case of several units. Solved for metre and ampere the velocity of light c/zc = m s−1 and the electrical constant 4π z2c 10−7 ε0 = m−3 kg−1 s4 A2 yield 1 m = s1 kg0 c1 ε0 0 , A = zc



4π 107 zc

‹1 2

1

1

3

1

s− 2 kg 2 c 2 ε0 2

(C.17)

and allow to express all quantities of table (C.1) in terms of the units s, kg, c and ε0 .

226

C Systems of Units

Table C.2: Conversion to s kg c ε0 Symbol Value(m kg s A) m kg s A m kg s A c ε0 h¯ GN µ0 C F Gy Hz H J N Ω Pa S T V W Wb

1 1 1 1 zc 107 /(4π z2c ) 1.054 571 6 ·10−34 6.673 ·10−11 4π · 10−7 1 1 1 1 1 1 1 1 1 1 1 1 1 1

1 0 0 0 1 -3 2 3 1 0 -2 2 0 2 2 1 2 -1 -2 0 2 2 2

Value(s kg c ε0 )

s kg

0 0 0 z−1 1 c 1 0 0 1 0 0 1 0 1 1 1 0 0 1 (4π /(107 zc )) 2 -1/2 0 -1 0 1 0 -1 4 2 1 0 1 -1 0 1.054 571 6 ·10−34 z−2 1 c -1 -2 0 6.673 ·10−11 z−3 1 c 1 -2 -2 1 0 1 0 1 1 (4π 10−7/zc ) 2 1/2 -1 4 2 4π 10−7 zc 1 0 -2 0 zc−2 0 0 -1 0 1 -1 1 -2 -2 107 /(4π zc ) 1 1 -2 0 z−2 0 c 1 -2 0 z−1 -1 c 1 -3 -2 107 /(4π zc ) 0 1 -2 0 zc -3 -1 3 2 4π 10−7 zc 0 1 7 -3/2 1 -2 -1 (10 zc /(4π )) 2 1 1 -3 -1 (4π z3c 10−7 )− 2 -1/2 1 -3 0 z−2 -1 c − 21 3 −7 1 -2 -1 (4π zc 10 ) 1/2

c

ε0

0 1 0 1 0 0 0 0 0 1/2 3/2 1/2 0 1 0 0 0 1 1 2 0 -1 3 0 0 -2 -1 1/2 3/2 1/2 0 1 1 0 2 0 0 0 0 0 -1 -1 1 2 0 1 1 0 0 -1 -1 1 -1 0 0 1 1 1/2 -3/2 -1/2 1/2 1/2 -1/2 1 2 0 1/2 1/2 -1/2

If a relation holds in SI units m, kg, s, A then the corresponding relation holds in the units s, kg, c, ε0 of the relativistic Heaviside system as they are invertibly related. The relation remains valid, if one divides all quantities Q by their powers of c and ε0 , e.g. divides each capacitance, which is measured in farad (F), by c and ε0 or multiplies each inductance, which is measured in henry (H), with c and ε0 . These ratios specify the quantities in units of c and ε0 just as a length l in units metre is l/m. The ratios QH satisfy equations from which all coefficients c and ε0 have disappeared because c and ε0 in units of c and ε0 are 1. Moreover one drops in the relativistic Heaviside system the distinguishing subscript of QH and writes Q for short and reads e.g. 1 henry = 107 /(299 792 458 ·4π ) s in units of c and ε0 Thereby one looses the information about the powers of the units. But one can reconstruct them for each physical quantity from the table. One only has to express each quantity QH as Q divided by its powers of c and ε0 to make these units reappear. The entry siemens (S) in table C.2 shows that c ε0 is an electric conductance. The fine structure constant α = e2 /(4π ε0 h¯ c) (C.9) is dimensionless. So

C Systems of Units

227

1 Rvon Klitzing

= 2α cε0 =

e2 e2 = 2π h¯ h

(C.18)

is a conductance which depends on h¯ , but not on c and therefore should be related to quantum effects of slow electrons and occur in electric and magnetic properties of low temperature solids. Indeed, in the continuum of conceivable conductances, discrete, integer multiples of this conductance occur in the quantum Hall effect. As it can be measured with high precision and comparatively little effort one considers to use its inverse, the von Klitzing resistance, Rvon Klitzing = 25 812.807 Ω

(C.19)

as SI unit and replace the unit ampere. With m = 102 cm, kg = 103 g and 3

1

1

A = 10 zc cm 2 g 2 s−2 (4π ε0 ) 2

(C.20)

one easily converts the quantities of table C.1 to Gaussian units cm, g, s and 4πε0 . Table C.3: Conversion to cm g s 4πε0 Symbol Value(m kg s A) m kg s A

Value(cm g s 4πε0 ) cm

m kg s A c ε0 h¯ GN c µ0 C F Gy Hz c2 H J N Ω Pa S cT V W c Wb

102 1 0 0 103 0 1 0 1 0 0 1 10 zc 3/2 1/2 -2 102 zc 1 0 -1 (4π )−1 0 0 0 1.054 571 6 ·10−27 2 1 -1 6.673 ·10−8 3 -1 -2 4π 10−2 z−1 -1 0 1 c 10 zc 3/2 1/2 -1 10−5 z2c 1 0 0 104 2 0 -2 1 0 0 -1 109 1 0 0 107 2 1 -2 5 10 1 1 -2 105 /z2c -1 0 1 10 -1 1 -2 10−5 z2c 1 0 -1 104 -1/2 1/2 -1 106 /zc 1/2 1/2 -1 107 2 1 -3 108 3/2 1/2 -1

1 1 1 1 zc 107 /(4π z2c ) 1.054 571 6 ·10−34 6.673 ·10−11 4π · 10−5 zc 1 1 1 1 z2c 1 1 1 1 1 zc 1 1 zc

1 0 0 0 1 -3 2 3 2 0 -2 2 0 3 2 1 2 -1 -2 1 2 2 3

0 0 0 1 0 0 0 1 0 0 0 1 0 -1 0 -1 4 2 1 -1 0 -1 -2 0 1 -3 -2 0 1 1 -1 4 2 0 -2 0 0 -1 0 1 -3 -2 1 -2 0 1 -2 0 1 -3 -2 1 -2 0 -1 3 2 1 -3 -1 1 -3 -1 1 -3 0 1 -3 -1

g s 4πε0 0 0 0 1/2 0 1 0 0 -1 1/2 1 0 0 -1 0 0 -1 0 1 -1/2 -1/2 0 -1/2

228

C Systems of Units

However, the Gaussian system uses the denomination “magnetic field” for the product of c with the magnetic field of the SI system BGauß = c BSI .

(C.21)

In these terms the Lorentz force is q(E + v × BSI ) = q(E + cv × BGauß ). Thereby the electric field E and the magnetic field BGauß acquire the same units which allows to compare their size 1

1

1

10−4 c T = Gs = cm− 2 g 2 s−1 (4πε0 )− 2 .

(C.22)

The names and the values of the Gaussian units are listed below. Table C.4: Units and Physical Constants in the Gaussian System Quantity

Unit

Symbol

length centimetre cm mass gram g time second s current velocity of light c ε0 electric constant reduced Planck constant h¯ Newton constant GN c · magnetic constant c µ0 charge franklin Fr capacitance dose, energy per mass frequency hertz Hz inductance (Gauß) c2 H energy erg erg force dyn dyn resistance pressure, stress barye Ba conductance magnetic field (Gauß) gauß Gs voltage gilbert Gb power magnetic flux (Gauß) maxwell Mx

Value

cm

g s 4πε0

1 1 0 0 0 1 0 1 0 0 1 0 0 1 0 1 3/2 1/2 -2 1/2 102 zc 1 0 -1 0 (4π )−1 0 0 0 1 1.054 571 6 ·10−27 2 1 -1 0 6.673 ·10−8 3 -1 -2 0 -1 0 1 -1 4π 10−2z−1 c 1 3/2 1/2 -1 1/2 1 1 0 0 1 1 2 0 -2 0 1 0 0 -1 0 109 1 0 0 -1 1 2 1 -2 0 1 1 1 -2 0 1 -1 0 1 -1 1 -1 1 -2 0 1 1 0 -1 1 1 -1/2 1/2 -1 -1/2 1 1/2 1/2 -1 -1/2 1 2 1 -3 0 1 3/2 1/2 -1 -1/2

The powers of 4πε0 are omitted in the Gaussian system. Moreover one frequently replaces the appropriate (half integer) powers of cm, g and s by the placeholder symbol [esu] (electrostatic units) and calculates only with the numerical values of the physical quantities.

Appendix D

Collapse of the Wave Function

The collapse of the wave function is claimed to be instantaneous and may therefore be a phenomenon which propagates faster than light. Which is why we investigate it more closely. If a particle interacts with a measuring device A it causes a result and is afterwards in some state which depends on the result. If one analyses the intermediate time evolution one has to consider also the pointer of A, as not only the particle but also the pointer changes. So we combine the particle which is to be measured and the pointer of the device A to a larger system and measure the particle pointer state with a device B. Before the measurement of the pair, the particle is in some state ψ , the pointer in a state φ and the pair in the product state ψ ⊗ φ in which all properties of the particle are independent of the pointer and vice versa.

Particle Device A ψ⊗φ

(χ1 ⊗ φ1 ) ψ1 + (χ2 ⊗ φ2 ) ψ2 + .. .

Device B

(χi ⊗ φi ) ψi Fig. D.1 Repeated Measurement

Unitary time evolution in the device maps this product state to a state U(ψ ⊗ φ ). If the particle is initially in an eigenstate Λi of the device A, then after the measurement, by definition of the eigenstate, the pointer in the state U(Λi ⊗ φ ) shows certainly the value i and is in an eigenstate φi of Aˆ which reads the display of A. So U(Λi ⊗ φ ) = χi ⊗ φi , no sum over i .

(D.1) 229

230

D Collapse of the Wave Function

Different states φi and φ j of the pointer are orthogonal, as they can be read off with certainty

φi φ j = δ i j . (D.2)

The particle state χi after the measurement is normalized as the time evolution U is unitary and because φi , Λi and φ are normalized. For the device A to be a measuring device it is not necessary that the states χi are mutually orthogonal, in particular, A must not necessarily perform a quantum nondemolition measurement which reproduces from each eigenstate Λi the eigenstate χi = Λi for the repeated measurement although some authors postulate this dispensable axiom. Many devices do not prepare eigenstates of a repeated measurement e.g. photographic films or photo multipliers count photons by annihilating them. If P more generally the initial particle state is not an eigenstate Λi it is a sum ψ = j Λ j ψ j with components ψ j = Λ j ψ . As the unitary time evolution in the device U is linear, the combined particle pointer state is mapped to X X Λ j ⊗ φ ψ j) = (χ j ⊗ φ j ) ψ j . (D.3) U(ψ ⊗ φ ) = U( j

j

The time evolution in the device entangles the particle with the pointer, i.e. generates from a product a state which does not factorize (unless all χi are equal) but which is a sum of products. If the states χi are mutually orthogonal, then one can determine the pointer also by measuring the particle. Then the particle states χi and the pointer states φi are factors of a Schmidt decomposition of U(ψ ⊗ φ ). But in a generic measurement the states χi are not required to be mutually orthogonal. If now one measures in the state U(ψ ⊗ φ ), after the first measurement, the particle with a device C and reads off with Aˆ the pointer of the first device, then to this combined device B = C ⊗ Aˆ (D.4)

there correspond product eigenstates Γk ⊗ φi where Γk are eigenstates of C. According to (8.1) the probability to obtain the result number k of the device C and see the pointer show the result number i is X w(k, i) = | hΓk ⊗ φi U(ψ ⊗ φ )i |2 = | hΓk ⊗ φi | χ j ⊗ φ j ψ j i |2 = | hΓk χi i ψi |2 . j

(D.5) P In particular, the pointer shows the result i with the probability |ψi |2 = l w(l, i). So, under the condition that Aˆ shows the result i, the probability wi (k) to obtain the value number k in a subsequent measurement of the particle with device B is w(k, i) wi (k) = P = | hΓk χi i |2 = w(k,C, χi ) , l w(l, i)

(D.6)

as if the first measurement with the device A had produced the conditional state χi .

D Collapse of the Wave Function

231

The transition of the state ψ to χi is not a discontinuous time evolution which reduces or collapses the state in the device. On the contrary, the probability of pairs of results change differentiably because of the Sch¨odinger equation in the product Hilbert space of particle pointer states. Before the repeated measurement one had a probability w(k, i) to obtain result number i in the first and k in the second measurement. Once the first result becomes known to be i, this result is certain and the second result k occurs with the condiP tional probability wi (k) = w(k, i)/ l w(l, i). To call this transition to the conditional probability and the conditional state χi the collapse of the wave function mystifies chance and certainty. One would have to speak with equal logic of the collapse of the lotto gambler when his expectations change with each of the 6 numbers which are drawn out of initially 49. Note that the collapse only occurs after one includes the information about the outcome of the measurement A. According to all accumulated experimental facts, this information cannot be obtained faster than light. The conditional state χi is the state which yields the conditional probabilities wi . As χi is independent from ψ one cannot reconstruct the original state ψ by any later measurement of the particle from the sample of events in which the previous measurement with A had produced the nondegenerate result number i. Selecting this sample is a way to produce the state χi by measuring with A. If after the measurement with the device A some or all outcomes are combined to a new state irrespective of what the pointer shows then the probability for a subsequent measurement C of the particle to yield the result number k is X X X |ψi |2 |χi i hχi | (D.7) | hΓk χi i ψi |2 = hΓk ρ Γk i where ρ = w(k, i) = i

i

i

as if the device A had generated from the pure state ψ the mixture ρ . The products (χi ⊗ φi ) ψi are incoherent as concerns measurements of the particle. The decoherence and the related entropy S = − tr ρ ln ρ arise, though the time evolution of the complete system of particle and pointer is unitary, because one disregards knowledge about the complete entangled state and restricts the measurements to the ones of the particle. The device A is idealized in figure D.1 by a box which the particle enters and leaves. This takes its time. In no real device the measurement is instantaneous though this simplifying idealization is often appropriate. But the idealization implies the watched pot behaviour that iterated observation prevents the time evolution of the state. One argues that a closely watched pot will never boil. This is just a curious consequence of the oversimplification that the measurement is instantaneous. No measurement is possible without interaction of the state and the device. So one should not be surprised that repeated observation changes the time evolution of the measured state. In the case of the watched pot, however, I expect it to boil sooner rather than later, if one shines it intensely with light in order to obtain the necessary brightness for shorter and shorter repeated observations.

References

1. Vladimir I. Arnold, Mathematical Methods of Classical Mechanics, Springer, Berlin, 1980 2. Alain Aspect, Jean Dalibard and G´erard Roger, Experimental Test of Bell’s Inequalities Using Time-Varying Analyzers, Phys. Rev. Lett. 49 (1982) 1804–1807 3. J. Bailey et al., Il Nuovo Cimento 9A (1972) 369 4. John S. Bell, On the Einstein-Podolsky-Rosen paradox, Physics 1 (1964) 195–200 reprinted in [5] 5. John S. Bell, Speakable and unspeakable in quantum mechanics, Cambridge University Press, 1987 6. Michael Victor Berry, Regular and Irregular Motion, in S. Jorna (ed.), Topics in Nonlinear Dynamics, Amer. Inst. Phys. Conf. Proceedings Nr.46 (1978) 16 7. Hermann Boerner, Representations of Groups with Special Consideration for the Needs of Modern Physics, North Holland Publishing Co., Amsterdam, 1962 8. Nikolay N. Bogoliubov and Dmitry V. Shirkov, Introduction to the Theory of Quantized Fields, John Wiley, New York, 1959 9. Hermann Bondi, Relativity and Common Sense, Heinemann, London, 1965 10. Ignazio Ciufolini and Erricos Pavlis, A Confirmation of the General Relativistic Prediction of the Lense-Thirring Effect, Nature 431 (2004) 958 11. Ignazio Ciufolini and John Archibald Wheeler, Gravitation and Inertia, Princeton Series in Physics, 1995 12. John F. Clausner, Michael A. Horne, Abner Shimony and Richard A. Holt, Proposed experiment to test local hidden-variable theories, Phys. Rev. Lett 23 (1969) 880–884 13. Richard Courant and David Hilbert, Methods of Mathematical Physics II, John Wiley & Sons, Chichester, 1953 14. Norbert Dragon and Friedemann Brandt, BRST Symmetry and Cohomology, in Strings, Gauge Fields, and the Geometry Behind, The Legacy of Maximilian Kreuzer, edited by Anton Rebhan et al., World Scientific Publishing Co., Singapore (2012) 3–86 15. Norbert Dragon and Nicolai Mokros, Relativistic Flight through Stonehenge, http://www.itp.uni-hannover.de/∼dragon, 1999 16. Gert Eilenberger, Regul¨ares und chaotisches Verhalten Hamiltonscher Systeme, 14. Ferienkurs Nichtlineare Dynamik in kondensierter Materie, Kernforschungsanlage J¨ulich, 1983 17. C. W. Francis Everitt et al., Gravity Probe B: Final Results of a Space Experiment to Test General Relativity, Phys. Rev. Lett. 106 (2011) 221101 18. Mosh´e Flato, Christian Fronsdal and Daniel Sternheimer, Difficulties with massless particles?, Comm. Math. Phys. 90 (1983) 563–573 19. William Fulton and Joe Harris, Representation theory A first course, Springer, New York, 1991 20. Carl Friederich Gauss Werke, Achter Band, Springer, Berlin, (1900) 220–224, Brief an Wolfgang von Bolyai, G¨ottingen, 6. M¨arz 1833

233

234

References

21. Ferdinando Gliozzi, Joel Sherk and David Olive, Supersymmetry, Supergravity Theories and the Dual Spinor Model, Nucl. Phys. B122 (1977) pp. 253–290 22. Domenico Giulini, Special Relativity A First Encounter, Oxford University Press, 2005 23. Israil S. Gradshteyn and Jossif M. Ryzhik, Table of Integrals, Series and Products, Academic Press, New York, 1965 24. Joseph C. Hafele and Richard E. Keating, Around-the-world atomic clocks: predicted relativistic time gains, Science 177 (1972) pp. 166–167; Around-the-world atomic clocks: observed relativistic time values, Science 177 (1972) pp. 168–170 25. Brian C. Hall, Lie Groups, Lie Algebras, and Representations, Springer, New York, 2003 26. Morton Hamermesh, Group Theory and its Application to Physical Problems, AddisonWesley Publishing Company, Reading, 1962 27. Allen Hatcher, Algebraic Topology, Cambridge University Press, 2002 28. Stephen A. Huggett and K. Paul Tod, An Introduction to Twistor Theory, Cambridge University Press, 1994 29. Ray A. d’Inverno, Introducing Einstein’s Relativity, Oxford University Press, 1992 30. Claude Itzykson and Jean-Bernard Zuber, Quantum Field Theory, McGraw-Hill Inc., New York, 1980 31. John David Jackson and Lev Borisovich Okun, Historical roots of gauge invariance, Rev. Mod. Phys. 73 (2001) 663–680 32. Rudolf Kippenhahn, Light from the Depths of Time, Springer, Berlin, 1987 33. Anton Lampa, Wie erscheint nach der Relativit¨atstheorie ein bewegter Stab einem ruhenden Beobachter?, Zeitschrift f¨ur Physik 27 (1924) 138–148 34. Lew Landau and Evgeny Lifshitz, The Classical Theory of Fields, Pergamon Press, Oxford, 1971 35. Dierck-Ekkehard Liebscher, Einstein’s Relativity and the Geometries of the Plane, WileyVCH Verlag GmbH, 1998 36. Ludvig Valentin Lorenz, On the Identity of the Vibrations of Light with Electrical Currents, Phil. Mag. ser. 4, 34 (1867) 287–301 37. Alan J. Macfarlane, On the Restricted Lorentz Group and Groups Homomorphically Related to It, J. Math. Phys. 3 (1962) 1116–1129 38. George W. Mackey, Induced Representations of Locally Compact Groups I, Annals of Mathematics, 55 (1952) 101–139; Induced Representations of Locally Compact Groups II, Annals of Mathematics, 58 (1953) 193–221; Induced Representations of Groups and Quantum Mechanics, W. A. Benjamin, New York, 1968 39. J¨urgen Moser, Stable and Random Motion in Dynamical Systems, Princeton University Press, Princeton, 1973 40. Tristan Needham, Visual Complex Analysis, Clarendon Press, Oxford, 1997 41. Peter Nemec, http://www.ohg-sb.de/lehrer/nemec/relativ.htmwww.ohg-sb.de/lehrer/nemec/relativ.htm 42. Theodore Duddell Newton and Eugene Paul Wigner, Localized States for Elementary Systems, Rev. Mod. Phys. 21 (1949) 400–406 43. Emmy Noether, Invariante Variationsprobleme, Nachrichten von der K¨oniglichen Gesellschaft der Wissenschaften zu G¨ottingen, Mathematisch-physikalische Klasse (1918) 235–257 44. John O’Connor and Edmund Robertson, The MacTutor History of Mathematics archive, http://www-groups.dcs.st-and.ac.uk/ history/BiogIndex.html 45. Peter J. Olver, Applications of Lie Groups to Differential Equations, Springer, Berlin, 1986 46. Global Positioning System: Theory and Applications Volume I, American Institute of Aeronautics and Astronautics, edited by Bradford W. Parkinson and James J. Spilker Jr., Washington, DC, 1996 47. Particle Data Group, Keith A. Olive et al., Chinese Phys. C38 (2014) 090001 http://pdg.lbl.gov 48. Roger Penrose, The apparent shape of a relativistically moving sphere, Proc. Camb. Phil. Soc. 55 (1959) 137–139 49. Olivier Piguet and Silvio P. Sorella, Algebraic Renormalization, Springer, Berlin, 1995

References

235

50. Michael Reed and Barry Simon, Methods of Modern Mathematical Physics, Vol. 1 - 4, Academic Press, London, 1980 51. David A. Richter, http://homepages.wmich.edu/∼drichter/hopffibration.htm 52. G¨unter Scharf, Finite Quantum Electrodynamics, Springer, Berlin, 1995 53. Konrad Schm¨udgen, Unbounded Operator Algebras and Representation Theory, Birkh¨auser, Basel, 1990 54. Gunter Seeber, Satellite Geodesy: Foundations, Methods, and Applications, de Gruyther, New York, 1993 55. Roman U. Sexl and Helmuth K. Urbantke, Relativity, Groups, Particles, Springer, Wien, 2001 56. Leo Stodolsky, The speed of light and the speed of neutrinos, Phys. Lett. B 201 (1988) 353 57. John Lighton Synge, 1 Relativity: The General Theory, North-Holland Publishing Company, Amsterdam, 1964 58. Peter Szekeres, A Course in Modern Mathematical Physics, Cambridge University Press, 2006 59. James Terrell, Invisibility of Lorentz Contraction, Phys. Rev. 116 (1959) 1041–1045 60. Abraham A. Ungar, Beyond the Einstein Addition Law and its Gyroscopic Thomas Precession, Kluwer Academic Publishers, Dordrecht, 2001 61. Giorgio Velo and Arthur S. Wightman (Eds.), Renormalization Theory, Nato Science Series C, Vol. 23, Springer, Berlin, 1976 62. John von Neumann, Die Eindeutigkeit der Schr¨odingerschen Operatoren, Mathematische Annalen 104 (1931) 570–578 63. Steven Weinberg, The Quantum Theory of Fields, Volume I Foundations, Cambridge University Press, 1995 64. Steven Weinberg and Edward Witten, Limits on Massless Particles, Phys. Lett. 96B (1980) 59–62 65. Reinhard Werner, Vorlesungsskript Quantenmechanik, unpublished 66. Julius Wess and Jonathan Bagger, Supersymmetry and Supergravity, Princeton University Press, Princeton, 1992 67. Arthur S. Wightman, On the Localizability of Quantum Mechanical Systems, Rev. Mod. Phys. 34 (1962) 845–872 68. Eugene Paul Wigner, Group Theory and its Application to the Quantum Mechanics of Atomic Spectra, Academic Press, New York, 1959 69. Clifford M. Will, Theory and Experiment in Gravitational Physics, Cambridge University Press, 1993; The Confrontation between General Relativity and Experiment, Living Rev. Relativity 17 (2014) 4, http://www.livingreviews.org/lrr-2014-4 (cited on 1st September 2015) 70. http://mathworld.wolfram.com/Hyperboloid.html 71. Nicholas M. J. Woodhouse, Special Relativity, Lecture Notes in Physics m6, Springer, Berlin, 1992

1

The name Synge is pronounced sing.

Index

Symbols

A

Bε ,y 91 G/H 188 MR,y [φ ] 93 S3 , Sn 116, 188 [A, B], [A, B] 189, 204 Adg , adω 189, 197  96 Ω 146 δ m n 113, 181 ˜ dk 98, 141 dΩ 159 η 182 ∂ˆ L 72, 109 ∂ˆ xm γ n 196 A− , A+ 155 UN , US 155 V p 181 2s+1 ) ˜ 141 L2 (Mm , dp; ˆ + 133 σ m 121 signF 201, 208 supp(g) 168 E(D − 2) 154 O(p, q) 116, 192 SL(2, ) 123–126 SO(p, q), SO(p, q)↑ 119 θ (λ ), ε (x) 101, 219 tr, str 182, 203 ˙ εi jk , ε αβ , ε α˙ β 186, 193 c 9 dt , dm 69, 108, 205 w(x, y, z) 63 p,q 116 121 2

aberration 52–57, 126 acceleration 4, 7, 67, 106 action 73, 108 active transformation 48 adjoint transformation 189 ampere 221 angular momentum 79, 147, 213 annihilation operator 181, 209 antiunitary 140 Avogadro number 221 axial vector 187

C

C

R Z

B background radiation 2, 154 Bell inequality 15, 18 Bohmian mechanics 151 Bohr radius 225 Boltzmann constant 223 boost 49, 118, 125 Bradley 54 bundle 188, 199 C candela 221 canonically conjugate momentum 77 cartesian product 183 Casimir operator 213 Cauchy completion 140 causal past 14 causality 168 charge conservation 86 Clebsch-Gordan coefficients 191 commutator 189, 204

237

238

Index

Compton scattering 64 conformal 55, 110 conserved quantity 58–64, 74–81 continuity equation 86 continuous spin 154 contraction 211 contragredient 182 convective derivative 69, 108 core 146 coset 188 Coulomb potential 91 creation operator 181, 209 cyclic variable 77 D d’Alembert operator 96 de Sitter precession 138 decay 63 degrees of freedom 69 Dirac algebra 196 Dirac spinor 195, 196 domain of dependence 88, 96 Doppler effect 22, 26, 35, 52–57 E electron radius 107 endomorphism 181 energy 58–64, 77–79, 86, 87, 110 E = m c2 61 energy-momentum tensor 86, 110, 112 entangled state 205 equilocal, equitemporal 12 equivariant 189 Euler derivative 72, 109 Euler-Lagrange equations 73, 109

gauge transformation grading 201 group 51, 187

95, 111

H Hamiltonian 141 Heisenberg pair 149, 161 Heisenberg picture 166 helicity 157 hermitean 122, 140 Hilbert space 139 Hopf fibration 200 Huygens’ principle 98 hyperbola 43 hyperbolic space 127 I image of a moving body 52 improvement terms 109 induced representation 143, 190 inertia 62 infinitesimal symmetry 75 infinitesimal transformation 75, 109 intertwiner 189 irreducible 143, 190 J jet space

69, 107

K katal 223 kelvin 221 Klein-Gordon equation

151

L F faithful 189 Fermi-Walker transport 137 fiber 188, 199 field 83, 169 fine structure 224 Fock space 169 four-momentum 61 four-potential 94 Frobenius reciprocity 160 functional 71, 108 functional derivative 71, 72 G G˚arding space

146

Lagrangian 71, 77, 108 Laplace equation 91 length contraction 32–35, 45–46 length squared 39–43, 65–67 Lie algebra 197 Li´enard-Wiechert potential 105 lift 69, 107 light coordinates 37 lightangle 12 lightcone 2, 6–9, 88, 96 lightlike 41, 61 limit velocity 13 little group 188 local functional 71 Lorentz force 83, 88 Lorentz transformation 47–51, 113–126

Index

239

Lorenz condition luminosity 57

95

M Majorana spinor 195 mass 61–64, 147 maximum-minimum principle 93 Maxwell equations 7, 83 measuring rod 32–35 metre 9, 224 Michelson 6, 9 Minkowski space 116 M¨obius transformation 113, 126 mole 221 momentum 58–64, 77, 87, 110 multiindex 146 multiplet 213 multiplicative operator 145 muon 67 N neutrino 7, 81 Newton constant 223 Noether theorem 74–76, 109–112 normal ordered product 211 normal subgroup 121 O one-dimensional motion 78 orbit 50, 188 orthogonality relation 114, 183 P passive transformation 48 Pauli matrices 121, 215 permutation 94 perturbative 168 photon 7, 15–19, 52, 61, 64 Planck mass 223 Poincar´e transformation 51 Poisson equation 91 polar vector 187 positronium 207 Poynting vector 87 pressure 88 propagator 211, 220 Q quaternion

121

R rapidity 28, 129 ray representation 142 realization 188 reducible 190 referee 11, 22 representation 59, 182, 189 restricted Lorentz group 119 rigid body 9, 15 Rømer 6 Rydberg energy 223 S S-matrix 165 scalar product 39, 139, 182, 186 Schr¨odinger equation 141 Schr¨odinger picture 166 Schur’s lemma 190 Schwartz function 146, 149, 174 second 9 section 144, 189 selection rule 191 separable 140, 208 sesquilinear form 140, 192 simply connected 200 simultaneous 10–15, 43 spacelike 41, 89, 173 spacetime 1 speed of light 6, 9 stabilizer 188 stereographic projection 55, 215 Stone-von Neumann theorem 149 subgroup 188 summation convention 70, 113 superluminal speed 14, 37 symmetric group 188 symmetry 188 T tachyon 14, 63 teleportation 15 tensor 182 tensor operator 191 tensor product 183 Thomas precession 133 time dilation 28–35 time ordered product 168, 210 timelike 41 trace 123 transformation 188 transition function 144 twin paradox 29–32, 46

240

Index

U unimodular 188 unitary 120, 140 V vector product 186 velocity 7, 9, 28, 51–52 vierbein 135, 138 von Klitzing resistance 227

wave packet 97 Weinberg-Witten theorem 176 Weyl spinor 193, 195 Wick’s theorem 211 Wigner rotation 133 Wigner-Eckart theorem 191 Witt algebra 217 worldline 7 Y

W wave function

140

Yang theorem

207