Quantum Field Theory I - THEP Mainz

October 12, 2011

Quantum Field Theory I

Ulrich Haisch Rudolf Peierls Centre for Theoretical Physics, University of Oxford OX1 3PN Oxford, United Kingdom

Abstract This course deals with modern applications of quantum field theory with emphasize on the quantization of theories involving scalar and spinor fields.

Recommended Books and Resources There is a vast array of quantum field theory texts, many of them with redeeming features. Here I mention a few of them, mostly the ones that I used or looked at when preparing this course. To a large extent, I will follow the first section of • M. Peskin and D. Schroeder, “An Introduction to Quantum Field Theory” This is a very clear and comprehensive book, covering essentially everything in this course as well as many advanced aspects of quantum field theory that go (far) beyond the scope of this lecture. • S. Weinberg, “The Quantum Theory of Fields: Volume 1, Foundations” This is the first in a three volume series by one of the masters of quantum field theory. It takes a unique route through the subject, focussing initially on particles rather than fields. Since it has a very particular viewpoint, it is “difficult to digest”, but certainly worth reading. • L. Ryder, “Quantum Field Theory” This elementary text has a nice discussion of much of the material in this course. It is good for a first reading. • A. Zee, “Quantum Field Theory in a Nutshell” This is a charming book, where emphasis is placed on physical understanding and the author isn’t afraid to hide the ugly truth when necessary. It contains many gems. By browsing the web, I also found interesting material. Nice introductions to quantum field theory (of different length and viewpoint) have been written by C. Anastasiou and D. Tong. The corresponding scripts can be found at: http://www.phys.ethz.ch/∼babis/Teaching/QFTI/qft1.pdf http://www.damtp.cam.ac.uk/user/tong/qft/qft.pdf Other links to useful resources can be found on the web page of D. Tong: http://www.damtp.cam.ac.uk/user/tong/qft.html For completeness, I will also give relevant references at the end of each section of this script. The interested reader can consult them for further details on the discussed topics.

1

Contents 1 Introduction 1.1 Why QFT? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Scales and Units . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Elements of Classical Field Theory 2.1 Dynamics of Fields . . . . . . . . . 2.2 Noether’s Theorem . . . . . . . . . 2.3 Example: Electrodynamics . . . . . 2.4 Space-Time Symmetries . . . . . . 2.5 Problems . . . . . . . . . . . . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

3 Klein-Gordon Theory 3.1 Klein-Gordon Field as Harmonic Oscillators 3.2 Structure of Vacuum . . . . . . . . . . . . . 3.3 Particle States . . . . . . . . . . . . . . . . . 3.4 Two Real Klein-Gordon Fields . . . . . . . . 3.5 Complex Klein-Gordon Field . . . . . . . . . 3.6 Heisenberg Picture . . . . . . . . . . . . . . 3.7 Klein-Gordon Correlators . . . . . . . . . . . 3.8 Non-Relativistic Limit . . . . . . . . . . . . 3.9 Problems . . . . . . . . . . . . . . . . . . . .

. . . . .

. . . . . . . . .

. . . . .

. . . . . . . . .

. . . . .

. . . . . . . . .

. . . . .

. . . . . . . . .

. . . . .

. . . . . . . . .

. . . . .

. . . . . . . . .

. . . . .

. . . . . . . . .

. . . . .

. . . . . . . . .

4 Interacting Fields 4.1 Classification of Interactions . . . . . . . . . . . . . . . . . 4.2 Interaction Picture . . . . . . . . . . . . . . . . . . . . . . 4.3 First Look at Scattering Processes . . . . . . . . . . . . . . 4.4 Wick’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . 4.5 Second Look at Scattering Processes . . . . . . . . . . . . 4.6 Feynman Diagrams . . . . . . . . . . . . . . . . . . . . . . 4.7 Third Look at Scattering Processes . . . . . . . . . . . . . 4.8 Yukawa Potential . . . . . . . . . . . . . . . . . . . . . . . 4.9 Connected and Amputated Feynman Diagrams . . . . . . 4.10 From Correlation Functions to Scattering Matrix Elements 4.11 Decay Widths and Cross Sections . . . . . . . . . . . . . . 4.12 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Dirac Theory 5.1 Spinor Representation . . . . . . . . . . 5.2 Discrete Symmetries of Dirac Theory . . 5.3 Continuous Symmetries of Dirac Theory 5.4 Solutions to Dirac Equation . . . . . . . 5.5 Quantization of Dirac Theory . . . . . . 5.6 Problems . . . . . . . . . . . . . . . . . . 2

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . .

. . . . . . . . .

. . . . . . . . . . . .

. . . . . .

. . . . .

. . . . . . . . .

. . . . . . . . . . . .

. . . . . .

. . . . .

. . . . . . . . .

. . . . . . . . . . . .

. . . . . .

. . . . .

. . . . . . . . .

. . . . . . . . . . . .

. . . . . .

. . . . .

. . . . . . . . .

. . . . . . . . . . . .

. . . . . .

. . . . .

. . . . . . . . .

. . . . . . . . . . . .

. . . . . .

. . . . .

. . . . . . . . .

. . . . . . . . . . . .

. . . . . .

. . . . .

. . . . . . . . .

. . . . . . . . . . . .

. . . . . .

. . . . .

. . . . . . . . .

. . . . . . . . . . . .

. . . . . .

. . . . .

. . . . . . . . .

. . . . . . . . . . . .

. . . . . .

3 3 5

. . . . .

8 8 9 12 14 19

. . . . . . . . .

22 23 25 30 34 37 38 42 48 52

. . . . . . . . . . . .

56 56 59 61 64 67 68 73 76 78 85 96 101

. . . . . .

107 108 117 123 125 130 143

1

Introduction

As the term quantum field theory (QFT) suggests, QFT is the application of quantum mechanics (QM) to dynamical systems of fields, in the same sense that QM is concerned mainly with the quantization of dynamical systems of particles. QFT is not only a subject that is absolutely essential to understand the current state of elementary particle physics as well as modern aspects of cosmology, but also plays a crucial role in many active areas of research, ranging from atomic over nuclear and condensed-matter physics to pure mathematics. Since the ultimate goal of this course is to gain a basic understanding of the fundamental laws of nature, we will in the following focus mainly on the physics of elementary particles and hence deal mostly with relativistic fields.

1.1

Why QFT?

The primary reason for introducing the concept of fields in classical physics is to construct laws of nature that are local. The old laws of Newton (Coulomb) involve “action at a distance”. This means that the force felt by a planet (an electron) changes immediately if a distant star (proton) moves. The laws of Newton and Coulomb thus feature non-local interactions. The field theories of Einstein (general relativity) and Maxwell (electrodynamics) remedied the situation, with all interactions mediated in a local fashion by fields. The requirement of locality remains a strong motivation for studying QFTs. However, there are further good reasons to treat the quantum field (and not the particle) as fundamental (or as Steven Weinberg puts it in [1]: “Quantum fields are the basic ingredients of the universe, and particles are just bundles of energy and momentum made out of them.”). QM and Special Relativity A first reason is that the combination of QM and special relativity implies that particle number is not conserved. Consider a particle of mass m trapped in a box of size L. Heisenberg’s uncertainty principle tells us that the uncertainty in the momentum of our particle is ∆p ≥ ~/L. In the relativistic limit, momentum and energy can be treated on equivalent footing, and one has an uncertainty in the energy of order ∆E ≥ ~c/L. Yet, if ∆E = 2mc2 , there is enough energy available to create a virtual particle-antiparticle pair from the vacuum (Dirac sea). This little exercise shows that when a particle with mass m is localized within a distance λCompton = ~/(mc), talking about a single particle loses its sense. For distances smaller than this Compton wavelength there is a high probability that we will detect particle-antiparticle pairs swarming around the single particle that we initially put into the box. Notice that λCompton is always smaller than the de Broglie wavelength given by λde Broglie = ~/|p|.1 If you like, the de Broglie wavelength is the distance at which the wavelike nature of particles becomes apparent, while the Compton wavelength is the distance at which the concept of a single pointlike particle breaks down and one has to start thinking about how to describe multiparticle states. 1

Throughout this course we will use boldface type (ordinary italic type) to denote 3-vectors (4-vectors).

3

The presence of a multitude of particles and antiparticles at short distances (or high energies) tells us that any attempt to write down a relativistic version of the one-particle Schrödinger equation is doomed to fail. There is no mechanism in standard non-relativistic QM to deal with changes in the particle number. Indeed, any attempt to naively write down a relativistic version of the one-particle Schrödinger equation meets serious problems: negative probabilities, infinite towers of negative energy states, or a breakdown of causality are the common issues that arise. QM and Causality Let us have a closer look at the issue of breakdown of causality. Consider the amplitude

A(t) = y e−iEt/~ x , (1.1) that describes the propagation of a free particle from the point x to y. In non-relativistic QM one has E = p2 /(2m) and hence2

A(t) = y exp −i p2 /(2m) t/~ x Z d3 p 2 p p x = y exp −i p /(2m) t/~ (2π~)3 Z (1.2) d3 p 2 = exp −i p /(2m) t/~ exp ip · (y − x)/~ (2π~)3 m 3/2 = exp im (y − x)2 /(2~2 t) . 2πi~t R Here we have made use of the completeness d3 p/(2π~)3 |pihp| = 1 of |pi and a little bit of algebra. The expression (1.2) is non-zero for all y and t, indicating that a particle can propagate between any two points in an arbitrarily short time. In a relativistic theory, this conclusion would psignal a violation of causality. One might hope that using the relativistic expression E = p2 c2 + m2 c4 for the energy would cure the problem, but it does not. In fact, in the relativistic case one has p

A(t) = y exp − it/~ p2 c2 + m2 c4 x Z p d3 p 2 c2 + m2 c4 exp ip · (y − x)/~ = exp − it/~ p (1.3) (2π~)3 Z ∞ p 1 = 2 2 dp p sin (p |y − x| /~) exp − it/~ p2 c2 + m2 c4 . 2π ~ |y − x| 0 This integral can be evaluated explicitly in terms of Bessel functions, but for our purposes it is sufficient to consider its asymptotic behavior for L2 = |y − x|2 c2 t2 , i.e., separations well outside the light-cone. We use the method of stationary phase. The relevant phase function 2

The symbol p denotes the momentum operator, which in many QM books is indicated by a “ b ”. To avoid clutter, I will not use the latter notation, but simply write p.

4

p √ pL − t p2 c2 + m2 c4 has a stationary point p = imcL/ L2 − c2 t2 . Plugging this value into (1.3), we find that (up to a rational function of L and t), h i √ 2 2 2 A(t) ∝ exp −m/~ L − c t . (1.4) This expression is small but non-zero outside the light-cone and causality is again violated. In both cases, the observed failure is telling us that we need a new formalism to preserve causality. This formalism is QFT. It solves the causality problem in a miraculous way. We will see later that in QFT the propagation of a particle across a space-like interval is indistinguishable from the propagation of an antiparticle in the opposite direction. When we ask whether an observation made at point x can affect an observation made at point y, we will find that the amplitudes for particle and antiparticle propagation cancel in such a way that causality is preserved. What else is QFT good for? Besides solving the causality problem, QFT also provides an elegant framework to describe transitions between states of different particle number and type. An example physical processes, exhaustivelly studied (from 1989 until 2000) at the Large Electron Positron (LEP) collider in Geneva, is the production of a muon (µ− ) and its antiparticle (µ+ ) out of the annihilation of an electron (e− ) and its antiparticle (the positron e+ ): e − + e + → µ− + µ+ .

(1.5)

The experimental confirmation of the QFT predictions for processes such as (1.5), often to an unprecedented level of accuracy, is our real reason for studying QFT. But the power of QFT does not end here. In traditional QM the relation between spin and statistics has to be put in by hand. To agree with experiment, one should choose Bose statistics (no minus sign if one exchanges two identical particles) for integer spin particles, and Fermi statistics (minus sign if one exchanges two identical particles) for half-integer spin particles. On the other hand, in QFT the relationship between spin and statistics is a consequence of the framework, following from the commutation quantization conditions for boson fields and anticommutation quantization conditions for fermion fields.

1.2

Scales and Units

There are three fundamental dimensionful constants in nature: the speed of light c, Planck’s constant ~ (divided by 2π), and Newton’s constant GN . Their dimensions are [c] = length time−1 , [~] = length2 mass time−1 , [GN ] = length3 mass−1 time−2 .

5

(1.6)

In order to avoid unnecessary clutter, we will work throughout this course in “natural” units, defined by c = ~ = 1.3 This allows us to express all dimensionful quantities in terms of a single scale which we choose to be mass or, equivalently, energy (since E = mc2 has become E = m). Energies will be given in units of eV (the electron volt) or more often GeV = 109 eV or TeV = 1012 eV, since we are typically dealing with high energies. To convert the unit of energy back to units of length or time, we have to insert the relevant powers of c and ~. E .g., the length scale λ associated to a mass m is λ = h/(mc). Remembering that hc ≈ 1.24 · 10−6 eV m ,

(1.7)

one finds that the length scale corresponding to the electron with mass me ≈ 511 keV is λe ≈ 2 · 10−12 m. Throughout this course we will refer to the dimension of a quantity, meaning the mass dimension. Newton’s constant, e.g., has [GN ] = −2 and defines a mass scale GN = MP−2 ,

(1.8)

where MP ≈ 1019 GeV is the Planck scale. This energy corresponds to a length scale LP ≈ 10−35 m the Planck length. The Planck length is believed to be the smallest length scale that makes sense: beyond this scale quantum gravity effects are likely to become important and its no longer clear that the concept of space-time can be applied. The largest length scale we can talk of is the size of the cosmological horizon, roughly 1060 LP . A number for particle physics and cosmology relevant masses and the corresponding length scales are shown in Table 1. Let me go through the list and spend some words on the most important quantities. After the size of the observable universe, the first scale we encounter is the cosmological constant (Λ) measured to be around 10−3 eV. Since nobody can really explain why the cosmological constant has this particular value, let’s forget about it real quick and turn our attention to the masses of the known elementary particles. These range from less than 1 eV for the neutrinos (ν’s) to around 175 GeV for the top quark (t). The (in)famous Higgs boson (h), which is the only not yet observed ingredient of the standard model (SM) of elementary particle physics, is believed to weigh in at about 100 to 200 GeV. For scales around 1 TeV, i.e., the terascale, the predictive power of the SM is expected to break down. This is precisely the energy regime that the Large Hadron Collider (LHC) at CERN in Geneva has started to explore, having a design center-of-mass energy of 14 TeV. Beyond the electroweak scale (v) of around 250 GeV, again nobody knows with certainty what is going on. One could find a plethora of new (elementary) particles or a “great desert”. There are experimental hints that the coupling constants of electromagnetism, and the weak and strong forces unify at around MGUT = 1016 GeV, i.e., the grand unification scale (GUT). Everything is topped off at the Planck scale where a QFT description might no longer be possible and a quantum theory including the effects of gravity is needed to describe the physics of fundamental interactions. The most likely possibility for such a theory seems to be some kind of string theory. But also many other ideas such as loop quantum gravity, Ho´rava-Lifshitz gravity, etc. exist. In fact, the “theory of everything” (TOE) could also be a QFT, but one in which the finite or 3

The whole point of units is that you can choose whatever units are most convenient!

6

Quantity Observable universe Cosmological constant (Λ) Neutrinos (ν’s) Electron (e− ) Muon (µ− ) Charm quark (c) Tau (τ − ) Bottom quark (b) Top quark (t) Higgs boson (h) Electroweak scale (v) LHC energy GUT scale (MGUT ) Planck scale (MP )

Mass −33

10 eV 10−3 eV . 1 eV 511 keV 106 MeV 1.3 GeV 1.78 GeV 4.6 GeV 175 GeV [100, 200] GeV 250 GeV 14 TeV 1016 GeV 1019 GeV

Length 27

10 m ≈ 2 · 1010 ly 10−3 m & 10−6 m 2 · 10−12 m 10−14 m 10−15 m 7 · 10−16 m 3 · 10−16 m 7 · 10−18 m [6, 12] · 10−18 m 5 · 10−18 m 9 · 10−20 m 10−31 m 10−35 m

Table 1: An assortment of masses and corresponding lengths scales that appear in the context of particle physics and cosmology.

infinite number of renormalized couplings do not run off to infinity with increasing energy, but hit a fixed point of the renormalization group equation. This possibility goes by the name of asymptotic safety. Don’t worry if haven’t understood a single word of what I have mumbled about possible TOEs. All this is way too advanced to be covered in this course. I only mentioned it, to make propaganda for the research of Joe Conlon (string theory), Andre Lukas (string theory), and John Wheater (quantum gravity), which work on such theories here in Oxford. Ask them if you want to know more about it.

References [1] S. Weinberg, “What is quantum field theory, and what did we think it was?,” arXiv:hepth/9702027. [2] S. Weinberg, “The Search for Unity: Notes for a History of Quantum Field Theory,” Daedalus, Vol. 106, No. 4, Discoveries and Interpretations: Studies in Contemporary Scholarship, Volume II (1977), 17 p. [3] Chapter 1 of S. Weinberg, “The Quantum theory of fields. Vol. 1: Foundations,” Cambridge, UK, Univ. Pr. (1995), 609 p. [4] F. Wilczek, Rev. Mod. Phys. 71, S85 (1999) [arXiv:hep-th/9803075]. 7

2

Elements of Classical Field Theory

In this second section we will discuss various aspects of classical fields. We will cover only the bare minimum ground necessary before turning to the quantum theory, and will return to classical field theory at several later stages in this course when we need to introduce new concepts or ideas.

2.1

Dynamics of Fields

A field is a quantity defined at every space-time point x = (t, x). While classical particle mechanics deals with a finite number of generalized coordinates qa (t), indexed by a label a, in field theory we are interested in the dynamics of fields φa (t, x) ,

(2.1)

where both a and x are considered as labels. We are hence dealing with an infinite number of degrees of freedom (dofs), at least one for each point x in space. Notice that the concept of position has been relegated from a dynamical variable in particle mechanics to a mere label in field theory. Lagrangian and Action The dynamics of the fields is governed by the Lagrangian. In all the systems we will study in this course, the Lagrangian is a function of the fields φa and their derivatives ∂µ φa ,4 and given by Z L(t) =

d3 x L(φa , ∂µ φa ) ,

(2.2)

where the official name for L is Lagrangian density. Like everybody else we will, however, simply call it Lagrangian from now on. For any time interval t ∈ [t1 , t2 ], the action corresponding to (2.2) reads Z Z Z t2

S=

dt

d3 x L =

d4 x L .

(2.3)

t1

Recall that in classical mechanics L depends only on qa and q˙a , but not on the second time derivatives of the generalized coordinates. In field theory we similarly restrict to Lagrangians L depending on φa and φ˙ a . Furthermore, with an eye on Lorentz invariance, we will only consider Lagrangians depending on ∇φa and not higher derivatives. Notice that since we have set ~ = 1, using the convention described in Section 1.2, the dimension of the action is [S] = 0. With (2.3) and [d4 x] = −4, it follows that the Lagrangian must necessarily have [L] = 4. Other objects that we will use frequently to construct Lagrangians are derivatives, masses, couplings, and most importantly fields. The dimensions of the former two objects are [∂µ ] = 1 and [m] = 1, while the dimensions of the latter two quantities depend on the specific type of coupling and field one considers. We therefore postpone 4

If there is no (or only little) room for confusion, we will often drop the arguments of functions and write φa = φa (x) etc. to keep the notation short.

8

the discussion of the mass dimension of couplings and fields to the point when we meet the relevant building blocks. Principle of Least Action The dynamical behavior of fields can be determined by the principle of least action. This principle states that when a system evolves from one given configuration to another between times t1 and t2 it does so along the “path” in cofiguration space for which the action is an extremum (usually a minimum) and hence satisfies δS = 0. This condition can be rewritten, using partial integration, as follows Z ∂L ∂L 4 δφa + δ(∂µ φa ) δS = d x ∂φa ∂(∂µ φa ) (2.4) Z ∂L ∂L ∂L 4 = dx − ∂µ δφa + ∂µ δφa = 0. ∂φa ∂(∂µ φa ) ∂(∂µ φa ) The last term is a total derivative and vanishes for any δφa that decays at spatial infinity and obeys δφa (t1 , x) = δφa (t2 , x) = 0. For all such paths, we obtain the Euler-Lagrange equations of motion (EOMs) for the fields φa , namely ∂L ∂L − = 0. (2.5) ∂µ ∂(∂µ φa ) ∂φa Hamiltonian Formalism The link between the Lagrangian formalism and the quantum theory goes via the path integral. While this is a powerful formalism, we will for the time being use canonical quantization, since it makes the transition to QM easier. For this we need the Hamiltonian formalism of field theory. We start by defining the momentum density π a (x) conjugate to φa (x), πa =

∂L . ∂ φ˙ a

(2.6)

In terms of π a , φ˙ a , and L the Hamiltonian density is given by H = π a φ˙ a − L ,

(2.7)

where, as in classical mechanics, we have eliminated φ˙ a in favor of π a everywhere in H. The Hamiltonian then simply takes the form Z H = d3 x H . (2.8)

2.2

Noether’s Theorem

The role of symmetries in field theory is possibly even more important than in particle mechanics. There are Lorentz symmetry, internal symmetries, gauge symmetries, supersymmetries, etc. We start here by recasting Noether’s theorem in a field theoretic framework. 9

Currents and Charges Noether’s theorem states that every continuous symmetry of the Lagrangian gives rise to a conserved current J µ (x), so that the EOMs (2.5) imply ∂µ J µ = 0 ,

(2.9)

or in components dJ 0 /dt + ∇ · J = 0. To every conserved current there exists also a conserved (global) charge Q, i.e., a physical quantity which stays the same value at all times, defined as Z Q= d3 x J 0 . (2.10) R3

The latter statement is readily shown by taking the time derivative of Q, Z Z 0 dQ 3 dJ = dx =− d3 x ∇ · J , dt dt R3 R3

(2.11)

which is zero, if one assumes that J falls off sufficiently fast as |x| → ∞. Notice, however, that the existence of the conserved current J is much stronger than the existence of the (global) charge Q, because it implies that charge is in fact conserved locally. To see this, we define the charge in a finite volume V by Z (2.12) QV = d3 x J 0 . V

Repeating the above analysis, we find dQV =− dt

Z

I

3

d x∇ · J = − V

dS · J ,

(2.13)

S

where S denotes the area bounding V , dS is a shorthand for n dS with n being the outward pointing unit normal vector of the boundary S, and we have used Gauss’ theorem. In physical terms the result means that any charge leaving V must be accounted for by a flow of the current 3-vector J out of the volume. This kind of local conservation law of charge holds in any local field theory. Proof of Theorem In order to prove Noether’s theorem, we’ll consider infinitesimal transformations. This is always possible in the case of a continuous symmetry. We say that δφa is a symmetry of the theory, if the Lagrangian changes by a total derivative δL(φa ) = ∂µ J µ (φa ) ,

(2.14)

for a set of functions J µ . We then consider the transformation of L under an arbitrary change of field δφa . Glancing at (2.4) tells us that in this case ∂L ∂L ∂L δL = − ∂µ δφa + ∂µ δφa . (2.15) ∂φa ∂(∂µ φa ) ∂(∂µ φa ) 10

When the EOMs are satisfied than the term in square bracket vanishes so that we are simply left with the total derivative term. For a symmetry transformation satisfying (2.13) and (2.14), the relation (2.15) hence takes the form ∂L µ δφa , (2.16) ∂µ J = δL = ∂µ ∂(∂µ φa ) or simply ∂µ J µ = 0 with Jµ =

∂L δφa − J µ , ∂(∂µ φa )

(2.17)

which completes the proof. Notice that if the Lagrangian is invariant under the infinitesimal transformation δφa , i.e., δL = 0, then J µ = 0 and J µ contains only the first term on the right-hand side of (2.17). We stress that that our proof only goes through for continuous transformations for which there exists a choice of the transformation parameters resulting in a unit transformation, i.e., no transformation. An example is a Lorentz boost with some velocity v, where for v = 0 the coordinates x remain unchanged. There are examples of symmetry transformations where this does not occur. E.g., a parity transformation does not have this property, and Noether’s theorem is not applicable then. Energy-Momentum Tensor Recall that in classic particle mechanics, spatial translation invariance gives rise to the conservation of momentum, while invariance under time translations is responsible for the conservation of energy. What happens in classical field theory? To figure it out, let’s have a look at infinitesimal translations xν → xν − ν =⇒ φa (x) → φa (x + ) = φa (x) + ν ∂ν φa (x) ,

(2.18)

where the sign in the field transformation is plus, instead of minus, because we are doing an active, as opposed to passive, transformation. If the Lagrangian does not explicitly depend on x but only through φa (x) (which will always be the case in the Lagrangians discussed in this course), the Lagrangian transforms under the infinitesimal translation as L → L + ν ∂ν L .

(2.19)

Since the change in L is a total derivative, we can invoke Noether’s theorem which gives us four conserved currents T µ ν = (J µ )ν one for each of the translations ν (ν = 0, 1, 2, 3). From (2.18) we readily read off the explicit expressions for T µ ν , T µν =

∂L ∂ν φa − δ µ ν L . ∂(∂µ φa )

(2.20)

This quantity is called the energy-momentum (or stress-energy) tensor. It has dimension [T µ ν ] = 4 and satisfies ∂µ T µ ν = 0 . (2.21) 11

The four “conserved charges” are (µ = 0, 1, 2, 3) Z µ P = d3 x T 0µ ,

(2.22)

Specifically, the “time component” of P µ is Z Z 0 3 00 3 a ˙ P = d x T = d x π φa − L ,

(2.23)

which (looking at (2.7) and (2.8)) is nothing but the Hamiltonian H. We thus conclude that the charge P 0 is the total energy of the field configuration, and it is conserved. In fields theory, energy conservation is thus a pure consequence of time translation symmetry, like it was in particle mechanics. Similarly, we can identify the charges P i (i = 1, 2, 3), Z Z i 3 0i P = d x T = − d3 x π a ∂i φa , (2.24) as the momentum components of the field configuration in the three space directions, and they are of course also conserved.

2.3

Example: Electrodynamics

As a simple application of the formalism we have developed so far in this section, let us try to derive Maxwell’s equations using the field theory formulation. In terms of the electric and magnetic fields E and B and the charge density ρ and 3-vector current j, these equations take the well-known form ∇ · B = 0, ∇×E+

∂B = 0, ∂t

∇ · E = ρ, ∇×B−

∂E =j. ∂t

(2.25) (2.26) (2.27) (2.28)

The E and B fields are spatial 3-vectors and can be expressed in terms of the components of the 4-vector field Aµ = (φ, A) by E = −∇φ −

∂A , ∂t

B = ∇ × A.

(2.29)

This definition ensures that the first two homogeneous Maxwell equations (2.25) and (2.26) are automatically satisfied, ∇ · (∇ × A) = ijk ∂i ∂j Ak = 0 , ∂A ∂ ∇ × −∇φ − + (∇ × A) = −∇ × (∇φ) = −ijk ∂j ∂k φ = 0 . ∂t ∂t 12

(2.30) (2.31)

Here ijk is the fully antisymmetric Levi-Civita tensor with 123 = −123 = +1 and the indices i, j, k = 1, 2, 3 are summed over if they appear twice. The remaining two inhomogeneous Maxwell equations (2.27) and (2.28) follow from the Lagrangian 1 1 (2.32) L = − (∂µ Aν ) (∂ µ Aν ) + (∂µ Aµ )2 − Aµ J µ , 2 2 with J µ = (ρ, j). From the rules presented in Section 2.1, we gather that the dimension of the vector field and current is [Aµ ] = 1 and [J µ ] = 3, respectively. The funny minus sign of the first term on the right-hand side is required to ensure that the kinetic term 1/2 A˙ 2i is positive using the Minkowski metric. Notice also that the Lagrangian (2.32) has no kinetic term 1/2 A˙ 20 and hence A0 is not dynamical. Why this is and necessarily has to be the case will only become fully clear if you attend the advanced QFT course. Yet, we can already get an idea what is going on by remembering that the photon (the quantum of electrodynamics) has only two polarization states, i.e., two physical dofs, while the massless vector field Aµ has obviously four dofs. The fact that time component A0 is not dynamical reduces the number of independent dofs in Aµ from four to three. But this is still one too many. The last unwanted dof can be “gauged away” using the gauge symmetry of the quantum version of electromagnetism aka quantum electrodynamics (QED). Enough said, let’s do serious business and compute something. To see that the statement made before (2.32) is indeed correct, we first evaluate ∂L = −∂ µ Aν + (∂ρ Aρ ) η µν , ∂(∂µ Aν )

∂L = −J ν , ∂Aν

(2.33)

from which we derive the EOMs, ∂L ∂L 0 = ∂µ − = −∂ 2 Aν + ∂ ν (∂ρ Aρ ) + J ν = −∂µ (∂ µ Aν − ∂ ν Aµ ) + J ν . (2.34) ∂(∂µ Aν ) ∂Aν Introducing now the field-strength tensor Fµν = ∂µ Aν − ∂ν Aµ ,

(2.35)

we can write (2.32) and (2.34) quite compact, 1 L = − Fµν F µν − Jµ Aµ . 4

(2.36)

∂µ F µν = J ν ,

(2.37)

Does this look familiar? I hope so. Notice that [Fµν ] = 2. In order to see that (2.37) indeed captures the physics of (2.27) and (2.28), we compute the components of F µν . We find F

0i

F

ij

= −F

i0

= −F

ji

i ∂A = ∂ A − ∂ A = ∇φ + = −E i , ∂t 0

i

i

0

i

j

j

i

ijk

k

= ∂ A − ∂ A = − B , 13

(2.38)

while all other components are zero. With this in hand, we then obtain from ∂µ F µ0 = ρ and ∂µ F µ1 = j 1 , ∂µ F µ0 = ∂0 F 00 + ∂i F i0 = ∇ · E = ρ , ∂µ F

µ1

= ∂0 F

01

+ ∂i F

i1

∂E 1 ∂B 3 ∂B 2 + − = =− ∂t ∂x2 ∂x3

1 ∂E ∇×B− = j1 . ∂t

(2.39)

Similar relations hold for the remaining components i = 2, 3. Taken together this proves the second inhomogeneous Maxwell equation (2.28). Let me also derive the energy-momentum tensor T µν of electrodynamics, ignoring for the moment the source term Aµ J µ . Using (2.33) one finds T µν = (∂ ν Aµ )(∂ρ Aρ ) − (∂ µ Aρ )(∂ ν Aρ ) +

1 µν η Fρσ F ρσ . 4

(2.40)

Notice that the first term in (2.40) is not symmetric, which implies that T µν 6= T νµ . In fact, this is not really surprising since the definition of the energy-momentum tensor (2.20) does not exhibit an explicit symmetry in the indices µ and ν. Nevertheless, there is typically a way to massage the energy-momentum tensor of any theory into a symmetric form.5 To learn how this can be done in the case under consideration is the objective of a homework problem.

2.4

Space-Time Symmetries

One of the main motivations to develop QFT is to reconcile QM with special relativity. We thus want to construct field theories in which space and time are placed on an equal footing and the theory is invariant under Lorentz transformations, xµ → (x0 )µ = Λµ ν xν ,

(2.41)

η µν = η ρσ Λµ ρ Λν σ ,

(2.42)

with so that the distance ds2 = η µν dxµ dxν is preserved. Here η µν = ηµν = diag (1, −1, −1, −1) denotes the Minkowski metric. E.g., a rotation by the angle θ about the z-axis, and a boost by v < 1 along the x-axis are respectively described by the following Lorentz transformations     1 0 0 0 γ −γv 0 0 0 cos θ − sin θ 0 −γv γ 0 0     Λµ ν =  Λµ ν =  (2.43) , , 0 sin θ cos θ 0  0 0 1 0 0 0 0 1 0 0 0 1 √ with γ = 1/ 1 − v 2 . The Lorentz transformations form a Lie group under matrix multiplication. You can learn more about this if you attend the lecture course on group theory held 5

One (but not the only) reason that you might want to have a symmetric energy-momentum tensor T µν is to make contact with general relativity, since such an object sits on the right-hand side of Einstein’s field equations.

14

by Andre Lukas. Alternatively, you can study the “group theory crash course” written by Martin Bauer (a PhD student at Mainz University). It can be found on my Oxford homepage. The various fields belong to different representations of the Lorentz group. The simplest example is the scalar field φ, which under the Lorentz transformation x → Λx,6 transforms as φ(x) → φ0 (x) = φ(Λ−1 x) .

(2.44)

The inverse Λ−1 appears in the argument because we are dealing with an active transformation, in which the field is truly shifted. To see why this means that the inverse appears, it will suffice to consider a non-relativistic example such as a temperature field. Suppose we start with an initial field φ(x) which has a hotspot at, say, x = (1, 0, 0). Let’s now make a rotation x → Rx about the z-axis so that the hotspot ends up at x = (0, 1, 0). If we want to express the new field φ0 (x) in terms of the old field φ(x), we have to place ourselves at x = (0, 1, 0) and ask what the old field looked like at the point R−1 x = (1, 0, 0) we came from. This R−1 is the origin of the Λ−1 factor in the argument of the transformed field in (2.44). The Lagrangian formulation of field theory makes it especially easy to discuss Lorentz invariance, since an EOM is automatically Lorentz invariant if it follows from a Lagrangian that is a Lorentz scalar. This is an immediate consequence of the principle of least action. If a Lorentz transformation leaves the Lagrangian unchanged, the transformation of an extremum in the action will be another extremum. To give an example, let’s look at the following Lagrangian 1 1 (2.45) L = (∂µ φ)2 − m2 φ2 . 2 2 where φ is a real scalar and, as we will see later, m is the mass of φ (for now on just think about m as a parameter). Obviously, the dimension of the field is [φ] = 1. You will show in a homework assignment that the EOM corresponding to (2.45) takes the form ∂ µ ∂µ + m2 φ = 0 . (2.46) This equation is the famous Klein-Gordon equation. The Laplacian in Minkowski space is sometimes denoted by . In this notation, the Klein-Gordon equation reads ( + m2 )φ = 0. Let us first check that a Lorentz transformation Λ leaves the Lagrangian (2.45) and its action invariant. According to (2.44), the mass term transforms as 1/2 m2 φ2 (x) → 1/2 m2 φ2 (x0 ) with x0 = Λ−1 x. The transformation of ∂µ φ is ∂µ φ(x) → ∂µ (φ(x0 )) = (Λ−1 )ν µ (∂ν φ)(x0 ) .

(2.47)

Using (2.43) we thus find that the derivative term in the Klein-Gordon Lagrangian behaves as 1 1 (∂µ φ(x))2 → (Λ−1 )ρ µ (∂ρ φ)(x0 )(Λ−1 )σ ν (∂σ φ)(x0 ) η µν 2 2 1 = (∂ρ φ)(x0 )(∂σ φ)(x0 ) η ρσ 2 1 2 = (∂µ φ(x0 )) , 2 6

To shorten the notation we will often use matrix notation and drop the indices µ, etc.

15

(2.48)

under the Lorentz transformation Λ. Putting things together, we find that the action of the Klein-Gordon theory is indeed Lorentz invariant, Z Z Z 4 4 0 S = d x L(x) → d x L(x ) = d4 x0 L(x0 ) = S . (2.49) Notice that changing the integration variables from d4 x to d4 x0 , in principle introduces an Jacobian factor det (Λ). This factor is, however, equal to 1 for Lorentz transformation connected to the identity, that we are dealing with. A similar calculation also shows that, as promised, also the EOM of the Klein-Gordon field φ is invariant, ∂ 2 + m2 φ(x) → ∂ 2 + m2 φ(x0 ) h i = (Λ−1 )ν µ ∂ν (Λ−1 )ρµ ∂ρ + m2 φ(x0 ) (2.50) = η νρ ∂ν ∂ρ + m2 φ(x0 ) = 0 . In the case of the Klein-Gordon theory, we hence conclude that the statements made before (2.45) are indeed correct. Representations of Lorentz Group The transformation law (2.44) is the simplest possible transformation law for a field. In fact, it is the only possibility for a one-component field aka a real scalar. Yet, it is also clear that in order to describe nature (think only about electromagnetism) we need multicomponent fields, which have more complicated transformation properties. The most familiar case is that of a vector field, such as the vector potential Aµ , which we have already met in Section 2.3. In this case the quantity that is distributed in space-time also carries an orientation which must be rotated and/or boosted. In fact, we will learn in this course that the Lorentz group has a variety of representations, corresponding to particles with integer (bosons) and half-integer spins (fermions) in QFT. These representations are normally constructed out of spinors. To start this general (and somewhat formal) discussion, let me examine the allowed possibilities for linear field transformations φa (x) → φ0a (x0 ) = M (Λ)ab φb (x) , (2.51) under (2.41). The first important point to notice is that the Lorentz transformations form a group. This means that two successive Lorentz transformations, x → x0 = Λx ,

x0 → x00 = Λ0 x0 ,

(2.52)

can also be described in terms of a single one x → x00 = Λ00 x ,

(2.53)

Λ00 = Λ0 Λ .

(2.54)

with

16

What happens to (2.51) under this set of Lorentz transformations? For x → x00 = Λ00 x, we have (in matrix notation) φ(x) → φ00 (x00 ) = M (Λ00 )φ(x) . (2.55) On the other hand, for x → x0 = Λx → x00 = Λ0 Λx, we get φ(x) → φ00 (x00 ) = M (Λ0 )φ0 (x0 ) = M (Λ0 )M (Λ)φ(x) .

(2.56)

In order for the last two equations to be consistent with each other, the field transformations M must obviously fulfill M (Λ0 Λ) = M (Λ0 )M (Λ) . (2.57) In group theory terminology, this means that the matrices M furnish a representation of the Lorentz group. Field Lorentz transformations are therefore not random, but they can be found if we find all (finite dimensional) representations of the Lorentz group. So how do the common representation of the Lorentz group look like and how do we get all of them? While both questions will be answered in this lecture, I believe it is best to do it case-by-case whenever we will meet a new type of (quantum) field. Since we already talked about the real scalar φ (Klein-Gordon field) and the vector Aµ (potential in electrodynamics), it makes nevertheless sense to give the representations for these two types of fields already at this point. Since a scalar field by definition does not change under Lorentz transformations, φ(x) → 0 0 φ (x ) = φ(x), the scalar representation of the Lorentz group is simply M (Λ) = 1 .

(2.58)

This was easy! The representation of the vector Aµ is also not difficult to figure out. Let me for the time being only state the result. One finds M (Λ) = Λ ,

(2.59)

which means that a vector field Aµ transforms under a Lorentz transformation as (restoring indices) Aµ (x) → (A0 )µ (x0 ) = Λµ ν Aν (x0 ) . (2.60) It is important to notice that the latter transformation property implies that any term build out of Aµ and ∂µ , where all Lorentz indices are contracted is invariant under Lorentz transformations. As an exercise you are supposed to show this explicitly for terms like ∂µ Aµ , etc. Angular Momentum In classical particle mechanics, rotational invariance gives rise to conservation of angular momentum. What is the analogy in field theory? Moreover, we now have further Lorentz transformations, namely boosts. What conserved quantity do they correspond to? In order to address these questions, we first need the infinitesimal form of the Lorentz transformations Λµ ν = δ µ ν + ω µ ν , 17

(2.61)

where ω µ ν is infinitesimal. The condition (2.42) for Λ to be a Lorentz transformation becomes in infinitesimal form η µν = η ρσ (δ µ ρ + ω µ ρ ) (δ ν σ + ω ν σ ) = η µν + ω µν + ω νµ + O(ω 2 ) ,

(2.62)

which implies that ω µν must be an antisymmetric matrix, ω µν = −ω νµ .

(2.63)

Notice that an antisymmetric 4 × 4 matrix has six independent parameters, which agrees with the number of different Lorentz transformations, i.e., three rotations and three boosts. Applying the infinitesimal Lorentz transformation to our real scalar field φ, we have φ(x) → φ(x − ωx) = φ(x) − ω µ ν xν ∂µ φ(x) ,

(2.64)

where the minus sign arises from the factor Λ−1 in (2.43). The variation of the field φ under an infinitesimal Lorentz transformation is hence given by δφ = −ω µ ν xν ∂µ φ .

(2.65)

By the same line of reasoning, one shows that the variation of the Lagrangian is δL = −ω µ ν xν ∂µ L = −∂µ (ω µ ν xν L) ,

(2.66)

where in the last step we used the fact that ω µ µ = 0 due to its antisymmetry. The Lagrangian changes by a total derivative, so we can apply Noether’s theorem (2.17) with J µ = −ω µ ν xν L to find the conserved current, ∂L ω ρ ν xν ∂ ρ φ + ω µ ν xν L ∂(∂µ φ) ∂L ρ µ = −ω ν ∂ρ φ − δ ρ L xν = −ω ρ ν T µ ρ xν . ∂(∂µ φ)

Jµ = −

(2.67)

Stripping off ω ρ ν , we obtain six different currents, which we write as (J λ )µν = xµ T λν − xν T λµ .

(2.68)

∂λ (J λ )µν = 0 ,

(2.69)

These currents satisfy and give (as usual) rise to six conserved charges. For µ, ν 6= 0, the Lorentz transformation is a rotation and the three conserved charges give the total angular momentum of the field (i, j = 1, 2, 3): Z ij Q = d3 x xi T 0j − xj T 0i . (2.70) What’s about the boosts? In this case, the conserved charges are Z 0i Q = d3 x x0 T 0i − xi T 00 . 18

(2.71)

The fact that these are conserved tells us that Z Z Z 0i d dQ0i 3 0i 3 dT = d xT + t d x − 0= d3 x xi T 00 dt dt dt Z d dP i i − =P +t d3 x xi T 00 . dt dt Yet, also the momentum P i is conserved, i.e., dP i /dt = 0, and we conclude that Z d d3 x xi T 00 = const. . dt

(2.72)

(2.73)

This is the statement that the center of energy of the field travels with a constant velocity. In a sense it’s a field theoretical version of Newton’s first law but, rather surprisingly, appearing here as a conservation law. Notice that after restoring the label a our results for (J λ )µν etc. also apply in the case of multicomponent fields. Poincar´ e Invariance We now require that a physical system possesses both space-time translation (2.18) and Lorentz transformation symmetry (2.41). The symmetry group that includes both transformations is called the Poincaré group. Notice that for any Poincaré-invariant theory the two charge conservation equations (2.21) and (2.69) should hold. This is only possible if the energymomentum tensor T µν is symmetric. Indeed, 0 = ∂λ (J λ )µν = ∂λ xµ T λν − xν T λµ = xµ ∂λ T λν + T λν ∂λ xµ − xν ∂λ T λµ − T λµ ∂λ xν

(2.74)

= T λν δλ µ − T λµ δλ ν = T µν − T νµ . Since Maxwell’s theory is Poincaré invariant, this general result tells us that the expression of the energy-momentum tensor in (2.40) can be made symmetric without changing physics. The key to actually do it, lies in making use of the conservation law (2.21) in an appropriate way.

2.5

Problems

i) Suppose that a no further specified Lagrangian L depends not only on φ and ∂µ φ but also on the second derivatives of the fields:7 L = L(φ, ∂µ φ, ∂µ ∂ν φ) .

(2.75)

For the case that the variations δφ vanish at the endpoints and that δ(∂µ1 . . . ∂µN φ) = ∂µ1 . . . ∂µN (δφ) holds, derive the Euler-Lagrange EOMs for such a theory. 7

For the sake of brevity, we have omitted the subscript a labelling the different fields.

19

Apply your result to obtain the EOMs for the field φ with Lagrangian L=

1 α ν (∂t φ) (∂x φ) + (∂x φ)3 − (∂ν ∂µ φ)2 . 2 6 2

(2.76)

ii) Let us study the dynamics of acoustic waves in an elastic medium (e.g. air), as described by the Lagrangian 2 ∂y 1 1 2 (∇y)2 , (2.77) L= ρ − ρvsound 2 ∂t 2 with ρ the density of the medium and vsound the speed of sound. Find the Euler-Lagrange EOMs for the system and their solutions. What do they describe? Calculate the Hamiltonian H. iii) Consider the Klein-Gordon Lagrangian (2.45). Derive the kinetic and potential energy (T and V with L = T − V ) as well as the Euler-Lagrange EOMs for the field φ. Write down the energy-momentum tensor T µν and show that it indeed satisfies ∂µ T µν = 0. Give the expressions for the conserved energy E and momentum P i . iv) Using (2.60) show that the terms ∂µ Aµ , (∂µ Aν )2 , and (∂µ Aν )(∂ ν Aµ ) are Lorentz invariant. What are the dimensions of these terms? v) We saw that in the case of electrodynamics in vacuum using (2.20) leads to an energymomentum tensor T µν that is not symmetric. To remedy that, one can add to T µν a term of the form ∂λ Γλµν , where Γλµν is antisymmetric in its first two indices, i.e., Γλµν = −Γµλν . Show that such an object is automatically divergenceless, i.e., it obeys ∂µ ∂λ Γλµν = 0. This feature implies that instead of T µν one can also use Θµν = T µν + ∂λ Γλµν ,

(2.78)

without changing the physics, since Θµν has the same globally conserved energy and momentum as T µν . Show that this construction, with Γλµν = F µλ Aν ,

(2.79)

leads to an energy-momentum tensor Θµν that is symmetric and yields the standard formulas for the electromagnetic energy and momentum densities: E=

1 E2 + B2 , 2

20

S =E×B.

(2.80)

References [1] Chapter 4 of L. D. Landau and E. M. Lifshitz, “The Classical Theory of Fields, Fourth Edition: Vol. 2 (Course of Theoretical Physics Series),” Butterworth-Heinemann (1975), 481 p. [2] Chapter 5 of B. Thidé, ”Electromagnetic Field Theory,” revised and extended 2nd edition, http://www.plasma.uu.se/CED/Book/index.html

21

3

Klein-Gordon Theory

In QM, canonical quantization is a recipe that takes us from the Hamiltonian formalism of classical dynamics to the quantum theory. The recipe tells us to take the generalized coordinates qa and their conjugate momenta pa = ∂L/∂ q˙a and promote them to operators. The Poisson bracket structure of classical mechanics descends to the structure of commutation relations between operators, namely [qa , qb ] = [pa , pb ] = 0 , [qa , pb ] = iδa b ,

(3.1)

where [a, b] = ab − ba is the usual commutator. If one wants to construct a QFT, one can proceed in a similar fashion. The idea is to start with the classical field theory and then to “quantize” it, i.e., reinterpret the dynamical variables as operators that obey canonical commutation relations,8 [φa (x), φb (y)] = [π a (x), π b (y)] = 0 , [φa (x), π b (y)] = iδ (3) (x − y)δa b .

(3.2)

Here φa (x) are field operators and the Kronecker delta in (3.1) has been replaced by a delta function since the momentum conjugates π a (x) are densities. Notice that for now, we are working in the Schrödinger picture which means that the operators φa (x) and π a (x) do only depend on the spatial coordinates but not on time. The time dependence sits in the states |ψi which obey the usual Schödinger equation d |ψi = H|ψi . (3.3) dt While all this looks pretty much the same as good old QM there is an important difference. The wavefunction |ψi in QFT, is a functional, i.e., a function of every possible configuration of the field φa , and not a simple function.9 So things are more complicated in QFT than in QM after all. The Hamiltonian H, being a function of φ˙ a and π a , also becomes an operator in QFT. In order to solve the theory, one task is to find the spectrum, i.e., the eigenvalues and eigenstates of H. This is usually very difficult, since there is an infinite number of dofs within QFT, at least one for each point x in space. However, for certain theories, called free theories, one can find a way to write the dynamics such that each dof evolves independently from all the others. Free field theories typically have Lagrangians which are quadratic in the fields, so that the EOMs are linear. i

8

This procedure is sometimes referred to as second quantization. We will not use this terminology here. In functional analysis, a functional is a map from a vector space to the field underlying the vector space, which is usually the real numbers. In other words, it is a function that takes a vector as its argument or input and returns a scalar. Commonly, the vector space is a space of functions, so the functional takes a function as its argument, and so it is sometimes referred to as a function of a function. The use of functionals goes back to the calculus of variations where one searches for a function which minimizes a certain functional. A particularly important application in physics is to search for a state of a system which minimizes the energy functional. 9

22

3.1

Klein-Gordon Field as Harmonic Oscillators

So far the discussion in this section was rather general. Let us be more specific and consider the simplest relativistic free theory as a practical example. It is provided by the classical Klein-Gordon equation (2.45). To exhibit the coordinates in which the dofs decouple from each other, we only have to Fourier transform the field φ, Z 3 d p i p·x φ(t, x) = e φ(t, p) . (3.4) (2π)3 In momentum space (2.45) simply reads 2 ∂ 2 2 φ(t, p) = 0 , + p +m ∂t2

(3.5)

which tells us that for each value of p, the Fourier transform φ(t, p) solves the equation of a harmonic oscillator with frequency p (3.6) ωp = |p|2 + m2 . We see that the most general solution of the classical Klein-Gordon equation is a linear superposition of simple harmonic oscillators, each vibrating at a different frequency with a different amplitude. In order to quantize the field φ, we must hence only quantize this infinite number of harmonic oscillators (as Sidney Coleman once said [1]: “The career of a young theoretical physicist consists of treating the harmonic oscillator in ever-increasing levels of abstraction.”). Let’s recall how to do it in QM. Harmonic Oscillator in QM Consider the QM Hamiltonian 1 2 1 2 2 p + ω q , (3.7) 2 2 with the canonical commutation relations [q, p] = i. In order to find the spectrum of the system, we define annihilation and creation operators (also known as lowering and raising or ladder operators) r r ω i ω i † a= q + √ p, a = q − √ p. (3.8) 2 2 2ω 2ω H=

Expressing q and p through a and a† gives r ω p = −i (a − a† ) . 2

1 q = √ (a + a† ) , 2ω

(3.9)

The commutator of the operators introduced in (3.8) is readily computed. One finds [a, a† ] = 1. Expressing the Hamiltonian (3.7) through a and a† gives ω 1 † † † H= aa + a a = ω a a + . (3.10) 2 2 23

It is also easy to show that the commutator of H with a and a† takes the form [H, a† ] = ωa† .

[H, a] = −ωa ,

(3.11)

These relations imply that if |ψi is an eigenstate of H with energy E, i.e., H|ψi = E|ψi, then we can construct other eigenstates by acting with the operators a and a† on |ψi: Ha† |ψi = (E + ω) a† |ψi ,

Ha|ψi = (E − ω) a|ψi ,

(3.12)

This feature explains why a (a† ) is called annihilation (creation) operator. From the latter equation it is also clear that the spectrum of (3.7) has a ladder structure, . . . , E − 2ω, E − ω, E, E +ω, E +2ω, . . . . If the energy is bounded from below, there must be a ground state |0i, which satisfies a|0i = 0. This state has the ground state or zero-point energy H|0i = ω/2|0i. Excited states |ni are then created by the repeated action of a† , p √ (a† )n |0i = n (n − 1) . . . 1 |ni = n! |ni , (3.13) and satisfy

1 1 H|ni = ω N + |ni = ω n + |ni , (3.14) 2 2 where N = a† a is the number operator with N |ni = n|ni. The prefactor on the right-hand side of (3.13) is needed to guarantee that the states |ni are normalized to 1, i.e., hn|ni = 1. Quantization of Real Klein-Gordon Field If we treat each Fourier mode of the field φ as an independent harmonic oscillator, we can apply canonical quantization to the real Klein-Gordon theory, and in this way find the spectrum of the corresponding Hamiltonian. In analogy to (3.9), we write φ and π as a linear sum of an infinite number of operators ap and a†p , labelled by the 3-momentum p, Z 3 i dp 1 h i p·x † −i p·x p ap e + ap e φ(x) = , (2π)3 2ωp (3.15) r h Z 3 i ωp dp (−i) ap ei p·x − a†p e−i p·x . π(x) = (2π)3 2 The commutation relations (3.2) become [ap , aq ] = [a†p , a†q ] = 0 , [ap , a†q ] = (2π)3 δ (3) (p − q) . Let us assume that the latter equations hold, it then follows that Z 3 3 r d p d q −i ωq † i p·x−i q·y † −i p·x+i q·y [φ(x), π(y)] = − [a , a ] e + [a , a ] e p q p q (2π)6 2 ωp Z 3 3 r d p d q −i ωq 3 (3) i p·x−i q·y −i p·x+i q·y = (2π) δ (p − q) − e − e (2π)6 2 ωp Z 3 d p −i i p·(x−y) −i p·(x−y) = − e − e = iδ (3) (x − y) , (2π)3 2 24

(3.16)

(3.17)

where we have dropped terms [ap , aq ] = [a†p , a†q ] = 0 from the very beginning. To show that [φ(x), φ(y)] = [π(x), π(y)] = 0 is left as an exercise. In terms of the ladder operators ap and a†p the Hamiltonian of the real Klein-Gordon theory takes the form Z 1 H= d3 x π 2 + (∇φ)2 + m2 φ2 2 √ Z ωp ωq 1 d3 x d3 p d3 q i q·x † −i q·x i p·x † −i p·x e a e − a e = − a e − a q p q p 2 (2π)6 2 1 + √ ip ap ei p·x − ip a†p e−i p·x · iq aq ei q·x − iq a†q e−i q·x (3.18) 2 ωp ωq m2 i q·x † −i q·x i p·x † −i p·x + √ aq e + aq e ap e + ap e 2 ωp ωq Z 3 i 1 dp 1h 2 2 2 † † 2 2 2 † † = (−ω + p + m )(a a + a a ) + (ω + p + m )(a a + a a ) , p −p p p p p −p p p p 4 (2π)3 ωp where we have first used the expressions for φ and π given in (3.15) and then integrated over d3 x to get delta functions δ (3) (p ± q), which, in turn, allows us to perform the d3 q integral. Inserting finally the expression (3.6) for the frequency, the first term in (3.18) vanishes and we are left with Z 3 Z d3 p dp 1 1 † † † † ω a a + a a = ω a a + [a , a ] H= p p p p p p p p p p 2 (2π)3 (2π)3 2 (3.19) Z 3 dp 1 † 3 (3) = ωp ap ap + (2π) δ (0) . (2π)3 2 We see that the result contains a delta function, evaluated at zero where it has an infinite spike. This contribution arises from the infinite sum over all modes vibrating with the zeropoint energy ωp /2. Moreover, the integral over ωp diverges at large momenta |p|. To better understand what is going on let us have a look at the ground state |0i where the former infinity first becomes apparent.

3.2

Structure of Vacuum

As in the case of the harmonic oscillator in QM, we define the vacuum |0i through the condition that it is annihilated by the action of all ap , ap |0i = 0 ,

∀p.

(3.20)

With this definition the energy E0 of the vacuum comes entirely from the second term in the last line of (3.19), Z 3 d p ωp 3 (3) (2π) δ (0) |0i = ∞|0i . (3.21) H |0i = E0 |0i = (2π)3 2 25

In fact, the latter expression contains not only one but two infinities. The first arises because space is infinitely large. Infinities of this kind are often referred to as infrared (IR) divergences. To isolate this infinity, we put the theory into a box with sides of length L and impose periodic boundary conditions (BCs) on the field. Then, taking the limit L → ∞, we get Z L/2 Z L/2 3 i p·x 3 (3) dx e = lim d3 x = V , (3.22) (2π) δ (0) = lim p=0 L→∞

L→∞

−L/2

−L/2

where V denotes the volume of the box. This result tells us that the delta function singularity arises because we try to compute the total energy E0 of the system rather than its energy density E0 . The energy density is simply calculated from E0 by dividing through the volume V . One finds Z 3 d p ωp E0 = , (3.23) E0 = V (2π)3 2 which is still divergent and resembles the sum of zero-point energies for each harmonic oscillator. Since E0 → ∞ in the limit |p| → ∞, i.e., high frequencies (or short distances), this singularity is an ultraviolet (UV) divergence. This divergence arises because we want too much. We have assumed that our theory is valid to arbitrarily short distance scales, corresponding to arbitrarily high energies. Recalling the discussion of energy scales in Section 1.2, this assumption is clearly absurd. The integral should be cut off at high momentum, reflecting the fact that our theory presumably breaks down at some point (most likely far below the GUT or Planck scale). Fortunately, the infinite energy shift in (3.19) is harmless if we want to measure the energy difference of the energy eigenstates from the vacuum. We can therefore “recalibrate” our energy levels (by an infinite constant) removing from the Hamiltonian operator the energy of the vacuum, : H : = H − E0 = H − h0|H|0i . (3.24) With this definition one has : H : |0i = 0. In fact, the difference between the latter Hamiltonian and the previous one is merely an ordering ambiguity in moving from the classical to the quantum theory. E.g., if we would have defined our Hamiltonian to take the form 1 (3.25) H = (ωq − ip) (ωq + ip) , 2 which is classically the same as our original definition (3.7), then after quantization instead of (3.10), we would have gotten H = ωa† a . (3.26) This type of ordering ambiguity arises often in field theories. The method that we have used above to deal with it is called normal ordering. In practice, normal ordering works by placing all annihilation operators ap in products of field operators to the right. Applied to the Hamiltonian of the real Klein-Gordon theory this prescription leads to Z 3 dp :H : = ωp a†p ap . (3.27) (2π)3 In the remainder of this section, we will normal order all operators in this manner (dropping the “: :” for simplicity). 26

Cosmological Constant Above we concluded that as long as we are interested in the differences between energy levels the infinite total energy E0 of the vacuum does not matter (which effectively means that E0 has no effect on particle physics phenomenology). So is the value of E0 unobservable then? No, in fact, not at all, since gravity is supposed to see all energy densities. In particular, the sum of all the zero-point energies should contribute to Einstein’s equations, Rµν −

R gµν + Λgµν = 8πGN Tµν , 2

(3.28)

in the form of a cosmological constant Λ = E0 /V . Here Rµν is the Ricci curvature tensor, R the scalar curvature (for their definitions please consult a text on general relativity), gµν is the metric tensor (not to be mixed up with the Minkowski metric ηµν ), GN denotes Newton’s constant, which we have already met in (1.8), and Tµν is the energy-momentum tensor in its symmetric form. Unfortunately, I do not have time to explain (3.28) in detail. If you want to learn more about Einstein’s equation, I suggest that you attend Andrew Steane’s course on general relativity. In order to be able to follow this lecture, it is sufficient to know that these equations contain a term proportional to E0 /V . An assortment of observations (cosmic microwave background, type-Ia supernovae, baryon acoustic oscillations, etc.) tells us that 74% of the energy density in the universe has the properties of a cosmological constant. This constant energy density filling space homogeneously is one form of dark energy. Another possibility of dark energy would be a scalar field such as quintessence, a dynamic quantity whose energy density can vary in space. The rest of the composition of today’s cosmos is made up by dark matter, amounting to 22%, and visible matter (atoms, etc.), giving the missing 4%. Dark matter is dark in the sense that it is inferred to exist from gravitational effects on visible matter and background radiation, but is undetectable by emitted or scattered electromagnetic radiation. So in conclusion, fully 96% of the universe seems to be composed of stuff we’ve never seen directly on earth. But our lack of understanding does not end there. In the last subsection, we have argued that integrating in (3.23) up to infinity is not the right thing to do, but that one should only consider modes up to a certain UV cut-off ΛUV , where one stops trusting the underlying theory. The resulting energy density E0 then scales like Λ4UV . While it is not clear which precise value we should take for ΛUV , let us be not very ambitious and take a value for this scale, up to which we truly believe that we understand the physics of fundamental interactions. The electron mass me = 511 keV could be such a choice. In consequence, E0predicted ≈ (511 keV)4 ≈ 6 · 1022 eV ,

(3.29)

where the superscript “predicted” should probably better read “guessed”. Glancing at Table 1, we see that the observed value of E0 is E0observed ≈ (10−3 eV)4 ≈ 10−12 eV ,

(3.30)

so it is clearly non-zero but unfortunately also roughly 34 orders of magnitude smaller than our prediction. Notice that the choice ΛUV = me that lead to (3.29) was, in fact, a conservative 27

one, because other educated guesses such as ΛUV = v, MGUT , etc. would have lead to a much bigger disagreement of up to 120 orders of magnitude for the choice ΛUV = MP . From the point of view of QFT, the net cosmological constant, is the sum of a number of apparently disparate contributions, including zero-point fluctuations of each field theory dof and potential energies from scalar fields, as well as a bare cosmological constant. There is no obstacle to imagining that all of the large and apparently unrelated contributions add together, with different signs, to produce a net cosmological constant consistent with the limit (3.30), other than the fact that it seems ridiculous. We know of no special symmetry which could enforce a vanishing vacuum energy while remaining consistent with the known laws of physics. This conundrum is the cosmological constant problem. While no widely accepted solution to this problem exists, there are many proposed ones ranging from the anthropic principle to the string-theory landscape. Don’t bother if you have never even heard of any of them, it is not important at all for what follows. Casimir Effect Using the normal ordering prescription we can happily set E0 = 0, while chanting the mantra that only energy differences can be measured. However, it should be possible to see that the vacuum energy is different if, for a reason, the fields vanish in some region of the space-volume or if some frequencies ωp do not contribute to the vacuum energy. Such a set-up can be realized, by forcing the real Klein-Gordon field φ to satisfy appropriate BCs. Let us assume, that φ vanishes on the planes with x = 0 and x = L, φ(0, y, z) = φ(L, y, z) = 0 ,

(3.31)

The presence of these BCs affects the Fourier decomposition of the field and, in particular, leads to a quantization of the momentum of the field inside the planes (k ∈ Z+ ), kπ , py , pz . (3.32) p= L For simplicity let us consider a massless real scalar field. In this case the ground-state energy per unit area S between the planes is given by the following expression s ∞ Z 2 2 X E0 (L) kπ dp⊥ 1 + p2⊥ . (3.33) = 2 2 S (2π) L k=1 Notice that we only integrate over the perpendicular directions p⊥ = (py , pz ), since the momentum px is discretized. Consequently, the volume integral has to be replaced by a surface integral of the planes. In analogy to (3.22), this gives a factor S/(2π)2 instead of V /(2π)3 . Let us see if we are able to calculate (3.33). We first switch to polar coordinates, s ∞ Z ∞ 2 X E0 (L) dp⊥ p⊥ 1 kπ = + p2⊥ . (3.34) S 2π 2 L k=1 0 28

As it stands this integral is divergent in the limit p⊥ → ∞. We can regulate this singularity in a number of different ways. One way to do it, is to introduce a UV cut-off a L, so that modes of momentum much bigger than a−1 are removed. E.g., multiplying the integrand in (3.34) by the factor exp [−a ((kπ/L)2 + p2⊥ )1/2 ] would do the job, since the resulting expression has the property that as a → 0, one regains the full, infinite result (3.33). The drawback of this method is that the new integral is quite difficult to perform (though doable), so let’s see if we find an easier way. The trick is to consider (3.33) not in d = 4 dimensions, but to work in less dimensions, say, d = 4 − 2 with > 0. While this looks very weird at first sight, let me mention that in general there exists a value of for which the integral is well-defined. We shall perform our calculation for such a value, and then try to analytically continue the result to = 0. In d = 4 − 2 dimensions the integral (3.34) takes the form s ∞ Z 2 1−2 kπ 1 E0 (L) X ∞ dp⊥ p⊥ = + p2⊥ . (3.35) S 2π 2 L 0 k=1 To evaluate this expression, we first change variables p⊥ → kπ/L l⊥ . We then obtain !Z ∞ q ∞ X 3−2 1 π E0 (L) 1−2 3−2 2 = k dl⊥ l⊥ 1 + l⊥ S 4π L 0 k=1 Z ∞ q 1 π 3−2 2 2 − 2 = 1 + l⊥ , ζ(2 − 3) dl⊥ (l⊥ ) 8π L 0 where we have identified the infinite sum with a Riemann zeta function, employing ∞ X 1 = ζ(a) . a k k=1 2 Performing now the change of variables l⊥ → x/(1 − x), we arrive at Z ∞ Z 1 q 2 2 − 2 dl⊥ dx x− (1 − x)−5/2 = B(1 − , − 3/2) , (l⊥ 1 + l⊥ = ) 0

(3.36)

(3.37)

(3.38)

0

where in the last step we have used the definition of the Euler beta function, Z 1 Γ(a)Γ(b) B(a, b) = = dx xa−1 (1 − x)b−1 . (3.39) Γ(a + b) 0 Putting everything together, the final result in d = 4 − 2 dimensions reads E0 (L) 1 π 3−2 = ζ(2 − 3) B(1 − , − 3/2) . (3.40) S 8π L 10 Amazingly, √ we can even take the limit → 0. Using Γ(a + 1) = a Γ(a) with Γ(1) = 1 and Γ(1/2) = π, and recalling that ζ(−3) = 1/120, we arrive at the finite expression E0 (L) π2 =− . S 1440L3 10

(3.41)

Many subtleties have been swept under the carpet in this calculation. E.g., the dimensions of the expressions in (3.35) to (3.40) are wrong by −2. All cheats will become clear when the method of dimensional regularization is properly introduced.

29

This result implies that the vacuum energy depends on the distance between the two planes, on which φ vanishes. Can we realize this in an experiment? Remember that the electromagnetic field is zero inside a conductor. If we place two uncharged conducting plates parallel to each other at a distance L, then we can reproduce the BCs of the set-up that we have just studied. While the quantization of the electromagnetic field is more complicated than the real Klein-Gordon field, which we have used to model the effect, this difference becomes (almost) immaterial as far as the vacuum energy is concerned. Our analysis, leads to an amazing prediction. Two electrically neutral metal plates attract each other. This is known as the Casimir-Polder force, first predicted in 1948 [4]. Notice, that the energy of the vacuum gets smaller when the conducting plates are closer, as indicated by the minus sign in (3.41). Therefore, there is an attractive force between them. This is an effect that has by now been verified experimentally with great precision.11 In our example, the force per unit area (pressure or rather anti-pressure) between the two conductor plates is given by π2 1 ∂E0 (L) =− . (3.42) F =− S ∂L 480L4 In fact, the true Casimir-Polder force is twice as large as the latter result, due to the two polarization states of the photon.

3.3

Particle States

After the discussion of the properties of the vacuum, we can now turn to the excitations of φ. It’s easy to verify (and therefore left as an exercise) that, in full analogy to (3.11), the Hamiltonian and the ladder operators of the real Klein-Gordon theory obey the following commutation relations [H, ap ] = −ωp ap , [H, a†p ] = ωp a†p . (3.43) These relations imply that we can construct energy eigenstates by acting on the vacuum state |0i with a†p (remember that they also imply that ap |0i = 0, ∀ p). We define |pi = a†p |0i .

(3.44)

H |pi = Ep |pi = ωp |pi ,

(3.45)

This state has energy with ωp given in (3.6), which is nothing but the relativistic energy of a particle with 3momentum p and mass m. We thus interpret the state |pi as the momentum eigenstate of a single scalar particle of mass m. Let us check this interpretation by studying the other quantum numbers of |pi. We begin with the total momentum P introduced in (2.24). Turning this expression into an operator, we arrive, after normal ordering, at Z Z 3 dp 3 p a†p ap . (3.46) P = − d x π ∇φ = (2π)3 11

The first experimental test of the Casimir-Polder force was conducted by Marcus Sparnaay in 1958, in a delicate and difficult experiment with parallel plates. Due to the large experimental errors, his results could neither prove the theoretical prediction right nor wrong.

30

Acting with P on our state |pi gives Z 3 Z 3 h i dq dq † † † 3 (3) † P |pi = q a a a |0i = q a (2π) δ (p − q) + a a |0i = p |pi , q q q p q p (2π)3 (2π)3

(3.47)

where we have employed the second line in (3.16) and used the fact that an annihilation operator acting on the vacuum is zero. The latter result tells us that the state |pi has momentum p. Another property of |pi that we can study is its angular momentum. Again we take the classical expression for the total angular momentum (2.67) and turn it into an operator, Z i ijk J = d3 x (J 0 )jk , (3.48) It is a good exercise to show that by acting with J i on the one-particle state with zero momentum one gets (3.49) J i |p = 0i = 0 . This result tells us that the particle carries no internal angular momentum. In other words, quantizing the real Klein-Gordon field gives rise to a spin-zero particle aka a scalar. Multiparticle States Acting multiple times with the creation operators on the vacuum we can create multiparticle states. We interpret the state |p1 , ..., pn i = a†p1 . . . a†pn |0i ,

(3.50)

as an n-particle state. Since one has [a†pi , a†pj ] = 0, the state (3.50) is symmetric under exchange of any two particles. E.g., |p, qi = a†p a†q |0i = a†q a†q |0i = |q, pi .

(3.51)

This means that the particles corresponding to the real Klein-Gordon theory are bosons. We see that, as promised already in Section 1.1, the relationship between spin and statistics is, in fact, a consequence of the QFT framework, following, in the case at hand, from the commutation quantization conditions for boson fields (3.2). The full Hilbert space of our theory is spanned by acting on the vacuum with all possible combinations of creation operators, |0i ,

a†p |0i ,

a†p a†q |0i ,

a†p a†q a†r |0i ,

... .

(3.52)

This space is known as the Fock space and is simply the sum of the n-particle Hilbert spaces, for all n ≥ 0. Like in QM, there is also an operator which counts the number of particles in a given state in the Fock space. It is the number operator Z 3 dp † a ap , (3.53) N= (2π)3 p 31

which satisfies N |p1 , . . . , pn i = n |p1 , . . . , pn i. Notice that the number operator commutes with the Hamiltonian, i.e., [N, H] = 0, ensuring that particle number is conserved. This means that we can place ourselves in the n-particle sector, and will remain there. This is a property of free theories, but will no longer be true when we consider interactions. Interactions create and destroy particles, taking us between the different sectors in the Fock space. Operator-Valued Distributions We have referred to the states |pi as “particles”. Yet, this name is somewhat misleading, since these states are momentum eigenstates and therefore not localized in space. Recall that in QM both the position and momentum eigenstates are not good elements of the Hilbert space since they are not normalizable (they normalize to delta functions). Similarly, in QFT neither the operators φ(x), nor ap and a†p are good operators acting on the Fock space. This is because these operators all produce states that are not normalizable:

h0 |φ(x)φ(x)| 0i = δ (3) (0) , 0 ap a†p 0 = (2π)3 δ (3) (0) . (3.54) This feature implies that they are operator-valued distributions and not functions. In the case of φ(x) one has that although the field operator has a well-defined vacuum expectation value (VEV), h0|φ(x)|0i = 0, the fluctuations h0|φ(x)φ(x)|0i of the operator at a fixed point are infinite. We can construct well-defined operators by smearing these distributions over space. E.g., we can create a wavepacket Z 3 d p −i p·x e ψ(p) |pi , (3.55) |ψi = (2π)3 which is partially localized in both position and momentum space. A typical state might be described by the Gaussian ψ(p) = exp [−p2 /(2m2 )]. Relativistic Normalization The vacuum |0i is normalized as h0|0i = 1. The one-particle states |pi = a†p |0i then satisfy

hp|qi = 0 ap a†q 0 = 0 (2π)3 δ (3) (p − q) + a†q ap 0 = (2π)3 δ (3) (p − q) , (3.56) where we have made use of (3.16) and (3.20) to arrive at the final answer. Since the latter expression depends on 3-momenta, an immediate question that arises is whether it is Lorentz invariant. What could go wrong? Suppose we perform a Lorentz transformation pµ → (p0 )µ = Λµ ν pν ,

(3.57)

such that p → p0 . In our QFT it would be preferable, if the state p changes under this Lorentz transformation as |pi → |p0 i = U (Λ) |pi , (3.58) with U (Λ) being unitary, i.e., U † (Λ)U (Λ) = U (Λ)U † (Λ) = 1. In such a case the normalization of |pi would remain unchanged

hp|pi → hp0 |p0 i = p U † (Λ)U (Λ) p = hp|pi . (3.59) 32

In order to find out whether or not the original and the Lorentz-transformed state, |pi and |p0 i, are related by an unitary transformation, we should look at an object which we know is Lorentz invariant. One such object is the identity operator (which is really the projection operator onto one-particle states). With the normalization (3.56) we know that it is given by Z 3 dp |pihp| . (3.60) 1= (2π)3 R This operator is Lorentz invariant, but it consists of two terms: the measure d3 p and the projector |pihp|. Are these two objects Lorentz invariant by themselves? In fact, they are not. R In order to prove this statement, we start with the measure d4 p which is obviously Lorentz invariant. The relativistic dispersion relation for a massive particle, i.e., p2 = m2 , and hence p20 = Ep2 = p2 +m2 is also Lorentz invariant. Solving for p0 , there are two branches of solutions, namely p0 = ±Ep . But the choice of branch is another Lorentz-invariant concept. Putting everything together tells us that Z Z 3 Z 3 d p dp 4 2 2 2 , (3.61) d p δ(p0 − p − m ) p0 >0 = = 2p0 p0 =Ep 2Ep is Lorentz invariant. From the latter result we can figure out everything else. E.g., the Lorentz-invariant delta function for 3-momenta is 2Ep δ (3) (p − q) , since

Z

d3 p 2Ep δ (3) (p − q) = 1 . 2Ep

(3.62)

(3.63)

This finally tells us that the relativistically normalized momentum eigenstates are given by12 p p |pi = 2Ep |pi = 2Ep a†p |0i , (3.64) and satisfy hp|qi = (2π)3 2Ep δ (3) (p − q) . We can also express the identity operator in terms of the |pi states. One has Z 3 dp 1 1= |pihp| . (2π)3 2Ep

(3.65)

(3.66)

We remark that some textspon QFT also define normalized annihilation (cre p relativistically † † ation) operators by a(p) = 2Ep ap a (p) = 2Ep ap . In order to avoid (further) confusion, we won’t make use of this notation here. 12

Our notation is rather subtle here, since the relativistically normalized momentum states |pi differ from |pi just by the fact that they are not set in boldface type.

33

3.4

Two Real Klein-Gordon Fields

Our task is to describe all known particles and their interactions. It is then interesting to study the quantization of a system with more than one field. In order to keep things simple, let us try to describe a system of two real Klein-Gordon fields φ1,2 which differ only in their mass parameters (m1 6= m2 ), X 1 1 2 2 2 (∂µ φi ) − mi φi . (3.67) L= 2 2 i=1,2 This Lagrangian leads to two independent Klein-Gordon equations, (∂ 2 + m2i ) φi = 0 .

(3.68)

The Hamiltonian, the total momentum, and the number operator of the system is given by H = H1 + H2 ,

P = P1 + P2 ,

N = N1 + N2 ,

(3.69)

where Z Hi =

d3 p ωi,p a†i,p ai,p , (2π)3

Z Pi =

d3 p p a†i,p ai,p , (2π)3

Z Ni =

d3 p † a ai,p , (2π)3 i,p

(3.70)

with ωi,p = (p2 + m2i )1/2 . It should be clear, that we can construct particle states in the same fashion as we did with the Lagrangian of just a single real Klein-Gordon field. Products of a†1,p operators acting on |0i create relativistic particles with mass m1 , while a†2,p operators create particles with mass m2 . E.g., the states |S 1 i = a†1,p |0i ,

|S 2 i = a†2,p |0i ,

(3.71)

satisfy H |S i i = ωi,p |S i i ,

P |S i i = p|S i i ,

N |S i i = 1|S i i .

(3.72)

These relations tell us that the states |S 1,2 i are degenerate in the sense that they are singleparticle states with the same momentum p. However, they can be distinguished by measuring the energy of the particles as long as the masses m1,2 are different (which we have assumed for the time being). Equal-Mass Case Admittedly the case of two real Klein-Gordon fields with different masses m1,2 is pretty boring. Things get a little bit more interesting, if we consider the special case m1 = m2 = m. Why? Because in this case the system possesses an additional rotation symmetry in the space of fields φ1,2 . According to Noether’s theorem this should lead to a new conserved charge. In order to be able to identify the additional charge, we first write the Lagrangian (3.67) in a form that exhibits the symmetry 1 1 L = (∂µ φT )(∂ µ φ) − m2 φT φ . (3.73) 2 2 34

Here we have introduced the field vector φ = (φ1 , φ2 )T . Obviously, the latter Lagrangian is invariant under the orthogonal transformations (O(2) transformations or two-dimensional rotations), φ → φ0 = Rφ ,

(3.74)

with RT = R−1 . To calculate the conserved current, we again consider infinitesimal symmetry transformations (i, j = 1, 2) Rij = δij + θij + O(θ2 ) . (3.75) The orthogonality of the matrix R, −1 T δij + θji = Rij = Rij = δij − θij ,

(3.76)

tells us that the matrix θ is antisymmetric. The infinitesimal transformation of the field φ1 under (3.74) is φ1 → φ01 = R1i φi = (δ1i + θ1i ) φi = φ1 + θ11 φ1 + θ12 φ2 = φ1 + θ12 φ2 ,

(3.77)

which tells us that the variation of φ1 is δφ1 = θ12 φ2 .

(3.78)

δφ2 = θ21 φ1 = −θ12 φ1 .

(3.79)

An analog calculation gives Knowing the variations δφ1,2 of the fields, the conserved current corresponding to (3.74) is readily written down, Jµ =

∂L δφi = θ12 (∂ µ φ1 ) φ2 − (∂ µ φ2 ) φ1 , ∂(∂µ φi )

(3.80)

so the conserved charge is Z Q=

d3 x

∂ 0 φ1 φ2 − ∂ 0 φ2 φ1 .

(3.81)

Substituting in the above expression the physical solutions (3.15) for the fields φ1,2 , and performing the integration over the space coordinates, one obtains (the actual computation is part of an exercise)13 Z 3 h i dp † † Q = −i a1,p a2,p − a2,p a1,p , (3.82) (2π)3 which is an hermitian operator, i.e., it satisfies Q† = Q. There is an ambiguity worth noting, when applying Noethers theorem to find the conserved charge under the transformation (3.74). Obviously, if Q is conserved, then so is every other operator c1 Q + c2 with c1,2 constant numbers. The expression for Q in (3.82) is therefore unique up to a multiplicative and an 13

This expression has not be normal ordered.

35

additive constant. The ambiguity on the additive constant is removed when we remove the contribution of the vacuum to the charge of particle states (as we have done for the energy). The normal-ordered charge operator : Q : = Q − h0|Q|0i ,

(3.83)

is ambiguous only up to a multiplicative factor, which essentially denotes the units in which we measure the charge of a state. Notice that we have already used this ambiguity in (3.81) and simply ignored the factor θ12 . In the following, we will use the normalization (3.83) of Q, dropping as before the “: :” to avoid unnecessary clutter. So far so good. Next we would like to determine the spectrum of Q. This is most easily done using the technique of ladder operators. We first define the following linear combinations, 1 (3.84) a±,p = √ (a1,p ± ia2,p ) , 2 of annihilation operators (an analog definition holds for the hermitian conjugate operators). It is left as a homework problem to show that these new operators satisfy the following commutation relations [Q, a†±,p ] = ±a†±,p .

[Q, a±,p ] = ∓a±,p ,

(3.85)

The latter relations imply that we can obtain states with charge q ± 1 from a state |Si of charge q, i.e., Q |Si = q |Si, by the action of a†±,p , Q a†±,p |Si = (q ± 1) a†±,p |Si . (3.86) In other words the operators a†±,p are ladder operators with respect to Q. Since a†±,p are linear combinations of a†1,p and a†2,p , which are ladder operators for the Hamiltonian H and the total momentum operator P , so are a†±,p . To find now all the common eigenstates of the charge operator Q, it is sufficient to start from a single common eigenstate and then to act with a†±,p on this state. It is not surprising that the vacuum |0i is also an eigenstate of Q, namely the one with zero charge14 Q |0i = 0 |0i = 0 .

(3.87)

Repeated application of the ladder operators, |S ± ni

=

n Y

a†±,pi |0i ,

(3.88)

i=1 − then creates n-particle states with positive (|S + n i) and negative (|S n i) charge. Consequently, one has ! ! n n X X H |S ± ωi,pi |S ± P |S ± pi |S ± ni = ni, ni = ni, i=1 i=1 (3.89) ± N |S ± n i = n |S n i , 14

± Q |S ± n i = ±n |S n i ,

Notice that the normal ordering (3.83) of Q plays an essential role here.

36

The main results of this subsection can be summarized as follows. The mass degeneracy of the Klein-Gordon fields φ1,2 results in a new O(2) symmetry of the Lagrangian. This gives rise to a new conserved quantity, the charge Q. A particle state is then characterized by its mass (or equivalently its energy), its momentum, and its charge, which can be either positive or negative. States with the same energy and momentum, but opposite charge, can be interpreted as particles and antiparticles. Notice that for a single real Klein-Gordon field there is only a single type of particle, since a real scalar particle is its own antiparticle.

3.5

Complex Klein-Gordon Field

We can gain further insight into the theory by rewriting the Lagrangian (3.73) a little bit, L = (∂µ ϕ∗ )(∂ µ ϕ) − m2 ϕ∗ ϕ ,

(3.90)

where

1 (3.91) ϕ = √ (φ1 + iφ2 ) , 2 denotes the complex Klein-Gordon field. We could now compute the Hamiltonian and momentum operators directly in terms of ϕ and ϕ∗ , arriving at the same expressions as in the representation with two real fields (if you don’t believe me you are free to check this yourself). In order to compute the charge Q, we then need to identify the internal symmetry of the new Lagrangian. In fact, it is easy to see that (3.90) is invariant under a field phase-redefinition aka a global U (1) transformation, ϕ → ϕ0 = eiα ϕ ,

ϕ∗ → (ϕ0 )∗ = e−iα ϕ∗ .

(3.92)

Notice that this transformation is the equivalent of the rotation symmetry transformation (3.74) that we have found earlier, in the real field representation. We verify this by using the explicit form of the matrix R in terms of sine and cosine of the rotation angle α, ! ! ! ! ! ! φ1 cos α sin α φ1 φ1 cos α −i sin α φ1 → , =⇒ → , φ2 − sin α cos α φ2 iφ2 −i sin α cos α iφ2 (3.93) =⇒ φ1 + iφ2 → e−iα (φ1 + iφ2 ) , =⇒ ϕ → e−iα ϕ . So why should we bother about the complex Klein-Gordon Lagrangian if (3.73) and (3.90) are equivalent? The reason is that the complex field representation is more suggestive to the fact that we have both particle and antiparticle states. To see this we “rederive” the expression for the charge operator (3.82). The variations of the fields ϕ and ϕ∗ (treated as independent) under (3.92) are δϕ = iαϕ , δϕ∗ = −iαϕ∗ . (3.94) Now we can again use the machinery of Noether’s theorem to calculate Q. I spare you the details of this computation and simply quote the final result after normal ordering. One finds15 Z 3 h i dp † † Q= a a − a a = N+ + N− , (3.95) +,p −,p −,p (2π)3 +,p You can obtain this expression by simply reexpressing (3.82) in terms of a±,p and a†±,p using the inverse of (3.84) and its hermitian conjugate analog. 15

37

where in the last step we have introduced the number operators Z 3 dp † N± = a a±,p . (2π)3 ±,p

(3.96)

The expression (3.95) implies that Q counts the number of antiparticles (created by a†+,p ) minus the number of particles (created by a†−,p ). Since [H, Q] = 0, this difference is a conserved quantity in our quantum theory. Of course, in a free field theory this isn’t such a big deal because both N+ and N− , i.e., the numbers of positively and negatively charged states, are separately conserved. However, we will see soon that in interacting theories Q survives as a conserved quantity, while N± individually do not.

3.6

Heisenberg Picture

Although we started with a Lorentz-invariant Lagrangian, we slowly butchered it as we quantized the theory, introducing a preferred time coordinate t. Its not at all obvious that the theory is still Lorentz invariant after quantization. E.g., the various field operators φ(x) we met depend on space, but not on time. Yet, the one-particle states obey the Schrödinger’s equation, d|p(t)i = H |p(t)i , (3.97) i dt which means that they evolve in time according to |p(t)i = e−iEp t |pi .

(3.98)

Things start to look better in the Heisenberg picture where the time dependence is assigned to the operators O, OH = eiHt OS e−iHt , (3.99) so that dOH = dt

d iHt d −iHt −iHt iHt e OS e + e OS e dt dt

(3.100)

= iH eiHt OS e−iHt + eiHt OS e−iHt (−iH) = i [H, OH ] . Here the subscripts S and H tell us whether the operator is in the Schrödinger or Heisenberg picture. In QFT, we drop these subscripts and we will denote the picture by specifying whether the fields depend on space φ(x) (the Schrödinger picture) or space-time φ(t, x) = φ(x) (the Heisenberg picture). The operators in the two pictures agree at a fixed time, say, t = 0. The commutation relations (3.2) become equal-time commutation relations in the Heisenberg picture. In the case of the real Klein-Gordon theory (2.45), [φ(t, x), π(t, y)] = iδ (3) (x − y) .

[φ(t, x), φ(t, y)] = [π(t, x), π(t, y)] = 0 ,

38

(3.101)

Now that our operators depend on time, we can study how they evolve when the clock starts ticking. For the field operator φ, we have Z n o i 2 ih ˙ φ(x) = i [H, φ(x)] = d3 y π 2 (y) + ∇φ(y) + m2 φ2 (y) , φ(x) 2 (3.102) Z = i d3 y π(y) (−i) δ (3) (x − y) = π(x) . Similarly, we get for the conjugate operator π, Z n o i 2 ih 3 2 2 2 d y π (y) + ∇φ(y) + m φ (y) , π(x) π(x) ˙ = i [H, π(x)] = 2 Z n o i 3 2 (3) = d y ∇y [φ(y), π(x)] ·∇φ(y) + ∇φ(y) ·∇y [φ(y), π(x)] + 2i m φ(y) δ (x − y) 2 = ∇2 − m2 φ(x) , (3.103) where we have included the subscript y on ∇y when there may be some confusion about which argument the derivative is acting on. To reach the last line, we have simply integrated by parts. Putting (3.102) and (3.103), we then find that φ satisfies (as one could have guessed) the Klein-Gordon equation (2.46). Things start to look more relativistic. We can also write the Fourier expansion (3.15) of the field φ by using the definition of Heisenberg operators (3.99). We first note that (ap )H = eiHt ap e−iHt = [eiHt , ap ] + ap eiHt e−iHt (3.104) = e−iEp t ap eiHt − ap eiHt + ap eiHt e−iHt = e−iEp t ap , where have applied repeatedly H n ap = ap (H − Ep )n ,

(3.105)

which holds for any n and follows from the commutation relations (3.43), after expanding the exponential in a power series (this step is actually not shown). A similar relation (with “−” replaced by “+”) holds for a†p . In the case of a†p , we hence have a†p

H

= eiHt a†p e−iHt = eiEp t a†p .

Using (3.104) and (3.106) then gives, Z 3 1 dp p ap e−ipx + a†p eipx , φ(x) = 3 (2π) 2Ep

(3.106)

(3.107)

which looks pretty much like (3.15) except that the exponentials are now written in terms of 4-vectors, px = Ep t − p · x. Note also that the sign has flipped in the exponent due to

39

the Minkowski metric. It’s a simple exercise to check that (3.107) indeed satisfies the KleinGordon equation (2.45), and is therefore left as a homework. For completeness let me also give the result for the conjugate field π in the Heisenberg picture. One finds, r Z 3 i dp Ep h −ipx † ipx π(x) = ap e − ap e , (3.108) (−i) (2π)3 2 as you might have guessed immediately from looking at (3.15) and (3.107). The equation (3.107) makes explicit the dual particle and wave interpretations of the quantum field φ. On the one hand, φ is written as an operator, which creates and destroys the particles that are the quanta of field excitation. On the other hand, φ is written as a linear combination of solutions (the exponentials) of the Klein-Gordon equation. Both signs of the time dependence, i.e., ±ip0 t with p0 > 0, appear in the exponential. If these were single-particle wavefunctions, they would correspond to states of positive and negative energy. Let us refer to them more generally as positive- and negative-frequency modes. The connection between the particle-creation operators and the waveforms displayed here is always valid for free quantum fields. A positive-frequency solution of the field equation has as its coefficient the operator that destroys a particle in that single-particle wavefunction, while a negativefrequency solution of the field equation (being the hermitian conjugate of a positive-frequency solution) has as its coefficient the operator that creates a particle in that positive-energy single-particle wavefunction. In this way, the fact that relativistic wave equations have both positive- and negative-frequency solutions is reconciled with the requirement that a sensible quantum theory should contain only positive excitation energies. Causality It looks like we are approaching something Lorentz invariant in the Heisenberg picture, where the field operator φ satisfies the Klein-Gordon equation. Yet, there is still a hint of nonLorentz invariance because φ and π satisfy the equal-time commutation relations (3.101). The question that we thus have to address is, what happens for arbitrary space-time separations? In particular, for our theory to be causal, we must require that all space-like separated operators commute, [O1 (x), O2 (y)] = 0 , ∀ (x − y)2 < 0 . (3.109) This ensures that a measurement at x cannot affect a measurement at y, when x and y are not causally connected (outside the light-cone). A graphical representation of the latter equation is given in Figure 3.1. Does our theory satisfy the requirement (3.109)? To answer this question, we first define ∆(x − y) = [φ(x), φ(y)] .

(3.110)

While the objects of the right-hand side are operators, it is seen (after a short calculation)

40

t

O2 (y) x O1 (x)

Figure 3.1: Picture of space-like separated operators O1 (x) and O2 (y).

that the left-hand side is simply a complex number, Z 3 h Z d3 p i 1 dq 1 −ipx † ipx −iq y † iq y p p a e + a e , a e + a e ∆(x − y) = p q p q (2π)3 2Ep (2π)3 2Eq Z 3 3 d pd q 1 † −ipx+iq y † ipx−iq y p = [a , a ] e + [a , a ] e p q p q (2π)6 2 Ep Eq (3.111) Z 3 3 d pd q 1 p = (2π)3 δ (3) (p − q) e−ipx+iq y − eipx−iq y (2π)6 2 Ep Eq Z 3 d p 1 −ip (x−y) e − eip (x−y) . = 3 (2π) 2Ep So what do we know about ∆(x − y)? First of all, it is Lorentz invariant thanks to the R Lorentz-invariant measure d3 p/(2Ep ) that we have introduced in (3.61). Second, it does not vanish for time-like separations. E.g., taking x = (t, 0, 0, 0) and y = (0, 0, 0, 0) gives [φ(x), φ(y)] ∝ exp (−imt) − exp (imt), where the exp (−imt) term arises from Z ∞ Z 3 √ p2 d p 1 −iEp t 4π −it p2 +m2 p e = dp e (2π)3 2Ep (2π)3 0 2 p2 + m2 Z ∞ √ 1 (3.112) = 2 dE E 2 − m2 e−iEt 4π m m = Y1 (mt) + iJ1 (mt) ∝ e−imt , t→∞ 8πt where to arrive at the second line, we have simply changed variables p → (E 2 − m2 )1/2 . In 41

order to obtain the final answer, one only needs to know that the Bessel functions of first and second kind, J1 (x) and Y1 (x), behave like J1 (x)/x ∝ sin x−cos x and Y1 (x)/x ∝ sin x+cos x in the relevant limit x → ∞. An analog calculation gives the exp (imt) term. Third, it vanishes for space-like separations. This follows by realizing that ∆(x − y) = 0 at equal times for all (x − y)2 = −(x − y)2 < 0, which can be seen explicitly by writing Z 3 ip·(x−y) dp 1 p [φ(t, x), φ(t, y)] = e − e−ip·(x−y) = 0 . (3.113) 3 (2π) 2 p2 + m2 Notice that in order to arrive at the final result, we have flipped the sign of p in the second exponent. This obviously does not change the result since p is an integration variable and (p2 + m2 )1/2 is invariant under such a change. But since ∆(x − y) is Lorentz invariant, it can only be a function of (x − y)2 and must hence vanish for all (x − y)2 < 0. Taken together the above findings imply that the real Klein-Gordon theory is indeed causal with commutators vanishing outside the light-cone. This property will continue to hold in the interacting theory. Indeed, it is usually given as one of the axioms of local QFTs. Let me mention, however, that the fact that [φ(x), φ(y)] is a complex function, rather than an operator, is a property of free fields only and does not hold in an interacting theory.

3.7

Klein-Gordon Correlators

The causal structure of the real Klein-Gordon theory (2.45) can also be probed in a different way. Let’s create a particle at the space-time point y. What is the amplitude to find it at point x? This question can be answered by calculating Z 3 3

d pd q 1 p 0 ap a†q 0 e−ipx+iq y D(x − y) = h0 |φ(x)φ(y)| 0i = 6 (2π) 2 Ep Eq Z 3 3

d pd q 1 p = 0 [ap , a†q ] 0 e−ipx+iq y 6 (2π) 2 Ep Eq (3.114) Z 3 3 d pd q 1 p = (2π)3 δ (3) (p − q) e−ipx+iq y (2π)6 2 Ep Eq Z 3 d p 1 −ip(x−y) = e . (2π)3 2Ep The function D(x − y) is called propagator and is a Lorentz-invariant 3-momentum integral. Let us now evaluate (3.113) for purely space-like separations, i.e., x − y = (0, r).16 The propagator is then Z ∞ Z 3 d p 1 ip·r 2π p2 eipr − e−ipr D(x − y) = e = dp (2π)3 2Ep (2π)3 0 2Ep ipr (3.115) Z ∞ −i p eipr = dp p . 2(2π)2 r −∞ p2 + m2 16

Notice that for purely time-like separations one would obtain the result (3.112).

42

Im p

+im

−im

Re p

Figure 3.2: Branch cuts of the propagator D(x − y) for a space-like transition.

Here we have first introduced spherical coordinates, then performed the integration over the azimuthal and polar angles, and finally changed variables in the second term from p → −p in order to combine the result into one term. The integrand in (3.115), considered as a complex function of p, has branch cuts on the imaginary axis starting at ±im. In order to evaluate the integral we push the contour up to wrap around the upper branch cut. The chosen integration contour is shown in Figure 3.2. Defining ρ = −ip, we then recast (3.115) into Z ∞ 1 1 ρe−ρr dρ p = 2 mK1 (mr) ∝ e−mr , D(x − y) = 2 (3.116) r→∞ 4π r m 4π r ρ2 − m2 p where the modified Bessel function K1 (x) scales like K1 (x) = π/(2x)+O(x−3/2 ) e−x in the limit of x → ∞. The latter equation tells us that the propagator ∆(x−y) decays exponentially quickly outside the light-cone but, nonetheless, it is non-vanishing. The quantum field appears to leak out of the causal region. Yet, we have just seen in (3.113) that space-like measurements commute and the theory is causal. How do we reconcile these two facts? We get a first clue of how this puzzle is resolved by realizing that the relation (3.113), expressed in terms of propagators, takes the form ∆(x − y) = [φ(x), φ(y)] = D(x − y) − D(y − x) = 0 .

(3.117)

What is the physical meaning of this result? It simply means that for (x − y)2 < 0, there is no Lorentz-invariant way to order events. If a particle can travel in a space-like direction from x to y, it can just as easily travel from y to x.17 In any measurement, the amplitudes for these two possible events cancel, so that the underlying QFT is causal. 17

When x − y is space-like, a continuous Lorentz transformation can take x − y to y − x.

43

Another way to think about the cancellation of the two contributions in (3.117) is in terms of amplitudes of particles and antiparticles. Let us first consider the case of a complex scalar field. If we look at the equation [ϕ(x), ϕ∗ (y)] = 0 outside the light-cone, the physical interpretation of (3.117) (or better its analog) is that the amplitude for the particle to propagate from x to y cancels the amplitude for the antiparticle to travel from y and x. In fact, this interpretation also applies (maybe in a less obvious way) to the case of the real scalar field, because the particle is then its own antiparticle. Green’s functions In fact, the statements made after (3.117) can be put on mathematical solid grounds. Let’s see how this goes. We start by considering the amplitude Z 3 d p 1 −ip(x−y) ip(x−y) h0 |[φ(x), φ(y)]| 0i = e − e , (3.118) (2π)3 2Ep and assume for now that x0 > y 0 . In this case we can rewrite the 3-momentum integral on the right-hand side of (3.117) as a 4-momentum integral, Z 3 dp 1 −ip(x−y) 1 −ip(x−y) h0 |[φ(x), φ(y)]| 0i = e + e 0 0 (2π)3 2Ep −2Ep p =Ep p =−Ep Z 3 Z dp dp0 −1 (3.119) e−ip(x−y) = 3 2 2 0 0 x >y (2π) 2πi p − m Z 4 i dp e−ip(x−y) . = 4 2 (2π) p − m2 Notice that this is the first time in this course that we have integrated over 4-momentum. Until now, we integrated only over 3-momentum, with p0 fixed by the mass-shell condition to be p0 = Ep . Barring possible typos, the calculation in (3.119) is certainly correct, but this fact might not be obvious to everybody in the audience right away. So let me do some “reverse engineering”. First notice that the denominator in the last line of (3.119) can be written as p2 − m2 = (p0 )2 − p2 − m2 = (p0 )2 − Ep2 = (p0 − Ep )(p0 + Ep ) ,

(3.120)

which implies that, for each value of p, the denominator produces a pole in the integrand at 1/2 p0 = ±Ep = ± (p2 + m2 ) . The 4-momentum integration is hence ill-defined and we need a prescription for avoiding the singularities on the real p0 -axis. How do we have to choose the integration contour in order to arrive at (3.119)? It is not difficult to see that in the case x0 > y 0 the contour has to be chosen as shown in Figure 3.3. Notice that closing the contour in the lower half-plane, where p0 → −i∞, ensures that the integrand vanishes since exp (−ip0 (x0 − y 0 )) → 0. The integral over p0 then picks up the residues at p0 = ±Ep which are −2πi/(p0 ± Ep ) p0 =±Ep = ∓2πi/(2Ep ), where the relative minus sign arises because we take a clockwise contour. Combining these elements shows that the calculation that led to the final result in (3.119) is in fact correct. 44

Im p0 Re p0 −Ep

+Ep p0 → −i∞

Figure 3.3: Integration contour for the retarded Green’s function DR (x − y).

In the following, we will call the last line of (3.119) together with the prescription for going around the pole retarded Green’s function, DR (x − y) = θ(x0 − y 0 ) h0 |[φ(x), φ(y)]| 0i = θ(x0 − y 0 ) D(x − y) − D(y − x) , (3.121) where the Heaviside step function θ(x) is defined as θ(x) = 0 for x < 0 and θ(x) = 1 for x > 0. It seldom matters what value is used for θ(0), since θ(x) is mostly used as a distribution (in the half-maximum convention one has θ(0) = 1/2). The name retarded Green’s function is in fact the correct one for DR (x − y), since this mathematical object obeys18 ∂ 2 + m2 DR (x − y) = ∂ 2 θ(x0 − y 0 ) h0 |[φ(x), φ(y)]| 0i + 2 ∂µ θ(x0 − y 0 ) (∂ µ h0 |[φ(x), φ(y)]| 0i) + θ(x0 − y 0 ) ∂ 2 + m2 h0 |[φ(x), φ(y)]| 0i = −δ(x0 − y 0 ) h0 |[π(x), φ(y)]| 0i

(3.122)

+ 2δ(x0 − y 0 ) h0 |[π(x), φ(y)]| 0i +0 = −iδ (4) (x − y) , and vanishes for x0 < y 0 by definition. Here all derivatives are understood with respect to x. In order to obtain the second line we have used the two relations ∂x θ(x) = δ(x) and 2 ∂x θ(x) f (x) = −δ(x) ∂x f (x) , the latter of which is shown easily by partial integration, and paid tribute to the fact that φ(x) obeys the Klein-Gordon equation. The last line then follows by employing the second equal-time commutation relation in (3.101). The retarded Green’s function is useful in classical field theory if we know the initial value of some field configuration and want to figure out what it evolves into in the presence of 18

Notice that the same result is obtained by applying the differential operator (∂ 2 + m2 ) directly to the expression in the last line of (3.119).

45

Im p0 p0 → +i∞

−Ep

+Ep Re p0

Figure 3.4: Integration contour for the advanced Green’s function DA (x − y).

a source, meaning that we want to know the solution to the inhomogeneous Klein-Gordon equation, (∂ 2 + m2 ) φ(x) = J(x) for some fixed background function J(x), acting as a static source. Similarly, one can define the advanced Green’s function DA (x − y) which vanishes when x0 > y 0 , which is useful if we know the end point of a field configuration and want to figure out where it came from. The integration contour corresponding to the advanced Green’s function is shown in Figure 3.4. You will get more familiar with the advanced Green’s function in an exercise. Feynman Propagator In fact, the most important quantity in interacting field theory is neither the retarded nor the advanced Green’s function but the Feynman propagator, DF (x − y) = h0 |T φ(x)φ(y)| 0i = θ(x0 − y 0 )D(x − y) + θ(y 0 − x0 )D(y − x) ,

(3.123)

where T stand for time ordering, i.e., placing all operators evaluated at later times to the left so that e.g., T φ(x)φ(y) = θ(x0 − y 0 )φ(x)φ(y) + θ(y 0 − x0 )φ(y)φ(x) . (3.124) Given the similarity of (3.118) and (3.123), it is does not come as a surprise, that the Feynman propagator can be written as, Z 4 i dp e−ip(x−y) . (3.125) DF (x − y) = 4 2 (2π) p − m2 Again we distinguish the cases x0 > y 0 and y 0 > x0 . In the former case, we perform the p0 integration following the contour shown in Figure 3.5, which encloses the pole at p0 = +Ep with residuum −2πi/(2Ep ), where the minus sign arises again since the path has a clockwise

46

Im p0 −Ep

Re p0 +Ep p0 → −i∞

Figure 3.5: Integration contour for the Feynman propagator DF (x − y) for x0 > y 0 . In the case y 0 > x0 , the integration contour is closed in the upper-half plane.

orientation. Consequently, one obtains Z 3 d p −2πi −iEp (x0 −y0 )+ip·(x−y) ie DF (x − y) = (2π)4 2Ep Z 3 d p 1 −ip(x−y) e = D(x − y) . = (2π)3 2Ep In contrast, in the case y 0 > x0 one finds Z 3 dp 2πi iEp (x0 −y 0 )+ip·(x−y) DF (x − y) = ie (2π)4 (−2Ep ) Z 3 d p 1 −iEp (y0 −x0 )−ip·(y−x) = e (2π)3 2Ep Z 3 d p 1 −ip(y−x) e = D(y − x) . = (2π)3 2Ep

(3.126)

(3.127)

where the integration is chosen as in Figure 3.5, but the path is closed in the upper-half plane (due to the counter-clockwise orientation of the half-circle the residuum does not pick up a minus sign). To go from the second line in (3.127) to the third, we have flipped the sign of p which is valid since we integrate over d3 p and all other quantities depend only on p2 . Taken together the latter two relations prove the equality of (3.123) and (3.125). Like DR (x − y) and DA (x − y), also the Feynman propagator is a Green’s function of the Klein-Gordon equation, Z 4 dp i 2 2 ∂ + m DF (x − y) = (−p2 + m2 ) e−ip(x−y) 4 2 2 (2π) p − m (3.128) Z 4 d p −ip(x−y) = −i e = −iδ (4) (x − y) . (2π)4 47

Im p0 +Ep

+i

−Ep

Re p0

−i p0 → −i∞

Figure 3.6: Schematic picture of the “i” prescription for x0 > y 0 . In the case y 0 > x0 , the integration contour is closed in the upper-half plane.

Notice that instead of specifying the contour, we may instead write the Feynman propagator as follows Z 4 i dp e−ip(x−y) , (3.129) DF (x − y) = 4 2 2 (2π) p − m + i with > 0 and infinitesimal. As shown in Figure 3.6, this has the effect of shifting the poles slightly off the real p0 -axis, so that the integration along this axis is equivalent to the integration contour displayed in Figure 3.5. This way of writing DF (x − y) is, for obvious reasons, called the “i” prescription.

3.8

Non-Relativistic Limit

In order to study the non-relativistic limit of our theory, we return to the classical complex Klein-Gordon field ϕ (for reasons that will become clear later on). We decompose it as19 ϕ(x) = e−imt ϕ(x) e ,

(3.130)

to single out the large kinematical part of the momentum of ϕ. In terms of the new field ϕ, e the Klein-Gordon equation reads i h ë − 2im ϕ ∂ 2 + m2 ϕ = ∂t2 − ∇2 + m2 e−imt ϕ e = e−imt ϕ e˙ − ∇2 ϕ e = 0, (3.131) where the explicit m2 term cancelled against the time derivatives. The non-relativistic limit ë m|ϕ|. is m |p|, which after a Fourier transform is equivalent to saying that |ϕ| e˙ We are 19

The exponential factor removes the large frequency part from the x-dependence in ϕ. Consequently, the xdependence of ϕ e is only governed by the small residual momentum and derivatives acting on ϕ e are suppressed by powers of 1/m. This way of decomposing a field is often the starting point for the construction of an effective field theory that entails the physics of the full theory in the kinematical limit m |p|. The most well-known example of such a theory in particle physics is heavy quark effective theory.

48

ë term in (3.131), so that the Klein-Gordon equation in the limit hence allowed to neglect the ϕ m → ∞ becomes d 1 i ϕ e=− ∇2 ϕ e. (3.132) dt 2m This looks very similar to the Schrödinger equation for a non-relativistic free particle of mass m. Except it does not have any probability interpretation. It is simply a classical field evolving through an equation that’s first order in time derivatives. It is also worthwhile to consider the Lagrangian of the complex scalar field ϕ itself and to investigate what happens to (3.90) in the non-relativistic limit. We again take the limit ë m|ϕ|, |ϕ| e˙ and obtain after a straightforward calculation (where in the last step we have divided by 2m), 1 L = iϕ e∗ ϕ e˙ − (∇ϕ e∗ ) · (∇ϕ) e . (3.133) 2m This Lagrangian has a conserved current related to its invariance under the global phase transformation ϕ e → eiα ϕ. e Employing Noether’s theorem (2.17), we find that the conserved current takes the form i µ ∗ ∗ ∗ J = ϕ e ϕ, e [ϕ e ∇ϕ e − ϕ∇ e ϕ e] . (3.134) 2m To get the Hamiltonian we compute the conjugate momentum π=

∂L = iϕ e∗ , ˙ ∂ϕ e

(3.135)

which does not contain a time derivative. This looks a little disconcerting, but its fully consistent for a theory which is first order in time derivatives. In order to determine the full trajectory of the field, we only need to specify initial conditions for ϕ e and ϕ e∗ at some point in time, say t = 0 (knowing the time derivatives on the initial slice is not necessary). Since the Lagrangian (3.133) already contains a term “p q” ˙ (and not the usual “1/2 p q”), ˙ the time derivatives drop out when one computes the Hamiltonian, H=

1 (∇ϕ e∗ ) · (∇ϕ) e . 2m

(3.136)

In order to quantize the system, we impose in the Schrödinger picture, [ϕ(x), e ϕ(y)] e = [ϕ e∗ (x), ϕ e∗ (y)] = 0 ,

[ϕ(x), e ϕ e∗ (y)] = δ (3) (x − y) ,

and expand the field into its Fourier components, Z 3 dp ϕ(x) e = ap eip·x . 3 (2π)

(3.137)

(3.138)

Inserting this into the commutation relations (3.137), leads to [ap , a†q ] = (2π)3 δ (3) (p − q) .

49

(3.139)

where the trivial expressions have been skipped. As usual the vacuum satisfies ap |0i = 0, and the excitations are a†p1 . . . a†pn |0i. The one-particle states |pi = a†p |0i, have energy H |pi =

p2 |pi , 2m

(3.140)

which is the non-relativistic dispersion relation. From the above, we conclude that quantizing the first order Lagrangian (3.133) gives rise to non-relativistic particles of mass m. Some comments seem to be in order. Notice that we have a complex field but only a single type of particle. The antiparticle is not in the spectrum. The existence ofR antiparticles is a consequence of relativity. A related fact is that the conserved charge Q = d3 x : ϕ e∗ ϕ e : is the particle number. This remains conserved even if we include interactions in the Lagrangian of the form (ϕ e∗ ϕ) e 2 etc., which are invariant under a global phase rotation. So in non-relativistic theories, particle number is conserved. It is only with relativity, and the appearance of antiparticles, that particle number can change. Finally, there is no non-relativistic limit of a real scalar field. In the relativistic theory, the particles are their own antiparticles, and there is no way to construct a multiparticle theory that conserves particle number. Recovering QM In QM, we talk about the position and momentum operators X and P . On the other hand, as we saw below (2.1), in QFT position is relegated to a label. How do we get back to good old QM? We already have the operator for the total momentum of the field, namely (3.46). When acting on a single-particle state, it gives P |pi = p |pi. It is also not too difficult to write down the position operator X. Let’s do it in the non-relativistic limit. In this case the operator Z 3 d p † −ip·x ∗ a e , (3.141) ϕ e (x) = (2π)3 p creates a particle localized with a delta function at x. We hence write |xi = ϕ e∗ (x)|0i. It is now natural to define the position operator X as Z X = d3 x x ϕ e∗ (x) ϕ(x) e , (3.142) since it has the sought property, Z X |xi = d3 y y ϕ e∗ (y) ϕ(y) e ϕ e∗ (x)|0i Z =

(3.143) d y yϕ e (y) δ (3) (y − x) − ϕ e∗ (x) ϕ(y) e |0i = x|xi . 3

∗

We can now construct a state |χi by taking a superposition of the one-particle states |xi, Z (3.144) |χi = d3 x χ(x) |xi .

50

Notice that the weight function χ(x) is what we would usually call the Schrödinger wavefunction (in the position representation). Let’s make sure that it indeed has the right properties. First, it is clear that for what concerns X it behaves correctly, namely Z X |χi = d3 x x χ(x)|xi . (3.145) What about the momentum operator P ? A straightforward calculation gives, Z 3 3 Z 3 3 d xd p d xd p † ∗ P |χi = p ap ap χ(x) ϕ e (x)|0i = p a†p e−ip·x χ(x)|0i 3 3 (2π) (2π) Z 3 3 Z 3 3 d x d p −ip·x d xd p † = ap i∇e−ip·x χ(x)|0i = e (−i∇χ(x)) a†p |0i 3 (2π) (2π)3 Z = d3 x − i∇χ(x) |xi .

(3.146)

This tells us that P acts as the familiar derivative on wave functions χ(x). To obtain the final result in (3.146), we have used in a first step the relationship [ap , ϕ e∗ (x)] = e−ip·x which can be easily checked. We learn that when acting on one-particle states, the operators X and P act as position and momentum operators in QM, with [X i , P j ] |χi = i δ ij |χi and i, j = 1, 2, 3. But what about dynamics? In particular, how does our wavefunction χ(x) change with time? To address this question, we first express the Hamiltonian corresponding to the density (3.136) through ladder operators, Z 3 Z d p p2 1 ∗ 3 (∇ϕ e ) · (∇ϕ) e = ap a†p , (3.147) H= dx 2m (2π)3 2m which implies that d 1 χ=− ∇2 χ , (3.148) dt 2m which formally looks exactly like the time evolution of the original field ϕ e given in (3.132). Yet this time, it is really the Schrödinger equation, complete with the usual probabilistic interpretation for the wavefunction χ (and not just a first-order differential equation). Note R in particular, that the conserved charge arising from the current (3.134) is Q = d3 x |χ(x)|2 which is the total probability. Historically, the fact that the equation for the classical field (3.132) and the one-particle wavefunction (3.148) coincide caused some confusion. It was thought, that perhaps one is quantizing the wavefunction itself and the resulting name “second quantization” is still sometimes used today meaning QFT. However, it is important to stress that, despite the name, nothing is quantized twice. One simply quantizes a classical field once. Nonetheless, it is good to know that, if one treats the one-particle Schrödinger equation as a quantum field, then it will give the correct generalization to multiparticle states. i

51

3.9

Problems

i) Consider the Klein-Gordon equation with the mass term set equal to zero and a dilatation transformation with parameter α, xµ → xµ 0 = e α xµ ,

φ(x) → φ0 (x0 ) = φ(x) e−dφ α .

(3.149)

Show that this transformation is a global symmetry of L if one chooses the scaling dimension dφ in an appropriate way. Compute the associated Noether current and verify that it is conserved. Is the symmetry preserved if you add a quartic −λ/(4!) φ4 to the Lagrangian? What happens if you add a mass term −m2 φ2 ? ii) Consider the Lagrangian 1 λ 1 L = (∂µ Φ∂ µ Φ − M 2 Φ2 ) + (∂µ ϕ∂ µ ϕ − m2 ϕ) − Φϕ2 , 2 2 2

(3.150)

with m M and derive the EOMs for the fields Φ and ϕ. Express the heavy field Φ through the light field ϕ and insert it back into the Lagrangian. Expand your result in 1/M 2 . What has changed compared to the original Lagrangian? Up to which energy scale would you trust the predictions of this effective Lagrangian? iii) Show that the first relation in (3.2) is satisfied if [ap , aq ] = [a†p , a†q ] = 0 holds. Prove the commutation relations (3.43). Calculate J i |p = 0i with J i defined in (3.48). Show that the number operator N defined in (3.53) commutes with the Hamiltonian H of (3.27) and satisfies N |p1 , . . . , pn i = n |p1 , . . . , pn i, where |p1 , . . . , pn i denotes the n-particle state introduced in (3.50). iv) Consider a real scalar φ(t, x) field living on a two-dimensional space-time and defined on an interval x ∈ [0, L] with Dirichlet BCs φ(t, 0) = φ(t, L) = 0. Show that the (classical) positive- and negative-frequency solutions to the Klein-Gordon equation that also satisfy the BCs have the form φ(±) n (t, x) = √

1 e±iωn t sin(kn x) . ωn L

(3.151)

Give the expression for kn in terms of L. How is ωn related to kn ? We now quantize the field φ(t, x), keeping in mind that momentum here is discretized, i.e., φ(t, x) =

∞ X

(+) † φ(−) n (t, x) an + φn (t, x) an ,

(3.152)

n=1

with the ladder operators satisfying [an , am ] = [a†n , a†m ] = 0 and [an , a†m ] = δmn . Compute the VEV h0|H|0i of the Hamiltonian density i 1 h ˙2 2 2 2 H= φ + (∂x φ) + m φ . 2 52

(3.153)

Integrating your result over the interval [0, L] and show that the total vacuum energy is ∞

E0 (L) =

1X ωn . 2 n=1

(3.154)

Since this quantity is infinite, we need some form of regularization in order to handle the divergence. Let us introduce an exponentially damping function exp(−δωn ) with δ > 0 in the sum, and consider for simplicity the case of a massless field. Prove that in this case the vacuum energy can be written as δπ π −2 sinh , (3.155) E0 (L, δ) = 8L 2L Take the limit δ → 0 and determine the vacuum energy for the case when no BCs are imposed. With all this at hand calculate the Casimir force. v) Derive the charge operator Q for the U (1) invariant Lagrangian (3.90) using infinitesimal transformations. Show that the result expressed through creation and annihilation operators takes the form of (3.95). Prove that Q satisfies (3.85). Verify that the charge is conserved (via [H, Q] = 0). Are the operators N± as defined in (3.96) conserved as well? What happens if you add an interaction term ∆L = −

λ ∗ 2 (ϕ ϕ) , 4!

(3.156)

to the Lagrangian? What does this result imply for the case in which particles are their own antiparticles? vi) Consider a theory with two complex scalar fields ϕ1 and ϕ2 . Write down all possible terms of the Lagrangian which are Lorentz invariant and renormalizable, i.e., have mass dimension of four and couplings with non-negative mass dimensions. Which terms survive, if there is an additional discrete symmetry (ϕ1 , ϕ2 ) → (−ϕ1 , ϕ2 ) ,

(3.157)

(ϕ1 , ϕ2 ) → (ϕ1 , −ϕ2 ) ,

(3.158)

and under which the Lagrangian remains invariant? Assume further, that both fields have the same mass, m1 = m2 , and that all dimensionless couplings are identical. You can now rewrite the Lagrangian in a more economic way if you introduce the scalar doublet ! ϕ1 Φ= , (3.159) ϕ2

53

and its hermitian conjugate. The theory at hand has four conserved global charges. One charge follows from the U (1) invariance, Φ → eiα Φ ,

δΦ = iαΦ ,

(3.160)

of the theory, that is already present in the case of a single complex scalar. The other three charges correspond to the mixing of the scalar fields under an SU (2) transformation, j Φ → eiαj τ Φ , δΦ = iαj τ j Φ , (3.161) where the indices j = 1, 2, 3 are summed over and τ j = σ j /2 with σ j being the usual Pauli matrices. Compute the four conserved charges using Noether’s theorem. For the SU (2) charges you should find Z j Q = d3 x i ϕ∗a (τ j )ab πb∗ − πa (τ j )ab ϕb , (3.162) where a, b = 1, 2 are field labels. Show further, that the latter charges fulfill the SU (2) commutation relation, [Qj , Qk ] = i jkl Ql .

(3.163)

What symmetries survive if you allow for different masses and dimensionless couplings? vii) Consider the Lagrangian for a free complex scalar field (3.90) which is invariant under global U (1) transformations (3.92). Is this Lagrangian invariant, if the global gets promoted to a local symmetry, i.e., α → e α(x), where e is just a universal constant and α(x) a function of space-time? If you now add a vector field Aµ to the Lagrangian with a coupling LAϕ = iλ ϕ∗ (∂µ ϕ) − ϕ(∂µ ϕ∗ ) Aµ + λ2 (Aµ ϕ∗ )(Aµ ϕ) ,

(3.164)

how do the vector field and the coupling constant λ have to transform under the local U (1), if the Lagrangian L + LAϕ should remain invariant under phase redefinitions? Compute the Noether current for this local symmetry. Add a kinetic term for the vector field to the Lagrangian, 1 (3.165) LA = − Fµν F µν , 4 and derive the EOMs for the field Aµ considering the full Lagrangian Lϕ + LAϕ + LA . viii) Compute the advanced Green’s function DA (x − y) for the Klein-Gordon equation using the integration contour in Figure 3.4. Recall which initial conditions one assumes in electrodynamics and why they lead to the use of the retarded Green’s function. Can you imagine physical BCs in which the advanced propagator would be the right choice? Prove and explain in this context the following two relations DR (−x) = DA (x) , 54

DF (−x) = DF (x) .

(3.166)

ix) Explicitly perform the steps that lead to the Lagrangian (3.133). Show that the corresponding EOM (vary with respect to ϕ e∗ ) is the Schrödinger equation. The Lagrangian has a global U (1) symmetry, ϕ e → eiα ϕ. e Verify the correctness of the expression for the Noether current (3.134) and discuss the physical meaning of the conserved charge. Based on your findings, give a reason why there is no non-relativistic limit for a real scalar field? x) Prove that the position operator X given in (3.142) satisfies X|xi = x|xi. Furthermore, show that (3.144), (3.146), and (3.148) are correct.

References [1] S. R. Coleman, “Physics 253: Quantum Field Theory”, Course given at Harvard University, 1975 and 1976, http://www.damtp.cam.ac.uk/user/tong/qft/col1.pdf, http://www.damtp.cam.ac.uk/user/tong/qft/col2.pdf, http://www.physics.harvard.edu/about/Phys253.html [2] N. Straumann, “The history of the cosmological constant problem,” arXiv:grqc/0208027. [3] S. M. Carroll, “The Cosmological Constant,” Living Rev. Relativity 3, 1 (2001), http://relativity.livingreviews.org/Articles/lrr-2001-1 [4] H. B. G. Casimir and D. Polder, “The Influence of retardation on the London-van der Waals forces,” Phys. Rev. 73, 360 (1948). [5] K. A. Milton, “The Casimir effect: Recent controversies and progress,” J. Phys. A 37, R209 (2004) [arXiv:hep-th/0406024].

55

4

Interacting Fields

Often in QM, we are interested in particles moving in some fixed background potential V (x). This can be easily incorporated into field theory by working with a Lagrangian with explicit x dependence. E.g., in the case of our non-relativistic complex scalar field ϕ e discussed in Section 3.8, we could simply add a term ∆L = −V (x) ϕ e∗ (x) ϕ(x) e ,

(4.1)

to the Lagrangian (3.133). Since this interaction does not respect translational symmetry, we won’t have the associated energy-momentum tensor. While such Lagrangians are useful in condensed matter physics, we rarely (or never) come across them in high-energy physics, where all equations obey translational (and Lorentz) invariance. One can of course also consider interactions between particles. Obviously, these are only important for n particle states with n ≥ 2. We therefore expect them to arise from additions to the Lagrangian (3.133) of the form ∆L = ϕ e∗ (x) ϕ e∗ (x) ϕ(x) e ϕ(x) e ,

(4.2)

which, in QFT, is an operator which destroys two particles before creating two new ones. Such terms in the Lagrangian will indeed lead to inter-particle forces, both in the non-relativistic and relativistic setting. In the following, we will explore these types of interactions in detail for relativistic theories.

4.1

Classification of Interactions

The free QFTs we have discussed so far are special. We can determine their spectrum, but they are dull since nothing happens as their name suggests. They have particle excitations, but these do not interact. To make things more interesting (i.e., more complicated) let us include interactions in our theory. These will take the form of higher-order terms in the Lagrangian. We start by asking what kind of small perturbations we can add to the theory. E.g., let us consider the Lagrangian for a real scalar field (2.45) and add the infinite tower of additional terms ∆L = −

∞ X

∆Ln ,

∆Ln =

n=3

λn n φ , n!

(4.3)

to it. Here the coefficients λn are called coupling constants. The first question that we have to address, is which restrictions the coupling constants have to satisfy in order for the additional terms to be small perturbations. Naively one would think that one simply has to require that “λn 1”. But this turns out to be not quite right. In order to see why the naive guess is not correct, we perform a dimensional analysis. Applying the rules gathered in Section 2.1, we find that the dimensions of the coupling constants are [λn ] = 4 − n . 56

(4.4)

This result makes clear why we cannot simply say “λn 1”, because this statement is only sensible for dimensionless quantities, but not dimensionful ones. The interaction terms in (4.3) fall into three different categories. First, dimension-three operators with [λ3 ] = 1. For such terms, we can define a dimensionless parameter λ3 /E, where E has dimension of mass and represents the energy scale of the process of interest. This means that ∆L3 = λ3 φ3 /(3!) is a small perturbation for high energies, i.e., E λ3 , but a big one at low energies, i.e., E λ3 . Such terms are called relevant, because they become and are most relevant at low energies which, after all, is where most of the physics that we experience lies. In a relativistic QFT, we have E > m, which means that we can always make this sort of perturbations small by taking λ3 m. Second, terms of dimension four with [λ4 ] = 0. E.g., ∆L4 = λ4 φ4 /(4!). Such terms are small if λ4 1 and are called marginal. Third, operators with dimension of higher than four, having [λn ] < 0. In this case the appropriate dimensionless parameters is (λn E n−4 ) and terms ∆Ln = λn φn /(n!) with n ≥ 5 are small (large) at low (high) energies. Such contributions are called irrelevant, since in daily life, meaning E n−4 λn , these operators do not matter. As we will see later, it is typically impossible to avoid high-energy processes in QFT. We have already seen a glimpse of this feature when we were discussing the structure of the vacuum in Section 3.2, which involved the calculation of an integral over infinitely large frequencies of a harmonic oscillator. We hence might expect problems with irrelevant operators that become important at high energies. Indeed, these operators lead to non-renormalizable QFTs in which one cannot make sense of the infinities at arbitrarily high energies. This does not mean that these theories are useless, it just means that they become incomplete at some energy scale and need to be embedded into an appropriate complete theory aka an UV completion. Let me also add that the above naive assignment of relevant, marginal, and irrelevant operators is not always carved in stone, since quantum corrections can sometimes change the character of an operator. Low-Energy Description In typical applications of QFT only the relevant and marginal couplings are important. This is due to the fact that the irrelevant couplings become small at low energies, as we have seen above. In practice this saves us, since instead of considering the infinite number of interaction terms in (4.3), only a handful are actually needed. E.g., in the case of the real scalar field φ described earlier, we only have to take into account two operators, namely ∆L3 = λ3 φ3 /(3!) and ∆L4 = λ4 φ4 /(4!), in the low-energy limit. Let us have a closer look at this issue. Suppose that at some day we discover the true superduper theory aka the TOE that describes the world at very high energy scales, say the GUT scale, or, if you wish, even the Planck scale. Whatever this scale is, let’s call it Λ. Since it is an energy scale, we obviously have [Λ] = 1. What we want to understand are the laws of physics at energy scales E that we can probe directly in a laboratory, which given today’s standards, means E Λ. Let us further suppose that at energies of order E, the laws of physics are described by a real scalar field.20 This scalar field will have some complicated 20

Of course, we know that this assumption is plain wrong, since the SM is a non-abelian gauge theory with chiral fermions, but the same argument applies in that case.

57

interaction terms (4.3), where the precise form is dictated by all the stuff that is going on in the TOE. Can we get an idea about the interactions? Well, we can write our dimensionful coupling constants λn in terms of dimensionless couplings gn , multiplied by a suitable power of the relevant scale Λ, gn λn = n−4 . (4.5) Λ The exact values of the dimensionless couplings gn depend on the details of the TOE,21 so we have to do some guesswork. Since the couplings gn are dimensionless, 1 looks like a pretty good and somehow a natural guess. Since we are not completely sure, let’s say gn = O(1). This means that in a laboratory with E Λ the interaction terms ∆Ln = λn φn /(n!) of (4.3) will be suppressed by powers of (E/Λ)n−4 if n ≥ 5. Given the LHC energy of around 1 TeV, this is a suppression by many orders of magnitude. E.g., for Λ = MP one has E/Λ = 10−16 . It is this simple argument based on dimensional analysis that ensures that we need to focus only on the first few terms in the interaction, namely those that are relevant and marginal. It also means that if we only have access to low-energy experiments, it is going to be very difficult to figure out the precise nature of the TOE, because its effects are highly diluted except for the relevant and marginal interactions. Some people therefore call the superduper theory that everybody is looking for, not TOE, but TOENAIL, which stands for “theory of everything not accessible in laboratories”. The discussion given above is a poor man’s version of the ideas of effective field theory and Wilson’s renormalization group, about which you can learn much more by asking Matthias Neubert. Weakly Coupled Theories In this course we will only deal with weakly coupled QFTs, i.e., theories that can be truly considered as small perturbations of the free field theory at all energies. We will look in more detail at two specific examples. The first example of a weakly coupled QFT we will study is the φ4 theory, L=

1 λ 1 (∂µ φ)2 − m2 φ2 − φ4 , 2 2 4!

(4.6)

where φ is our well-known real scalar field. For (4.6) to be weakly-coupled we have to require λ 1. We can get a hint for what the effects of the additional φ4 term will be. Expanding it in terms of ladder operators, we find terms like a†p a†p a†p a†p ,

a†p a†p a†p ap ,

(4.7)

etc., which create and destroy particles. This signals that the φ4 Lagrangian (4.6) describes a theory in which particle number is not conserved. In fact, it is not too difficult to check that the number operator N does not commute with the Hamiltonian, i.e., [H, N ] 6= 0. The second example we will look at is a scalar Yukawa theory. Its Lagrangian is given by L = (∂µ ϕ∗ )(∂ µ ϕ) + 21

1 1 (∂µ φ)2 − M 2 ϕ∗ ϕ − m2 φ2 − g ϕ∗ ϕ φ , 2 2

If we would know the precise structure of the TOE we could, in fact, calculate the couplings gn .

58

(4.8)

with g M, m. This theory couples a complex scalar ϕ to a real scalar φ. In this theory the individual particle numbers for ϕ and φ are not conserved. Yet, the Lagrangian (4.8) is invariant under global phase rotations of ϕ, which ensures that there will be a conserved charge Q obeying [H, Q] = 0. In fact, we have met this charge already in (3.95). In consequence, in the scalar Yukawa theory the number of ϕ particles minus the number of ϕ antiparticles is conserved. Notice also that the potential in (4.8) has a stable minimum at ϕ = φ = 0, but it is unbounded from below, if −gφ becomes too large. This means that we should not mess to much with the scalar Yukawa theory.

4.2

Interaction Picture

In QM, there is a useful viewpoint called the interaction picture, which allows to deal with small perturbations to a well-understood Hamiltonian. Let me briefly recall how this works. In the Schrödinger picture, the states evolve as id/dt|ψiS = H |ψiS , while the operators OS are time independent. In contrast, in the Heisenberg picture the states do not evolve with time, but the operators change with time, namely one has |ψiH = eiHt |ψiS and OH = eiHt OS e−iHt . The interaction picture is a hybrid of the two. We split the Hamiltonian as H = H0 + Hint ,

(4.9)

where in the interaction picture the time dependence of operators OI is governed by H0 , while the time dependence of the states |ψiI is governed by Hint . While this split is arbitrary, things are easiest if one is able to solve the Hamiltonian H0 , e.g., if H0 is the Hamiltonian of a free theory. From what I have said so far, it follows that |ψiI = eiH0 t |ψiS ,

OI = eiH0 t OS e−iH0 t .

(4.10)

Since the Hamiltonian is itself an operator, the latter equation also applies to the interaction Hamiltonian Hint . In consequence, one has HI = (Hint )I = eiH0 t Hint e−iH0 t .

(4.11)

The Schrödiner equation in the interaction picture is readily derived starting from the Schrödinger picture, d −iH0 t d e |ψiI = (H0 + Hint ) e−iH0 t |ψiI , i |ψiS = H |ψiS , =⇒ i dt dt d (4.12) =⇒ i |ψiI = eiH0 t Hint e−iH0 t |ψiI , dt d =⇒ i |ψiI = HI |ψiI . dt Dyson’s Formula In order to solve the system described by the Hamiltonian (4.9), we have to find a way of how to find a solution to the Schrödinger equation in the interaction basis (4.12). Let us write the solution as |ψ(t)iI = U (t, t0 )|ψ(t0 )iI , (4.13) 59

where U (t, t0 ) is an unitary time-evolution operator satisfying U (t, t) = 1, U (t1 , t2 ) U (t2 , t3 ) = † U (t1 , t3 ), and U (t1 , t3 ) U (t2 , t3 ) = U (t1 , t2 ). Inserting (4.13) into the last line of (4.12) i

d U (t, t0 ) = HI (t) U (t, t0 ) . dt

(4.14)

If HI would be a function, the solution to the differential equation (4.14) would read Z t ? 0 0 U (t, t0 ) = exp −i dt HI (t ) .

(4.15)

t0

Yet, HI is not a function but an operator and this causes ordering issues. Let’s have a closer look at the exponential to understand where the trouble comes from. The exponential is defined through its power expansion, Z t Z t 2 Z t (−i)2 0 0 0 0 0 0 exp −i dt HI (t ) = 1 − i dt HI (t ) + dt HI (t ) + . . . . (4.16) 2 t0 t0 t0 When we differentiate this with respect to t, the third term on the right-hand side gives Z t Z t 1 1 0 0 0 0 − dt HI (t ) . (4.17) dt HI (t ) HI (t) − HI (t) 2 2 t0 t0 The second term of this expression looks good since it is part of HI (t)U (t, t0 ) appearing on the right-hand side of (4.14), but the first term is no good, because the HI (t) sits on the wrong side of the integral, and we cannot commute it through, given that [HI (t0 ), HI (t)] 6= 0 when t 6= t0 . So what is the correct expression for U (t, t0 ) then? The correct answer is provided by Dyson’s formula,22 which reads Z t 0 0 U (t, t0 ) = T exp −i dt HI (t ) . (4.18) t0

Here T denotes time ordering as defined in (3.124). It is easy to prove the latter statement. We start by expanding out (4.18), which leads to Z t U (t, t0 ) = 1 − i dt0 HI (t0 ) t0

(−i)2 + 2

"Z

t

dt0

Z

t

dt00 HI (t00 ) HI (t0 ) +

t0

t0

t

Z

dt0

t0

Z

#

t0

(4.19)

dt00 HI (t0 ) HI (t00 ) + . . . .

t0

In fact, the terms in the last line are actually the same, since Z t Z t Z t Z t00 0 00 00 0 dt0 HI (t00 ) HI (t0 ) dt dt HI (t ) HI (t ) = dt00 t0

t0

t0

Z

t

=

0

Z

(4.20)

t0

dt t0

22

t0 00

0

00

dt HI (t ) HI (t ) , t0

Essentially figured out by Paul Dirac, but in its compact notation due to Freeman Dyson.

60

where the range of integration in the first expression is over t00 ≥ t0 , while in the second expression one integrates over t0 ≤ t00 , which is, of course, the same thing. The final expression is simply obtained by relabelling t0 and t00 . In fact, it is not too difficult to show that one has Z tn−1 Z t Z t1 Z 1 t dtn HI (t1 ) . . . HI (tn ) = dt2 . . . dt1 dt1 . . . dtn T (HI (t1 ) . . . HI (tn )) . (4.21) n! t0 t0 t0 t0 Putting things together this means that the power expansion of (4.17) takes the form Z

t

U (t, t0 ) = 1 − i

0

0

dt HI (t ) + (−i)

2

t0

Z

t 0

Z

t0

dt t0

dt00 HI (t0 ) HI (t00 ) + . . . .

(4.22)

t0

The proof of Dyson’s formula is straightforward. First, observe that under the T operation, all operators commute, since their order is already fixed by time ordering. Thus, Z t Z t d d 0 0 0 0 i U (t, t0 ) = i T exp −i dt HI (t ) = T HI (t) exp −i dt HI (t ) dt dt t0 t0 (4.23) Z t = HI (t) T exp −i dt0 HI (t0 ) = HI (t)U (t, t0 ) . t0

Notice that since t, being the upper limit of the integral, is the latest time so that the factor HI (t) can be pulled out to the left. Before moving on, I have to say that Dyson’s formula is rather formal. In practice, it turns out to be very difficult to compute the time-ordered exponential in (4.18). The power of (4.18) comes from the expansion (4.22) which is valid when HI is a small perturbation to H0 .

4.3

First Look at Scattering Processes

Let us now try to apply the interaction picture to QFT, starting with an easy example, namely the interaction Hamiltonian of the Yukawa theory, Z Hint = g d3 x ϕ∗ ϕ φ . (4.24) Unlike the free theories discussed in Section 2, this interaction does not conserve the particle number of the individual fields, allowing particles of one type to morph into others. In order to see why this is the case, we look at the evolution of the state, i.e., |ψ(t)i = U (t, t0 ) |ψ(t0 )i, in the interaction picture. If g M, m, where M and m are the masses of ϕ and φ, respectively, the perturbation (4.24) is small and we can approximate the full time-evolution operator U (t, t0 ) in (4.18) by (4.22). Notice that (4.22) is, in fact, an expansion in powers of Hint . The interaction Hamiltonian Hint contains ladder operators for each type of particle. In particular, glancing at (3.107) tells us that the φ field contains the operators a† and a that create or destroy φ particles.23 Let’s call this particle mesons (M ). On the other hand, from the discussion in Section 3.4 and 3.5, it follows that the field ϕ contains the operators a†+ and 23

The additional subscript p of the ladder operators a† and a etc. is dropped hereafter in the text.

61

a− , which implies that it creates ϕ antiparticles and destroys ϕ particle. We will call these particles nucleons (N ).24 Finally, the action of ϕ∗ is to create nucleons through a†− and to destroy antinucleons via a+ . While the individual particle number is not conserved, it is important to emphasize that Q = N+ − N− as defined in (3.95) is conserved not only in the free theory, but also in the presence of Hint . At first order in Hint , one will have terms of the form a†+ a†− a which destroys ¯ . At second order in Hint , we have a meson and creates a nucleon-antinucleon pair, M → N N more complicated processes. E.g., the combination of ladder operators (a†+ a†− a)(a+ a− a† ) gives ¯ → M → NN ¯ . The rest of this section is devoted to calculate rise to the scattering process N N the quantum amplitudes for such processes to occur. In order to calculate the amplitude, we have to make an important, but slightly dodgy, assumption. We require that the initial state |ii at t → −∞ (final state |f i at t → ∞) is an eigenstate of the free theory described by the Hamiltonian H0 . At some level, this sounds like a reasonable approximation. If at t → ∓∞ the particles are well separated they do not feel the effects of each other. Moreover, we intuitively expect that the states |ii and |f i are eigenstates of the individual number operators N and N± . These operators commute with H0 , but not with Hint . As the particles approach each other, they interact briefly, before departing again, each going on its own way. The amplitude to go from |ii to |f i is given by lim hf |U (t+ , t− )|ii = hf |S|ii ,

t∓ →∓∞

(4.25)

where the unitary operator S is known as the S-matrix. Needless to say, that the S in S-matrix stands for scattering. There are a number of reason why the assumption of non-interacting initial and final states |ii and |f i is shaky. First, one cannot describe bound states. E.g., naively this formalism cannot deal with the scattering of an e− and proton (p) which collide, bind, and leave as a Hydrogen atom. It is possible to circumvent this objection, since it turns out that bound states show up as poles in the S-matrix. Second, and more importantly, a single particle, a long way from its neighbors, is never alone in field theory. This is true even in classical electrodynamics, where the electron sources the electromagnetic field from which it can never escape. In QED, a related fact is that there is a cloud of virtual photons surrounding the electron. This line of thought gets us into the issues of renormalization and you will hear more on this later. For the time being, let me simply use the assumption of non-interacting asymptotic states. After developing the basics of scattering theory, we will revisit the latter problem. Example: Meson Decay Let us consider the relativistically normalized initial and final states, p p |f i = 4Eq1 Eq2 a†+,q1 a†−,q2 |0i . |ii = 2Ep1 a†p1 |0i , 24

(4.26)

Of course, in reality nucleons are spin-1/2 particles, and do not arise from the quantization of a scalar field. Our scalar Yukawa theory is therefore only a toy model for nucleons interacting with mesons.

62

The initial state contains a meson with momentum p1 , while the final state contains a nucleonantinucleon pair of momentum q1 and q2 . In leading order in the interaction Hint (4.24), the ¯ is given by amplitude for the process M → N N Z hf |S|ii = −ig hf | d4 x ϕ∗I (x)ϕI (x)φI (x)|ii . (4.27) Let us calculate this matrix element step by step. We first express φI in terms of a† and a using (3.107). Notice that it is correct to apply the latter equation, since the φI field in (4.27) is in the interaction picture, which is the same as the Heisenberg picture of the free theory. The annihilation operator in (3.107) will turn |ii into something proportional to |0i, while the piece containing a creation operator will turn |ii into a two meson state. A two meson state ¯ state, and the ladder operator appearing in has however no overlap with hf |, which is a N N ∗ ϕI and ϕI cannot change this situation. So we have Z Z 3 p 2Ep1 dk √ ak a†p1 e−ikx |0i hf |S|ii = −ig hf | d4 x ϕ∗I (x)ϕI (x) 3 (2π) 2Ek Z Z 3 p i 2Ep1 h dk 3 (3) † √ (2π) δ (p − k) − a = −ig hf | d4 x ϕ∗I (x)ϕI (x) a e−ikx |0i (4.28) k 1 p1 (2π)3 2Ek Z = −ig hf | d4 x ϕ∗I (x)ϕI (x) e−ip1 x |0i . Now we do the same for ϕI and ϕ∗I . To get a non-zero overlap with our nucleon-antinucleon final state, we have to pick up the creation operators a†+ and a†− from the Fourier expansion of the field operators. Altogether we then have p Z 4 3 d x d k1 d3 k2 4Eq1 Eq2 p hf |S|ii = −ig h0| a−,q2 a+,q1 a†+,k1 a†−,k2 |0i e−i(k1 +k2 −p1 )x 6 (2π) 4Ek1 Ek2 p Z 4 3 d x d k1 d3 k2 4Eq1 Eq2 (4.29) p (2π)6 δ (3) (q 1 − k1 ) δ (3) (q 2 − k2 ) e−i(k1 +k2 −p1 )x = −ig (2π)6 4Ek1 Ek2 = −ig (2π)4 δ (4) (q1 + q2 − p1 ) , where we have made repeatedly use of the commutation relations of the ladder operators as given in (3.16) and ignored contributions where annihilation operators act on the vacuum, since these vanish by definition. We have drawn first blood: the result in (4.29) is our first QFT amplitude. ¯ decays. In particular, Notice that the delta function constraints the possible M → N N the decay can only happen at all if the mass of the meson is larger or equal to the mass of the nucleon-antinucleon state, i.e., m ≥ 2M . In order to see this, we simply boost our reference frame so that the meson is at rest p1 = (m, 0, 0, 0). This is always possible. Momentum conservation, as imposed by the delta function, than implies that the nucleon and antinucleon 1/2 are produced back-to-back, q 1 = −q 2 , and that m = 2 M 2 + |q 1,2 |2 ≥ 2M . 63

4.4

Wick’s Theorem

Using Dyson’s formulas (4.18) and (4.22), we want to compute matrix elements such as hf |T HI (x1 ) . . . HI (xn ) |ii , (4.30) where |ii and |f i are assumed to be asymptotically free states. The ordering of the operators HI is fixed by time ordering. However, since the interaction Hamiltonian contains certain creation and annihilation operators, it would be convenient if we could start to move all annihilation operators to the right, where they can start eliminating particles in |ii. Recall that this is the definition of normal ordering as defined in (3.27). Wick’s theorem tells us how to go from time-ordered products to normal-ordered products. Before stating Wick’s theorem in its full generality, let’s keep it simple and try to rederive something that we know already. This is always a good idea. Case of Two Fields The most simple matrix element of the form (4.30) is h0|T φI (x)φI (y)|0i .

(4.31)

We already calculated this object in Section 3.7 and gave it the name Feynman propagator. What we want to do now is to rewrite it in such a way that it is easy to evaluate and to generalize the obtained result to the case with more than two fields. We start by decomposing the real scalar field in the interaction picture as − φI (x) = φ+ I (x) + φI (x) ,

with25

Z 3 1 1 dk d3 k − −ikx √ √ ak e , φI (x) = a†k eikx . = 3 3 (2π) (2π) 2Ek 2Ek This decomposition can be done for any free field. It is useful since φ+ I (x)

(4.32)

Z

h0|φ− I (x) = 0 .

φ+ I (x)|0i = 0 ,

(4.33)

(4.34)

Now we consider the case x0 > y 0 and compute the time-order product of the two scalar fields, + − − T φI (x)φI (y) = φI (x)φI (y) = φ+ (x) + φ (x) φ (y) + φ (y) I I I I + − − + − + − = φ+ I (x)φI (y) + φI (x)φI (y) + φI (y)φI (x) + φI (x)φI (y) − + φ+ I (x), φI (y) ,

(4.35)

+ where we have normal ordered the last line, i.e., brought of + all φI−’s to the right. To 0get rid − + the φI (x)φI (y) term, we have added the commutator φI (x), φI (y) . In the case x < y 0 , we find, repeating the above exercise, − , (4.36) T φI (x)φ( y) = : φI (x)φI (y) : + φ+ (y), φ (x) I I 25

The superscripts “±” do not make much sense, but I just follow Pauli and Heisenberg here. If you have to, complain with them.

64

where we have made use of the fact that the first four terms in the last line of (4.35) are simply the normal-ordered product of the two fields, : φ(x)φ(y) : . In order to combine the results (4.35) and (4.36) into one equation, we define the contraction of two fields,  − 0 0  φ+ I (x), φI (y) , x > y , (4.37) φI (x)φI (y) =  φ+ (y), φ− (x) , y 0 > x0 . I I This definition implies that the contraction of two φI fields is nothing but the Feynman propagator: φI (x)φI (y) = DF (x − y) .

(4.38)

For a string of field operators φI , the contraction of a pair of fields means replacing the contracted operators with the Feynman propagator, leaving all other operators untouched. Equipped with the definition (4.37), the relation between time-ordered and normal-ordered products of two fields can now be simply written as T φI (x)φI (y) = : φI (x)φI (y) : + φI (x)φI (y) .

(4.39)

Let me emphasize that while both T φI (x)φI (y) and : φI (x)φI (y) : are operators, their difference is a complex function, namely the Feynman propagator or the contraction of two φI fields. The formalism of contractions is also straightforwardly extended to our complex scalar field ϕI . One has T ϕI (x)ϕ∗I (y) = : ϕI (x)ϕ∗I (y) : + ϕI (x)ϕ∗I (y) , (4.40) prompting us to define the contraction in this case as ϕI (x)ϕ∗I (y) = DF (x − y) .

ϕI (x)ϕI (y) = ϕ∗I (x)ϕ∗I (y) = 0 .

(4.41)

For convenience and brevity, I will from here on often drop the subscript I, whenever I calculate matrix elements of the form (4.30). There is however little room for confusion, since contractions will always involve interaction-picture fields. Strings of Fields With all this new notation at hand, the generalization to arbitrarily many fields is also easy to write down: T φ(x1 ) . . . φ(xn ) = : φ(x1 ) . . . φ(xn ) + all possible contractions : . (4.42) This identity is known as the Wick’s theorem. Notice that for n = 2 the latter equation is equivalent to (4.39). Before proving Wick’s theorem, let me tell you what the phrase “all possible contractions” means by giving a simple example.

65

For n = 4 we have, writing φi instead of φ(xi ) for brevity, T φ1 φ2 φ3 φ4 = : φ1 φ2 φ3 φ4 + φ1 φ2 φ3 φ4 + φ1 φ2 φ3 φ4 + φ1 φ2 φ3 φ4 + φ1 φ2 φ3 φ4 + φ1 φ2 φ3 φ4 + φ1 φ2 φ3 φ4

(4.43)

+ φ1 φ2 φ3 φ4 + φ1 φ2 φ3 φ4 + φ1 φ2 φ3 φ4 : . When the contracted field operator are not adjacent, we still define it to give a factor of DF . E.g., : φ1 φ2 φ3 φ4 : = DF (x2 − x4 ) : φ1 φ3 : .

(4.44)

Since the VEV of any normal-ordered operator vanishes, i.e., h0| : O : |0i = 0, sandwiching any term of (4.43) in which there remain uncontracted field operators between the vacuum |0i gives zero. This means that only the three fully contracted terms in the last line of that equation survive and they are all complex functions. We therefore have h0|T φ1 φ2 φ3 φ4 |0i = DF (x1 − x2 )DF (x3 − x4 ) + DF (x1 − x3 )DF (x2 − x4 )

(4.45)

+ DF (x1 − x4 )DF (x2 − x3 ) , which is a rather simple result and has, as we will see in the next section, a nice pictorial interpretation. Proof of Wick’s Theorem We still like to prove Wick’s theorem. Naturally this is done by induction. We have already proved the case n = 2. So let’s assume that (4.42) is valid for n − 1 and try to show that the latter equation also holds for n field operators. With out loss of generality we can assume that x01 > . . . > x0n , since if this is not the case we simply relabel the points in an appropriate way. Such a relabeling leaves both sides of (4.42) unchanged. Then applying Wick’s theorem to the string φ2 . . . φn , we arrive at T φ1 . . . φn = φ1 . . . φn = = φ1 : (φ2 . . . φn + all contraction not involving φ1 ) :

(4.46)

− = (φ+ 1 + φ1 ) : (φ2 . . . φn + all contraction not involving φ1 ) : . − We now want to move the φ± 1 ’s into the : . . . : . For φ1 this is easy, since moving it in, it is already on the left-hand side and thus the resulting term is normal ordered. The term with + φ+ 1 is more complicated because we have to bring it into normal order by commuting φ1 to the right. E.g., consider the term without contractions, + + φ+ 1 : φ2 . . . φn : = : φ2 . . . φn : φ1 + [φ1 , : φ2 . . . φn :]

+ − + − = : φ+ 1 φ2 . . . φn : + : [φ1 , φ2 ]φ3 . . . φn + φ2 [φ1 , φ3 ]φ4 . . . φn + . . . : = : φ+ 1 φ2 . . . φn + φ1 φ2 φ3 . . . φn + φ1 φ2 φ3 φ4 . . . φn + . . . : . 66

(4.47)

Here we first used the fact that the commutator of a single operator and a string of operators can be written as a sum of all possible strings of operators with two adjacent operators put into a commutator. The simplest relation of this type reads [φ1 , φ2 φ3 ] = [φ1 , φ2 ]φ3 + φ2 [φ1 , φ3 ] and is easy to prove. In the last step we then realized that under the assumption x01 > . . . > x0n all commutator of two operators are equivalent to a contraction of the relevant fields. The first term in the last line of (4.47) combines with the φ− 1 term of (4.46) to give : φ1 . . . φn : , meaning that we have derived the first term on the right-hand side of Wick’s theorem as well as all terms involving only one contraction of φ1 with another field in (4.42). It is not too difficult to understand that repeating the above exercise (4.47) with all the remaining terms in (4.46) will then give all possible contractions of all the fields, including those of φ1 . Hence the induction step is complete and Wick’s theorem is proved.

4.5

Second Look at Scattering Processes

In order to see the real power of Wick’s theorem let’s put it to work and try to calculate N N → N N scattering in the Yukawa theory (4.24). We first write down the expressions for the initial and final states, p |ii = 4Ep1 Ep2 a†+,p1 a†+,p2 |0i = |p1 , p2 i , (4.48) p |f i = 4Eq1 Eq2 a†+,q1 a†+,q2 |0i = |q1 , q2 i . We now look at the expansion of hf |S|ii in powers of the coupling constant g. In order to isolate the interesting part of the S-matrix, i.e., the part due to interactions, we define the T -matrix by S = 1 + iT , (4.49) where the 1 describes the situation where nothing happens. The leading contribution to iT occurs at second order in the interaction (4.24). We find Z (−ig)2 d4 x d4 y T ϕ∗ (x)ϕ(x)φ(x)ϕ∗ (y)ϕ(y)φ(y) . (4.50) 2 Applying Wick’s theorem to the time-order production entering this expression, we get (besides others) a term DF (x − y) : ϕ∗ (x)ϕ(x)ϕ∗ (y)ϕ(y) : , (4.51) which features a contraction of the two φ fields. This term will contribute to the scattering, because the operator : ϕ∗ (x)ϕ(x)ϕ∗ (y)ϕ(y) : destroys the two nucleons in the initial state and generates those appearing in the final state. In fact, (4.51) is the only contribution to the process N N → N N , since any other ordering of the field operators would lead to a vanishing matrix element. The matrix element of the normal-ordered operator in (4.51) is readily computed: hq1 , q2 | : ϕ∗ (x)ϕ(x)ϕ∗ (y)ϕ(y) : |p1 , p2 i = hq1 , q2 |ϕ∗ (x)ϕ∗ (y)|0ih0|ϕ(x)ϕ(y)|p1 , p2 i = ei(q1 x+q2 y) + ei(q1 y+q2 x)

e−i(p1 x+p2 y) + e−i(p1 y+p2 x)

= ei[(q1 −p1 )x+(q2 −p2 )y] + ei[(q2 −p1 )x+(q1 −p2 )y] + (x ↔ y) , 67

(4.52)

where, in going to the third line, we have used the fact that for relativistically normalized states, h0|ϕ(x)|pi = e−ipx . Putting things together, the matrix element (4.50) takes the form Z i ieik(x−y) (−ig)2 d4 x d4 y d4 k h i[(q1 −p1 )x+(q2 −p2 )y] i[(q2 −p1 )x+(q1 −p2 )y] e +e + (x ↔ y) 2 , (4.53) 2 (2π)4 k − m2 + i where the term in curly brackets arises from (4.52), while the final factor stems from the expression for the φ propagator (3.129). The (x ↔ y) terms double up with the others to cancel the factor of 1/2 in the prefactor (−ig)2 /2, while the x and y integrals give delta functions. One arrives at Z 4 h i(2π)8 dk 2 δ (4) (q1 − p1 + k)δ (4) (q2 − p2 − k) (−ig) (2π)4 k 2 − m2 + i (4.54) i + δ (4) (q2 − p1 + k)δ (4) (q1 − p2 − k) . Finally, we perform the k integration using the delta functions. We obtain 1 1 2 i(−ig) + (2π)4 δ (4) (p1 + p2 − q1 − q2 ) , (p1 − q1 )2 − m2 + i (p1 − q2 )2 − m2 + i

(4.55)

where the delta function, like in (4.29), imposes momentum conservation. Let me note that, in fact, we can drop the i in the propagators, since the denominators cannot become zero. In order to see this, we go to the center-of-mass (CM) frame, where p1 = −p2 and, by momentum conservation |p1 | = |q 1 |. This ensures that the 4-momentum of the meson is k = (0, p1 − q 1 ), and in consequence k 2 < 0. We will see shortly another, much simpler way to reproduce the result (4.55) using Feynman diagrams. This will also shed light on the physical interpretation. ¯N ¯ → N ¯N ¯ and Notice that the above calculation is also relevant for the scatterings N ¯ ¯ N N → N N . Both reactions arise from the term (4.52) in Wick’s theorem. However, we will ¯N ¯ or N ¯N ¯ → N N , because these transitions never find a term that contributes to N N → N would violate the conservation of the charge Q introduced in (3.95).

4.6

Feynman Diagrams

As the above example demonstrates, to actually compute scattering amplitudes using Wick’s theorem is (still) rather tedious. There’s a much better way, which starts by drawing pretty pictures. This pictures represent the expansion of hf |S|ii and we will learn how to associate mathematical expressions with those pictures. The pictures, you probably already guessed it, are the famous Feynman diagrams. The Feynman-diagram approach turns out to be a powerful tool to calculate QFT amplitudes (or as Schwinger puts it in [1]: “Like the silicon chips of more recent years, the Feynman diagram was bringing computation to the masses.”). We again start simple and consider the case of for fields, all at different space-time points, which we have already worked out in (4.45). Let us present each of the points x1 to x4 by a point and the propagators DF (x1 − x2 ) etc. by a line joining the relevant points. Then the

68

right-hand side of (4.45) can be represented as a sum of three Feynman diagrams, 1 1

2

3

4

h0|T φ1 φ2 φ3 φ4 |0i =

2

+

1

2

+ 3

4

. 3

(4.56)

4

While this matrix element is not a measurable quantity, the pictures suggest a physical interpretation. Two particles are generated at two points and then each propagators to one of the other points, where they are both annihilated. This can happen in three possible ways corresponding to the three shown graphs. The total amplitude for this process is the sum of the three Feynman diagrams. Things get more interesting, if one considers expressions like (4.56) that contain field operators evaluated at the same space-time point. So let us have a look at the expansion of the propagator (4.31) of the real scalar field, Z h0|T φ(x)φ(y) + φ(x)φ(y) −i dt HI (t) + . . . |0i , (4.57) in the presence of the interaction term HI = −λ/(4!) φ4 of the φ4 theory (4.6). The first term gives the free-field result, h0|T φ(x)φ(y)|0i = DF (x − y), while the second term takes the form Z Z 3 λ 4 h0|T φ(x)φ(y) (−i) dt d z φ (z) |0i 4! (4.58) Z −iλ = h0|T φ(x)φ(y) d4 z φ(z)φ(z)φ(z)φ(z) |0i . 4! Now let’s apply Wick’s theorem (4.42) to (4.58). We get one term for each possible way to contract the six different φ’s with each other in pairs. There are 15 such possibilities, but fortunately only two of these possibilities are really different. If we contract φ(x) and φ(y), there are 3 possible ways to contract the remaining φ(z)’s. The other possibility is to contract φ(x) with φ(z) (four choices) and φ(y) with φ(z) (three choices), and φ(z) with φ(z) (one choice). There are 12 possible ways to do this, all giving the same result. In consequence, we have Z Z 3 λ 4 h0|T φ(x)φ(y) (−i) dt d z φ (z) |0i 4! Z −iλ (4.59) =3 DF (x − y) d4 z DF (z − z) DF (z − z) 4! Z −iλ + 12 d4 z DF (x − z) DF (y − z) DF (z − z) . 4! We can understand the latter expression better if we represent each term as a Feynman graph. Again we draw each propagator as a line and each point as a dot. This time we have however to distinguish between the external points x and y and the internal point z, which is 69

R associated with a factor −iλ d4 z. Neglecting the overall factors, we see that the expression (4.59) is equal to the sum of the following two diagrams

x

y

z

+

x

z

y

.

(4.60)

We refer to the lines in these diagrams as propagators, since they represent the propagation amplitudes DF (x − y) etc. Internal points where four lines meet are called vertices. Since DF (x − y) is the amplitude for a free Klein-Gordon particle to propagate between x and y, the diagrams actually interpret the analytic formula as a process of creation, propagation, and annihilation which takes place in space-time. Let’s now move to a more complicated contraction that arises at order λ3 in the φ4 interaction (φx = φ(x) etc.):

3 Z Z Z −iλ 4 4 d z φz φz φz φz d w φw φw φw φw d4 u φu φu φu φu |0i 4! 3 Z 1 −iλ d4 z d4 w d4 u DF (x − z)DF (z − z)DF (z − w) = 3! 4!

1 h0| φx φy 3!

(4.61)

× DF2 (w − u)DF (u − u)DF (u − y) . The number of “different” contractions that gives this result is large. One has 3! × 4 · 3 × 4 · 3 · 2 × 4 · 3 × 1/2 ,

(4.62)

which means a total number or 10 368 possibilities. Here the factor 3! arises from the interchange of the vertices z, w, and u, while the first 4 · 3 factor describes the placement of the contractions into the z vertex. The factor 4 · 3 · 2 characterizes the placement of the contractions into the w vertex whereas the second 4 · 3 factor is associated to the placement of the contractions into the u vertex. Finally, the factor of 1/2 is due to the interchange of the w–u contractions. The product in (4.62) is roughly 1/13 of the total number of 135 135 contractions of 14 different field operators. The particular contraction (4.61) can be represented by the following “cactus” diagram:

u

(4.63) .

x

z

w 70

y

It is conventional, for obvious reasons, to let this one diagram represent the sum of all 10 368 identical terms. In practical applications one always draws the Feynman diagrams first, using it as a mnemonic device to write down the analytic expression. If this is done, one still has to figure out the multiplicative overall R 4 factor. Of course, one can do this as we have done it above by associating a factor d z (−iλ/(4!)) with each vertex, putting in the 1/n! factor from the Taylor expansion, and then do the combinatorics by writing out the product of fields as in (4.61) and counting. Yet, typically the 1/n! factor from the Taylor series will cancel the n! factor arising from the interchanging the vertices, so that one can simply forget about this factors. Furthermore, the generic vertex has four different lines coming from four different places, so that the various contractions into the φφφφ operator generates a factor of 4! (as in the case of the w vertex in the above example). This factor of 4!R cancels the denominator of −iλ/(4!). It is therefore conventional to associate the expression d4 z (−iλ) with each vertex. Applying this scheme to the Feynman graph in (4.63) gives a multiplicative factor that is too large by a factor of S = 8 = 2 · 2 · 2, which is called the symmetry factor of the diagram. Two factor of 2 come from lines that start and end on the same vertex, since the diagram is symmetric under the interchange of the ends of such lines (z and u in our case). The other factor of 2 comes from the two propagators connecting w and u, since the graph is symmetric under the interchange of these two lines. A third type of symmetry (not arising in the case at hand) is the equivalence of two vertices. In order to arrive at the correct overall factor, one has to divide by the symmetry factor, which is in general the number of possibilities to change parts of the diagrams without changing the result of the Feynman graph. Most people never need to evaluate Feynman graphs with a symmetry factor larger than 2, so there is no need to worry too much about these technicalities. But for completeness let me give some examples of nontrivial symmetry factors. Here they are (dropping the labels x and y at the external points): S = 2 · 2 · 2 = 8,

S = 2,

(4.64) S = 3! · 2 = 12 .

S = 3! = 6 ,

Clearly, if you are in doubt about the symmetry factor you can always determine it by counting equivalent contractions, as we did above. We are now ready to summarize our rules needed to find the analytic expression for each piece of a given Feynman diagram in the φ4 theory: 1. For each propagator one has x

y

71

= DF (x − y) .

Z 2. For each vertex one has

z

= (−iλ)

3. For each external point one has x

d4 z .

= 1.

4. Divide by the symmetry factor. Since these rules are written in terms of space-time points x, y, z, etc. these rules are called position-space Feynman rules. One way to interpret these rules is to think of the factor (−iλ) as R 4the amplitude for the emission and/or absorption of particles at a vertex. The integral d z tells us that we have to “sum” over all points where this process can occur. This means that this is nothing but the superposition principle of QM: when a process can happen in different ways, we add the amplitudes for each possibility. Furthermore, in order to calculate each individual amplitude the Feynman rules tell us to multiply the amplitudes (propagators and vertices) for each of independent part of the process. The above Feynman rules are given in position-space. Yet, in actual calculation it is (often) more convenient to work in the momentum-space by introducing the Fourier transformation of the propagator (3.129). To such a propagator one has to assign a 4-momentum p, indicating in general the direction of the momentum with an arrow (since DF (x − y) = DF (y − x) the direction of p is arbitrary). The z-dependent factors of the vertices in a diagram are then given by p3

p1

p4

p2

Z ⇐⇒

d4 z e−i(p1 +p2 +p3 −p4 )·z = (2π)4 δ (4) (p1 + p2 + p3 − p4 ) .

(4.65)

In other words momentum is conserved at each vertex. The delta functions from the vertices can now be used to perform some of the momentum integrals from the propagators. We are left with the following momentum-space Feynman rules: 1. For each propagator one has

p

p2

= −iλ .

2. For each vertex one has

3. For each external point one has x

=

p

72

= e−ip·x .

i . − m2 + i

4. Impose momentum conservation at each vertex. Z 5. Integrate over each undetermined momentum

d4 l . (2π)4

6. Divide by the symmetry factor. Again, we can interpret each factor as the amplitude for that part of the process, with the integrations coming from the superposition principle. The exponential factor for an external point is just the amplitude for a particle at that point to have the needed momentum, or, depending on the direction of the arrow, for a particle with a certain momentum to be found at the specific point.

4.7

Third Look at Scattering Processes

Let us now apply the things that we have learned to the case of N N → N N scattering. At order g 2 we have to consider the two diagrams shown in Figure 4.1. Employing the relevant momentum-space Feynman rules, it is readily seen that the analytic expression for the sum of the displayed graphs agrees with the final result (4.55) of the calculation that we performed earlier in Section 4.5. In fact, there is a nice physical interpretation of the graphs. We talk, rather loosely, of the nucleons exchanging a meson which, in the first diagram, has momentum k = p1 −q1 = p2 −q2 . This meson does not satisfy the usual energy dispersion relation, because k 2 6= m2 , where m is the mass of the meson. The meson is called a virtual particle and is said to be off-shell (or, sometimes, off mass-shell). Heuristically, it can’t live long enough for its energy to be measured to great accuracy. In contrast, the momentum on the external, nucleon legs satisfy p21 = p22 = q12 = q22 = M 2 , which means that the nucleons, having mass M , are on-shell. Similar considerations apply to the second diagram. It is important to notice that the appearance of the two diagrams above ensures that the particles satisfy Bose statistics. ¯ → NN ¯ , are The diagrams describing the scattering of a nucleon and an antinucleon, N N a little bit different than the ones for N N → N N . At lowest order, the corresponding graphs are shown in Figure 4.2. It is a simple matter to write down the amplitude using the relevant Feynman rules, 1 1 2 + (2π)4 δ (4) (p1 + p2 − q1 − q2 ) . (4.66) i(−ig) 2 2 2 2 (p1 + p2 ) − m + i (p1 − q1 ) − m + i Notice that in the CM frame, p1 = −p2 , the denominator of the first term in the square bracket is 4 (M 2 + p21 ) − m2 . If m < 2M , then this term never vanishes and we may drop the i. In contrast, if m > 2M , then the amplitude corresponding to the first diagram diverges at some value of p1 . In this case it turns out that we may also neglect the i term, although for a different reason. In this case the meson is unstable when m > 2M and thus has a finite width Γ. When correctly treated, this instability adds a finite imaginary piece i Γ to the denominator which makes the application of the i prescription unnecessary. Nonetheless, the increase in the scattering amplitude which we see in the first diagram when 4 (M 2 + p21 ) = m2 is what 73

N

p1

q1

N

N

p1

N q1

+

M

M q2

N

p2

q2

N

N

p2

N

Figure 4.1: Feynman diagrams contributing to N N → N N scattering at order g 2 .

allows us to discover new particles. These appear as a resonance (a peak or bump) in the cross section (roughly the amplitude squared). We see that the amplitudes (4.55) and (4.66) (and in general all processes that include the exchange of just a single particle) depend on the same combinations of momenta in the denominators. There are standard names for various sums and differences of momenta that are known as Mandelstam variables. They are s = (p1 + p2 )2 = (q1 + q2 )2 , t = (p1 − q1 )2 = (p2 − q2 )2 ,

(4.67)

u = (p1 − q2 )2 = (p2 − q1 )2 , where, as in the explicit examples above, p1 and p2 are the momenta of the two initial-state particles, and q1 and q2 are the momenta of the two final-state particles. In order to get a feel for what these variables mean, let us assume (for simplicity) that all four particles are the same. In the CM frame, the initial two particles have the following 4-momenta p2 = (E, 0, 0, −p) ,

p1 = (E, 0, 0, p) ,

(4.68)

The particles then scatter at some angle θ and leave with momenta q2 = (E, 0, −p sin θ, −p cos θ) .

q1 = (E, 0, p sin θ, p cos θ) ,

(4.69)

Then from the definitions (4.67), we have that s = 4E 2 ,

t = −2p2 (1 − cos θ) ,

u = −2p2 (1 + cos θ) .

(4.70)

We see that the variable s measures the total center of mass energy of the collision, while the variables t and u are measures of the energy exchanged between particles (they are basically equivalent, just with the outgoing particles swapped around). Now the amplitudes that involve exchange of a single particle can be written simply in terms of the Mandelstam variables. E.g., for nucleon-nucleon scattering, the amplitude (4.55) is proportional to26 A(N N → N N ) ∝ 26

1 1 + , 2 t−m u − m2

Here and in the following we simply drop all i terms.

74

(4.71)

N

p1

q1

N

N

M

¯ N

−p2

+ −q2

¯ N

p1

q1

N

−q2

¯ N

M ¯ N

−p2

¯ → NN ¯ scattering at order g 2 . Figure 4.2: Feynman diagrams contributing to N N

while in the case of nucleon-antinucleon scattering one finds ¯ → NN ¯) ∝ A(N N

1 1 + . 2 s−m t − m2

(4.72)

We say that the first case involves t- and u-channel diagrams. On the other hand, the nucleonantinucleon scattering is said to involve s- and t-channel exchange. Note finally that there is a relationship between the Mandelstam variables. In the cases ¯ → NN ¯ scattering, which involves external particles with the same of N N → N N and N N mass, one has s + t + u = 4M 2 . (4.73) P4 When the masses of the external particles are different this becomes s + t + u = i=1 m2i , where mi denotes the individual masses of the initial- and final-state particles. Let us now consider the case of meson-meson scattering, M M → M M . The simplest diagram we can draw that describes this process is shown in Figure 4.3. It has a single loop, and momentum conservation at each vertex is no longer sufficient to determine every momentum passing through the diagram. Assigning the single undetermined momentum l to the right-hand propagator, all other momenta are fixed by the kinematics (the actual momenta assignments are not displayed in the figure). The amplitude corresponding to the displayed diagram is Z 1 1 d4 l 4 −i (−ig) 4 2 2 (2π) l − M (l + q1 )2 − M 2 (4.74) 1 1 4 (4) × (2π) δ (p1 + p2 − q1 − q2 ) . (l − p1 + q1 )2 − M 2 (l − q2 )2 − M 2 While an explicit calculation of this Rloop integral is beyond the scope of this lecture, notice that for large l, this integral goes as d4 l/l8 , which means that it is UV finite (the integral is also IR finite since all propagators are massive). In general, loop integrals can have however both UV (l2 → ∞) and IR (l2 → 0) singularities. The delta function follows from the conservation of 4-momentum which, in turn, follows from space-time translational invariance. It is common to all S-matrix elements. We will define the amplitude A(f → i) by stripping off this momentum-conserving delta function, hf |S − 1| ii = i hf |T | ii = i (2π)4 δ (4) (pf − pi ) A(f → i) , 75

(4.75)

M N M

M

N N N

M

Figure 4.3: Lowest order contribution to M M → M M scattering. The momentum assignments are not explicitly shown.

where pf (pi ) is the sum of the final (initial) 4-momenta, and the factor of i out front is a convention which is there to match non-relativistic QM.

4.8

Yukawa Potential

So far we have calculated the quantum amplitudes for various scattering processes. But this quantities are a little bit abstract. In order to make contact to experiment let me show in the following how to translate the amplitude (4.55) for nucleon-nucleon scattering into something familiar from Newtonian mechanics, namely a potential, or force, between the particles. We start by asking a simple question in classical field theory that will turn out to be relevant in order to calculate the quantum process. Suppose that we have a fixed delta function source for our real scalar field φ, that persists for all times. What is the profile of φ(x)? In order to answer this question, we have to solve the static Klein-Gordon equation, −∇2 + m2 φ(x) = δ (3) (x) . (4.76) R We can solve this equation by going to momentum-space φ(x) = d3 p/ ((2π)3 ) eip·x φ(p). After this Fourier transformation the relation (4.76) takes the form (p2 + m2 ) φ(p) = 1, which means that we can write the field as Z 3 eip·x dp . (4.77) φ(x) = (2π)3 p2 + m2 Let us compute this integral explicitly. Changing to polar coordinates, and writing p · x = pr cos θ, we get Z ∞ 1 p2 2 sin (pr) dp φ(x) = 2 2 2 (2π) 0 p +m pr Z ∞ 1 p sin (pr) = dp 2 (4.78) 2 (2π) r −∞ p + m2 Z ∞ dp peipr 1 = Re . 2 2 2πr −∞ 2πi p + m 76

We evaluate the last integral by closing the contour in the upper half plane p → i∞, picking up the pole at p = im. This gives φ(x) =

1 −mr e . 4πr

(4.79)

We see that the field dies off exponentially quickly at distances 1/m, i.e., the Compton wavelength of the meson. It is now interesting to ask how the profile of the φ field (the meson) and the force between the ϕ particles (the nucleons) are related. Realize that in electrostatics where a charged particle acts as a delta-function source for the gauge potential A0 with Aµ = (φ, A) we have to face a similar problem. In this case one has −∇2 A0 = δ (3) (x) which is solved by A0 = 1/(4πr). The profile of A0 then acts as the potential energy for another charged (test) particle moving in this background. Is such an interpretation also possible in the case of φ? Or phrased slightly different, is there a classical limit of the scalar Yukawa theory where the nucleons act as deltafunction sources for the meson field, creating the profile (4.79)? And, if so, is this profile then felt as a static potential? The answer is essentially yes, at least in the limit M m. But the correct way to describe the potential felt by the nucleons is not to talk about classical fields at all, but instead work directly with the quantum amplitudes. Let us see explicitly how this goes. We first compare the result of the first diagram in Figure 4.1 to the corresponding amplitude in non-relativistic QM which describes the interaction of two particles through a potential. In order for the comparison to be meaningful, we have to take the non-relativistic limit of (4.55). We work in the CM frame with p = p1 = −p2 and q = q 1 = −q 2 with |p| = |q| for elastic scattering. In the non-relativistic limit one has |p| M , which by momentum conservation implies |q| M . It is easy to check that in this limit the first term in (4.55) turns into ig 2 . (p − q)2 + m2

(4.80)

We should now compare this result to the scattering amplitude in QM. In order to do this, we consider two particles separated by a distance x, interacting through a potential V (x). The amplitude for the particles to scatter from ±p into ±q can be computed in perturbation theory, using techniques familiar from non-relativistic QM. In Born approximation, i.e., to leading order in the perturbative expansion, the sought amplitude is given by Z hq |V (x)| pi = −i d3 r V (x) e−i(p−q)·x . (4.81) Taking into account that there is a relative factor of (2M )2 that arises in comparing the QFT amplitude to hq |V (x)| pi, which can be traced to the relativistic normalization of the states |p1 , p2 i,27 we find after equating (4.80) and (4.81) the following relation Z −λ2 d3 r V (x) e−i(p−q)·x = . (4.82) (p − q)2 + m2 27

Notice that this factor is also necessary to get the dimensions of the potential to work out correctly.

77

Here we have introduced the dimensionless parameter λ = g/(2M ). The latter equation is trivially inverted, giving Z 3 dp eip·x −λ2 −mr 2 V (x) = −λ e , (4.83) = (2π)3 p2 + m2 4πr where in the last step have used the results (4.77) through (4.79). The potential V (x) is the famous Yukawa potential. The force has a range 1/m and the minus sign in (4.83) tells us that the potential is attractive. Hideki Yukawa made this potential the basis for his theory of the nuclear force and worked backwards from the range of the force (of about 1 fm) to predict the mass (of about 200 MeV) of the required boson the pion [2]. It is important to realize that QFT has given us an entirely new perspective on the nature of forces between particles. Rather than being a fundamental concept, the force arises from the virtual exchange of other particles, in this case the meson.

4.9

Connected and Amputated Feynman Diagrams

We have explained in some detail how to compute scattering amplitudes by drawing all Feynman diagrams and by writing down the corresponding analytic expression for them using Feynman rules. In fact, there are a couple of caveats about what Feynman diagrams one should draw and calculate. Both of these caveats are related to the assumption made so far that the initial and final states are eigenstates of the free theory which, as we have mentioned before, is not correct. The two caveats are as follows. First, we consider only connected Feynman diagrams, where every part of the diagram is connected to at least one external line. We shall see shortly, that this will be related to the fact that the vacuum |0i of the free theory is not the true vacuum |Ωi of the interacting theory. An example of a disconnected diagram (or piece) is shown on the left-hand side in Figure 4.4. Second, we do not consider diagrams with loops on external lines so-called unamputated graphs. An example of such a diagram is depicted on the right-hand side of the latter figure. These diagrams are related to the fact that the one-particle states of the free theory are not the same as the one-particle states of the interacting theory. In particular, correctly dealing with these diagrams will account for the fact that particles in interacting QFTs are always surrounded by a swarm of virtual particles. We will refer to diagrams in which all loops on external legs have been removed as amputated graphs. Vacuum of the Interacting Theory We start out by discussing the properties of the vacuum |Ωi of the interacting theory. We will normalize the state |Ωi as hΩ|Ωi = 1 and H |Ωi = 0. Since |Ωi is the ground state of H, we can isolate it by the following procedure. Imagine starting with the vacuum |0i of the free theory (i.e., H0 |0i = 0) and evolving it with H, X e−iHt |0i = e−iEn t |nihn|0i , (4.84) n

78

Figure 4.4: Example of a disconnected (left-hand side) and an unamputated (righthand side) Feynman diagram in φ4 theory.

where En (|ni) are the eigenvalues (eigenstates) of H. We must assume that |Ωi and |0i have some overlap, i.e., hΩ|0i = 6 0. If this would not be the case the interaction term HI would not be a small perturbation compared to H0 . Under this assumption, we can rewrite (4.84) as follows X e−iHt |0i = e−iE0 t |ΩihΩ|0i + e−iEn t |nihn|0i , (4.85) n6=0

where E0 = hΩ|H0 |Ωi. Since En > E0 for all n, we can get rid of the second term in (4.85) by sending t to infinity in a slightly imaginary direction, t → (1 − i) ∞.28 It follows that |Ωi =

−1 e−iE0 t hΩ|0i e−iHt |0i .

lim

(4.86)

t→(1−i)∞

Since t is very large we can shift it by a small amount, let’s say t0 , so that −1 −iH(t+t0 ) |Ωi = lim e−iE0 (t+t0 ) hΩ|0i e |0i t→(1−i)∞

=

lim

−1 −iH(t0 −(−t)) −iH0 (−t−t0 ) e e |0i e−iE0 (t0 −(−t)) hΩ|0i

t→(1−i)∞

=

lim t→(1−i)∞

(4.87)

−1 e−iE0 (t0 −(−t)) hΩ|0i U (t0 , −t) |0i .

Here we have used in the second line that H0 |0i = 0 and employed in the third line the relation U (t, t0 ) = exp [iH0 (t − t0 )] exp [−iH(t − t0 )] exp [−iH0 (t0 − t0 )] which follows from (4.14). We see that (ignoring the prefactor) we can get the ket |Ωi from |0i by simply evolving from −t to t0 with the time-evolution operator U . Similarly, we find for the bra hΩ| the expression hΩ| =

lim t→(1−i)∞

h0| U (t, t0 ) e−iE0 (t−t0 ) h0|Ωi

−1

.

(4.88)

Correlation Functions There are many questions we want to ask in QFT that are not directly related to scattering experiments. E.g., we might want to compute the viscosity of the quark gluon plasma, or 28

Since the BCs of the Feynman propagator DF (x − y) are such that the integration contour that is slightly rotated away from the Re p0 -axis the contribution of the imaginary piece of t does not alter the final result.

79

understand the response of a condensed matter system to an experimental probe, or figure out the non-Gaussianity of density perturbations arising in the cosmic microwave background from novel models of inflation. All of these questions are answered in the framework of QFT by computing elementary objects known as correlation functions. In the following we will define correlation functions, explain how to compute them using Feynman diagrams, and then relate them back to scattering amplitudes. In order to keep the following discussion as simple as possible, we will work in the real Klein-Gordon theory. We start by defining the n-point correlation (or Green’s) function G(n) (x1 , . . . , xn ) = hΩ|T (φH (x1 ) . . . φH (xn )) |Ωi ,

(4.89)

where φH denotes the φ field in the Heisenberg picture of the full theory, rather than the interaction picture that we have been dealing with so far. The first question that one can ask, is how to compute G(n) in terms of matrix elements evaluated on |0i, the vacuum of the free theory. Let me first state the result and then prove it. The result reads Z t 0 0 h0|T φI (x1 ) . . . φI (xn ) exp −i dt HI (t ) |0i −t (n) Z t . (4.90) G (x1 , . . . , xn ) = lim t→(1−i) ∞ 0 0 h0|T exp −i dt HI (t ) |0i −t

Notice that both the numerator and denominator appearing on the right-hand side of the latter equation can be calculated using the methods developed for S-matrix elements, namely Feynman diagrams (or alternatively Dyson’s formula and Wick’s theorem) after expanding the exponentials into a Taylor series. After stating the result (4.90), we still have to prove it. With out loss of generality we assume that x01 > . . . > x0n > t0 . If this is not the case we simply relabel the points in an appropriate way. Such a relabeling leaves both sides of (4.90) unchanged. We then have G(n) (x1 , . . . , xn ) = hΩ|φH (x1 ) . . . φH (xn )|Ωi =

lim

t→(1−i) ∞

e−iE0 (t−t0 ) h0|Ωi

−1

h0| U (t, t0 )

† † × U (x01 , t0 ) φI (x1 ) U (x01 , t0 ) U (x02 , t0 ) φI (x2 ) U (x02 , t0 ) . . . † −1 × U (x0n , t0 ) φI (xn ) U (x0n , t0 ) U (t0 , −t) |0i e−iE0 (t0 −(−t)) hΩ|0i =

lim

t→(1−i) ∞

e−iE0 (2t) |h0|Ωi|2

(4.91)

−1

× h0|U (t, x01 ) φI (x1 ) U (x01 , x02 ) . . . U (x0n−1 , x0n ) φI (xn ) U (x0n , −t)|0i =

h0|U (t, x01 ) φI (x1 ) U (x01 , x02 ) . . . U (x0n−1 , x0n ) φI (xn ) U (x0n , −t)|0i . t→(1−i) ∞ h0|U (t, −t)|0i lim

Here we have first used (4.87) and (4.88) and rewritten all Heisenberg fields φH in terms of interacting fields, † φH (x) = U (x0 , t0 ) φI (x) U (x0 , t0 ) . (4.92) 80

† Remember that U satisfies U (t1 , t2 ) U (t2 , t3 ) = U (t1 , t3 ) and U (t1 , t3 ) U (t2 , t3 ) = U (t1 , t2 ). In order to arrive at the last line, we have finally employed −1 h0|U (t, −t)|0i . (4.93) 1 = hΩ|Ωi = e−iE0 (2t) |h0|Ωi|2 The proof of (4.90) is complete after noticing that all fields in (4.91) are in time order R t 0and that the product of U operators in the numerator reduces to U (t, −t) = T exp − i −t dt HI (t0 ) . Hence the last line in (4.91) is nothing but the right-hand side of (4.90). Exponentiation of Bubble Diagrams By means of (4.90) we can now (in principle) calculate any n-point correlation function. But what is the physical interpretation of this equation? We first express the denominator of (4.90) in terms of Feynman diagrams,   lim

 +  

h0|U (t, −t)|0i = 1 +

t→(1−i) ∞

+

+

  + . . . . (4.94) 

The disconnected Feynman diagrams appearing on the right-hand side of this relation are called vacuum bubbles. What is the value of the first non-trivial graph? Restoring the position label and the integration momenta, l1

l2

(4.95)

it is readily seen that momentum conservation requires l1 = l2 , so that the diagram evaluates to (2π)4 δ (4) (0). This factor is also easily derived in position space, where one has Z d4 z (const.) ∝ 2t V . (4.96) This result just tells us that the space-time process (4.95) can happen at any place in space, and at any time between −t and t. Every disconnected diagram will have one such (2π)4 δ (4) (0) = 2t V factor, where V denotes the volume of space. In fact, the contributions to G(n) from disconnected diagrams can be shown to exponentiate. To prove the linked-cluster theorem, we first label the various possible disconnected pieces:         , , , , ... . (4.97) Vi ∈       Now we assume that a given Feynman diagram has ni pieces of the form Vi for each i, in addition to its one piece that is connected. If we also denote the value of Vi by vi , the value of a single Feynman graph is ! Y (vi )ni (value of connected piece) × , (4.98) (ni )! i 81

where 1/((ni )!) is the symmetry factor associated with interchanging the ni copies of the piece Vi . The value of the sum of all diagrams is then given by ! X X Y (vi )ni , (4.99) (value of connected piece) × (ni )! i all connected diagrams all {ni }

where “all {ni }” means “all ordered sets {n1 , n2 , . . .} of non-negative integers”. The sum of the connected diagrams factors out of this expression, giving ! ! X X Y (vi )ni . (4.100) (value of connected piece) × (ni )! i all connected diagrams all {ni }

In fact, not only the connected pieces factorize, but also the disconnected ones. One has   ! ! X Y (vi )ni Y X (vi )ni Y X  = = exp (vi ) = exp vi . (4.101) (n )! (n )! i i i i i i all {ni }

all {ni }

We see that the combinatoric factors (as well as the symmetry factors) associated with each diagram are such that the whole series of disconnected pieces sums to an exponential. Taken together (4.99) through (4.101) imply that the sum of all diagrams is equal to the sum of all connected diagrams multiplied with the exponential of the sum of all disconnected graphs. Applying our findings concerning the exponentiation of bubble diagrams to (4.94), we arrive at the following pictorial identity   Z t    . (4.102) dt0 HI (t0 ) |0i = exp  + . . . lim h0|T exp −i + +   t→(1−i) ∞

−t

The exponentiation of disconnected diagrams is also relevant in the case of the numerator of the right-hand side of (4.90). Let us consider the two-point correlation function G(2) for simplicity. In this case the numerator takes the form Z t 0 0 lim h0|T φI (x) φI (y) exp −i dt HI (t ) |0i = t→(1−i) ∞

−t



 

x

y

+

x

y

  × exp  

+

x

y

+ ...  

+

+

82

 + ...  .

(4.103)

Combining now (4.102) and (4.103), it follows that the exponentials involving the sum of disconnected diagrams cancel between the numerator and denominator in the formula for the correlation functions. In the case of the two-point function, the final form of (4.90) is thus G(2) (x, y) =

x

y

+

x

y

+

x

y

+

x

y

+ . . . . (4.104)

The generalization to higher correlation function is straightforward and reads G(n) (x1 , . . . , xn ) = hΩ|T (φH (x1 ) . . . φH (xn )) |Ωi =

! sum of all connected graphs . (4.105) with n external points

The disconnected diagrams exponentiate, factor, and cancel as before. It is important to remember that by “disconnected” we mean “disconnected from all external points”. In higher correlations functions, diagrams can also be disconnected in another sense. Consider, e.g., the four-point function G(4) (x1 , x2 , x3 , x4 ) =

+

+

+

+

+

+

+

+ ... +

+

+ ...

+ ...

+ ... .

(4.106)

In many of the displayed diagrams, external points are disconnected from each other. Such diagrams do neither exponentiate nor factor, they contribute to the amplitude just as do the fully connected diagrams in which any point can be reached from any other by traveling along the lines. Energy Density of Vacuum An immediate consequence of the linked-cluster theorem is that all vacuum bubbles cancel when calculating correlation functions. Does this mean that the disconnected diagrams have no physical meaning at all? The place to look for the answer to this question is (4.91) and (4.93). Taken together these two equations imply that Z t 0 0 lim h0|T φI (x1 ) . . . φI (xn ) exp −i dt HI (t ) |0i t→(1−i) ∞ −t (4.107) −iE0 (2t) 2 −1 = hΩ|T (φH (x1 ) . . . φH (xn )) |Ωi lim e |h0|Ωi| . t→(1−i)∞

83

Looking only at the t-dependent parts on both sides, it follows that hX i h i exp vi ∝ exp − iE0 (2t) .

(4.108)

i

The sum of all vacuum bubbles is therefore related to the difference in the ground-state zeropoint energies of the interacting and the free theory, the latter of which was defined to be zero. Because each bubble graph Vi contains a single factor of (2π)4 δ (4) (0) = 2t V , one explicitly finds that the energy density of the ground state of the (interacting) φ4 theory reads    E0 = i E0 =  V

+

i−1 h 4 (4)  + . . .  (2π) δ (0) .

+

(4.109)

Notice that the IR divergence arising from the infinite extent of space-time volume which we have first met in Section 3.2 and then again in (4.96) has been removed in E0 , leaving behind an highly UV-divergent expression that reflects our ignorance about the physics governing the high-energy regime. One-Particle States in Interacting Theory We now have an extremely beautiful formula (4.105) for computing an extremely abstract quantity the n-point correlation function. Our next task is to relate these objects back to S-matrix elements (4.25) or equivalent T -matrix elements (4.49) , which will allow us to compute quantities that can actually be measured, namely decay rates and cross sections. In order to achieve this goal, we still have to learn how to deal with diagrams involving loops on the external lines. Let us first try to understand the problem with such graphs, looking at a specific example. We consider the following Feynman diagram l p1 p3

q1 q2

p2

1 = 2

Z

i d p3 2 p3 − m2 4

Z

d4 l

i l2 − m2

(4.110)

× (−iλ) (2π)4 δ (4) (p2 + p3 − q1 − q2 ) × (−iλ) (2π)4 δ (4) (p1 − p3 ) ,

appearing in φ4 theory. We can integrate over p3 using the second delta function. It tells us to evaluate 1 1 1 = 2 = . (4.111) 2 2 2 p3 − m p3 =p1 p1 − m 0 We get an infinity, since p1 , being the momentum of an external particle, is on-shell, i.e., p21 = m2 . This is not good! Clearly, diagrams like (4.111) should not contribute to the 84

S-matrix elements. In fact, this is physically reasonable, since the external leg corrections,

+

+

+

+ ... ,

(4.112)

represent the evolution of one-particle state of the free theory into the one-particle state of the interacting theory, in the same way that the vacuum-bubble diagrams (4.97) represent the evolution of |0i into |Ωi. Since these corrections have nothing to do with the scattering process itself, it is somehow clear that one should exclude them from the calculation of the S-matrix. For a generic Feynman diagram with external legs, we define amputation in the following way. Starting from the tip of each external leg, find the last point at which the diagram can be cut by removing a single propagator, such that this operation separates the leg from the rest of the diagram. Cut there. Let me give an non-trivial example of a diagram that appears at O(λ10 ), if one wants to compute φφ → φφ scattering in φ4 theory. Here it is:

=⇒ amputation

(4.113)

So far we have learnt about the problem with external-leg corrections the become infinite for on-shell external states as implied by (4.111) and gave a simple prescription of how to solve the issue, i.e., by simply removing these corrections by amputation. A practitioner or an experimental physicist might be happy at this point, but as theorists we want more. So let’s have a closer look at the connection between G(n) and S.

4.10

From Correlation Functions to Scattering Matrix Elements

Before we start, let me warn you that this subsection will be more abstract than the preceding ones. Its main theme will be the singularities of Feynman diagrams viewed as analytic functions of their external momenta. Yet, we will see rather soon that this apparently esoteric subject is full of physical implications, and that it illuminates the relation between Feynman diagrams and the general principles of QFT.

85

K¨ all´ en-Lehmann Spectral Representation We already know that in the free theory the matrix element h0|T φ(x)φ(y)|0i has a simple physical interpretation. It gives the amplitude for a particle to propagator from y to x. To what extent carries this over to the interacting theory? In order to answer this question, we will have a look at the two-point correlation function (4.104). Our analysis of G(2) will rely only on general principles of special relativity and QM, but will neither depend on the nature of the interactions nor on an expansion in perturbation theory. Yet, to simplify matters, we will restrict our consideration to the case of the real scalar field φ. Similar results can be obtained for correlation functions of fields with spin. We begin by studying the excited states of the interacting theory, with the corresponding energies being defined relative to the ground-state energy E0 . Let |λ0 i be an excited eigenstate of the full Hamiltonian with vanishing total 3-momentum 0, i.e., P |λ0 i = 0. That |λ0 i can be an eigenstate of both H and P follows from the fact that [H, P ] = 0. Such a state can consist of an arbitrary number of particles or it can even be bound state. The simultaneous eigenvalues of H − E0 and P can be combined into a 4-vector pµ0 = (mλ , 0), where mλ denotes the “mass” of the particular zero-momentum state. Being the generator of spacetime translations, P µ = (H − E0 , P ) transforms as contravariant 4-vector under boosts, i.e., U −1 (Λ)P µ U (Λ) = Λµ ν P ν where U (Λ) is the unitary operator that implements the Lorentz boost. This implies that by boosting |λ0 i one can generate a new state |λp i, which can have any 3-momenta p and is an eigenstate of H − E0 with energy Ep (λ) = (|p|2 + m2λ )1/2 . Or the other way round, any eigenstate with explicit 3-momentum can be boosted to a zeromomentum eigenstate. You are kindly asked to prove this statement explicitly. The sets of eigenvalues pµ = (E − E0 , p) are thus organized into hyperboloids, as is shown in Figure 4.5. The lowest-lying isolated hyperboloid corresponds to the one-particle states of the interacting theory, whereas the other ones correspond to bound states that may or may not be present. Above a certain threshold value of mλ , a continuum of “multiparticle” states starts. From the above it follows that the states |λp i form a complete set of states in the interacting theory, in the same way the states |pi do in the free theory. In turn, the completeness relation of the one-particle states in the free theory (3.66) is replaced by X Z d3 p 1 |λp ihλp | , (4.114) 1 = |ΩihΩ| + 3 (2π) 2Ep (λ) λ where the first term corresponds to the ground state and the second one to all excited states. We now insert (4.114) into the two-point function G(2) (x, y) = hΩ|T φ(x)φ(y)|Ωi.29 In the case x0 > y 0 , we obtain hΩ|φ(x)φ(y)|Ωi = hΩ|φ(x)|ΩihΩ|φ(y)|Ωi X Z d3 p 1 + hΩ|φ(x)|λp ihλp |φ(y)|Ωi . 3 2E (λ) (2π) p λ 29

(4.115)

For the sake of brevity, the labels H indicating Heisenberg fields will be dropped hereafter, whenever we discuss the properties of correlation functions.

86

H multiparticle continuum HH Y A K A

one particle in motion

bound state m @ I @ @

one particle at rest P

Figure 4.5: The eigenvalues of P µ = (H, P ) are hyperboloids in the P –H plane. For a typical theory the states consist of one or more particles of mass m. In consequence, there is a hyperboloid of one-particle states and a continuum of hyperboloids of two-, three-particle states, and so on. There may also be one or more bound state hyperboloids below the threshold for creation of two free particles.

In the absence of preferred directions in the universe, the vacuum |Ωi should be invariant under space-time translations and Lorentz transformations, i.e., eiP x |Ωi = |Ωi and U (Λ) |Ωi = |Ωi. As part of an exercise you will show that this implies that hΩ|φ(x)|Ωi = hΩ|φ(0)|Ωi = v ,

(4.116)

where v denotes the VEV of the field φ(x), usually taken to be zero. If v 6= 0 than one ¯ should reformulate the theory using the shifted field φ(x) = φ(x) − v, which by definition has vanishing VEV. By an appropriate choice of the dofs of the interacting theory one hence can always get rid of the first term in (4.115). The matrix elements entering the second term can be manipulated as follows hΩ|φ(x)|λp i = hΩ|eiP x φ(0)e−iP x |λp i = e−ipx hΩ|φ(0)|λp i p0 =Ep (λ) = e−ipx hΩ|U −1 (Λ)U (Λ)φ(0)U −1 (Λ)U (Λ)|λp i p0 =Ep (λ)

(4.117)

= e−ipx hΩ|φ(0)|λ0 i p0 =Ep (λ) , where U (Λ) implements a boost from p to 0. In order to arrive at the final expression, we have made use of the fact that |Ωi and φ(0) are Lorentz invariant.30 30

For a field with spin we would need to keep track of its non-trivial transformation properties under the Lorentz group.

87

ρ(s) one-particle states

bound states

multiparticle continuum

s m

2

2

(2m)

Figure 4.6: The spectra density ρ(s) for a typical interacting theory. The one-particle states contribute a delta function at m2 , i.e., the square of the physical mass of the particle. Multiparticle state form a continuous spectrum starting at (2m)2 . There may also be bound states below the two-particle threshold.

Leaving out the VEV and using (4.117), the two-point correlation function (4.115) then takes the form (x0 > y 0 ) Z 3 X d p e−ip(x−y) 2 hΩ|φ(x)φ(y)|Ωi = |hΩ|φ(0)|λ0 i| (2π)3 2Ep (λ) p0 =Ep (λ) λ (4.118) Z 4 X d p i |hΩ|φ(0)|λ0 i|2 e−ip(x−y) , = 4 p2 − m2 + i (2π) λ λ where to arrive at the final result we have introduced an integration over p0 employing (3.120). The integral in the last line of (4.118) is the Feynman propagator DF (x−y; m2λ ) belonging to a “φ-particle” with mass mλ . We see that the particle interpretation has in fact changed in the interacting theory from free particles to dressed particles (quasi-particles), so the “particles” we are dealing with here are not the particles that we know from the free theory. An expression analog to the one in (4.118) holds in the case x0 < y 0 . Combining both cases one arrives at the Källén-Lehmann spectral representation of the two-point correlation function Z ∞ ds (2) G (x, y) = ρ(s) DF (x − y; s) , (4.119) 0 2π where ρ(s) depends on the squared invariant mass s. This spectral density function is positive definite and given by X ρ(s) = 2π δ(s − m2λ ) |hΩ|φ(0)|λ0 i|2 . (4.120) λ

88

Im (p2 )

one-particle pole

multiparticle brunch cut

bound-state poles ?

m2

??

(2m)2

?

Re (p2 )

Figure 4.7: Analytic structure in the complex p2 -plane of the Fourier transform of the two-point correlation function for a typical interacting theory. The one-particle states lead to an isolated pole at p2 = m2 . States of two or more free particles give a brunch cut, while possible bound states show up as additional poles below (2m)2 .

The spectral density for a typical theory is plotted in Figure 4.6. We see that the states in the interacting theory that describe one-particle states correspond to an isolated delta function in the spectral density, ρ(s) = 2π δ(s − m2 ) Z + nothing else until s & (2m)2 . (4.121) The factor Z = |hΩ|φ(0)|λ0 i|2 ,

(4.122)

is called the field-strength renormalization. It is the probability for φ(0) to create a one-particle state out of the vacuum |Ωi and m denotes the physical mass of the associated particle, being the energy eigenvalue in its rest frame. Notice that this physical mass is in general not equal to the bare mass parameter occurring in the Lagrangian of the φ4 theory (4.6). To make the distinction between physical and bare quantities manifest, we will hereafter indicate bare quantities by a subscript 0. It is important to realize that only the physical mass m is directly observable, while the bare mass m0 is not. In momentum-space the spectral decomposition (4.119) reads Z Z ∞ i ds (2) 2 4 ipx (2) ˜ ρ(s) 2 G (p ) = d x e G (x, 0) = p − s + i 0 2π (4.123) Z ∞ iZ ds i = 2 + ρ(s) 2 . p − m2 + i p − s + i & (2m)2 2π The analytic structure of this function in the complex p2 -plane is depicted in Figure 4.7. The first term gives an isolated simple pole at p2 = m2 , while the second term contributes a branch cut beginning at p2 = (2m)2 . If there are any two-particle bound states these will appear as additional delta functions in (4.123) and thus as additional poles below the cut. Let us compare the results we have obtained in this subsection to those found in Section 3.7 for the free theory. The Fourier transform of the Feynman propagator (i.e., the two-point 89

correlation function in the theory of a free scalar field) reads (x0 > 0) Z Z i 2 4 ipx ˜ F (p ) = d x e DF (x) = d4 x eipx h0|T φ(x)φ(0)|0i = , D 2 p − m20 + i

(4.124)

and is the amplitude for a particle to propagate from 0 to x. The relation (4.123) implies that the two-point correlation function of the most general theory of an interacting real scalar field φ takes a very similar form. The general expression is essentially a sum of scalar propagation amplitudes for states generated from the vacuum by the field φ(0). There are however two important differences between (4.123) and (4.124). First, (4.123) contains the field renormalization factor Z, which is one in the case of the free fields. The latter statement is easily shown explicitly by evaluating the matrix elements h0|φ(0)|pi and thus left as an exercise. Second, (4.123) contains contributions from multiparticle intermediate states with a continuous mass spectrum. In the free field theory, φ(0) can create only a single particle from |0i. Notice that the generation of multiparticle states is the reason why the factor Z in general differs from unity in the interacting theory. Lehmann-Symanzik-Zimmermann Reduction Formula So far we have seen that the Fourier transform of the two-point correlation function (4.123) considered as an analytic function of p2 has a simple pole at the square of the physical mass of the one-particle states, while multiparticle intermediate states give weaker branch cut singularities. In the following we will find that this rather formal observation generalizes to higher-point correlation functions and plays a crucial role in the derivation of a general relation between Green’s functions and S-matrix elements. This relation has first been derived by Harry Lehmann, Kurt Symanzik, and Wolfhart Zimmermann [3] and is today known as the LSZ reduction formula. Combining the LSZ reduction formula with our Feynman rules for computing correlation functions (4.105) will then give us a master formula for S-matrix elements in terms of Feynman diagrams. For simplicity, we will again carry out the whole analysis for the case of a real scalar field. ˜ (2) (p2 ) in the vicinity In the following we would like use the single-particle pole structure of G of p2 ≈ m2 to obtain the asymptotic “in” and “out” states of the theory and in particular their matrix elements, out hq1 , . . . , qn |pA , pB iin

= hq1 , . . . , qn |S |pA , pB i .

(4.125)

These matrix elements are plane-wave amplitudes that describe the scattering of a initial two-particle momentum state |pA , pB iin , constructed in the far past (t = t− → −∞), into a n-particle momentum state |q1 , . . . , qn iout , which represents the final-state particles in the far future (t = t+ → ∞).31 The basic idea to derive the desired master formula is as follows. In order to calculate the S-matrix element for a 2 → n scattering process, we start with the correlation function 31

Because human built detectors are in general not able to resolve positions down to the de Broglie wavelengths of the particles, it is correct to work with plane-wave states in the Heisenberg picture rather than wave packets to describe the collision.

90

involving (n + 2) Heisenberg fields. If we Fourier-transform this function with respect to the coordinate of any one of these fields, we will find a pole of the form (4.123) in the corresponding Fourier-transformed variable. We will argue that the one-particle states associated with these poles are in fact asymptotic states, i.e., states given by the limit of well-separated wave packets as they become concentrated around definite momenta. Taking the limit in which all (n + 2) external particles go on-shell, we can then interpret the coefficient of the multiple pole as an S-matrix element. We first study the Fourier-transform of the (n + 2)-point correlation function with respect to one argument x, Z d4 x eipx hΩ|T (φx φ1 . . . φn+1 ) |Ωi . (4.126) Here the shorthands φx = φ(x), φ1 = φ(y1 ), etc. have been used and all φ’s are Heisenberg fields. We would now like to identify poles in the variable p0 . To do this, we divide the integral over x0 into three regions, Z ∞ Z t+ Z Z t− 0 0 0 dx + dx0 , (4.127) dx + dx = −∞

t−

t+

where t− < min {yi0 } and t+ > max {yi0 } with i = 1, . . . , n + 1. In the region x0 ∈ [t− , t+ ] the result of the integral is an analytic function of p0 without poles, since the region is bounded and the integrand depends on p0 through the analytic function exp(ip0 x0 ). In the other two regions the integrand still has no poles, but the integration intervals are unbounded. Therefore singularities in p0 may develop upon integration. Consider the third region, i.e., x0 ∈ [t+ , ∞[. In this case x0 is the latest time, so φx stands first in the time-ordered product. In order to determine the pole structure of (4.126), we insert the completeness relation (4.114), assuming that the field φ has a vanishing VEV.32 The integral over the third region then becomes Z Z ∞ X Z d3 k 1 0 3 i(p0 x0 −p·x) hΩ|φ(x)|λk ihλk |T (φ1 . . . φn+1 ) |Ωi . (4.128) dx d xe 3 (2π) 2Ek (λ) t+ λ Using (4.117) and including a damping factor exp (−x0 ) with infinitesimal to ensure that the integral is well-defined,33 the above integral takes the form Z 3 XZ ∞ dk 1 0 0 0 0 dx e i(p −k +i)x hΩ|φ(0)|λ0 i (2π)3 δ (3) (p − k) 3 (2π) 2Ek (λ) t+ λ × hλk |T (φ1 . . . φn+1 ) |Ωi (4.129) k0 =Ek (λ) 0

=

X λ

1 ie i(p −Ep (λ)+i) t+ hΩ|φ(0)|λ0 i hλp |T (φ1 . . . φn+1 ) |Ωi . 2Ep (λ) p0 − Ep (λ) + i

If this is not the case we reformulate the theory in terms of the φ¯ field. This regularization is equivalent to the i prescription used in (3.129) and the tilted time-axis prescription introduced in (4.86). 32

33

91

R Here we have used d3 x exp (−i(p − k)x) = (2π)3 δ (3) (p − k). The expression (4.129) has the same residue at p0 = Ep (λ) − i as the term i/(p2 − m2λ + i) = i/ (p0 )2 − (Ep (λ))2 + i appearing the two-point correlation function (4.118). Like before this singularity will be either a single pole or a brunch cut, depending on whether the rest energy mλ is isolated or not. The one-particle state in the far future corresponds to an isolated pole at the on-shell energy p0 = Ep . In this case, (4.129) gives √ Z i Z p0 →Ep 4 ipx d x e hΩ|T (φx φ1 . . . φn+1 ) |Ωi ∼ (4.130) out hp|T (φ1 . . . φn+1 ) |Ωi . p2 − m2 + i In order to obtain this result we have identified the matrix element hΩ|φ(0)|λ0 i appearing in (4.129) with Z 1/2 using (4.122), absorbing the left over phase into the definition of |λ0 i. We have furthermore used the notation |piout = |λp ione−particle for a one-particle eigenstate with momentum p that is created at asymptotically large times in the future. In order to evaluate the contribution from the first region, i.e., x0 ∈] − ∞, t− ], one puts φx last in the time-ordered product. Performing steps similar to the ones for the first integration interval (the actual calculation is left as an exercise), one find that the one-particle state in the far past corresponds to an isolated pole at the on-shell energy p0 = −Ep , √ Z i Z p0 →−Ep 4 ipx hΩ|T (φ1 . . . φn+1 ) | − piin , (4.131) d x e hΩ|T (φx φ1 . . . φn+1 ) |Ωi ∼ p2 − m2 + i where | − piin = |λ−p ione−particle denotes the one-particle eigenstate with momentum −p which is constructed at asymptotically large times in the past. We now want to repeat the same exercise for the remaining field coordinates y1 , etc. In the asymptotic treatment of multiparticle states it is, however, better to use normalized wave packets. In that case x is constrained to lie within a small band about the trajectory of a particle with momentum p, with the spatial extent of the band being determined by the wave packet. In this way the particles do not interfere and can effectively be considered free R 4 at asymptotic times, unlike plane-wave states. Instead of a simple Fourier transform d x exp (ipx), we should hence have used Z 3 Z dq 0 0 d4 x eip x e−iq·x ψ(q) , (4.132) 3 (2π) in (4.126), where ψ(q) is a function that is peaked around p, and at the end taken the limit of a sharply peaked wave packet ψ(q) → (2π)3 δ (3) (q − p). With this modification the right-hand side in (4.129) would turn into X Z d3 q 1 i ψ(q) hΩ|φ(0)|λ0 i hλq |T (φ1 . . . φn+1 ) |Ωi 3 0 (2π) 2E q (λ) p − Eq (λ) + i λ (4.133) √ Z 3 0 dq i Z p →Ep ∼ ψ(q) 2 out hq|T (φ1 . . . φn+1 ) |Ωi , 3 (2π) p˜ − m2 + i where p˜ = (p0 , q). We see that the one-particle singularity is now a branch cut, whose length is the width in momentum space of the wave packet ψ(q). It follows that if the width of the 92

ψ(q) is taken to zero, the brunch cut sharpens up to a pole. In this limit (4.133) reduces to the simple form (4.130). The same line of reasoning applies to the pole structure that appears in the far past. In this case one recovers (4.131). The procedure described above can be generalized to the (n + 2)-particle case we are interested in by integrating each of the coordinates against a wave packet. Let me spare you the gory details of the actual calculation and only tell you about the final result. It turns out that by smearing each coordinate one can extract the leading singularities that turn out to be products of poles in the separate energy variables. The physics behind this factorization is that an (n + 2)-particle asymptotic state is created/annihilated by (n + 2) field operators that are constrained to lie in distant wave packets and therefore are effectively localized. Under these conditions an (n + 2)-particle excitation in the continuum can be represented by (n + 2) distinct (i.e., independent) one-particle excitations of the ground state. At the end one arrives at ! n Z ! Y Z Y ˜ (n+2) (pA , pB , q1 , . . . , qn ) = G d4 xi eipi xi d4 yi e−iqj yj hΩ|T (φA φB φ1 . . . φn ) |Ωi j=1

i=A,B

p0i →Epi

∼

qj0 →Eqj

=

! n ! √ √ Y i Z i Z out hq1 , . . . , qn |pA , pB iin 2 2 2 + i 2 + i p − m q − m i j j=1 i=A,B Y

! n ! √ √ Y i Z i Z hq1 , . . . , qn |S|pA , pB i , 2 2 2 + i 2 + i p − m q − m i j j=1 i=A,B Y

(4.134)

where the use of exp (−iqj yj ) ensures that the particles in the “in” state have positive energy. The latter relation is the famous LSZ reduction formula. It implies that the S-matrix element involving two particles in the “in” state and n particles in the “out” state can be obtained from the corresponding Fourier-transformed (n + 2)-point correlation function by extracting the leading singularities in the energies p0i and qj0 , which coincide with the situations where the external particles become on-shell. Diagrammatic Master Formula Our final goal is to reformulate the above procedure in the language of Feynman diagrams. For concreteness, we will first analyze the relation between the diagrammatic expansion of the scalar field four-point function and the S-matrix element and then generalize this result to the case of 2 → n scattering. We will consider explicitly the fully connected Feynman diagrams contributing to the Fourier-transformed correlation functions. By a similar analysis, it is straightforward to show that disconnected diagrams should be disregarded, because they do not have the singularity structure with a product of four (n + 2) poles, appearing on the right-hand side of the LSZ reduction formula (4.134). The exact four-point correlation function is shown in Figure 4.8. In this figure we have indicated explicitly the diagrammatic corrections on each external leg. The light gray blob in

93

pB

q2

amp.

pA

q1

Figure 4.8: Structure of the exact four-point correlation function in scalar field theory.

the centre of the diagram represents the sum of all amputated four-point graphs,

amp.

=

+

+

+ ... ,

+

(4.135)

while the dark gray circles indicate the two-point Green’s function aka the full propagator. The full propagator can be written as a Dyson series, =

1PI

1PI

+

1PI

+ ... ,

(4.136)

where

1PI

= −iΣ(p2 ) =

+

+

+ ... ,

(4.137)

is the collection of all one-particle irreducible (1PI) self-energy diagrams. Diagrams are called 1PI if they cannot be split in two by removing a single line. The Dyson series (4.136) is in fact a geometrical series, which can be summed up according to =

i i i 2 + −iΣ(p ) + ... 2 2 p2 − m0 + i p2 − m0 + i p2 − m20 + i

i = 2 . 2 p − m0 − Σ(p2 ) + i

(4.138)

We see that the full propagator has a simple pole located at the physical mass m, which is shifted away from the bare mass m0 by the self-energy: p2 − m20 − Σ(p2 ) = 0 , =⇒ m2 = m20 + Σ(m2 ) . (4.139) p2 =m2

94

Notice that our sign convention for the 1PI self-energy Σ(p2 ) implies that a positive contribution to Σ(p2 ) corresponds to a positive shift of the scalar particle mass. Close to its simple pole at p2 ≈ m2 the denominator of the full propagator (4.138) can be expanded in the following way (4.140) p2 − m20 − Σ(p2 ) = p2 − m2 1 − Σ0 (m2 ) + O (p2 − m2 )2 , where Σ0 (m2 ) stands for ∂Σ(p2 )/(∂p2 ) p2 =m2 . This implies that just like in the Källén-Lehmann spectral representation (4.123), the full propagator has a single-particle pole of the form p0 →Ep

∼

p2

iZ + (regular terms) , − m2 + i

(4.141)

with

1 . (4.142) 1 − Σ0 (m2 ) As a result, the sum of all fully connected 2 → 2 diagrams contains a product of four poles Z=

p2A

iZ iZ iZ iZ , 2 2 2 2 2 2 − m + i pB − m + i q1 − m + i q2 − m2 + i

(4.143)

multiplying the amputated four-point diagrams. This is exactly the singularity on the righthand side of the LSZ reduction formula (4.134). Comparing the coefficients of the product of poles, we conclude that the S-matrix element of the process φ(pA )φ(pB ) → φ(q1 )φ(q2 ) can be expressed through pB q2 √ 4 Z hq1 , q2 |S|pA , pB i =

,

amp.

pA

(4.144)

q1

where the light gray blob represents the sum of amputated four-point diagrams with all external momenta being on-shell. This is the sought diagrammatic master formula for the case of 2 → 2 scattering of scalar fields. An identical analysis can be applied to the Fourier-transformed (n + 2)-point correlator. In this case the relation between the S-matrix element and the Feynman graphs reads34 pB qn √ n+2 Z hq1 , . . . , qn |S|pA , pB i =

amp.

pA

.. .

.

(4.145)

q1

Notice that the renormalization factors Z 1/2 are irrelevant for calculations at the leading order of perturbation theory, but are important in the calculation of higher-order corrections. This completes the derivation of the connection between scattering matrix elements and fully connected amputated Feynman diagrams. 34

If the external particles are of different species, each has its own renormalization factor Z 1/2 . Furthermore, if the particles have spin, there will be additional polarization factors on the right-hand side of the equation.

95

4.11

Decay Widths and Cross Sections

As in usual QM, also in QFT the probabilities for things to happen are the (modulus) square of the quantum amplitudes. In this subsection we will compute these probabilities, known as decay widths and cross sections. One small subtlety here is that any T -matrix element (4.75) comes with a factor of (2π)4 δ (4) (pf − pi ), so that we end up with the square of a delta function. As we will see in a moment, this subtlety is a result of the fact that we are working in an infinite space. Fermi’s Golden Rule In order to start the discussion, let me derive something familiar, namely Fermi’s golden rule using Dyson’s formula (4.18). For two energy eigenstates |mi and |ni with Em 6= En , one has in Born approximation Z t hn| U (t) |mi = −i hn| dt0 HI (t0 )|mi 0

Z

t

0

dt0 eiωt

= −i hn|Hint |mi

(4.146)

0 iωt

= −hn|Hint |mi

e

−1 , ω

where ω = En − Em and we have used in the first step the equality (4.11) to express HI in terms of Hint . The probability Pm→n (t) for the transition from |mi to |ni to happen in the time t, is thus given by Pm→n (t) = |hn| U (t) |mi|2 = 2 |hn|Hint |mi|2

1 − cos (ωt) . ω2

(4.147)

The function (1 − cos (ωt))/ω 2 is visualized in Figure 4.9. The ω-dependence indicates that most transitions occur in a region between energy eigenstates separated by ∆E = 2π/t, i.e., the half-width of the function. Looking at the figure one furthermore observes that as t → ∞, the function shown in the plot approaches a delta function. In order to find the normalization, we evaluate Z ∞ 1 − cos (ωt) dω = πt . (4.148) ω2 −∞ This implies that 1 − cos (ωt) t→∞ → δ(ω) . (4.149) πω 2 t Consider now a distribution of final states with density ρ(En ). In this case one has to integrate over En and obtains Z Z 1 − cos (ωt) 2 Pm→n (t) = dEn ρ(En ) |hn| U (t) |mi| = dEn ρ(En ) 2 |hn|Hint |mi|2 ω2 (4.150) t→∞

→ 2πt |hn|Hint |mi|2 ρ(Em ) . 96

t2 2

−

2π t

2π t

Figure 4.9: Graphical representation of (1 − cos (ωt))/ω 2 appearing in Pm→n (t).

It follows that the probability for the transition per unit time for states around the same energy Em ≈ En = E takes the form P˙m→n (t) = 2π |hn|Hint |mi|2 ρ(E) ,

(4.151)

This result is known as Fermi’s golden rule. In the above derivation, we were rather careful with taking the limit t → ∞. Suppose we were a little bit sloppier, and first chose to compute the amplitude for the initial state |mi at t → −∞ to evolve into the final state |ni at t → ∞. Then we would get Z ∞ dt0 HI (t0 )|mi = −i hn|Hint |mi 2π δ(ω) . (4.152) − i hn| −∞

Now when squaring the amplitude, we find Pm→n (t) = |hn|Hint |mi|2 (2π)2 [δ(ω)]2 . Tracking through the previous computation, we realize that the extra infinity arises because Pm→n (t) is the probability for the transition to happen in infinite time. We thus can write the delta functions as (2π)2 [δ(ω)]2 = 2π δ(ω) t, where t is a shorthand for t → ∞. The reason that we have stressed this point is because the T -matrix element in (4.75) has been computed in the same way as (4.152), which means that we have to reinterpret the square of the delta function arising from |hf |T |ii|2 as a space-time volume factor. Decay Rates We would now like to calculate the probability for a single-particle initial state |ii of momentum pi and rest mass P m to decay into the final state |f i consisting of n particles with total momentum pf = nj=1 qj . This quantity is given by the ratio Pn =

|hf |S|ii|2 . hi|iihf |f i

(4.153)

The states |ii and |f i obey the relativistic normalization formula (3.64), hi|ii = (2π)3 2Epi δ (3) (0) = 2Epi V , 97

(4.154)

where we have replaced the delta function δ (3) (0) by the volume V of the space. Similarly, one has for the final state n X hf |f i = 2Eqj V . (4.155) j=1

If the initial-state particle is at rest, i.e., Epi = m and pi = 0, we get using (4.75) for the i → f decay probability n 2 1 Y 1 (2π)4 δ (4) (pf − pi ) |A(i → f )|2 Pn = 2mV j=1 2Eqj V n Y 1 1 4 (4) 2 . (2π) δ (pf − pi ) |A(i → f )| V t = 2mV 2Eqj V j=1

(4.156)

Notice that in order to arrive at the second line we have replaced one of the delta functions (2π)4 δ (4) (0) by the space-time volume V t. We can now divide out t to get the transition function per After integrating R 3unit time. 3 over all possible momenta of the final-state particles, i.e., V d qj /(2π) , we then obtain in terms of the relativistically-invariant n-body phase-space element35 dΠn = (2π)4 δ (4) (pf − pi )

n Y d3 qj 1 , 3 2E (2π) q j j=1

(4.157)

the following expression for the partial decay width into the considered n-particle final state Z 1 dΠn |A(i → f )|2 . (4.158) Γn = 2m R Notice that the factors of the spatial volume V in the measure V d3 qj /(2π)3 have cancelled those in (4.156), while the factors 1/(2Eqj ) in (4.156) have conspired with the 3-momentum R integrals in V d3 qj /(2π)3 to produce Lorentz-invariant measures (3.61). In consequence, the density of final states (4.157) is a Lorentz-invariant quantity. After summation over all possible n-particle final states, one finally finds the so-called total decay width Z 1 X Γ= dΠn |A(i → f )|2 , (4.159) 2m n with dΠn corresponding to a given final state. The total decay width is equal to the reciprocal of the half-life τ = 1/Γ of the decaying particle. If the decaying particle is not √ at rest, the decay rate becomes m Γ/Epi . This leads to an increased half-life Epi τ /m = τ / 1 − v 2 = γτ , where v is the velocity of the decaying particle. Of course, this is a well-known effect related to time dilation. E.g. taking the muon lifetime at rest as the laboratory value of 2.22 µs, the lifetime of a cosmic ray produced muon traveling at 98% of the speed of light is about five times longer. 35

This object is in some textbooks denoted by dPSn .

98

In terms of the partial and total decay width, (4.158) and (4.159), the branching ratio (or branching fraction) for the n-particle decay i → f reads B(i → f ) = Needless to say that B(i → f ) ∈ [0, 1] and

P

n

Γn . Γ

(4.160)

B(i → f ) = 1.

Cross Sections Consider now a beam of particles of type B hitting a target at rest consisting of particles of type A. The case of two colliding particle beams like e+ e− (LEP), p¯ p (Tevatron) or pp (LHC) can be obtained from this by an appropriate Lorentz boost. Let’s start by assuming constant densities ρA and ρB in the target and the beam over their whole extents À and `B . The number of scattering events will then be proportional to (ρA À ) (ρB `B ) O ,

(4.161)

where O denotes the cross-sectional overlap area common to both the beam and the target. The experimental set-up is illustrated in Figure 4.10. The ratio σ=

# scattering events # scattering events = , (OρA À ) (OρB `B ) /O NA NB /O

(4.162)

defines the cross section σ as the effective area of a chunk taken out of the beam by each particle in the target. The quantities NA and NB are the numbers of A and B particles that are relevant for scattering, i.e., the particles that at some point in time belong to the overlap between target and beam. Notice that all of this can be equally well formulated in terms of time-related quantities like the scattering rate and the incoming particle flux. Simply replace the number (#) of scattering events by the number of scattering events per second and `B ρB by the flux vB ρB of beam particles. In reality ρA and ρB are not constant, since the colliding particles are described by wave packets and both target and beam have a density profile. However, the range of the interaction between the colliding particles is much smaller than the width of the individual wave packets perpendicular to the beam, which in turn is much smaller than the actual diameter of the beam. Therefore, to very good approximation ρA and ρB can be considered as locally constant on QM (i.e., interaction) length scales, whereas the density profiles inside the target and beam can be incorporated properly by averaging over the overlap region Z À `B d2 x⊥ ρA (x⊥ ) ρB (x⊥ ) = NA NB /O . (4.163) Here x⊥ is the spatial coordinate perpendicular to the beam. From this it follows that # scattering events = σ NA NB /O ,

(4.164)

where σ can be calculated for effectively constant values of ρA and ρB corresponding to approximately plane-wave initial states. By the way, we do not have to restrict ourselves to the total 99

À -

beam O

→ → → → → ρB → vB → → → → →

ρA

-

`B target Figure 4.10: Incident beam of particles with density ρB , extent `B , and velocity vB hitting a target of density ρA and extent À . The overlap area of the beam and target is denoted by O.

number of scattering events. In a similar way we can study the cross section for scattering into the region d3 q1 . . . d3 qn around the n-particle final-state momentum point q 1 , . . . q n . This is actually what detectors usually do,36 since they detect particles with energy and momentum in certain finite bins, which are given by the detector resolution. These bins cannot resolve the momentum spread of any of the wave packets, so in the final state we should use plane waves as well. Calculating cross sections therefore amounts to computing transition probabilities in momentum space. These transition probabilities are universal in the sense that they are independent of details of the experiment, like the properties of the beams, the targets or the preparation of the initial-state particles. Consider an initial state consisting of one target and one beam particle in the momentum state |ii = |pA , pB i scattering into a final state |f i = |q1 , . . . , qn i. In analogy with the calculation that lead to (4.159), the corresponding differential transition probability per unit time and flux is given by dσ =

dΠn 1 |A(i → f )|2 , F 4EpA EpB V

(4.165)

which is usually referred to as the differential cross section. In the latter expression F stands for the flux associated with the incoming beam of particles. In the CM frame of the collision this flux reads |v rel | |v A − v B | |p /EA − pB /EB | pCM ECM F = = = A = , (4.166) V V V EA EB V where ECM = EA +EB is the total CM energy and pCM is the momentum of either of the particles in the CM frame. To find this result we have used that the 4-momentum of a massive par ticle reads pµ0 = (m, 0) in its rest frame, and becomes pµ = γ (E0 + v · p0 ) , γ (p0 + E0 v) = 36

Provided that the particle positions cannot be resolved at the level of the de Broglie wavelengths of the particles, which typically is the case in human-built detectors.

100

mγ (1, v) upon a boost with velocity v. In the CM frame we thus find the expression Z dΠn |A(i → f )|2 , (4.167) σCM = 4|EA pB − EB pA | for the total cross section. The flux factor 1/4 |EA pB − EB pA |−1 is not Lorentz invariant, but invariant under boosts along the beam direction, as expected for a cross-sectional area perpendicular to the beam. Notice that the expression for dσ as given in (4.165) is also valid for identical particles in the final state. Finding a set of particles in the required momentum bin effectively identifies the particles. However, when integrating dσ to obtain the total cross section σCM for the scattering into the n particles one has to restrict this integration to inequivalent configurations. E.g., ¯ → M M in the scalar Yukawa theory is the total cross section Rfor the 2 → 2 reaction N N obtained as σCM = 1/2 dσ.

4.12

Problems

¯ → M M at O(g 2 ) in the Yukawa i) Draw the Feynman diagrams that contribute to N N theory. Write down the corresponding amplitude and express your result through the Mandelstam variables. Can the amplitude develop a pole? Calculate the leading-order contribution (4.74) to M M → M M scattering in the limit of small external momenta, i.e., p21 M 2 etc. To do so, first relate the d-dimensional integral Z i 1 dd l =− Γ(1 − d/2) (M 2 )d/2−1 , (4.168) d 2 2 d/2 (2π) l − M (4π) to the Feynman integral appearing in (4.74), then set d = 4−2, and finally take the limit → 0. Can you recover the qualitative behavior of your result in the effective theory obtained after integrating out the nucleon fields? Think in terms of higher-dimensional operators. ¯ → NN ¯ scattering in the Yukawa theory. ii) Calculate the Born-level potential for N N Compare your result to (4.83). What does your finding tell you about the nature of scalar interactions? Compute the leading-order potential for φφ → φφ scattering in φ4 theory. What is the physical meaning of your result? iii) Consider the anharmonic oscillator with Hamiltonian H=

1 2 1 2 2 λ 3 p + ω x + x . 2 2 3!

(4.169)

The goal of this exercise is it to calculate matrix elements of the form hΩ|xn (0)|Ωi ,

101

(4.170)

where n = 1, 2, . . . and |Ωi denotes the ground state of the perturbed Hamiltonian (we write xn (0) rather than xn because the time associated with these operators is important) using Feynman diagrams and comparing the obtained results with those following from standard time-independent perturbation theory. There are two types of vertices in the possible Feynman diagrams, namely external and internal vertices. External vertices have a single line and correspond to the x(0) factors entering the matrix elements (4.170). They are labeled by the time the operators are evaluated, t = 0 in our case. Internal vertices have three lines, corresponding to the perturbation λ/(3!) x3 in (4.169) and are labeled by a parameter t. Each internal vertex has a different parameter. The Feynman diagrams are constructed using the following Feynman rules: 0

t

s

= 1 , (external vertex) , (−iλ) = 3!

Z

t = D(s, t) =

dt , (internal vertex) ,

(4.171)

1 −iω |s−t| e , (propagator) . 2ω

Here the limits of the t-integration are ∓(1 − i)∞. We need the i because without it, the Feynman integrals would not converge. Yet, the final results of the integrals turn out to be independent of , so that in practice we could omit the i from our notation. Draw the relevant Feynman diagrams that contribute at O(1) and O(λ) to the matrix elements (4.170) with n = 1, 2, 3. Determine the corresponding weight factors and calculate the graphs using (a > 0) Z 2 dt e−ia|t| = . (4.172) ia Try to reproduce your results using standard time-independent perturbation theory. Proceed to compute the O(λ2 ) correction to hΩ|x2 (0)|Ωi. Remember to employ (4.105). The Feynman integrals appearing in this case are of the form (a, b, c > 0) Z Z −2 −2 −2 ds dt e−ia|s| e−ib|t| e−ic|s−t| = + + . (4.173) (a + b)(b + c) (a + b)(a + c) (a + c)(b + c) This result can be easily derived from (4.172). If you want, verify that you get the same answer for the O(λ2 ) correction to hΩ|x2 (0)|Ωi using standard QM perturbation theory. iv) Consider a Lorentz transformation Λ that boosts pµ0 = (mλ , 0) to Λµν pν0 = (Ep (λ), p). Show that |λp i = U (Λ)|λ0 i satisfies P µ |λp i = Λµν pν0 |λp i where P µ = (H − E0 , P ). The ground state |Ωi of any interacting theory has to be Poincaré invariant, since the vacuum ought not to have a preferred direction. Show that hΩ|φ(x)|Ωi = v for any φ(x) if hΩ|φ(0)|Ωi = v. 102

Compute the spectral density function ρ(s) and the field renormalization factor Z of the free scalar theory by explicitly calculating h0|φ(0)|pi. v) Evaluate the leading singularity of the Fourier-transformed (n + 2)-point correlation function (4.126) arising from the first integration region in (4.127). You should find the result given in (4.131). vi) In Section 4.10 we learnt that the two-point correlation function of the φ4 theory viewed as an analytic function of the momentum p2 has a branch-cut singularity associated with multiparticle intermediate states. This finding should not come as a surprise to those familiar with non-relativistic scattering theory, where the amplitudes considered as a function of energy have branch cuts on the positive real axis. The imaginary part of the scattering amplitude appears as a discontinuity across this branch cut. By the optical theorem the imaginary part of the forward-scattering amplitude is then proportional to the total cross section. In this exercise we will derive the QFT version of the optical theorem for a 2 → 2 scattering process. Derive a equation for the product T † T involving the T -matrix starting from the unitarity of the S-matrix. What is the physical reason for the unitarity of the S-matrix? Calculate the matrix element of this relation between the two-particle initial and final states |p1 , p2 i and |q1 , q2 i. In order to compute the matrix element of T † T insert a complete set of states |ki i with i = 1, . . . , ∞. Give a pictorial representation of the resulting identity. Set p1 = q1 and p2 = q2 and relate the matrix element of T † T to a total cross section. Use this result to derive the standard form of the optical theorem, i.e., Im A(p1 , p2 → p1 , p2 ) = 2ECM pCM σ(p1 , p2 → anything) .

(4.174)

Here ECM is the total CM energy, pCM is the momentum of either of the particles in the CM frame, and σ(p1 , p2 → anything) is the total cross section for the production of all final states. vii) The generalized optical theorem 2 Im A(a → b) =

XZ

dΠf A(a → f ) A∗ (b → f ) .

(4.175)

f

is true not only for S-matrix elements, but for any amplitude that we can define in terms of Feynman diagrams. Here a and b denote asymptotic states, the sum f runs over all possible sets of final states, and dΠf is the corresponding phase-space element (4.157). In this exercise we will learn that the optical theorem can also be used to deal with unstable particles, which never appear in asymptotic states. Recall that the exact two-point function of a scalar field takes the form (4.138). Use the diagrammatic master formula (4.157) to derive a relation between the amplitude A(p → p) describing 1 → 1 “scattering” and the quantity −iΣ(p2 ) that is the sum of all 1PI insertions into the boson propagator (4.137). 103

The latter relation can be used to study the imaginary part of Σ(p2 ). In order to do so, we change the definition of the physical mass of the φ-particle from (4.139) into m2 = m20 + Re Σ(m2 ) .

(4.176)

Assume now that the full propagator (4.138) appears in the s channel of a 2 → 2 Feynman diagram. Compute the cross section for the process in the vicinity of the resonance. Neglect all overall factors. Compare your result to the relativistic Breit-Wigner formula for the cross section in the region of a resonance, 2 1 . (4.177) σ ∝ s − m2 + imΓ Here m is the mass of the resonance and Γ is its width. Identify the width Γ with the imaginary part of Σ(p2 ) assuming that the resonance is narrow, i.e., Γ m. What does this mean for the lifetime of the particle? Calculate Im Σ(m2 ), and hence Γ, using the optical theorem (4.175). You should recover the result (4.159). R viii) Derive the explicit form of the 2-body phase-space element dΠ2 from (4.157). Using your result calculate the angular differential cross section (dσ/dΩ)CM for a generic 2 → 2 process in the CM frame. The solid angle dΩ is given by dφ d cos θ where θ ∈ [−π, π] is the polar scattering angle (with respect to the beam axis) and φ ∈ [0, 2π[ the azimuthal scattering angle (around the beam axis). Consider also the case where the external particles all have the same mass. In this case you should obtain dσ |A(p1 , p2 → q1 , q2 )|2 = , (4.178) dΩ CM 64π 2 s where A(p1 , p2 → q1 , q2 ) represents the relevant scattering matrix element. ix) The interactions of pions at low energy can be described by a phenomenological model called the linear sigma model, Z 1 2 1 3 2 2 H= dx Π + (∇Φi ) + V (Φ ) . (4.179) 2 i 2 Here Φi with i = 1, . . . , N are real scalar fields and Πi denotes the conjugate momentum derived from Φi . The potential is given by V (Φ2 ) =

1 2 λ m (Φi )2 + (Φ2i )2 . 2 4

(4.180)

Note that for m2 > 0 and λ = 0 the above Hamiltonian just consists out of N copies of the Klein-Gordon Hamiltonian. If one now assumes λ to be a small perturbation, one can calculate scattering amplitudes in a series expansion in λ. Show that the propagator of the Φi fields is Φi (x)Φj (y) = δij DF (x − y) , 104

(4.181)

where DF (x − y) is the standard Klein-Gordon propagator with mass m. Show furthermore that there is one type of vertex given by k

l = −2iλ (δij δkl + δil δjk + δik δjl ) .

i

(4.182)

j

A vertex involving two Φ1 and two Φ2 thus has the value −2iλ, while a vertex where four fields of the same type attach receives a factor −6iλ. Compute the cross sections for Φ1 Φ2 → Φ1 Φ2 , Φ1 Φ1 → Φ2 Φ2 , and Φ1 Φ1 → Φ1 Φ1 scattering to first order in λ. Work in the CM frame. Now consider the case m2 < 0. In this case, the potential has a local maximum rather than a minimum at Φi = 0. Since the potential is symmetric under SO(N ) rotations of Φ = (Φ1 , . . . , ΦN ), we can choose to write the fields close to the new minimum as T Φ(x) = π1 (x), . . . , πN −1 (x), v + σ(x) ,

(4.183)

where v is a constant chosen to minimize the potential (4.180), σ(x) is a small deviation, and πi (x) denote the remaining fields, called pions. Show, that with such a potential we have a theory of one massive sigma field and (N − 1) massless pion fields, interacting through cubic and quartic potential terms. Assign Feynman rules to the propagators σ(x)σ(y) =

,

πi (x)πj (y) = i

j,

(4.184)

and the vertices

i

j .

k

l

i

j

i

j

105

(4.185)

References [1] J. Schwinger, ”Renormalization Theory of Quantum Electrodynamics: An Individual View,” in “The Birth of Particle Physics”, Cambridge University Press (1983), 329 p. [2] H. Yukawa, Proc. Phys. Math. Soc. Jap. 17, 48 (1935). [3] H. Lehmann, K. Symanzik and W. Zimmermann, Nuovo Cim. 1, 205 (1955).

106

5

Dirac Theory

We have seen that quantization of scalar fields gives rise to spin-zero particles. But most particles in nature have an intrinsic angular momentum or spin. These arise naturally in field theory by considering fields which themselves transform non-trivially under the Lorentz group. In this section we will describe the Dirac equation, whose quantization gives rise to fermionic spin-1/2 particles. In order to motivate the Dirac equation, we will start by studying the appropriate representation of the Lorentz group. We already know from Section 2.4 that if one considers infinitesimal Lorentz transformations (2.61), the matrices ω µν (2.63) entering the transformations have to be antisymmetric. Such an object has six independent parameters which agrees with the number of transformations of the Lorentz group, i.e., three rotations and three boosts. In the following it will turn out to be useful to introduce a basis of this six 4 × 4 antisymmetric matrices. We call our µν matrices (Mαβ ) with α, β, µ, ν = 0, 1, 2, 3 and write the basis of six matrices as µν

(Mαβ )

= η αµ η βν − η βµ η αν ,

(5.1)

where the indices α and β denote which basis element we are dealing with, while µ and ν µν belong to the 4 × 4 matrices. Notice that (Mαβ ) is antisymmetric in both α, β and µ, ν. If we use these matrices in practical applications (e.g., if we want to multiply them together or act on some field) we will typically need to lower one index, (Mαβ )µ ν = η αµ δ β ν − η βµ δ α ν .

(5.2)

Since we lowered the index with the Minkowski metric, we pick up various minus signs which means that when written in this form, the matrices are no longer necessarily antisymmetric. E.g., one has     0 0 0 0 0 1 0 0 0 0 −1 0 1 0 0 0     12 µ 01 µ (5.3) (M ) ν =  (M ) ν =  . , 0 1 0 0  0 0 0 0 0

0

0

0

0

0

0

0

The matrix M01 , which is real and symmetric, generates boost in the x direction, while M12 is real and antisymmetric and generates rotations in the x–y plane. In terms of Mαβ , we can now write any infinitesimal ω µ ν as ωµν =

1 Ωαβ (Mαβ )µ ν , 2

(5.4)

where the matrix Ωαβ contains six numbers and is antisymmetric in α and β. This matrix parametrizes the Lorentz transformation we are doing. The basis of the six matrices Mαβ forms the generators of the Lorentz transformations. The generators obey the Lorentz Lie algebra relations, [Mαβ , Mγδ ] = η βγ Mαδ − η αγ Mβδ + η αδ Mβγ − η βδ Mαγ . 107

(5.5)

Here the matrix indices have been suppressed. A finite Lorentz transformation Λ can be constructed from (5.4) by building the exponential 1 αβ Λ = exp Ωαβ M . (5.6) 2 Let me stress again what each of these objects are: the Mαβ are six 4 × 4 basis elements of the Lorentz group, while the Ωαβ are six numbers telling us what kind of Lorentz transformation we are doing.

5.1

Spinor Representation

We now want to find other matrices which satisfy the Lorentz algebra commutation relations (5.5). In the following, we will construct the spinor representation of the Lorentz group using a trick due to Dirac. We start by defining something which, at first sight, has nothing to do with the Lorentz group. It is the Clifford algebra (or Dirac algebra) {γ µ , γ ν } = 2 η µν 1 ,

(5.7)

where {a, b} = ab+ba is the usual anticommutator, γ µ denotes a set of four matrices (the Dirac matrices), and 1 is the n×n unit matrix with n being the dimensionality of the representation. The relation (5.7) implies that we have to look for matrices γ µ that satisfy γ µ γ ν = −γ ν γ µ when µ 6= ν, (γ 0 )2 = 1 , (5.8) and (γ i )2 = −1 ,

(5.9)

for i = 1, 2, 3. It is not difficult to convince oneself that the simplest representation of the Clifford algebra (for four-dimensional Minkowski space) is in terms of 4 × 4 matrices. In fact, there are many 4 × 4 matrices γ µ that obey (5.7).37 E.g., we may take the so-called Weyl or chiral representation, ! ! i 0 1 0 σ , (5.10) γ0 = , γi = 1 0 −σ i 0 where each element is a 2 × 2 matrix itself and σ i denotes the Pauli matrices ! ! ! 0 1 0 −i 1 0 σ1 = , σ2 = , σ3 = . 1 0 i 0 0 −1

(5.11)

The latter matrices themselves satisfy {σ i , σ j } = 2δ ij . Using these properties one easily shows that (5.10) indeed satisfies (5.7). This is left as a homework problem. One can construct any other representation of the Clifford algebra from a specific one by taking M γ µ M −1 for any invertible matrix M . However, up to this equivalence, it turns out that there is a unique irreducible representation of the Clifford algebra, and the matrices (5.8) provide an example. 37

108

So what is the connection between the Clifford algebra and the Lorentz group? In order to answer the question, we consider the commutator of two Dirac matrices γ µ , S µν =

1 µ ν [γ , γ ] . 4

(5.12)

In our representation, the 0i and ij components of S µν are given explicitly by ! i σ 0 1 S 0i = , 2 0 −σ i and i S ij = − ijk 2

σk 0

0 σk

(5.13)

! .

(5.14)

It is straightforward to show and thus part of an exercise, that these matrices (irrespectively of their representation) satisfy [S µν , γ κ ] = γ µ η νκ − γ ν η κµ .

(5.15)

[S µν , S κλ ] = η νκ S µλ − η µκ S νλ + η µλ S νκ − η νλ S µκ .

(5.16)

and The latter equality tells us that the matrices S µν form a representation of the Clifford algebra (5.5). We now also understand the physical meaning of (5.13) and (5.14). The former object induces a Lorentz boost, while the latter generates a three-dimensional rotation. Dirac Spinors The S µν are 4 × 4 matrices, because the γ µ are. So far we haven’t given an index to the rows and columns of these matrices. Let’s call the indices α and β. We furthermore need a field that the (S µν )α β act upon. The sought field has to have four complex components labelled α and we call it ψ α (x). This object is the famous Dirac spinor. Under Lorentz transformations, we have ψ α (x) → S(Λ)α β ψ β (x0 ) . (5.17) where x0 = Λ−1 x and the full Lorentz transformation S(Λ) takes the form 1 γδ Ωγδ S , S(Λ) = exp 2

(5.18)

and the expression for Λ is given in (5.6). Although the basis of generators S γδ and Mγδ is different we use the same six numbers Ωγδ in both S(Λ) and Λ. This ensures that we are doing the same Lorentz transformation on ψ and x. Both S(Λ) and Λ are 4 × 4 matrices. So how can we be sure that the spinor representation (5.18) is something new, and isn’t equivalent to the familiar vector representation (2.60)?

109

In order to convince ourselves that the two representations are truly different, we look at rotations. If we write the rotation parameters as Ωij = −ijk ϕk , then (5.6) and (5.18) become   0 0 0 0 ! 0 0 ϕ3 −ϕ2  ei/2 ϕ·σ 0   Λ = exp  S(Λ) = , (5.19) , 0 −ϕ3 0 ϕ1  0 ei/2 ϕ·σ 0 ϕ2 −ϕ1 0 where in order to arrive at the right-hand sides one has to remember that Ω12 = −Ω21 = −ϕ3 , etc. We now consider a rotation by 2π around the z-axis, which means to take ϕ = (0, 0, 2π). It follows that   0 0 0 0 ! 3 0 0 2π 0 eiπσ 0   Λ = exp  = −1 . (5.20) S(Λ) =  = 1, 3 0 −2π 0 0 0 eiπσ 0 0 0 0 This implies that under a 2π rotation a vector and spinor transforms as follows Aµ (x) → Aµ (x) ,

ψ α (x) → −ψ α (x) .

(5.21)

The latter relation tells us that spinors have the unintuitive property that a 2π rotation does not return them to their initial state, but a 4π rotation does. So S(Λ) definitely differs from the vector representation Λ. For later convenience let me also give explicitly the analogs of (5.19) for Lorentz boosts. Writing the boost parameter as Ωi0 = −Ω0i = χi , one finds   0 −χ1 −χ2 −χ3 !  −χ2 i/2 χ·σ 0 0 0 e 0   S(Λ) = . (5.22) Λ = exp  2 , −χ 0 0 0 0 e−i/2 χ·σ −χ3 0 0 0 Another important question to ask is whether or not S(Λ) is a unitary representation of the Lorentz group.38 From (5.18), we infer that S(Λ) is unitary if S µν is anti-hermitian, i.e., (S µν )† = −S µν . But we have 1 (5.23) (S µν )† = − [(γ µ )† , (γ ν )† ] , 4 which can be anti-hermitian if all γ µ are hermitian or all are anti-hermitian. However, we can never arrange for this to happen since (5.8) and (5.9) imply that S µν has both real and imaginary eigenvalues, and a anti-hermitian matrix ought only to have imaginary ones. E.g., in the Weyl representation (5.10), we have the property (γ 0 )† = γ 0 ,

(γ i )† = −γ i .

38

(5.24)

Notice that using the Weyl representation the relations (5.13) and (5.14) already tell us explicitly that rotations are unitary while boosts are not. This observation is also true for the vector representation.

110

In fact the Lorentz group being non-compact, has no finite-dimensional representations that are unitary. But this does not matter to us, since our spinor ψ is not a QM wavefunction, but a classical field. Dirac Action With the new field ψ at hand we now want to construct Lorentz-invariant EOMs involving it. In order to do this we try to write down a Lorentz-invariant action that is “bi-linear” in ψ. We consider the product ψ † (x)ψ(x) = (ψ ∗ )T (x)ψ(x) , (5.25) where ψ † (x) is the usual adjoint of a multi-component object. Under a Lorentz transformation Λ, one has ψ † (x)ψ(x) → ψ † (x0 )S † (Λ)S(Λ)ψ(x0 ) , (5.26) which is not Lorentz invariant since S(Λ) is not unitary, i.e., S † (Λ)S(Λ) 6= 1. This means that ψ † ψ is not a Lorentz scalar and thus not the right building block for constructing the action. Yet, it is easy to see what went wrong and to correct for it. From (5.24) we find that for µ = 0, 1, 2, 3, one has (γ µ )† = γ 0 γ µ γ 0 , (5.27) which in turn implies that (S µν )† = −γ 0 S µν γ 0 ,

(5.28)

S † (Λ) = −γ 0 S(Λ)γ 0 .

(5.29)

and This suggests that instead of ψ † we should better use ¯ ψ(x) = ψ † (x)γ 0 ,

(5.30)

as a building block in our Dirac action. This object is called adjoint Dirac spinor. Equipped with ψ and ψ¯ let us now see what kind of Lorentz covariant objects we can form ¯ It is a simple exercise to show that this object transforms out of them. We first consider ψψ. under a Lorentz transformation Λ as ¯ ¯ 0 )ψ(x0 ) , ψ(x)ψ(x) → ψ(x (5.31) R ¯ which tells us that it is a Lorentz scalar. A term d4 x ψ(x)ψ(x) is thus Lorentz invariant since µ ¯ det(Λ) = 1. Next we consider ψγ ψ. This term has the following transformation property µ ¯ ¯ 0 )γ ν ψ(x0 ) , ψ(x)γ ψ(x) → Λµ ν ψ(x

(5.32)

under Lorentz transformations. This claim is proven as part of an exercise. From (5.32) we ¯ µ ψ is a Lorentz vector. This means that we can treat the µ index on the γ µ infer that ψγ matrices as a true vector index. In particular, we can it R 4form Lorentz scalars by R contracting 4 ¯ ¯ with other Lorentz indices. As a result terms like d x ψ(x)A/(x)ψ(x) and d x ψ(x)∂/ ψ(x) are Lorentz invariant. Here we have introduced the shorthand notation a/ = γ µ aµ for any 111

¯ µν ψ with σ µν = i/2 [γ µ , γ ν ]. Not surprisingly contravariant vector aµ . Finally, we consider ψσ this object behaves like a Lorentz tensor µν ¯ ¯ 0 )σ κλ ψ(x0 ) . ψ(x)σ ψ(x) → Λµ κ Λν λ ψ(x

(5.33)

µ ¯ This result is again easy to derive by considering separately the properties R 4 of ψ, ψ,µνand γ under ¯ Lorentz transformations. From (5.33) it follows that terms like d x ψ(x)σ ψ(x)Fµν (x), where all indices are contracted are Lorentz invariant. ¯ ψγ ¯ µ ψ, and ψσ ¯ µν ψ, each of which We are now equipped with the three bi-linears ψψ, transforms covariantly under the Lorentz group. We can try to build a Lorentz-invariant action from these. In fact, we need only the first two terms. We write Z ¯ (i∂/ − m) ψ(x) . S = d4 x ψ(x) (5.34)

This is the Dirac action we were looking for. Since [S] = 0, [d4 x] = −4, [∂µ ] = 1, and [m] = 1, ¯ = 3/2. we can read off the mass dimension of the spinor field and its adjoint. We have [ψ] = [ψ] The factor of “i” is there to make the action (5.34) real. Upon complex conjugation, it cancels a minus sign that comes from integration by parts. As we will see soon, after quantization the Dirac theory describes particles and antiparticles of mass |m| and spin 1/2. Notice that the Lagrangian is of first order, rather than the second-order Lagrangians we were working with for scalar fields. Also, the mass parameter appears in the Lagrangian as m, which can be positive or negative. Dirac Equation The EOMs for ψ and ψ follow from (5.34) by varying independently with respect to ψ¯ and ψ, respectively. In the first case, we obtain39 (i∂/ − m) ψ = 0 .

(5.35)

This is the Dirac equation. In the second case it follows that ←

ψ¯ (i∂/ + m) = 0 ,

(5.36)

which is the hermitian-conjugate form of (5.35). Here the derivative acts to the left. Both (5.35) and (5.36) are first order in derivatives, yet miraculously Lorentz invariant. As an homework assignment you are asked to show this explicitly. In contrast, in the case of a scalar field a first-order EOM would necessarily break Lorentz invariance, because one would always need to introduce a privileged vector that saturates the open index of ∂µ . The γ µ matrices provide this index in the case of the Dirac equation. It is also important to realize that the Dirac equation mixes up different components of ψ through γ µ . However, each individual component itself solves the Klein-Gordon equation (2.46). In order to see this, we compute 0 = − (iγ µ ∂µ + m) (iγ ν ∂ν − m) ψ = γ µ γ ν ∂µ ∂ν + m2 ψ (5.37) = 1/2 {γ µ , γ ν } ∂µ ∂ν + m2 ψ = ∂µ ∂ µ + m2 ψ . 39

Hereafter we will often drop the coordinate x in ψ(x) etc.

112

The final expression contains no γ µ matrices, and so applies to each component ψ α of the spinor field separately. Chiral Spinors We have seen that in the chiral representation (5.10) both the spinor rotations (5.19) and boosts (5.20) are block diagonal. This means that the Dirac representation is reducible. It decomposes into two irreducible representations, acting only on two-component spinors ψL,R , which in the chiral representation, are defined by ! ψR ψ= . (5.38) ψL The two-component objects ψL,R are called chiral spinors and the labels L, R stand for leftand right-handed chirality. They transform in the same way under rotations, but oppositely under boosts: ψL,R → ei/2 ϕ·σ ψL,R , ψL,R → e∓1/2 χ·σ ψL,R , (5.39) In group theory language ψL is in the (1/2, 0) representation of the Lorentz group, ψR is in the (0, 1/2) representation, and ψ belongs to (1/2, 0) ⊕ (0, 1/2).40 Strictly speaking, the Dirac spinor is a representation of the double cover of SO+ (1, 3) ∼ = SL(2, C)/Z2 . Here SO+ (1, 3) denotes the proper, orthochronous or restricted Lorentz group, which consists of those Lorentz transformations that preserve the orientation of space and direction of time, while SL(2, C) is the complex special linear group and Z2 is the two element cyclic group. The fact that the Lorentz group is doubly connected is the source of the rotation-by-4π property (5.21) of spinors. The relations (5.38) and (5.39) correspond to the chiral representation. But what happens if we choose a different representation γ µ of the Clifford algebra, where the Lorentz group matrices S(Λ) are not block diagonal? Is there an invariant way to define chiral spinors? We can do this by defining the “fifth” Dirac matrix γ 5 = iγ 0 γ 1 γ 2 γ 3 = −

i µνκλ γµ γν γκ γλ , 4!

(5.40)

where µνκλ is the totally antisymmetric Levi-Civita tensor with 0123 = −0123 = −1. The γ 5 matrix has the following properties, all of which can be verified using (5.40) and the anticommutation relations (5.7): (γ 5 )† = γ 5 ,

(γ 5 )2 = 1 ,

{γ 5 , γ µ } = 0 .

(5.41)

The reason that this Dirac matrix is called γ 5 is that the set of matrices γ M = {γ µ , iγ 5 } satisfy the five-dimensional Clifford algebra, i.e., {γ M , γ N } = 2η M N where M, N = 0, 1, 2, 3, 4. It is also not difficult to check that [S µν , γ 5 ] = 0 , (5.42) 40

Using this terminology, scalars belong to the (0, 0) representation, vectors are in the (1/2, 1/2) representation, and the electromagnetic field-strength tensor transforms as (1, 0) ⊕ (0, 1) under the Lorentz group.

113

which means that γ 5 is a scalar41 under rotations and boosts. The latter relation also tells us that eigenvectors of γ 5 whose eigenvalues are different transform without mixing, and as a result the Dirac representation must be reducible. This criterion for reducibility is Schur’s lemma. It follows that 1 ∓ γ5 PL,R = , (5.43) 2 form Lorentz-invariant projection operators. They satisfy (please show this) 2 PL,R = PL,R ,

PL,R PR,L = 0 ,

PL + PR = 1 .

One can also check easily that for the Weyl representation (5.10), one has explicitly ! 1 0 γ5 = . 0 −1

(5.44)

(5.45)

We see that PL,R project onto left- and right-handed spinors, i.e., ψL,R = PL,R ψ. Weyl Equations The Dirac Lagrangian can be written in terms of the chiral fields (5.38) as L = ψ¯ (i∂/ − m) ψ = i ψ¯L ∂/ ψL + ψ¯R ∂/ ψR − m ψ¯L ψR + ψ¯R ψL ,

(5.46)

¯ R,L . After a slight change of notation, where ψ¯L,R = ψP σ µ = (1, σ) ,

σ ¯ µ = (1, −σ) ,

and multiplying with ψ¯ = ψ¯L + ψ¯R from the left the corresponding EOMs read i ψL† σ µ ∂µ ψL + ψR† σ ¯ µ ∂µ ψR − m ψL† ψR + ψR† ψL = 0 .

(5.47)

(5.48)

We see that a massive fermion requires both components ψL and ψR , since they are coupled via the mass term, which is chirality flipping. The kinetic term on the other hand is chirality conserving. This means that a massless fermion can be described by a single Weyl spinor ψL or ψR alone. The corresponding Euler-Lagrange equations go by the name of Weyl equations: iσ µ ∂µ ψL = 0 ,

i¯ σ µ ∂µ ψR = 0 .

(5.49)

In many practical applications it is overwhelmingly convenient to employ two-component Weyl spinor notation, rather than the four-component Dirac spinors. This is due to the fact that the Lagrangian of the SM and essentially all of its extensions violate parity, i.e., the leftand right-handed fermionic components couple differently to the electroweak gauge group. If one uses four-component spinor notation, then there are a lot of clumsy left- and right-handed projection operators. This is not the case if one employs the two-component Weyl fermion notation, which treats fermionic dofs with different gauge quantum numbers separately from the start, as nature intended for us to do. Plenty of details on and many useful techniques to deal with Weyl fermions can be found in [1], which I highly recommend for further reading. 41

In fact, we will see soon that it is a pseudo-scalar and not a scalar.

114

Dofs Counting At this point a couple of comments about the dofs counting seem to be indicated. In classical mechanics, the number of dofs of a system is equal to the dimension of the configuration space or, equivalently, half the dimension of the phase space. In field theory we have an infinite number of dofs, but it makes sense to count the number of dofs per spatial point, which at least should be finite. E.g., in this sense a real scalar field φ has a single dofs. At the quantum level, this translates to the fact that it gives rise to a single type of particle. A classical complex scalar field, on the other hand, has two dofs, corresponding to the particle and its antiparticle in the QFT. But what about a Dirac spinor? One might think that there are eight dofs, since ψ has four complex components. But this is wrong! Crucially, and in contrast to the scalar field, the EOM of ψ is first order rather than second order. In particular, for the Dirac theory, the momentum conjugate to the spinor ψ is given by πψ =

∂L = iψ † , ∂ ψ˙

(5.50)

which is not proportional to the time derivative of ψ. The phase space of a spinor is hence ˙ So the parameterized by ψ and ψ † , while for a scalar it is parameterized by φ and πφ = φ. phase space of the Dirac spinor ψ has eight real dimensions and correspondingly the number of real dofs is four. We will learn soon that, in the QFT, this counting manifests itself as two dofs (i.e., spin up and down) for the particle, and another two for the antiparticle. A similar counting for the Weyl fermion tells us that it has two dofs. Majorana Fermions Our spinor ψ is a complex object. It has to be, since the representation S(Λ) is typically also complex. This means that if we were to try to make ψ real, e.g., by imposing ψ = ψ ∗ , then it would not stay real once we make a Lorentz transformation. However, there is a way to impose a reality condition on ψ. In order to motivate this possibility, its simplest to look at a novel basis for the Clifford algebra (5.9), known as the Majorana basis ! ! ! ! 2 3 2 1 0 σ iσ 0 0 −σ −iσ 0 γ0 = , γ1 = , γ2 = , γ3 = . (5.51) σ2 0 0 iσ 3 σ2 0 0 −iσ 1 What is special about these matrices is that they are all pure imaginary, i.e., (γ µ )∗ = −γ µ . This implies that the generators (5.12), and hence the full Lorentz transformations (5.18) are real. In the specific basis (5.51), we can therefore work with a real spinor simply by imposing the condition, ψ = ψ∗ , (5.52) which is preserved under Lorentz transformations. Such spinors are called Majorana spinors. Can this procedure be generalized to an arbitrary basis of Dirac matrices? We only ask that the basis satisfies (5.24). We then define the charge conjugate of a Dirac spinor ψ as ψ c = Cψ ∗ , 115

(5.53)

where C is a 4 × 4 matrix obeying C† C = 1 ,

C † γ µ C = −(γ µ )∗ .

(5.54)

The first relation tells us that charge conjugation can be described by an unitary operator. Let us first check that (5.53) is a sensible definition, meaning that ψ c transforms nicely under Lorentz transformations. One has ψ c (x) → C S ∗ (Λ)ψ ∗ (x0 ) = S(Λ)Cψ ∗ (x0 ) = S(Λ)ψ c (x0 ) .

(5.55)

Here we made use of the properties (5.54) to commute the matrix C past S ∗ (Λ) to the right. Comparing the latter result to (5.17), we see that ψ and ψ c transform in the same way under the Lorentz group. In fact, not only does ψ c transforms nicely under rotations and boosts, but it satisfies the Dirac equation, if ψ does. This follows from, (i∂/ − m)ψ = 0 ,

=⇒

(−i∂/∗ − m)ψ ∗ = 0 ,

=⇒

C (−i∂/∗ − m)ψ ∗ = (i∂/ − m)ψ c = 0 ,

(5.56)

where we have again employed (5.54). Finally, we can now impose the Lorentz-invariant reality condition on the Dirac spinor, to yield a Majorana spinor, ψ = ψc .

(5.57)

After quantization, the Majorana spinor gives rise to a Majorana fermion that is its own antiparticle. This is exactly the same as in the case of scalar fields, where we have seen that a real scalar field gives rise to a spin-zero boson that is its own antiparticle. So how does the matrix C look like? This, of course, depends a lot on the basis. In the Majorana basis (5.51), where all the Dirac matrices are purely imaginary, one simply has C = 1, and in consequence the condition (5.57) turns into (5.52). In the chiral basis (5.10), on the other hand, only γ 2 is imaginary, and we may take42 C = iγ 2 .

(5.58)

It is also interesting to see how the Majorana condition (5.57) looks in terms of the decomposition into left- and right-handed Weyl spinors. Plugging in the various definition, we find ψR = iσ 2 ψL∗ and ψL = −iσ 2 ψR∗ . In other words, a Majorana spinor can be written in terms of chiral spinors as ! ψR ψ= . (5.59) −iσ 2 ψR∗ Notice that it is not possible to impose the Majorana condition, ψ = ψ c , at the same time as the Weyl condition, ψL = 0 or ψR = 0. Instead the Majorana condition relates left- and right-handed spinors via (5.59). In an exercise you will learn more about Majorana fermions. So let’s move on. 42

Be aware, in many texts an extra factor of γ 0 is absorbed into the definition of C.

116

5.2

Discrete Symmetries of Dirac Theory

In addition to the continuous Lorentz transformations we have considered so far, there are two other space-time operations that are potential symmetries of any QFT, namely parity and time reversal. Parity, denoted by P , sends P

x = (t, x) −→ (t, −x) = xP ,

(5.60)

reversing the handedness of space. Times reversal, denoted by T , sends T

x = (t, x) −→ (−t, x) = xT ,

(5.61)

interchanging the forward and backward light-cone. Since parity has an important role to play in the SM and, in particular, the theory of the electroweak interactions, let’s first have a look at the action of P on spinors and bi-linears constructed from them. Parity In order to understand what happens to a spinor under parity, we consider how rotations and boosts act on Weyl spinors. In the chiral representation, the corresponding transformation properties have already been spelled out in (5.39). We also know that under parity rotations do not flip sign, while boosts do, since P acting on a particle should reverse its momentum, but not its spin. This tells us that parity exchanges right- and left-handed spinors, P

ψL,R (x) −→ ψR,L (xP ) .

(5.62)

Using this knowledge and the fact that changing the parity twice is the identity, i.e., P 2 = 1, we see that the action of parity on ψ can be described in the Weyl basis by P = γ0 .

(5.63)

This 4 × 4 matrix satisfies P †P = 1 ,

P † γ µ P = (γ µ )† ,

(5.64)

so also parity can be implemented by an unitary operator. Our spinor transforms under P as P

ψ(x) −→ P ψ(xP ) .

(5.65)

Notice that if ψ(x) satisfies the Dirac equation (5.35), so does the parity-transformed spinor P ψ(xP ), since one has (iγ 0 ∂t + iγ i ∂i − m)P ψ(t, x) = P (iγ 0 ∂t − iγ i ∂i − m)ψ(t, x) = 0 .

(5.66)

Here the extra minus sign from passing P through γ i is compensated by the derivative acting on −x instead of x. Let me now consider how the covariant interaction terms we have constructed before trans¯ Obviously, one has form under P . We start with ψψ. P ¯ ¯ P )ψ(xP ) , ψ(x)ψ(x) −→ ψ(x

117

(5.67)

given that (γ 0 )2 = 1 and (γ 0 )† = γ 0 . This is the transformation of a scalar. In the case of the ¯ µ ψ, we find instead ψγ P µ ¯ P )γ µ ψ(xP ) , ¯ ψ(x)γ ψ(x) −→ (−1)µ ψ(x

(5.68)

where (−1)µ = 1 for µ = 0 and (−1)µ = −1 for µ = 1, 2, 3. Notice that the factor (−1)µ arises from the combination of (5.24) and (5.27). The latter transformation property tells us that ¯ µ ψ transforms as a vector, with the spatial part changing sign. You can also check easily ψγ ¯ µν ψ transforms as a tensor, namely that ψσ P µν ¯ ¯ P )σ µν ψ(xP ) . ψ(x)σ ψ(x) −→ (−1)µ (−1)ν ψ(x

(5.69)

¯ 5 ψ and ψγ ¯ 5 γ µ ψ. How do Using γ 5 , we can form two more Lorentz-covariant objects, i.e., ψγ these transform under parity? In the first case, we obtain P 5 ¯ ¯ P )γ 5 ψ(xP ) , ψ(x)γ ψ(x) −→ −ψ(x

(5.70)

where we have used the last relation in (5.41) and (γ 0 )2 = 1. In the second case, a straightforward calculation gives P µ 5 ¯ ¯ P )γ µ γ 5 ψ(xP ) . ψ(x)γ γ ψ(x) −→ −(−1)µ ψ(x

(5.71)

¯ 5 ψ and ψγ ¯ µ γ 5 ψ the names pseudoThe minus signs in (5.70) and (5.81) earns the objects ψγ scalar and pseudo-vector (or axial-vector). To summarize, we have the following spinor bilinears, ¯ : scalar , ψψ ¯ µ ψ : vector , ψγ ¯ µν ψ : tensor , ψσ

(5.72)

¯ 5 ψ : pseudo-scalar , ψγ ¯ µ γ 5 ψ : pseudo-vector . ψγ The total number of bi-linears is (1 + 4 + (4 · 3)/2 + 4 + 1) = 16 which is all we could hope for from a 4-component object. We are now equipped with new terms involving γ 5 that we can start to add to our Lagrangian to construct new theories. Typically such terms will break parity invariance of the ¯ 5 ψ does not break parity if φ is theory, although this is not always true. E.g., the term φ ψγ itself a pseudo-scalar. nature makes use of these parity-violating interactions by using γ 5 in the electroweak force. A theory which treats ψL,R on an equal footing is called a vector-like theory. In contrast, a theory in which ψL,R appear differently is called a chiral theory.

118

Time Reversal Another obvious question that we should address is how our building blocks in (5.66) transform under T . In order to answer this question, we first have to understand how the time-reversal symmetry is correctly implemented in a QM context. The implementation turns out to be more subtle than in the case of C and P , since the relevant operator is in the case of T not unitary but anti-unitary, i.e., T † = T −1 with hψ|T † |ψ 0 i = hT ψ|ψ 0 i∗ , where |ψi and |ψ 0 i denote arbitrary multiparticle quantum states. A straightforward way to realize that the operator implementing T must be anti-unitary is to consider the behavior of the Schrödinger equation for a free particle under time reversal. In classical mechanics, a free particle has a time-reversal invariant motion, and it is reasonable that we would like to retain this property in QM as well. But the operator ∂t is T -odd while ∇ is T -even. This is impossible to reconcile with the Schrödinger equation unless time reversal changes i → −i and ψ → ψ ∗ . The operator thus has to be anti-unitary. Note that the anti-unitary of T implies that it does not have meaningful eigenvalues, contrary to what happens in the case of C and P . As there is no quantum number associated with time reversal, no conservation law exists when the action is invariant under time reversal. Just as for parity, we define time-reversal transformation in QFT by its action on states. We require that T should reverse the particle momentum and its spin. It is not too difficult to figure out, that the transformation of the Dirac spinor under time reversal involves in the chiral basis the matrix T = iγ 1 γ 3 ,

(5.73)

which satisfies T †T = 1 ,

T † γ µ T = (−1)µ (γ µ )∗ ,

(5.74)

The transformation itself takes the following form T

ψ(x) −→ T ψ ∗ (xT ) .

(5.75)

The first thing to notice is that if ψ(x) obeys the Dirac equation, the same is true for T ψ ∗ (xT ). This follows, because (iγ 0 ∂t + iγ i ∂i − m)T ψ ∗ (t, x) = T (i(γ 0 )∗ ∂t − i(γ i )∗ ∂i − m)ψ ∗ (t, x) = 0 .

(5.76)

Notice that the minus sign between the (γ 0 )∗ and (γ i )∗ term is compensated by the derivative acting on −t rather then t, and that the final result follows after complex conjugation which sends i → −i. We are now read to consider the transformation properties of the building blocks (5.72). I simply quote the results without proof, leaving the actual derivations to you as an useful ¯ one finds that exercise. For the scalar ψψ T ¯ ¯ T )ψ(xT ) . ψ(x)ψ(x) −→ ψ(x

(5.77)

¯ µ ψ, one has instead In the case of the vector ψγ T µ ¯ ¯ T )γ µ ψ(xT ) . ψ(x)γ ψ(x) −→ (−1)µ ψ(x

119

(5.78)

¯ / ψ and ψA ¯ /ψ This is exactly the transformation property we want for vectors, since it leaves ψ∂ invariant under time reversal. Notice that the minus sign appearing for the space-components in (5.78) is cancelled by those appearing in the transformation of the derivative ∂µ and the ¯ µν ψ behaves electromagnetic field Aµ , respectively. One furthermore shows, that the tensor ψσ like T µν ¯ ¯ T )σ µν ψ(xT ) , ψ(x)σ ψ(x) −→ −(−1)µ (−1)ν ψ(x (5.79) under time reversal. We finally want to know the transformation properties of the covariants ¯ 5 ψ, one obtains involving γ 5 . For the pseudo-scalar ψγ T 5 ¯ ¯ T )γ 5 ψ(xT ) , ψ(x)γ ψ(x) −→ −ψ(x

(5.80)

¯ µ γ 5 ψ is given by while the action of T on the pseudo-vector ψγ T µ 5 ¯ T )γ µ γ 5 ψ(xT ) . ¯ ψ(x)γ γ ψ(x) −→ (−1)µ ψ(x

(5.81)

Charge Conjugation The last of the three discrete symmetries is the particle-antiparticle symmetry C, which we meet already at the end of Section 5.1 when discussing the properties of Majorana fermions. In physical terms, charge conjugation is conventionally defined to take a fermion with a given spin orientation into an antifermion with the same spin orientation. As we have seen in (5.56), this transformation is a symmetry of the Dirac equation. Once again we want to know how C acts on fermion bi-linears. I again quote the relevant ¯ transforms under C as results without giving the details of their derivation. The scalar ψψ C ¯ ¯ ψ(x)ψ(x) −→ ψ(x)ψ(x) .

(5.82)

¯ µ ψ, one has instead In the case of the vector ψγ C µ µ ¯ ¯ ψ(x)γ ψ(x) −→ −ψ(x)γ ψ(x) .

(5.83)

¯ µν ψ behaves like Under C the tensor ψσ C µν µν ¯ ¯ ψ(x)σ ψ(x) −→ −ψ(x)σ ψ(x) .

(5.84)

¯ 5 ψ, one arrives at For the pseudo-scalar ψγ C 5 5 ¯ ¯ ψ(x)γ ψ(x) −→ ψ(x)γ ψ(x) ,

(5.85)

¯ µ γ 5 ψ reads while the action of C on the pseudo-vector ψγ C µ 5 µ 5 ¯ ¯ ψ(x)γ γ ψ(x) −→ ψ(x)γ γ ψ(x) .

120

(5.86)

CP and CP T Symmetry We saw that the free Dirac equation (5.35) is invariant under P , T , and C separately. Yet, we can build more general QFTs that violate any of these discrete symmetries by adding to the Dirac Lagrangian appropriate perturbations. These additional terms must transform as a Lorentz scalar. The various fermionic bi-linears that can be used to construct such terms are shown in Table 1. The last line of this table tells us that all Lorentz-scalar combinations of ψ¯ and ψ are invariant under the combined symmetry CP T . Actually, it is quite generally true that one cannot build a Lorentz-invariant QFT with a hermitian Hamiltonian that violates CP T . More precisely, one can prove the following three statements [2]: first, an interacting theory that violates CP T invariance necessarily violates Lorentz invariance, second, CP T invariance is not sufficient for out-of-cone Lorentz invariance, and third, theories that violate CP T by having different particle and antiparticle masses must be non-local. This implies that any study of CP T violation includes also Lorentz violation. Several experimental searches of such violations have been performed during the last few years. A detailed list of results of these experimental searches are summarized in [3]. So far no evidence for neither CP T nor Lorentz violation has been found. The consequences of the CP T invariance are far-reaching. The most celebrated ones are the equality of masses and total decay width (or lifetimes) for particles and antiparticles. Both statements are easy to prove. Try it! What about the other discrete symmetries in nature? Are they conserved? Although P is conserved in electromagnetism, strong interactions, and gravity, it turns out to be violated in electroweak interactions. The SM incorporates parity violation by expressing the electroweak interaction as a chiral gauge interaction. Only the left-handed components of particles and right-handed components of antiparticles participate in the electroweak interactions in the SM. This implies that P is not a symmetry of our universe, unless a hidden mirror sector exists in which parity is violated in the opposite way (a “left-right symmetry”). It was suggested several times and in different contexts that parity might not be conserved, but in the absence of compelling evidence these suggestions were not taken seriously. A careful review by Tsung Dao Lee and Chen Ning Yang [4] showed that while P conservation had been verified in decays by the strong or electromagnetic interactions, it was untested in the electroweak interaction. They proposed several possible direct experimental tests. They were almost ignored, but Lee was able to convince his colleague Chien-Shiung Wu to look for P violation. In 1957, Wu’s group conducted an ingenious experiment showing that in the case of the β-decay of Co60 , nature knows left from right [5]. The discovery of P violation immediately explained the outstanding τ –θ puzzle related to the decay of charged kaons. So P is broken in nature, what about CP then? The first thing to notice in this respect is that a symmetry of a QM system can be restored if another symmetry can be found such that the combined symmetry remains unbroken. This rather subtle point about the structure of Hilbert space was realized shortly after the discovery of P violation, and it was proposed that charge conjugation was the desired symmetry to restore order. As a result, the CP symmetry was proposed in 1957 by Lev Landau as the true symmetry between matter and antimatter. In other words, a process in which all particles are exchanged with their antiparticles was assumed to be equivalent to the mirror image of the original process. The discovery of CP violation in 1964 in the decays of neutral kaons [6], which resulted in the Nobel Prize in Physics 121

¯ Symmetry ψψ P T C CP CP T

+1 +1 +1 +1 +1

¯ µψ ψγ

¯ µν ψ ψσ

¯ 5ψ ψγ

¯ µγ 5 ψ ψγ

∂µ

Aµ

Fµν

(−1)µ (−1)µ −1 −(−1)µ −1

(−1)µ (−1)ν −(−1)µ (−1)µ −1 −(−1)µ (−1)ν +1

−1 −1 +1 −1 +1

−(−1)µ (−1)µ +1 −(−1)µ −1

(−1)µ −(−1)µ +1 (−1)µ −1

(−1)µ −(−1)µ +1 (−1)µ −1

(−1)µ (−1)ν (−1)µ (−1)ν +1 (−1)µ (−1)ν +1

Table 1: Transformation properties of fermion bi-linears as well as ∂µ , Aµ , and Fµν under the discrete P , T , and C symmetries and the combinations CP and CP T .

in 1980 for its discoverers James Cronin and Val Fitch, shocked particle physics and opened the door to questions still at the core of particle physics and cosmology today. CP violation is incorporated in the SM by including a complex phase in the matrix describing quark mixing. In such a scheme a necessary condition for CP violation can then be shown to be the presence of at least three generations of quarks. This possibility was suggested by Makoto Kobayashi and Toshihide Maskawa in a seminal paper [7] in 1973, which earned them one half of the Nobel Prize in Physics in 2008. The past decade has seen tremendous progress in the study of CP violation. In particular, the so-called B factories (BaBar and Belle) have collected and analyzed an impressive amount of experimental data, that led to the confirmation of the Kobayashi-Maskawa (KM) mechanism of CP violation. Yet, the dynamical origin of CP violation remains a puzzling mystery which awaits to be unraveled. Another unsolved theoretical questions in this context is why the universe is made entirely of matter, rather than consisting of equal parts of matter and antimatter. It can be demonstrated that, to create an imbalance in matter and antimatter from an initial condition of balance, three necessary conditions [8] must be satisfied, one of which is the existence of CP violation. The other two are baryon-number violation and the presence of interactions out of thermal equilibrium. These conditions have been formulated first in 1967 by Andrei Sakharov. The SM contains only two sources that can break the CP symmetry. The first of these, involves the aforementioned KM phase, but can account for only a small portion of the needed CP violation. The second of these, resides in the quantum chromodynamics (QCD) Lagrangian and goes by the name of θ parameter. It has not been found experimentally. The fact that one would expect the θ parameter to lead to either no or CP violation that is way too large is the essence of the strong CP problem [9]. There are several proposed solutions to solve this problem. The most well-known is based on an idea original due to Robert Peccei and Helen Quinn [10], involving new scalar particles called axions. Oops! Looks like I am getting carried away here. Let’s get focused and return to the discussion of the Dirac theory.

122

5.3

Continuous Symmetries of Dirac Theory

Besides the discrete P , T , and C symmetries, the Dirac action (5.34) enjoys a number of continuous symmetries. In the following we will discuss space-time translations, Lorentz transformations, the internal vector and axial-vector symmetry, and compute the associated conserved currents. Space-Time Translations Under infinitesimal space-time translations (2.18), the Dirac spinor ψ transforms as δψ = µ ∂µ ψ .

(5.87) ←

Given that the Dirac Lagrangian depends on ∂µ ψ but not on ψ¯ ∂ µ , we can use the standard formula (2.20) to obtain the energy-momentum tensor ¯ µ ∂ ν ψ − η µν L . T µν = iψγ

(5.88)

Since a current is conserved only when the EOMs are obeyed, we do not lose anything by imposing the Euler-Lagrange equation already on T µν . In the case of a scalar field this does not really buy us anything, because the EOMs are second order in derivatives, while the energy-momentum tensor is first order. However, for a spinor field the EOMs are first order (5.35). This means that we can ignore the second term in (5.88), leaving us with ¯ µ∂ ν ψ . T µν = iψγ It follows that the total energy is given by Z Z Z 3 00 3 0 ˙ ¯ E = d x T = d x iψγ ψ = d3 x ψ † γ 0 (−iγ · ∇ + m) ψ ,

(5.89)

(5.90)

where in order to obtain the final expression we have employed the Dirac equation. The components of the total momentum are given by Z Z i 3 0i ¯ 0∂ i ψ . P = d x T = d3 x iψγ (5.91) Both the total energy and momentum are of course conserved. Lorentz Transformations Under a Lorentz transformation, the Dirac spinor transforms as (5.17) which, in infinitesimal form, reads 1 (5.92) δψ α = −ω µ ν xν ∂µ ψ α + Ωγδ (S γδ )α β ψ β . 2 From (5.6) it follows that ω µ ν = 1/2 Ωγδ (Mγδ )µ ν , where the generators of the Lorentz group (Mγδ )µ ν take the form (5.2). After direct substitution, this tells us that ω µν = Ωµν , and as a result (5.91) becomes 1 γδ α β α µ ν α δψ = −ω ν x ∂µ ψ + (S ) β ψ . (5.93) 2 123

The conserved current arising from Lorentz transformations now follows from the same calculation we saw for the scalar field (2.67). Yet, there are two small differences. First, we are allowed to neglect terms proportional to L in the computation and, second, we pick up an extra piece in the current from the second term in (5.92). At the end one has ¯ λ S µν ψ . (J λ )µν = xµ T λν − xν T λµ − iψγ

(5.94)

After quantization, when (J λ )µν is turned into an operator, this extra term will be responsible for providing the single-particle states with internal angular momentum, telling us that the quantization of a Dirac spinor gives rise to a particle carrying spin 1/2. Vector Symmetry The Dirac Lagrangian is invariant under global phase rotations of the spinor, i.e., ψ → e−iα ψ .

(5.95)

This symmetry gives rise to the conserved current ¯ µψ , jVµ = ψγ

(5.96)

where the index V stands for vector, reflecting the fact that the left- and right-handed spinors ψL,R transform in the same way under phase rotations. It is straightforward to check using (5.35) and (5.36), that jVµ is indeed conserved under the EOMs, ←

¯ µ (∂µ ψ) = imψψ ¯ − imψψ ¯ = 0. ∂µ jVµ = (ψ¯ ∂ µ )γ µ ψ + ψγ The conserved quantity arising from the vector symmetry is Z Z Z 3 0 3 ¯ 0 Q = d x jV = d x ψγ ψ = d3 x ψ † ψ .

(5.97)

(5.98)

We will see shortly that this has the interpretation of electric charge, or particle number, for fermions. Axial-Vector Symmetry In the case of massless fermions, the Dirac Lagrangian possesses an extra internal symmetry, which rotates left- and right-handed fermions in opposite directions, 5

¯ iαγ 5 . ψ¯ → ψe

ψ → eiαγ ψ ,

(5.99)

Here the second transformation follows from the first by noticing that exp (−iαγ 5 ) γ 0 = γ 0 exp (iαγ 5 ) as a consequence of the anti-commutation relation in (5.41). Invariance under the global phase rotation (5.98) leads to the conserved current ¯ µγ 5 ψ , jAµ = ψγ 124

(5.100)

where the subscript A stands for axial-vector. This current is only conserved if the mass parameter m in the Dirac action (5.34) is equal to zero. Indeed, with the full Dirac Lagrangian we may compute ← ¯ µ γ 5 (∂µ ψ) = 2im ψγ ¯ 5ψ , ∂µ jAµ = (ψ¯ ∂ µ )γ µ γ 5 ψ + ψγ (5.101) which is non-vanishing only if m 6= 0. However, in the quantum theory things become more interesting for the axial-vector current. When the theory is coupled to gauge fields, the axial transformation remains a symmetry of the classical Lagrangian. But the symmetry does not survive the quantization process [11–13]. It is the prototypical example of an anomaly: a symmetry of the classical theory that is not preserved at the quantum level. In fact, the axial anomaly has important physical implications. It does not only determine the neutral pion decay π 0 → 2γ, but also provides an indirect way to determine the number of color dofs. For further reading, I recommend the recent review article [14].

5.4

Solutions to Dirac Equation

In order to get some feeling for the physics of the Dirac equation (5.35), we now discuss its plane-wave solutions. The fact that the Dirac field ψ obeys the Klein-Gordon equation, tells us that it can be written as a linear combination of plane waves. We make the ansatz ψ(x) = u(p) e−ipx ,

(5.102)

where u(p) is a 4-component spinor that is independent of x, but does depend on the 3momentum p.43 Notice that (5.102) is a positive frequency solution, because ψ ∝ exp (−iEt). Inserting the above ansatz into the Dirac equation takes the form ! −m pµ σ µ (p/ − m) u(p) = u(p) = 0 , (5.103) pµ σ ¯ µ −m where we have used the notation (5.47). In order to find the solution to this equation, we write u(p) = (u1 , u2 )T . In terms of the two-component spinors u1,2 the relation (5.103) reads (p · σ) u2 = mu1 ,

(p · σ ¯ ) u1 = mu2 ,

(5.104)

where p · σ = pµ σ µ and p · σ ¯ = pµ σ ¯ µ . However, these equations are not independent from each other, since (p · σ)(p · σ ¯ ) = p20 − pi pj σ i σ j = p20 − pi pj δ ij = pµ pµ = m2 . (5.105) We conclude that any spinor of the form u(p) = N 43

(p · σ) ξ 0 mξ 0

! ,

(5.106)

In an abuse of notation we denote hereafter the 4-component Dirac spinors by u(p) and not u(p) etc.

125

with constant N is a solution to (5.103). In order to make this more symmetric, we choose √ √ √ ¯ ξ. Then u1 = (p · σ) p · σ ¯ ξ = m p · σ ξ, and putting things N = 1/m and ξ 0 = p · σ together one obtains ! √ p·σξ , (5.107) u(p) = √ p·σ ¯ξ where ξ is a 2-component spinor that can be chosen to satisfy ξ † ξ = 1. Here it is understood that in taking the square root of a matrix, we take the positive root of each eigenvalue. Further solutions to the Dirac equation follow from the ansatz ψ(x) = v(p) eipx .

(5.108)

These solutions oscillate in time as ψ ∝ exp (iEt) and are therefore called negative frequency solutions. Realize however that both (5.102) and (5.108) are solutions to the classical field equations and both have positive total energy (5.90). The Dirac equation (5.35) requires that the 4-component spinor v(p) satisfies ! m p·σ (p/ + m) v(p) = v(p) = 0 . (5.109) p·σ ¯ m Following the line of reasoning that lead to (5.106), it is easy to show that the latter equation is solved by ! √ p·ση v(p) = , (5.110) √ − p·σ ¯η for some constant 2-component spinor η taken to be normalized as η † η = 1. Spin-Up and Spin-Down Solutions In order to make contact to QM, consider the positive frequency solution with mass m and vanishing 3-momentum p = 0, i.e., the rest frame of the associated particle. In this case the solution to (5.103) takes the form ! √ ξ , (5.111) u(p) = m ξ where ξ is an arbitrary 2-component spinor. We can interpret the spinor ξ by looking at the rotation generator (5.19). We see that ξ transforms under rotations as an ordinary 2component spinor of the rotation group, and therefore determines the spin orientation of the Dirac solution in the usual way. E.g., when ξ T = (1, 0), the corresponding field has spin up along the z-axis. After quantization, this will become the spin of the associated particle.44 44

In the rest of this section, we will indulge in an abuse of terminology and refer to the classical solutions to the Dirac equations as “particles”, even though they have no such interpretation before quantization.

126

Starting from (5.111), we now consider the particle with spin ξ T = (1, 0) and boost it along the z-direction with pµ = (E, 0, 0, pz ). The solution (5.107) to the Dirac equation becomes ! !   √ 1 1  E − pz  e−y/2      0 0 √     u(p) =  (5.112) ! = m  !, √    1 1    y/2  E + pz e 0 0 where in the last step we have introduced the rapidity E + pz 1 , y = ln 2 E − pz

(5.113)

which is related to E and pz via E=

m y e + e−y , 2

pz =

m y e − e−y . 2

(5.114)

Notice that rapidities are, unlike speeds at relativistic velocities, additive quantities. This feature explains why in particle physics rapidities are often used instead of velocities. For large boosts, i.e., E m or equivalent y 1, the result (5.112) turns into   0   √ 0 u(p) ≈ 2E   . 1

(5.115)

0 In the same limit, one obtains for a particle with spin ξ T = (0, 1) the expression   0 1 √   u(p) ≈ 2E   . 0

(5.116)

0 This implies that in the limit y → ∞ the states degenerate into the 2-component spinors of a √ massless particle. We now also understand the reason for the factor of m in (5.111). It is necessary to keep the spinor expressions finite in the massless limit. Helicity The solutions (5.115) and (5.116) are the eigenstates of the helicity operator ! i σ 0 i 1 h = ijk pi S jk = pi , 2 2 0 σi 127

(5.117)

where S ij is the rotation generator (5.14). The massless field in (5.115) has helicity 1/2 and is said to be right-handed, while the one in (5.116) has helicity −1/2 and is called left-handed. Notice that the helicity of a massive particle depends on the frame of reference, since one can always boost to a frame in which its momentum is in the opposite direction, but its spin is unchanged. For a massless particle which travels at the speed of light one cannot perform such a boost. This also explains the origin of the notation ψL,R for Weyl spinors. The solutions of the Weyl equations (5.49) are states of definite helicity, corresponding to left- and righthanded particles, respectively. The Lorentz invariance of helicity (for a massless particle) is manifest in the notation of Weyl spinors, since ψL and ψR live in different representations of the Lorentz group. Spinor Products There are a number of identities that will be very useful in the following section, regarding the (inner) products of the spinors u(p) and v(p). For convenience, we define a basis ξ r and η r with r = 1, 2 for the 2-component spinors such that ξ r† ξ s = δ rs ,

η r† η s = δ rs .

(5.118)

! 1 , 0

! 0 , 1

(5.119)

E.g., one can take ξ1 =

ξ2 =

and similarly for η r . Let us first look at the positive frequency solutions u(p). We can take the inner product of 4-component spinors in two different ways, i.e., ur† (p) · us (p) or u¯r (p) · us (p). Of course, only the latter object is Lorentz invariant, but it will turn out that the former is needed when we will quantize the theory. So let me state both. One has ! √ s √ p · σ ξ √ ur† (p) · us (p) = (ξ r† p · σ, ξ r† p · σ ¯) √ p·σ ¯ ξs (5.120) = ξ r† (p · σ) ξ s + ξ r† (p · σ ¯ ) ξ s = 2ξ r† p0 ξ s = 2p0 δ rs , while the Lorentz-invariant inner product is ! √ ! 01 p · σ ξs p · σ, ξ p·σ ¯) u¯ (p) · u (p) = (ξ √ 10 p·σ ¯ ξs √ √ √ √ = ξ r† p · σ p · σ ¯ ξ s + ξ r† p · σ ¯ p · σ ξ s = 2mδ rs . r

s

r† √

r† √

(5.121)

Here we have used (5.105) in order to arrive at the final expression. For the negative frequency solutions v(p), one derives in an analog way v r† (p) · v s (p) = 2p0 δ rs ,

v¯r (p) · v s (p) = −2mδ rs .

128

(5.122)

We can also compute the Lorentz-invariant inner product between ur (p) and v(p). We find ! √ ! s √ 0 1 p · σ η √ u¯r (p) · v s (p) = (ξ r† p · σ, ξ r† p · σ ¯) √ 10 − p·σ ¯ ηs (5.123) √ √ √ √ ¯ η s − ξ r† p · σ ¯ p · σ ηs = 0 , = ξ r† p · σ p · σ and similarly for v¯r (p) · us (p) = 0. The solutions u(p) and v(p) are thus orthogonal to each other. Let us furthermore calculate ur† (p)·v s (−p) and v r† (−p)·us (p).45 Defining p¯µ = (p0 , −p), one has in the first case ! √ s √ p ¯ · σ η √ ur† (p) · v s (−p) = (ξ r† p · σ, ξ r† p · σ ¯) √ − p¯ · σ ¯ ηs (5.124) √ √ √ √ = ξ r† p · σ p¯ · σ η s − ξ r† p · σ ¯ p¯ · σ ¯ ηs . Here the term under the first square root is given by (p · σ)(¯ p · σ) = (p0 − pi σ i )(p0 + pi σ i ) = ¯ )(¯ p·σ ¯ ). This means that the two terms in the p20 − p2 = m2 . The same result holds for (p · σ last line of (5.124) cancel, leaving us with ur† (p) · v s (−p) = v r† (−p) · us (p) = 0 .

(5.125)

Spin Sums In evaluating Feynman diagrams, we will often wish to sum over the polarization states of a fermion. We can derive the relevant spin sums (or completeness relations) by simple calculations. We start by computing ! ! X X √p · σ ξ r √ 0 1 √ ur (p) u¯r (p) = ¯) (ξ r† p · σ, ξ r† p · σ √ r 10 p · σ ¯ ξ r=1,2 r=1,2 (5.126) ! ! √ √ √ √ p·σ p·σ ¯ p·σ p·σ m p·σ = √ = = /p + m . √ √ √ p·σ ¯ p·σ ¯ p·σ ¯ p·σ p·σ ¯ m Notice that the two spinors appearing on the left-hand side of (5.126) are not contracted. In the derivation of the latter equation, we have used that ! X 1 0 ξ r ξ r† = . (5.127) 0 1 r=1,2 Similarly, one derives X

v r (p) v¯r (p) = /p − m .

(5.128)

r=1,2

Again, it is crucial that X

η r η r† =

r=1,2 45

10 01

! .

Our notation is such that with u(−p) we in fact mean u(−p) etc.

129

(5.129)

5.5

Quantization of Dirac Theory

We are now ready to construct the quantum version of the free Dirac field, starting from the relevant action (5.34). We will first proceed naively and treat ψ as we have done in the case of the scalar field. Yet, we will see pretty fast that things go wrong, and we will have to reconsider how to quantize the Dirac theory. Walking on this blind alley will, however, allow us to better understand the relation between spin and statistics. So at the end, it will be a quite useful detour. Little Detour We start in the usual way by calculating the momentum conjugate to ψ. In fact, we already did this in (5.50), and know that π = iψ † , which does not involve the time derivative of ψ. This makes perfectly sense, because the Dirac equation is first order in time, so that we need only to specify ψ and ψ † on an initial time slice to determine the full evolution. In order to quantize the theory we then proceed in analogy with the Klein-Gordon field, and promote ψ and ψ † to operators, satisfying the following canonical (equal time) commutation relations †

[ψ α (x), ψ β (y)] = [ψ α† (x), ψ β (y)] = 0 , (5.130)

†

[ψ α (x), ψ β (y)] = δ (3) (x − y) δ αβ , where α and β denote spinor indices. This already looks peculiar. If ψ were real-valued the left-hand side would be antisymmetric under exchange of x and y, while the right-hand side is symmetric. But ψ is complex, so we do not have a contradiction yet. In fact, we will soon learn that much worse problems arise when we impose commutation relations on the Dirac field. But it is instructive to see how far we can get, in order to better understand the relation between spin and statistics. So let’s press on. Since we are dealing with a free theory, where any classical solution is a sum of plane waves, we may write the quantum operators in the Schrödinger picture as i X Z d3 p 1 h r r ip·x r† r −ip·x p a u (p) e + b v (p) e , ψ(x) = p p 3 (2π) 2E p r=1,2 (5.131) h i X Z d3 p 1 p ψ † (x) = arp† ur † (p) e−ip·x + brp v r † (p) eip·x , 3 (2π) 2E p r=1,2 where the operators arp† and brp† create particles associated to the positive energy solutions ur (p) exp (ip · x) and negative energy solutions v r (p) exp (−ip · x), respectively. The arp and brp are the corresponding annihilation operators. Like in the case of scalar fields the commutation relations of the fields (5.130) lead to commutation relations for the ladder operators. The nonvanishing commutators are [arp , asq † ] = (2π)3 δ (3) (p − q) δ rs , [brp , bsq † ] = −(2π)3 δ (3) (p − q) δ rs . 130

(5.132)

Notice that the commutator [brp , bsq † ] has a strange minus sign on the right-hand side. It is not obvious that this sign causes trouble, but we should be aware of it. With the commutation relations (5.132) at hand, it is straightforward to show that the relations (5.130) hold. One has h X Z d3 pd3 q 1 † p [arp , asq † ] ur (p)us † (q) ei(p·x−y·q) [ψ(x), ψ (y)] = 6 (2π) 4E E p q r,s=1,2 i r† s r s† −i(p·x−y·q) + [bp , bq ] v (p)v (q) e (5.133) i X Z d3 p 1 h r r† 0 ip·(x−y) r r† 0 −ip·(x−y) = u (p) u ¯ (p) γ e + v (p)¯ v (p) γ e . (2π)3 2Ep r=1,2 In order to simplify this further, we now employ the completeness relations (5.126) and (5.128). It follows that Z 3 i dp 1 h † 0 ip·(x−y) 0 −ip·(x−y) [ψ(x), ψ (y)] = (p / + m) γ e + (p / − m) γ e (2π)3 2Ep Z 3 0 0 i ip·(x−y) dp 1 h 0 0 (5.134) = p γ + p · γ + m γ + p γ − p · γ − m γ e 0 0 (2π)3 2Ep Z 3 d p ip·(x−y) e = δ (3) (x − y) , = (2π)3 as promised. Notice that to obtain the second line we have change the integration from p to −p for what concerns the second term. We also see that the minus sign in the second relation of (5.132) is crucial here, since it is necessary so that the terms p · γ = pi γ i cancel in the final expression. It is also easy to show that the first commutation relation in (5.130) is satisfied once the equations (5.132) are imposed. I leave it to the reader to perform the explicit computation. Equipped with (5.132), we can find the explicit form of the Dirac Hamiltonian in terms of R ladder operators. The Hamiltonian can be simply read off from (5.90), since E = H = d3 x H. Hence, we have H = ψ¯ (−iγ · ∇ + m) ψ , (5.135) as a starting point, which we would like to turn into an operator. We first look at X Z d3 p 1 h r p ap (−p · γ + m) ur (p) eip·x (−iγ · ∇ + m) ψ = 3 (2π) 2E p r=1,2 i r† r −ip·x + bp (p · γ + m) v (p) e .

(5.136)

In order to find this result it is important to notice that p · x = −pi xi , which explains the “additional” minus sign of the p · γ terms. Using now (5.103) and (5.109) to replace the p · γ

131

terms, leads to r i X Z d3 p Ep 0 h r r ip·x r† r −ip·x (−iγ · ∇ + m) ψ = γ ap u (p) e − bp v (p) e , (2π)3 2 r=1,2

(5.137)

We now use this expression to write the Hamiltonian as s i X Z d3 xd3 pd3 q Ep h s † s † −iq·x s s† iq·x + bq v (q) e H= a u (q) e (2π)6 4Eq q r,s=1,2 h i r r ip·x r† r −ip·x × ap u (p) e − bp v (p) e (5.138)

X Z d3 p 1 h asp† arp us † (p) · ur (p) − bsp† brp v s † (p) · v r (p) = 3 (2π) 2 r,s=1,2 −

† asp† br−p

s†

r

u (p) · v (−p) −

bsp ar−p

s†

r

v (p) · v (−p)

i

,

where in the last two terms we have changed p to −p. Now is the right time to employ the formulas in (5.120), (5.122), and (5.125), that allow us to get rid of the spinor products. We arrive at the simple result h i X Z d3 p r† r r r† a a − b b E H= p p p p p (2π)3 r=1,2 (5.139) h i X Z d3 p r† r r† r 3 (3) = Ep ap ap − bp bp + (2π) δ (0) . (2π)3 r=1,2 The delta-function term should be familiar to you by now. It is easily dealt with by normal ordering. However, the term brp† brp is a complete mess, since it implies that the Hamiltonian is not bounded below, meaning that our quantum theory makes no sense. Taken seriously it would tell us that we could tumble to states of lower and lower energy by continually producing particles by the action of brp† . Since the above calculation was a little subtle, you might think that it’s possible to rescue the theory to get the minus signs to work out right. You can play around with different things, but you’ll always find this minus sign cropping up somewhere. And, in fact, it’s telling us something important that we missed. Further insight in the structure of the Dirac theory, can be gained by investigating the causality of the theory. To do this we should calculate [ψ(x), ψ † (y)], or more conveniently ¯ [ψ(x), ψ(y)], at non-equal times and hope to find that this commutator is zero outside the light-cone. We start this exercise by switching to the Heisenberg picture thereby restoring the ¯ From (3.104) and (3.106), we infer that time-dependence of ψ and ψ. (arp )H = eiHt arp e−iHt = e−iEp t arp ,

(arp† )H = eiHt arp† e−iHt = eiEp t arp† ,

(5.140)

(brp† )H = eiHt brp† e−iHt = e−iEp t brp† .

(5.141)

while (brp )H = eiHt brp e−iHt = eiEp t brp ,

132

It immediately follows that X Z d3 p 1 p ψ(x) = 3 (2π) 2Ep r=1,2 X Z d3 p 1 ¯ p ψ(x) = (2π)3 2Ep r=1,2

h

arp

h

arp†

−ipx

r

u (p) e

+

brp†

+

brp

ipx

i

−ipx

i

r

v (p) e

, (5.142)

r

ipx

u¯ (p) e

r

v¯ (p) e

.

We can now compute the commutator. One has i X Z d3 p 1 h αβ −ip(x−y) αβ ip(x−y) α β r r r r ¯ [ψ (x), ψ (y)] = (u (p) u ¯ (p)) e + (v (p) v ¯ (p)) e (2π)3 2Ep r=1,2 i d3 p 1 h αβ −ip(x−y) αβ ip(x−y) (p / + m) e + (p / − m) e (2π)3 2Ep Z 3 d p 1 −ip(x−y) αβ ip(x−y) = (i∂/x + m) e − e . (2π)3 2Ep Z

=

(5.143)

Looking back at (3.110) and (3.111), we see that this means that [ψ α (x), ψ¯β (y)] = (i∂/x + m)αβ ∆(x − y) .

(5.144)

This expression vanishes outside the light-cone, because the commutator of the real scalar field ∆(x − y) = [φ(x), φ(y)] does. As a result the quantum version of the Dirac theory is causal. Although there is no problem with causality, it is worthwhile to stare at the commutator in (5.144) a bit longer. If |0i is the vacuum state of the theory, arp |0i = brp |0i = 0 .

(5.145)

[ψ α (x), ψ¯β (y)] = h0 [ψ α (x), ψ¯β (y)] 0i = h0 ψ α (x) ψ¯β (y) 0i − h0 ψ¯β (y)ψ α (x) 0i .

(5.146)

for all r and p, then

It is important to realize now, that the first (second) matrix element receives only contribution from terms containing the ur (p) (v r (p)) spinors. Explicitly, one has in the first case X Z d3 p d3 q α 1 β ¯ p (ur (p) u¯s (q))αβ e−i(px−qy) h0|arp asq † |0i , (5.147) h0 ψ (x) ψ (y) 0i = 6 (2π) 4Ep Eq r,s=1,2 and a similar expression holds in the second case. It is now crucial to ask the following questions. Can we say something about the matrix elements h0|arp asq † |0i based on the classical symmetries of the Dirac theory? In particular, how does Lorentz invariance constrain the form of the relevant matrix elements? For the ground 133

state |0i to be invariant under translations, we must have exp (iP · x) |0i = |0i. In analogy to (5.140) the action of exp (iP · x) on the ladder operators can be shown to lead to eiP ·x arp e−iP ·x = eip·x arp ,

eiP ·x arp† e−iP ·x = e−ip·x arp† .

(5.148)

Analog expressions hold in the case of brp and brp† . Therefore, h0|arp asq † |0i = h0|arp asq † eiP ·x |0i = ei(p−q)·x h0|eiP ·x arp asq † |0i = ei(p−q)·x h0|arp asq † |0i .

(5.149)

This implies that the matrix element can only be non-zero if p = q. Similarly, it can be shown that rotational invariance of |0i requires that r = s, which should be intuitively clear. From these considerations, one concludes that the matrix element can be written as h0|arp asq † |0i = (2π)3 δ (3) (p − q) δ rs A(p) ,

(5.150)

where A(p) is an arbitrary function that is so far undetermined. Inserting the latter result into (5.147), gives X Z d3 p 1 α β h0 ψ (x) ψ¯ (y) 0i = (ur (p) u¯r (p))αβ e−ip(x−y) A(p) 3 2E (2π) p r=1,2 (5.151) Z 3 dp 1 = (p/ + m)αβ e−ip(x−y) A(p) . (2π)3 2Ep For this expression to be invariant under boosts, we have to require that A(p) must be a Lorentz scalar, i.e., A(p) = A(p2 ). In fact, since p2 = m2 it follows that A has to be a positive constant. The positivity of A is the result of the positivity of the norm of states in any self-respecting Hilbert space. Hence, Z 3 α d p 1 −ip(x−y) β αβ e . (5.152) h0 ψ (x) ψ¯ (y) 0i = A (i∂/ + m) (2π)3 2Ep In a similar fashion, we can also calculate the second matrix element in (5.146). The final result reads Z 3 β d p 1 ip(x−y) α αβ h0 ψ¯ (y)ψ (x) 0i = −B (i∂/ + m) e . (5.153) (2π)3 2Ep where B is another positive constant. The minus sign is important. It arises from the completeness relation (5.128) of the v r (p) spinors and the sign of x in the exponential. From (5.152) and (5.153) we see that the two terms in the last line of (5.146) would indeed cancel if A = −B. Yet, this is impossible since A and B must both be positive. So how to resolve this apparent contradiction? Setting A = B = 1, it follows from (5.152) and (5.153) that (outside the light-cone) h0 ψ α (x) ψ¯β (y) 0i = −h0 ψ¯β (y)ψ α (x) 0i , (5.154) which means that the spinor fields anticommute at space-like separation. This suggests that postulating the commutation relations (5.130) for the spinor fields, was the mistake that lead to the negative energy problem in (5.139). 134

Fermionic Quantization The key piece of physics that we obviously missed before is that spin-1/2 particles are fermions, meaning that they obey Fermi-Dirac statistics with the quantum state picking up a minus sign upon the interchange of any two particles as indicated by (5.154). This fact is embedded into the structure of relativistic QFT: the spin-statistics theorem tells us that integer spin fields must be quantized as bosons, while half-integer spin fields must be quantized as fermions. Any attempt to do otherwise will lead to an inconsistency. All inconsistencies are removed by postulating the equal-time anticommutation relation for the Dirac field, †

{ψ α (x), ψ β (y)} = {ψ α† (x), ψ β (y)} = 0 , α

(5.155)

β†

(3)

{ψ (x), ψ (y)} = δ (x − y) δ

αβ

,

instead of (5.130). In this case we still have the expansions (5.131) and (5.142) in terms of the ladder operators arp , arp† , brp , and brp† , but the line of reasoning that lead to (5.132) now tells us that {arp , asq † } = (2π)3 δ (3) (p − q) δ rs , {brp , bsq † } = (2π)3 δ (3) (p − q) δ rs ,

(5.156)

while all other anticommutators vanish identically. Using these anticommutator relations, we can now compute the Hamiltonian again, finding h i X Z d3 p r† r r r† a a − b b E H= p p p p p (2π)3 r=1,2 (5.157) h i X Z d3 p r† r r† r 3 (3) = Ep ap ap + bp bp − (2π) δ (0) . (2π)3 r=1,2 We see that the anticommutators have saved us from the indignity of an unbounded Hamiltonian. Notice that when normal ordering, we now throw away a negative infinite contribution proportional to −(2π)3 δ (3) (0) and not a positive one as in the case of the scalar field (3.19). In principle, the negative contribution from fermionic fields could (partially) cancel the positive contribution arising from bosonic fields. So one could hope that if there is a symmetry relating fermions and bosons to each other, a so-called supersymmetry, the cosmological constant problem might be solvable. In fact, it can be shown that supersymmetry solves the cosmological constant problem “halfway”, but does not render a complete solution. If you want to figure out what “halfway” actually means, I recommend to have a look at the excellent review [15] and the relevant references therein. For completeness let me also quote the expression for the momentum operator. Inserting the expansions (5.131) into (5.91), one finds after a straightforward calculation and normal ordering the following result h i X Z d3 p r† r r† r P = a + b b (5.158) p a p p p p . 3 (2π) r=1,2 135

Fermi-Dirac Statistic Although the ladder operators now obey anticommutation relations, the Hamiltonian (5.157) has nice commutation relations with them. You can check easily that [H, arp† ] = Ep arp† ,

[H, arp ] = −Ep arp ,

(5.159)

and likewise in the case of brp and brp† . As in the scalar case (3.43), this implies that we can again construct a tower of energy eigenstates by acting on |0i with arp† and brp† to create particles and antiparticles. E.g., we have the one-particle state |p, ri = arp† |0i ,

(5.160)

with momentum p and spin quantum number r. The two-particle state |p1 , r1 ; p2 , r2 i = arp11 † arp22 † |0i ,

(5.161)

|p1 , r1 ; p2 , r2 i = arp11 † arp22 † |0i = −arp22 † arp11 † |0i = −|p2 , r2 ; p1 , r1 i ,

(5.162)

obeys due to (5.156). This confirms that the particles do satisfy Fermi-Dirac statistics as anticipated. In particular, we have the Pauli’s exclusion principle |p, r; p, ri for all p and r. Finally, if one wants to be sure about of the particle, one could act with the angular momentum R 3 the0 spin i ijk jk operator J = d x (J ) constructed from (5.94) to confirm that a stationary particle |p = 0, ri does indeed carry intrinsic angular momentum 1/2. This exercise, which is left to the reader, will show that in the case of |p = 0, ri only the third term in (5.94) will give a non-vanishing contribution to the internal angular momentum. Dirac’s Hole Interpretation Before discussing the propagator of the Dirac field, a historical remark seems to be in order. Dirac originally viewed his equation (5.35) as a relativistic version of the Schrödinger equation, considering ψ as the wavefunction ot a single particle with spin 1/2 (a fact which is put in by hand in Diracs theory). In order to reinforce this interpretation, he wrote (5.35) as i

∂ψ = −iα · ∇ψ + mβ ψ = Hψ , ∂t

(5.163)

with α = −γ 0 γ and β = γ 0 . The operator H appearing in the above equation is then understood as the one-particle Hamiltonian. Notice that this viewpoint is quite different from the one we held so far, where ψ is a classical field that gets quantized. In Dirac’s view, the Hamiltonian is defined by (5.163), while for us the Hamiltonian is given by the field operator (5.157). But for the moment let’s stick to (5.163) and see where it lead Dirac/leads us. With the interpretation of ψ as a single-particle wavefunction, the plane-wave solutions (5.102) and (5.108) are thought of as energy eigenstates, satisfying i

∂ψ ∂ = i u(p)e−ipx = Ep u(p)e−ipx = Ep ψ , ∂t ∂t 136

(5.164)

and an analog relation for ψ = v(p)eipx with Ep replaced by −Ep . The plane-wave solutions thus look like positive and negative energy solutions. The spectrum is again unbounded from below, because there are states v(p) with arbitrary low energy −Ep . At first glance this is disastrous, just like the unbounded field theory Hamiltonian of (5.157). Paul Dirac’s ingenious solution to this problem was to turn to the Pauli exclusion principle. In 1930, Dirac proposed that in the true vacuum of the universe, all the negative energy states are filled, so that only the positive energy states are accessible. The filled negative energy states are referred to as the Dirac sea. Although you might worry about the infinite negative charge of the vacuum, Dirac argued that only charge differences would be observable (a trick reminiscent of the normal ordering prescription we use for field operators). Having avoided the problem with the anomalous negative-energy quantum states by introducing an infinite sea comprised of occupied negative energy states, Dirac realized that his theory made a shocking prediction. Suppose that a negative energy state is excited to a positive energy state, leaving behind a hole in the Dirac sea. The hole would have all the properties of the electron, except it would carry positive charge. After flirting with the idea that it may be the proton,46 Dirac concluded that the hole is a new particle, the positron. It took only couple of years before the positron was discovered experimentally in 1932 by Carl Anderson, with all the physical properties predicted for the Dirac hole. Although Dirac’s physical insight led him to the right answer, we now understand that the interpretation of the Dirac equation as a single-particle wavefunction is not really correct. E.g., Dirac’s argument for antimatter relies crucially on the particles being fermions while, as we have seen already in this course, antiparticles exist for both fermions and bosons. What we really learn from Dirac’s analysis is that there is no consistent way to interpret the Dirac equation as a single-particle wavefunction. It is instead to be thought of as a classical field which has only positive energy solutions, since the Hamiltonian (5.90) is positive definite. Quantization of this field then gives rise to both particle and antiparticle excitations and makes the vacuum the state in which no particles exist instead of an infinite sea of particles. This picture is much more convincing, especially since it recaptures all the valid predictions of the Dirac sea, such as electron-positron annihilation. On the other hand, the field formulation does not eliminate all the difficulties raised by the Dirac sea. In particular, the problem of the vacuum possessing infinite energy, is still present. Feynman Propagator ¯ We now look at the anticommutator of the fields ψ(x) and ψ(y). Dropping the indices α and β from here on, we simply write ¯ iS(x − y) = {ψ(x), ψ(y)} . 46

(5.165)

Robert Oppenheimer pointed out that an electron and its hole would be able to annihilate each other, releasing energy on the order of the electron’s rest energy in the form of energetic photons. If holes were protons, stable atoms would thus not exist, which is clearly in contradiction with observations. Hermann Weyl also noted that a hole should have the same mass as an electron, whereas the proton is about 2000 times heavier.

137

Inserting the expansions (5.142), we essentially only have to repeat the calculation that lead to (5.143), to obtain iS(x − y) = (i∂/x + m) D(x − y) − D(y − x) , (5.166) where D(x − y) is the propagator (3.114) of the real scalar field. The object iS(x − y) is called the fermionic propagator. Some comments seem to be in order here. For space-like separated points (x − y)2 < 0, we have already seen in (3.117) that D(x − y) − D(y − x) = 0. In the bosonic theory, we made a big deal out of this, since it ensured that [φ(x), φ(y)] = 0 for (x − y)2 < 0, which we took ¯ as a proof of causality. However, in the case of fermions we now have {ψ(x), ψ(y)} = 0 for 2 (x−y) < 0. What happened to causality? The best that we can say is that all our observables ¯ e.g., the Hamiltonian operator (5.157) or the momentum operator are bi-linear in ψ and ψ, (5.158). These objects still commute outside the light-cone. The theory remains causal as long as individual fermionic operators are not observable. If you think this is a weak argument, remember that no one has ever seen a physical device come back to minus itself when you rotate by 2π! Notice furthermore, that the propagator satisfies (i∂/x − m)S(x − y) = 0, since (∂x2 + m2 )D(x − y) = 0 using the on-shell condition p2 = m2 . By a similar calculation to that above, we can determine the VEVs of the bi-linears, Z 3 dp ¯ (p/ + m) e−ip(x−y) , h0|ψ(x)ψ(y)|0i = (2π)3 (5.167) Z 3 d p ¯ (p/ − m) eip(x−y) , h0|ψ(y)ψ(x)|0i = (2π)3 which allows us to define the fermionic Feynman propagator SF (x−y), which is a 4×4 matrix, as the following time-ordered product 0 ¯ ¯ ¯ SF (x−y) = h0|T ψ(x)ψ(y)|0i = θ(x0 −y 0 ) h0|ψ(x)ψ(y)|0i−θ(y −x0 ) h0|ψ(y)ψ(x)|0i , (5.168)

where the minus sign in front of the second term is crucial in the QFT of fermions. When (x − y)2 < 0, there is no invariant way to determine whether x0 > y 0 or x0 < y 0 . In this case ¯ the minus sign is necessary to make the two definitions agree since {ψ(x), ψ(y)} = 0 outside the light-cone. In full analogy to the scalar case, there is also a 4-momentum integral representation for the Feynman propagator. It reads Z 4 d p −ip(x−y) /p + m SF (x − y) = i e , (5.169) 4 2 (2π) p − m2 + i and satisfies (i∂/x − m)SF (x − y) = iδ (4) (x − y) ,

(5.170)

which means that SF (x − y) is a Green’s function of the Dirac operator. The minus sign that we see in (5.168) also occurs for any string of operators inside any time-ordered product. While bosonic operators commute inside T , fermionic operators anticommute. We have this same behavior for normal-ordered products as well, with fermionic 138

operators receiving a minus sign when their order is changed. With the understanding that all fermionic operators anticommute inside time- or normal-ordered products, Wick’s theorem proceeds just as in the bosonic case, which has been outlined in great detail in Section 4.4. In the fermionic case, we define the contraction as ¯ ¯ ¯ : = SF (x − y) . ψ(x)ψ(y) = T ψ(x)ψ(y)− : ψ(x)ψ(y)

(5.171)

Yukawa Theory Based on the experiences gained in Section 4.6, it is now straightforward to work out the Feynman rules needed to calculate fermion correlation functions. Let us for definiteness consider the case of the Yukawa theory, L=

1 1 ¯ , (∂µ φ)2 − µ2 φ2 + ψ¯ (i∂/ − m) ψ − λφ ψψ 2 2

(5.172)

which describes the interaction of a scalar field with mass µ and a Dirac field with mass m. Couplings of this type appear in the SM, between fermions and the Higgs boson, and give mass to the fermionic dofs after electroweak symmetry breaking. In that context, the fermions can be charged leptons (possibly neutrinos) or quarks. If you wish (5.172) is thus the proper version of the scalar Yukawa theory of (4.8). Notice that there is however an important difference following from the dimensions of the involved fields. We still have [φ] = 1, but the kinetic terms of the fermion requires that [ψ] = 3/2. Thus, unlike in the case with only scalars, the coupling is dimensionless, i.e., [λ] = 0. In order to get a grip on the Feynman rules, let us study ψψ → ψψ scattering. This is pretty much the same calculation we have already performed in Section 4.5. The only minor modification is, that now the particles that scatter have spin, while the nucleons N we considered earlier on are scalars. In analogy to (4.48) we write the initial and final states as p |ii = 4Ep1 Ep2 arp11† arp22† |0i = |p1 , r1 ; p2 , r2 i , (5.173) p |f i = 4Eq1 Eq2 asq11† asq22† |0i = |q 1 , s1 ; q 2 , s2 i . Notice that for these states one has to be careful when one takes the adjoint since the fermionic creation operators anticommute. E.g., the final-state bra is p (5.174) hf | = 4Eq1 Eq2 h0| asq22 asq11 . To get a contribution to the scattering of two fermions, we have to calculate the O(λ2 ) corrections to the T -matrix element ihf |T |ii. The relevant contribution to iT takes the form Z (−iλ)2 ¯ ¯ d4 xd4 y T φ(x) ψ(x)ψ(x) φ(y) ψ(y)ψ(y) , (5.175) 2 where all fields are interacting ones. Just like in the case of the bosonic calculation, the contribution to ψψ → ψψ scattering comes from the term where the two φ fields are contracted, ¯ ¯ DF (x − y) : ψ(x)ψ(x) ψ(y)ψ(y) : . 139

(5.176)

ψ

p1

q1

ψ

ψ

p1

ψ q1

+

φ

φ q2

ψ

p2

q2

ψ

ψ

p2

ψ

Figure 5.1: Feynman diagrams contributing to ψψ → ψψ scattering at order λ2 .

We can now study the action of the fermionic operators on |ii. By expanding the ψ operators, but not the ψ¯ fields, we find Z 3 3 d k1 d k2 ¯ t1 r1 † r2 † ¯ · ut2 (k2 ) ¯ ¯ ψ(x) · u (k ) ψ(y) : ψ(x)ψ(x) ψ(y)ψ(y) : ap1 ap2 |0i = − 1 (2π)6 (5.177) e−i(k1 ·x+k2 ·y) t1 t2 r1 † r2 † × p ak1 ak1 ap1 ap2 |0i . 4Ek1 Ek2 Here the btk11† and bkt22† terms in the expansion of ψ have been ignored since the do not contribute to the considered process at O(λ2 ) and the brackets indicate how the spinor indices are ¯ contracted. Notice finally that the overall minus sign arises from moving ψ(x) past ψ(y). By anticommuting the annihilation operators past the creation operators and performing the momentum integrations using the delta functions, we then get for the right-hand side of (5.177) the following expression 1 ¯ · ur2 (p2 ) e−i(p1 ·x+p2 ·y) ¯ · ur1 (p1 ) ψ(y) − ψ(x) −p 4Ep1 Ep2 (5.178) ¯ ¯ · ur1 (p1 ) e−i(p1 ·y+p2 ·x) |0i . · ur2 (p2 ) ψ(y) + ψ(x) Note the minus sign between the two individual terms. We now let this expression act on hf | from the right. Let us first have a look what happens to the first term in (5.178). Ignoring prefactors and exponentials, we have ¯ ¯ · ur2 (p2 ) |0i = h0| asq22 asq11 ψ(x) · ur1 (p1 ) ψ(y) ei(q1 ·x+q2 ·y) s1 p (¯ u (q1 ) · ur1 (p1 )) (¯ us2 (q2 ) · ur2 (p2 )) 4Eq1 Eq2

(5.179)

ei(q1 ·y+q2 ·x) s1 −p (¯ u (q1 ) · ur2 (p2 )) (¯ us2 (q2 ) · ur1 (p1 )) . 4Eq1 Eq2 In fact, the second term in (5.178) can be shown to give the same result up to a sign. Both terms thus add, which cancels the factor of 1/2 in (5.175). Furthermore, the square roots of 140

energies in (5.179) cancel against the relativistic normalizations of the states (5.173). Putting everything together and including the Feynman propagator of the φ field, we end up with Z 4 4 4 d xd yd k ieik·(x−y) s1 2 (−iλ) (¯ u (q1 ) · ur1 (p1 )) (¯ us2 (q2 ) · ur2 (p2 )) ei[(q1 −p1 )·x+(q2 −p2 )·y] (2π)4 k 2 − µ2 + i (5.180) − (¯ us1 (q1 ) · ur2 (p2 )) (¯ us2 (q2 ) · ur1 (p1 )) ei[(q2 −p1 )·x+(q1 −p2 )·y] . Performing the integrations over x and y and suppressing a factor i(2π)4 , which will end up in i hf |T |ii = i(2π)4 δ (4) (p1 + p2 − q1 − q2 ) A(ψψ → ψψ), this becomes Z d4 k 2 us2 (q2 ) · ur2 (p2 )) δ (4) (q1 − p1 + k) δ (4) (q2 − p2 − k) (−iλ) (¯ us1 (q1 ) · ur1 (p1 )) (¯ 2 2 k − µ + i (5.181) (4) (4) r1 s2 r2 s1 u (q2 ) · u (p1 )) δ (q1 − p1 + k) δ (q2 − p2 − k) , − (¯ u (q1 ) · u (p2 )) (¯ from which we can immediately read of the result for the scattering amplitude " (¯ us1 (q1 ) · ur1 (p1 )) (¯ us2 (q2 ) · ur2 (p2 )) A(ψψ → ψψ) = (−iλ)2 (p1 − q1 )2 − µ2 + i # (¯ us1 (q1 ) · ur2 (p2 )) (¯ us2 (q2 ) · ur1 (p1 )) − . (p1 − q2 )2 − µ2 + i

(5.182)

Honestly, the derivation of the expression for A(ψψ → ψψ) was a bit tedious. Can it be done more easily? Yes, it can! Of course, the trick is again to use Feynman diagrams and rules. The lowest-order Feynman graphs for the scattering of two fermions into two fermions are shown in Figure 5.1. Starring at those diagrams as well as (5.182), it is, in fact, easy to guess the Feynman rules that reproduce the final result for the ψψ → ψψ scattering amplitude. The relevant momentum-space Feynman rules involving fermions and antifermions turn out to be: i (p/ + m) = 2 . 1. For each propagator one has p − m2 + i p

= −iλ .

2. For each vertex one has

3. For each external fermion one has

p

= us (p) (initial state) ,

141

p

= u¯s (p) (final state) .

4. For each external antifermion one has

= v¯s (p) (initial state) ,

←p

p→

= v s (p) (final state) .

5. Impose momentum conservation at each vertex. Z 6. Integrate over each undetermined momentum

d4 l . (2π)4

7. Figure out the overall sign of the diagram. The Feynman rule for the propagator of the scalar field φ (indicated by a dashed line) has already been given in Section 4.6 and external scalar legs just give a trivial factor of 1. Several comments regarding the above rules are in order. First, the direction of the momentum on a fermion line is significant. On external lines, the direction of the momentum is always ingoing (outgoing) for initial-state (final-state) particles. This follows from the expan¯ where the annihilation (creation) operators are multiplied by sion of the operators ψ and ψ, exp (−ipx) (exp (ipx)) as can be seen from (5.142). On internal lines, represented by propagators, the momentum must be assigned in the direction of the particle-number flow (for electrons, this is the direction of the negative charge flow). It is conventional to draw arrows on fermion lines to represent the direction of the particle-number flow. The momentum assigned to a fermion then flows in the direction of this arrow, while in the case of an antifermion particle-number and momentum flow are opposite to each other. Hence an additional arrow, identifying the momentum flow, has been drawn next to the antifermion line. Second, in the case of the Yukawa theory the 1/n! factor from the Taylor expansion of the time-ordered exponential is always cancelled by the n! ways of interchanging the vertices to obtain the same contraction. In the case at hand there is thus no need for symmetry factors, ¯ cannot replace each other in a contraction. given that the fields in the interaction term −λφ ψψ Third, the Dirac indices contract together along fermion lines. This happened in the case of ψψ → ψψ scattering (5.182), but will also happen in more complicated diagrams like e.g. p4

p3

p2

p1

∝ u¯(p4 )

i (p/3 + m) i (p/2 + m) u(p1 ) . (p23 − m2 ) (p22 − m2 )

(5.183)

Fourth and finally, we should understand how to determine the correct overall sign of the diagrams. Let us return to the case of fermion-fermion scattering (5.182). Here the t-channel diagram has a plus sign, while the u-channel contribution receives a minus sign. Where does the relative minus sign between the two graphs come from? Let us look at the Wick

142

contractions. For the contractions corresponding to the t-channel diagram in Figure 5.1, we have h0|asq22 asq11 ψ¯x ψx ψ¯y ψy arp11† arp22† |0i .

(5.184)

¯ two spaces to the left, and so one picks This contraction can be untangled by moving ψ¯y = ψ(y) 2 up a factor (−1) = 1. On the other hand, the contraction corresponding to the u-channel diagram in Figure 5.1 reads h0|asq22 asq11 ψ¯x ψx ψ¯y ψy arp11† arp22† |0i .

(5.185)

Here we only have to move ψ¯y one space to the left, giving a factor of −1. The relative minus sign between the two diagrams is a reflection of the Fermi-Dirac statistics. In more complicated ¯ x = ψ(x)ψ(x) ¯ graphs the overall sign can be determined most easily by noting that (ψψ) as well as any other pair of fermions, commutes with any operator. Thus, e.g. ¯ x (ψψ) ¯ y (ψψ) ¯ z (ψψ) ¯ w . . . = . . . (+1) (ψψ) ¯ x (ψψ) ¯ z (ψψ) ¯ y (ψψ) ¯ w ... . . . (ψψ)

(5.186)

= . . . SF (x − z)SF (z − y)SF (y − w) . . . , with SF (x − y) given in (5.169). Notice that in the case of the simplest closed fermion loop in the Yukawa theory the latter prescription leads to ¯ x (ψψ) ¯ y = (ψψ) = (−1) tr ψy ψ¯x ψx ψ¯y

(5.187)

= (−1) tr [SF (y − x)SF (x − y)] . Due to the cyclic property of the trace changing the ordering of SF (y − x) and SF (x − y) of course gives the same result. The result (5.187) extends straightforwardly to all closed fermion lines. A fermion loop hence always gives a factor of −1 and the trace of the product of fermion propagators that make up the loop. Equipped with the Feynman rules for the Yukawa theory, we can now calculate the cross sections for some simple scattering processes. This is quite a good exercise. You should try it!

5.6

Problems

i) Show explicitly that the Weyl representation (5.10) satisfies the Clifford algebra (5.7). Derive the properties (5.15) and (5.16) of the matrices S µν introduced in (5.12). ¯ ψγ ¯ µ ψ, and ψσ ¯ µν ψ transforms as in (5.30), (5.31), and (5.32), i.e., Show that the term ψψ, it is a Lorentz scalar, vector, and tensor, respectively. It might be advantageous to look 143

at infinitesimal transformations and consider separately the transformation properties ¯ and γ µ under the action of (5.6). of ψ, ψ, Calculate the transformation properties of (5.35) and (5.36) under Lorentz transformations. You should find that theses EOMs are form invariant. Verify that the fifth Dirac matrix γ 5 defined as in (5.40) satisfies (5.41) and (5.42). Prove that the chiral projectors PL,R introduced in (5.43) obey the relations (5.44). ii) Prove the Gordon identity (p0 + p)µ iσ µν qν + u(p) . u¯(p )γ u(p) = u¯(p ) 2m 2m 0

µ

0

(5.188)

Here qµ = (p0 − p)µ . iii) Use (5.7) to show that the following identities involving contractions of 4-dimensional Dirac matrices are correct: γ µ γµ = 4 , γ µ γ ν γµ = −2γ ν , γ µ γ ν γ κ γµ = 4η νκ ,

(5.189)

γ µ γ ν γ κ γ λ γµ = −2γ λ γ κ γ ν . Employ the anticommutation relation (5.7) in combination with the cyclic property of the trace to prove the following identities: tr (1) = 4 , tr (any odd number of Dirac matrices) = 0 , tr (γ µ γ ν ) = 4η µν , tr γ µ γ ν γ κ γ λ = 4 η µν η κλ − η µκ η νλ + η µλ η νκ , tr γ 5 = 0 , tr γ µ γ ν γ 5 = 0 , tr γ µ γ ν γ κ γ λ γ 5 = −4iµνκλ .

(5.190)

iv) Products of Dirac bi-linears obey relations known as Fierz identities. The simplest of these formulas reads (¯ u1 γ µ PL u2 ) (¯ u3 γµ PL u4 ) = − (¯ u1 γ µ PL u4 ) (¯ u3 γµ PL u2 ) ,

(5.191)

where ui with i = 1, . . . , 4 are 4-component Dirac spinors (the momentum dependence has been dropped here for simplicity) and PL is the left-handed projector introduced in (5.43). In fact, there are similar rearrangement formulas for any product u¯1 ΓA u2 u¯3 ΓB u4 . (5.192) 144

Here ΓA and ΓB are any of the 16 combinations of Dirac matrices listed in (5.72). The goal of this exercise is to derive these Fierz identities. To begin with, normalize the 16 matrices ΓA such that tr ΓA , ΓB = 4δ ab .

(5.193)

This gives ΓA = {1, γ 0 , iγ j , . . .}. Write down all elements of this set. The general form of the Fierz identity is X AB u¯1 ΓA u2 u¯3 ΓB u4 = CM N u¯1 ΓM u4 u¯3 ΓN u2 ,

(5.194)

M,N AB A with unknown coefficients CM N . Using the completeness of the set Γ , show that AB CM N =

1 M A N B tr Γ Γ Γ Γ . 16

(5.195)

Employing (5.194) and (5.195) prove (5.191). In addition work out the explicit Fierz transformation of the product (¯ u1 u2 )(¯ u3 u4 ). v) In Section 4.8 we saw that the Yukawa potential for N N → N N scattering is attractive. ¯ and ψ¯ψ¯ → ψ¯ψ¯ scattering in the Repeat the calculation for ψψ → ψψ, ψ ψ¯ → ψ ψ, non-relativistic limit. You might want to use (5.120) to (5.122). If you understood how to calculate the Yukawa potential the derivation of the Coulomb potential, which encodes the interactions of electrons/positrons and the photon field Aµ in the non-relativistic limit, is also not difficult. Consider again the three different cases of particle-particle, particle-antiparticle, and antiparticle-antiparticle scattering. The QED Feynman rules for the photon (γ) propagator and vertex between electron (Q = −1) and positron (Q = 1) are given by γ µ

ν = p→

e

−igµν , p2 + i

γ µ = −iQeγµ .

e

The propagator of a tensor boson, such as the graviton (G), i.e., the force carrier of gravity, looks like G µν

ρσ = p→

i 1h i (−gµρ )(−gνσ ) + (−gµσ )(−gνρ ) 2 . 2 p + i 145

Can you derive from this result the orientation of the gravitational force? vi) The Furry theorem states that the sum of all Feynman graphs in QED with an odd number of external photons (off or on the photon mass shell) and no other external lines vanish. In order to proof Furry’s theorem, consider (n = 0, 1, . . .) µ

hΩ|T jVµ1 (x1 ) . . . jV 2n+1 (x2n+1 )|Ωi ,

(5.196)

and show by invoking symmetry arguments that a matrix element of this form vanishes. Here |Ωi denotes the true vacuum of the interacting theory and jVµ (x) is the vector current introduced in (5.96). vii) The goal of this exercise is to introduce the spinor method and to derive some identities that will be very useful to calculate scattering amplitudes in the high-energy limit, where the involved particles can be treated as massless. Derive an explicit solution for the Dirac equation /p u(p) = 0 , of a massless fermion. To do so, write out /p using the basis ! ! i 0 1 0 −σ γ0 = , γi = , γ5 = 1 0 σi 0

(5.197)

! 1 0 , 0 −1

(5.198)

of Dirac matrices. To keep the notation compact you might want to introduce p1 ± ip2 . e±iϕp = p (p1 )2 + (p2 )2

p± = p 0 ± p3 ,

(5.199)

Use the projection operators PL,R in the basis (5.198) to decompose the original solution into two helicity solutions u± (p) = PR,L u(p) . (5.200) Give the explicit form of u± and u¯± . Show that u†± u± = 2p0 , which fixes the normalization of the spinors. Relate u+ (¯ u+ ) with (¯ u− )T ((u− )T ) using γ0 and γ2 . What is the physics behind these relations? So far we have only talked about the positive-energy solutions u. How do the negative-energy solutions v fit into the picture? In particular, how are u± (¯ u± ) and v± (¯ v± ) related in the case of a massless fermion? Consider now a set of massless momenta pi with i = 1, 2, . . . , n. We introduce a bra and ket notation with the spinor labelled by the index i corresponding to the momentum pi , |i± i = u± (pi ) ,

hi± | = u¯± (pi ) .

(5.201)

[ij] = hi+ |j − i .

(5.202)

The basic spinor product are defined as hiji = hi− |j + i , 146

What happens to hi± |j ± i? Show the antisymmetry of the spinor products, i.e., hiji = −hjii ,

[ij] = −[ji] ,

(5.203)

by using either the explicit expressions for u± and u¯± you have derived earlier or the charge conjugation properties of the spinors. For the case when both energies are positive, i.e., p0i > 0 and p0j > 0, derive analytic expressions for the spinor products (5.202). Express you result through sij = (pi + pj )2 = 2pi pj , and

1 + p1i p+ j − pj pi cos φij = q , + |sij | p+ p i j

2 + p2i p+ j − pj pi sin φij = q . + |sij | p+ p i j

(5.204)

(5.205)

So what is the connection between spinor products and Lorentz products of momenta? Use your explicit result to show that the two types of spinor products are related by complex conjugation, hiji∗ = [ji] . (5.206) Since spinor products should have simple properties under crossing symmetry, one defines the spinor product hiji for negative energies by analytic continuation from the positiveenergy case, but with pi,j replaced by −pi,j if p0i,j < 0. The spinor product [ij] is then defined through the identity hiji[ji] = tr PL /pi /pj = sij . (5.207) Consider now the spinor string [i|γ µ |ji = u¯+ (pi )γ µ u+ (pj ) ,

(5.208)

a quantity that naturally appears as the current describing the emission of a vector boson from a right-handed massless fermion line. Notice that the helicity labels on the spinors can always be suppressed in favor of angle or square brackets as in the spinor products. So one has, hi| = hi− |, [i| = hi+ |, |ii = |i+ i, and |i] = |i− i. Show the charge conjugation property of the current [i|γ µ |ji = hj|γ µ |i] .

(5.209)

Prove that |ii[i| = PR /pi ,

|i]hi| = PL /pi ,

(5.210)

and use these projection operators to show the correctness of the Gordon identity [i|γ µ |ii = hi|γ µ |i] = 2 pµi .

147

(5.211)

Show by the use of (5.208) that |iihj| − |jihi| = hjiiPR ,

(5.212)

holds and derive from this relation the Schouten identity hijihkli = hikihjli + hilihkji .

(5.213)

The same identity applies when angle brackets are replaced by square brackets. In explicit calculation (5.208) is a powerful tool since its application can lead to enormous algebraic simplifications. Use the Fierz transformation (γ µ PR )ij (γµ PL )kl = 2 (PL )il (PR )kj ,

(5.214)

to show the simple relation hi|γ µ |j][k|γµ |li = 2hili[jk] ,

(5.215)

γ µ [i|γµ |ji = 2 |i]hj| + |ji[i| .

(5.216)

as well as the Fierz identity

A similar relation holds for γ µ hi|γµ |j]. viii) The goal of this exercise is to calculate the squared tree-level matrix elements of the processes d¯ u → e− ν¯e and d¯ u → e− ν¯e g using the spinor formalism developed above. The first process d¯ u → e− ν¯e describes the production of a massive W − boson from the collision of a down (d) and an antiup quark (¯ u) and the subsequent decay of the W − boson into an electron (e− ) and an electron antineutrino (¯ νe ). Draw the relevant Feynman diagram and write down the corresponding amplitude in spinor notation using the Feynman rules: W µ

ν = p→

u¯ d

W

p2

−igµν , 2 − MW + i

gw µ = −i √ Vud γµ PL , 2

148

ν¯e

W

gw µ = −i √ γµ PL . 2

e−

Here MW denotes the mass of the W − boson, gw is the weak gauge coupling, and Vud ≈ 1 is the complex 11 element of the Cabibbo-Kobayashi-Maskawa (CKM) matrix, which describes quark mixing in the SM. Notice that the W − boson only couples to the left-handed component of the quark and lepton fields. The deeper significance of this property will become clear once you learn more about the SM of particle physics. Simplify your result for the amplitude using the charge conjugation property of the current (5.209) and the Fierz identity (5.215). Calculate the squared matrix element and express your result in terms of scalar products of momenta. The second process is similar to the first one, but more complicated since it involves the emission of an additional gluon (g) from one of the external quark legs. Draw the possible Feynman graphs for d¯ u → e− ν¯e g at tree level and write down the amplitude. In addition to the Feynman rules given already you will need: q

g µ = −i gs T a γµ .

q

µ = ∗µ (p) (final state) .

µ = µ (p) (initial state) , p←

→p

Here gs is the coupling constant of QCD and T a with a = 1, . . . , 8 are the generators of the associated gauge group, i.e., SU (3)c . The symbol µ (p) stands for the polarization vector of the initial- or final-state gluon. In order to calculate the squared matrix element for d¯ u → e− ν¯e g, we also need to introduce a spinor representation for the polarization vector for gluons with definite helicity a = ±, hp|γµ |ξ] [p|γµ |ξi + , − , (5.217) µ (p, ξ) = √ µ (p, ξ) = − √ 2 hξpi 2 [ξp] where p is the gluon momentum and ξ is an auxiliary massless vector, called the reference momentum, reflecting the freedom of on-shell gauge transformations. The objects introduced in (5.217) have the following properties. Since /p |p± i = 0, the polarization vector ± (p, ξ) is transverse to p, i.e., ± (p, ξ) p = 0 , 149

(5.218)

for any choice of ξ with p ξ 6= 0. Complex conjugation acts on the polarization vectors like ∗ ± = ∓ (5.219) µ (p, ξ) µ (p, ξ) , and they are normalized as follows ∗ ± (p, ξ) ± (p, ξ) = −1 ,

∗ ± (p, ξ) ∓ (p, ξ) = 0 .

(5.220)

They also fulfill a complettness relation, which reads X

aµ (p, ξ) (aν (p, ξ))∗ = −ηµν +

a=±

pµ ξν + ξµ pν . pξ

(5.221)

Equipped with the definition and the properties of the polarization vectors you can now actually calculate the matrix element for W − + g production. Consider the case of the emission of a positive and negative helicity gluon separately and keep the gauge vector ξ arbitrary. Use the charge conjugation, Schouten, and Fierz identities, (5.209), (5.213), and (5.215), as well as the projection operators (5.210), to reduce both amplitudes to combinations of basic spinor products (5.202). Also employ momentum conservation, Pn µ i=1 pi = 0, which leads to the identity n X

[ji]hiki = 0 .

(5.222)

i=1,i6=j,k

It is important that your final result for the helicity amplitudes is independent of the choice of ξ. Could you have obtained your results far more simple by a specific choice of ξ? Square the d¯ u → e− ν¯e g amplitude and simplify your answer as much as possible. In the fundamental representation the generators T a fulfill tr T a T b = TF δ ab with TF = 1/2 and T a T a = CF where CF = (Nc2 − 1)/(2Nc ) = 4/3 for Nc = 3 corresponding to the QCD gauge group SU (3)c . ix) Heavy quark effective theory (HQET) is an effective field theory designed to systematically exploit the simplifications of the interactions of QCD in the heavy-quark limit for the case of hadrons containing a single heavy quark such as the B and D meson. The first goal of this exercise is to derive the interactions of a heavy quark with the light dofs starting from the Lagrangian ¯ (iD L=Q / − mQ ) Q ,

(5.223)

where Q is a Dirac spinor representing the heavy quark of mass mQ and Dµ = ∂µ − igs T a Gaµ ,

(5.224)

is the covariant derivative, which describes the minimal coupling of quarks to a gluon. It depends on the QCD coupling constant gs , the gluon fields Gaµ with a = 1, . . . , 8, and the generators T a of SU (3)c . The obtained effective description will not only allow us to 150

show that the Lagrangian (5.223) has a spin-flavor symmetry in the limit mQ → ∞, but also provides a systematic and rigorous way to obtain corrections to the infinite mass limit. To warm up solve the free Dirac equation for a heavy quark at rest. Use the decomposition Q(x) = e−imQ t Q(0) , (5.225) and plague it into (5.35). What do you observe? The heavy-quark momentum pµ can always be decomposed as pµ = mQ vµ + kµ ,

(5.226)

where vµ is the 4-velocity of the heavy hadron. Once mQ vµ , the large kinematical part of the momentum is singled out, the remaining component kµ is determined by soft QCD bound-state interactions, and thus k 2 m2Q . In order to work in an arbitrary frame one defines 1 ± v/ , (5.227) P± = 2 with v/v/ = v 2 = 1. Show that P± are projection operators and find the explicit form of them in the rest frame. Remove the large-frequency part of the x-dependence in Q(x) resulting from the large momentum mQ vµ by plugging ˜ Q(x) = e−imQ vx Q(x) h i ˜ ˜ = e−imQ vx P+ Q(x) + P− Q(x)

(5.228)

= e−imQ vx hv (x) + Hv (x) , into the Lagrangian (5.223). Notice that (5.228) is the covariant generalization of decomposing Q(x) into upper and lower components. Why? To decouple the simplified Dirac equation multiply it by the projection operators and use P± a/ = a/⊥ P∓ ± vaP± , (5.229) where aµ⊥ = aµ − vav µ for any 4-vector aµ . From the two resulting equations derive a relation between Hv (x) and hv (x) valid up to terms of O(1/m2Q ). Employ the relation between Hv (x) and hv (x) to eliminate the field Hv (x) from the system of equations. Using µ ν , D⊥ ] = −igGµν (5.230) [D⊥ ⊥ , find the final form of the EOM of the heavy-quark field hv (x). In (5.229), we have introduced the QCD field strength tensor Gµν = Gaµν T a . In its explicit form the field strength is given by Gaµν = ∂µ Gaν − ∂ν Gaµ + gf abc Gbµ Gcν with [T a , T b ] = if abc T c , where f abc are the fully antisymmetric structure constants. 151

Write down the Lagrangian that leads to the EOM for hv (x) including O(1/mQ ) terms. Discuss the spin and flavor properties of the leading term and the power corrections in the 1/mQ expansion. By going to the heavy-quark rest frame determine the physical meaning of the two O(1/mQ ) corrections. Explain the appearance of the spin and flavor symmetry (and its breaking) in physical terms. Compare your findings for heavy-light meson systems with the physics of the hydrogen atom. Point out similarities/differences. Derive the Feynman rules for the heavy-quark propagator once starting from the HQET Lagrangian and once by expanding the propagator of the free Dirac theory. Give also the Feynman rule for the interaction of the heavy quark with the gluon. The masses of the vector and pseudoscalar B and D mesons are experimentally determined to be MB ∗ = 5.33 GeV, MB = 5.28 GeV and MD∗ = 2.00 GeV, MD = 1.86 GeV, respectively. These numbers imply that MB2 ∗ − MB2 = 0.53 GeV2 ,

MD2 ∗ − MD2 = 0.54 GeV2 ,

(5.231)

which suggests that the difference between the square of a heavy-light vector meson mass and the square of a heavy-light pseudoscalar meson mass is a constant. Can you explain this behavior qualitatively using the heavy-quark symmetries you have derived above?

References [1] S. P. Martin, arXiv:hep-ph/9709356. [2] O. W. Greenberg, Phys. Rev. Lett. 89, 231602 (2002) [arXiv:hep-ph/0201258]. [3] V. A. Kostelecky and N. Russell, arXiv:0801.0287 [hep-ph]. [4] T. D. Lee and C. N. Yang, Phys. Rev. 104 (1956) 254. [5] C. S. Wu, E. Ambler, R. W. Hayward, D. D. Hoppes and R. P. Hudson, Phys. Rev. 105, 1413 (1957). [6] J. H. Christenson, J. W. Cronin, V. L. Fitch and R. Turlay, Phys. Rev. Lett. 13, 138 (1964). [7] M. Kobayashi and T. Maskawa, Prog. Theor. Phys. 49, 652 (1973). [8] A. D. Sakharov, Pisma Zh. Eksp. Teor. Fiz. 5 (1967) 32 [JETP Lett. 5 (1967) 24] [Sov. Phys. Usp. 34 (1991) 392] [Usp. Fiz. Nauk 161 (1991) 61]. [9] M. Dine, arXiv:hep-ph/0011376. [10] R. D. Peccei and H. R. Quinn, Phys. Rev. Lett. 38, 1440 (1977). [11] S. L. Adler, Phys. Rev. 177, 2426 (1969). [12] J. S. Bell and R. Jackiw, Nuovo Cim. A 60, 47 (1969). 152

[13] S. L. Adler and W. A. Bardeen, Phys. Rev. 182, 1517 (1969). [14] S. L. Adler, arXiv:hep-th/0405040. [15] S. M. Carroll, “The Cosmological Constant,” Living Rev. Relativity 3, 1 (2001), http://relativity.livingreviews.org/Articles/lrr-2001-1

153