Download THE CONCEPTUAL BASIS OF QUANTUM FIELD THEORY

THE CONCEPTUAL BASIS OF QUANTUM FIELD THEORY Gerard ’t Hooft Institute for Theoretical Physics Utrecht University, Leuvenlaan 4 3584 CC Utrecht, the Netherlands and Spinoza Institute Postbox 80.195 3508 TD Utrecht, the Netherlands e-mail: [email protected] internet: http://www.phys.uu.nl/~thooft/ Published in Handbook of the Philosophy of Science, Elsevier. December 23, 2004 Version 08/06/16 1 Abstract Relativistic Quantum Field Theory is a mathematical scheme to describe the sub-atomic particles and forces. The basic starting point is that the axioms of Special Relativity on the one hand and those of Quantum Mechanics on the other, should be combined into one theory. The fundamental ingredients for this construction are reviewed. A remarkable feature is that the construction is not perfect; it will not allow us to compute all amplitudes with unlimited precision. Yet in practice this theory is more than accurate enough to cover the entire domain between the atomic scale and the Planck scale, some 20 orders of magnitude. Keywords: Scalar fields, Spinors, Yang-Mills fields, Gauge transformations, Feynman rules, BroutEnglert-Higgs mechanism, The Standard Model, Renormalization, Asymptotic freedom, Vortices, Magnetic monopoles, Instantons, Confinement. This version (08/06/16) has some minor typos corrected, and a sign corrected in Eq. (4.8). Contents 1 Introduction to the notion of quantized fields 3 2 Scalar fields 5 2.1 Classical Theory. Feynman rules . . . . . . . . . . . . . . . . . . . . . . . . 5 2.2 Spontaneous symmetry breaking. Goldstone modes . . . . . . . . . . . . . 9 2.3 Quantization of a classical theory . . . . . . . . . . . . . . . . . . . . . . . 11 2.4 The Feynman path integral . . . . . . . . . . . . . . . . . . . . . . . . . . 13 2.5 Feynman rules for the quantized theory . . . . . . . . . . . . . . . . . . . . 16 3 Spinor fields 20 3.1 The Dirac equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 3.2 Fermi-Dirac statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 3.3 The path integral for anticommuting fields . . . . . . . . . . . . . . . . . . 22 3.4 The Feynman rules for Dirac fields . . . . . . . . . . . . . . . . . . . . . . 24 1 4 Gauge fields 25 4.1 The Yang-Mills equations [3] . . . . . . . . . . . . . . . . . . . . . . . . . . 27 4.2 The need for local gauge-invariance . . . . . . . . . . . . . . . . . . . . . . 29 4.3 Gauge fixing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 4.4 Feynman rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 4.5 BRST symmetry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 5 The Brout-Englert-Higgs mechanism 35 5.1 The SO(3) case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 5.2 Fixing the gauge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 5.3 Coupling to other fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 5.4 The Standard Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 6 Unitarity 40 6.1 The largest time equation . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 6.2 Dressed propagators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 6.3 Wave functions for in- and out-going particles . . . . . . . . . . . . . . . . 45 6.4 Dispersion relations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 7 Renormalization 7.1 7.2 47 Regularization schemes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 7.1.1 Pauli-Villars regularization . . . . . . . . . . . . . . . . . . . . . . . 50 7.1.2 Dimensional regularization . . . . . . . . . . . . . . . . . . . . . . . 51 7.1.3 Equivalence of regularization schemes . . . . . . . . . . . . . . . . . 52 Renormalization of gauge theories . . . . . . . . . . . . . . . . . . . . . . . 53 8 Anomalies 53 9 Asymptotic freedom 56 9.1 The Renormalization Group . . . . . . . . . . . . . . . . . . . . . . . . . . 56 9.2 An algebra for the beta functions . . . . . . . . . . . . . . . . . . . . . . . 58 10 Topological Twists 59 2 10.1 Vortices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 10.2 Magnetic Monopoles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 10.3 Instantons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 11 Confinement 63 11.1 The maximally Abelian gauge . . . . . . . . . . . . . . . . . . . . . . . . . 64 12 Outlook 65 12.1 Naturalness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 12.2 Supersymmetry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 12.3 Resummation of the Perturbation Expansion . . . . . . . . . . . . . . . . . 66 12.4 General Relativity and Superstring Theory . . . . . . . . . . . . . . . . . . 67 1. Introduction to the notion of quantized fields Quantum Field Theory is one of those cherished scientific achievements that have become considerably more successful than they should have, if one takes into consideration the apparently shaky logic on which it is based. With awesome accuracy, all known subatomic particles appear to obey the rules of one example of a quantum field theory that goes under the uninspiring name of “The Standard Model”. The creators of this model had hardly anticipated such a success, and one can rightfully ask to what it can be attributed. We have long been aware of the fact that, in spite of its successes, the Standard Model cannot be exactly right. Most quantum field theories are not asymptotically free, which means that they cannot be extended to arbitrarily small distance scales. We could easily cure the Standard Model, but this would not improve our understanding at all, because we know that, at those extremely tiny distance scales where the problems would become relevant, a force appears that we cannot yet describe unambiguously: the gravitational force. It would have to be understood first. Perhaps this is the real strength of Quantum Field Theory: we know where its limits are, and these limits are far away. The gravitational force acting between two subatomic particles is tremendously weak. As long as we disregard that, the theory is perfect. And, as I will explain, its internal logic is not shaky at all. Subatomic particles all live in the domain of physics where spins and actions are comparable to Planck’s constant h̄. One obviously needs Quantum Mechanics to describe them. Since the energies available in sub-atomic interactions are comparable to, and often larger than, the rest mass energy mc2 of these particles, they often travel with velocities 3 close to that of light, c, and so relativistic effects will also be important. Thus, in the first half of the twentieth century, the question was asked: “How should one reconcile Quantum Mechanics with Einstein’s theory of Special Relativity? ” As we shall explain, Quantum Field Theory is the answer to this question. Our first intuitions would be, and indeed were, quite different[1]. One would set up abstract Hilbert spaces of states, each containing fixed or variable numbers of particles. Subsequently, one would postulate a consistent scheme of interactions. What would ‘consistent’ mean? In Quantum Mechanics, we have learned how to describe a process where we start with a certain number of particles that are all far apart but moving towards one another. This is the ‘in’ state |ψiin . After the interaction has taken place, we end up with particles all moving away from one another, a state |ψ 0 iout . The probability that a certain in-state evolves into a given out-state is described by a quantum mechanical transition amplitude, out hψ 0 |ψiin . The set of all such amplitudes in the vector spaces formed by all in- and out-states is called the scattering matrix. One can ask how to construct the scattering matrix in such a way that (i) it is invariant under Lorentz transformations, and (ii) obeys the strict laws of quantum causality. By ‘quantum causality’ we mean that under no circumstance a measurable effect may proceed with a velocity faster than that of light. In practice, this means that one must demand that any set of local operators Oi (x, t) obeys commutation rules such that the commutators [Oi (x, t), Oj (x0 , t0 )] vanish as soon as the vector (x − x0 , t − t0 ) is space-like. One can show that this implies that the scattering matrix must obey dispersion relations. This is indeed how physicists started to think about their problem. But how should one construct such a scattering matrix? Does any systematic procedure exist? A quantized field may seem to be something altogether different, yet it does appear to allow for the construction of an interacting medium that does obey the laws of Lorentz invariance and causality. The local operators can be constructed from the fields. All we then have to do is to set up schemes of relativistically covariant field equations, such as Maxwell’s laws. Even the introduction of non-linear terms in these equations appears to be straightforward, and if we were to subject such systems to a mathematically welldefined procedure called “quantization”, we would have candidates for a solution to the aforementioned problem. Realizing that the energy in a quantized field comes in quantized energy packages, which in all respects behave like elementary particles, and, conversely, realizing that operators in the form of fields could be defined also when one starts up with Hilbert spaces consisting of elementary particles, it was discovered that quantized fields do indeed describe subatomic particles. Subsequently, it was discovered that, in a quantized field, the number of ways in which interactions can be introduced (basically by adding non-linear 4 terms in the field equations), is quite limited. Quantization requires that all interactions can be given in the form of a Lagrange function L; relativity requires this L to be Lorentzinvariant, and, most strikingly, self-consistency of Quantum Field Theory then provides further restrictions, which leads to the possibility of writing down a complete list of all possible interactions. The Standard Model is just one element of this list. The scope of this concise treatise on Quantum Field Theory is too limited to admit detailed descriptions of all technical details. Instead, special emphasis is put on the conceptual issues that arise when addressing the numerous questions and problems associated with this doctrine. One could use this text to learn Quantum Field Theory, but for many technical details, more literature must be consulted.[2] We also limited ourselves to applications of Quantum Field Theory in elementary particle physics. There are many examples in low-temperature physics where these and similar techniques are useful, but they will not be addressed here. 2. Scalar fields 2.1. Classical Theory. Feynman rules A field is here taken to mean a physical variable that is a function of space-time coordinates x = (x, t). In order for our theories to be in accordance with special relativity, we will have to specify how a field transforms under a homogeneous Lorentz transformation, x0 = Lx . (2.1) φ0 (x) = φ(x0 ) , (2.2) If a field φ transforms as then φ is called a scalar field. The improper Lorentz transformations, such as parity reflection P and time reversal T , are of lesser importance since we know that Nature is not exactly invariant under those. Let us first restrict ourselves to real scalar fields; generalization to the case where fields are denoted by complex numbers will be straightforward. Upon quantization, scalar fields will come in energy packets that behave as spinless Bose-Einstein particles, such as π 0 , π ± and η 0 . Conceptually, the scalar field is the easiest to work with, but in section 9 we shall find reasons why other kinds of fields can actually improve the internal consistency of our theories. 5 Lorentz-invariant field equations typically take the form1 ∂µ2 ≡ ∂~x2 − ∂t2 . (∂µ2 − m2(i) )φi = Fi (φ) ; (2.3) Here, the index i labels different possible species of scalar fields, and Fi (φ) could be any function of the field(s) φj (x). Usually, however, we assume that there is a potential function V int (φ), such that Fi (φ) is the gradient of V int , and furthermore we assume that V int is a polynomial whose degree is at most four: V int (φ) = ∂V int (φ) Fi (φ) = = ∂φi 1 g φφφ 6 ijk i j k 1 g φφ 2 ijk j k + 1 λ 24 ijk` φi φj φk φ` ; + 61 λijk` φj φk φ` , (2.4) where g and λ must be totally symmetric under all permutations of their indices.1 This is actually a limitation on the forms that Fi (φ) can take. Without this limitation, we would not have a conserved energy, and quantization of the theory would not be possible. Later, we will see why higher terms in the polynomial are not permitted (section 7). In order to understand the general structure of the classical solutions to this set of equations, we temporarily add a function −Ji (x) to Fi (φ) in Eq. (2.3). Multiplying Ji temporarily with a small parameter ε (later to be replaced by 1), we subsequently expand the solution in powers of ε: (m2(i) − ∂µ2 )φi (x) = εJi (x) − (1) ∂ V ∂φi (2) int (φ(x)) ; (3) φi (x) = εφi (x) + ε2 φi (x) + ε3 φi (x) + · · · = Z 4 (1) 2 (2) 3 (3) d y Gij (x − y) εJj (y) − Fj εφ (y) + ε φ (y) + ε φ (y) + · · · . (2.5) The function Gij (x − y) is a solution to the equation (m2i − ∂µ2 )Gij (x − y) = δij δ(x − y) . (2.6) Assembling terms of equal order in ε we find a recursive procedure to solve the field equations (2.5). At the end of our calculation, we might set ε equal to 1 and Ji (x) equal to zero, or better, have J non-vanishing only in the far-away region where the particles originated, so that the J interaction is a simplified model for the machine that produced the particles in the far past. Indeed, in the quantum theory it will also turn out to be convenient to use J as a model for the particle detector at the end of the experiment. 1 We use summation convention: repeated indices that are not put between brackets are automatically summed over. Greek indices µ are Lorentz indices taking 4 values, Latin indices i, j, · · · run from 1 to 3. Our metric convention is gµν = diag (−1, 1, 1, 1) . 6 We see that the solution to Eq. (2.5) can be written as the sum of a large number of terms. Each of these terms can be written in the form of a diagram, called a Feynman diagram. In these diagrams, we represent a space-time point as a dot, and the function Gij (x − y) as a line connecting x with y . The index i may be indicated at each line. A dot may either be associated with a term Ji (y), or it is a three-point vertex associated with a coefficient gijk or a four-point vertex, going with a coefficient λijk` . A typical Feynman diagram is sketched in Fig. 1. φ1 (x) 1 λ2456 4 2 6 x′ g123 J4 (x′′) 5 J5 (x′′′) J6 (x′ν) J3 (xν) G33 (x′−xν) Figure 1: Example of a Feynman diagram for classical scalar fields Observe the general structure of these diagrams. There are factors 12 , 16 , etc., which can easily be read off from the symmetries of the diagram. By construction, there are no closed loops: the diagram is simply connected. This will be different in the quantized theory. There is one important issue to be addressed: the Green function Gij (x − y) is not completely determined by the equation (2.6): one may add arbitrary combinations of the solutions of the homogeneous equation (m2i − ∂µ2 )Gij (x − y) = 0. In Fourier space, this ambiguity is reflected in the fact that one has some freedom in choosing the integration curve C in the solution2 −4 Gij (x − y) = (2π) Z d4 k eik·(x−y) C k2 δij . + m2i (2.7) Our choice can be indicated by shifting the pole by an infinitesimal imaginary number, after which we choose the contour C to be along the real axis of all integrands. A reasonable choice is −4 G+ ij (x − y) = (2π) Z d4 k eik·(x−y) k2 − (k 0 δij , + iε)2 + m2(i) (2.8) where ε is an infinitesimal, positive number. With this choice, the integration contour in the complex k 0 plane can be shifted such that the imaginary part of k 0 can be given an 2 An inner product k · x stands for ~k · ~x − k 0 x0 . 7 arbitrarily large positive value, and from this one deduces that the Green function will vanish as soon as the time difference, x0 − y 0 , is negative. This Green function, called the forward Green function, gives our expressions the desired causality structure: There are obviously no effects that propagate backwards in time, or indeed faster than light. The converse choice, G− (x−y), gives us the backward solution. However, in the quantized theory, we will often be interested in yet another choice, the Feynman propagator, defined as Z δij F −4 , (2.9) Gij (x − y) = (2π) d4 k eik·(x−y) 2 2 0 k − k + m2i − iε where, again, the infinitesimal number ε > 0. The rules to obtain the complete expansion of the solution can now be summarized as follows: 1) Each term can be depicted as a diagram consisting of points (vertices) connected by , corresponds to a point x where lines (called propagators). One end-point, , refer to factors J(y (i) ) we want to know the field φ; the other end points, for the corresponding points y (i) , see Fig. 1. 2) There are no “closed loops”. i.e. the diagrams must be simply connected (this will be different in the quantum theory). i 3) There are vertices with three prongs (3-vertices), j k , each being associated i with a factor gijk , and vertices with four prongs (4-vertices), a factor λijk` . j l k , each giving i j k x(2) , is associated with 4) Each line connecting two points x(1) and x(2) , x(1) a factor Gij (x(1) − x(2) ) when we work in ordinary space-time (configuration space), or a factor δij , (2.10) 2 k + m2i − iε in momentum space (the reason for this iε choice will only become apparent in the quantized theory). 5) If we work in configuration space, we must integrate over all x values at each vertex except the one where φ was defined; if we work in momentum space, we must integrate over the k values, subject to the restriction of momentum conservation at P each vertex: kout = in kin . 8 6) A ‘combinatorial factor’. For the classical theories it is 1/N , where N is the number of permutations of the source vertices that leave the diagram unaltered. It is not difficult to generalize the rules for the case of higher polynomials in the interactions, but this will not be needed for the time being. 2.2. Spontaneous symmetry breaking. Goldstone modes In the classical theory, the Hamilton density is ~ i )2 + V (φ) ; H(x, t) = 21 φ̇2i + 12 (∂φ V (φ) = 21 m2i φ2i + V int (φ) . (2.11) The theory is invariant under the group of transformations φ0i (x) = Aij φj (x) , (2.12) if A is orthogonal and the potential function V (φ) is invariant under that group. The simplest example is the transformation φ ↔ −φ : A = ±1 ; V = V (φ2 ) = 12 a φ2 + λ 4 φ . 24 (2.13) There are two cases to consider: i) a > 0. In this case, φ = 0 is the absolute minimum of V . We write a = m2 , (2.14) and find that m indeed describes the mass of the particle. All Feynman diagrams have an even number of external lines. Since, in the quantum theory, these lines will be associated with particles, we find that states with an odd number of particles can never evolve into states with an even number of particles, and vice versa. If we define the quantum number PC = (−1)N , where N is the number of φ particles, then we find that PC is conserved during interactions. ii) a < 0. In this case, we see that: — trying to identify the mass of the particle using Eq. (2.14) yields the strange result that the mass would be purely imaginary. Such objects (“tachyons”) are not known to exist and probably difficult to reconcile with causality, and furthermore: — the configuration φ = 0 does not correspond to the lowest energy configuration of the system. The lowest energy is achieved when F 2 = −6a/λ . φ = ±F ; 9 (2.15) It is now convenient to rewrite the potential V as V = λ 2 (φ − F 2 )2 − C , 24 (2.16) where we did not bother to write down the value of the constant C , since it does not occur in the evolution equations (2.5). There are now two equivalent vacuum states, the minima of V . Choosing one of them, we introduce a new field variable φ̃ to write φ ≡ F + φ̃ ; λ 2 λF 2 2 λF 3 λ V = φ̃ (2F + φ̃)2 = φ̃ + φ̃ + φ̃4 , 24 6 6 24 (2.17) and we see that a) for the new field φ̃, the mass-squared m̃2 = λF 2 /3 is positive, and b) a three-prong vertex appeared, with associated factor λF . The quantum number PC is no longer apparently conserved. This phenomenon is called ‘spontaneous symmetry breaking’, and it plays an important role in Quantum Field Theory. Next, let us consider the case of a continuous symmetry. The prototype example is the U (1) symmetry of a complex field. The symmetry group consists of the transformations A(θ), where θ is an angle: Φ≡ √1 (φ1 2 + iφ2 ) ; Φ0 = A(θ)Φ = eiθ Φ , (2.18) Again, the most general potential3 invariant under these transformations is V (Φ, Φ∗ ) = a Φ∗ Φ + 12 λ(Φ∗ Φ)2 − C , (2.19) In the case where the U (1) symmetry is apparent, one can rewrite the Feynman rules to apply directly to the complex field Φ, noticing that one can write the potential V as a real function of the two independent variables Φ and Φ∗ . With ∂µ2 Φ = ∂V (Φ, Φ∗ ) , ∂Φ∗ (2.20) one notices that the Feynman propagators can be written with an arrow in them: an arrow points towards a point x where the function Φ(x) is called for, and away from a 3 Observe how we adjusted the combinatorial factors. The choices made here are the most natural ones to keep these coefficients as predictable as possible in future calculations. 10 point x0 where a factor Φ∗ (x0 ) is extracted from the potential V . At every vertex, as many arrows enter as they leave, and so, during an interaction, the total number of lines pointing forward in time minus the number of lines pointing backward is conserved. This is an additively conserved quantum number, to be interpreted as a ‘charge’ Q. According to Noether’s theorem, every symmetry is associated to such a conservation law. However, if a < 0, this U (1) symmetry is spontaneously broken. Then we write V = 21 λ(Φ∗ Φ − F 2 )2 − C , F 2 ≡ −a/λ . (2.21) This time, the stable vacuum states form a closed circle in the complex plane of Φ values. Let us write Φ ≡ F + Φ̃ ; V = 1 λ 2 Φ̃ ≡ √1 (φ̃1 2 2 ∗ + iφ̃2 ) ; F (Φ̃ + Φ̃∗ ) + Φ Φ λF λ = λF φ̃21 + √ φ̃1 (φ̃21 + φ̃22 ) + (φ̃21 + φ̃22 )2 . 8 2 (2.22) The striking thing about this potential is that the mass term for the field φ̃2 is missing. The mass squared for the φ̃1 field is m̃21 = 2λF . The fact that one of the effective fields is massless is a fundamental consequence of the fact that we have spontaneous breakdown of a continuous symmetry. Quite generally, there is a theorem, called the Goldstone theorem: If a continuous symmetry whose symmetry group has N independent generators, is broken down spontaneously into a (residual) symmetry whose group has N1 independent generators, then N − N1 massless effective fields emerge. The propagators for massless fields obey Eq. (2.6) without the m2 term, which gives these expressions an ‘infinite range’: such a Green’s function drops off only slowly for large spatial or timelike separations. These massless oscillating modes are called ‘Goldstone modes’. 2.3. Quantization of a classical theory How does one “quantize” a field theory? In the old days of Quantum Mechanics, it was taught that “you take the Poisson brackets of the classical system, and replace these by commutators.” Here and there, one had to readjust the order in which classical expressions emerge, when they are replaced by operators, but the rules appeared to leave no essential ambiguities. Indeed, if such a procedure is possible, one may get a quantum theory which reproduces the original classical system in the limit of vanishing h̄. Also, the group of symmetry transformations under which the classical system was invariant, often re-emerges in the quantum system. 11 A field theory, however, has a strictly infinite set of physical degrees of freedom (the field values at every point in 3-space, or, the complete set of Fourier modes). More often than not, upon “quantization”, this leads to infinities that render the theory ill-defined. One has to formulate the notion of “quantization” much more carefully, going through several intermediate steps. Since, today, the answers to our questions are so well known, it is often forgotten how these answers can be derived rigorously and why they take the form they have. What is the strictly logical sequence of arguments? First of all, it is unreasonable to expect that every classical field theory should have a quantum mechanical counterpart. What we wish to do, is construct some quantum system, its Hilbert space and its Hamiltonian, such that in one or more special limits, it reproduces a known classical theory. We demand certain properties of the theory, such as Lorentz invariance and causality, but most of all we demand that it be internally logically impeccable, allowing us to calculate how in such a system particles interact, under all imaginable circumstances. We will, however, continue to use the phrase ‘quantization’, meaning that we attempt to construct a quantum theory with a given classical field theory as its h̄ → 0 limit. Often, authors forget to mention the first, very important, step in this logical procedure: replace the classical field theory one wishes to quantize by a strictly finite theory. Assuming that physical structures smaller than a certain size will not be important for our considerations, we replace the continuum of three-dimensional space by a discrete but dense lattice of points. In the differential equations, we replace all derivatives ∂/∂xi by finite ratios of differences: ∆/∆xi , where ∆φ stands for φ(x + ∆x) − φ(x) . In Fourier space, this means that wave numbers ~k are limited to a finite range (the Brillouin zone), so that integrations over ~k can never diverge. If this lattice is sufficiently dense, the solutions we are interested in will hardly depend on the details of this lattice, and so, the classical system will resume Lorentz invariance and the speed of light will be the practical limit for the velocity of perturbances. If necessary, we can also impose periodic boundary conditions in 3-space, and in that case our system is completely finite. Finite systems of this sort allow for ‘quantization’ in the old-fashioned sense: replace the Poisson brackets by commutators. Note that we did not (yet) discretize time, so the Hamiltonian of our theory has the form H =T +V = 3 XY (∆xa ) xa a=1 X 1 2 (∂φi /∂t)2 + i 1 2 X ∆φi 2 i,a ∆xa + V (φ) . (2.23) The canonical momenta associated to the fields φi (x) are pi (x) = (∂φi /∂t) 3 Y (∆xa ) , a=1 12 (2.24) and so, we will assume these to be operators obeying: [φi (x), φj (x0 )] = 0 ; [φi (x), pj (x0 )] = iδij δx, x0 . [pi (x), pj (x0 )] = 0 ; (2.25) Now, we have to wait and see what happens in the limit of an infinitely dense spacelattice. Will, like the classical theory, our quantum concoction turn out to be Lorentzinvariant? How do we perform Lorentz transformations on physical states? This question turns out to be far from trivial to answer, but the answer is known. We first need some useful technical tools. 2.4. The Feynman path integral The Feynman path integral is often introduced as an “infinite dimensional” integral. Again, we insist on at first keeping everything finite. Label the generalized coordinates (here the φi fields) as qi . The momenta are pi . The Hamiltonian (2.23) is of the convenQ3 tional type (the volume elements a=1 (∆xa ) act as masses). For future use, we need a slightly more general one, a Hamiltonian that also contains pieces linear in the momenta pi : H =T +V ; T = 2 X pi − Ai (q) i 2 m(i) ; V = V (q) . (2.26) In principle, we keep the number n of coordinates and momenta finite, in which case there is no doubt that the differential equations in question have unique, finite solutions (assuming the functions Ai and V to be sufficiently smooth; indeed we will mostly work with polynomials). Consider the configuration states |qi and the momentum states |pi. We have hq|q0 i = δ n (q − q0 ) , hp|p0 i = δ n (p − p0 ) ; hq|pi = (2π)−n/2 eipi qi . (2.27) Taking the order of the operators into account, we write for the kinetic energy X p2i − 2Ai pi + A2i + iW (q) ; T = 2 m(i) i X [Ai (q), pi ] X ∂i Ai (q) W (q) = = . 2i m(i) 2 m(i) i i (2.28) This enables us to compute swiftly the matrix elements hq|H|pi = hq|pi(h(q, p) + iW (q)) ; hp|H|qi = hp|qi(h(q, p) − iW (q)) , 13 (2.29) (2.30) where h(q, p) is the classical Hamiltonian as a function of the two sets of variables q and p. The evolution operator U (t, δt) for a short time interval δt is U (t, δt) e−iH(t)δt = I − iH δt + O(δt)2 = . (2.31) Its matrix elements between states hp| and |qi are easy to derive now: hp|U (t, δt)|qi = (2π) = (2π) −n/2 −ipi qi e −n/2 hp|qi − iδthp|H|qi + O(δt)2 = exp 2 1 − iδt{h(q, p) − iW (q)} + O(δt) − ip · q − iδt{h(q, p) − iW (q)} + O(δt) 2 . (2.32) What makes this expression very useful is the fact that it does not become singular in the limit δt ↓ 0. The momentum-momentum and the coordinate-coordinate matrix elements do become singular in that limit. Next, let us consider a finite time interval T . The evolution operator over that time interval can formally be viewed as a sequence of many evolution operators over short time intervals δt, with T = N δt. Using closure, both in p space and in q space, at all time intervals, I = Z dn q |qihq| = Z dn p |pihp| , (2.33) we can write |ψ(qN , T )i = hqN |U (0, T )|ψ(0)i = Z dn q0 Z dn p0 · · · Z dn qN −1 hqN |pN −1 i hpN −1 |U (tN −1 , δt)|qN −1 i hqN −1 |pN −2 i hpN −2 |U (tN −2 , δt)|qN −2 i · · · hp0 |U (0, δt)|q0 i hq0 |ψ(0)i . Z dn pN −1 (2.34) Plugging in Eq. (2.32), we see that |ψ(qN , T )i = NY −1 Z dn qτ τ =0 exp i N −1 X τ =0 δt pτ Z dn pτ e−W (qτ )δt × (2π)n qτ +1 − qτ − h(qτ , pτ , tτ ) hq0 |ψ(0)i . δt (2.35) Define q̇τ ≡ qτ +1 − qτ , δt 14 (2.36) and L(p, q, q̇, t) = p · q̇ − h(q, p, t) , (2.37) and the measure NY −1 Z dn qτ Z dn pτ τ =0 e−W (qτ )δt Z ≡ DqDp , (2π)n (2.38) then we obtain an expression that seems to be easy to extend to infinitely fine grids in the time variable: hqN |ψ(T )i = Z DqDp exp i N −1 X δt L(p, q, q̇, t) hq0 |ψ(0)i . (2.39) τ =0 In these expressions, we actually allowed the parameters in the Hamiltonian H and the Lagrangian L to depend explicitly on time t, so as to expose the physical structure of these expressions. Note that L(p, q, q̇, t) = − X i X (pi − Ai − m(i) q̇i )2 − V (q) + (Ai q̇i + 21 m(i) q̇i2 ) , 2 m(i) i (2.40) and the integrals over all momentum variables are easy to perform, giving some constant that only depends on the masses m(i) : hqN |ψ(T )i = Z −1 M X Dq exp i δt L(q, q̇, t) hq0 |ψ(0)i , (2.41) τ =0 with L(q, q̇, t) = T − V ; T = ( 12 m(i) q̇i2 + Ai q̇i ) ; X i Dq = e − P τ W (qτ ) δt NY −1 n d qτ τ =0 Y i m(i) 12 2π δt . (2.42) Actually, L(q, q̇, t) is obtained from L(p, q, q̇, t) by extremizing the latter with respect to p: ∂ L(p, q, q̇, t) = 0 ; ∂pi q̇i = ∂h(q, p, t) . ∂pi (2.43) This is exactly the standard relation between Lagrangian and Hamiltonian of the classical theory. So, L is indeed the Lagrangian. 15 If the continuum limit exists, the exponent in Eq. (2.41) is exactly i times the classical action, S= Z dtL(q, q̇, t) . (2.44) It is tempting to assume that the O(δt)2 terms in Eq. (2.31) disappear in the limit; after all, they are only multiplied by factors N ≈ C/δt. In that case, the evolution operator in Eq. (2.41) clearly takes the form of an integral over all paths going from q0 to qN . This is Feynman’s path integral. In the case of a field theory, one considers the field defined on a lattice in space, and since the path integral starts with a lattice in the time variable, we end up dealing with a lattice in space and time. In conclusion: The evolution operator in a field theory is described by first rephrasing the theory on a dense lattice in space-time. Replacing partial derivatives by the corresponding finite difference ratios, one writes an expression for the action S of the theory. Normally, it can be written as an integral over a Lagrange density, L(φ, ∂µ φ). The evolution operator of the theory is obtained by integrating eiS over all field configurations φ(x, t) in a given space-time patch. The integration measure is defined from Eq. (2.42). The Ai terms, linear in the time derivatives, do not play a role in scalar field theories but they do in vector theories, and the fact that they occur in the measure (2.42) is usually ignored. Indeed, in most cases, W (q) vanishes, but we must be aware that it might cause problems in some special cases. We ignore the W term for the time being. 2.5. Feynman rules for the quantized theory The Feynman rules for quantized field theories were first derived by careful analysis of perturbation theory. Writing the quantum Hamiltonian H as H = H0 + H int , one assembles all terms bilinear in the fields and their derivatives in H0 and performs the perturbation expansion for small values of H int . This leads to a set of calculation rules very similar to the rules derived for a classical theory, see subsection 2.1. Most of these rules (but not everything) can now most elegantly be derived from the path integral. Let us first derive these rules for computing a finite dimensional integral of the type (2.41). Although often our action will not contain terms linear in the variables qi (t), we do need such terms now, so, if necessary, we add them by hand, only to remove them at the end of the calculations. There is no need to indicate the time variable t explicitly; we absorb it in the indices i. the action is then S(q) = X L(x, t) = Ji qi − 21 Mij qi qj − 16 Aijk qi qj qk − x,t 16 1 B qqq q 24 ijk` i j k ` . (2.45) To calculate dN q eiS(q) , we keep only the bilinear part (the term with the coefficients Mij ) inside the exponent, and expand the exponent of all other terms: R out h0|0iin = C Z dN q exp − 21 iMij qi qj ∞ X ∞ X ∞ X 1 × k=0 `=0 m=0 k!`!m! (iJi1 qi1 ) · · · (iJik qik ) (− 6i Ai1 j1 k1 qi1 qj1 qk1 ) · · · (− 6i Ai` j` k` qi` qj` qk` ) (− 24i Bi1 j1 k1 `1 qi1 qj1 qk1 q`1 ) · · · (− 24i Bim jm km `m qim qjm qkm q`m ) . (2.46) (C is a constant not depending on the coefficients, but only on their dimensionality). We can calculate all of these integrals if we know how to do the J terms. These however can be done to all orders since we know exactly how to do the Gaussian integral N Z C N d q exp i(− 12 Mij qi qj + Ji qi ) = (2π) 2 (det(M )) 1 2 exp 1 iJ M −1 Jj 2 i ij = ∞ X 1 1 −1 −1 1 · · · iJ M J iJ M J i j i j i1 j1 1 ik jk k . 2 2 1 k k=0 k! (2.47) This expression tells us how to do the integrals in Eq. (2.46) by collecting terms that go with given powers of Ji . The outcome of this calculation can be summarized in a concise way: 1) Each term can be depicted as a diagram consisting of points (vertices) connected by lines (propagators). The lines may end at points i, , which refer to factors Ji . i 2) There are vertices with three prongs (3-vertices), j k , each being associated i with a factor Aijk , and vertices with four prongs (4-vertices), a factor Bijk` . j l k , each giving 3) Each line connecting two points i and j , is associated with a factor Mij−1 . 4) In contrast with the classical theory, however, the diagrams may contain disconnected pieces, or multiply connected parts: closed loops. See Fig. 2. 5) There are combinatorial factors arising from the coefficients such as k! in Eq. (2.46). One can gain experience in deriving these factors; they follow directly from the symmetry structure of a diagram. This technical detail will not be further addressed here. 17 Apparently nothing changes if one re-inserts the (x, t) dependence of these coefficients, when the variables qi are replaced by the fields φi (x, t), and the action by that of a field theory: S= Z d4 xL(x, t) ; L(x, t) = − 21 (∂µ φi )2 − 12 m2(i) φ2i − V (φ) + Ji (x)φi (x) . (2.48) The rules are as in Subsection 2.1, with the only real distinction that, in the quantum theory, diagrams with closed loops in them contribute. These diagrams may be regarded as the ”quantum corrections” to the classical field theory. The disconnected diagrams mentioned under point (4), arise for technical reasons that we will not further elaborate; in practical calculations they may usually be ignored. Figure 2: Example of a Feynman diagram for quantized scalar fields At one point, however, we made an omission: the overall constant C was not computed. It comes from the cancellation of two coefficients (the one in the measure and the one coming from the Gaussian integrals) each of which tend to infinity in the limit of an infinitely dense grid. In most cases, we are not interested in this coefficient (it refers to vacuum-energy), but this does imply that more is needed to extract relevant physical information from these Feynman diagrams. Fortunately, this deficit is easy to cure. The “source insertions”, Ji (x)φi (x) can serve as a model both for the production and for the detection of particles. Let both |0iin and |0iout be the vacuum, or ground state of the theory. At early times, the insertion −J(x, t)φ in the Hamiltonian acts on this vacuum state to excite it into the initial state we are interested in. By differentiating with respect to J , we can reach any initial state we want to consider. Similarly, at the end of the experiment, at late times, Jφ can link the particle state that we wish to detect to the final vacuum state. In short, differentiating with respect to J(x, t) gives us any matrix element that we wish to study. This is easier than one might think: Ji refers to particles of type i, and if we give it the same space-time dependence as the wave function of the particle we want to see (put it on the ‘mass shell’ of that particle), then we can be sure that there will be no contamination from unwanted particle states. One only has to check 18 the normalization, but also that is not hard: we adjust the 1-particle to 1-particle amplitude to be one; a single particle cannot scatter (it could be unstable, but that is another matter). The constant C always drops out of these calculations. An important point is the ambiguity of the inverse matrix M −1 . As in the classical case, there are homogeneous solutions, so, if we work in momentum space, there will be the question how to integrate around the poles of the propagator. The iε prescription mentioned in subsection 2.1 is now imperative. This is explained as follows. Consider the propagator in position space, and choose its poles situated as follows: Z 0 d4 k eik·x−ik t ; m2 + k2 − k 0 2 − iε ε↓0. (2.49) √ The poles are at k 0 = ±( m2 + k2 − iε). Now consider this propagator at time t = −T + iβ with both T and β large. Since β is large, the choice of the contour at negative k 0 is immaterial, since the exponential there is very small. At positive k 0 , we choose the contour to go above the pole, so the imaginary part of k 0 is chosen positive. We see that then the exponential vanishes rapidly at negative time. In short, our propagator tends to zero if the time t tends to −T + iβ when both T and β are large and positive. The same holds for t → +T − iβ . Indeed, we want our evolution operator to be dominated by the empty diagram in these two limits. Write: hψ|U (0, +T − iβ)|ψ 0 i = hψ|Ei exp(−iET − βE) hE|ψ 0 i , X (2.50) E where |Ei are the energy eigenstates. At large β , the vacuum state should dominate. Conversely, if we consider evolution backwards in time, the other iε prescription is needed. One then works with the Feynman rules for the inverse, or the complex conjugate, of the scattering matrix. Now, we are in a position to add the prescription how to identify the external lines (the lines sticking out of the diagram) with in- and out-going particles. For an ingoing particle, we use a source function J(x) whose Fourier components emit a positive amount of energy k 0 . For an out-going particle the source emits a negative k 0 . According to the rules formulated above, these sources would be connected to the rest of the diagram by propagators, in Fourier space (kµ2 + m2 − iε)−1 . Since the in- and out-going particles have kµ2 + m2 = 0 , we must take the residue of the pole. In practice, this means that we have to remove the external propagators, a procedure called ‘amputation’. One then still has to establish a normalization factor. This factor is most easily obtained by checking unitarity of the scattering matrix, using the optical theorem. At first sight, this seems to be just a simple numerical coefficient, but there is a slight complication at higher orders, when self-energy corrections affect the propagator. These corrections also remove unstable particles from the physical scattering matrix. We return to this in Section 6. The complete Feynman rules are listed in subsection 4.4 19 3. Spinor fields 3.1. The Dirac equation The fields introduced in the previous section can only be used to describe particles with spin 0. In a quantum theory, particles can come in any representation of the little group, which is the subgroup of the inhomogeneous Lorentz group that leaves the 4-momentum of a particle unaffected. For massive particles in ordinary space, this is the group of rotations of a three-vector, SO(3). Its representations are labelled by either an integer ≥ 0, or an integer + 12 , representing the total spin of a particle. So, next in line are the particles with spin 12 . The wave function for such a particle has two components, one for spin up and one for spin down. Therefore, to describe a relativistic theory with such particles, we should use a two-component field obeying a relativistically covariant field equation. Paul Dirac was the first to find an appropriate relativistically covariant equation for a free particle with spin 12 : (m + X γ µ ∂µ )ψ(x) = 0 , (3.1) µ but the field ψ(x, t) has four complex components. Here, γ µ , µ = 0, 1, 2, 3, are four 4 × 4 matrices, obeying {γ µ , γ ν } = γ µ γ ν + γ ν γ µ = 2g µν ; γµ† = gµν γ ν . (3.2) In contrast to the scalar case, the Dirac equation is first order in the space- and timederivatives, and furthermore, one could impose a ‘reality condition’ (Majorana condition) on the fields, of the form ψ(x) = Cψ ∗ (x) , γ µ C = C(γ µ )∗ , µ = 0, 1, 2, 3. (3.3) These two features combined give the Dirac field the same multiplicity as two scalar fields. Usually, we do not impose the Majorana condition, so that the Dirac field is truly complex, having a conserved U (1) charge much like a complex scalar field. We briefly recapitulate the most salient features of the Dirac equation. The 4 × 4 Dirac matrices can conveniently be expressed in terms of two commuting sets of Pauli matrices, σa and τa . Define σ1 = 0 1 1 0 , σ2 = 0 −i i 0 , σ3 = 1 0 0 −1 , (3.4) and similarly for the τ matrices, except that they act in different spaces: a Dirac index is then viewed as a pair (iα) of indices i and α, such that the matrices σa act on the first index i, and the matrices τA act on the indices α. We have: σa σb = δab + iεabc σc , τA τB = δAB + iεABC τC , 20 [σa , τB ] = 0 . (3.5) Define (with the convention gµν =diag(−1, 1, 1, 1)) γ 1 = σ1 τ1 , γ 2 = σ2 τ1 , γ 3 = σ 3 τ1 , γ 0 = −iτ3 . (3.6) The matrix C in Eq. (3.3) is then: C = γ2 γ4 . (3.7) In the non-relativistic limit, the Dirac equation reads (m + iγ µ k µ )ψ ≈ (m − iγ 0 k 0 )ψ ≈ m(1 − τ3 )ψ = 0 , (3.8) so that only two of the four field components survive (the ones with τ3 |ψi = |ψi ). 3.2. Fermi-Dirac statistics At this point, we could now attempt to pursue our fundamental quantization program: produce the Poisson brackets of the system, replace these by commutators, rewrite the Hamiltonian of the system in operator form, and solve the resulting Schrödinger equation. Unfortunately, if one uses ordinary (commuting) numbers, this does not work. The Lagrangian associated to the Dirac equation will read L= Z d3~x L(x) ; L(x) = −ψ(x)(m + 4 X γ µ ∂µ )ψ(x) , (3.9) µ=0 and the canonical procedure would give as momentum fields: pψ (~x) = ∂L = ψ(~x)γ 0 , ∂(∂0 ψ (~x)) pψ̄ (~x) = 0 . (3.10) From this, one finds the Hamiltonian: H= Z 3 d ~x H(~x) ; H(~x) = pψ ψ̇ − L(~x) = ψ(x)(m + 3 X γ i ∂i )ψ(x) , (3.11) i=1 Here, the index i is a spatial one, runnunig from 1 to 3. This, however, is not bounded from below! Such a quantum theory would not possess a vacuum state, and hence be unsuitable as a model for Nature. For a better understanding of the situation, we strip the Dirac equation to its bare bones. After diagonalizing it, we find that the Lagrangian consists of elementary units of the form L = ψ(i∂t ψ − M ψ) ; pψ = iψ ; 21 H = ψM ψ . (3.12) If we were using ordinary numbers, the only way to obtain a lower bound on H would be by identifying ψ with ψ . Then, however, the kinetic part of the Lagrangian would become a time-derivative: ψ∂t ψ → 21 ∂t (ψ ψ) , (3.13) so that it could not contribute to the action. One concludes that, only in the space of anticommuting numbers, can the Lagrangian (3.12) make sense. Thus, one replaces the Poisson brackets for ψ and ψ by anti commutators: {ψ, ψ} ≡ ψ ψ + ψ ψ = 1 ; {ψ, ψ} = 0 ; {ψ, ψ} = 0 . (3.14) The elementary representation of this algebra is in a ‘Hilbert space’ consisting of just two states (the empty state and the one-particle state), in which the operators ψ and ψ act as annihilators and creators: ψ= 0 1 0 0 ; ψ= 0 0 1 0 ; H= 0 0 0 M . (3.15) Returning to the non-diagonalized case, we can keep the Lagrangian (3.9) and Hamiltonian (3.11) when the anticommutation rules (3.14) are replaced by i {ψ (x), ψj (x0 )} = δji δ(x − x0 ) ; i {ψi (x), ψj (x0 )} = 0 ; j {ψ (x), ψ (x0 )} = 0 . (3.16) The anticommutation rules (3.16) turn Dirac particles into fermions. It appears to be a condition for any Lorentz-invariant quantum theory to be consistent, that integer spin particles must be bosons and particles whose spin is an integer + 21 must be fermions. 3.3. The path integral for anticommuting fields Let us now extend the notion of path integrals to include Dirac fields. This means we have to integrate over anticommuting numbers, to be called θi , where i is some index (possibly including x). They are numbers, not operators, so all anticommutators vanish. Consider the Taylor expansion of a function of a variable θ . Since θ2 = 0, this expansion has only two coefficients: f (θ) = f (0) + f 0 (0)θ . (3.17) So, this is the most general function of θ that one can have. It is generally agreed that one should define integrals for anticommuting numbers θ by postulating Z dθ 1 ≡ 0 ; Z 22 dθ θ ≡ 1 . (3.18) The reason for this definition is that one can manipulate these expressions in the same way as integrals over ordinary numbers: Z dθ f (θ + α) = Z Z dθ f (θ) ; dθ ∂f (θ) =0, ∂θ (3.19) etc. Now, consider the Hamiltonian for just one fermionic degree of freedom, (3.15), which we write as H = M b† b ; and a wave function ψ = {b, b† } = 1 ; b2 = (b† )2 = 0 , (3.20) ψ0 . Define the following function of θ : ψ1 ψ(θ) ≡ ψ0 θ + ψ1 , (3.21) This now serves as our wave function. It is not hard to derive how the annihilation operator b and the creation operator b† act on these wave functions: if φ = bψ then φ(θ) = θ ψ(θ) , (3.22) ∂ . ∂θ (3.23) or: b† = b=θ; We now wish to express the evolution of a fermionic wave function in terms of a path integral, just as in subsection 2.4. Consider a short time interval δt. Then, ignoring all terms of order (δt)2 , one derives e−iδt H ψ(θ1 ) = ψ0 θ1 + (1 − iM δt)ψ1 = = = Z Z Z dθ0 (−θ1 + θ0 − iM δtθ0 )(ψ0 θ0 + ψ1 ) dθ0 dθ0 Z Z dθ 1 + θ(−θ1 + θ0 − iM δtθ0 ) ψ(θ0 ) dθ eθ(−θ1 +θ0 −iM δtθ0 ) ψ(θ0 ) . (3.24) Repeating this procedure over many infinitesimal time intervals, with T = N δt, one arrives at the formal expression ψ(θT ) = Z dθT −1 dθT −1 · · · dθ0 dθ0 exp N −1 X τ =0 23 δt θτ ( −θτ +1 + θτ − iM θτ ) ψ(θ0 ) . (3.25) δt The exponential tends to i Z dt L(t) . (3.26) Thus, as in the bosonic case, the evolution operator is formally the path integral of eiS over all (anticommuting) fields ψi (x, t), where the action S is the time integral4 . of the Lagrangian L, and indeed the space-time integral of the Lagrange density L(x, t). 3.4. The Feynman rules for Dirac fields Let Mij be any matrix that can be diagonalized. Using Eqs. (3.18), we find the integral YZ dθi Z dθi eθi Mij θj = det(M ) , ij i (3.27) which can be easily checked by diagonalizing M , and writing Z dθ Z dθ eθM θ = Z dθ Z dθ (1 + θM θ) = M . (3.28) Thus, a Gaussian integral over anticommuting numbers gives a result very similar to that over commuting numbers, except that we get det(M ) rather than C/ det(M ). Writing M = M0 + δM ; det(M ) = eTr(log M )) = 1 + Tr (log M ) + 12 (Tr log M )2 + · · · , (3.29) we see that this can be obtained from det(M −1 ) by switching the signs of all odd terms in this expansion. Since the N th term corresponds to a Feynman diagram with N closed fermionic loops, one derives that the Feynman rules can be read off from the ones for ordinary commuting fields, by switching a sign whenever a closed fermionic loop is encountered. We have −Tr log M = −Tr log M0 + ∞ X (−1)n Tr (M0−1 δM )n . n n=1 (3.30) Here, as in the bosonic case, −M0 is the propagator of the theory, and δM represents the contribution from any perturbation. Thus, if our Lagrangian, including possible interaction terms, is L = −ψ i (m(i) + γ µ ∂µ )ψi + ψ i gij (φ)ψj , 4 (3.31) In some applications, careful considerations of the boundary conditions for Dirac’s equation, require an extra boundary term to be added to the action (3.25) 24 then the propagator, in Fourier space, is (m(i) + iγ µ kµ )−1 = m(i) − iγ µ kµ , m2(i) + k 2 − iε (3.32) while gij (φ) generates the interaction vertices of a Feynman diagram. The iε term is chosen as in bosonic theories, for the same reason as there: the vacuum state must be the state with lowest energy. The poles in the propagator can be used to define in- and out-going particles, by adding source terms to the Lagrangian: δL = η(x)ψ + ψη(x) , (3.33) where η(x) and η(x) are kept fixed, as anticommuting numbers. We could proceed to derive the precise rules for in- and out-going particles with spin up or down, but it is more convenient to postpone this until we discuss the unitarity property of the S -matrix, where these rules are required explicitly, and where we find the precise prescription for the normalization of these states (section 6). Note that our Lagrangian is always kept to be bilinear in the anticommuting fields. This is because we insist that L itself must be a commuting number and, furthermore, terms that are quartic in the fermionic fields have too high a dimension. We will see in section 7 why such terms have to be avoided. 4. Gauge fields We continue to search for elementary fields, whose Lorentz covariant field equations can be subject to our quantization program. In principle, such fields could come as any arbitrary representation of the Poincaré algebra, that is, we might consider any kind of tensor field, Aµνλ··· (x, t). It turns out, however, that tensors with more than one Lorentz index cannot be used. This is because we wish the energy density of a field to be bounded from below, and in addition, we wish the dimensionality of the interactions to be sufficiently low, such that all coupling strengths have mass dimension zero or positive. A theory is called “renormalizable” if all of its interaction parameters λi (that is, all parameters with respect to which we need to make a perturbation expansion) have a mass-dimensionality that is positive or zero. In practice, the dimensionality of coupling coefficients is easy to establish: in n space-time dimensions, all terms in the Lagrangian L have dimensionality [mass]n . The mass-dimensionality of all fields and parameters can then be read off: dim(λ) = [m]4−n , dim(λ3 ) = [m]3−n/2 , etc. Coupling strengths with mass dimension less than zero give rise to unacceptably divergent expressions for the contributions of the interactions at short scales. A prime example 25 of a field one would like to include is the gravitational field described by the metric gµν (x), but its only possible interaction is the gravitational one, whose coupling strength, Newton’s constant GN , has the wrong dimension. The non-renormalizable theories one then obtains are the subject of intense investigations but fall outside the scope of this paper (see C. Rovelli’s contribution in this book). So, only spin-one fields Aaµ (x) are left for consideration. Here, µ is a Lorentz index, while the number of field types is counted by the index a = 1, · · · , NV . These fields should describe the creation and annihilation of spin-one particles. When at rest, such a particle will be in one of three possible spin states. Yet, to be Lorentz-invariant, a vector field Aµ should have four components. One of these, at least, should be unphysical, although one might think of accepting an extra, spinless particle to be associated to the vector particles. More important therefore is the consideration that, in the corresponding classical theory, the energy should be bounded from below. This then rules out the treatment of a four-vector field as if we had four scalar fields, because the Lorentz-invariant product has an indefinite metric. Can we construct a Lagrangian for a vector field that gives a Hamiltonian that is bounded from below? Let us look at the high-momentum limit for one of these vector fields. The only two terms in a Lagrangian that can survive there are: L = − 12 α (∂µ Aν )2 + 21 β ∂µ Aµ ∂ν Aν , (4.1) since other terms of this dimensionality can be reduced to these ones by partial integration of the action, while mass terms (terms without partial derivatives) become insignificant. We have for the canonical momentum fields ∂L = α ∂0 Ai (i = 1, 2, 3); ∂∂0 Ai ∂L = = (β − α) ∂0 A0 − β ∂i Ai . ∂∂0 A0 Ei = E0 (4.2) Now, consider the Hamiltonian density H = E µ ∂0 Aµ − L. It must be bounded from below for all field configurations Aµ (x, t). Let us first consider the case when the spacelike components Ai and all spacelike derivatives ∂i are negligible compared to ∂0 A0 : H → 21 (β − α)(∂0 A0 )2 , (4.3) then, when A0 and all time-derivatives are negligible: H → 21 α(∂i Aj )2 − 21 β(∂i Ai )2 . (4.4) These must all be bounded from below. Eq. (4.3) dictates that β ≥ α, while Eq. (4.4) dictates that α ≥ β . We conclude that α = β , which we can both normalize to one. 26 Since total derivatives in the Lagrangian do not count, we can then rewrite the original Lagrangian (4.1) as a a L → − 41 Fµν Fµν , a Fµν = ∂µ Aaν − ∂ν Aaµ . (4.5) Realizing that this is the Lagrangian for ordinary QED, we know that its energy-density is properly bounded from below. We conclude that every vector field theory must have a Lagrangian that approaches Eq. (4.5) at high energies and momenta. We do note, that with the choice α = β , both (4.3) and (4.4) tend to zero. Indeed, a = 0, any field Aaµ that can be written as a space-time gradient, Aaµ = ∂µ Λa (x, t), has Fµν and hence contributes neither to the Lagrangian nor to the Hamiltonian. Such fields could be arbitrarily strong, yet carry zero energy. They would represent particles and forces without energy. This is unacceptable in a decent Quantum Field Theory. How do we protect our theory against such features? There is exactly one way to do this. We must make sure that field replacements of the type Aaµ → Aaµ + ∂µ Λa (x) + · · · , (4.6) do not affect at all the physical state that we are describing. This is what we call a local gauge transformation. We must insist that our theory is invariant under local gauge transformations. The ellipses in Eq. (4.6) indicate that we allow extra terms that do not contribute to the bilinear part of the Lagrangian (4.5). Thus, we arrive at Yang-Mills field theory. 4.1. The Yang-Mills equations [3] Our conclusion from the above is that every vector field is associated to a local gauge symmetry. The dimensionality of the local gauge group must be equal to NV , the number of vector fields present. Besides the vector fields, the local symmetry transformations may also affect the scalar and spinor fields. In short, the vector fields must be Yang-Mills fields. We here give a brief summary of Yang-Mills theory. We have a local Lie group with elements Ω(x) at the point x. Let the matrices T , a = 1, · · · , NV be its infinitesimal generators: a Ω(x) = I + i X Λa (x)T a ; T a = (T a )† . (4.7) a Characteristic for the group are its structure constants fabc : [T a , T b ] = −ifabc T c . 27 (4.8) As is well-known in group theory, one can choose the normalization of T a in such a way that the fabc are totally antisymmetric: fabc = −fbac = fbca . (4.9) Usually, the spinor fields ψ(x) and scalar fields φ(x) are introduced in such a way that they transform as (sets of irreducible) representations of the gauge group. A local gauge transformation is then: ψ 0 (x) = Ω(x)ψ(x) ; φ0 (x) = Ω(x)φ(x) , (4.10) and in infinitesimal form: ψ 0 (x) = ψ(x) + iΛa (x)T a ψ(x) + O(Λ)2 , (4.11) and similarly for φ(x). The dimension of the irreducible representation can be different for different field types. So, scalar and spinor fields usually form gauge-vectors of various dimensionalities. In these, and in the subsequent expressions, the indices labelling the various components of the fields ψ , Ω and T a have been suppressed. Our vector fields Aaµ (x) are most conveniently introduced by demanding the possibility of constructing gauge-covariant gradients of these fields: Dµ ψ(x) ≡ (∂µ + igAaµ (x)T a )ψ(x) , (4.12) where g is a freely adjustable coupling parameter. The repeated indices a, denoting the different species of vector fields, are to be summed over. By demanding the transformation rule (Dµ ψ(x))0 = Ω(x)Dµ ψ(x) = Dµ ψ(x) + iΛa (x)T a Dµ ψ(x) + O(Λ)2 , (4.13) one easily derives the transformation rule for the vector fields Aaµ (x): igAaµ 0 (x)T a = Ω(x) ∂µ + igAaµ (x)T a Ω−1 (x) = igAaµ (x)T a − i∂µ Λa (x)T a + g[T a , T b ]Λa (x)Abµ (x) (4.14) (omitting the O(Λ)2 terms). With Eq. (4.8), this becomes Aaµ 0 (x) = Aaµ (x) − g1 ∂µ Λa (x) + fabc Λb (x)Acµ (x) . (4.15) If we ensure that all gradients used are covariant gradients, we can directly construct the general expressions for Lagrangians for scalar and spinor fields that are locally gaugeinvariant: 2 2 1 Linv scalar (x) = − 2 (Dµ φ) − V (φ ) ; µ Linv Dirac (x) = −ψ(γ Dµ + m)ψ , 28 (4.16) (4.17) and in addition other possible invariant local interaction terms without derivatives. The commutator of two covariant derivatives is a (x)T a ψ(x) ; [Dµ , Dν ]ψ(x) = igFµν a Fµν (x) = ∂µ Aaν − ∂ν Aaµ + gfabc Abµ Acν . (4.18) a Unlike Aaµ (x) or the direct gradients of Aaµ (x), this Yang-Mills field Fµν transforms as a true adjoint representation of the local gauge group: a 0 a c (x) + fabc Λb (x)Fµν (x) . Fµν (x) = Fµν (4.19) This allows us to construct a locally gauge invariant Lagrangian for the vector field: a 1 a Linv YM (x) = − 4 Fµν (x)Fµν (x) . (4.20) The structure constants fabc in the definition 4.18 of the field F µν implies the presence of interaction terms in the Yang-Mills Lagrangian 4.20. If fabc is non-vanishing, we talk of a non-Abelian gauge theory. There is one important complication in the case of fermions: the Dirac matrix γ 5 ≡ γ 1 γ 2 γ 3 γ 4 can be used to project out the chiral sectors: ψ ≡ ψL + ψR ; ψL = 21 (1 + γ 5 )ψ ; ψR = 12 (1 − γ 5 )ψ . (4.21) Since the kinetic part of a Dirac Lagrangian can be split according to LDirac = −ψ L (γD)ψL − ψ R (γD)ψR , (4.22) we may choose the left-handed fields ψL to be in representations different from the righthanded ones, ψR . However, since a mass term joins left to right: −mψ ψ = −mψ L ψR − mψ R ψL , (4.23) such terms would then be forbidden, hence such chiral fields must be massless. Secondly, not all combinations of chiral fermions are allowed. An important restriction is discussed in section 8. The fields ψL turn out to describe spin- 21 massless particles with only the leftrotating helicity, while their antiparticles, described by ψ L , have only the right-rotating helicity. 4.2. The need for local gauge-invariance In the early days of Gauge Theory, it was thought that local gauge-invariance could be an ‘approximate’ symmetry. Perhaps one could add mass terms for the vector field that 29 violate local symmetry, but make the model look more like the observed situation in particle physics. We now know, however, that such models suffer from a serious defect: they are non-renormalizable. The reason is that renormalizability requires our theory to be consistent up to the very tiniest distance scales. A mass term would, at least in principle, turn the field configurations described by the Λ(x) contributions in Eq. (4.15) into physically observable fields (the Lagrangian now does depend on Λ(x)). But, since the kinetic term for Λ(x) is lacking, violently oscillating Λ fields carry no sizeable amount of energy, so they would not be properly suppressed by energy conservation. Uncontrolled short distance oscillations are the real, physical cause for a theory being non-renormalizable. It is similar uncontrolled short-distance fluctuations of the space-time metric that cause the quantized version of General Relativity (“Quantum Gravity”) to be non-renormalizable. Drastic measures (String Theory?) are needed to repair such a theory. Since renormalizability provides the required coherence of our theories, local gauge symmetry, described by Eqs. (4.10) and (4.15), must be an exact, not an approximate symmetry of any Quantum Field Theory.5 Obviously, the fact that most vector particles in the sub-atomic world do carry mass must be explained in some other way. It is here that the Brout-Englert-Higgs mechanism comes to the rescue, see Section 5. 4.3. Gauge fixing The longitudinal parts of the vector fields do not occur directly in the Yang-Mills Lagrangian (4.20), exactly because of its invariance under transformations of the form (4.15). Yet if we wish to describe solutions, we need to choose a longitudinal component. This is why we wish to impose some additional constraint, the so-called gauge condition, on our description of the solutions, both in the classical and in the quantized theory. In electrodynamics, we usually impose a constraint such as ∂µ Aµ (x) = 0 or A0 = 0. In a Yang-Mills theory, such a constraint is needed for each value of the index a. A gauge fixing term is indicated by a field C a (x) which is put equal to zero: either or C a (x) = 0 ; C a (x) = ∂µ Aaµ (x) C a (x) = Aa0 (x) a = 1, · · · , NV ; where (Feynman gauge), (timelike gauge), (4.24) (4.25) (4.26) or other possible gauge choices. It is always possible to find a Λa (x) that obeys such a condition. For instance, to obtain the Feynman gauge (4.25), all one has to do is extremize an integral under variations of the gauge group: δ Z 4 dx 2 Aaµ (x) = Z d4 x 2Aµ Dµ Λ = 0 5 → Dµ Aµ (x) = ∂µ Aaµ (x) = 0 . (4.27) One apparent exception could be the case where the longitudinal component decouples completely, which happens in massive QED. But even in that case, it is better to view the longitudinal photon as a Higgs field, see section 5. 30 For the classical theory, the most elegant way to impose such a gauge condition is by adding a Lagrange multiplier term to the Lagrangian: L(x) = Linv (x) + λa (x)C a (x) , (4.28) where C a (x) is any of the possible gauge fixing terms and λa (x) a free kinematical variable. Here, Linv stands for the collection of all gauge-invariant terms in the Lagrangian. The Euler-Lagrange equations of the theory then automatically yield the Yang-Mills field equations plus the constraint, apart from a minor detail: the boundary condition. Varying the gauge transformations, one finds, since Linv does not vary, Dµ λa (x) = 0. We need to impose the stricter equation λa (x) = 0, which is obtained by imposing λa (x) = 0 at the boundaries of our system. Alternatively, one can replace the invariant Lagrangian by L(x) = Linv (x) − 1 2 2 C a (x) , (4.29) which has the advantage that, after partial integration, the bilinear part becomes very simple: L = − 14 Fµν Fµν − 12 (∂µ Aµ )2 → − 12 (∂µ Aν )2 , so that the vector field can be treated as if it were just 4 scalars. Again, varying the gauge transformation Λa (x), one finds Dµ C a (x) = 0, which must be replaced by the more stringent condition C a (x) = 0 by adding the appropriate boundary condition. Note that the Lagrange-Hamilton formalism could give the wrong sign to the energy of some field components; we should continue to use the energy deduced before imposing the gauge constraint. If we use the timelike gauge (4.26), the energy is correct, but the theory appears to lack Lorentz invariance. Lorentz transformations must now be accompanied by gauge transformations. How is the gauge constraint to be handled in the quantized theory? This problem was solved by B.S. DeWitt[4],[5] and by Faddeev and Popov[6]. The gauge constraint is to be imposed in the integrand of the functional integral: Z= Z DA(x) Z Dφ(x) · · · ei R d4 xLinv (x) Y δ(C a (x))∆{A, φ} . (4.30) a,x Thus, we integrate only over those field configurations that obey the gauge condition. ∆{A, φ} is a Jacobian factor, which we will discuss in a moment. The formal delta function can be replaced by a Lagrange multiplier: Z Dλa (x)ei R d4 x λa (x)C a (x) , (4.31) and indeed, if λa (x) is simply added to the list of dynamical field variables of the theory, the Feynman rules can be derived unambiguously as they were for the scalar and the spinor case. 31 There is, however, a problem. It appears to be difficult to prove gauge-invariance. More precisely: we need to ascertain that, if we make the transition to a different gauge fixing function C a (x), the physical contents of the theory, in particular the scattering matrix, remains the same. The difficulty has to do with the measure of the integral. It is not gauge-invariant, unless we add the extra term ∆{A, φ} in Eq. (4.30). This term is associated to the volume of an infinitesimal gauge transformation. Suppose that the field combination C a (x) transforms under a gauge transformation as C a0 (x) = C a (x) + ∂C a (x) b 0 Λ (x ) , ∂Λb (x0 ) (4.32) then the required volume term is the Jacobian ∆{A, φ} = det ∂C a (x) ∂Λb (x0 ) . (4.33) The determinant is computed elegantly by using the observation in subsection 3.4 that that integral over anticommuting variables gives a determinant (Eq. (3.27)). So, we introduce anticommuting scalar fields η and η , and then write (4.33) = Z Dη a (x) Z Dη a (x) exp η a (x) ∂C a (x) b 0 η (x ) . ∂Λb (x0 ) (4.34) This is called the Faddeev-Popov term in the action. Taking everything together, we arrive at the following action for a Yang-Mills theory: L(x) = Linv (x) + λa (x)C a (x) + η a (x) ∂C a (x) b 0 η (x ) . ∂Λb (x0 ) (4.35) It is also possible to find the quantum analogue for the classical Lagrangian (4.29). First, replace C a (x) by C a (x) − F a (x), where F a (x) is a fixed but x-dependent quantity in the functional integral (4.35). Physical effects should be completely independent of F a (x). Therefore, we can functionally integrate over F a (x), using any weight factor we − 1 2 R 4 a 2 d x(F (x)) like. Choose the weight factor e . The Lagrange multiplier λa (x) now simply a a forces C (x) to be equal to F (x). We end up with the effective Lagrangian6 L(x) = Linv (x) − 1 2 2 C a (x) + η a (x) ∂C a (x) b 0 η (x ) . ∂Λb (x0 ) (4.36) This is the most frequently used Lagrangian for gauge theories. In contrast to the Lagrangians for scalar and spinor fields, not all fields here represent physical particles. The longitudunal part of the vector fields, and the fermionic yet scalar fields η and η are “ghosts”. These are the only fields to which no particles are associated. 6 One usually absorbs the factor 1/g of Eq. (4.15) into the definition of the η field. 32 4.4. Feynman rules The Feynman rules, needed for the computation of the scattering matrix elements using perturbation theory, can be read off directly from the gauge-fixed Lagrangian (4.35) or (4.36). In both cases, we first split off the bilinear parts7 , writing the Lagrangian as L = −Aα (x)M̂αβ Aβ (x) − ψ α (x)D̂αβ ψβ (x) + Lint , (4.37) where Lint contains all trilinear and quadrilinear terms. Here, Aα (x) is short for all bosonic (scalar and vector) fields, and ψ and ψ for both the Dirac fermions and the Faddeev-Popov fermions. The coefficients M̂αβ , D̂αβ and the trilinear coefficients may contain the gradient operator ∂/∂xµ . After Fourier expansion, this will turn into a factor ikµ . ferm — The propagators P̂αβ and P̂αβ will be the inverse of the coefficients M̂ − iε and D̂ − iε, so, for instance δαβ ; if M̂αβ = (m(α) − ∂µ2 )δαβ then P̂αβ = 2 m(α) + k 2 − iε if D̂αβ = (m(α) + γ µ ∂µ )δαβ then ferm P̂αβ = (m(α) − iγ µ kµ )δαβ . m2(α) + k 2 − iε (4.38) — The vertices are generated by the trilinear and quadrilinear terms of Lint , just as in subsection 2.5. If we have source terms such as J a (x)φa (x), η i (x)ψi (x) or i ψ (x)ηi (x), then these correspond to propagators ending into points, where the momentum k has to match a given Fourier component of the source. All this can be read off neatly from formal expansions of the functional integral such as (2.46). — There is an overall minus sign for every fermionic closed loop. — Every diagram comes with canonical coefficients such as 1/k! and (2π)−4N where k! is the dimension of the diagram’s internal symmetry group, and N counts the number of loop integrations. These coefficients can be obtained by comparing functional integrals with ordinary integrals. — There is a normalization coefficient for every external line, depending on the wave function chosen for the in- and out-going particles. We return to this in section 6. Note that any terms in the Lagrangian that can be written as a gradient of some (locally defined) field configuration can be replaced by zero. This is because (under sufficiently carefully chosen boundary conditions) such terms do not contribute to the total action R S = d4 xL(x). 7 One may decide to leave small corrections to the bilinear parts of the Lagrangian to be treated together with the higher order terms as if they were ‘two-point vertices’. 33 4.5. BRST symmetry As the reader may have noted, we departed from our original intention, to keep space and time on a lattice and only turn to the continuum limit at the very end of a calculation. We have not even started doing calculations, and already the Feynman rules were formulated as if the fields lived on a space-time continuum. Indeed, we should have kept space and time discrete, so that the functional integral is nothing but an ordinary integral in a space with very many, but still a finite number of, dimensions. In practice, however, the continuum is a lot easier to handle, so, often we do not explicitly mention the finite size meshes of space and time. Our first attempt to formulate the continuum limit will be in section 7. We will then see that the coefficients in the Lagrangian (4.36) have to be renormalized. The following question then comes up: If we see a Lagrangian that looks like (4.36), how can we check that its coefficients are those of a genuine gauge theory? The answer to this question is that the gauge-fixed Lagrangians (4.35) and (4.36) possess a symmetry. The first attempts to identify the symmetry in question gave negative results, because the ghost field is fermionic while the gauge fixing terms are bosonic. In the early days we thought that the required relation between the gauge fixing terms and the ghost terms had to be checked by inspection.[15] But the complete answer was discovered by Becchi, Rouet and Stora[7], and independently by Tyutin[8]. The symmetry, called BRST symmetry, is a supersymmetry. For the Lagrangian (4.36), which is slightly more general than (4.35), the transformation rules are ∂Aα (x) b 0 A0α (x) = Aα (x) + ε ∂Λ b (x0 ) η (x ) ; (a) η a0 (x) = η a (x) + 21 ε fabc η b (x)η c (x) ; (b) η a0 (x) = η a (x) + ε C a (x) , (c) (4.39) where the anticommuting number ε is the infinitesimal generator of this (global) supersymmetry transformation. The invariance of the Lagrangian (4.36) under this supersymmetry transformation is easy to check, except perhaps the cancellation of the variation of the last term against the contribution of (4.39 b): ηa a ∂C a 1 c d a ∂ ∂C ε f η η + η ηcηd = · · · . bcd ∂Λb 2 ∂Λc ∂Λd (4.40) Substituting some practical examples for the gauge constraint function C a , one discovers that these terms always cancel out. The reason for (4.40) to vanish is the fact that gauge 34 transformations form a group, implying the Jacobi identity: fasb fscd + fasc fsdb + fasd fsbc = 0 . (4.41) The converse is more difficult to prove: If a theory is invariant under a transformation of the form (4.39) (BRST invariance), then it is a gauge-fixed local gauge theory. What is really needed in practice, is to show that the ghost particles do not contribute to the S -matrix. This indeed follows from BRST invariance, via the so-called Slavnov-Taylor identities[16][17], relations between amplitudes that follow from this symmetry. 5. The Brout-Englert-Higgs mechanism The way it is described above, Yang-Mills gauge theory does not appear to be suitable to describe massive particles with spin one. However, in our approach we concentrated only on the high-energy, high-momentum limit of theories for vector particles, by assuming the Lagrangian to take the form (4.1) there. Mass terms dominate in the infra-red, or low energy domain. Here, one may note that we have not yet exploited all possibilities. We need to impose exact local gauge-invariance, as explained in subsection 4.2. So our theory must be constructed along the lines expounded in subsection 4.1. All scalar and spinor fields must come as representations of the gauge group. So, what did we overlook? In our description of the most general, locally gauge-invariant Lagrangian, it was tacitly assumed that the minimum of the scalar potential function V (φ) occurs at φ = 0, so that, as one may have in global symmetries, the symmetry is evident in the particle spectrum: physical particles come as representations of the full local symmetry group. But, as we have seen in the case of a global symmetry, in subsection 2.2, the minimum of the potential may occur at other values of φ. If these values are not invariant under the gauge group, then they form a non-trivial representation of the group, invariant only under a subgroup of the gauge group. It is the invariant subgroup, if at all non-trivial, of which the physical particles will form representations, but the rest of the symmetry is hidden. Indeed, if we switch off the coupling to the vector fields, we obtain again the situation described in subsection 2.2. As was emphasized there, the particle spectrum then contains massless particles, the Goldstone bosons. These Goldstone bosons represent the field excitations associated to a global symmetry transformation, which does not affect the energy: hence the absence of mass. But, Global gauge Goldstone bosons do carry a kinetic term. Therefore, they do carry away energy when moving with the speed of light. This is because a global symmetry only dictates the Goldstone field to carry no energy if the field is space-time independent. In contrast, local gauge symmetries demand that Goldstone fields also carry no energy when they do depend on space and time. In the case of a local symmetry, therefore, 35 Goldstone modes are entirely in the ghost sector of the theory; Goldstone particles then are unphysical. Let us see how this happens in an example. 5.1. The SO(3) case As a prototype, we take the group SO(3) as our local gauge group, and for simplicity we ignore the contributions of loop diagrams, which represent the higher order quantum corrections to the field equations. Let the scalar field φa be in the 3-representation. The invariant part of the Lagrangian is then: a 2 ) − 21 (Dµ φa )2 − V (φ) ; Linv = − 14 (Fµν V (φ) = 18 λ (φa )2 − F 2 2 . (5.1) Here, Dµ stands for the covariant derivative: Dµ φa = ∂µ φa + gabc Abµ φc . As in section 2.2, Eq. (2.17), we define shifted fields φ̃a by   0   φa ≡ φ̃a +  0  ; F V (φ̃) = 12 λF 2 φ̃23 + 12 λF φ̃2 φ̃3 + 81 λ(φ̃2 )2 . (5.2) The shift must also be carried out in the kinetic term for φ: A2µ   = Dµ φ̃a + gF  −A1µ  ; 0  Dµ φa  − 12 (Dµ φa )2 = 2 − 21 (Dµ φ̃a )2 − gF A2µ Dµ φ̃1 − A1µ Dµ φ̃2 − 12 g 2 F 2 A1µ + A2µ 2 . (5.3) Defining the complex fields Φ̃ = √1 (φ̃1 2 Aµ = + iφ̃2 ) ; √1 (A1 µ 2 + iA2µ ) ; Dµ Φ̃ = (∂µ + iA3µ )Φ̃ − iAµ φ̃3 , (5.4) we see that the Lagrangian (5.1) becomes a 2 Linv = − 41 (Fµν ) − 12 (Dµ φ̃3 )2 − Dµ Φ̃∗ Dµ Φ̃ where − 21 MH2 φ̃23 − MV2 A∗µ Aµ + MV =(A∗µ Dµ Φ̃) − V int (φ̃) , √ MH = λF ; MV = gF , (5.5) and V int is the remainder of the potential term. = stands for imaginary part. Thus, the ‘neutral’ component of the scalar field, the Higgs particle, gets a mass MH (see Eq. 5.2) and the ‘charged’ components of the vector field receive a mass term with mass MV . The mechanism that removes (some of) the Goldstone bosons and generates mass for the vector particles, is called the Brout-Englert-Higgs (BEH) mechanism.[10][11] In every respect, the neutral, massless component of the vector field behaves like an electromagnetic vector potential, and the complex vector particle is electrically charged. 36 5.2. Fixing the gauge If one would try to use the rules of Subsection 4.4 to derive the Feynman rules directly from Linv , one would find that the matrix M̂ describing the bilinear part of the Lagrangian has no inverse. This is because the gauge must first be fixed. Choosing ∂µ Aaµ (x) = 0 has the advantage that the somewhat awkward term =(A∗µ ∂µ Φ̃) can be put equal to zero by partial integration. The vector propagator (in momentum space) is then easily computed to be ab Pµν (k) δµν − kµ kν /(k 2 − iε) δab , = k 2 + m2(a) − iε (5.6) where m(a) = MV for the charged vector field and 0 for the neutral one. This indeed appears to describe a vector particle with mass m(a) and an additional transversality constraint. One can do something smarter, though. If, in the gauge-fixed lagrangian (4.36), we choose C 3 = ∂µ A3µ ; C 1 = ∂µ A1µ − MV φ̃2 ; C 2 = ∂µ A2µ + MV φ̃1 , (5.7) then we find that the scalar-vector mixing terms cancel out, but now also the (∂µ Aµ )2 term cancels out, so that the vector propagator looses its kµ kν term. The vector propagator is then ab Pµν (k) = k2 δµν δab , + m2(a) − iε (5.8) and the charged scalar ghost gets a mass MV . The physical field φ̃3 is unaffected. It is instructive to compute the Faddeev-Popov ghost Lagrangian in this gauge. One easily finds it to be Lghost = η a ∂ 2 η a − MV2 (η 1 η 1 + η 2 η 2 ) + interaction terms . (5.9) As will be confirmed by more explicit calculations, the theory has physical, charged vector particles with masses MV , a neutral (massless) photon and a neutral scalar particle with mass MH . The latter is called the Higgs particle of this theory. All other fields in the Lagrangian describe ghost fields. Apparently, in the gauge described above, all ‘unphysical’ charged particles, the ghosts, the timelike components of the vector fields, as well as the Goldstone bosons, have the same mass MV . The unphysical neutral particles all have mass zero. One concludes that the symmetry pattern of this example is as follows: the local gauge group, SO(3), is broken by the Brout-Englert-Higgs mechanism into its subgroup SO(2) 37 (the rotations about a fixed axis, formed by the vacuum value of φa ), or equivalently, U (1). Therefore two of the three vector bosons obtain a mass, while one massless U (1) photon remains. At the same time, two of the three scalars turn into ghosts, the third into a Higgs particle. The Brout-Englert-Higgs mechanism does not alter the total number of independent physical states in the particle spectrum. In our example, two of the three scalar particles disappeared, but the two massive spin-1 particles now each have three spin helicities, whereas the massless photons only had two. 5.3. Coupling to other fields The shift (5.2) in the definition of the fields, gives all interactions an asymmetric appearance. This is why, in the literature, one talks of “spontaneous breaking of the local symmetry”. Actually, this is something of a misnomer. In the case of a global symmetry, spontaneous breakdown means that the vacuum state is degenerate. After a global symmetry transformation, the vacuum state is transformed into a physically inequivalent vacuum state, which is not realized in the system. The existence of a massless Goldstone boson testifies to that. In the case of a local symmetry, nothing of the sort happens. There is only one vacuum state, and it is invariant under the local symmetry, always. This is why the Goldstone boson became unphysical. In fact, all physical states are formally invariant under local gauge transformations. Apparent exceptions to this rule are, of course, the charged particles in QED, but this is because we usually wish to ignore their interactions with the vector potential at infinity. In reality a full discussion of charged particles is obscured by their long-range interactions. In view of all of this, it is better not to say that a local symmetry is spontaneously broken, but, rather, to talk of the Brout-Englert-Higgs mechanism [10][11], which is the phenomenon that the spectrum of physical particles do not form a representation of the local symmetry group. The local symmetry can only be recognized by shifting the scalar fields back to their symmetric notation, the original fields φ. Local symmetry must not be regarded as a property of the physical states, but rather as a property of our way of describing the physical states. If, however, we perform a perturbation expansion for small values of the gauge field coupling, we find that at vanishing gauge coupling a local symmetry is spontaneously broken. Therefore, it is still quite useful to characterize our perturbative description by listing the gauge groups and the subgroups into which they are broken. Now, let us assume that there are other fields present, such as the Dirac fermions, ψi . In the symmetric notation, they must form a representation of the local gauge group. So, 38 we have i i LDirac = −ψ (γµ Dµ + m(i) ) ψi − ψ gY taij φa ψj , (5.10) where Dµ is the appropriate covariant derivative, containing those matrices T a that are appropriate for the given representation (see 4.11 and 4.12), and g Y stands for one or more Yukawa coupling parameters. The mass terms m(i) and coupling coefficients taij are invariant tensors of the gauge group (masses are only allowed if the fermions are not chiral, see the discussion following Eq. (4.22)). i i LDirac = −ψ (γµ Dµ + m(i) ) ψi − ψ gY taij φa ψj , (5.11) Here, again, we started with the more transparent symmetric fields φa , but the physical fields φ̃ are obtained by the shift φa = Fa + φ̃a . Thus, the lowest order bilinear part of the Dirac Lagrangian becomes i LDirac → −ψ (γµ ∂µ + m(i) )δij + gY taij Fa ψj , (5.12) In particular, if the symmetry acts distinctly on the chiral parts of the fermion fields, the mass term m(i) is forbidden, but the less symmetric second term may generate masses and in any case mass differences for the fermions. Thus, not only do the vector and scalar particles no longer form representations of the original local gauge group, but neither do the fermions. 5.4. The Standard Model What is presently called the ‘Standard Model’ is just an example of a Higgs theory. The gauge group is SU (3) × SU (2) × U (1). This means that the set of vector fields falls apart into three groups: 8 associated to SU (3), then 3 for SU (2), and finally one for U (1). The scalar fields φi form one two-dimensional, complex representation of two of the three groups: it is a doublet under SU (2) and rotates as a particle with charge 21 under U (1). Representing the Higgs scalar in terms of four real field components, the Brout-EnglertHiggs mechanism is found to remove three of them, leaving only one neutral, physical Higgs particle. SU (2) × U (1) is broken into a diagonal subgroup U (1). Three of the four gauge fields gain a mass. The one surviving photon field is obtained after re-diagonalizing the vector fields; it is a linear composition of the original U (1) field and one of the three components of the SU (2) gauge fields. The SU (3) group is not affected by the Brout-Englert-Higgs mechanism, so one would expect all ‘physical’ particles to come in representations of SU (3). What happens instead 39 is further explained in section 11: only gauge-invariant combinations of fields are observable as particles in our detectors. The fermions in the Standard Model form three ‘families’. In each family, we see the same pattern. The left handed fields, ψL , all form doublets under SU (2), and a combination of a triplet (‘quarks’) and singlets (‘leptons’) under SU (3). The right handed components, ψR , form the same representations under SU (3), but form a pair of two singlets under SU (2); so they do not couple to the SU (2) vector fields. The U (1) charges of the left-handed SU (2) doublets are − 12 for the leptons and 16 for the quarks; the U(1) charges of the right-handed singlets are −1 and 0 (for the leptons) or − 31 and 23 (for the quarks). The Standard model owes its structure to the various possible Yukawa interaction terms with the Higgs scalars. They are all of the form ψ φ ψ , and invariant under the entire gauge group, but since there are three families of fermions, each having left and right handed chiral components, there are still a fairly large number of such terms, each of which describes an interaction strength whose value is not dictated by the principles of our theory.[12] 6. Unitarity As we saw in subsection 4.4, the Feynman rules unambiguously follow from the expression one has for the Lagrangian of the theory. More precisely, what was derived there was the set of rules for the vacuum-to-vacuum amplitude in the presence of possible source insertions Ji (x), including anticommuting sources ηi , η i . The overall multiplicative constant C in our Gaussian integrals such as (2.46) is completely fixed by the demand that, in the absence of sources, the vacuum-to-vacuum amplitude should be 1. By construction then, the resulting scattering matrix should turn out to be unitary. In practice, however, things are not quite that simple. In actual calculations, one often encounters divergent, hence meaningless expressions. This happens when one makes the transition to the continuum limit too soon — remember that we insisted that space and time are first kept discrete. Unitarity of the S -matrix turns out to be a sensitive criterion to check whether we are performing the continuum limit correctly. It was one of our primary demands when we initiated the program of constructing workable models for relativistic, quantized particles. Another demand, the validity of dispersion relations, can be handled the same way as unitarity; these two concepts will be shown to be closely related. The formalism described below is based on work by Cutkosky and others, but was greatly simplified by Veltman.[13] Parts of this section are fairly technical and could be skipped at first reading. 40 6.1. The largest time equation Let us start with the elementary Feynman propagator, (k 2 + m2 − iε)−1 , and its Fourier transform back to configuration space (omitting for simplicity a factor (2π)4 ): ∆F (x) = −i Z d4 k eikx , k 2 + m2 − iε x = x(1) − x(2) . (6.1) In addition, we define the on-shell propagators ∆± (x) = 2π Z d4 k eikx δ(k 2 + m2 )θ(±k 0 ) ; k x = ~k · ~x − k 0 x0 , (6.2) and θ is the Heaviside step function, θ(x) = 1 for x ≥ 0 and = 0 otherwise. The integrals are over Minkowski variables ~k, k 0 . These operators propagate particles on mass shell with the given sign of the energy from x(2) to x(1) , or back with the opposite sign. We have ∆+ (x) = (∆− (x))∗ ; ∆+ (x) = ∆− (−x) . (6.3) Our starting point is the decomposition of the propagator into forward and backward parts: ∆F (x) = θ(x0 )∆+ (x) + θ(−x0 )∆− (x) . (6.4) Obviously: ∗ ∆F (x) = θ(x0 )∆− (x) + θ(−x0 )∆+ (x) . (6.5) One easily proves this by deforming the contour integration in the complex k 0 plane. Consider now a Feynman diagram with n vertices, where lines are attached with a given topological structure, which will be kept fixed. The external lines are assumed to be ‘amputated’: there are no propagators attached to them. The Feynman rules are applied as described in Subsections 2.5 and 4.4. The diagram is then part of our calculation of an S -matrix element. We consider the diagram in momentum representation and in the coordinate representation. The expression we get in coordinate representation is called F (x(1) , x(2) , · · · , x(n) ). Next, we introduce an expression associated to the same diagram, but where some of the vertices are underlined: F (x(1) , x(2) , · · · , x(i) , · · · , x(j) , · · · , x(n) ), where x(i) refer to the coordinates that must be integrated over when one elaborates the Feynman rules. The rules for computing this new amplitude are as follows: i) A propagator ∆F (x(i) − x(j) ) is used if neither x(i) nor x(j) are underlined. 41 ii) A propagator ∆+ (x(i) − x(j) ) is used if x(i) but not x(j) is underlined. iii) A propagator ∆− (x(i) − x(j) ) is used if x(j) but not x(i) is underlined. ∗ iv ) A propagator ∆F (x(i) − x(j) ) is used if both x(i) and x(j) are underlined. v ) A minus sign is added for every underlined vertex. In all other respects, the rules for the calculation of the amplitude are unchanged. Figure 3: Diagram with underlined vertices, which are indicated by little circles One now derives the largest time equation: Let x(k) be the coordinate with the largest time: x(k)0 ≥ x(i)0 , ∀i . Then, F (x(1) , x(2) , · · · , x(k) , · · · , x(n) ) = −F (x(1) , x(2) , · · · , x(k) , · · · , x(n) ) , (6.6) where in both terms the points other than x(k) are underlined or not in identical ways. One easily proves this using Eqs. (6.4) and (6.5). One consequence of this theorem is F ({x(i) }) = 0 . X all 2n (6.7) possible underlinings We now show that these are the diagrams contributing to the unitarity equation, or ‘optical theorem’: X S|nihn|S † = I . n 42 (6.8) The diagrams for the matrix S are as described earlier. The diagrams for S † contain the complex conjugates of the propagators. Since also the vertices in the functional integral are all multiplied by i, they must all change sign in S † . Also the momenta k in eikx switch sign. In short, the diagrams needed for the computation of S † indeed are the underlined Green functions. Note that, in momentum space, the largest time equation (6.6) cannot be applied to individual vertices, since, while being integrated over, the vertex with largest time switches position. However, the summed equation (6.7) is valid. The identity I on the r.h.s. of Eq. (6.8) comes from the one structure that survives: the diagram with no vertices at all. We observe that unitarity may follow if we add all possible ways in which a diagram with given topology can be cut in two, as depicted in Fig. 3. The shaded line separates S from S † . The lines joining S to S † represent the intermediate states |ni in Eq. (6.8). They are on mass shell and have positive energies, which is why we need the factors δ(k 2 +m2 )θ(k 0 ) there. If a propagator is equipped with some extra coefficients Rij : Pij (k) = −i Rij (k) , + m2 − iε k2 (6.9) then we can still use the same decomposition (6.4), provided Rij is local : it must be a finite polynomial in k . Writing Rij = X fi (k)fj∗ (k) , (6.10) k we can absorb the factors fi (k) into the definition of S , provided that all eigenvalues of Rij are non-negative. Indeed, kinetic terms in the Lagrangian must all have the same sign. Note that we are not allowed to replace the terms in the Lagrangian by their complex conjugates. This implies that, for the unitarity proof, it is mandatory that the Lagrange density is a real function of the fields. An important feature of these equations is the theta functions for k 0 . They guarantee that the intermediate states contribute only if their total energy does not exceed the energy available in the given channel. 6.2. Dressed propagators In the previous subsection, not all diagrams that contribute to S S † have yet been handled correctly. There is a complication when self-energy diagrams occur. If one of the lines at both sides of a self-energy blob is replaced by ∆± , then the other propagator ∆F places a pole on top of that Dirac delta. In this case, we have to use a more sophisticated 43 prescription. To see what happens, we must first sum the geometric series of propagator insertions, see Fig. 4(a). We obtain what is called the dressed propagator. In momentum space, let us write the contribution of a single blob in Fig. 4(a) as −iδM (k). It represents the summed contribution of all irreducible diagrams, which are the diagrams with two external lines that cannot fall apart if one cuts one internal line. We need its real and imaginary parts: δM (k) ≡ δm2 (k) − iΓ(k). Write the full propagator as P dr (k) = P 0 (k) − P 0 (k)iδM (k)P 0 (k) + · · · 0 = P (k) ∞ X 0 n − iδM (k)P (k) n=0 P 0 (k) = ; 1 + iδM (k)P 0 (k) if P 0 (k) = −i(M (k) − iε)−1 then P dr (k) = −i(M (k) + δM (k) − iε)−1 , (6.11) (6.12) where P 0 (k) is the unperturbed (‘bare’) propagator. If we define the real part of the dressed propagator (in momentum space) to be <(P dr (k)) = (k 2 Γ(k) = π%(−k 2 ) , + M + δm2 )2 + Γ2 (6.13) %(m2 ) ; k 2 + m2 − iε (6.14) then, by contour integration, dr P (k) = Z ∞ 0 dm2 we call this the Källen-Lehmann representation of the propagator. Later, it will be assured that %(m2 ) = 0 if m2 < 0 . = + + + ⋅⋅⋅ (a) (b) = Figure 4: (a) The dressed propagator as a geometric series; (b) Cutting the dressed propagator The best strategy now is to apply a largest time equation to the entire dressed propagator. Write as for Eqs. (6.4) and (6.5), P dr (x) = θ(x0 )∆+dr (x) + θ(−x0 )∆−dr (x) ; P dr (x)∗ = θ(x0 )∆−dr (x) + θ(−x0 )∆+dr (x) . 44 (6.15) Then, ∆±dr (k) = 2π Z d4 k eikx %(−k 2 )θ(±k 0 ) . (6.16) The imaginary part Γ(k) of the irreducible diagrams can itself again be found by applying the cutting rules. Writing S = I + iT , we find that unitarity for all non-trivial diagrams corresponds to i(T − T † ) + T T † = 0, and the diagrams for T T † are depicted in Fig. 4b. They are exactly the diagrams needed for unitarity of the entire scattering matrix involving a single virtual particle in the channel of two external ones. One observes that the function %(−k 2 ) must be non-negative, and only nonvanishing for timelike k . The latter is guaranteed by the theta functions in k 0 . Only the delta peaks in % are associated to stable particles that occur in the initial and final states of the scattering matrix. Resonances with finite widths contribute to the unitarity of the scattering matrix via their stable decay products. 6.3. Wave functions for in- and out-going particles Many technical details would require too much space for a full discussion, so we have to keep this sketchy. In case we are dealing with vector or spinor particles, the residues Rij of the propagators represent the summed absolute squares of the particle wave functions. We have seen in Eq. (6.9) how this comes about. If, for example, a vector particle is described by a propagator Pµν = −i δµν + kµ kν /M 2 , k 2 + M 2 − iε (6.17) then we see that, first of all, the numerator is a polynomial in k , as was required, and, if we go on mass shell, k 2 = −M 2 , then we see that the field component proportional to kµ is projected out. In particular, if we put k = (0, 0, 0, iM ), then Rij = δij and its timelike components disappear, so indeed there are three independent states for the particle described. For the fermions, the bare propagator is P Dirac = −i k2 m − iγk . + m2 − iε (6.18) Before relating this to the renormalization of the wave functions, we must note that all γ µ are hermitean, while ki are real and k4 is imaginary. We observe that the Feynman rules for S † are like those of S , but with γ 4 replaced by −γ 4 . Next, the arrows in the propagators must be reversed. This leads to an extra minus sign for every vector kµ , while γ µ are replaced by γ µ† . All together, one requires that γ i → −γ i while γ 4 remains 45 unchanged. This amounts to the replacement γ µ → γ 4 γ µ γ 4 . One concludes that the rules for S † are like those for S if all fermion lines enter or leave the diagram with an extra factor γ 4 . This means that the wave functions for external fermions in a diagram are to be normalized as i 4 0 4 (m − iγi k + γ k )γ = 2 X |ψi (k)ihψi (k)| , (k 0 > 0) ; (6.19) i=1 while for the anti-fermions, we must demand γ 4 (m − iγi k i − γ 4 k 0 ) = − 4 X |ψi (k)ihψi (k)| , (k 0 > 0) . (6.20) i=3 The minus sign is necessary because the operator in (6.20) has two negative eigenvalues. One concludes that unitarity requires spin- 21 particles to carry one extra minus sign for each closed loop of these particles. This leads to the necessity of Fermi-Dirac statistics. Again, it is important that none of the higher order corrections ever affect the signs of the eigenvalues for these projection operators, since these can never be accommodated for by a renormalization of the particle wave function. The conclusion of this section, that the scattering matrix is unitary in the space of physical particle states, should not come as a surprise because our theory has been constructed to be that way. Yet it is important that we see here in what way the Feynman diagrams intertwine to produce unitarity explicitly. We also see that unitarity is much more difficult to control when we have ghosts due to the gauge fixing procedure. Our vector particles then have propagators where Eq. (6.17) is replaced by expressions such as ren Pµν = −i gµν . k 2 + M 2 − iε (6.21) We write here gµν rather than δµν in order to emphasize that our arguments are applied in Minkowski space, where clearly the time components ‘carry the wrong sign’. The field components associated to that would correspond to particles that contribute negatively to the scattering probabilities. To correct this, one would have to replace |nihn| by −|nihn|, which cannot be achieved by renormalizing the states |ni. Here, we use the BRST relations to show that all unphysical states can be transformed away. In practice, we use the fact that the scattering matrix does not depend on the choice of the gauge fixing function C a (x), so we choose it such that all ghost particles have a mass exceeding some critical value Λ. In the intermediate states, their projector operators ∆± (k) then only contribute if the total energy in the given channel exceeds Λ. This then means that there are no ghosts in the intermediate states, so the scattering matrix is unitary in the space of physical particles only — an absolutely essential step in the argument that these theories 46 are internally consistent. The required gauge fixing functions C a (x) are not difficult to construct, but their existence is only needed to complete this formal argument. They are rather cumbersome to use in practical calculations. 6.4. Dispersion relations The largest time equation can also be employed to derive very important dispersion relations for the diagrams. These imply that any diagram D can be regarded as a combination of two sets of diagrams Di and Di† : D= XZ ∞ i 0 dk 0 Di (k 0 ) Di† (k 0 ) . 0 −k − iε (6.22) Here, Di (k 0 ) and Di† (k 0 ) stand for amplitudes depending on various external momenta k , where one of the timelike components, k 0 , is integrated over. This, one derives by singling out two points, x(1) and x(2) in a diagram, and time-ordering them. The details of the derivation go beyond the scope of this paper (although they are not more difficult than the previous derivations in this section). Eq. (6.22) can be used to express diagrams with closed loops in terms of diagrams with fewer closed loops, and discuss the subtraction procedures needed for renormalization. 7. Renormalization For a proper discussion of the renormalization concept, we must emphasize what our starting point was: first, replace the continuum of space by a dense lattice of points, and only at the very end of all calculations do we make an attempt to take the continuum limit. The path integral procedure, illuminated in subsection 2.4, implies that time, also, can be replaced by a lattice. In Fourier space, the space-time lattice leads to finite domains for the values of energies and momenta (the Brillouin zones), so that all ultra-violet divergences disappear. If we also wish to ensure the absence of infra-red divergences, we must replace the infinite volume of space and time by a finite box. This is often required if complications arise due to divergent contributions of soft virtual particles, typically photons. Nasty infra-red divergences occur in theories with confinement, to be discussed in section 11. The instruments that we shall use for the ultra-violet divergences of a theory are as follows. We assume that all freely adjustable physical constants of the theory, referred to as the ‘bare’ parameters, such as the ’bare’ mass and charge of a particle, should be carefully tuned to agree with observation, but the tuning process may depend critically on the mesh size a of the space-time lattice. Thus, while we vary a, we allow all bare 47 parameters, λ say, in the theories to depend on a, often tending either to infinity or to zero as a → 0. If this procedure is combined with a perturbation expansion, say in terms of a small coupling g , we expect to find that observable features depend minimally on a provided that the bare couplings g(a) remain small in the limit a ↓ 0. This will be an important condition for our theories to make sense at all. How do we know whether g(a) tends to zero or not? The simplest thing to look at, is the dimensionality of g . All parameters of a field theory have a dimension of a length to some power. These dimensions usually depend on the dimensionality n of space-time. The rules to compute them are easy to obtain: - An action S = dn xL(x) is dimensionless; R - The dimension of a Lagrange density L is therefore (length)−n = mn , where m is a mass. - The dimension of the fields can be read off from the kinetic terms in the Lagrangian, because they contain no further parameters. A scalar field φ has dimension m(n−2)/2 , a fermionic field ψ has dimension m(n−1)/2 . - A gauge coupling constant g has dimension m(4−n)/2 and the coupling parameter λ in an interaction term of the form λφk has dimension mn+k−nk/2 , and so on. A theory is called power-counting renormalizable, if all expansion parameters have mass-dimension positive or zero. This is why, in 4 space-time dimensions, we cannot accept higher than quartic interactions among scalars. In practice, in 4 space-time dimensions, most expansion parameters have dimension zero. In Section 9, we will see that dimensionless coupling parameters nevertheless depend on the size of a, but only logarithmically: λ(a) ≈ λ0 + Cλ20 log(a) + higher orders. (7.1) Regardless of whether this tends to zero or to infinity in the continuum limit, one finds that, in the continuum theory, the perturbative corrections to the bare parameters λ diverge. This is nothing to be alarmed about. However, if λ itself is also a small parameter in terms of which we wish to perform a perturbation expansion, then clearly trouble is to be expected if its bare value tends to infinity. Indeed, we shall argue that, in general, such theories are inconsistent. There are two very important remarks to be made: 48 — Theories can be constructed where all couplings really tend to zero in the continuum limit. These theories are called asymptotically free (Section 9), and they allow for accurate approximations in the ultra-violet. It is generally believed that such theories can be defined in a completely unambiguous fashion through their perturbation expansions in the ultra-violet; in any case, they allow for very accurate calculation of all their physical properties. QCD is the prime example. — If a theory is not asymptotically free, but has only small coupling parameters, the perturbation expansion formally diverges, and the continuum limit formally does not exist. But the first N terms of perturbation expansion do make sense, where N = O(1/g). This means that uncontrollable margins of error are exponentially small, of 2 order e−C/g or e−C/g , which in practice is much smaller than other uncertainties in the theory, so they are of hardly any practical consequence. Thus, in such a case, our theory does have intrinsic inaccuracies, but these are exponentially suppressed. In practice, such theories are still highly valuable. The Standard Model is an example. A useful approach is to substitute all numbers in a theory by formal series expansions, where the expansion parameter, a factor common to all coupling parameters of the theory, is formally kept infinitesimal. In that case, all perturbation coefficients are uniquely defined, though one has little direct knowledge concerning the convergence or divergence of the expansions. In both the cases mentioned above, our theories are defined from their perturbation expansion; clearly, the perturbation expansion is not only a convenient device for calculations, it is an essential ingredient in our theories. Let us therefore study how renormalization works, order-by-order in perturbation theory. In a connected diagram, let the number of external lines be E , the number of propagators be P , and let Vn be the number of vertices with n prongs. By drawing two dots on each propagator and one on each external line, one finds that the number of dots is 2P + E = X nVn = 3V3 + 4V4 . (7.2) n For tree diagrams (simply connected diagrams), one finds by induction, with V the numP ber of vertices, V = n Vn , V =P +1 . (7.3) A diagram with L closed loops (an L-fold connected diagram) turns into a tree by cutting away L propagators. Therefore, one has P =V −1+L . 49 (7.4) Combining Eqs. (7.2) and (7.4), one has E + 2L − 2 = X (n − 2)Vn = V3 + 2V4 . (7.5) n Consequently, if every 3-vertex comes with a factor g and every 4-vertex with a factor λ, and if a diagram with a given number E of external lines, behaves as g 2n λk , it must have L = n + k + 1 − 21 E closed loops. Perturbation expansion is therefore often regarded as an expansion in terms of the number of closed loops. 7.1. Regularization schemes In a tree diagram, in momentum space, no integrations are needed to be done — the momentum flowing through every propagator is fixed by the momenta of the in- and out-going particles. But if there are L loops, one has to perform 4L integrations in momentum space. It is these integrations that often tend to diverge at large momenta. Of course, these divergences are stopped if momentum space is cut off, as is the case in a finite lattice. However, since our lattice is not Lorentz-invariant and may lack other symmetries such as gauge-invariance, it is useful to find other ways to modify our theory so that UV divergences disappear. This is called ‘regularization’. We give two examples. 7.1.1. Pauli-Villars regularization Assume that a propagator of the form shown is replaced as follows: k2 A(k) + m2 − iε → X i ei k2 A(k) ; + Λ2i − iε X i ei = 0 , X ei Λ2i = 0 . (7.6) i If we take e1 = 1, Λ1 = m, while all other Λi tend to +∞, we get back the original propagator. With finite Λi , however, we can make all momentum integrations converge at infinity. Our theory is then finite. This is (a somewhat simplified version of) Pauli-Villars regularization. However, the new propagators cannot describe ordinary particles. The ones with ei < 0 contribute to the unitarity relation with the wrong sign! On the other hand, the iε prescription is as usual, so that these particles do carry positive energy. In any channel where the total energy is less than Λi , the ‘Pauli-Villars ghosts’ do not contribute to the unitarity relation at all. So, in a theory where we put a limit to the total energy considered, Pauli-Villars regularization is physically acceptable. In practice, we will try to send all ghost masses Λi to infinity. 50 7.1.2. Dimensional regularization Dimensional regularization[9] consists of formally performing all loop integrations in 4 − ε dimensions, where ε may be any (possibly complex) number. As long as ε is irrational, all integrations can be replaced by finite expressions following an unambiguous prescription, to be explained below. If ε = 0, one can also subtract the integrals, but the prescription is then often not unambiguous, so that anomalies might arise. This is why dimensional regularization will be particularly important whenever the emergence of anomalies is a problem one wishes to understand and control. It is important to realize that also when ε 6= 0, integrals may be divergent, but that, for irrational ε, unambiguous subtractions may be made. This needs to be explained, but first, one needs to define what it means to have non-integral dimensions. Such a definition is only well understood within the frame of the perturbation-, or loop-, expansion. Consider an irreducible diagram with L loops and N external lines, where we keep the external momenta p(1) , · · · , p(N ) fixed. It is obvious from the construction of the theory that the integrand is a purely rational function in L(4 − ε) variables. Observing that the external momenta span some N − 1 dimensional space, we now employ the fact that the integration in the remaining dimensions is rotationally invariant. There, we write the formula for the `-dimensional (Euclidean) sphere of radius r as Z d` kδ(k 2 − r2 ) = π `/2 `−2 r . Γ(`/2) (7.7) Here, Γ stands for the Euler gamma function, Γ(z) = (z − 1)! for integral z . It is at this point where we can decide that this expression defines the integral for any, possibly complex, value for `. It converges towards the usual values whenever ` happens to be a positive integer. After having used Eq. (7.7), one ends up with an integral over s variables kµ of a function f (k), where s is an integer, but f (k) contains ε-dependent powers of polynomials in k . Convergence or divergence of an integral can be read off from simple power counting arguments, and, at first sight, one sees hardly any improvement when ε is close to zero. However, what is achieved is that infra-red divergences (kµ → 0) are separated from the ultra-violet divergences (kµ → ∞), and this allows us to define the “finite parts” of the integrals unambiguously: • All integrals ds kf (k) are replaced by functionals I({f (k)}) that obey the same combinatorial rules as ordinary integrals: R I(αf1 + βf2 ) = αI(f1 ) + βI(f2 ) , I({f (k + q)}) = I({f (k)}) , 51 (7.8) • I(f ) = R ds kf (k) if this converges. • I(f ) = 0 if f (k) = (k 2 )p when 2p + s is not an integer. This latter condition is usually fulfilled, if we started with ε not integer. These rules are sufficient to replace any integral one encounters in a Feynman diagram by some finite expression. Note, however, that complications arise if one wants to use these rules when 2p + s is an integer, particularly when it is zero. In that case, the expression diverges in the ultra-violet and in the infra-red, so, in this case, it cannot be used to remove all divergences — it can only replace one by another. Consequently, our finite expressions tend to infinity as ε → 0. It is important to verify that dimensional regularization fully respects unitarity and the dispersion relations discussed above. Therefore, the ‘dimensionally regularized’ diagrams correspond to solutions of the dispersion relations and the unitarity relations, providing some ‘natural’ subtraction. 7.1.3. Equivalence of regularization schemes The subtractions provided by the various regularization schemes discussed above, in general, are not the same. At any given order, they do all obey the same dispersion relations of the form (6.22). If we ask, which amplitudes can be added to one scheme to reproduce the other, or, what is the amplitude of the difference between the two schemes (after having eliminated these differences at the order where the subdiagrams Di (k 0 ) had been computed), we find the following. This difference must be a Lorentz-covariant expression; and it can only come from the dimensionally regularized contributions of the unphysical Pauli-Villars ghosts in Eq. (6.22). Because of their large masses, only very large values of k 0 in this equation contribute. The p0 -dependence then must reduce to being a polynomial one (p being the momenta of the fixed external lines), and because of Lorentz-invariance, the expression must be polynomial in all components of pµ . This is exactly what can be achieved by putting a counter term inside the bare Lagrangian of the theory. This way, one derives that the various regulators differ from one another by different effective couplings in the bare Lagrangian. It is then a question of taste which regulator one prefers. Since dimensional regularization often completely respects local gauge-invariance8 , and also because it turned out to be very convenient and efficient in practice, one often prefers that. It should always be kept in mind, however, that dimensional regularization is something of a mathematical trick, and the physical expressions only make sense in the limit ε → 0. 8 Only in one case, there is a complication, namely, when there are Adler-Bell-Jackiw anomalies; see Section 8. 52 7.2. Renormalization of gauge theories Using the results from the previous Sections, we decide to treat quantum field theories in general, and gauge theories in particular, as follows: first, we regularize the theory, by using a ‘lattice cut-off’, or a Pauli-Villars cut-off, or by turning towards n = 4 − ε dimensions. All these procedures are characterized by a small parameter, such as ε, such that the physical theory is formally obtained in the limit ε → 0. These procedures are all equivalent, in the sense that by adding local interaction terms to the Lagrangian, one can map the results of one scheme onto those of another. Subsequently, we renormalize the theory. This means that all parameters in the Lagrangian are modified by finite corrections, which however may diverge in the limit ε → 0. If these counter terms have been chosen well, the theory may stay finite and well defined in this limit. In particular, we should have a unitary, causal theory. Unitarity is only guaranteed if the theory is gauge-invariant. Therefore, one prefers regulator schemes that preserve gauge-invariance throughout. This is what dimensional regularization often does. In that case, the renormalization procedure respects BRSTinvariance, see Subsect. 4.5. 8. Anomalies The Sections that follow will (again) be too brief to form a complete text for learning Quantum Field Theory. Our aim is here to give a summary of the features that are all extremely important to understand the general structure of relativistic Quantum Field Theories. If, for a given theory, no obviously gauge-invariant regularization procedure appears to exist, this might be for a reason: such a theory might not be renormalizable at all. In principle, this could be checked, as follows. One may always decide to use a regularization procedure that does not respect the symmetries one wants, provided that the symmetry can be restored in the limit where the physically observable effects of the regulator go away, such as ε → 0, or Λi → ∞, i > 0. If a gauge-invariant regulator does exist, but it hasn’t yet been explicitly constructed, then we know that it differs from any other regulator by a bunch of finite counter terms. To find such counter terms is not hard, in practice; just add all terms needed to restore BRST invariance of the amplitudes. But, in case that regulator is not known, how can we then be sure that such terms exist at all? BRST invariance requires the validity of the Slavnov-Taylor identities, but they appear to overdetermine the subtraction terms. This is the way we originally phrased the problem in Ref. [14]. In fact, indeed there may be a clash. If this happens, it is called an anomaly.[20] 53 Actually, the incidence of such anomalies is limited, fortunately. This is because for most theories completely gauge-invariant regulator techniques were found. Dimensional regularization often works. The one case where it does not is when there are chiral fermions. Classically, one may separate any fermionic field into a left-handed and a right handed part, as was mentioned in Subsection 4.1: P± = 12 (1 ± γ 5 ) ψ(x) = P+ ψL (x) + P− ψR (x) ; γ5 = 1 ε γ µγ ν γ αγ β 24 µναβ ; . (8.1) Indeed, since (γ 5 )2 = 1, the operators P± are genuine projection operators: P±2 = P± . The left- and right sectors of the fermions, see Eq. (4.22), may be separately gaugeinvariant, transforming differently under gauge transformations. This, however, requires γ 5 to anti-commute with all other γ µ , µ = 1, · · · , n. But, as we see from their definition, Eq. (8.1), γ 5 only anti-commutes with four of the γ µ , not all n. This is why the contributions from the −ε remaining dimensions will not be gauge-invariant. It was discovered by Bell and Jackiw[18], and independently by Adler[19], that no local counter term exists that obeys all symmetry conditions and has the desired dimensionality; Bell and Jackiw tried to use unconventional regulators, but those turned out not to be admissible. The basic culprit is the triangle diagram, Fig. 5(a), representing the matrix element of the axial vector current ψ γ µ γ 5 ψ in the field of two photons, each being coupled to the vector current ψ γ α ψ . p, α k, µ q, β (a) (b) Figure 5: (a) The anomalous triangle diagram. µ, α and β are the polarizations, k, p and q = k − p are the external momenta. (b) An anomalous diagram in non-Abelian theories For simplicity, we assume here the fermions to be massless. Let us call this amplitude then Γα,β µ (p, q). It is linearly divergent. Upon regularization, there are two counter terms, or subtraction terms, whose coefficients should be determined, in a correct combination with the finite parts of the amplitude. Limiting ourselves to the correct quantum numbers and dimensions, we find the two quantities, δ1 Γα,β µ (p, q) = εµαβγ pγ ; δ2 Γα,β µ (p, q) = εµαβγ qγ . 54 (8.2) We can determine their coefficients by applying the condition that the total amplitude be invariant under gauge transformations of the photon field. This implies that the expression must vanish when any of the two photons are longitudinal: Aµ = ∂µ Λ, which means pα Γα,β µ (p, q) = 0 ; qβ Γα,β µ (p, q) = 0 . (8.3) Since pα δ1 Γα,β µ (p, q) = qβ δ1 Γα,β µ (p, q) = Aµ,α ; 0; pα δ2 Γα,β µ (p, q) = Aµ,β ; Aµ,α = εµαβγ pγ qβ , qβ δ2 Γα,β µ (p, q) = 0 ; (8.4) this fixes the coefficients in front of δ1 Γ and δ2 Γ. When now we investigate whether this amplitude is also transversal with respect to the axial vector current, we are struck by a surprise. The counter terms, fixed by condition (8.4), also contribute here: α,β kµ δ1 Γα,β µ (p, q) = −kµ δ2 Γµ (p, q) = Aα,β , (8.5) but they do not cancel against the contribution of the finite part. After imposing gaugeinvariance with respect to the two vector insertions, one finds (in the case of a single chiral fermion) 2 −1 kµ Γα,β µ (p, q) = (4π ) εµαβγ pµ qγ , (8.6) and this can be rewritten as an effective divergence property of a vector current: ∂µ Jµ5 = − iLe2 Fµν F̃µν , 8π 2 (8.7) where F̃µν = 21 εµναβ Fαβ , and it was assumed that the photons couple with charges e. What is surprising about this is, that the triangle diagram itself, Fig. 5a, appears to be totally symmetric under all permutations, since γ 5 can be permuted to any of the other end-points. Imposing gauge-invariance at two of its three end-points implies breaking of the invariance at the third. This result is very important. It implies an induced violation of a conservation law, apparently to be attributed to the regularization procedure. It also means that it is not possible to couple three gauge bosons to such a triangle graph, because this cannot be done in a gauge-invariant way. In most theories, however, we have couplings both to left-handed and to right-handed fermions. Their contributions are of opposite sign, which means that they can cancel out. Therefore, one derives an important constraint on gauge theories with chiral fermions: The triangle anomalies must cancel out. 55 Let the left handed chiral fermions be in representations of the total set of gauge groups that transform as ψLi → ψLi + iΛa TLa ij ψLj (8.8) where Λ is infinitesimal, and TLa are the gauge generators for the left-handed fermions. Similarly for the right-handed ψRi . Define a b c b a c dabc L = Tr(TL TL TL + TL TL TL ) , (8.9) and similarly dabc R . The anomaly constraint is then X dabc L = X dabc R , (8.10) where the sum is over all fermion species in the theory. In the Standard Model, the only contributions could come if either one or all three indices of dabc refer to the U (1) group. One quickly verifies that indeed the U (1) charges of the quarks and leptons are distributed in such a way that (8.10) is completely verified, but only if the number of quark generations and lepton generations are equal. In Chapter 10.3, we will see the physical significance of this observation. Note, that in the non-Abelian case, there are also anomalies in diagrams with 4 external legs, see Fig. 5(b). They arise from the trilinear terms in Fµν F̃µν (the quadrilinear terms cancel). These are the only cases where the regularization procedure may violate gauge invariance. In diagrams with more loops, or sub diagrams with more external lines, regularization procedures could be found that preserve gauge invariance. 9. Asymptotic freedom 9.1. The Renormalization Group It was observed by Stueckelberg and Peterman[21] in 1953, that, although the perturbative expansion of a theory depends on how one splits up the bare parameters in the Lagrangian into lowest order parameters, and counter terms required for the renormalization, the entire theory should not depend on this. This they interpreted as an invariance, and the action of replacing parameters from lowest order to higher order corrections as a group operation. One obtains the ‘Renormalization Group’. There is only one instance where such transformtions really matter, and that is when one compares a theory at one mass- or distance-scale to the same theory at a different scale. A scale transformation must be associated with a replacement of counter terms. Thus, physicists began to identify the notion of a scale transformation as a ‘renormalization group transformation’. 56 Gell-Mann and Low[22] observed that this procedure can be used to derive the smalldistance behavior of QED. One finds that the effective fine-structure constant depends on the scale µ, described by the equation µ2 d α2 N + O(α3 ) , α(µ) = β(α) = dµ2 3π (9.1) where N is the number of charged fermion types. As long as α(µ) stays small, so that the O(α3 ) terms can be neglected, we see that its µ-dependence is α(µ) = α0 , 1 − (α0 N/3π) log(µ2 /µ20 ) if α(µ0 ) = α0 . (9.2) Things run out of control when µ reaches values comparable to exp(3π/2N α0 ), but, at least in the case of QED, where α0 ≈ 1/137, this mass scale is so large that in practice no problems are expected. The pole in Eq. (9.2) is called the Landau pole; Landau concluded that quantum field theories such as QED have no true continuum limit because of this pole. Gell-Mann and Low suspected, however, that β(α) might have a zero at some large value of α, so that, at high values of µ, α approaches this value, but does not exceed this stationary point. What exactly happens at or near the Landau pole, cannot be established using perturbation expansion alone, since this will depend on all higher order terms in Eq. (9.1); in fact, it is not even known whether Quantum Field Theory can be reformulated accurately enough to decide. The question, however, might be not as important as it seems, since the Landau pole will be way beyond the Planck mass, where we know that gravitational terms will take over; it will be more important to solve Quantum Gravity first. An entirely different situation emerges in theories where the function β(λ) is negative. It was long thought that this situation can never arise, unless the coupling strength λ itself is given the wrong sign (the sign that would render the energy density of the classical theory unbounded from below), but this turns out only to be the case in theories that only contain scalar and spinor fields. If there is a non-Abelian Yang-Mills component in the theory, negative β functions do occur. In the simplest case, an SU (2) gauge theory with Nf fermions in the elementary doublet representation, the beta function is Nf − 11 4 µ2 d 2 g (µ) = β(g 2 ) = g (µ) + O(g 6 ) , 2 dµ 24π 2 (9.3) so, as long as Nf < 11 we have that the coupling strength g(µ) tends to zero, logarithmically, as µ → ∞. This feature is called asymptotic freedom. In an SU (Nc ) gauge theory, Nc , so, with the present number of Nf = 6 the β function is proportional to Nf − 11 2 quark flavors, QCD (Nc = 3) is asymptotically free. In line with a notation often used, the subscript c here stands for ”colour”; in QCD, the number of colours is Nc = 3. 57 9.2. An algebra for the beta functions In theories with gauge fields, fermions, and scalars, the situation is more complex. A general algorithm for the beta functions has been worked out. The most compact notation for the general result can be given by writing how the entire Lagrange density L scales under a scale transformation. Let the Lagrangian L be L = − 41 Gaµν Gaµν − 12 (Dµ φi )2 − V (φ) − ψ i γµ Dµ ψi −ψ i Sij (φ) + iγ5 Pij (φ) ψj , (9.4) where the covariant derivatives are defined as follows:9 Dµ φi ≡ ∂µ φi + iTija Aaµ φj ; Dµ ψi = ∂µ ψi + iUija Aaµ ψj , (9.5) and the structure constants f abc are defined by [T a , T b ] = −if abc Tc , (9.6) Gaµν = ∂µ Aaν − ∂ν Aaµ + f abc Abµ Acν . (9.7) so that We split the fermions into right- and left-handed representations, so that U a = ULa PL + URa PR ; PL = 1 + γ5 , 2 PR = 1 − γ5 . 2 (9.8) The functions S(φ) and P (φ) are at most linear in φ and V (φ) is at most quartic. The Lagrangian (9.4) is the gauge-invariant part; we do not write the gauge-fixing part or the ghost; the final result will not depend on those details. The result of an algebraical calculation is that µ2 dL = dµ2 11 1 1 Gaµν Gbµν [− C1ab + C2ab + C3ab ] − ∆V − ψ(∆S + iγ5 ∆P )ψ , 12 24 6 16π 2 (9.9) in which C1ab C2ab C3ab ∆V = f apq f bpq , = Tr (T a T b ) , = Tr (ULa ULb + URa URb ) , = 41 Vij2 − 23 Vi (T 2 φ)i + 34 (φT a T b φ)2 +φi Vj Tr (S,i S,j + P,i P,j ) − Tr (S 2 + P 2 )2 + Tr [S, P ]2 , 9 T and U are hermitean, but since φ is real, the elements of T must be imaginary. 58 (9.10) (9.11) (9.12) (9.13) where Vi ≡ ∂V (φ) ; ∂φi S,i ≡ ∂S , ∂φi etc., (9.14) and writing S + iP = W , one finds ∆S and ∆P from ∆W = 1 Wi Wi∗ W + 14 W Wi∗ Wi + Wi W ∗ Wi 4 − 32 (UR2 W ) − 23 W (UL )2 + Wi φj Tr (Si Sj + Pi P j ) . (9.15) This expression does not include information on how fields φi and ψi transform under scaling. The fields are not directly observable. This algebraic expression can be used to find how, in general, coupling strengths run under rescalings of the momenta. It is an interesting exercise to work out what the conditions are for asymptotic freedom, that is, for all coupling strengths to run to zero at infinite momentum. In general, one finds that scalar fields can only exist if there are also gauge fields and fermions present; the latter must be in sufficiently high representations of the gauge group. 10. Topological Twists The Lagrangian (9.4) is the most general one allowed if we wish to limit ourselves to coupling strengths that run logarithmically under rescalings of the momenta, see for instance Eq. (9.2). Such theories have a domain of validity that ranges over exponentially large values of the momenta (in principle over all momenta if the theory is asymptotically free). The most striking feature of this general Lagrangian is that it is topologically highly non-trivial. Locally stable field configurations may exist that have some topological twist in them. In particular, this can be made explicit in the case of a Brout-Englert-Higgs mechanism. Here, these twists can already be seen at the classical level (i.e., ignoring quantum effects). If we say that a scalar field φi has a vacuum expectation value, then this means that we perform our perturbation expansion starting with a field value of the form φi = (F, 0, · · ·) in the vacuum, after which field fluctuations δφ around this value are assumed to be small. One assumes that the potential V (φ) has its minimum there. This may appear to violate gauge-invariance, if φi transform into each other under local gauge transformations, but strictly speaking the phrase “spontaneous breakdown of local gauge symmetry” is inappropriate, because it may also simply mean that we choose a gauge condition. It is however a fact that the spectrum of physical particles comes out to be altogether different if we perturb around φi = 0, so this ‘Higgs mode’ is an important notion in any case. 59 10.1. Vortices If the Higgs field has only two real components (such as when U (1) is broken into the identity group), one may consider a configuration where this field makes a full twist over 360◦ when following a closed curve. Inside the curve there must be a zero. The zeros must form a curve themselves, and they cost energy. This is the Abrikosov vortex. Away from its center, one may transform φi back to a constant value, but this generates a vector potential Aµ (x), obeying I Aµ dxµ = 2π , e (10.1) which means that this vortex carries an amount of magnetic flux, of magnitude exactly 2π/e. Apparently, in this model, magnetic field lines condense into locally stable vortices.[23] This is also what happens to magnetic fields inside a superconductor. 10.2. Magnetic Monopoles Something similar may happen if the Higgs field has three real components. In that case, one can map the S2 sphere of minima of V (φ), onto a sphere in 3-space. There will be isolated zeros inside this sphere. These objects behave as locally stable particles. If one tries to transform the field locally to a constant value, one finds that a vector potential again may emerge. If, for example, in an SU (2) theory, a Higgs in the adjoint representation (which has 3 real components) breaks the gauge group down to U (1), then one finds the vector potential of an isolated magnetic source inside the sphere. This means that the source , where e is the original coupling is a magnetic monopole with magnetic charge gm = 4π e strength of the SU (2) theory. Indeed, Dirac[24] has derived, back in 1931, that magnetic charges gm and electric charges q must obey the Dirac quantization condition q gm = 2πn . (10.2) Apparently, for the monopole in this model, n = 2. However, it is easy to introduce particles in the elementary representation, which have q = 21 e; these then saturate the Dirac condition (10.2). Dirac could not say much about the mass of his magnetic monopoles. In the present theories, however, the mass is calculable. In general, the magnetic monopole mass turns out to be the mass of an ordinary particle divided by a number of the order of the gauge coupling strength squared. Careful analysis of the existing Lie groups and the way they may be broken spontaneously into one or more subgroups U (1), reveals a general feature: Only if the underlying 60 gauge group is compact, and has a compact covering group, must electric charges in the U (1) gauge groups be quantized (otherwise, it would not be forbidden to add arbitrary real numbers to the U (1) charges), and whenever the covering group of the underlying gauge group is compact, magnetic monopole solutions can be constructed. Apparently, whenever the gauge group structure provides for a compelling reason for electric charges to be quantized, the existence of magnetic monopole solutions is guaranteed. Thus, assuming that Nature has compelling reasons for the charge units of electrons and protons to be equal, and quantized into multiples of e, we must assume that magnetic monopole solutions must exist. However, in most ‘Grand Unification Schemes’, the relevant mass scale is many orders of magnitude higher than the mass scale of particles studied today, so the monopoles, whose mass is that divided by a coupling strength squared, are even heavier. From the structure of the Higgs field of a monopole, one derives that the system is invariant under rotations provided that rotations are associated with gauge rotations. A consequence of this is, that elementary particles with half-odd isospin, when bound to a monopole, produce bound states with half-odd integer orbital angular momentum[25]. What is strange about this, is that such particles should develop Dirac statistics. Indeed, one can derive that both the spin and the statistics of bound states of electric and magnetic charges, flip from Bose-Einstein to Fermi-Dirac or back[26] if they form odd values of the Dirac quantum n (Eq. 10.2). 10.3. Instantons A Higgs field with two real components gives rise to vortices, a Higgs with three components gives magnetic monopoles, so what do we get if a Higgs field has four real components? This is the case if, for instance, SU (2) is broken spontaneously into the identity by a Higgs in the fundamental representation (two complex = 4 real components). The topologically stable objects one finds are stable points in four-dimensional space-time. They represent events, and, referring to their particle-like appearance, the resulting solutions (in Euclidean space) were called ‘instantons’. Because this Higgs field, in the case of SU (2), breaks the gauge symmetry completely, one can argue that this solution is also topologically stable in pure gauge theories, without a Higgs mechanism at all. Far from the origin, the vector potential field is described as a local gauge rotation of the value Aaµ (x) = 0. The gauge rotation in question, Ω(x), is described by noting that the SU (2) matrices form an S3 space, i.e., the three dimensional surface of a sphere in four dimensions. Mapping this S3 one-to-one onto the boundary of some large region in (Euclidean) space-time, gives the field configuration of an instanton. It was noted by Belavin, Polyakov, Schwarz and Tyupkin[27] (who also were the first 61 to write down this solution) that this solution has a non-vanishing value of Z a a d4 x Fµν F̃µν = 32π 2 . g2 (10.3) The integrand is the divergence of a current: a a Fµν F̃µν = ∂µ Kµ ; Kµ = 2εµναβ Aaν (∂α Aaβ + 31 gf abc Abα Acβ ) , (10.4) the so-called Chern-Simons current. This current, however, is not gauge-invariant, which is why it does not vanish at infinity. It does vanish after the gauge transformation Ω(x) that replaces Aaµ at infinity by 0. Eq. (10.3) is most easily derived by using this ChernSimons current. It so happens that the instanton is also a solution of the equation Fµν = F̃µν , (10.5) so that we also find the action to be given by −8π 2 /g 2 . In a pure gauge theory (one without fermions), instantons can be interpreted as representing tunneling transitions. In ordinary Quantum Mechanics, tunneling is an exponentially suppressed transition. The exponential suppression is turned into an oscillating expression if we replace time t by an imaginary quantity iτ . The oscillating exponent is the action of a classical transition in imaginary time. One may also say that a tunneling transition can be described by a classical mechanical transition if the potential V (~q) is replaced by 2E − V (~q), where E is the energy. The classical action then represents the quantity in the exponent of the (exponentially suppressed) tunneling transition. The above substitution is exactly what one gets by replacing time t by iτ . In relativistic Quantum Field Theory, this is also exactly the Wick rotation from Minkowski space-time into Euclidean space-time. In short, instantons represent tunneling that is 2 2 associated with the suppression factor e−8π /g . The transition can be further understood by formulating a gauge theory in the temporal gauge, A0 = 0. In this gauge, there is a residual invariance under gauge transformations Λ(x) that are time-independent. All ‘physical states’, therefore, come as representations of this local gauge group. Normally, R however, we restrict ourselves to the i Λ(x)d3 x trivial representation, Ω|ψi = |ψi, where Ω = e , because this configuration is conserved in time, and because any other choice would violate Lorentz invariance. However, closer analysis shows that one only has to impose this constraint for those gauge transformations that can be continuously reached from the identity transformation. This is not the case for transformations obtained by mapping the S3 space of the SU (2) transformations non-trivially onto three-space R3 . These transformations form a discrete set, characterized by the integers k = 0, ±1, ±2, . . .. Writing Ωk (x) = Ω1 (x)k , Ωk |ψi = eiθk |ψi , 62 (10.6) we find that the tunneling transitions described by instantons cause an exponentially suppressed θ dependence of physical phenomena in the theory. Since, under parity transformations P , the angle θ turns into −θ , a non-vanishing θ also implies an explicit parity (eventually, P C ) violation of the strong interactions. In the presence of fermions, the situation is altogether different. Due to the chiral anomaly, we have for the current of chiral fermions Jµ5 (x), the equation (8.7). The total R number of chiral fermions, Q5 = d3 xJ05 (x) changes by one unit due to an instanton: ∆Q5 = ±1. This can be understood by noting that the Dirac equation for massless, chiral fermions has one localized solution in the Euclidean space of an instanton. In Minkowski space-time, this solution turns into a state that describes a chiral fermion either disappearing into the Dirac sea, or emerging from it, so that, indeed, the number of particles minus anti-particles changes by one unit for every chiral fermion species. If left- and right handed fermions are coupled the same way to the gauge field, as in QCD, the instanton removes a left-handed fermion and creates a right-handed one, or, in other words, it flips the chirality. This ∆Q5 = ±2 event has exactly the quantum numbers of a mass term for the Goldstone boson that would be associated to the conservation of chiral charge, the η particle. This explains why the η particle is considerably heavier than the pions, which have lost most of their mass due to chiral symmetry of the quarks.[28] What one concludes from the study of instantons is that QCD, the theory for the strong interactions, neatly explains the observed symmetry structure of the hadron spectrum, including the violation of chiral charge conservation that accounts for the η mass. In the electro-weak sector, one also has instantons. We now see that the cancellation of the anomalies in the quark and the lepton sector implies an important property of the electro-weak theory: since the anomalies do not respect gauge-invariance of the quark sector alone, quarks can be shown not to be exactly conserved. One finds that instantons induce baryon number violating events: three baryons (nine different quarks all together) may transmute into three anti-leptons, or vice versa. 11. Confinement An important element in the Standard Model is the gauge theory for the strong interactions, based on the gauge group SU (3). Quarks are fermions in the elementary representation of SU (3). The observed hadronic particles all are bound states of quarks and/or anti-quarks, in combinations that are gauge-invariant under SU (3). An important question is: what is the nature of the forces that binds these quarks together? We have seen that vortex solutions can be written down that would cause an interesting force pattern among magnetic monopoles: in a Higgs theory with magnetic monopoles, these monopoles could be bound together with Abrikosov vortices. 63 Indeed, this would be a confining force: every magnetic monopole must be the end point of a vortex, whose other end point is a monopole of opposite magnetic charge. Indeed, the confinement would be absolute: isolated monopoles cannot exist. It was once thought that, therefore, quarks must be magnetic monopoles. This, however, would be incompatible with the finding that quarks only interact weakly at small distances, magnetic charges being always quite strong. A more elegant idea is that the binding force forms electric rather than magnetic vortices. An electric vortex can be understood as the dual transformation of a magnetic vortex. It comes about when the Brout-EnglertHiggs mechanism affects freely moving magnetically charged particles. Further analytic arguments, as well as numerical investigations, have revealed that indeed such objects are present in QCD, and that the Higgs mechanism may occur in this sector. Let us briefly explain the situation in words. 11.1. The maximally Abelian gauge A feature that distinguishes non-Abelian gauge theories from Abelian ones, is that a reference frame for the gauge choice, the gauge condition, can partly be fixed locally in terms of the pure gauge fields alone; noticing that the covariant field strengths Gµν transform as the adjoint representation, one may choose the gauge such that one of these components, say G12 , is diagonal. This then removes the non-Abelian part of the gauge group, but the diagonal part, called the Cartan subgroup, remains. In this way, a non-Abelian gauge theory turns into an Abelian one. A slightly smarter, but non-local gauge that does the P same is the condition that i6=j (Aµij )2 is minimized. It is called the maximally Abelian gauge. However, such a gauge choice does produce singularities. These typically occur when two eigenvalues of G12 coincide. It is not difficult to convince oneself that these singularities behave as particles, and that these particles carry magnetic charges with respect to the Cartan subgroup. Absolute confinement occurs as soon as these magnetically charged particles undergo a Brout-Englert-Higgs mechanism. Although this still is the preferred picture explaining the absolute nature of the quark confining force, it may be noted that the magnetically charged particles do not have to be directly involved with the confinement mechanism. Rather, they are indicators. This, we deduce from the fact that confinement also occurs in theories with a very large number Nc of colors; in the limit Nc → ∞, magnetically charged particles appear to be suppressed in the perturbative regime, but the electric vortices are nevertheless stable. The strength of a vortex is determined by its finite width, and this width is controlled by the lightest gluonic state, the ‘glueball’. At distance scales large compared to the inverse mass of the lightest glueball, an electric vortex cannot break. Confinement is a condensation phase that is a logical alternative of the Brout-Englert64 Higgs phase. In some cases, however, these two phases may coexist. An example of such a coexistence is the SU (2) sector of the Standard Model. Conventionally, this sector is viewed as a prototype of the Higgs mechanism, but it so happens that the SU (2) sector of the Standard Model can be treated exactly like the colour SU (3) sector: as if there is confinement. To see this, one must observe that the Higgs doublet field can be used to fix the SU (2) sector of the gauge group unambiguously. This means that all physical particles can be connected to gauge-invariant sources by viewing them as gauge-invariant bound states of the Higgs particle with the other elementary doublets of the model. For instance, writing the Higgs doublet as φa = F0 + φ̃a , and the lepton doublet as ψa , the electron is seen to be associated to the ‘baryonic’ field εab φa ψb , the neutrino is φ∗a ψa , the Z0 boson is φ∗a Dµ φa , and so on. Theories in which the confinement phase is truly distinct from the Higgs phase are those where the Higgs field is not a one-to-one representation of the gauge group, such as the adjoint representation of SU (2). 12. Outlook Quantum Field Theory has reached a respectable status as an accurate and well-studied description of sub-atomic particles. From a purely mathematical point of view, there are some inherent limitations to the accuracy by which it defines the desired amplitudes, but in nearly all conceivable circumstances, its intrinsic accuracy is much higher than what can be reached in experiments. This does not mean that we can reach such accuracy in real calculations, which more often than not suffer from technical limitations, particularly where the interactions are strong, as in QCD. In this domain, there is still a need for considerable technical advances. 12.1. Naturalness When the Standard Model, as known today, is extrapolated to energy domains beyond approximately 1 TeV, a difficulty is encountered that is not of a mathematical nature, but rather a physical one: it becomes difficult to believe that it represents the real world. The bare Lagrangian, when considered on a very fine lattice, is required to have parameters that must be tuned very precisely in order to produce particles such as the Higgs particle and the weak vector bosons, whose masses are much less than 1 TeV. This fine-tuning is considered to be unnatural. In a respectable physical theory, such a coincidence is not expected. With some certainty, one can state that the fundamental laws of Nature must allow for a more elegant description at high energies than a lattice with such fine-tuning. What is generally expected is either a new symmetry principle or possibly a new regime with an altogether different set of physical fields. 65 A candidate for a radically different regime is the so-called technicolour theory, a repetition of QCD but with a typical energy scale of a TeV rather than a GeV. The quarks, leptons and Higgs particles of the Standard Model would then all turn out to be the hadrons of this technicolour theory. Different gauge groups could replace SU (3) here. However, according to this scheme, a new strong interaction regime would be reached, where perturbation expansions used in the weak sector of the Standard Model would have to break down. As precision measurements and calculations continue to confirm the reliability of these perturbation expansions, the technicolour scenario is considered to be unlikely. 12.2. Supersymmetry A preferred scenario is a simple but beautiful enhancement of the symmetries of the Standard Model: supersymmetry. This symmetry, which puts fermions and bosons into single multiplets, does not really modify the fundamental aspects of the theory. But it does bring about considerable simplifications in the expressions for the amplitudes, not only in the perturbative sector, but also, in many cases, it allows us to look deeper into the non-perturbative domains of the theories. There is a vast amount of literature on supersymmetry, but some aspects of it are still somewhat obscure. We would like to know more about the physical origin and meaning of supersymmetry, as well as the mechanism(s) causing it to be broken — and made almost invisible — at the domain of the Standard Model that is today accessible to experimental observation. 12.3. Resummation of the Perturbation Expansion The perturbation expansion in Quantum Field Theory is almost certain to be divergent for any value of the coupling parameter(s). A simple argument for its divergence has been put forward by Dyson[29]: imagine that in the theory of QED there were a bound ε such that, whenever |α| < ε, where α is the fine-structure constant, perturbation expansions would converge. Then it would converge for some negative real value of α. However, one can easily ascertain that for any negative value of α, the vacuum would be unstable: vacuum fluctuations would allow large numbers of electrons to be pair-created, and since like charges attract, highly charged clouds of electrons could have negative energies. Theories with asymptotic freedom may allow for a natural way to re-sum the perturbation series, by first solving the theory at high energy with extreme precision, after which one has to integrate the Schrödinger equation to obtain the physical amplitudes at lower energy. Such a program has not yet been carried out, because integrating these Schrödinger equations is beyond our present capabilities, but one may suspect that, as a matter of principle, it should be possible. Theories that are not asymptotically free may 66 perhaps allow for more precise treatments if an ultra-violet fixed point can be established. The extent of the divergence of the perturbation expansion can be studied or predicted. This one does using the Borel resummation technique. An amplitude Γ(λ) = ∞ X an λ n , (12.1) n=1 can be rewritten as Γ(λ) = B(z) = Z ∞ 0 ∞ X B(z)e−z/λ dz , ak+1 z k /k! . (12.2) k=0 The series for B(z) is generally expected to have a finite radius of convergence. If B(z) can be analytically extended to the domain 0 ≤ z < ∞, then that (re-)defines our amplitude. In general, however, one can derive that B(z) must have singularities on the real axis, for instance where z corresponds to the action of instantons or instanton pairs. In addition, singularities associated to the infrared and/or ultraviolet divergences of the theory are expected. Sometimes, these different singularities interfere. 12.4. General Relativity and Superstring Theory It is dubious, however, whether the issue of convergence or divergence of the perturbation expansion is of physical relevance. We know that Quantum Field Theory cannot contain the entire truth concerning the sub-atomic world; the gravitational force is guaranteed not to be renormalizable, so at those scales where this force becomes comparable to the other forces, the so-called Planck scale, a radically new theory is called for. Superstring Theory is presently holding the best promise to evolve into such a theory. With this theory, physicists are opening a new chapter, where we leave conventional Quantum Field Theory, as described in this paper, behind. In its present form, superstring theory appears to have turned into a collection of wild ideas called M -theory, whose foundations are still extremely shaky. Some of the best minds of the world are competing to turn this theory into something that can be used to provide for reliable predictions and that can be taught in a text book, but this has not yet been achieved. Acknowledgement Te author thanks Xingyang Yu for informing him of some typos and a sign error in Eq. (4.8) that may have caused confusion. 67 References [1] A. Pais. Inward bound: of matter and forces in the physical world, Oxford University Press 1986. See also: R.P.Crease and C.C. Mann, The Second Creation: makers of the revolution in twentieth-century physics, New York, Macmillan, 1986. [2] B. de Wit and J. Smith, Field Theory in Particle Physics, North Holland, Amsterdam 1986, ISBN 0444 86 999 9; I.J.R. Aitchison and A.J.G. Hey, Gauge Theories in Particle Physics, Adam Hilger, 1989, ISBN 0-85274-328-9; L.H. Ryder, Quantum Field Theory, 2nd edition, Cambridge Univ. Press, 1985, 1996, ISBN 0521 47242 3 hb 0521 47814 6 pb; C. Itzykson and J.-B. Zuber, Quantum Field Theory, McGraw Hill, New York, 1980; T.-P. Cheng and L.-F. Li, Gauge theory of elementary particle physics, Clarendon Press, Oxford, 1984, ISBN 0 19 851961 3 (Pbk). [3] C.N. Yang and R.L. Mills, Physical Review 95 (1954) 631. [4] B.S. DeWitt, Physical Review Letters 12 (1964) 742. [5] B.S. DeWitt, Physical Review 160 (1967) 1113; ibid. 162 (1967) 1195, 1239. [6] L.D. Faddeev and V.N. Popov, Physics Letters 25B (1967) 29; L.D. Faddeev, Theoretical and Mathematical Physics 1 (1969) 3 (in Russian) L.D. Faddeev, Theoretical and Mathematical Physics 1 (1969) 1 (Engl. transl) L. Faddeev, L. Takhtajan, Soviet Journal of Mathematics 24 (1984) 241. [7] C. Becchi, A. Rouet and R. Stora, Communications in Mathematical Physics 42 (1975) 127; id., Annals of Physics (N.Y.) 98 (1976) 287. [8] I.V. Tyutin, Lebedev Prepr. FIAN 39 (1975), unpubl. [9] G. ’t Hooft and M. Veltman, Nuclear Physics B44 (1972) 189. [10] F. Englert and R. Brout, Physical Review Letters 13 (1964) 321. [11] P.W. Higgs, Physics Letters 12 (1964) 132; Physical Review Letters 13 (1964) 508; Physical Review 145 (1966) 1156. [12] L. Hoddeson et al, The Rise of the Standard Model, Particle Physics in the 1960s and 1970s, Cambridge Univ. Press, 1997, ISBN 0-521-57816-7. [13] G. ’t Hooft and M. Veltman, DIAGRAMMAR, CERN Report 73/9 (1973), reprinted in Particle Interactions at Very High Energies, Nato Adv. Study Inst. Series, Sect. B, vol. 4b, p. 177; reprinted in G. ’t Hooft, Under the Spell of the Gauge Principle, Adv Series in Math. Physics Vol 19, World Scientific, Singapore, 1994, p. 28. 68 [14] G. ’t Hooft, Nuclear Physics B33 (1971) 173. [15] G. ’t Hooft and M. Veltman, Nuclear Physics B50 (1972) 318. [16] A. Slavnov, Theoretical and Mathematical Physics 10 (1972) 153 (in Russian), Theoretical and Mathematical Physics 10 (1972) 99 (Engl. Transl.) [17] J.C. Taylor, Nuclear Physics B33 (1971) 436. [18] J.S. Bell and R. Jackiw, Il Nuovo Cimento 60A (1969) 47. [19] S.L. Adler, Physical Review 177 (1969) 2426; S.L. Adler and W.A. Bardeen, Physical Review 182 (1969) 1517; W.A. Bardeen, Physical Review 184 (1969) 1848. [20] R. Jackiw, Diverse topics in Theoretical and Mathematical Physics, World Scientific 1995, Chapter 1. [21] E,C,G. Stueckelberg and A. Peterman, Helvetica Physica Acta 26 (1953) 499; N.N. Bogoliubov and D.V. Shirkov, Introduction to the theory of quantized fields (Interscience, New Yark, 1959). [22] M. Gell-Mann and F. Low, Physical Review 95 (1954) 1300. [23] H.B. Nielsen and P. Olesen, Nuclear Physics B61 (1973) 45. [24] P.A.M. Dirac, Proceedings of the Royal Society A133 (1931) 60; Physical Review 74 (1948) 817. 9. [25] P. Hasenfratz and G. ’t Hooft , Physical Review Letters 36 (1976) 1119. [26] A.S. Goldhaber, Physical Review Letters 36 (1976) 1122. [27] A.A. Belavin, A.M. Polyakov, A.S. Schwartz and Y.S. Tyupkin, Physics Letters 59 B (1975) 85. [28] G. ’t Hooft, Physics Reports 142 (1986) 357. [29] F. Dyson, Physical Review 85 (1952) 631. 69

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download THE CONCEPTUAL BASIS OF QUANTUM FIELD THEORY