Landau
Lifshitz
The Classical
Theory of Fields
Third Revised English Edition
Course of Theoretical Physics
Volume 2
CD CD
o CO
J.. fl*
CD —
E
CO
L. D. Landau (Deceased) and E. ML Lifshitz
re
Institute of Physical Problems
USSR Academy of Sciences
gamon
Pergamon Press
Course of Theoretical Physics
Volume 2
THE CLASSICAL THEORY
OF FIELDS
Third Revised English Edition
L, D. LANDAU (Deceased) and
E, M. LIFSHITZ
Institute of Physical Problems, USSR Academy
of Sciences
This third English edition of the book
has been translated from the fifth
revised and extended Russian edition
published in 1967. Although much
new material has been added, the
subject matter is basically that of the
second English translation, being a
systematic presentation of electro
magnetic and gravitational fields for
postgraduate courses. The largest
additions are four new sections
entitled "Gravitational Collapse",
"Homogeneous Spaces", "Oscillating
Regime of Approach to a Singular
Point", and "Character of the
Singularity in the General Cosmological
Solution of the Gravitational Equations"
These additions cover some of the
main areas of research in general
relativity.
Mxcvn
COURSE OF THEORETICAL PHYSICS
Volume 2
THE CLASSICAL THEORY OF FIELDS
OTHER TITLES IN THE SERIES
Vol. 1. Mechanics
Vol. 3. Quantum Mechanics — Non Relativistic Theory
Vol. 4. Relativistic Quantum Theory
Vol. 5. Statistical Physics
Vol. 6. Fluid Mechanics
Vol. 7. Theory of Elasticity
Vol. 8. Electrodynamics of Continuous Media
Vol. 9. Physical Kinetics
THE CLASSICAL THEORY
OF
FIELDS
Third Revised English Edition
L. D. LANDAU AND E. M. LIFSHITZ
Institute for Physical Problems, Academy of Sciences of the U.S.S.R.
Translated from the Russian
by
MORTON HAMERMESH
University of Minnesota
PERGAMON PRESS
OXFORD • NEW YORK • TORONTO
SYDNEY ' BRAUNSCHWEIG
Pergamon Press Ltd., Headington Hill Hall, Oxford
Pergamon Press Inc., Maxwell House, Fairview Park, Elmsford,
New York 10523
Pergamon of Canada Ltd., 207 Queen's Quay West, Toronto 1
Pergamon Press (Aust.) Pty. Ltd., 19a Boundary Street,
Rushcutters Bay, N.S.W. 2011, Australia
Vieweg & Sohn GmbH, Burgplatz 1, Braunschweig
Copyright © 1971 Pergamon Press Ltd.
All Rights Reserved. No part of this publication may be
reproduced, stored in a retrieval system, or transmitted, in any
form or by any means, electronic, mechanical, photocopying,
recording or otherwise, without the prior permission of
Pergamon Press Ltd.
First English edition 1951
Second English edition 1962
Third English edition 1971
Library of Congress Catalog Card No. 73140427
Translated from the 5th revised edition
of Teoriya Pola, Nauka, Moscow, 1967
Printed in Great Britain by
THE WHITEFRIARS PRESS LTD., LONDON AND TONBRIDGE
08 016019
CONTENTS
Preface to the Second English Edition ix
Preface to the Third English Edition x
Notation xi
Chapter 1. The Principle of Relativity 1
1 Velocity of propagation of interaction 1
2 Intervals 3
3 Proper time 7
4 The Lorentz transformation 9
5 Transformation of velocities 12
6 Fourvectors 14
7 Fourdimensional velocity 21
Chapter 2. Relativistic Mechanics 24
8 The principle of least action 24
9 Energy and momentum 25
10 Transformation of distribution functions 29
1 1 Decay of particles 30
12 Invariant crosssection 34
13 Elastic collisions of particles 36
14 Angular momentum 40
Chapter 3. Charges in Electromagnetic Fields 43
15 Elementary particles in the theory of relativity 43
16 Fourpotential of a field 44
17 Equations of motion of a charge in a field 46
18 Gauge invariance 49
19 Constant electromagnetic field 50
20 Motion in a constant uniform electric field 52
21 Motion in a constant uniform magnetic field 53
22 Motion of a charge in constant uniform electric and magnetic fields 55
23 The electromagnetic field tensor 60
24 Lorentz transformation of the field 62
25 Invariants of the field 63
Chapter 4. The Electromagnetic Field Equations 66
26 The first pair of Maxwell's equations • 66
27 The action function of the electromagnetic field 67
28 The fourdimensional current vector 69
29 The equation of continuity 71
30 The second pair of Maxwell equations 73
31 Energy density and energy flux 75
32 The energymomentum tensor 77
33 Energymomentum tensor of the electromagnetic field 80
34 The virial theorem 84
35 The nergymomentum tensor for macroscopic bodies 85
VI CONTENTS
Chapter 5. Constant Electromagnetic Fields 88
36 Coulomb's law 88
37 Electrostatic energy of charges 89
38 The field of a uniformly moving charge 91
39 Motion in the Coulomb field 93
40 The dipole moment 96
41 Multipole moments 97
42 System of charges in an external field 100
43 Constant magnetic field 101
44 Magnetic moments 103
45 Larmor's theorem 105
Chapter 6. Electromagnetic Waves 108
46 The wave equation 108
47 Plane waves 110
48 Monochromatic plane waves 114
49 Spectral resolution 118
50 Partially polarized light 119
51 The Fourier resolution of the electrostatic field 124
52 Characteristic vibrations of the field 125
Chapter 7. The Propagation of Light 129
53 Geometrical optics 129
54 Intensity 132
55 The angular eikonal 134
56 Narrow bundles of rays 136
57 Image formation with broad bundles of rays 141
58 The limits of geometrical optics 143
59 Diffraction 145
60 Fresnel diffraction 150
61 Fraunhofer diffraction 153
Chapter 8. The Field of Moving Charges 158
62 The retarded potentials 158
63 The LienardWiechert potentials 160
64 Spectral resolution of the retarded potentials 163
65 The Lagrangian to terms of second order 165
Chapter 9. Radiation of Electromagnetic Waves 170
66 The field of a system of charges at large distances 170
67 Dipole radiation 173
68 Dipole radiation during collisions 177
69 Radiation of low frequency in collisions 179
70 Radiation in the case of Coulomb interaction 181
71 Quadrupole and magnetic dipole radiation 188
72 The field of the radiation at near distances 190
73 Radiation from a rapidly moving charge 193
74 Synchrotron radiation (magnetic bremsstrahlung) 197
75 Radiation damping 203
76 Radiation damping in the relativistic case 208
77 Spectral resolution of the radiation in the ultrarelativistic case 211
78 Scattering by free charges 215
79 Scattering of lowfrequency waves 220
80 Scattering of highfrequency waves 221
CONTENTS Vii
Chapter 10. Particle in a Gravitational Field 225
81 Gravitational fields in nonrelativistic mechanics 225
82 The gravitational field in relativistic mechanics 226
83 Curvilinear coordinates 229
84 Distances and time intervals 233
85 Covariant differentiation 236
86 The relation of the Christoffel symbols to the metric tensor 241
87 Motion of a particle in a gravitational field 243
88 The constant gravitational field 247
89 Rotation 253
90 The equations of electrodynamics in the presence of a gravitational field 254
Chapter 11. The Gravitational Field Equations 258
91 The curvature tensor 258
92 Properties of the curvature tensor 260
93 The action function for the gravitational field 266
94 The energymomentum tensor 268
95 The gravitational field equations 272
96 Newton's law 278
97 The centrally symmetric gravitational field 282
98 Motion in a centrally symmetric gravitational field 287
99 The synchronous reference system 290
100 Gravitational collapse 296
101 The energymomentum pseudotensor 304
1 02 Gravitational waves 311
103 Exact solutions of the gravitational field equations depending on one variable 314
104 Gravitational fields at large distances from bodies 318
105 Radiation of gravitational waves 323
106 The equations of motion of a system of bodies in the second approximation 325
Chapter 12. Cosmological Problems 333
107 Isotropic space 333
108 Spacetime metric in the closed isotropic model 336
109 Spacetime metric for the open isotropic model 340
110 The red shift 343
111 Gravitational stability of an isotropic universe 350
112 Homogeneous spaces 355
113 Oscillating regime of approach to a singular point 360
114 The character of the singularity in the general cosmological solution of the gravitational
equations 367
Index
371
PREFACE
TO THE SECOND ENGLISH EDITION
This book is devoted to the presentation of the theory of the electromagnetic and
gravitational fields. In accordance with the general plan of our "Course of Theoretical
Physics", we exclude from this volume problems of the electrodynamics of continuous
media, and restrict the exposition to "microscopic electrodynamics", the electrodynamics
of the vacuum and of point charges.
A complete, logically connected theory of the electromagnetic field includes the special
theory of relativity, so the latter has been taken as the basis of the presentation. As the
startingpoint of the derivation of the fundamental equations we take the variational
principles, which make possible the achievement of maximum generality, unity and simplicity
of the presentation.
The last three chapters are devoted to the presentation of the theory of gravitational
fields, i.e. the general theory of relativity. The reader is not assumed to have any previous
knowledge of tensor analysis, which is presented in parallel with the development of the
theory.
The present edition has been extensively revised from the first English edition, which
appeared in 1951.
We express our sincere gratitude to L. P. Gor'kov, I. E. Dzyaloshinskii and L. P. Pitaevskii
for their assistance in checking formulas.
Moscow, September 1961 L. D. Landau, E. M. Lifshitz
PREFACE
TO THE THIRD ENGLISH EDITION
This third English edition of the book has been translated from the revised and extended
Russian edition, published in 1967. The changes have, however, not affected the general
plan or the style of presentation.
An essential change is the shift to a different fourdimensional metric, which required
the introduction right from the start of both contra and covariant presentations of the
four vectors. We thus achieve uniformity of notation in the different parts of this book
and also agreement with the system that is gaining at present in universal use in the physics
literature. The advantages of this notation are particularly significant for further appli
cations in quantum theory.
I should like here to express my sincere gratitude to all my colleagues who have made
valuable comments about the text and especially to L. P. Pitaevskii, with whom I discussed
many problems related to the revision of the book.
For the new English edition, it was not possible to add additional material throughout
the text. However, three new sections have been added at the end of the book, §§ 112114.
April, 1970 E. M. Lifshitz
NOTATION
Threedimensional quantities
Threedimensional tensor indices are denoted by Greek letters
Element of volume, area and length: dV, di, d\
Momentum and energy of a particle: p and $
Hamiltonian function: 2tf
Scalar and vector potentials of the electromagnetic field: and A
Electric and magnetic field intensities: E and H
Charge and current density : p and j
Electric dipole moment: d
Magnetic dipole moment: m
Fourdimensional quantities
Fourdimensional tensor indices are denoted by Latin letters i, k, I, . . . and take on the
values 0, 1, 2, 3
We use the metric with signature (H )
Rule for raising and lowering indices — see p. 14
Components of fourvectors are enumerated in the form A 1 = (A , A)
Antisymmetric unit tensor of rank four is e iklm , where e 0123 = 1 (for the definition see
P 17)
Radius fourvector: x* = (ct, r)
Velocity four vector: u l = dx l \ds
Momentum fourvector: p = {Sic, p)
Current fourvector : j* = (cp, pi)
Fourpotential of the electromagnetic field: A 1 = ($, A)
Electromagnetic field fourtensor F ik = j±  — { (for the relation of the components of
F ik to the components of E and H, see p. 77)
Energymomentum fourtensor T ik (for the definition of its components, see p. 78)
CHAPTER 1
THE PRINCIPLE OF RELATIVITY
§ 1. Velocity of propagation of interaction
For the description of processes taking place in nature, one must have a system of
reference. By a system of reference we understand a system of coordinates serving to indicate
the position of a particle in space, as well as clocks fixed in this system serving to indicate
the time.
There exist systems of reference in which a freely moving body, i.e. a moving body which
is not acted upon by external forces, proceeds with constant velocity. Such reference systems
are said to be inertial.
If two reference systems move uniformly relative to each other, and if one of them is an
inertial system, then clearly the other is also inertial (in this system too every free motion will
be linear and uniform). In this way one can obtain arbitrarily many inertial systems of
reference, moving uniformly relative to one another.
Experiment shows that the socalled principle of relativity is valid. According to this
principle all the laws of nature are identical in all inertial systems of reference. In other
words, the equations expressing the laws of nature are invariant with respect to transforma
tions of coordinates and time from one inertial system to another. This means that the
equation describing any law of nature, when written in terms of coordinates and time in
different inertial reference systems, has one and the same form.
The interaction of material particles is described in ordinary mechanics by means of a
potential energy of interaction, which appears as a function of the coordinates of the inter
acting particles. It is easy to see that this manner of describing interactions contains the
assumption of instantaneous propagation of interactions. For the forces exerted on each
of the particles by the other particles at a particular instant of time depend, according to this
description, only on the positions of the particles at this one instant. A change in the position
of any of the interacting particles influences the other particles immediately.
However, experiment shows that instantaneous interactions do not exist in nature. Thus a
mechanics based on the assumption of instantaneous propagation of interactions contains
within itself a certain inaccuracy. In actuality, if any change takes place in one of the inter
acting bodies, it will influence the other bodies only after the lapse of a certain interval of
time. It is only after this time interval that processes caused by the initial change begin to
take place in the second body. Dividing the distance between the two bodies by this time
interval, we obtain the velocity of propagation of the interaction.
We note that this velocity should, strictly speaking, be called the maximum velocity of
propagation of interaction. It determines only that interval of time after which a change
occurring in one body begins to manifest itself in another. It is clear that the existence of a
2 THE PRINCIPLE OF RELATIVITY § 1
maximum velocity of propagation of interactions implies, at the same time, that motions of
bodies with greater velocity than this are in general impossible in nature. For if such a motion
could occur, then by means of it one could realize an interaction with a velocity exceeding
the maximum possible velocity of propagation of interactions.
Interactions propagating from one particle to another are frequently called "signals",
sent out from the first particle and "informing" the second particle of changes which the
first has experienced. The velocity of propagation of interaction is then referred to as the
signal velocity.
From the principle of relativity it follows in particular that the velocity of propagation
of interactions is the same in all inertial systems of reference. Thus the velocity of propaga
tion of interactions is a universal constant. This constant velocity (as we shall show later) is
also the velocity of light in empty space. The velocity of light is usually designated by the
letter c, and its numerical value is
c = 2.99793 x 10 10 cm/sec. (1.1)
The large value of this velocity explains the fact that in practice classical mechanics
appears to be sufficiently accurate in most cases. The velocities with which we have occasion
to deal are usually so small compared with the velocity of light that the assumption that the
latter is infinite does not materially affect the accuracy of the results.
The combination of the principle of relativity with the finiteness of the velocity of propaga
tion of interactions is called the principle of relativity of Einstein (it was formulated by
Einstein in 1905) in contrast to the principle of relativity of Galileo, which was based on an
infinite velocity of propagation of interactions.
The mechanics based on the Einsteinian principle of relativity (we shall usually refer to it
simply as the principle of relativity) is called relativistic. In the limiting case when the
velocities of the moving bodies are small compared with the velocity of light we can neglect
the effect on the motion of the finiteness of the velocity of propagation. Then relativistic
mechanics goes over into the usual mechanics, based on the assumption of instantaneous
propagation of interactions; this mechanics is called Newtonian or classical. The limiting
transition from relativistic to classical mechanics can be produced formally by the transition
to the limit c *■ oo in the formulas of relativistic mechanics.
In classical mechanics distance is already relative, i.e. the spatial relations between
different events depend on the system of reference in which they are described. The state
ment that two nonsimultaneous events occur at one and the same point in space or, in
general, at a definite distance from each other, acquires a meaning only when we indicate the
system of reference which is used.
On the other hand, time is absolute in classical mechanics ; in other words, the properties
of time are assumed to be independent of the system of reference; there is one time for all
reference frames. This means that if any two phenomena occur simultaneously for any one
observer, then they occur simultaneously also for all others. In general, the interval of time
between two given events must be identical for all systems of reference.
It is easy to show, however, that the idea of an absolute time is in complete contradiction
to the Einstein principle of relativity. For this it is sufficient to recall that in classical
mechanics, based on the concept of an absolute time, a general law of combination of
velocities is valid, according to which the velocity of a composite motion is simply equal to
the (vector) sum of the velocities which constitute this motion. This law, being universal,
should also be applicable to the propagation of interactions. From this it would follow
§ 2 VELOCITY OF PROPAGATION OF INTERACTION 3
that the velocity of propagation must be different in different inertial systems of reference,
in contradiction to the principle of relativity. In this matter experiment completely confirms
the principle of relativity. Measurements first performed by Michelson (1881) showed
complete lack of dependence of the velocity of light on its direction of propagation; whereas
according to classical mechanics the velocity of light should be smaller in the direction of the
earth's motion than in the opposite direction.
Thus the principle of relativity leads to the result that time is not absolute. Time elapses
differently in different systems of reference. Consequently the statement that a definite time
interval has elapsed between two given events acquires meaning only when the reference
frame to which this statement applies is indicated. In particular, events which are simul
taneous in one reference frame will not be simultaneous in other frames.
To clarify this, it is instructive to consider the following simple example. Let us look at
two inertial reference systems K and K' with coordinate axes XYZ and X' Y'Z' respectively,
where the system K' moves relative to K along the X(X') axis (Fig. 1).
B— A— C
1 1 1 X'
x
Y Y'
Fig. 1.
Suppose signals start out from some point A on the X' axis in two opposite directions.
Since the velocity of propagation of a signal in the K' system, as in all inertial systems, is
equal (for both directions) to c, the signals will reach points B and C, equidistant from A,
at one and the same time (in the K' system). But it is easy to see that the same two events
(arrival of the signal at B and C) can by no means be simultaneous for an observer in the K
system. In fact, the velocity of a signal relative to the A" system has, according to the principle
of relativity, the same value c, and since the point B moves (relative to the K system)
toward the source of its signal, while the point C moves in the direction away from the
signal (sent from A to C), in the AT system the signal will reach point B earlier than point C.
Thus the principle of relativity of Einstein introduces very drastic and fundamental
changes in basic physical concepts. The notions of space and time derived by us from our
daily experiences are only approximations linked to the fact that in daily life we happen to
deal only with velocities which are very small compared with the velocity of light.
§ 2. Intervals
In what follows we shall frequently use the concept of an event. An event is described by
the place where it occurred and the time when it occurred. Thus an event occurring in a
certain material particle is defined by the three coordinates of that particle and the time
when the event occurs.
It is frequently useful for reasons of presentation to use a fictitious fourdimensional
space, on the axes of which are marked three space coordinates and the time. In this space
4 THE PRINCIPLE OF RELATIVITY § 2
events are represented by points, called world points. In this fictitious fourdimensional space
there corresponds to each particle a certain line, called a world line. The points of this line
determine the coordinates of the particle at all moments of time. It is easy to show that to a
particle in uniform rectilinear motion there corresponds a straight world line.
We now express the principle of the invariance of the velocity of light in mathematical
form. For this purpose we consider two reference systems K and K' moving relative to each
other with constant velocity. We choose the coordinate axes so that the axes X and X'
coincide, while the Y and Z axes are parallel to Y' and Z'; we designate the time in the
systems K and K' by t and t'.
Let the first event consist of sending out a signal, propagating with light velocity, from a
point having coordinates x t y ± z x in the K system, at time 1 1 in this system. We observe the
propagation of this signal in the K system. Let the second event consist of the arrival of the
signal at point x 2 y 2 z 2 at the moment of time t 2 . The signal propagates with velocity c;
the distance covered by it is therefore c^ — 1 2 ). On the other hand, this same distance
equals [(x 2 — x 1 ) 2 + (y 2 y 1 ) 2 + (z 2 —z 1 ) 2 ] i . Thus we can write the following relation
between the coordinates of the two events in the K system:
(x 2  Xl ) 2 + (y 2  ytf + izitiYfihh) 2 = 0 (21)
The same two events, i.e. the propagation of the signal, can be observed from the K'
system:
Let the coordinates of the first event in the K' system be xi y[ z[ t\, and of the second:
x 2 y' 2 z' 2 t 2 . Since the velocity of light is the same in the K and K' systems, we have, similarly
to (2.1):
{AAYHy'zytfHz'zAfc^ttf = o. (2.2)
If x x y x z t t ± and x 2 y 2 z 2 1 2 are the coordinates of any two events, then the quantity
S12 = [c 2 (^*i) 2 (*2*i) 2 (y2yi) 2 (z2Zi) 2 3* (23)
is called the interval between these two events.
Thus it follows from the principle of invariance of the velocity of light that if the interval
between two events is zero in one coordinate system, then it is equal to zero in all other
systems.
If two events are infinitely close to each other, then the interval ds between them is
ds 2 = c 2 dt 2 dx 2 dy 2  dz 2 . (2.4)
The form of expressions (2.3) and (2.4) permits us to regard the interval, from the formal
point of view, as the distance between two points in a fictitious fourdimensional space
(whose axes are labelled by x, y, z, and the product ct). But there is a basic difference
between the rule for forming this quantity and the rule in ordinary geometry: in forming the
square of the interval, the squares of the coordinate differences along the different axes are
summed, not with the same sign, but rather with varying signs.f
As already shown, if ds = in one inertial system, then ds' = in any other system. On
the other hand, ds and ds' are infinitesimals of the same order. From these two conditions
it follows that ds 2 and ds' 2 must be proportional to each other:
ds 2 = ads' 2
where the coefficient a can depend only on the absolute value of the relative velocity of the
t The fourdimensional geometry described by the quadratic form (2.4) was introduced by H. Minkowski,
in connection with the theory of relativity. This geometry is called pseudoeuclidean, in contrast to ordinary
euclidean geometry.
§ 2 INTERVALS 5
two inertial systems. It cannot depend on the coordinates or the time, since then different
points in space and different moments in time would not be equivalent, which would be in
contradiction to the homogeneity of space and time. Similarly, it cannot depend on the
direction of the relative velocity, since that would contradict the isotropy of space.
Let us consider three reference systems K, K X ,K 2 , and let V ± and V 2 be the velocities of
systems K x and K 2 relative to K. We then have :
ds 2 = a{Vi)ds\, ds 2 = a(V 2 )ds 2 2 .
Similarly we can write
ds\ = a(V x2 )ds\,
where V 12 is the absolute value of the velocity of K 2 relative to K x . Comparing these relations
with one another, we find that we must have
777\ = a(V 12 ). (2.5)
But V 12 depends not only on the absolute values of the vectors V x and V 2 , but also on the
angle between them. However, this angle does not appear on the left side of formula (2.5).
It is therefore clear that this formula can be correct only if the function a(V) reduces to a
constant, which is equal to unity according to this same formula.
Thus,
ds 2 = ds' 2 ,
and from the equality of the infinitesimal intervals there follows the equality of finite
intervals: s = s'.
Thus we arrive at a very important result: the interval between two events is the same in all
inertial systems of reference, i.e. it is invariant under transformation from one inertial
system to any other. This invariance is the mathematical expression of the constancy of the
velocity of light.
Again let x^y^Zxt^ and x 2 y 2 z 2 t 2 be the coordinates of two events in a certain
reference system K. Does there exist a coordinate system K\ in which these two events
occur at one and the same point in space ?
We introduce the notation
hh = hi, (x 2 x 1 ) 2 +(y 2 y 1 ) 2 +(z 2 z 1 ) 2 = \\ 2 .
Then the interval between events in the K system is :
2 _ r 2,2 _;2
i 12 — C l 12 Ixi
and in the K' system
~'2 _ _2,/2 j/2
s 12 — c '12 f 12'
whereupon, because of the invariance of intervals,
2 f 2 _;2 _ 2./2 _//2
C Ii2 H2 — c l \2 l \2'
We want the two events to occur at the same point in the K' system, that is, we require
I' 12 = 0. Then
^12 = £ ^12 'l2 == C ^12 ^ ^*
Consequently a system of reference with the required property exists if s\ 2 > 0, that is, if
the interval between the two events is a real number. Real intervals are said to be timelike.
Thus, if the interval between two events is timelike, then there exists a system of reference
in which the two events occur at one and the same place. The time which elapses between
THE PRINCIPLE OF RELATIVITY
the two events in this system is
§2
t'i2 = Uchl 2 li 2 = S ^.
(2.6)
If two events occur in one and the same body, then the interval between them is always
timelike, for the distance which the body moves between the two events cannot be greater
than ct 12 , since the velocity of the body cannot exceed c. So we have always
l 12 < ct 12 .
Let us now ask whether or not we can find a system of reference in which the
two events occur at one and the same time. As before, we have for the K and K' systems
c t 12 lj 2 = c 2 t'? 2 l'? 2 . We want to have f 12 = 0, so that
s 2 i2=l'A<0.
Consequently the required system can be found only for the case when the interval s 12
between the two events is an imaginary number. Imaginary intervals are said to be spacelike.
Thus if the interval between two events is spacelike, there exists a reference system in
which the two events occur simultaneously. The distance between the points where the
events occur in this system is
'l2 = V/?2C 2 *12 = «12 (2.7)
The division of intervals into space and timelike intervals is, because of their invariance,
an absolute concept. This means that the timelike or spacelike character of an interval is
independent of the reference system.
Let us take some event O as our origin of time and space coordinates. In other words, in
the fourdimensional system of coordinates, the axes of which are marked x, y, z, t, the
world point of the event O is the origin of coordinates. Let us now consider what relation
other events bear to the given event O. For visualization, we shall consider only one space
dimension and the time, marking them on two axes (Fig. 2). Uniform rectilinear motion of a
particle, passing through x = at t = 0, is represented by a straight line going through O
and inclined to the t axis at an angle whose tangent is the velocity of the particle. Since the
maximum possible velocity is c, there is a maximum angle which this line can subtend with
the t axis. In Fig. 2 are shown the two lines representing the propagation of two signals
Fig. 2
§ 3 INTERVALS 7
(with the velocity of light) in opposite directions passing through the event O (i.e. going
through x = at t = 0). All lines representing the motion of particles can lie only in the
regions aOc and dOb. On the lines ab and cd, x = ±ct. First consider events whose world
points lie within the region aOc. It is easy to show that for all the points of this region
c 2 t 2 — x 2 > 0. In other words, the interval between any event in this region and the event O
is timelike. In this region t > 0, i.e. all the events in this region occur "after" the event O.
But two events which are separated by a timelike interval cannot occur simultaneously in
any reference system. Consequently it is impossible to find a reference system in which any
of the events in region aOc occurred "before" the event O, i.e. at time t < 0. Thus all the
events in region aOc are future events relative to O in all reference systems. Therefore this
region can be called the absolute future relative to O.
In exactly the same way, all events in the region bOd are in the absolute past relative to O ;
i.e. events in this region occur before the event O in all systems of reference.
Next consider regions dOa and cOb. The interval between any event in this region and
the event O is spacelike. These events occur at different points in space in every reference
system. Therefore these regions can be said to be absolutely remote relative to O. However,
the concepts "simultaneous", "earlier", and "later" are relative for these regions. For any
event in these regions there exist systems of reference in which it occurs after the event
O, systems in which it occurs earlier than O, and finally one reference system in which it
occurs simultaneously with O.
Note that if we consider all three space coordinates instead of just one, then instead of
the two intersecting lines of Fig. 2 we would have a "cone" x 2 +y 2 +z 2 c 2 t 2 = in the
fourdimensional coordinate system x, y, z, t, the axis of the cone coinciding with the / axis.
(This cone is called the light cone.) The regions of absolute future and absolute past are then
represented by the two interior portions of this cone.
Two events can be related causally to each other only if the interval between them is
timelike; this follows immediately from the fact that no interaction can propagate with a
velocity greater than the velocity of light. As we have just seen, it is precisely for these events
that the concepts "earlier" and "later" have an absolute significance, which is a necessary
condition for the concepts of cause and effect to have meaning.
§ 3. Proper time
Suppose that in a certain inertial reference system we observe clocks which are moving
relative to us in an arbitrary manner. At each different moment of time this motion can be
considered as uniform. Thus at each moment of time we can introduce a coordinate system
rigidly linked to the moving clocks, which with the clocks constitutes an inertial reference
system.
In the course of an infinitesi mal time interv al dt (as read by a clock in our rest frame) the
moving clocks go a distance y/dx 2 + dy 2 +dz 2 . Let us ask what time interval dt' is indicated
for this period by the moving clocks. In a system of coordinates linked to the moving
clocks, the latter are at rest, i.e., dx' = dy' = dz' = 0. Because of the invariance of intervals
ds 2 = c 2 dt 2 dx 2 dy 2 dz 2 = c 2 dt' 2 t
from which
dt' = dtj\
dx 2 + dy 2 + dz 2
o THE PRINCIPLE OF RELATIVITY § 3
But
dx 2 + dy 2 + dz 2 ,
— = v z ,
dt 2
where v is the velocity of the moving clocks; therefore
^ =  = ^^2 (3.1)
c
Integrating this expression, we can obtain the time interval indicated by the moving clocks
when the elapsed time according to a clock at rest is t 2 — t t :
tjfij
t '2fi = jdt^l^. (3.2)
tl
The time read by a clock moving with a given object is called the proper time for this object.
Formulas (3.1) and (3.2) express the proper time in terms of the time for a system of reference
from which the motion is observed.
As we see from (3.1) or (3.2), the proper time of a moving object is always less than the
corresponding interval in the rest system. In other words, moving clocks go more slowly
than those at rest.
Suppose some clocks are moving in uniform rectilinear motion relative to an inertial
system K. A reference frame K' linked to the latter is also inertial. Then from the point of
view of an observer in the K system the clocks in the K' system fall behind. And con
versely, from the point of view of the K' system, the clocks in AT lag. To convince ourselves
that there is no contradiction, let us note the following. In order to establish that the clocks
in the K' system lag behind those in the K system, we must proceed in the following fashion.
Suppose that at a certain moment the clock in K' passes by the clock in K, and at that
moment the readings of the two clocks coincide. To compare the rates of the two clocks in
A^and K' we must once more compare the readings of the same moving clock in K' with the
clocks in K. But now we compare this clock with different clocks in K— with those past
which the clock in K' goes at this new time. Then we find that the clock in K' lags behind the
clocks in K with which it is being compared. We see that to compare the rates of clocks in
two reference frames we require several clocks in one frame and one in the other, and that
therefore this process is not symmetric with respect to the two systems. The clock that appears
to lag is always the one which is being compared with different clocks in the other
system.
If we have two clocks, one of which describes a closed path returning to the starting point
■(the position of the clock which remained at rest), then clearly the moving clock appears to
lag relative to the one at rest. The converse reasoning, in which the moving clock would be
considered to be at rest (and vice versa) is now impossible, since the clock describing a
closed trajectory does not carry out a uniform rectilinear motion, so that a coordinate
system linked to it will not be inertial.
Since the laws of nature are the same only for inertial reference frames, the frames linked
to the clock at rest (inertial frame) and to the moving clock (noninertial) have different
properties, and the argument which leads to the result that the clock at rest must lag is not
valid.
§ 4 THE LORENTZ TRANSFORMATION
The time interval read by a clock is equal to the integral
lh
taken along the world line of the clock. If the clock is at rest then its world line is clearly a
line parallel to the t axis; if the clock carries out a nonuniform motion in a closed path and
returns to its starting point, then its world line will be a curve passing through the two points,
on the straight world line of a clock at rest, corresponding to the beginning and end of the
motion. On the other hand, we saw that the clock at rest always indicates a greater time
interval than the moving one. Thus we arrive at the result that the integral
b
fds,
a
taken between a given pair of world points, has its maximum value if it is taken along the
straight world line joining these two points.f
§ 4. The Lorentz transformation
Our purpose is now to obtain the formula of transformation from one inertial reference
system to another, that is, a formula by means of which, knowing the coordinates x, y, z, t,
of a certain event in the K system, we can find the coordinates x', y', z', t' of the same event
in another inertial system K'.
In classical mechanics this question is resolved very simply. Because of the absolute
nature of time we there have t = t'\ if, furthermore, the coordinate axes are chosen as usual
(axes X, X' coincident, Y, Z axes parallel to Y', Z\ motion along X, X') then the co
ordinates v, z clearly are equal to y',z', while the coordinates x and x' differ by the distance
traversed by one system relative to the other. If the time origin is chosen as the moment when
the two coordinate systems coincide, and if the velocity of the K' system relative to K\s V,
then this distance is Vt. Thus
x = x'+Vt, y = y', z = z\ t = t'. (4.1)
This formula is called the Galileo transformation. It is easy to verify that this transformation,
as was to be expected, does not satisfy the requirements of the theory of relativity; it does
not leave the interval between events invariant.
We shall obtain the relativistic transformation precisely as a consequence of the require
ment that it leave the interval between events invariant.
As we saw in § 2, the interval between events can be looked on as the distance between the
corresponding pair of world points in a fourdimensional system of coordinates. Conse
quently we may say that the required transformation must leave unchanged all distances in
the fourdimensional x, v, z, ct, space. But such transformations consist only of parallel
displacements, and rotations of the coordinate system. Of these the displacement of the co
ordinate system parallel to itself is of no interest, since it leads only to a shift in the origin
of the space coordinates and a change in the time reference point. Thus the required trans
t It is assumed, of course, that the points a and b and the curves joining them are such that all elements ds
along the curves are timelike.
This property of the integral is connected with the pseudoeuclidean character of the fourdimensional
geometry. In euclidean space the integral would, of course, be a minimum along the straight line.
10 THE PRINCIPLE OF RELATIVITY § 4
formation must be expressible mathematically as a rotation of the fourdimensional
x, y, z, ct, coordinate system.
Every rotation in the fourdimensional space can be resolved into six rotations, in the
planes xy, zy, xz, tx, ty, tz (just as every rotation in ordinary space can be resolved into three
rotations in the planes xy, zy, and xz). The first three of these rotations transform only the
space coordinates; they correspond to the usual space rotations.
Let us consider a rotation in the tx plane; under this, the y and z coordinates do not
change. In particular, this transformation must leave unchanged the difference (ct) 2 x 2 ,
the square of the "distance" of the point (ct, x) from the origin. The relation between the
old and the new coordinates is given in most general form by the formulas:
x = x' cosh \\i + ct' sinh \J/, ct = x' sinh if/ + ct' cosh \J/, (4.2)
where \j/ is the "angle of rotation"; a simple check shows that in fact c 2 t 2 x 2 = c 2 t' 2 x' 2 .
Formula (4.2) differs from the usual formulas for transformation under rotation of the co
ordinate axes in having hyperbolic functions in place of trigonometric functions. This is the
difference between pseudoeuclidean and euclidean geometry.
We try to find the formula of transformation from an inertial reference frame K to a
system K' moving relative to iTwith velocity V along the x axis. In this case clearly only the
coordinate x and the time t are subject to change. Therefore this transformation must have
the form (4.2). Now it remains only to determine the angle \j/, which can depend only on the
relative velocity Kf
Let us consider the motion, in the K system, of the origin of the K' system. Then x' =
and formulas (4.2) take the form:
x = ct' sinh \{/, ct = ct' cosh \J/,
or dividing one by the other,
x , ,
— = tanh w.
ct Y
But xjt is clearly the velocity V of the K' system relative to K. So
tanh y/ = — .
c
From this
V
sinh \J/ = — ■ ■ ■ , cosh \\i =
V 1 c 2 V 1 c 2
Substituting in (4.2), we find:
f+*x'
7T '"'• ""• "7
x'+Vt'
x = , — —^ y = y> z = z > t= f — — 2  (4.3)
This is the required transformation formula. It is called the Lorentz transformation, and is of
fundamental importance for what follows.
t Note that to avoid confusion we shall always use V to signify the constant relative velocity of two
inertial systems, and v for the velocity of a moving particle, not necessarily constant.
§ 4 THE LORENTZ TRANSFORMATION 11
The inverse formulas, expressing x', y', z\ t' in terms of x, y, z, t, are most easily obtained
by changing V to V (since the K system moves with velocity  V relative to the K'
system). The same formulas can be obtained directly by solving equations (4.3) for x', y', z', t'.
It is easy to see from (4.3) that on making the transition to the limit c » co and classical
mechanics, the formula for the Lorentz transformation actually goes over into the Galileo
transformation.
For V > c in formula (4.3) the coordinates x, t are imaginary; this corresponds to the fact
that motion with a velocity greater than the velocity of light is impossible. Moreover, one
cannot use a reference system moving with the velocity of light— in that case the
denominators in (4.3) would go to zero.
For velocities V small compared with the velocity of light, we can use in place of (4.3)
the approximate formulas :
V
x = x' + Vf, v = v\ z = z', t = t'+^x'. (4.4)
Suppose there is a rod at rest in the K system, parallel to the X axis. Let its length,
measured in this system, be Ax = x 2 x 1 (x 2 and Xj are the coordinates of the two ends of
the rod in the K system). We now determine the length of tliis rod as measured in the K'
system. To do this we must find the coordinates of the two ends of the rod (x' 2 and xi) in
this system at one and the same time t'. From (4.3) we find:
_ x[ + Vt' x' 2 + Vt'
Xi — ^=« x 2 —
V 1 ? J 1 
V
The length of the rod in the K' system is Ax' = x^x'j ; subtracting x x from x 2 , we find
Ax'
Ax =
J
■£
The proper length of a rod is its length in a reference system in which it is at rest. Let us
denote it by l = Ax, and the length of the rod in any other reference frame K' by /. Then
(=! 0N /lJ (4.5)
Thus a rod has its greatest length in the reference system in which it is a t rest. Its l ength
in a system in which it moves with velocity V is decreased by the factor VI  V 2 /c 2 . This
result of the theory of relativity is called the Lorentz contraction.
Since the transverse dimensions do not change because of its motion, the volume "T of a
body decreases according to the similar formula
/ V 2
where y* is the proper volume of the body.
From the Lorentz transformation we can obtain anew the results already known to us
concerning the proper time (§ 3). Suppose a clock to be at rest in the K' system. We take
two events occurring at one and the same point x', y', z' in space in the K' system. The time
between these events in the K' system is Af' = t' 2 t\. Now we find the time At which
12 THE PRINCIPLE OF RELATIVITY § 5
elapses between these two events in the K system. From (4.3), we have
V V
*i+2*' '2+ 2*'
C C
t 2 =
V 1 c 2 V 1 c 2
or, subtracting one from the other,
t 7 t< = At =
7
in complete agreement with (3.1).
Finally we mention another general property of Lorentz transformations which distin
guishes them from Galilean transformations. The latter have the general property of com
mutativity, i.e. the combined result of two successive Galilean transformations (with
different velocities V t and V 2 ) does not depend on the order in which the transformations
are performed. On the other hand, the result of two successive Lorentz transformations does
depend, in general, on their order. This is already apparent purely mathematically from our
formal description of these transformations as rotations of the fourdimensional coordinate
system: we know that the result of two rotations (about different axes) depends on the order
in which they are carried out. The sole exception is the case of transformations with parallel
vectors V ± and V 2 (which are equivalent to two rotations of the fourdimensional coordinate
system about the same axis).
§ 5. Transformation of velocities
In the preceding section we obtained formulas which enable us to find from the coordinates
of an event in one reference frame, the coordinates of the same event in a second reference
frame. Now we find formulas relating the velocity of a material particle in one reference
system to its velocity in a second reference system.
Let us suppose once again that the K' system moves relative to the K system with velocity
V along the x axis. Let v x = dxjdt be the component of the particle velocity in the K system
and v' x = dx'fdt' the velocity component of the same particle in the K' system. From (4.3),
we have
V
J dx' + Vdt' , , dt'+ 2 dx'
ax = — , ^ , dy = dy , dz = dz', dt =
J 1 ? J l 
„2
Dividing the first three equations by the fourth and introducing the velocities
dr f dt'
we find
v' + V
I V 2 I V 2
y x = y, v y = — , v z = — .. (5.1)
l + v' x  2 l + v' x j 2 l + tf
§ 5 TRANSFORMATION OF VELOCITIES 13
These formulas determine the transformation of velocities. They describe the law of com
position of velocities in the theory of relativity. In the limiting case of c > oo, they go over
into the formulas v x = v' x + V, v y = v' y , v z = v' z of classical mechanics.
In the special case of motion of a particle parallel to the X axis, v x = v, v y = v x = 0.
Then v' y = v' z = 0, v' x = i/, so that
v +V
v =
V'
1 + v'*
(5.2)
It is easy to convince oneself that the sum of two velocities each smaller than the velocity
of light is again not greater than the light velocity.
For a velocity V significantly smaller than the velocity of light (the velocity v can be
arbitrary), we have approximately, to terms of order V/c:
( v' 2 \ V V
v x = v' x + V ylJL}> v y = V 'y V 'A ^ v * = <<»* ~v
These three formulas can be written as a single vector formula
v = v'+V A (V *▼')▼'• ( 5  3 )
c
We may point out that in the relativistic law of addition of velocities (5.1) the two
velocities v' and V which are combined enter unsymmetrically (provided they are not both
directed along the x axis). This fact is related to the noncommutativity of Lorentz trans
formations which we mentioned in the preceding Section.
Let us choose our coordinate axes so that the velocity of the particle at the given moment
lies in the XY plane. Then the velocity of the particle in the K system has components
v x = v cos 0, v y = v sin 9, and in the K' system v' x = v' cos 6', v y = v' sin 6' (v, v', 9, 9' are
the absolute values and the angles subtended with the X, X' axes respectively in the K, K'
systems). With the help of formula (5.1), we then find
"'V 1 
V 2 .
— = sin
tan 9 = ; . (5.4)
v cos +V
This formula describes the change in the direction of the velocity on transforming from
one reference system to another.
Let us consider a very important special case of this formula, namely, the deviation of
light in transforming to a new reference system — a phenomenon known as the aberration
of light. In this case v = v' = c, so that the preceding formula goes over into
J
tan 9 = sin 9'. (5.5)
 +cos0'
14 THE PRINCIPLE OF RELATIVITY § 6
From the same transformation formulas (5.1) it is easy to obtain for sin and cos 0:
J<
< V 2 V
1 2 cos 0' + 
sin0 = — ^— sin0, cos 9 = — . (5.6)
1 +  cos 9' 1 +  cos 9'
c c
In case V < c, we find from this formula, correct to terms of order V/c:
V
sin 9 — sin 9' = sin 9' cos 9'.
c
Introducing the angle A0 = Q'O (the aberration angle), we find to the same order of
accuracy
V
A0 =  sin 9', (5.7)
which is the wellknown elementary formula for the aberration of light.
§ 6. Fourvectors
The coordinates of an event (ct, x, y, z) can be considered as the components of a four
dimensional radius vector (or, for short, a fourradius vector) in a fourdimensional space.
We shall denote its components by x\ where the index i takes on the values 0, 1,2, 3, and
x° = ct, x 1 = x, x 2 = v, x 3 = z.
The square of the "length" of the radius four vector is given by
(x ) 2 ^ 1 ) 2 ^ 2 ) 2 ^ 3 ) 2 .
It does not change under any rotations of the fourdimensional coordinate system, in
particular under Lorentz transformations.
In general a set of four quantities A , A 1 , A 2 , A 3 which transform like the components
of the radius fourvector x* under transformations of the fourdimensional coordinate
system is called a. fourdimensional vector (fourvector) A 1 . Under Lorentz transformations,
V V
A' +A" A fl + ~A'°
A° = —jJL=, A l = / ' , ^ 2 =^' 2 A 3 =A' 3 . (6.1)
/ xr7. I y 2
V 1 ? V 1 
The square magnitude of any four vector is defined analogously to the square of the radius
fourvector:
(A ) 2 (A 1 ) 2 (A 2 ) 2 (A 3 ) 2 .
For convenience of notation, we introduce two "types" of components of fourvectors,
denoting them by the symbols A 1 and A h with superscripts and subscripts. These are related
by
A = A , A x = A 1 , A 2 = A 2 , A 3 = A 3 . (6.2)
The quantities A 1 are called the contravariant, and the A t the covariant components of the
four vector. The square of the four vector then appears in the form
3
I
i =
£ A% = A°A +A 1 A 1 +A 2 A 2 +A 3 A 3 .
§ 6 FOUR VECTORS 15
Such sums are customarily written simply as A l A u omitting the summation sign. One
agrees that one sums over any repeated index, and omits the summation sign. Of the pair
of indices, one must be a superscript and the other a subscript. This convention for sum
mation over "dummy" indices is very convenient and considerably simplifies the writing of
formulas.
We shall use Latin letters i, k, I, . . . , for fourdimensional indices, taking on the values
0, 1, 2, 3.
In analogy to the square of a fourvector, one forms the scalar product of two different
four vectors :
A% = A°B + A 1 B 1 + A 2 B 2 + A 3 B 3 .
It is clear that this can be written either as A l Bi or Afi— the result is the same. In general
one can switch upper and lower indices in any pair of dummy indices. f
The product A l B t is a. fourscalar — it is invariant under rotations of the fourdimensional
coordinate system. This is easily verified directly,} but it is also apparent beforehand (from
the analogy with the square A l A^) from the fact that all fourvectors transform according to
the same rule.
The component A° is called the time component, and A 1 , A 2 , A 3 the space components of
the four vector (in analogy to the radius four vector). The square of a four vector can be
positive, negative, or zero; such vectors are called, timelike, spacelike, and nullvectors,
respectively (again in analogy to the terminology for intervals).
Under purely spatial rotations (i.e. transformations not affecting the time axis) the three
space components of the fourvector A 1 form a threedimensional vector A. The time
component of the fourvector is a threedimensional scalar (with respect to these trans
formations). In enumerating the components of a four vector, we shall often write them as
A 1 = (A , A).
The co variant components of the same four vector are A t = (A , —A), and the square of
the fourvector is A l A t = (A ) 2 — A 2 . Thus, for the radius fourvector:
x'" = (ct, r), x t = (ct, r), x% = c 2 t 2 r 2 .
For threedimensional vectors (with coordinates x, y, z) there is no need to distinguish
between contra and co variant components. Whenever this can be done without causing
confusion, we shall write their components as A a (a = x, y, z) using Greek letters for sub
scripts. In particular we shall assume a summation over x, y, z for any repeated index (for
example, A • B = A a i? a ).
A fourdimensional tensor (fourtensor) of the second rank is a set of sixteen quantities
A ik , which under coordinate transformations transform like the products of components of
two fourvectors. We similarly define fourtensors of higher rank.
t In the literature the indices are often omitted on fourvectors, and their squares and scalar products are
written as A 2 , AB. We shall not use this notation in the present text.
% One should remember that the law for transformation of a four vector expressed in co variant components
differs (in signs) from the same law expressed for contravariant components. Thus, instead of (6.1), one will
have:
V V
A\A\ A\Ah
A 2 = AL A* = A' 3 .
16 THE PRINCIPLE OF RELATIVITY § 6
The components of a secondrank tensor can be written in three forms: co variant, A ik ,
contravariant, A ik , and mixed, A\ (where, in the last case, one should distinguish between
A\ and A k , i.e. one should be careful about which of the two is superscript and which a
subscript). The connection between the different types of components is determined from
the general rule: raising or lowering a space index (1, 2, 3) changes the sign of the com
ponent, while raising or lowering the time index (0) does not. Thus :
a _ aoo a __^01 a —Ail
A ° = A 00 , Al'=A°\ A\ = A 0i , AS^A 11 ,....
Under purely spatial transformations, the nine quantities A 11 , A 12 , . . . form a three
tensor. The three components A 01 , A 02 , A 03 and the three components A 10 , A 20 , A 30
constitute threedimensional vectors, while the component A 00 is a threedimensional
scalar.
A tensor A ik is said to be symmetric if A ik = A ki , and antisymmetric if A ik = —A ki . In an
antisymmetric tensor, all the diagonal components (i.e. the components A 00 , A 11 , . . . )
are zero, since, for example, we must have A 00 = —A 00 . For a symmetric tensor A ik , the
mixed components A\ and A k l obviously coincide; in such cases we shall simply write A k ,
putting the indices one above the other.
In every tensor equation, the two sides must contain identical and identically placed
(i.e. above or below) free indices (as distinguished from dummy indices). The free indices in
tensor equations can be shifted up or down, but this must be done simultaneously in all
terms in the equation. Equating covariant and contravariant components of different
tensors is "illegal" ; such an equation, even if it happened by chance to be valid in a particular
reference system, would be violated on going to another frame.
From the tensor components A ik one can form a scalar by taking the sum
A t i = A° +A\+A 2 2 + A 3 3
(where, of course, A? = A\). This sum is called the trace of the tensor, and the operation for
obtaining it is called contraction.
The formation of the scalar product of two vectors, considered earlier, is a contraction
operation: it is the formation of the scalar A l Bi from the tensor A l B k . In general, contracting
on any pair of indices reduces the rank of the tensor by 2. For example, A\ n is a tensor of
second rank A\B k is a four vector, A ik ik is a scalar, etc.
The unit fourtensor 8 k satisfies the condition that for any fourvector A\
51A 1 = A k . (6.3)
It is clear that the components of this tensor are
k fl, if i = k
s ' = \o, if t + k  (6  4)
Its trace is 5\ = 4.
By raising the one index or lowering the other in 8 k , we can obtain the contra or covariant
tensor g lk or g ik , which is called the metric tensor. The tensors g lk and g ik have identical
components, which can be written as a matrix :
(<?") = G7*)= o ~o i «l (65)
\0
§ 6 FOUR VECTORS 17
(the index i labels the rows, and k the columns, in the order 0, 1, 2, 3). It is clear that
g ik A k = A h g ik A k = A l . (6.6)
The scalar product of two fourvectors can therefore be written in the form:
A i A i = g ik A i A k =g ik A i A k . (6.7)
The tensors 5 l k , g ik , g* are special in the sense that their components are the same in all
coordinate systems. The completely antisymmetric unit tensor of fourth rank, e Mm , has the
same property. This is the tensor whose components change sign under interchange of any
pair of indices, and whose nonzero components are ± 1. From the antisymmetry it follows
that all components in which two indices are the same are zero, so that the only non
vanishing components are those for which all four indices are different. We set
e 0123 = +l (6.8)
(hence e 012 s =  !)• Then all the other nonvanishing components e iklm are equal to + 1 or
 1, according as the numbers i, k, I, m can be brought to the arrangement 0, 1, 2, 3 by an
even or an odd number of transpositions. The number of such components is 4! = 24. Thus,
e iklm e Mm =24. (6.9)
With respect to rotations of the coordinate system, the quantities e iklm behave like the
components of a tensor; but if we change the sign of one or three of the coordinates the
components e iklm , being defined as the same in all coordinate systems, do not change, whereas
the components of a tensor should change sign. Thus e iklm is, strictly speaking, not a tensor,
but rather a pseudotensor. Pseudotensors of any rank, in particular pseudoscalars, behave
like tensors under all coordinate transformations except those that cannot be reduced to
rotations, i.e. reflections, which are changes in sign of the coordinates that are not reducible
to a rotation.
The products e iklm e prst form a fourtensor of rank 8, which is a true tensor; by contracting
on one or more pairs of indices, one obtains tensors of rank 6, 4, and 2. All these tensors
have the same form in all coordinate systems. Thus their components must be expressed as
combinations of products of components of the unit tensor 5 k — the only true tensor whose
components are the same in all coordinate systems. These combinations can easily be found
by starting from the symmetries that they must possess under permutation of indices.f
If A ik is an antisymmetric tensor, the tensor A ik and the pseudotensor A* ik = \e iklm A lm
are said to be dual to one another. Similarly, e iklm A m is an antisymmetric pseudotensor of
rank 3, dual to the vector A 1 . The product A ik A* k of dual tensors is obviously a pseudoscalar.
t For reference we give the following formulas:
% S\ 31 S\
*w S i S i f £, 3 i t
6™ 8? 8™ S?
e mm e prlm = 2{8 l p S«  8\8 k p ), e Mm e pklm = 68}.
The overall coefficient in these formulas can be checked using the result of a complete contraction, which
should give (6.9).
As a consequence of these formulas we have:
e prst A ip A kT A ls A mt = —Ae mm ,
e ikim e pru A{p Akr Ais Amt = 24 a ,
where A is the determinant formed from the quantities A (k .
%
V
si
K
%
SI
K
V
*'.
18 THE PRINCIPLE OF RELATIVITY § 6
In this connection we note some analogous properties of threedimensional vectors and
tensors. The completely antisymmetric unit pseudotensor of rank 3 is the set of quantities
e aPy which change sign under any transposition of a pair of indices. The only nonvanishing
components of e aPy are those with three different indices. We set e xyz = 1 ; the others are 1
or  1, depending on whether the sequence a, 0, y can be brought to the order x, y, z by an
even or an odd number of transpositions, f
The products e aPy e Atiy form a true threedimensional tensor of rank 6, and are therefore
expressible as combinations of products of components of the unit threetensor 5 aP .%
Under a reflection of the coordinate system, i.e. under a change in sign of all the co
ordinates, the components of an ordinary vector also change sign. Such vectors are said to
be polar. The components of a vector that can be written as the cross product of two polar
vectors do not change sign under inversion. Such vectors are said to be axial. The scalar
product of a polar and an axial vector is not a true scalar, but rather a pseudoscalar : it
changes sign under a coordinate inversion. An axial vector is a pseudovector, dual to some
antisymmetric tensor. Thus, if C = A x B, then
c « = i^pyCpy, where C Py = A p B y  A y B p .
Now consider fourtensors. The space components (/, k, = 1, 2, 3) of the antisymmetric
tensor A lk form a threedimensional antisymmetric tensor with respect to purely spatial
transformations; according to our statement its components can be expressed in terms of
the components of a threedimensional axial vector. With respect to these same trans
formations the components A 01 , A 02 , A 03 form a threedimensional polar vector. Thus the
components of an antisymmetric fourtensor can be written as a matrix:
Px Py P 2
p x a, a x
04**) =
y
Py a 2 a x
■p 2 —a y a x
(6.10)
where, with respect to spatial transformations, p and a are polar and axial vectors, re
spectively. In enumerating the components of an antisymmetric fourtensor, we shall write
them in the form
^' & = (p,a);
then the covariant components of the same tensor are
A* = (p,a).
Finally we consider certain differential and integral operations of fourdimensional tensor
analysis.
t The fact that the components of the fourtensor e iklm are unchanged under rotations of the four
dimensional coordinate system, and that the components of the threetensor e aSy are unchanged by rotations
of the space axes are special cases of a general rule: any completely antisymmetric tensor of rank equal to
the number of dimensions of the space in which it is defined is invariant under rotations of the coordinate
system in the space.
% For reference, we give the appropriate formulas:
SaX. 8an S a
taffy €auv = Sgx 8 'pu Sg
OyS O yil Oy
Contrasting this tensor on one, two and three pairs of indices, we get:
taffy exuy = §a\ 8ff u —8 ali 8g\,
Caffy Sxffy = 2S a x,
^affy taffy == ." 
§ 6 FOURVECTORS 19
The fourgradient of a scalar <j> is the four vector
d * f 1 d(j> vA
We must remember that these derivatives are to be regarded as the covariant components
of the four vector. In fact, the differential of the scalar
is also a scalar; from its form (scalar product of two four vectors) our assertion is obvious.
In general, the operators of differentiation with respect to the coordinates x\ d/dx\
should be regarded as the covariant components of the operator fourvector. Thus, for
example, the divergence of a fourvector, the expression dAjdx 1 , in which we differentiate
the contravariant components A\ is a scalar.f
In threedimensional space one can extend integrals over a volume, a surface or a curve.
In fourdimensional space there are four types of integrations :
(1) Integral over a curve in fourspace. The element of integration is the line element, i.e.
the fourvector dx\
(2) Integral over a (twodimensional) surface in fourspace. As we know, in three
space the projections of the area of the parallelogram formed from the vectors <h
and dr' on the coordinate planes x a x p are dx a dx' p dx p dx' x . Analogously, in four
space the infinitesimal element of surface is given by the antisymmetric tensor of
second rank df ik = dx l dx' k  dx k dx' i ; its components are the projections of the element of
area on the coordinate planes. In threedimensional space, as we know, one uses as surface
element in place of the tensor df aP the vector df x dual to the tensor df af} : df a = ^e aP7 df Pr
Geometrically this is a vector normal to the surface element and equal in absolute mag
nitude to the area of the element. In fourspace we cannot construct such a vector, but we
can construct the tensor df* ik dual to the tensor df ik ,
df* ik = W klm df ln . (6.11)
Geometrically it describes an element of surface equal to and "normal" to the element of
surface df lk ; all segments lying in it are orthogonal to all segments in the element df ik .
It is obvious that df ik dff k = 0.
(3) Integral over a hypersurface, i.e. over a threedimensional manifold. In three
dimensional space the volume of the parallelepiped spanned by three vectors is equal to the
determinant of the third rank formed from the components of the vectors. One obtains
analogously the projections of the volume of the parallelepiped (i.e. the "areas" of the
t If we differentiate with respect to the "covariant coordinates" x u then the derivatives
to, \c It v *7
form the contravariant components of a fourvector. We shall use this form only in exceptional cases [for
example, for writing the square of the fourgradient (^/tU^/O^Xi)].
We note that in the literature partial derivatives with respect to the coordinates are often abbreviated
using the symbols
<>' = — , 7>= —
In this form of writing of the differentiation operators, the co or contravariant character of quantities formed
with them is explicit.
20 THE PRINCIPLE OF RELATIVITY § 6
hypersurface) spanned by three four vectors dx 1 , dx' 1 , dx"' 1 ; they are given by the deter
minants
dx 1 dx n dx" 1
dS ikl = dx" dx' k dx" k ,
dx 1 dx' 1 dx" 1
which form a tensor of rank 3, antisymmetric in all three indices. As element of integration
over the hypersurface, it is more convenient to use the four vector dS\ dual to the tensor
dS w :
dS l = ie iklm dS klm , dS klm = e nklm dS". (6.12)
Here
dS° = dS 12 \ dS 1 = dS 023 , ....
Geometrically dS 1 is a four vector equal in magnitude to the "areas" of the hypersurface
element, and normal to this element (i.e. perpendicular to all lines drawn in the hyper
surface element). In particular, dS° = dxdydz, i.e. it is the element of threedimensional
volume dV, the projection of the hypersurface element on the hyperplane x° = const.
(4) Integral over a fourdimensional volume; the element of integration is the scalar
dQ = dx°dx 1 dx 2 dx 3 = cdtdV. (6.13)
Analogous to the theorems of Gauss and Stokes in threedimensional vector analysis,
there are theorems that enable us to transform fourdimensional integrals.
The integral over a closed hypersurface can be transformed into an integral over the four
volume contained within it by replacing the element of integration dS t by the operator
d
dSi^dQ—.. (6.14)
For example, for the integral of a vector A 1 we have :
(fi^S f = f ^dQ. (6.15)
This formula is the generalization of Gauss' theorem.
An integral over a twodimensional surface is transformed into an integral over the hyper
surface "spanning" it by replacing the element of integration df* k by the operator
d f£+ ds ii^ ds «i;r ( 6  16 >
For example, for the integral of an antisymmetric tensor A ik we have :
iS^aiS{ iS w^wjS is 'W (6  17)
The integral over a fourdimensional closed curve is transformed into an integral over the
surface spanning it by the substitution :
■a
dx i ^dj ki — k . (6.18)
Thus for the integral of a vector, we have:
which is the generalization of Stokes' theorem.
§ 7 FOURDIMENSIONAL VELOCITY 21
PROBLEMS
1. Find the law of transformation of the components of a symmetric fourtensor A ik under
Lorentz transformations (6.1).
Solution: Considering the components of the tensor as products of components of two four
vectors, we get:
1 ^' 00 +2 A' 01 +  2 A'A, A" = —L; (A' 11 *! A'™+ ^ A' 00
<22 — J'22 ,<23 /<'23 i13 * / .„o . K
y4 '22 > ^23 = ^23^ A 12 = / A , 12 . ^ ^'0
c 2
7'
^ 01 = —
1^
c 2
V 2 \ V V
A ' 01 ( l + ~2)+^ A ' 00 +~ Anl
A 02 = .J ( ^' 02 +  A' 12 ),
' F 2 V c
/■
and analogous formulas for A 33 , A 13 and A 03 .
2. The same for the antisymmetric tensor A ik .
Solution: Since the coordinates x 2 and x 3 do not change, the tensor component A 23 does not
change, while the components A 12 , A 13 and A 02 , A 03 transform like x 1 and x°:
A' 12 +~A'° 2 A '™ + Z A '12
A 23 = A' 23 , A 12  ° *™ c
J^ J
c 2
and similarly for A 13 , A 03 .
With respect to rotations of the twodimensional coordinate system in the plane x°x x (which are
the transformations we are considering) the components A 01 = —A 10 , A 00 = A 11 = 0, form an
antisymmetric of tensor of rank two, equal to the number of dimensions of the space. Thus (see the
remark on p. 18) these components are not changed by the transformations:
A' 01 .
§ 7. Fourdimensional velocity
From the ordinary threedimensional velocity vector one can form a fourvector. This
fourdimensional velocity {fourvelocity) of a particle is the vector
i _dx i
ds
(7.1)
To find its components, we note that according to (3.1),
/ v~ 2
ds — cdt 1 1 5j
22 THE PRINCIPLE OF RELATIVITY § 7
where v is the ordinary threedimensional velocity of the particle. Thus
dx 1 dx v„
ds
cd 'J 1 ? C J 1 ?
etc. Thus
«' = /7=l. — 7=51 (72)
Note that the fourvelocity is a dimensionless quantity.
The components of the four velocity are not independent. Noting that dxidx 1 = ds 2 , we
have
u% = l. (7.3)
Geometrically, w 1 is a unit fourvector tangent to the world line of the particle.
Similarly to the definition of the four velocity, the second derivative
. d 2 x i du l
ds 2 ds
may be called the fouracceleration. Differentiating formula (7.3), we find:
uiw 1 = 0, (7.4)
i.e. the fourvectors of velocity and acceleration are "mutually perpendicular".
PROBLEM
Determine the relativistic uniformly accelerated motion, i.e. the rectilinear motion for which the
acceleration w in the proper reference frame (at each instant of time) remains constant.
Solution: In the reference frame in which the particle velocity is v = 0, the components of the
fouracceleration w* = (0, w/c 2 , 0, 0) (where w is the ordinary threedimensional acceleration,
which is directed along the x axis). The relativistically invariant condition for uniform acceleration
must be expressed by the constancy of the fourscalar which coincides with w 2 in the proper reference
frame:
w 2
w'H'i = const = r.
c*
In the "fixed" frame, with respect to which the motion is observed, writing out the expression for
w l w { gives the equation
d v v ...
— — , = w, or — , = wt +const.
dt ' = '
J^ ' J>4
Setting v = for / = 0, we find that const = 0, so that
wt
v =
J
r.2
Integrating once more and setting x = for t = 0, we find:
~S(V 1+! ?'}
§ 7 FOURDIMENSIONAL VELOCITY 23
For wt<^c, these formulas go over the classical expressions v = wt, x = wt 2 /2. For wt >oo, the
velocity tends toward the constant value c.
The proper time of a uniformly accelerated particle is given by the integral
As t*oo, it increases much more slowly than t, according to the law c/w In (2wt/c).
CHAPTER 2
RELATIVISTIC MECHANICS
§ 8. The principle of least action
In studying the motion of material particles, we shall start from the Principle of Least
Action. The principle of least action is defined, as we know, by the statement that for each
mechanical system there exists a certain integral S, called the action, which has a minimum
value for the actual motion, so that its variation 5S is zero.f
To determine the action integral for a free material particle (a particle not under the
influence of any external force), we note that this integral must not depend on our choice of
reference system, that is, it must be invariant under Lorentz transformations. Then it follows
that it must depend on a scalar. Furthermore, it is clear that the integrand must be a dif
ferential of the first order. But the only scalar of this kind that one can construct for a free
particle is the interval ds, or a ds, where a is some constant. So for a free particle the action
must have the form
b
S = —cc I ds,
a
where a \ b is an integral along the world line of the particle between the two particular events
of the arrival of the particle at the initial position and at the final position at definite times
t t and t 2 , i.e. between two given world points; and a is some constant characterizing the
particle. It is easy to see that a must be a positive quantity for all particles. In fact, as we
saw in § 3, a \ h ds has its maximum value along a straight world line; by integrating along
a curved world line we can make the integral arbitrarily small. Thus the integral a \ b ds with
the positive sign cannot have a minimum ; with the opposite sign it clearly has a minimum,
along the straight world line.
The action integral can be represented as an integral with respect to the time
t2
S = ! Ldt.
ti
The coefficient L of dt represents the Lagrange function of the mechanical system. With the
aid of (3.1), we find:
»2
= j« c J
v 2
1 ~ dt,
t Strictly speaking, the principle of least action asserts that the integral S must be a minimum only for
infinitesimal lengths of the path of integration. For paths of arbitrary length we can say only that S must be
an extremum, not necessarily a minimum. (See Mechanics, § 2.)
24
§ 9 ENERGY AND MOMENTUM 25
where v is the velocity of the material particle. Consequently the Lagrangian for the particle
is
L = —ac\J\ — v 2 lc 2 .
The quantity a, as already mentioned, characterizes the particle. In classical mechanics
each particle is characterized by its mass m. Let us find the relation between a and m. It can
be determined from the fact that in the limit as c > oo, our expression for L must go over
into the classical expression L = mv 2 J2. To carry out this transition we expand L in powers
of v/c. Then, neglecting terms of higher order, we find
■/■
v 2
L = — ac l 1 1 — =x — ac+ —.
c 2c
Constant terms in the Lagrangian do not affect the equation of motion and can be
omitted. Omitting the constant ac in L and comparing with the classical expression
L = mv 2 /2, we find that a = mc.
Thus the action for a free material point is
S=mcjds (8.1)
and the Lagrangian is
L=mc 2 Jl^ 2 . (8.2)
§ 9. Energy and momentum
By the momentum of a particle we can mean the vector p = 8L/d\ (dL/dx is the symbolic
representation of the vector whose components are the derivatives of L with respect to the
corresponding components of v). Using (8.2), we find:
mv
M
For small velocities (v <^ c) or, in the limit as c> oo, this expression goes over into the
classical p = my. For v = c, the momentum becomes infinite.
The time derivative of the momentum is the force acting on the particle. Suppose the
velocity of the particle changes only in direction, that is, suppose the force is directed
perpendicular to the velocity. Then
dp m d\
d* I v dt
7
• ,.
If the velocity changes only in magnitude, that is, if the force is parallel to the velocity, then
dp m dv
dt / i; 2 \* dt'
We see that the ratio of force to acceleration is different in the two cases.
(9.3)
26 RELATIVISTIC MECHANICS § 9
The energy $ of the particle is defined as the quantityf
<f = pvL.
Substituting the expressions (8.2) and (9.1) for L and p, we find
(9.4)
V
1 V 
c 2
This very important formula shows, in particular, that in relativistic mechanics the energy
of a free particle does not go to zero for v = 0, but rather takes on a finite value
& = mc 2 . (9.5)
This quantity is called the rest energy of the particle.
For small velocities (v/c < 1), we have, expanding (9.4) in series in powers of v/c,
 mv 2
£ K mc +—,
which, except for the rest energy, is the classical expression for the kinetic energy of a
particle.
We emphasize that, although we speak of a "particle", we have nowhere made use of the
fact that it is "elementary". Thus the formulas are equally applicable to any composite body
consisting of many particles, where by m we mean the total mass of the body and by v the
velocity of its motion as a whole. In particular, formula (9.5) is valid for any body which is at
rest as a whole. We call attention to the fact that in relativistic mechanics the energy of a free
body (i.e. the energy of any closed system) is a completely definite quantity which is always
positive and is directly related to the mass of the body. In this connection we recall that in
classical mechanics the energy of a body is defined only to within an arbitrary constant, and
can be either positive or negative.
The energy of a body at rest contains, in addition to the rest energies of its constituent
particles, the kinetic energy of the particles and the energy of their interactions with one
another. In other words, mc 2 is not equal to 2 m a c 2 (where m fl are the masses of the particles),
and so m is not equal to Zm r Thus in relativistic mechanics the law of conservation of mass
does not hold : the mass of a composite body is not equal to the sum of the masses of its
parts. Instead only the law of conservation of energy, in which the rest energies of the particles
are included, is valid.
Squaring (9.1) and (9.4) and comparing the results, we get the following relation between
the energy and momentum of a particle :
^ = p 2 + m 2 c 2 . (9.6)
The energy expressed in terms of the momentum is called the Hamiltonian function 2tf\
tf = cjp 2 + m 2 c 2 . (9.7)
For low velocities, p < mc, and we have approximately
je x mc 2 + j— ,
2m
i.e., except for the rest energy we get the familiar classical expression for the Hamiltonian.
t See Mechanics, § 6.
§ 9 ENERGY AND MOMENTUM 27
From (9.1) and (9.4) we get the following relation between the energy, momentum, and
velocity of a free particle :
P = <^ (9.8)
For v = c, the momentum and energy of the particle become infinite. This means that a
particle with mass m different from zero cannot move with the velocity of light. Nevertheless,
in relativistic mechanics, particles of zero mass moving with the velocity of light can exist.f
From (9.8) we have for such particles:
P = ' (9.9)
The same formula also holds approximately for particles with nonzero mass in the socalled
ultrarelativistic case, when the particle energy $ is large compared to its rest energy mc 2 .
We now write all our formulas in fourdimensional form. According to the principle of
least action,
b
5S = — mc8 ds = 0.
a
To set up the expression for 5S, we note that ds = y/dXidx 1 and therefore
b m b
co r dxfidx 1 r ie .
ob = — mc — j=— = — mc Ufdox 1 .
a a
Integrating by parts, we obtain
5S — —mcUidx 1
C c idu, ,
+ mc 5x l ~ds. (9.10)
As we know, to get the equations of motion we compare different trajectories between the
same two points, i.e. at the limits (<5x l ') fl = (5x% = 0. The actual trajectory is then deter
mined from the condition SS = 0. From (9.10) we thus obtain the equations dujds = 0; that
is, a constant velocity for the free particle in fourdimensional form.
To determine the variation of the action as a function of the coordinates, one must consider
the point a as fixed, so that (<5x l ) a = 0. The second point is to be considered as variable, but
only actual trajectories are admissible, i.e., those which satisfy the equations of motion.
Therefore the integral in expression (9.10) for 8S is zero. In place of (5x\ we may write
simply 8x\ and thus obtain
SS= mcUidx 1 . (9.11)
The fourvector
dS
is called the momentum fourvector. As we know from mechanics, the derivatives dS/dx,
dS/dy, dS/dz are the three components of the momentum vector p of the particle, while the
derivative dS/dt is the particle energy <f. Thus the covariant components of the four
t For example, light quanta and neutrinos.
28 RELATIVISTIC MECHANICS § 9
momentum are/? f = (^/c,p), while the contravariant components aref
p' = (*/c,p). (9.13)
From (9.11) we see that the components of the fourmomentum of a free particle are:
p* = mcu\ (9.14)
Substituting the components of the fourvelocity from (7.2), we see that we actually get
expressions (9.1) and (9.4) for p and 8.
Thus, in relativistic mechanics, momentum and energy are the components of a single
fourvector. From this we immediately get the formulas for transformation of momentum
and energy from one inertial system to another. Substituting (9.13) in the general formulas
(6.1) for transformation of four vectors, we find:
V
P* = —j= v =? Py = Py> P* = P'» * = 7==f» < 9 ' 15 )
\ 1 ~V \l l ~~c 2
where p x , p y , p z are the components of the threedimensional vector p.
From the definition (9.14) of the fourmomentum, and the identity u% = 1, we have, for
the square of the fourmomentum of a free particle:
p i p i =m 2 c 2 . (9.1.6)
Substituting the expressions (9.13), we get back (9.6).
By analogy with the usual definition of the force, the force fourvector is defined as the
derivative :
dp ' du l
g* = JL = mc (9.17)
as as
Its components satisfy the identity g t u l = 0. The components of this four vector are expressed
in terms of the usual threedimensional force vector f = dp/dt:
, , fv f
9 ' = / i=k> 7=*\ (918)
SI 1 c 2 C \ 1 c 2
The time component is related to the work done by the force.
The relativistic HamiltonJacobi equation is obtained by substituting the derivatives
dS/dx* for p t in (9.16):
dS dS .. dS 8S , ,
£Ta?« a?a? BV ' (9 ' 19)
or, writing the sum explicitly:
The transition to the limiting case of classical mechanics in equation (9.19) is made as
follows. First of all we must notice that just as in the corresponding transition with (9.7),
t We call attention to a mnemonic for remembering the definition of the physical fourvectors: the
contravariant components are related to the corresponding threedimensional vectors (r for x\ p for /?') with
the "right", positive sign.
§ 1° TRANSFORMATION OF DISTRIBUTION FUNCTIONS 29
the energy of a particle in relativistic mechanics contains the term mc 2 , which it does not in
classical mechanics. Inasmuch as the action S is related to the energy by S  (dS/dt), in
making the transition to classical mechanics we must in place of S substitute a new action
S' according to the relation:
S = S'~mc 2 t.
Substituting this in (9.20), we find
1
2m
fdS'\ 2 (dS'\ 2 /dS'\ 2 l 1 foS'\ 2 dS'
\T X ) + \Ty) + \T Z )
^7 +~
2mc 2 \dt J ' dt
In the limit as c » oo, this equation goes over into the classical HamiltonJacobi equatio
§ 10. Transformation of distribution functions
In various physical problems we have to deal with distribution functions for the momenta
of particles :f(p)dp x dp y dp z is the number of particles having momenta with components
in given intervals dp x , dp y , dp z (or, as we say for brevity, the number of particles in a given
volume element dp x dp y dp z in "momentum space"). We are then faced with the problem
of finding the law of transformation of the distribution function /(p) when we transform
from one reference system to another.
To solve this problem, we first determine the properties of the "volume element"
dp x dp y dp z with respect to Lorentz transformations. If we introduce a fourdimensional
coordinate system, on whose axes are marked the components of the fourmomentum of a
particle, then dp x dp y dp z can be considered as the fourth component of an element of the
hypersurface defined by the equation p l Pi = m 2 c 2 . The element of hypersurface is a four
vector directed along the normal to the hypersurface; in our case the direction of the normal
obviously coincides with the direction of the fourvector Pi . From this it follows that the
ratio
dp x d p dp z
—^~ (10.1)
is an invariant quantity, since it is the ratio of corresponding components of two parallel
fourvectors.f
The number of particles, fdp x dp y dp z , is also obviously an invariant, since it does not
depend on the choice of reference frame. Writing it in the form
( ^dp x d Py dp z
S
nft J h /f nte f 3ti ? n f l th f SpeCt t0 the element (101) can be ex P re ssed in fourdimensional form by means
of the ^function (cf. the footnote on p. 00) as an integration with respect to
2
 Sipip'+m^y^p, d 4 p = dp0dp*dp 2 dp*. (10. la)
Jmf»?l C ^° ne T pl T Tu Sd aS inde P endent variable * (with/; taking on only positive values). Formula
(10.1a) is obvious from the following representation of the delta function appearing in it :
?"*"*■>  s {<" ?) = £[' (»+ ~) +* {» $\. oai»
where * = cVp* +„?(*. This formula in turn follows from formula (V) of the footnote on p. 70.
30 RELATIVISTIC MECHANICS § 11
and using the invariance of the ratio (10.1), we conclude that the product/(p)^ is invariant.
Thus the distribution function in the K' system is related to the distribution function in the
K system by the formula
/'(P') = ^> (102)
where p and $ must be expressed in terms of p' and $' by using the transformation formulas
(9.15).
Let us now return to the invariant expression (10.1). If we introduce "spherical co
ordinates" in momentum space, the volume element dp x dp y dp z becomes p 2 dp do, where do is
the element of solid angle around the direction of the vector p. Noting that pdp = SdSjc 2
[from (9.6)], we have:
p 2 dpdo pdido
J~ c 2 '
Thus we find that the expression
pdSdo (10.3)
is also invariant.
If we are dealing with particles moving with the velocity of light, so that the relation
<$ =pc (9.9) is valid, the invariant quantity (10.3) can be written as pdp do or SdSdo.
§11. Decay of particles
Let us consider the spontaneous decay of a body of mass M into two parts with masses
m t and m 2 . The law of conservation of energy in the decay, applied in the system of reference
in which the body is at rest, gives. f
M = * 10 + *2o (111)
where <f 10 and S 2Q are the energies of the emerging particles. Since <f 10 > m 1 and S 2Q > m i>
the equality (11.1) can be satisfied only if M > m i +m 2 , i.e. a body can disintegrate spon
taneously into parts the sum of whose masses is less than the mass of the body. On the other
hand, if M < m l + m 2 , the body is stable (with respect to the particular decay) and does not
decay spontaneously. To cause the decay in this case, we would have to supply to the body
from outside an amount of energy at least equal to its "binding energy" (m 1 +m 2 — M).
Momentum as well as energy must be conserved in the decay process. Since the initial
momentum of the body was zero, the sum of the momenta of the emerging particles must
be zero: P10+P20 = 0 Consequently p\ Q = pj , or
S\ Q m\ = S 2 2Q m 2 2 . (11.2)
The two equations (11.1) and (11.2) uniquely determine the energies of the emerging
t In §§ 1113 we set c = 1. In other words the velocity of light is taken as the unit of velocity (so that
the dimensions of length and time become the same). This choice is a natural one in relativistic mechanics
and greatly simplifies the writing of formulas. However, in this book (which also contains a considerable
amount of nonrelativistic theory) we shall not usually use this system of units, and will explicitly indicate
when we do.
If c has been set equal to unity in formulas, it is easy to convert back to ordinary units: the velocity is
introduced to assure correct dimensions.
§ 11 DECAY OF PARTICLES 31
particles:
M 2 + m 2 m 2 2 M 2 mj + m 2 2
Sxo = m ' /ao = m — ' (1L3)
In a certain sense the inverse of this problem is the calculation of the total energy M of
two colliding particles in the system of reference in which their total momentum is zero.
(This is abbreviated as the "system of the center of inertia" or the "Csystem".) The com
putation of this quantity gives a criterion for the possible occurrence of various inelastic
collision processes, accompanied by a change in state of the colliding particles, or the
"creation" of new particles. A process of this type can occur only if the sum of the masses
of the "reaction products" does not exceed M.
Suppose that in the initial reference system (the "laboratory" system) a particle with mass
m 1 and energy S x collides with a particle of mass m 2 which is at rest. The total energy of the
two particles is
and their total momentum is p = p t +p 2 = Pi Considering the two particles together as a
single composite system, we find the velocity of its motion as a whole from (9.8):
V = ^=li— . (11.4)
<T £ x + m 2
This quantity is the velocity of the Csystem with respect to the laboratory system (the L
system).
However, in determining the mass M, there is no need to transform from one reference
frame to the other. Instead we can make direct use of formula (9.6), which is applicable to
the composite system just as it is to each particle individually. We thus have
M 2 = S 2 p 2 = {S 1 + m 2 ) 2 {S\m 2 ) i
from which
M 2 = m\ + m 2 2 + 2m 2 £ x . (11.5)
PROBLEMS
1. A particle moving with velocity V dissociates "in flight" into two particles. Determine the
relation between the angles of emergence of these particles and their energies.
Solution: Let «? be the energy of one of the decay particles in the Csystem [i.e. ^ 10 or «? 2 o in
(11.3)], & the energy of this same particle in the Lsystem, and its angle of emergence in the L
system (with respect to the direction of V). By using the transformation formulas we find:
. #VpcosO
VlV 2
so that
For the determination of «? from cos we then get the quadratic equation
S\\  v 2 cos 2 0)2<W o VlK 2 +<^(l  V 2 )+V 2 m 2 cos 2 6 = 0, (2)
which has one positive root (if the velocity v of the decay particle in the Csystem satisfies v > V)
or two positive roots (if v < V).
The source of this ambiguity is clear from the following graphical construction. According to
(9.15), the momentum components in the Lsystem are expressed in terms of quantities referring to
32
RELATIVISTIC MECHANICS
§ 11
the Csystem by the formulas
Eliminating O , we get
(a) V<v
p COS Oo + ^oV . .
Px= 7==—, p y =Po sin 6» .
V\v 2
pl+(Px^lV 2 <?oV) 2 =pl
(b) V>v
Fig. 3.
With respect to the variables p x , p y , this is the equation of an ellipse with semiaxes p /Vl — V 2 , p ,
whose center (the point O in Fig. 3) has been shifted a distance i Q V/Vl — V 2 from the point
p = (point A in Fig. 3).f
If V>po/£o = v , the point A lies outside the ellipse (Fig. 3b), so that for a fixed angle the
vector p (and consequently the energy S) can have two different values. It is also clear from the
construction that in this case the angle 6 cannot exceed a definite value max (corresponding to
the position of the vector p in which it is tangent to the ellipse). The value of max is most easily
determined analytically from the condition that the discriminant of the quadratic equation (2)
go to zero :
PoVlV 2
sin 6L ax =
mV
2. Find the energy distribution of the decay particles in the Lsystem.
Solution: In the Csystem the decay particles are distributed isotropically in direction, i.e. the
number of particles within the element of solid angle do = In sin O d0 o is
dN= — do = %d\cos O \.
An
The energy in the Lsystem is given in terms of quantities referring to the Csystem by
. So+poVcosOo
e =
Viv 2
0)
to
*o+Vp
and runs through the range of values from
*oVp
Viv 2 Viv 2
Expressing d\cos 6 \ in terms of d<$, we obtain the normalized energy distribution (for each of the
two types of decay particles) :
1
dN
2V Po
VlV 2 d£.
3. Determine the range of values in the Lsystem for the angle between the two decay particles
(their separation angle) for the case of decay into two identical particles.
Solution: In the Csystem, the particles fly off in opposite directions, so that 9 10 = n0 20 = O .
According to (5.4), the connection between angles in the C and Lsystems is given by the formulas:
cot 6 X =
to cos 0o + V
v sin t)
Q VlV 2 '
COt 2 =
—vq cos 0q + V
Vq sin c
o VlV 2
t In the classical limit, the ellipse reduces to a circle. (See Mechanics, § 16.)
§ 11 DECAY OF PARTICLES 33
(since v 10 = V2o = v in the present case). The required separation angle is = 0i + 2 , and a
simple calculation gives :
V 2 v* + V 2 vl&m 2 d
cot0 =
2Kt7 O VlF a sin0o
An examination of the extreme for this expression gives the following ranges of possible values of
0:
for V< v : 2 tan" 1 ( v ± VT^V 2 )<©<«;
for v < V< , =: O<0 <sin _1
VI— i; 2
l lV 2 n
\l 1«S < 2 ;
for V>— ^=^:O<0<2tan 1 (^Vr=T^l<?.
Vlv 2 \V VL / 2
4. Find the angular distribution in the Lsystem for decay particles of zero mass.
Solution: According to (5.6) the connection between the angles of emergence in the C and
Lsystems for particles with m = is
cos0F
cos O = z — T/ 2
1 — Fcos
Substituting this expression in formula (1) of Problem 2, we find:
(lV 2 )do
dN
4n(lVcos9) 2 '
5. Find the distribution of separation angles in the Lsystem for a decay into two particles of
zero mass.
Solution: The relation between the angles of emergence, 9 U 2 in the Lsystem and the angles
0io = 0o, 020 = 7T— O in the Csystem is given by (5.6), so that we have for the separation angle
= 0i+0 2 :
2F 2 1F 2 cos 2 o
COS0 = l^COS 2 0o
and conversely,
'»=V L
COS O = /l z 7r  cot 2 .
Substituting this expression in formula (1) of problem 2, we find :
1F 2 do
dN =
J'
16nV sin 3 ^ IV 2 cos 20
The angle takes on values from v to & mm = 2 cos 1 V.
6. Determine the maximum energy which can be carried off by one of the decay particles, when
a particle of mass M at rest decays into three particles with masses m lt m 2 , and m 3 .
Solution: The particle m 1 has its maximum energy if the system of the other two particles m 2 and
m 3 has the least possible mass; the latter is equal to the sum m 2 +m 3 (and corresponds to the case
where the two particles move together with the same velocity). Having thus reduced the problem
to the decay of a body into two parts, we obtain from (11.3):
_M 2 +ml— (m 2 +m 3 ) 2
* lw 2M '
34 RELATIVISTIC MECHANICS § 12
§ 12. Invariant crosssection
Collision processes are characterized by their invariant crosssections, which determine
the number of collisions (of the particular type) occurring between beams of colliding
particles.
Suppose that we have two colliding beams; we denote by n t and n 2 the particle densities
in them (i.e. the numbers of particles per unit volume) and by v x and v 2 the velocities of the
particles. In the reference system in which particle 2 is at rest (or, as one says, in the rest
frame of particle 2), we are dealing with the collision of the beam of particles 1 with a stationary
target. Then according to the usual definition of the crosssection a, the number of collisions
occurring in volume dV in time dt is
dv = Gv xti n x n 2 dt,
where v tel is the velocity of particle 1 in the rest system of particle 2 (which is just the definition
of the relative velocity of two particles in relativistic mechanics).
The number dv is by its very nature an invariant quantity. Let us try to express it in a form
which is applicable in any reference system:
dv = AniHzdVdt, (12.1)
where A is a number to be determined, for which we know that its value in the rest frame of
one of the particles is v rel <r. We shall always mean by a precisely the crosssection in the
rest frame of one of the particles, i.e. by definition, an invariant quantity. From its definition,
the relative velocity y rel is also invariant.
In the expression (12.1) the product dVdt is an invariant. Therefore the product An ± n 2
must also be an invariant.
The law of transformation of the particle density n is easily found by noting that the
number of particles in a given volume element dV, ndV, is invariant. Writing ndV=n dV Q
(the index refers to the rest frame of the particles) and using formula (4.6) for the trans
formation of the volume, we find:
"=vfb (122)
or n = n S\m, where £ is the energy and m the mass of the particles.
Thus the statement that An± n 2 is invariant is equivalent to the invariance of the expression
AS ! S 2 . This condition is more conveniently represented in the form
A VTi = A „ , * n =mv > (12.3)
where the denominator is an invariant — the product of the fourmomenta of the two
particles.
In the rest frame of particle 2, we have $ 2 = m 2 , p 2 = 0, so that the invariant quantity
(12.2) reduces to A. On the other hand, in this frame A = av tel . Thus in an arbitrary reference
system,
A = crv rel ^f. (12.4)
1©2
To give this expression its final form, we express v Tel in terms of the momenta or velocities
of the particles in an arbitrary reference frame. To do this we note that in the rest frame of
§ 12 INVARIANT CROSSSECTION 35
particle 2,
PuPz = / r m 2 .
Then
'rel
L m\m\
V (PuPi) 2 '
(12.5)
Expressing the quantity p u p 2 = <^i<^2~PrP2 m terms of the velocities y ± and v 2 by using
formulas (9.1) and (9.4):
lvrv 2
p li pi = m 1 m 2
V(i^Xi^i)'
and substituting in (12.5), after some simple transformations we get the following expression
for the relative velocity :
V(viv2) 2 (vixv 2 ) 2 , 10 ~
r " — r^vT^ — (12 ' 6)
(we note that this expression is symmetric in \ t and v 2 , i.e. the magnitude of the relative
velocity is independent of the choice of particle used in defining it).
Substituting (12.5) or (12.6) in (12.4) and then in (12.1), we get the final formulas for
solving our problem :
y/(PuPl) 2 mlml _
dv = a — — n 1 n 2 dVdt (12.7)
& 1©2
or
dv = ffV^iVa^^xva) 2 n 1 n 2 dVdt (12.8)
(W. Pauli, 1933).
If the velocities v x and v 2 are collinear, then v 1 xv 2 = 0, so that formula (12.8) takes the
form:
dv = a\y 1 y 2 \n 1 n 2 dVdt. (12.9)
PROBLEM
Find the "element of length" in relativistic "velocity space".
Solution: The required line element dl v is the relative velocity of two points with velocities v and
v+</v. We therefore find from (12.6)
(dy) 2 (vxdv) 2 dv 2 v 2 .
dl *= (1 , 2)2  (T ^ + iz^ 2 + sin2 ^ 2 >.
where 0, 4> are the polar angle and azimuth of the direction of v. If in place of v we introduce the
new variable x through the equation v = tanh x, the line element is expressed as :
dl\ = cfr 2 +sinh 2 x (d9 2 +sin 2 6 • d(f> 2 ).
From the geometrical point of view this is the line element in threedimensional Lobachevskii
space— the space of constant negative curvature (see (107.8)).
36 RELATIVISTIC MECHANICS § 13
§ 13. Elastic collisions of particles
Let us consider, from the point of view of relativistic mechanics, the elastic collision of
particles. We denote the momenta and energies of the two colliding particles (with masses
m i and m i) b y Pi> <?i and p 2 , S 2 \ we use primes for the corresponding quantities after
collision. The laws of conservation of momentum and energy in the collision can be written
together as the equation for conservation of the fourmomentum:
Pi+P2 = Pi+p'l (13.1)
From this fourvector equation we construct invariant relations which will be helpful in
further computations. To do this we rewrite (13.1) in the form:
Pi+P2Pi=p'l
and square both sides (i.e. we write the scalar product of each side with itself). Noting that
the squares of the fourmomenta pi and p'[ are equal to m{, and the squares of pj and p'l
are equal to mf, we get:
™ 2 i+PitP2PitPiP2iPi = 0. (13.2)
Similarly, squaring the equation pt+plp'j = p'l, we get:
ml + PiiPlpip'iPiiP'i = 0. (13.3)
Let us consider the collision in a reference system (the Lsystem) in which one of the
particles (m 2 ) was at rest before the collision. Then p 2 = 0, S 2 = m 2 , and the scalar products
appearing in (13.2) are:
PuPi = £im 2 ,
p 2i p'{ = m 2 g' u (13.4)
PuPi = ^i^iprpi = « x t'lPtPl cos e lt
where Q x is the angle of scettering of the incident particle m x . Substituting these expressions
in (13.2) we get:
cos 9, = ^ ' t i (13.5)
Pi Pi '
Similarly, we find from (13.3):
^. C + mJMmJ
P1P2
where 2 is the angle between the transferred momentum p 2 and the momentum of the
incident particle p x .
The formulas (13.56) relate the angles of scattering of the two particles in the Lsystem
to the changes in their energy in the collision. Inverting these formulas, we can express the
energie s S' u S' 2 in terms of the angles ± or 2 . Thus, substituting in (13.6) p x = sfg\m\,
p 2 = yJS" 2 ml and squaring both sides, we find after a simple computation:
(^ + m 2 ) 2 + (^m^)cos 2 2
^ " mi (<f 1 + m 2 ) 2 (<r 2 m 2 )cos 2 2  (13J)
Inversion of formula (13.5) leads in the general case to a very complicated formula for $\
in terms of 9 t .
We note that if m x > m 2 , i.e. if the incident particle is heavier than the target particle, the
scattering angle X cannot exceed a certain maximum value. It is easy to find by elementary
§ 13 ELASTIC COLLISIONS OF PARTICLES 37
computations that this value is given by the equation
sin0i ma x = ^, (13.8)
which coincides with the familiar classical result.
Formulas (13.56) simplify in the case when the incident particle has zero mass: m t — 0,
and correspondingly p t = S u p\ = S\. For this case let us write the formula for the energy
of the incident particle after the collision, expressed in terms of its angle of deflection :
S\ = . (13.9)
m 2
1COS0 2 + —
Let us now turn once again to the general case of collision of particles of arbitrary mass.
The collision is most simply treated in the Csystem. Designating quantities in this system
by the additional subscript 0, we have p 10 = P2o= Po From the conservation of momen
tum, during the collision the momenta of the two particles merely rotate, remaining equal in
magnitude and opposite in direction. From the conservation of energy, the value of each of
the momenta remains unchanged.
Let x be the angle of scattering in the Csystem— the angle through which the momenta
p 10 and p 20 are rotated by the collision. This quantity completely determines the scattering
process in the Csystem, and therefore also in any other reference system. It is also con
venient in describing the collision in the Lsystem and serves as the single parameter which
remains undetermined after the conservation of momentum and energy are applied.
We express the final energies of the two particles in the Lsystem in terms of this para
meter. To do this we return to (13.2), but this time write out the product p u p'{ in the C
system :
PuPi = ^io^ioPio "Pio = ^ioPo cos x = £0(1 cos #) + m 2
(in the Csystem the energies of the particles do not change in the collision: S' 10 = <^ 10 ).
We write out the other two products in the Lsystem, i.e. we use (13.4). As a result we get:
^1^1 = (Po/n* 2 Xlcos x). We must still express pi in terms of quantities referring to
the Lsystem. This is easily done by equating the values of the invariant p u p± in the L and
Csystems :
&10&20 — PlO'P20 = <^1 ™2>
or
^{p 2 o + ml)(pl + m 2 2 ) = S x m 2 pl.
Solving the equation for pi, we get:
P °ml + m 2 2 + 2m 2 * 1 ' (13 ' 10
Thus, we finally have:
The energy of the second particle is obtained from the conservation law: S x +m 2 = ^[+^' 2 .
Therefore
38 RELATIVISTIC MECHANICS § 13
The second terms in these formulas represent the energy lost by the first particle and trans
ferred to the second particle. The maximum energy transfer occurs for % = n, and is equal to
^ 2m axm 2  ^^ lmin = ro 2 + m  +2m2 ^. (1313)
The ratio of the minimum kinetic energy of the incident particle after collision to its
initial energy is :
^lmin^i = (jniWi) 2 , 13 14)
S x m^ ml + ml + 2m 2 £i
In the limiting case of low velocities (when $ « m+mv 2 /2), this relation tends to a constant
limit, equal to
/ m 1 m 2 \ 2
\m 1 + m 2 ) '
In the opposite limit of large energies $ l5 relation (13.14) tends to zero; the quantity $' x min
tends to a constant limit. This limit is
6 1 min — «
2m 2
Let us assume that m 2 >m 1 , i.e. the mass of the incident particle is small compared to
the mass of the particle at rest. According to classical mechanics the light particle could
transfer only a negligible part of its energy (see Mechanics, § 17). This is not the case in
relativistic mechanics. From formula (13.14) we see that for sufficiently large energies $ A
the fraction of the energy transferred can reach the order of unity. For this it is not sufficient
that the velocity of m 1 be of order 1, but one must have $ t ~ m 2 , i.e. the light particle must
have an energy of the order of the rest energy of the heavy particle.
A similar situation occurs for m 2 <^m 1 , i.e. when a heavy particle is incident on a light
one. Here too, according to classical mechanics, the energy transfer would be insignificant.
The fraction of the energy transferred begins to be significant only for energies i t ~ m\jm 2 .
We note that we are not talking simply of velocities of the order of the light velocity, but
of energies large compared to m lt i.e. we are dealing with the ultrarelativistic case.
PROBLEMS
1. The triangle ABC in Fig. 4 is formed by the momentum vector p of the impinging particle
and the momenta p' x , p' 2 of the two particles after the collision. Find the locus of the points C
corresponding to all possible values of p'i, p' 2 .
Solution: The required curve is an ellipse whose semiaxes can be found by using the formulas
obtained in problem 1 of § 11. In fact, the construction given there determined the locus of the
vectors p in the Lsystem which are obtained from arbitrarily directed vectors p with given length
po in the Csystem.
(a) m 1 >m 2 (b) m, < m 2
Fig. 4.
§ 13 ELASTIC COLLISIONS OF PARTICLES 39
Since the absolute values of the momenta of the colliding particles are identical in the Csystem,
and do not change in the collision, we are dealing with a similar construction for the vector p'i, for
which
_ _ _ m 2 V
Po = Pio — P20 — / 
VlV 2
in the Csystem where V is the velocity of particle m 2 in the Xsystem, coincides in magnitude with
the velocity of the center of inertia, and is equal to V=p 1 /(<? 1 +m 2 ) (see (1 1.4)). As a result we find
that the minor and major semiaxes of the ellipse are
W2Pi Po m 2 pi(£i+m 2 )
Po
Vm 2 1 +m 2 2 +2m 2 ^i VlV 2 m 2 +m\\2m 2 £ x
(the first of these is, of course, the same as (13.10)).
For 9 X = 0, the vector p' x coincides with p x , so that the distance AB is equal to p x . Comparing p x
with the length of the major axis of the ellipse, it is easily shown that the point A lies outside the
ellipse if m x > m 2 (Fig. 4a), and inside it if m x < m 2 (Fig. 4b).
2. Determine the minimum separation angle min of two particles after collision if the masses
of the two particles are the same (mi = m 2 = m).
Solution: If m x = m 2 , the point A of the diagram lies on the ellipse, while the minimum separation
angle corresponds to the situation where point C is at the end of the minor axis (Fig. 5). From the
construction it is clear that tan (0 min /2) is the ratio of the lengths of the semiaxes, and we find:
or
. ©mm / 2m
tan— =/—— ,
2 \] SL+m
cos & miJX = ———.
«?i+3m
Fig. 5.
3. For the collision of two particles of equal mass m, express i\, S' 2 , x in terms of the angle 0i
of scattering in the £system.
Solution: Inversion of formula (13.5) in this case gives:
^i+m)+(^ 1 ~m)cos 2 e 1 (*i*m*)sm* 0i
(*! +m) (<?im) cos 2 e x ' 2 fT " 2m+{£ x m) sin 2 ± '
Comparing with the expression for £\ in terms of x'
^i = ^i^y^(lcosz),
we find the angle of scattering in the Csystem:
2w(<* , i+3/«)sin 2 1
cos# =
2m+(^ , 1 +m)sin 2 1 '
40 RELATIVISTIC MECHANICS § 14
§ 14. Angular momentum
As is well known from classical mechanics, for a closed system, in addition to conserva
tion of energy and momentum, there is conservation of angular momentum, that is, of the
vector
M = £rxp
where r and p are the radius vector and momentum of the particle; the summation runs over
all the particles making up the system. The conservation of angular momentum is a con
sequence of the fact that because of the isotropy of space, the Lagrangian of a closed system
does not change under a rotation of the system as a whole.
By carrying through a similar derivation in fourdimensional form, we obtain the
relativistic expression for the angular momentum. Let x* be the coordinates of one of the
particles of the system. We make an infinitesimal rotation in the fourdimensional space.
Under such a transformation, the coordinates x l take on new values x' 1 such that the
differences x'^—x* are linear functions
x"x' = x k SQ. ik (14.1)
with infinitesimal coefficients SQ ik . The components of the fourtensor SQ tk are connected
to one another by the relations resulting from the requirement that, under a rotation, the
length of the radius vector must remain unchanged, that is, xx' £ = x f x'. Substituting for
x n from (14.1) and dropping terms quadratic in SQ ik , as infinitesimals of higher order, we
find
x'x^Qtt = 0.
This equation must be fulfilled for arbitrary x\ Since x*x k is a symmetric tensor, SQ. ik must
be an antisymmetric tensor (the product of a symmetrical and an antisymmetrical tensor is
clearly identically zero). Thus we find that
SQ ki =SQ ik . (14.2)
The change 5S in the action S for an infinitesimal change in coordinates has the form
(see 9.11):
(the summation extends over all the particles of the system). In the case of rotation which
we are now considering, 8x t = 5Q ik x k , and so
6S = *&*%?**.
If we resolve the tensor l,p 1 x k into symmetric and antisymmetric parts, then the first of
these when multiplied by an antisymmetric tensor gives identically zero. Therefore, taking
the antisymmetric part of 'Lp'x 1 ', we can write the preceding equality in the form
dS = 5Q ik iZ(P i x k P k x i ) (143)
For a closed system, because of the isotropy of space and time, the Lagrangian does not
change under a rotation in fourspace, i.e. the parameters 5Q ik for the rotation are cyclic
coordinates. Therefore the corresponding generalized momenta are conserved. These
generalized momenta are the quantities dS/dQ ik . From (14.3), we have
dQ ik 2^ KF y J
Consequently we see that for a closed system the tensor
M ik = Y,(x i p k x k p i ) (14.4)
§ 14 ANGULAR MOMENTUM 41
is conserved. This antisymmetric tensor is called the fourtensor of angular momentum. The
space components of this tensor are the components of the threedimensional angular
momentum vector M = Er x p:
M 23 = M X , M 13 = M y , M 12 = M Z .
The components M 0i , M 02 , M 03 form a vector X(/p<fr/c 2 ). Thus, we can write the
components of the tensor M ik in the form:
M lk =
l(*?}M
(14.5)
(Compare (6.10).)
Because of the conservation of M ik for a closed system, we have, in particular,
z(*J)
= const.
Since, on the other hand, the total energy S £ is also conserved, this equality can be written
in the form
ysx c 2 y P
4^r ^~ t = const.
From this we see that the point with the radius vector
moves uniformly with the velocity
»5 d4.6)
V'g? (14.7)
which is none other than the velocity of motion of the system as a whole. [It relates the total
energy and momentum, according to formula (9.8).] Formula (14.6) gives the relativistic
definition of the coordinates of the center of inertia of the system. If the velocities of all the
particles are small compared to c, we can approximately set £ « mc 2 so that (14.6) goes
over into the usual classical expression
We note that the components of the vector (14.6) do not constitute the space components
of any fourvector, and therefore under a transformation of reference frame they do not
transform like the coordinates of a point. Thus we get different points for the center of
inertia of a given system with respect to different reference frames.
PROBLEM
Find the connection between the angular momentum M of a body (system of particles) in the
reference frame K in which the body moves with velocity V, and its angular momentum M (0) in
t We note that whereas the classical formula for the center of inertia applies equally well to interacting
and noninteracting particles, formula (14.6) is valid only if we neglect interaction. In relativistic mechanics,
the definition of the center of inertia of a system of interacting particles requires us to include explicitly the
momentum and energy of the field produced by the particles.
42 RELATIVISTIC MECHANICS § 14
the frame K in which the body is at rest as a whole; in both cases the angular momentum is defined
with respect to the same point— the center of inertia of the body in the system K .\
Solution: The K system moves relative to the # system with velocity V; we choose its direction
for the x axis. The components of M ik that we want transform according to the formulas (see
problem 2 in § 6) :
V V
jV/(°> 12 ^ — M i0)02 jvf <0)13 + — A/ (0)03
M 12 = , M 13 = ° , M 23 =M m23 .
\J 1 ~'c 2 V 1_ ^
Since the origin of coordinates was chosen at the center of inertia of the body (in the K Q system),
in that system E#r = 0, and since in that system £p = 0, M (0)02 = M (0)03 = 0. Using the con
nection between the components of M ik and the vector M, we find for the latter:
M (0) M (0)
M S = M?\ My = 4=2=, M g = '
7^5 7 r
f We remind the reader that although in the system K (in which Ep = 0) the angular momentum is
independent of the choice of the point with respect to which it is defined, in the K system (in which Sp ^ 0)
the angular momentum does depend on this choice (see Mechanics, § 9).
CHAPTER 3
CHARGES IN ELECTROMAGNETIC FIELDS
§ 15. Elementary particles in the theory of relativity
The interaction of particles can be described with the help of the concept of afield of force.
Namely, instead of saying that one particle acts on another, we may say that the particle
creates a field around itself; a certain force then acts on every other particle located in this
field. In classical mechanics, the field is merely a mode of description of the physical
phenomenon — the interaction of particles. In the theory of relativity, because of the finite
velocity of propagation of interactions, the situation is changed fundamentally. The forces
acting on a particle at a given moment are not determined by the positions at that same
moment. A change in the position of one of the particles influences other particles only
after the lapse of a certain time interval. This means that the field itself acquires physical
reality. We cannot speak of a direct interaction of particles located at a distance from one
another. Interactions can occur at any one moment only between neighboring points in
space (contact interaction). Therefore we must speak of the interaction of the one particle
with the field, and of the subsequent interaction of the field with the second particle.
We shall consider two types of fields, gravitational and electromagnetic. The study of
gravitational fields is left to Chapters 10 to 12 and in the other chapters we consider only
electromagnetic fields.
Before considering the interactions of particles with the electromagnetic field, we shall
make some remarks concerning the concept of a "particle" in relativistic mechanics.
In classical mechanics one can introduce the concept of a rigid body, i.e., a body which is
not deformable under any conditions. In the theory of relativity it should follow similarly
that we would consider as rigid those bodies whose dimensions all remain unchanged in the
reference system in which they are at rest. However, it is easy to see that the theory of
relativity makes the existence of rigid bodies impossible in general.
Consider, for example, a circular disk rotating around its axis, and let us assumed that it is
rigid. A reference frame fixed in the disk is clearly not inertial. It is possible, however, to
introduce for each of the infinitesimal elements of the disk an inertial system in which this
element would be at rest at the moment; for different elements of the disk, having different
velocities, these systems will, of course, also be different. Let us consider a series of line
elements, lying along a particular radius vector. Because of the rigidity of the disk, the
length of each of these segments (in the corresponding inertial system of reference) will be
the same as it was when the disk was at rest. This same length would be measured by an
observer at rest, past whom this radius swings at the given moment, since each of its seg
ments is perpendicular to its velocity and consequently a Lorentz contraction does not
43
44 CHARGES IN ELECTROMAGNETIC FIELDS § 16
occur. Therefore the total length of the radius as measured by the observer at rest, being the
sum of its segments, will be the same as when the disk was at rest. On the other hand, the
length of each element of the circumference of the disk, passing by the observer at rest at a
given moment, undergoes a Lorentz contraction, so that the length of the whole circum
ference (measured by the observer at rest as the sum of the lengths of its various segments)
turns out to be smaller than the length of the circumference of the disk at rest. Thus we
arrive at the result that due to the rotation of the disk, the ratio of circumference to radius
(as measured by an observer at rest) must change, and not remain equal to 2n. The absurdity
of this result shows that actually the disk cannot be rigid, and that in rotation it must
necessarily undergo some complex deformation depending on the elastic properties of the
material of the disk.
The impossibility of the existence of rigid bodies can be demonstrated in another way.
Suppose some solid body is set in motion by an external force acting at one of its points. If
the body were rigid, all of its points would have to be set in motion at the same time as the
point to which the force is applied; if this were not so the body would be deformed. How
ever, the theory of relativity makes this impossible, since the force at the particular point is
transmitted to the others with a finite velocity, so that all the points cannot begin moving
simultaneously.
From this discussion we can draw certain conclusions concerning the treatment of
"elementary" particles, i.e., particles whose state we assume to be described completely by
giving its three coordinates and the three components of its velocity as a whole. It is obvious
that if an elementary particle had finite dimensions, i.e. if it were extended in space, it could
not be deformable, since the concept of deformability is related to the possibility of in
dependent motion of individual parts of the body. But, as we have seen, the theory of
relativity shows that it is impossible for absolutely rigid bodies to exist.
Thus we come to the conclusion that in classical (nonquantum) relativistic mechanics,
we cannot ascribe finite dimensions to particles which we regard as elementary. In other
words, within the framework of classical theory elementary particles must be treated as
points.f
§16. Fourpotential of a field
For a particle moving in a given electromagnetic field, the action is made up of two parts:
the action (8.1) for the free particle, and a term describing the interaction of the particle with
the field. The latter term must contain quantities characterizing the particle and quantities
characterizing the field.
It turns out J that the properties of a particle with respect to interaction with the electro
magnetic field are determined by a single parameter — the charge e of the particle, which can
be either positive or negative (or equal to zero). The properties of the field are characterized
t Quantum mechanics makes a fundamental change in this situation, but here again relativity theory
makes it extremely difficult to introduce anything other than point interactions.
t The assertions which follow should be regarded as being, to a certain extent, the consequence of experi
mental data. The form of the action for a particle in an electromagnetic field cannot be fixed on the basis of
general considerations alone (such as, for example, the requirement of relativistic invariance). The latter
would permit the occurrence in formula (16.1) of terms of the form J A ds, where A is a scalar function.
To avoid any misunderstanding, we repeat that we are considering classical (and not quantum) theory, and
therefore do not include effects which are related to the spins of particles.
§ 16 FOURPOTENTIAL OF A FIELD 45
by a four vector A h the fourpotential, whose components are functions of the coordinates
and time. These quantities appear in the action function in the term
b
e C
A ( dx l ,
where the functions A { are taken at points on the world line of the particle. The factor Ijc
has been introduced for convenience. It should be pointed out that, so long as we have no
formulas relating the charge or the potentials with already known quantities, the units for
measuring these new quantities can be chosen arbitrarily.!
Thus the action function for a charge in an electromagnetic field has the form
b
S = lmcds  A t dx\ (16.1)
a
The three space components of the fourvector A 1 form a threedimensional vector A
called the vector potential of the field. The time component is called the scalar potential; we
denote it by A° = 0. Thus
A 1 = (</>, A). (16.2)
Therefore the action integral can be written in the form
b
<S = l—mcds+  Adr—efidt).
a
Introducing dr/dt = v, and changing to an integration over t,
t2 2
S= \ltnc 2 Jl 2 +kyeAdt. (16.3)
ti
The integrand is just the Lagrangian for a charge in an electromagnetic field:
J i i
L= mc 2 J l^ + A\e(f). (16.4)
This function differs from the Lagrangian for a free particle (8.2) by the terms (e/c) A • v e<f),
which describe the interaction of the charge with the field.
The derivative dL/dx is the generalized momentum of the particle; we denote it by P.
Carrying out the differentiation, we find
„ mv e e
P = , = + A = p+A. (16.5)
J'4
Here we have denoted by p the ordinary momentum of the particle, which we shall refer to
simply as its momentum.
From the Lagrangian we can find the Hamiltonian function for a particle in a field from
the general formula
d\
t Concerning the establishment of these units, see § 27.
46 CHARGES IN ELECTROMAGNETIC FIELDS § 17
Substituting (16.4), we get
mr 2
+ e<f>. (16.6)
J
c z
However, the Hamiltonian must be expressed not in terms of the velocity, but rather in terms
of the generalized momentum of the particle.
From (16.5) and (16.6) it is clear that the relation between W — efy and P— (ejc)A is the
same as the relation between Jf and p in the absence of the field, i.e.
or else
= Jm 2 c 4 + c 2 (pa) +e<f). (16.8)
For low velocities, i.e. for classical mechanics, the Lagrangian (16.4) goes over into
In this approximation
mv 2 e
L = —  + Ave0. (16.9)
2 c
e
p = mv = P — A,
c
and we find the following expression for the Hamiltonian :
jP = ^(va) +e(f). (16.10)
2m \ c J
Finally we write the HamiltonJacobi equation for a particle in an electromagnetic field.
It is obtained by replacing, in the equation for the Hamiltonian, P by dS/dr, and 2tf by
(dS/dt). Thus we get from (16.7)
( vs  A V^(^+^) + m2 c 2 =° (1611)
§ 17. Equations of motion of a charge in a field
A charge located in a field not only is subjected to a force exerted by the field, but also in
turn acts on the field, changing it. However, if the charge e is not large, the action of the
charge on the field can be neglected. In this case, when considering the motion of the charge
in a given field, we may assume that the field itself does not depend on the coordinates or the
velocity of the charge. The precise conditions which the charge must fulfil in order to be
considered as small in the present sense, will be clarified later on (see § 75). In what follows
we shall assume that this condition is fulfilled.
So we must find the equations of motion of a charge in a given electromagnetic field.
These equations are obtained by varying the action, i.e. they are given by the Lagrange
§ 17 EQUATIONS OF MOTION OF A CHARGE IN A FIELD 47
equations
d /dL\ dL ._.
*(*)¥' (m)
where L is given by formula (16.4).
The derivative BL/dv is the generalized momentum of the particle (16.5). Further, we
write
—  = VL =  grad A • v— e grad <p.
or c
But from a formula of vector analysis.
grad (a • b) = (a • V)b + (b • V)a + b x curl a + a x curl b,
where a and b are two arbitrary vectors. Applying this formula to A • v, and remembering
that differentiation with respect to r is carried out for constant v, we find
8L e e
— =  (v • V)A+  v x curl A— e grad <p.
or c c
So the Lagrange equation has the form :
d / 6 \ 6 6
— ( p+  A ) =  (v • V)A+  v x curl A — e grad 0.
at \ c J c c
But the total differential (dA/dt)dt consists of two parts: the change (8Afdt)dt of the vector
potential with time at a fixed point in space, and the change due to motion from one point
in space to another at distance dr. This second part is equal to (dr • V)A. Thus
dA dA , ^ k
Substituting this in the previous equation, we find
dp e dA e
— = egmd(f)+ vxcurl A. (17.2)
dt c ot c
This is the equation of motion of a particle in an electromagnetic field. On the left side
stands the derivative of the particle's momentum with respect to the time. Therefore the
expression on the right of (17.2) is the force exerted on the charge in an electromagnetic
field. We see that this force consists of two parts. The first part (first and second terms on the
right side of 17.2) does not depend on the velocity of the particle. The second part (third
term) depends on the velocity, being proportional to the velocity and perpendicular to it.
The force of the first type, per unit charge, is called the electric field intensity; we denote
it by E. So by definition,
1 dA
E=gradf (17.3)
The factor of v/c in the force of the second type, per unit charge, is called the magnetic
field intensity. We designate it by H. So by definition,
H = curl A. (17.4)
If in an electromagnetic field, E ^ but H = 0, then we speak of an electric field; if
E = but H^O, then the field is said to be magnetic. In general, the electromagnetic field is
a superposition of electric and magnetic fields.
We note that E is a polar vector while H is an axial vector.
48 CHARGES IN ELECTROMAGNETIC FIELDS § 17
The equation of motion of a charge in an electromagnetic field can now be written as
dp e
~ = eE+vxH. (17.5)
dt e v '
The expression on the right is called the Lor entz force. The first term (the force which the
electric field exerts on the charge) does not depend on the velocity of the charge, and is
along the direction of E. The second part (the force exerted by the magnetic field on the
charge) is proportional to the velocity of the charge and is directed perpendicular to the
velocity and to the magnetic field H.
For velocities small compared with the velocity of light, the momentum p is approximately
equal to its classical expression m\, and the equation of motion (17.5) becomes
d\ e
m = eE+vxH, (17.6)
dt c
Next we derive the equation for the rate of change of the kinetic energy of the particlef
with time, i.e. the derivative
a<5,: n d
dt dt
It is easy to check that
r kin = dp
dt dt
Substituting dpjdt from (17.5) and noting that v x H • v = 0, we have
kin = eEv. (17.7)
dt
The rate of change of the kinetic energy is the work done by the field on the particle per
unit time. From (17.7) we see that this work is equal to the product of the velocity by the
force which the electric field exerts on the charge. The work done by the field during a time
dt, i.e. during a displacement of the charge by dt, is clearly equal to eE • dr.
We emphasize the fact 1 lat work is done on the charge only by the electric field ; the mag
netic field does no work ( n a charge moving in it. This is connected with the fact that the
force which the magnetic ield exerts on a charge is always perpendicular to the velocity of
the charge.
The equations of mecha lies are invariant with respect to a change in sign of the time, that
is, with respect to intercha lge of future and past. In other words, in mechanics the two time
directions are equivalent. This means that if a certain motion is possible according to the
equations of mechanics, then the reverse motion is also possible, in which the system passes
through the same states in reverse order.
It is easy to see that this is also valid for the electromagnetic field in the theory of relativity.
In this case, however, in addition to changing t into — t, we must reverse the sign of the mag
netic field. In fact it is easy to see that the equations of motion (17.5) are not altered if we
make the changes
t+t, E+E, H+H. (17.8)
f By "kinetic" we mean the energy (9.4), which includes the rest energy.
§ 18 GAUGE INVARIANCE 49
According to (17.3) and (17.4), this does not change the scalar potential, while the vector
potential changes sign :
$>(!>, A+A, (17.9)
Thus, if a certain motion is possible in an electromagnetic field, then the reversed motion
is possible in a field in which the direction of H is reversed.
PROBLEM
Express the acceleration of a particle in terms of its velocity and the electric and magnetic field
intensities.
Solution: Substitute in the equation of motion (17.5) p = v<^ kl n/c 2 , and take the expression for
d<?kin/dt from (17.7). As a result, we get
■sV i ?{ ,i  i v xh ?*4
§ 18. Gauge invariance
Let us consider to what extent the potentials are uniquely determined. First of all we call
attention to the fact that the field is characterized by the effect which it produces on the
motion of a charge located in it. But in the equation of motion (17.5) there appear not the
potentials, but the field intensities E and H. Therefore two fields are physically identical if
they are characterized by the same vectors E and H.
If we are given potentials A and </>, then these uniquely determine (according to (17.3) and
(17.4)) the fields E and H. However, to one and the same field there can correspond different
potentials. To show this, let us add to each component of the potential the quantity —df/dx k ,
where /is an arbitrary function of the coordinates and the time. Then the potential A k goes
over into
A ' k==Ak ~d? (18 ' 1}
As a result of this change there appears in the action integral (16.1) the additional term
e iL** = d(zf\ (18.2)
c dx \c
which is a total differential and has no effect on the equations of motion. (See Mechanics,
§2.)
If in place of the fourpotential we introduce the scalar and vector potentials, and in place
of x\ the coordinates ct, x, y, z, then the four equations (18.1) can be written in the form
1 /)
A' = A+grad/, <j>' = $ ~±. (18.3)
cot
It is easy to check that electric and magnetic fields determined from equations (17.3) and
(17.4) actually do not change upon replacement of A and by A' and <£', defined by (18.3).
Thus the transformation of potentials (18.1) does not change the fields. The potentials are
therefore not uniquely defined; the vector potential is determined to within the gradient of
an arbitrary function, and the scalar potential to within the time derivative of the same
function.
50 CHARGES IN ELECTROMAGNETIC FIELDS § 19
In particular, we see that we can add an arbitrary constant vector to the vector potential,
and an arbitrary constant to the scalar potential. This is also clear directly from the fact that
the definitions of E and H contain only derivatives of A and 0, and therefore the addition of
constants to the latter does not affect the field intensities.
Only those quantities have physical meaning which are invariant with respect to the trans
formation (18.3) of the potentials; in particular all equations must be invariant under this
transformation. This invariance is called gauge invariance (in German, eichinvarianz).\
This nonuniqueness of the potentials gives us the possibility of choosing them so that they
fulfill one auxiliary condition chosen by us. We emphasize that we can set one condition,
since we may choose the function/in (18.3) arbitrarily. In particular, it is always possible to
choose the potentials so that the scalar potential 4> is zero. If the vector potential is not zero,
then it is not generally possible to make it zero, since the condition A = represents three
auxiliary conditions (for the three components of A).
§ 19. Constant electromagnetic field
By a constant electromagnetic field we mean a field which does not depend on the time.
Clearly the potentials of a constant field can be chosen so that they are functions only of the
coordinates and not of the time. A constant magnetic field is equal, as before, to H = curl A.
A constant electric field is equal to
E=grad0. (19.1)
Thus a constant electric field is determined only by the scalar potential and a constant
magnetic field only by the vector potential.
We saw in the preceding section that the potentials are not uniquely determined. However,
it is easy to convince oneself that if we describe the constant electromagnetic field in terms of
potentials which do not depend on the time, then we can add to the scalar potential, without
changing the fields, only an arbitrary constant (not depending on either the coordinates or
the time). Usually is subjected to the additional requirement that it have a definite value
at some particular point in space; most frequently (j> is chosen to be zero at infinity. Thus the
arbitrary constant previously mentioned is determined, and the scalar potential of the con
stant field is thus determined uniquely.
On the other hand, just as before, the vector potential is not uniquely determined even
for the constant electromagnetic field; namely, we can add to it the gradient of an arbitrary
function of the coordinates.
We now determine the energy of a charge in a constant electromagnetic field. If the field
is constant, then the Lagrangian for the charge also does not depend explicitly on the time.
As we know, in this case the energy is conserved and coincides with the Hamiltonian.
According to (16.6), we have
mc 2
J
+ e<j>. (19.2)
f We emphasize that this is related to the assumed constancy of e in (18.2). Thus the gauge invariance of
the equations of electrodynamics (see below) and the conservation of charge are closely related to one
another.
§ 19 CONSTANT ELECTROMAGNETIC FIELD 51
Thus the presence of the field adds to the energy of the particle the term e(j>, the potential
energy of the charge in the field. We note the important fact that the energy depends only on
the scalar and not on the vector potential. This means that the magnetic field does not affect
the energy of the charge. Only the electric field can change the energy of the particle. This is
related to the fact that the magnetic field, unlike the electric field, does no work on the charge.
If the field intensities are the same at all points in space, then the field is said to be uniform.
The scalar potential of a uniform electric field can be expressed in terms of the field intensity
as
#=Et. (19.3)
In fact, since E = const, V(E • r) = (E • V)r = E.
The vector potential of a uniform magnetic field can be expressed in terms of its field
intensity as
A = ^Hxr. (19.4)
In fact, recalling that H = const, we obtain with the aid of wellknown formulas of vector
analysis :
curl (Hxr) = Hdiv r(H V)r = 2H
(noting that div r = 3).
The vector potential of a uniform magnetic field can also be chosen in the form
A x =Hy, A y = A z = (19.5)
(the z axis is along the direction of H). It is easily verified that with this choice for A we
have H = curl A. In accordance with the transformation formulas (18.3), the potentials
(19.4) and (19.5) differ from one another by the gradient of some function: formula (19.5)
is obtained from (19.4) by adding V/, where/= —xyH/2.
PROBLEM
Give the variational principle for the trajectory of a particle (Maupertuis' principle) in a constant
electromagnetic field in relativistic mechanics.
Solution: Maupertuis' principle consists in the statement that if the energy of a particle is con
served (motion in a constant field), then its trajectory can be determined from the variational
equation
5
j*Pdr = 0,
where P is the generalized momentum of the particle, expressed in terms of the energy and the
coordinate differentials, and the integral is taken along the trajectory of the particle, f Substituting
P = p+(e/c)A and noting that the directions of p and dr coincide, we have
<5 I (pdl+\kdr)=0,
c
where dl = Vdr 2 is the element of arc. Determining p from
p 2 +m 2 c 2 =
we obtain finally
'e<
■SU(^
m 2 c 2 dl+Adr)>=0.
t See Mechanics, § 44.
52 CHARGES IN ELECTROMAGNETIC FIELDS § 20
§ 20. Motion in a constant uniform electric field
Let us consider the motion of a charge e in a uniform constant electric field E. We take
the direction of the field as the Xaxis. The motion will obviously proceed in a plane, which
we choose as the XY plane. Then the equations of motion (17.5) become
Px = eE, p y =
(where the dot denotes differentiation with respect to t), so that
p x = eEt, p y = p . (20.1)
The time reference point has been chosen at the moment when^ = 0; p is the momentum
of the particle at that moment.
The kin etic energ y of the particle (the energy omitting the potential energy in the field) is
^kin = c\lm 2 c 2 +p 2 . Substituting (20.1), we find in our case
^kin = \lm 2 c* + c 2 p 2 + (ceEtj 2 = Jg 2 + (ceEt) 2 , (20.2)
where «f is the energy at t = 0.
According to (9.8) the velocity of the particle is v = pc 2 /<f kln . For the velocity v x = x
we have therefore
dx p x c 2 c 2 eEt
dt ^ kln Qg 2 + (ceEt) 2 '
Integrating, we find
x = — yUl + (ceEt) 2 . (20.3)
The constant of integration we set equal to zero.f
For determining y, we have
dy Vy c2 PoC 2
dt <f kin V^o + (ce£0 2 '
from which
PnC . , , /ceEt s .
— smiTM—  . (20.4)
eE
o
We obtain the equation of the trajectory by expressing / in terms of y from (20.4) and sub
stituting in (20.3). This gives:
fin eEy
x= _?cosh— . (20.5)
eE p c
Thus in a uniform electric field a charge moves along a catenary curve.
If the velocity of the particle is v 4 c, then we can set p = mv , i Q = mc 2 , and expand
(20.5) in series in powers of 1/c. Then we get, to within terms of higher order,
eE
x =  — 2 y + const,
, 2
2mvQ
that is, the charge moves along a parabola, a result well known from classical mechanics.
f This result (for p Q = 0) coincides with the solution of the problem of relativistic motion with constant
"proper acceleration" w = eEjm (see the problem in § 7). For the present case, the constancy of the accelera
tion is related to the fact that the electric field does not change for Lorentz transformations having velocities
V along the direction of the field (see § 24).
§ 21 MOTION IN A CONSTANT UNIFORM MAGNETIC FIELD 53
§ 21. Motion in a constant uniform magnetic field
We now consider the motion of a charge e in a uniform magnetic field H. We choose the
direction of the field as the Z axis. We rewrite the equation of motion
e
p = vxH
c
in another form, by substituting for the momentum, from (9.8),
Sy
where $ is the energy of the particle, which is constant in the magnetic field. The equation of
motion then goes over into the form
$ d\ e
?ir~c yxH < 211 >
or, expressed in terms of components,
i> x = cov y , v y =cov x , v z = 0, (21.2)
where we have introduced the notation
ecH
co =
(21.3)
We multiply the second equation of (21.2) by i, and add it to the first:
d_
dt
— (v x + iv y ) =  ico(v x + ivX
so that
v x +iv y = ae~ i<ot ,
where a is a complex constant. This can be written in the form a = v 0t e~ ia where v Qt and a
are real. Then
v x +iv y = v 0t e* cot+ * )
and, separating real and imaginary parts, we find
v x = v 0t cos (cot + a), v y = v ot sin (cot + a). (21.4)
The constants v 0t and oc are determined by the initial conditions; a is the initial phase, and
as for v 0t , from (21.4) it is clear that
v 0t = \/v 2 x +v 2 y ,
that is, v 0t is the velocity of the particle in the XY plane, and stays constant throughout the
motion.
From (21.4) we find, integrating once more,
x = x + r sin (cot +<x), y = y + r cos (cot+a), (21.5)
where
v 0t v 0t g cp t
(p t is the projection of the momentum on the XY plane). From the third equation of (21.2),
we find v z = v 0z and
z = z + v 0z t. (21.7)
C.T.F. 3
54 CHARGES IN ELECTROMAGNETIC FIELDS § 21
From (21.5) and (21.7), it is clear that the charge moves in a uniform magnetic field along
a helix having its axis along the direction of the magnetic field and with a radius r given by
(21.6). The velocity of the particle is constant. In the special case where v 0z = 0, that is, the
charge has no velocity component along the field, it moves along a circle in the plane
perpendicular to the field.
The quantity w, as we see from the formulas, is the angular frequency of rotation of the
particle in the plane perpendicular to the field.
If the velocity of the particle is low, then we can approximately set $ = mc 2 . Then the
frequency co is changed to
ca = — . (21.8)
mc
We shall now assume that the magnetic field remains uniform but varies slowly in
magnitude and direction. Let us see how the motion of a charged particle changes in this
case.
We know that when the conditions of the motion are changed slowly, certain quantities
called adiabatic invariants remain constant. Since the motion in the plane perpendicular to
the magnetic field is periodic, the adiabatic invariant is the integral
t§ v  dt '
2n
taken over a complete period of the motion, i.e. over the circumference of a circle in the
present case (P, is the projection of the generalized momentum on the plane perpendicular
to Hf). Substituting P ( = p t + (e/c)A, we have:
= 2^ P ' dr = ^ P ' dr+ 2^ A  dr
In the first term we note that p t is constant in magnitude and directed along dr; we apply
Stokes' theorem to the second term and write curl A = H :
/ = rp t + ± Hr\
where r is the radius of the orbit. Substituting the expression (21.6) for r, we find:
From this we see that, for slow variation of H, the tangential momentum p t varies propor
tionally to v H.
This result can also be applied to another case, when the particle moves along a helical
path in a magnetic field that is not strictly homogeneous (so that the field varies little over
distances comparable with the radius and step of the helix). Such a motion can be considered
as a motion in a circular orbit that shifts in the course of time, while relative to the orbit the
t See Mechanics, § 49. In general the integrals § p dq, taken over a period of the particular coordinate
q, are adiabatic invariants. In the present case the periods for the two coordinates in the plane perpendicular
to H coincide, and the integral / which we have written is the sum of the two corresponding adiabatic in
variants. However, each of these invariants individually has no special significance, since it depends on the
(nonunique) choice of the vector potential of the field. The nonuniqueness of the adiabatic invariants which
results from this is a reflection of the fact that, when we regard the magnetic field as uniform over all of space,
we cannot in principle determine the electric field which results from changes in H, since it will actually
depend on the specific conditions at infinity.
§ 22 MOTION OF A CHARGE IN CONSTANT UNIFORM ELECTRIC AND MAGNETIC FIELDS 55
field appears to change in time but remain uniform. One can then state that the component
of the angular momentum transverse to the direction of the field varies according to the law:
p t = \JCH, where C is a constant and #is a given function of the coordinates. On the other
hand, just as for the motion in any constant magnetic field, the energy of the particle (and
consequently the square of its momentum p 2 ) remains constant. Therefore the longitudinal
component of the momentum varies according to the formula :
pf = P 2 Pf = P 2 ~CH(x, y, z). (21.10)
Since we should always have pf ^ 0, we see that penetration of the particle into regions of
sufficiently high field (CH > p 2 ) is impossible. During motion in the direction of increasing
field, the radius of the helical trajectory decreases proportionally top t /H(i.Q. proportionally
to 1/yJH), and the step proportionally to p x . On reaching the boundary where p t vanishes,
the particle is reflected : while continuing to rotate in the same direction it begins to move
opposite to the gradient of the field.
Inhomogeneity of the field also leads to another phenomenon — a slow transverse shift
(drift) of the guiding center of the helical trajectory of the particle (the name given to the
center of the circular orbit); problem 3 of the next section deals with this question.
PROBLEM
Determine the frequency of vibration of a charged spatial oscillator, placed in a constant,
uniform magnetic field; the proper frequency of vibration of the oscillator (in the absence of the
field) is co .
Solution: The equations of forced vibration of the oscillator in a magnetic field (directed along
the z axis) are :
•• , 2 eH . o eH
x + co x = — y, y + ca 2 y= x, z+co 2 z = 0.
mc mc °
Multiplying the second equation by i and combining with the first, we find
eH ■
c+co 2 c = i — i,
mc
where C=x+iy. From this we find that the frequency of vibration of the oscillator in a plane
perpendicular to the field is
J" t+ \(0)'
eH
2mc'
If the field H is weak, this formula goes over into
co = co ±eH/2mc.
The vibration along the direction of the field remains unchanged.
§ 22. Motion of a charge in constant uniform electric and magnetic fields
Finally we consider the motion of a charge in the case where there are present both
electric and magnetic fields, constant and uniform. We limit ourselves to the case where the
velocity of the charge v < c, so that its momentum p = mv; as we shall see later, it is necessary
for this that the electric field be small compared to the magnetic.
56 CHARGES IN ELECTROMAGNETIC FIELDS § 22
We choose the direction of H as the Z axis, and the plane passing through H and E as the
YZ plane. Then the equation of motion
e
mv = eE+  vxH
c
can be written in the form
6 6
mx = yH, my = eE y — xH, m'z = eE z . (22.1)
From the third equation we see that the charge moves with uniform acceleration in the Z
direction, that is,
eE
Z = 2m t2 + V ° zt ' (22 ' 2)
Multiplying the second equation of (22.1) by i and combining with the first, we find
d e
— (x + iy) + ico(x + iy) = i — E y
at m
(co = eH/mc). The integral of this equation, where x + iy is considered as the unknown, is
equal to the sum of the integral of the same equation without the righthand term and a
particular integral of the equation with the righthand term. The first of these is ae~ i(0t , the
second is eE y /ma) = cE y (H. Thus
cE
x + iy = ae imt +f.
H
The constant a is in general complex. Writing it in the form a = be ict , with real b and a, we
see that since a is multiplied by e~ imt , we can, by a suitable choice of the time origin, give
the phase a any arbitrary value. We choose this so that a is real. Then breaking up x + iy
into real and imaginary parts, we find
cE
x = a cos cot + — , y=— a sin cot. (22.3)
H
At t = the velocity is along the X axis.
We see that the components of the velocity of the particle are periodic functions of the
time. Their average values are:
 cE y ~ n
*=, 30.
This average velocity of motion of a charge in crossed electric and magnetic fields is often
called the electrical drift velocity. Its direction is perpendicular to both fields and independent
of the sign of the charge. It can be written in vector form as:
_ cExH _ ..
v = ^f ( 22  4 )
All the formulas of this section assume that the velocity of the particle is small compared
with the velocity of light; we see that for this to be so, it is necessary in particular that the
electric and magnetic fields satisfy the condition
^<U, (22.5)
XI
while the absolute magnitudes of E y and H can be arbitrary.
§ 22 MOTION OF A CHARGE IN CONSTANT UNIFORM ELECTRIC AND MAGNETIC FIELDS 57
Fig. 6.
Integrating equation (22.3) again, and choosing the constant of integration so that at
t = 0, x = y = 0, we obtain
cE r .
(22.6)
x =  sin cot\ t:
co H
y = ~ (cos cor— 1).
CO
Considered as parametric equations of a curve, these equations define a trochoid. Depend
ing on whether a is larger or smaller in absolute value than the quantity cE y /H, the projection
of the trajectory on the plane XY has the forms shown in Figs. 6a and 6b, respectively.
If a = —cE y /H, then (22.6) becomes
cE
x = — ^ (cot — sin cot),
coH
cE y n
y = ^ (1  C0SCW °
(22.7)
that is, the projection of the trajectory on the XY plane is a cycloid (Fig. 6c).
PROBLEMS
1. Determine the relativistic motion of a charge in parallel uniform electric and magnetic fields.
Solution: The magnetic field has no influence on the motion along the common direction of E
and H (the z axis), which therefore occurs under the influence of the electric field alone; therefore
according to § 20 we find:
eE' Km '
For the motion in the xy plane we have the equation
\+(ceEt) 2 .
PX =  HVy,
c
Py = — ~ HV X
c
58 CHARGES IN ELECTROMAGNETIC FIELDS § 22
or
d r . . \ • e H , , . n ieHc ,
jAPx+iPy) = i—(v x +Wy) =  — — (jp x +ipy).
at c «r kln
Consequently
Px+ip y =Pte~ i * t
whereat is the constant value of the projection of the momentum on the xy plane, and the auxiliary
quantity (j> is defined by the relation
d(f> = eHc — ,
®kin
from which
Furthermore we have:
so that
c ' = TE siDb p ' (1 >
P,+ip, = P ,e<* = % (*+ W = <^±^'
c/?( . , CD*
x = —sin<*, >>=— cos<£. (2)
en en
Formulas (1), (2) together with the formula
Z = ^ COSh §<*' (3)
determine the motion of the particle in parametric form. The trajectory is a helix with radius
cptleH and monotonically increasing step, along which the particle moves with decreasing angular
velocity <f> — eHc/£ kia and with a velocity along the z axis which tends toward the value c.
2. Determine the relativistic motion of a charge in electric and magnetic fields which are mutually
perpendicular and equal in magnitude, f
Solution: Choosing the z axis along H and the y axis along E and setting E = H, we write the
equations of motion :
d ^ = Ev y , %L = eE (l V A ^' =
dt c dt \ c) dt
and, as a consequence of them, formula (17.7),
dt
From these equations we have :
p 3 = const, ^kin—cpx = const = a.
Also using the equation
Kin~C 2 p 2 x = (<? ki n + Cp x )(^ in C Px ) = C 2 P 2 y +e 2
(where e 2 = trfc^+^p 2 = const), we find:
<? k m + cp x = l (c 2 p 2 y +e 2 ),
a
and so
 _« ,c 2 p 2 +e 2
« , c 2 pl+s 2
Px= — ;r +
2c 2ac
f The problem of motion in mutually perpendicular fields E and H which are not equal in magnitude can,
by a suitable transformation of the reference system, be reduced to the problem of motion in a pure electric
or a pure magnetic field (see § 25).
j t =eE\ * kln — J = eE{S kln  cp*) = e£a,
§ 22 MOTION OF A CHARGE IN CONSTANT UNIFORM ELECTRIC AND MAGNETIC FIELDS 59
Furthermore, we write
<p fay
®kin
from which
*» (»+;?)»+ £* a)
To determine the trajectory, we make a transformation of variables in the equations
dx _ c 2 p x
to the variable /?„ by using the relation dt = £ kiD dp y /eEa, after which integration gives the formulas :
c ( , £ 2 \ c 3
X= "leE\ l + ^)^ + 6^eE P  < 2 >
<? 2 P*C 2
Formulas (1) and (2) completely determine the motion of the particle in parametric form (parameter
p y ). We call attention to the fact that the velocity increases most rapidly in the direction per
pendicular to E and H (the x axis).
3. Determine the velocity of drift of the guiding center of the orbit of a nonrelativistic charged
particle in a quasihomogeneous magnetic field (H. Alfven, 1940).
Solution: We assume first that the particle is moving in a circular orbit, i.e. its velocity has no
longitudinal component (along the field). We write the equation of the trajectory in the form
r = R(/)+£(f ), where R(t) is the radius vector of the guiding center (a slowly varying function of
the time), while £(0 is a rapidly oscillating quantity describing the rotational motion about the
guiding center. We average the force (e/c)r x H(r) acting on the particle over a period of the oscil
latory (circular) motion (compare Mechanics, § 30). We expand the function H(r) in this expression
in powers of e :
H(r) = H(R)+(£V)H(R).
On averaging, the terms of first order in e(0 vanish, while the seconddegree terms give rise to
an additional force
f=^x(CV)H.
For a circular orbit
C = ^xn, C = ^,
CO
where n is a unit vector along H; the frequency m = eH/mc; v ± is the velocity of the particle in its
circular motion. The average values of products of components of the vector C rotating in a plane
(the plane perpendicular to n), are :
where d ap is the unit tensor in this plane. As a result we find:
f=^(nxy)xH.
Because of the equations div H = and curl H = which the constant field H(R) satisfies we
have: '
(nx V)xH = n div H+(n V)H+n(V xH) = (n V)H = H(n V)n+n(n VH).
We are interested in the force transverse to n, giving rise to a shift of the orbit; it is equal to
f= ±(nV)n = ^v,
£ Zp
where p is the radius of curvature of the force line of the field at the given point, and v is a unit
vector directed from the center of curvature to this point.
60 CHARGES IN ELECTROMAGNETIC FIELDS § 23
The case where the particle also has a longitudinal velocity ^ ,( along n) reduces to the previous case
if we go over to a reference frame which is rotating about the instantaneous center of curvature of
the force line (which is the trajectory of the guiding center) with angular velocity v\\f P . In this
reference system the particle has no longitudinal velocity, but there is an additional transverse force,
the centrifugal force emvfj£. Thus the total transverse force is
f i =v
M>
P
This force is equivalent to a constant electric field of strength ije. According to (22.40) it
causes a drift of the guiding center of the orbit with a velocity
The sign of this velocity depends on the sign of the charge.
§ 23. The electromagnetic field tensor
In § 17, we derived the equation of motion of a charge in a field, starting from the
Lagrangian (16.4) written in threedimensional form. We now derive the same equation
directly from the action (16.1) written in fourdimensional notation.
The principle of least action states
b
dS = 8 \(mcdsA t dx*\ = 0. (23.1)
a
Noting that ds = yjdxidx 1 , we find (the limits of integration a and b are omitted for brevity):
Kdx { ddx l e , ,„ . e „ .\
mc — +  A t d5x l +  SAidx 1 ) = 0.
3S =
We integrate the first two terms in the integrand by parts. Also, in the first term we set
dxi/ds = u h where u t are the components of the four velocity. Then
/(
e e A X ( e \ .
mc du, 5x l +  8x l dA, — 3 A, dx l ) — ( mcu,+  A, J 5x l
c c / L\ c J
= 0. (23.2)
The second term in this equation is zero, since the integral is varied with fixed coordinate
values at the limits. Furthermore:
and therefore
C / e dA, e dA \
(mcdiii Sx l +—l dx l dx k — —\ d^dx*) = 0.
In the first term we write du t = (dujds)ds, in the second and third, dx l = u l ds. In addition,
in the third term we interchange the indices / and k (this changes nothing since the indices /
and k are summed over). Then
j[^mm^°
§ 23 THE ELECTROMAGNETIC FIELD TENSOR 61
In view of the arbitrariness of 5x\ it follows that the integrand is zero, that is,
dui _ e /dA k dAA k
ds c\ dx* dx k J
We now introduce the notion
dA k dA t
Fik= M~d? (233)
The antisymmetric tensor F ik is called the electromagnetic field tensor. The equation of
motion then takes the form :
du* e ..
mc ^ = " ir, V (23.4)
These are the equations of motion of a charge in fourdimensional form.
The meaning of the individual components of the tensor F ik is easily seen by substituting
the values A t = (<£, A) in the definition (23.3). The result can be written as a matrix in
which the index i = 0, 1, 2, 3 labels the rows, and the index k the columns:
/ E x E y E 2 \ I E x E y ~E 2 \
■ h »: ; 4  [t : s 4 «
\E z H y H x 0/ \E Z H y H x 0/
More briefly, we can write (see § 6):
F ik = (E,H), F' fc = (E,H).
Thus the components of the electric and magnetic field strengths are components of the
same electromagnetic field fourtensor.
Changing to threedimensional notation, it is easy to verify that the three space com
ponents (/ = 1, 2, 3) of (23.4) are identical with the vector equation of motion (17.5), while
the time component (i = 0) gives the work equation (17.7). The latter is a consequence of
the equations of motion; the fact that only three of the four equations are independent can
also easily be found directly by multiplying both sides of (23.4) by u\ Then the left side of the
equation vanishes because of the orthogonality of the fourvectors u l and dujds, while the
right side vanishes because of the antisymmetry of F ik .
If we admit only possible trajectories when we vary S, the first term in (23.2) vanishes
identically. Then the second term, in which the upper limit is considered as variable, gives the
differential of the action as a function of the coordinates. Thus
5S =  I mcu t +  Ai J 5xK (23.6)
Then
dS e e
 — j = mcu^  c A^ Vi ^ c A t . (23.7)
The fourvector dS/dx 1 is the fourvector P t of the generalized momentum of the particle.
Substituting the values of the components p t and A t , we find that
P ={—^> P+AJ. (23.8)
As expected, the space components of the four vector form the threedimensional general
62 CHARGES IN ELECTROMAGNETIC FIELDS § 24
ized momentum vector (16.5), while the time component is &/c, where £ is the total energy
of the charge in the field.
§ 24. Lorentz transformation of the field
In this section we find the transformation formulas for fields, that is, formulas by means
of which we can determine the field in one inertial system of reference, knowing the same
field in another system.
The formulas for transformation of the potentials are obtained directly from the general
formulas for transformation of fourvectors (6.1). Remembering that A 1 = ((f), A), we get
easily
V V
<t>> +<!>' A' x +A' x
4> = , ° , , A X =—JL=, A y = A' A Z = A' Z . (24.1)
V 1 c 2 V 1 c 2
The transformation formulas for an antisymmetric secondrank tensor (like F lk ) were
found in problem 2 of § 6: the components F 23 and F 01 do not change, while the com
ponents F 02 , F 03 , and F 12 , F 13 transform like x° and x 1 , respectively. Expressing the
components of F ik in terms of the components of the fields E and H, according to (23.7),
we then find the following formulas of transformation for the electric field :
V
E' y +H' z
E x = E' x , E v = ,_!__ E z = — =^=, (24.2)
1^
^ <r
and for the magnetic field:
V
My— — E Z
H x = H' H v =  , ° . , H z = ,— 1—. . (24.3)
■J
2
C
V
KH' y
J?
V
h + e;
\ c 2
Thus the electric and magnetic fields, like the majority of physical quantities, are relative ;
that is, their properties are different in different reference systems. In particular, the electric
or the magnetic field can be equal to zero in one reference system and at the same time be
present in another system.
The formulas (24.2), (24.3) simplify considerably for the case V < c. To terms of order
V/c, we have :
v v
E x = E' x , E v = E'\ — H' z , E z — E' z H y ;
* " c c
H x = H' x , H y = H y  E' z , H z = H' z +  E y '.
These formulas can be written in vector form
E = E'+ H' x V, H = H'  E' x V. (24.4)
c c
§ 25 INVARIANTS OF THE FIELD 63
The formulas for the inverse transformation from K' to K are obtained directly from
(24.2)(24.4) by changing the sign of V and shifting the prime.
If the magnetic field H' = in the K' system, then, as we easily verify on the basis of
(24.2) and (24.3), the following relation exists between the electric and magnetic fields in
the K system:
H = VxE. (24.5)
If in the K' system, E' = 0, then in the K system
E=ivxH. (24.6)
Consequently, in both cases, in the K system the magnetic and electric fields are mutually
perpendicular.
These formulas also have a significance when used in the reverse direction: if the fields E
and H are mutually perpendicular (but not equal in magnitude) in some reference system K,
then there exists a reference system K' in which the field is pure electric or pure magnetic!
The velocity V of this system (relative to K) is perpendicular to E and H and equal in
magnitude to cH/E in the first case (where we must have H < E) and to cE/H in the second
case (where E < H).
§ 25. Invariants of the field
From the electric and magnetic field intensities we can form invariant quantities, which
remain unchanged in the transition from one inertial reference system to another.
The form of these invariants is easily found starting from the fourdimensional representa
tion of the field using the antisymmetric fourtensor F ik . It is obvious that we can form the
following invariant quantities from the components of this tensor:
F ik F ik = inv, (25.1)
e iklm F ik F lm = mv, (25 .2)
where e lklm is the completely antisymmetric unit tensor of the fourth rank (cf. § 6). The first
quantity is a scalar, while the second is a pseudoscalar (the product of the tensor F ik with its
dual tensor, f
Expressing F ik in terms of the components of E and H using (23.5), it is easily shown that,
in threedimensional form, these invariants have the form:
H 2 E 2 = inv, (25.3)
EH = inv. (25.4)
The pseudoscalar character of the second of these is here apparent from the fact that it is the
product of the polar vector E with the axial vector H (whereas its square (E • H) 2 is a true
scalar) .
t We also note that the pseudoscalar (25.2) can also be expressed as a fourdivergence:
as can be easily verified by using the antisymmetry of e lklm .
64 CHARGES IN ELECTROMAGNETIC FIELDS § 25
From the invariance of the two expressions presented, we get the following theorems. If
the electric and magnetic fields are mutually perpendicular in any reference system, that is,
E • H = 0, then they are also perpendicular in every other inertial reference system. If the
absolute values of E and H are equal to each other in any reference system, then they are the
same in any other system.
The following inequalities are also clearly valid. If in any reference system E > H (or
H > E), then in every other system we will have E > H (or H > E). If in any system of
reference the vectors E and H make an acute (or obtuse) angle, then they will make an acute
(or obtuse) angle in every other reference system.
By means of a Lorentz transformation we can always give E and H any arbitrary values,
subject only to the condition that E 2 — H 2 and EH have fixed values. In particular, we
can always find an inertial system in which the electric and magnetic fields are parallel to
each other at a given point. In this system E • H = EH, and from the two equations
E — H = Eq — Hq, EH = Eo'Ho.
we can find the values of E and H in this system of reference (E and H are the electric and
magnetic fields in the original system of reference).
The case where both invariants are zero is excluded. In this case, E and H are equal and
mutually perpendicular in all reference systems.
If E • H = 0, then we can always find a reference system in which E = or H = (accord
ing as E 2 — H 2 < or > 0), that is, the field is purely magnetic or purely electric. Con
versely, if in any reference system E = or H = 0, then they are mutually perpendicular in
every other system, in accordance with the statement at the end of the preceding section.
We shall give still another approach to the problem of finding the invariants of an anti
symmetric fourtensor. From this method we shall, in particular, see that (25.34) are
actually the only two independent invariants and at the same time we will explain some
instructive mathematical properties of the Lorentz transformations when applied to such
a fourtensor.
Let us consider the complex vector
F = E+iH. (25.5)
Using formulas (24.23), it is easy to see that a Lorentz transformation (along the x axis)
for this vector has the form
F x = F' x , F = F' cosh <j)  \F' Z sinh $ = F' y cos i(f>F' z sin i(j>.
V
F z = F' z cos i(j)+F' y sin i(f), tanh (j> = . (25.6)
We see that a rotation in the x, t plane in fourspace (which is what this Lorentz transforma
tion is) for the vector F is equivalent to a rotation in the y, z plane through an imaginary
angle in threedimensional space. The set of all possible rotations in fourspace (including
also the simple rotations around the x, y, and z axes) is equivalent to the set of all possible
rotations, through complex angles in threedimensional space (where the six angles of
rotation in fourspace correspond to the three complex angles of rotation of the three
dimensional system).
The only invariant of a vector with respect to rotation is its square: F 2 = E 2 —H +
+2i EH; thus the real quantities E 2 H 2 and EH are the only two independent
invariants of the tensor F ik .
§ 25 INVARIANTS OF THE FIELD 65
If F 2 # 0, the vector F can be written as F = a n, where n is a complex unit vector (n 2 = 1).
By a suitable complex rotation we can point n along one of the coordinate axes; it is clear
that then n becomes real and determines the directions of the two vectors E and H:
F = (E+iH)n; in other words we get the result that E and H become parallel to one
another.
PROBLEM
Determine the velocity of the system of reference in which the electric and magnetic fields are
parallel.
Solution: Systems of reference K', satisfying the required condition, exist in infinite numbers. If
we have found one such, then the same property will be had by any other system moving relative
to the first with its velocity directed along the common direction of E and H. Therefore it is sufficient
to find one of these systems which has a velocity perpendicular to both fields. Choosing the
direction of the velocity as the x axis, and making use of the fact that in K': E x = H' t = 0,
E' y H' Z E' Z H' V = 0, we obtain with the aid of formulas (24.2) and (24.3) for the velocity V of the
K' system relative to the original system the following equation:
y
c _ ExH
~ \^~~E 2 +H 2
(we must choose that root of the quadratic equation for which V< c).
CHAPTER 4
THE ELECTROMAGNETIC FIELD EQUATIONS
§ 26. The first pair of Maxwell's equations
From the expressions
1 dA
H = curl A, E = — grad <f>
c dt
it is easy to obtain equations containing only E and H. To do this we find curl E :
1 r\
curl E = — curl A — curl grad </>.
c dt
But the curl of any gradient is zero. Consequently,
curlE= — . (26.1)
c dt
Taking the divergence of both sides of the equation curl A = H, and recalling that div
curl = 0, we find
div H = 0. (26.2)
The equations (26.1) and (26.2) are called the first pair of Maxwell's equations. f We note
that these two equations still do not completely determine the properties of the fields. This is
clear from the fact that they determine the change of the magnetic field with time (the
derivative dH/dt), but do not determine the derivative dE/dt.
Equations (26.1) and (26.2) can be written in integral form. According to Gauss' theorem
f divHdV = <$> H df,
where the integral on the right goes over the entire closed surface surrounding the volume
over which the integral on the left is extended. On the basis of (26.2), we have
Hdf = 0. (26.3)
The integral of a vector over a surface is called the flux of the vector through the surface.
Thus the flux of the magnetic field through every closed surface is zero.
According to Stokes' theorem,
f curl E • dt = i E • d\,
where the integral on the right is taken over the closed contour bounding the surface over
t Maxwell's equations (the fundamental equations of electrodynamics) were first formulated by him in
the 1860's.
66
§ 27 THE ACTION FUNCTION OF THE ELECTROMAGNETIC FIELD 67
which the left side is integrated. From (26.1) we find, integrating both sides for any surface,
&Edl= 1 j(H'dt. (26.4)
The integral of a vector over a closed contour is called the circulation of the vector around
the contour. The circulation of the electric field is also called the electromotive force in the
given contour. Thus the electromotive force in any contour is equal to minus the time
derivative of the magnetic flux through a surface bounded by this contour.
The Maxwell equations (26.1) and (26.2) can be expressed in fourdimensional notation.
Using the definition of the electromagnetic field tensor
F ik = dA k /dx i dA i ldx k ,
it is easy to verify that
d_F* + dFja + d_F_ u
dx l dx l dx'
+ ^« + ^»,0. (26.5)
The expression on the left is a tensor of third rank, which is antisymmetric in all three indices.
The only components which are not identically zero are those with i^k^ I Thus there are
altogether four different equations which we can easily show [by substituting from (23.5)]
coincide with equations (26.1) and (26.2).
We can construct the four vector which is dual to this antisymmetric fourtensor of rank
three by multiplying the tensor by e iklm and contracting on three pairs of indices (see § 6).
Thus (26.5) can be written in the form
dFim
= 0, (26.6)
dx k
which shows explicitly that there are only three independent equations.
§ 27. The action function of the electromagnetic field
The action function S for the whole system, consisting of an electromagnetic field as well
as the particles located in it, must consist of three parts :
S = S f + S m + S mf , (27.1)
where S m is that part of the action which depends only on the properties of the particles,
that is, just the action for free particles. For a single free particle, it is given by (8.1). If there
are several particles, then their total action is the sum of the actions for each of the individual
particles. Thus,
s m =Y, mc \ ds  ( 27  2 >
The quantity S mf is that part of the action which depends on the interaction between
the particles and the field. According to § 16, we have for a system of particles:
s M 7=E ! [ A ^ k  ( 27  3 )
68 THE ELECTROMAGNETIC FIELD EQUATIONS § 27
In each term of this sum, A k is the potential of the field at that point of spacetime at which
the corresponding particle is located. The sum S m +S mf is already familiar to us as the action
(16.1) for charges in a field.
Finally S f is that part of the action which depends only on the properties of the field itself,
that is, S f is the action for a field in the absence of charges. Up to now, because we were
interested only in the motion of charges in a given electromagnetic field, the quantity S f ,
which does not depend on the particles, did not concern us, since this term cannot affect
the motion of the particles. Nevertheless this term is necessary when we want to find
equations determining the field itself. This corresponds to the fact that from the parts
s m+ s mf of the action we found only two equations for the field, (26.1) and (26.2), which
are not yet sufficient for complete determination of the field.
To establish the form of the action S f for the field, we start from the following very
important property of electromagnetic fields. As experiment shows, the electromagnetic field
satisfies the socalled principle of superposition. This principle consists in the statement that
the field produced by a system of charges is the result of a simple composition of the fields
produced by each of the particles individually. This means that the resultant field intensity
at each point is equal to the vector sum of the individual field intensities at that point.
Every solution of the field equations gives a field that can exist in nature. According to the
principle of superposition, the sum of any such fields must be a field that can exist in nature,
that is, must satisfy the field equations.
As is well known, linear differential equations have just this property, that the sum of any
solutions is also a solution. Consequently the field equations must be linear differential
equations.
From the discussion, it follows that under the integral sign for the action S f there must
stand an expression quadratic in the field. Only in this case will the field equations be linear;
the field equations are obtained by varying the action, and in the variation the degree of the
expression under the integral sign decreases by unity.
The potentials cannot enter into the expression for the action S f , since they are not
uniquely determined (in S mf this lack of uniqueness was not important). Therefore S f must
be the integral of some function of the electromagnetic field tensor F ik . But the action must
be a scalar and must therefore be the integral of some scalar. The only such quantity is the
product F ik F ik .1[
Thus S f must have the form:
S f = a J j F ik F ik dVdt, dV = dx dy dz,
where the integral extends over all of space and the time between two given moments ; a is
some constant. Under the integral stands F ik F ik =2(H 2 E 2 ). The field E contains the
derivative dA/dt; but it is easy to see that (dA/dt) 2 must appear in the action with the
positive sign (and therefore E 2 must have a positive sign). For if (dA/t) 2 appeared in S f
f The function in the integrand of S f must not include derivatives of F lk , since the Lagrangian can contain,
aside from the coordinates, only their first time derivatives. The role of "coordinates" (i.e., parameters to be
varied in the principle of least action) is in this case played by the field potential A k ; this is analogous to the
situation in mechanics where the Lagrangian of a mechanical system contains only the coordinates of the
particles and their first time derivatives.
As for the quantity e mm F ik F lm (§ 25), as pointed out in the footnote on p. 63, it is a complete four
divergence, so that adding it to the integrand in S f would have no effect on the "equations of motion". It is
interesting that this quantity is already excluded from the action for a reason independent of the fact that it is
a pseudoscalar and not a true scalar.
§ 28 THE FOURDIMENSIONAL CURRENT VECTOR 69
with a minus sign, then sufficiently rapid change of the potential with time (in the time
interval under consideration) could always make S f a negative quantity with arbitrarily
large absolute value. Consequently S f could not have a minimum, as is required by the
principle of least action. Thus, a must be negative.
The numerical value of a depends on the choice of units for measurement of the field.
We note that after the choice of a definite value for a and for the units of measurement of
field, the units for measurement of all other electromagnetic quantities are determined.
From now on we shall use the Gaussian system of units; in this system a is a dimension
less quantity, equal to — (l/167i).f
Thus the action for the field has the form
S f = f F ik F ik dCl, dQ = cdt dx dy dz. (27 A)
J lone J
In threedimensional form :
S f = i f (E 2 H 2 ) dVdt. (27.5)
In other words, the Lagrangian for the field is
L f = ^j(E 2 H 2 )dV. (27.6)
The action for field plus particles has the form
S =  £ f meds £ ( c A k dx k  ^ J* F ik F ik dQ. (27.7)
We note that now the charges are not assumed to be small, as in the derivation of the
equation of motion of a charge in a given field. Therefore A k and F ik refer to the actual field,
that is, the external field plus the field produced by the particles themselves; A k and F ik now
depend on the positions and velocities of the charges.
§ 28. The fourdimensional current vector
Instead of treating charges as points, for mathematical convenience we frequently
consider them to be distributed continuously in space. Then we can introduce the "charge
density" q such that odVis the charge contained in the volume dV. The density q is in general
a function of the coordinates and the time. The integral of q over a certain volume is the
charge contained in that volume.
Here we must remember that charges are actually pointlike, so that the density q is zero
everywhere except at points where the point charges are located, and the integral J gdV
must be equal to the sum of the charges contained in the given volume. Therefore q can be
f In addition to the Gaussian system, one also uses the Heaviside system, in which a = — J. In this
system of units the field equations have a more convenient form (4n does not appear) but on the other
hand, n appears in the Coulomb law. Conversely, in the Gaussian system the field equations contain 4n, but
the Coulomb law has a simple form.
70 THE ELECTROMAGNETIC FIELD EQUATIONS § 28
expressed with the help of the (5function in the following form:
= Ie«<5(rr fl ) (28.1)
a
where the sum goes over all the charges and r a is the radius vector of the charge e a .
The charge on a particle is, from its very definition, an invariant quantity, that is, it does
not depend on the choice of reference system. On the other hand, the density q is not generally
an invariant — only the product q dV is invariant.
Multiplying the equality de = gdV on both sides with dx l :
dx l
de dx l = gdVdx 1 — gdVdt — .
dt
On the left stands a fourvector (since de is a scalar and dx l is a fourvector). This means
that the right side must be a fourvector. But dVdt is a scalar, and so Q{dx l ldt) is a four
vector. This vector (we denote it by/) is called the current fourvector:
. dx l
J>= Q . (28.2)
The space components of this vector form a vector in ordinary space,
J = Q\, (28.3)
where v is the velocity of the charge at the given point. The vector j is called the current
t The ^function S(x) is defined as follows: 8(x) = 0, for all nonzero values of x; for x = 0, 3(0) = oo, in
such a way that the integral
+ oo
f S(x)dx = l. (I)
— 00
From this definition there result the following properties: iff(x) is any continuous function, then
+ 00
j f(x) S(xa) dx =f(a), (II)
— 00
and in particular,
+ oo
j f(x)d(x)dx = f(0). (Ill)
(The limits of integration, it is understood, need not be ± co ; the range of integration can be arbitrary,
provided it includes the point at which the <5function does not vanish.)
The meaning of the following equalities is that the left and right sides give the same result when introduced
as factors under an integral sign:
S(x) = S(x), d(ax) = 1. S(x). (IV)
M
The last equality is a special case of the more general relation
Mx)] = V u ^.6(xa t ), (V)
where <l>(x) is a singlevalued function (whose inverse need not be singlevalued) and the a ( are the roots of
the equation <j>(x) = 0.
Just as S(x) was defined for one variable x, we can introduce a threedimensional ^function, d(r), equal to
zero everywhere except at the origin of the threedimensional coordinate system, and whose integral over all
space is unity. As such a function we can clearly use the product S(x) d(y) 8(z).
§ 29 THE EQUATION OF CONTINUITY 71
density vector. The time component of the current four vector is cq. Thus
/ = (q?,j). (28.4)
The total charge present in all of space is equal to the integral J odV over all space. We
can write this integral in fourdimensional form :
f QdV = i jfdV =  c jj'dSt, (28.5)
where the integral is taken over the entire fourdimensional hyperplane perpendicular to the
x° axis (clearly this integration means integration over the whole threedimensional space).
Generally, the integral
IJfdSi
over an arbitrary hypersurface is the sum of the charges whose world lines pass through this
surface.
Let us introduce the current four vector into the expression (27.7) for the action and
transform the second term in that expression. Introducing in place of the point charges e a
continuous distribution of charge with density q, we must write this term as
  J QA t dx*dV,
replacing the sum over the charges by an integral over the whole volume. Rewriting in the
form
M
dx l
q — A t dVdt,
dt
we see that this term is equal to
Thus the action S takes the form
S =  £ [mcds\ f AifdQ — — f F ik F ik dQ. (28.6)
§ 29. The equation of continuity
The change with time of the charge contained in a certain volume is determined by the
derivative
dt]
gdV.
On the other hand, the change in unit time, say, is determined by the quantity of charge
which in unit time leaves the volume and goes to the outside or, conversely, passes to its
interior. The quantity of charge which passes in unit time though the element di of the
surface bounding our volume is equal to oy • di, where v is the velocity of the charge at the
point in space where the element di is located. The vector di is directed, as always, along
the external normal to the surface, that is, along the normal toward the outside of the volume
under consideration. Therefore q\ • di is positive if charge leaves the volume, and negative
if charge enters the volume. The total amount of charge leaving the given volume per
72 THE ELECTROMAGNETIC FIELD EQUATIONS § 29
unit time is consequently § g\ ■ df, where the integral extends over the whole of the closed
surface bounding the volume.
From the equality of these two expressions, we get
J gdV = <f>gvdf. (29.1)
The minus sign appears on the right, since the left side is positive if the total charge in the
given volume increases. The equation (29.1) is the socalled equation of continuity, expressing
the conservation of charge in integral form. Noting that gv is the current density, we can
rewrite (29.1) in the form
J QdV = j>ydf. (29.2)
We also write this equation in differential form. To do this we apply Gauss' theorem to
(29.2):
j>ydf= jdivjdV.
and we find
JH+g)"*
Since this must hold for integration© ver an arbitrary volume, the integrand must be zero :
,. . do.
i+ dt = (29,3)
This is the equation of continuity in differential form.
It is easy to check that the expression (28.1) for g in <5function form automatically
satisfies the equation (29.3). For simplicity we assume that we have altogether only one
charge, so that
Q = ed(rr ).
The current j is then
j = ev<5(rr ),
where v is the velocity of the charge. We determine the derivative dg/dt. During the motion
of the charge its coordinates change, that is, the vector r changes. Therefore
dg dg dr
dt dr dt '
But dr /dt is just the velocity v of the charge. Furthermore, since q is a function of r— r ,
dg dg
dr dr'
Consequently
— =  v • grad g = div (gv)
ot
(the velocity v of the charge of course does not depend on r). Thus we arrive at the equation
(29.3).
It is easily verified that, in fourdimensional form, the continuity equation (29.3) is
expressed by the statement that the fourdivergence of the current four vector is zero :
S° (29  4)
30 THE SECOND PAIR OF MAXWELL EQUATIONS 73
In the preceding section we saw that the total charge present in all of space can be written
as
1 r .
l dS,
M
where the integration is extended over the hyperplane x° = const. At each moment of time,
the total charge is given by such an integral taken over a different hyperplane perpendicular
to the x° axis. It is easy to verify that the equation (29.4) actually leads to conservation of
charge, that is, to the result that the integral jfdSt is the same no matter what hyperplane
x° = const we integrate over. The difference between the integrals jfdSi taken over two
such hyperplanes can be written in the form § fdSi, where the integral is taken over the
whole closed hypersurface surrounding the fourvolume between the two hyperplanes under
consideration (this integral differs from the required integral because of the presence of the
integral over the infinitely distant "sides" of the hypersurface which, however, drop out,
since there are no charges at infinity). Using Gauss' theorem (6.15) we can transform this to
an integral over the fourvolume between the two hyperplanes and verify that
<£fdS i =(^ i dQ = 0. (29.5)
The proof presented clearly remains valid also for any two integrals $j l dS it in which
the integration is extended over any two infinite hypersurfaces (and not just the hyperplanes
x° = const) which each contain all of threedimensional space. From this it follows that
the integral
JfdSt
is actually identical in value (and equal to the total charge in space) no matter over what
such hypersurface the integration is taken.
We have already mentioned (see the footnote on p. 50) the close connection between the
gauge invariance of the equations of electrodynamics and the law of conservation of charge.
Let us show this once again using the expression for the action in the form (28.6). On re
placing A t by Aiidfjdx 1 ), the integral
df
*/'
*>"
is added to the second term in this expression. It is precisely the conservation of charge, as
expressed in the continuity equation (29.4), that enables us to write the integrand as a four
divergence difj^/dx 1 , after which, using Gauss' theorem, the integral over the fourvolume
is transformed into an integral over the bounding hypersurface ; on varying the action, these
integrals drop out and thus have no effect on the equations of motion.
§ 30. The second pair of Maxwell equations
In finding the field equations with the aid of the principle of least action we must assume
the motion of the charges to be given and vary only the potentials (which serve as the
"coordinates" of the system) ; on the other hand, to find the equations of motion we assumed
the field to be given and varied the trajectory of the particle.
74 THE ELECTROMAGNETIC FIELD EQUATIONS § 30
Therefore the variation of the first term in (28.6) is zero, and in the second we must not
vary the current j \ Thus,
3s =!l{l jisA < + l FiksF *} d *=°'
(where we have used the fact that F ik 5F ik = F ik SF ik ). Substituting F ik = dA k jdx i dA i l8x k
we have
ss —S\ {^< + 1 p,t h ** 1 F,k i? sa ) *>■
In the second term we interchange the indices i and k, over which the expressions are
summed, and in addition replace F ik by F ik . Then we obtain
ts —Sc{^ SA h F 'i? i 4 da 
The second of these integrals we integrate by parts, that is, we apply Gauss' theorem:
1 r (1 1 dF ik "\ 1 r
5S =  c J {c f+ fe &?} &A ' dCl  ^ J f ' t5A ' dS > ■ f 30  1 )
In the second term we must insert the values at the limits of integration. The limits for the
coordinates are at infinity, where the field is zero. At the limits of the time integration, that is,
at the given initial and final time values, the variation of the potentials is zero, since in accord
with the principle of least action the potentials are given at these times. Thus the second term
in (30.1) is zero, and we find
Kl . 1 3F ik \ e
c i+ ^) 5A > dn = °
Since according to the principle of least action, the variations SA t are arbitrary, the co
efficients of the 8 A i must be set equal to zero:
dF ik 4tt ..
dx
/• (30.2)
Let us express these four (/ = 0, 1, 2, 3) equations in threedimensional form. For i = 1 :
dF 11 dF 12 8F 13 1 dF 10 An ^
dx dy dz c dt ~ c 3
Substituting the values for the components of F ik , we find
dH z dH y 1 8E X _ An ,
dy dz c dt ~ c
This together with the two succeeding equations (i = 2, 3) can be written as one vector
equation :
.__ IdE An.
curlH =  + j. (30.3)
Finally, the fourth equation (/ = 0) gives
div E = AnQ. (30.4)
Equations (30.3) and (30.4) are the second pair of Maxwell equations.f Together with the
t The Maxwell equations in a form applicable to point charges in the electromagnetic field in vacuum
were formulated by Lorentz.
§ 31 ENERGY DENSITY AND ENERGY FLUX 75
first pair of Maxwell equations they completely determine the electromagnetic field, and are
the fundamental equations of the theory of such fields, i.e. of electrodynamics.
Let us write these equations in integral form. Integrating (30.4) over a volume and
applying Gauss' theorem
J div EdV = j>Edf,
we get
JEdf = 4n j odV. (30.5)
Thus the flux of the electric field through a closed surface is equal to 4n times the total charge
contained in the volume bounded by the surface.
Integrating (30.3) over an open surface and applying Stokes' theorem
J curl H • di = j> H • d\,
we find
The quantity
1 dE (30.7)
4n dt
is called the "displacement current". From (30.6) written in the form
l'T/Kf)*
we see that the circulation of the magnetic field around any contour is equal to 4n/c times
the sum of the true current and displacement current passing through a surface bounded by
this contour.
From the Maxwell equations we can obtain the already familiar continuity equation (29.3).
Taking the divergence of both sides of (30.3), we find
n r r 4;r ,. .
div curl H =  — div E+ — div j.
cdt c
But div curl H = and div E = 4n Q , according to (30.4). Thus we arrive once more at
equation (29.3). In fourdimensional form, from (30.2), we have:
d 2 F ik ^ Andf^
dx l dx l = ~ 7 fa 1 '
But when the operator d 2 ldx i dx k , which is symmetric in the indices i and k, is applied to
the antisymmetric tensor F ik , it gives zero identically and we arrive at the continuity
equation (29.4) expressed in fourdimensional form.
§ 31. Energy density and energy flux
Let us multiply both sides of (30.3) by E and both sides of (26.1) by H and combine the
resultant equations. Then we get
i E  — +H — = jE(HcurlEEcurlH).
c dt c dt c
76 THE ELECTROMAGNETIC FIELD EQUATIONS § 31
Using the wellknown formula of vector analysis,
div (a x b) = b • curl aa • curl b,
we rewrite this relation in the form
^(E* + H*)=^jEdiv(ExH)
or
The vector
d^/E 2 +H 2
/E z +H z \
r^/ = ~ j ' E ~ divS  (3L1)
S = Z ExH (312)
4k
is called the Poynting vector.
We integrate (31.1) over a volume and apply Gauss' theorem to the second term on the
right. Then we obtain
d ce 2 +h 2 T r r
etj^r dv =)i Edv r' df ' < 31  3 >
If the integral extends over all space, then the surface integral vanishes (the field is zero
at infinity). Furthermore, we can express the integral Jj • EdV as a sum E e\ • E over all
the charges, and substitute from (17.7):
r /» 172 1 tt2 J
Then (31.3) becomes
d_ f CE 2 +H 2
dt
Thus for the closed system consisting of the electromagnetic field and particles present in
it, the quantity in brackets in this equation is conserved. The second term in this expression
is the kinetic energy (including the rest energy of all the particles; see the footnote on p. 48),
the first term is consequently the energy of the field itself. We can therefore call the quantity
w E 2 +H 2
W = ^n~ < 3L5 >
the energy density of the electromagnetic field; it is the energy per unit volume of the field.
If we integrate over any finite volume, then the surface integral in (31.3) generally does
not vanish, so that we can write the equation in the form
d c r f 2 4 ff 2 1 r
*{j»r dy+ Z'"'r $*•*• (3L6)
where now the second term in the brackets is summed only over the particles present in the
volume under consideration. On the left stands the change in the total energy of field and
particles per unit time. Therefore the integral j S • di must be interpreted as the flux of
field energy across the surface bounding the given volume, so that the Poynting vector S is
this flux density — the amount of field energy passing through unit area of the surface in
unit time.f
t We assume that at the given moment there are no charges on the surface itself. If this were not the case,
then on the right we would have to include the energy flux transported by particles passing through the
surface.
§ 32 THE ENERGYMOMENTUM TENSOR 77
§ 32. The energymomentum tensor
In the preceding section we derived an expression for the energy of the electromagnetic
field. Now we derive this expression, together with one for the field momentum, in four
dimensional form. In doing this we shall for simplicity consider for the present an electro
magnetic field without charges. Having in mind later applications (to the gravitational field),
and also to simplify the calculation, we present the derivation in a general form, not
specializing the nature of the system. So we consider any system whose action integral has
the form
S = f A (q, p\ dVdt = [ AdQ, (32.1)
where A is some function of the quantities q, describing the state of the system, and of their
first derivatives with respect to coordinates and time (for the electromagnetic field the
components of the fourpotential are the quantities q) ; for brevity we write here only one
of the #'s. We note that the space integral J A dVis the Lagrangian of the system, so that A
can be considered as the Lagrangian "density'". The mathematical expression of the fact that
the system is closed is the absence of any explicit dependence of A on the x\ similarly to the
situation for a closed system in mechanics, where the Lagrangian does not depend explicitly
on the time.
The "equations of motion" (i.e. the field equations, if we are dealing with some field) are
obtained in accordance with the principle of least action by varying S. We have (for brevity
we write q t ,• = dqfdx 1 ),
1 C [dA c d (dA \ c d dA
da = o.
The second term in the integrand, after transformation by Gauss' theorem, vanishes upon
integration over all space, and we then find the following "equations of motion" :
d dA dA
a?arr^"° (32  2)
(it is, of course, understood that we sum over any repeated index).
The remainder of the derivation is similar to the procedure in mechanics for deriving the
conservation of energy. Namely, we write:
dA _ dA dq dA dq^ k
dx { dq dx l dq t k dx l '
Substituting (32.2) and noting that q, k ,i = Q, i, *> we find
dA__d_ /dA\ dA_ dq^ _ d_ f dA\
dx l ~ dx k \dq t J q ' i+ dq tk dx k " dx k V' 1 dqj'
On the other hand, we can write
so that, introducing the notation
dx l l dx"
T k i=q,i^d\A, (32.3)
n = I^^)^A. (32.5)
78 THE ELECTROMAGNETIC PlELt) EQUATIONS § 32
we can express the relation in the form
dT k
We note that if there is not one but several quantities q (l \ then in place of (32.3) we must
write
8A
But in § 29 we saw that an equation of the form dA k /dx k = 0, i.e. the vanishing of the
fourdivergence of a vector, is equivalent to the statement that the integral j A k dS k of the
vector over a hypersurface which contains all of threedimensional space is conserved. It is
clear that an analogous result holds for the divergence of a tensor; the equation (32.4)
asserts that the vector P l = const T lk dS k is conserved.
This vector must be identified with the fourvector of momentum of the system. We
choose the constant factor in front of the integral so that, in accord with our previous
definition, the time component P° is equal to the energy of the system multiplied by l/c.
To do this we note that
P° = const j T 0k dS k = const j T 00 dV
if the integration is extended over the hyperplane x° = const. On the other hand, according
to (32.3),
oo  dA * (,H
T oo = 4 __ A . ^
Comparing with the usual formulas relating the energy and the Lagrangian, we see that
this quantity must be considered as the energy density of the system, and therefore \ T 00 dV
is the total energy of the system. Thus we must set const = l/c, and we get finally for the
fourmomentum of the system the expression
\i
T ik dS k . (32.6)
The tensor T ik is called the energymomentum tensor of the system.
It is necessary to point out that the definition of the tensor T lk is not unique. In fact, if
T ik is defined by (32.3), then any other tensor of the form
T ik + —. xl/ m , \ji m =  y/ ilk (32.7)
8x
will also satisfy equation (32.4), since we have identically d 2 ij/ ikl /dx k dx l = 0. The total four
momentum of the system does not change, since according to (6.17) we can write
where the integration on the right side of the equation is extended over the (ordinary) surface
which "bounds" the hypersurface over which the integration on the left is taken. This surface
is clearly located at infinity in the threedimensional space, and since neither field nor particles
are present at infinity this integral is zero. Thus the fourmomentum of the system is, as it
must be, a uniquely determined quantity. To define the tensor T ik uniquely we can use the
M
§ 32 THE ENERGYMOMENTUM TENSOR 79
requirement that the fourtensor of angular momentum (see § 14) of the system be expressed
in terms of the fourmomentum by
ik = J* (x i dP k x k dP i ) =  f (x l T kl x*T il )dS h (32.8)
that is its "density" is expressed in terms of the "density" of momentum by the usual
formula.
It is easy to determine what conditions the energymomentum tensor must satisfy in order
that this be valid. We note that the law of conservation of angular momentum can be
expressed, as we already know, by setting equal to zero the divergence of the expression under
the integral sign in M ik . Thus
— l (x i T kl x k T il ) = 0.
Noting that dx l jdx l = 5\ and that dT kl /dx l = 0, we find from this
byr^d^T" = T ki T ik = o
or
T tk = T ki^ (32 9)
that is, the energymomentum tensor must be symmetric.
We note that T lk , defined by formula (32.5), is generally speaking not symmetric, but can
be made so by transformation (32.7) with suitable \j/ ikl . Later on (§ 94) we shall see that
there is a direct method for obtaining a symmetric tensor T lk .
As we mentioned above, if we carry out the integration in (32.6) over the hyperplane
x° = const, then P ' takes on the form
dV, (32.10)
where the integration extends over the whole (threedimensional) space. The space com
ponents of P ' form the threedimensional momentum vector of the system and the time
component is its energy multiplied by l/c. Thus the vector with components
111
 T 10 z. T" 20  T 30
c c c
may be called the "momentum density", and the quantity
W = T 00
the "energy density".
To clarify the meaning of the remaining components of T lk , we separate the conservation
equation (32.4) into space and time parts :
1 dT 00 dT 0a 1 dT a0 dT aP
~ ^ + TT = °» ~ T— + ^Tff = 0. (32.11)
c dt dx a c dt dx p v J
We integrate these equations over a volume V in space. From the first equation
\U T oo dv+ f^ dv =
cdt J J dx«
or, transforming the second integral by Gauss' theorem,
 J T 00 dV =~c& T 0a df a , (32.12)
80 THE ELECTROMAGNETIC FIELD EQUATIONS § 33
where the integral on the right is taken over the surface surrounding the volume V {df x ,
df y , df z are the components of the threevector of the surface element di). The expression on
the left is the rate of change of the energy contained in the volume V; from this it is clear
that the expression on the right is the amount of energy transferred across the boundary of
the volume V, and the vector S with components
cT 01 , cT°\ cT 03
is its flux density — the amount of energy passing through unit surface in unit time. Thus we
arrive at the important conclusion that the requirements of relativistic invariance, as
expressed by the tensor character of the quantities T ik , automatically lead to a definite
connection between the energy flux and the momentum flux : the energy flux density is equal
to the momentum flux density multiplied by c 2 .
From the second equation in (32.11) we find similarly:
d_
di
f 1 T *o d y =  (f) T#df p . (32.13)
On the left is the change of the momentum of the system in volume Fper unit time, therefore
§ T aP dfp is the momentum emerging from the volume Fper unit time. Thus the components
T ap of the energymomentum tensor constitute the threedimensional tensor of momentum
flux density — the socalled stress tensor; we denote it by a aP (a, /? = x, y, z). The energy flux
density is a vector; the density of flux of momentum, which is itself a vector, must obviously
be a tensor (the component <7 a/} of this tensor is the amount of the acomponent of the
momentum passing per unit time through unit surface perpendicular to the x p axis).
We give a table indicating the meanings of the individual components of the energy
momentum tensor:
(32.14)
§ 33. Energymomentum tensor of the electromagnetic field
We now apply the general relations obtained in the previous section to the electromagnetic
field. For the electromagnetic field, the quantity standing under the integral sign in (32.1) is
equal, according to (27.4), to
F„F kl .
The quantities q are the components of the fourpotential of the field, A k , so that the definition
(32.5) of the tensor T k becomes
OX
'»(§)
§ 33 ENERGYMOMENTUM TENSOR OF THE ELECTROMAGNETIC FIELD 81
To calculate the derivatives of A which appear here, we find the variation <5A. We have
8tt *' 8tt \ dx k dx l J
or, interchanging indices and making use of the fact that F kl = —F lk ,
4n dx K
From this we see that
eA If.
d (SA,\ 4n
\dx k
and therefore
or, for the contravariant components :
1 8A l 1
But this tensor is not symmetric. To symmetrize it we add the quantity
 — J*.
4tt dx t '*
According to the field equation (30.2) in the absence of charges, dF\jdx l = 0, and therefore
A%dx l l Andx lK h
so that the change made in T ik is of the form (32.7) and is admissible. Since dA l /dx t — dA'/dXi
= F a , we get finally the following expression for the energymomentum tensor of the
electromagnetic field :
T ik = £ (F il F * +i g ik F lm F lm ). (33.1)
This tensor is obviously symmetric. In addition it has the property that
T\ = 0, (33.2)
i.e. the sum of its diagonal terms is zero.
Let us express the components of the tensor T ik in terms of the electric and magnetic field
intensities. By using the values (23.5) for the components F ik , we easily verify that the
quantity T 00 coincides with the energy density (31.5), while the components cT 0a are the
same as the components of the Poynting vector S (31.2). The space components T ap form
a threedimensional tensor with components
a " = ^( E2 y + E * E * + H2 y + H l H2 x)>
<r xy =(E x E y +H x H y ),
82 THE ELECTROMAGNETIC FIELD EQUATIONS § 33
etc., or
a « = ^{ E ^H a H p + id al} (E 2 + H 2 )}. (33.3)
This tensor is called the Maxwell stress tensor.
To bring the tensor T ik to diagonal form, we must transform to a reference system in
which the vectors E and H (at the given point in space and moment in time) are parallel to
one another or where one of them is equal to zero; as we know (§ 25), such a transformation
is always possible except when E and H are mutually perpendicular and equal in magnitude.
It is easy to see that after the transformation the only nonzero components of T ik will be
T 00 = — T 11 = T 22 = T 33 = W
(the x axis has been taken along the direction of the field).
But if the vectors E and H are mutually perpendicular and equal in magnitude, the tensor
T lk cannot be brought to diagonal form.f The nonzero components in this case are
T 00 = T 33 = T 30 = W
(where the x axis is taken along the direction of E and the y axis along H).
Up to now we have considered fields in the absence of charges. When charged particles are
present, the energymomentum tensor of the whole system is the sum of the energy
momentum tensors for the electromagnetic field and for the particles, where in the latter the
particles are assumed not to interact with one another.
To determine the form of the energymomentum tensor of the particles we must describe
their mass distribution in space by using a "mass density" in the same way as we describe a
distribution of point charges in terms of their density. Analogously to formula (28.1) for
the charge density, we can write the mass density in the form
/x = Xm fl <5(rr fl ), (33.4)
a
where r a are the radiusvectors of the particles, and the summation extends over all the
particles of the system.
The "fourmomentum density" of the particles is given by pLcu t . We know that this density
is the component T 0x /c of the energymomentum tensor, i.e. T 0a = n c 2 u\a =1,2, 3). But
the mass density is the time component of the fourvector pi/c (dx k /dt) (in analogy to the
charge density; see § 28). Therefore the energymomentum tensor of the system of non
interacting particles is
. t dx l dx k , . ds
T lk = nc — — = pi c u l u k . (33.5)
ds dt dt
As expected, this tensor is symmetric.
We verify by a direct computation that the energy and momentum of the system, defined
as the sum of the energies and momenta of field and particles, are actually conserved. In
other words we shall verify the equations.
i_ (T (/)fe +T (P)/c )==0} (336)
dx k
which express these conservation laws.
t The fact that the reduction of the symmetric tensor T ik to principal axes may be impossible is related to
the fact that the fourspace is pseudoeuclidean. (See also the problem in § 94.)
§ 33 ENERGYMOMENTUM TENSOR OF THE ELECTROMAGNETIC FIELD 83
Differentiating (33.1), we write
ST if)k t 1 n rlm 8F lm 5F, u dF kl
4n\2 dx l dx k dx k ll )'
dx k
Substituting from the Maxwell equations (26.5) and (30.2),
aF w = 4^r, dF^^_dF^_dF«
dx k c 3, dx l dx l dx m '
we have:
dT<™ t 1 / 1 dF mi . 1 BF„ .. dF„ _ 4k
dx k
= L(_ ld J™ F lm_ 1?*U F lm__ ?Ftt F H_ 4* p ,\
4n\ 2 dx l 2dx m 8x k c ilJ /
By permuting the indices, we easily show that the first three terms on the right cancel one
another, and we arrive at the result:
J ~=~F ik j\ (33.7)
ox k c
Differentiating the expression (33.5) for the energymomentum tensor of the particles.gives
dT {p) \ d ( dx k \ dx k du i
)T {p)k t d ( dx k \ dx k du t
d^ = CUi d?{ li li) + ^ C ^d?
The first term in this expression is zero because of the conservation of mass for non
interacting particles. In fact, the quantities n(dx k /dt) constitute the "mass current" four
vector, analogous to the charge current four vector (28.2); the conservation of mass is
expressed by equating to zero the divergence of this four vector:
(33.8)
d ( dx k \ rt
just as the conservation of charge is expressed by equation (29.4).
Thus we have :
dT^ )k j _ (hfdUi_ dUi
dx k ~ llC ~di~dx~ k ~ llC ~dt'
Next we use the equation of motion of the charges in the field, expressed in the four
dimensional form (23.4).
du t e k
ds c
Changing to continuous distributions of charge and mass, we have, from the definitions of
the densities n and g : fi/m = gfe. We can therefore write the equation of motion in the form
du { g n k
or
lxc = ~F ik u
ds c
n c ~n =  *ikQ u j = ~ *ikJ ■
dt c ' dt c
Thus,
dT ( P )k x
Combining this with (33.7), we find that we actually get zero, i.e. we arrive at equation
(33.6).
84 THE ELECTROMAGNETIC FIELD EQUATIONS § 34
§ 34. The virial theorem
Since the sum of the diagonal terms of the energymomentum tensor of the electro
magnetic field is equal to zero, the sum T\ for any system of interacting particles reduces to
the trace of the energymomentum tensor for the particles alone. Using (33.5), we therefore
have:
mi ~^ni; i ds ds , I V 2
Let us rewrite this result, shifting to a summation over the particles, i.e. writing /x as the sum
(33.4). We then get finally:
ri = Im fl c 2 7l"f<5(rr fl ). (34.1)
a V C
We note that, according to this formula, we have for every system:
T\ > 0, (34.2)
where the equality sign holds only for the electromagnetic field without charges.
Let us consider a closed system of charged particles carrying out a finite motion, in which
all the quantities (coordinates, momenta) characterizing the system vary over finite ranges, f
We average the equation
1 dT aQ dT aP _
c dt dx p ~
[see (32.11)] with respect to the time. The average of the derivative dT a0 Jdt, like the average
of the derivative of any bounded quantity, is zero. J Therefore we get
— 7^ =
We multiply this equation by x* and integrate over all space. We transform the integral by
Gauss' theorem, keeping in mind that at infinity T p a = 0, and so the surface integral vanishes :
^3 dK =JI?^ F =Jv« dF = o '
or finally,
j T a a dV = 0. (34.3)
On the basis of this equality we can write for the integral of T\ = T%+T%:
J T\dV = j* fldV = S,
where $ is the total energy of the system.
t Here we also assume that the electromagnetic field of the system also vanishes at infinity. This means
that, if there is a radiation of electromagnetic waves by the system, it is assumed that special "reflecting
walls" prevent these waves from going off to infinity.
% Let/(/) be such a quantity. Then the average value of the derivative df/dt over a certain time interval Tis
df_\ rdf f(T)f(0)
dt TJ dt T '
Since /(/) varies only within finite limits, then as T increases without limit, the average value of dffdt clearly
goes to zero.
§ 35 THE ENERGYMOMENTUM TENSOR FOR MACROSCOPIC BODIES 85
Finally, substituting (34.1) we get:
<? = l™ a c 2 Jlf 2 . (34.4)
This relation is the relativistic generalization of the virial theorem of classical mechanics.!
For low velocities, it becomes
a a ■£
that is, the total energy (minus the rest energy) is equal to the negative of the average value
of the kinetic energy — in agreement with the result given by the classical virial theorem for a
system of charged particles (interacting according to the Coulomb law).
§ 35. The energymomentum tensor for macroscopic bodies
In addition to the energymomentum tensor for a system of point particles (33.5), we shall
also need the expression for this tensor for macroscopic bodies which are treated as being
continuous.
The flux of momentum through the element di of the surface of the body is just the force
acting on this surface element. Therefore a aP df fi is the acomponent of the force acting on
the element. Now we introduce a reference system in which a given element of volume of the
body is at rest. In such a reference system, Pascal's law is valid, that is, the pressure/? applied
to a given portion of the body is transmitted equally in all directions and is everywhere
perpendicular to the surface on which it acts.J Therefore we can write a a p df p = pdf a , so
that the stress tensor is <r aP = pd a p. As for the components T a0 , which represent the momen
tum density, they are equal to zero for the given volume element in the reference system we
are using. The component T 00 is as always the energy density of the body, which we denote
by e; s/c 2 is then the mass density of the body, i.e. the mass per unit volume. We emphasize
that we are talking here about the unit "proper" volume, that is, the volume in the reference
system in which the given portion of the body is at rest.
Thus, in the reference system under consideration, the energymomentum tensor (for the
given portion of the body) has the form:
r  L ;  J (35.D
Now it is easy to find the expression for the energymomentum tensor in an arbitrary
reference system. To do this we introduce the fourvelocity u l for the macroscopic motion
of an element of volume of the body. In the rest frame of the particular element, u l = (1, 0).
The expression for T ik must be chosen so that in this reference system it takes on the form
t See Mechanics, § 10.
t Strictly speaking, Pascal's law is valid only for liquids and gases. However, for solid bodies the maximum
possible difference in the stress in different directions is negligible in comparison with the stresses which can
play a role in the theory of relativity, so that its consideration is of no interest.
86 THE ELECTROMAGNETIC FIELD EQUATIONS § 35
(35.1). It is easy to verify that this is
T ik = (p+e)u i u k pg ik , (35.2)
or, for the mixed components,
T k i =(p + e)u l u k p%.
This expression gives the energymomentum tensor for a macroscopic body. The
expressions for the energy density W, energy flow vector S and stress tensor <r aP are :
v 2
W v 2 ' _v 2 '
1_ ? 1_ ? (35.3)
K)
If the velocity v of the macroscopic motion is small compared with the velocity of light, then
we have approximately:
S = (p + e)v.
Since S/c 2 is the momentum density, we see that in this case the sum (p + e)/c 2 plays the
role of the mass density of the body.
The expression for T ik simplifies in the case where the velocities of all the particles making
up the body are small compared with the velocity of light (the velocity of the macroscopic
motion itself can be arbitrary). In this case we can neglect, in the energy density e, all terms
small compared with the rest energy, that is, we can write fi c 2 in place of e, where fi is
the sum of the masses of the particles present in unit (proper) volume of the body (we
emphasize that in the general case, n must differ from the actual mass density e/c 2 of the
body, which includes also the mass corresponding to the energy of microscopic motion of
the particles in the body and the energy of their interactions). As for the pressure determined
by the energy of microscopic motion of the molecules, in the case under consideration it is
also clearly small compared with the rest energy n c 2 . Thus we find
T* = j* cVu\ (35.4)
From the expression (35.2), we get
T\ = e3p. (35.5)
The general property (34.2) of the energymomentum tensor of an arbitrary system now
shows that the following inequality is always valid for the pressure and density of a macro
scopic body:
P<\. (35.6)
Let us compare the relation (35.5) with the general formula (34.1) which we saw was valid
for an arbitrary system. Since we are at present considering a macroscopic body, the expres
sion (34.1) must be averaged over all the values of r in unit volume. We obtain the result
slp = l™ a c 2 Jl4 (35.7)
a 11 c
(the summation extends over all particles in unit volume).
§ 35 THE ENERGYMOMENTUM TENSOR FOR MACROSCOPIC BODIES 87
We apply our formula to an ideal gas, which we assume to consist of identical particles,
since the particles of an ideal gas do not interact with one another, we can use formula
(33.5) after averaging it. Thus for an ideal gas,
T lk = nmc —  — ,
at as
where n is the number of particles in unit volume and the dash means an average over all the
particles. If there is no macroscopic motion in the gas then we can use for T lk the expression
(35.1). Comparing the two formulas, we arrive at the equations:
YlYYl
e = nm / — \, p = —— I — \. (35.8)
M
These equations determine the density and pressure of a relativistic ideal gas in terms of the
velocity of its particles ; the second of these replaces the wellknown formula p = nmv 2 {2> of
the nonrelativistic kinetic theory of gases.
CHAPTER 5
CONSTANT ELECTROMAGNETIC FIELDS
§ 36. Coulomb's law
For a constant electric, or as it is usually called, electrostatic field, the Maxwell equations
have the form:
divE = 47rp, (36.1)
curl E = 0. (36.2)
The electric field E is expressed in terms of the scalar potential alone by the relation
E=grad</>. (36.3)
Substituting (36.3) in (36.1), we get the equation which is satisfied by the potential of a
constant electric field :
Acf)=47ip. (36.4)
This equation is called the Poisson equation. In particular, in vacuum, i.e., for q = 0, the
potential satisfies the Laplace equation
A<£ = 0. (36.5)
From the last equation it follows, in particular, that the potential of the electric field can
nowhere have a maximum or a minimum. For in order that <p have an extreme value, it
would be necessary that the first derivatives of with respect to the coordinates be zero,
and that the second derivatives d 2 <j)Jdx 2 , d 2 <j>/dy 2 , d 2 $\dz 2 all have the same sign. The last
is impossible, since in that case (36.5) could not be satisfied.
We now determine the field produced by a point charge. From symmetry considerations,
it is clear that it is directed along the radiusvector from the point at which the charge e is
located. From the same consideration it is clear that the value E of the field depends only on
the distance jR from the charge. To find this absolute value, we apply equation (36.1) in the
integral form (30.5). The flux of the electric field through a spherical surface of radius R
circumscribed around the charge e is equal to 4nR 2 E; this flux must equal 4ne. From this we
get
In vector notation:
R 2
E = — (36.6)
R*
Thus the field produced by a point charge is inversely proportional to the square of the
88
§ 37 ELECTROSTATIC ENERGY OF CHARGES 89
distance from the charge. This is the Coulomb law. The potential of this field is, clearly,
<£ = . (36.7)
If we have a system of charges, then the field produced by this system is equal, according
to the principle of superposition, to the sum of the fields produced by each of the particles
individually. In particular, the potential of such a field is
a K a
where R a is the distance from the charge e a to the point at which we are determining the
potential. If we introduce the charge density q, this formula takes on the form
/
 dV, (36.8)
where R is the distance from the volume element dV to the given point of the field.
We note a mathematical relation which is obtained from (36.4) by substituting the values
of q and 4> for a point charge, i.e. q = e <5(R) and = e/R. We then find
aQ)=4tt<5(R). (36.9)
§ 37. Electrostatic energy of charges
We determine the energy of a system of charges. We start from the energy of the field, that
is, from the expression (31.5) for the energy density. Namely, the energy of the system of
charges must be equal to
U = J I E 2 dV,
where E is the field produced by these charges, and the integral goes over all space. Sub
stituting E = — grad 0, U can be changed to the following form:
U =  — J Egrad $ dV =  — J div (E</>) dV+ — f <f> div E dV.
According to Gauss' theorem, the first integral is equal to the integral of E<£ over the surface
bounding the volume of integration, but since the integral is taken over all space and since
the field is zero at infinity, this integral vanishes. Substituting in the second integral,
div E = 4tiq, we find the following expression for the energy of a system of charges:
U = ijo(f>dV. (37.1)
For a system of point charges, e a , we can write in place of the integral a sum over the
charges
tf = i5>.0.. (37.2)
a
where 4> a is the potential of the field produced by all the charges, at the point where the
charge e a is located.
90 CONSTANT ELECTROMAGNETIC FIELDS § 37
If we apply our formula to a single elementary charged particle (say, an electron), and the
field which the charge itself produces, we arrive at the result that the charge must have a
certain "selfpotential energy equal to ecj)/2, where (f) is the potential of the field produced
by the charge at the point where it is located. But we know that in the theory of relativity
every elementary particle must be considered as pointlike. The potential cj) = ejR of its field
becomes infinite at the point R = 0. Thus according to electrodynamics, the electron would
have to have an infinite "selfenergy", and consequently also an infinite mass. The physical
absurdity of this result shows that the basic principles of electrodynamics itself lead to the
result that its application must be restricted to definite limits.
We note that in view of the infinity obtained from electrodynamics for the selfenergy and
mass, it is impossible within the framework of classical electrodynamics itself to pose the
question whether the total mass of the electron is electrodynamic (that is, associated with the
electromagnetic selfenergy of the particle). f
Since the occurrence of the physically meaningless infinite selfenergy of the elementary
particle is related to the fact that such a particle must be considered as pointlike, we can
conclude that electrodynamics as a logically closed physical theory presents internal con
tradictions when we go to sufficiently small distances. We can pose the question as to the
order of magnitude of such distances. We can answer this question by noting that for the
electromagnetic selfenergy of the electron we should obtain a value of the order of the rest
energy mc 2 . If, on the other hand, we consider an electron as possessing a certain radius R ,
then its selfpotential energy would be of order e 2 /R . From the requirement that these two
quantities be of the same order, e 2 /R ~ mc 2 , we find
Ro ~ — 2  (37.3)
mc
This dimension (the "radius" of the electron) determines the limit of applicability of
electrodynamics to the electron, and follows already from its fundamental principles. We
must, however, keep in mind that actually the limits of applicability of the classical electro
dynamics which is presented here lie much higher, because of the occurrence of quantum
phenomena. J
We now turn again to formula (37.2). The potentials cf) a which appear there are equal, from
Coulomb's law, to
*. = Eir» (37  4)
K ab
where R ab is the distance between the charges e a , e b . The expression for the energy (37.2)
consists of two parts. First, it contains an infinite constant, the selfenergy of the charges, not
depending on their mutual separations. The second part is the energy of interaction of the
charges, depending on their separations. Only this part has physical interest. It is equal to
t/' = iZ e «&> (37.5)
where
&=Iir ( 37  6 )
b±a K*ab
t From the purely formal point of view, the finiteness of the electron mass can be handled by introducing
an infinite negative mass of nonelectromagnetic origin which compensates the infinity of the electromagnetic
mass (mass "renormalization"). However, we shall see later (§ 75) that this does not eliminate all the internal
contradictions of classical electrodynamics.
% Quantum effects become important for distances of the order of h/mc, where h is Planck's constant.
§ 38 THE FIELD OF A UNIFORMLY MOVING CHARGE 91
is the potential at the point of location of e a , produced by all the charges other than e a . In
other words, we can write
^' = ~Z~. (37.7)
In particular, the energy of interaction of two charges is
U' =  1 . (37.8)
*M2
§ 38. The field of a uniformly moving charge
We determine the field produced by a charge e, moving uniformly with velocity V. We
call the laboratory frame the system K; the system of reference moving with the charge is the
K' system. Let the charge be located at the origin of coordinates of the K' system. The
system K' moves relative to K along the X axis ; the axes Y and Z are parallel to Y' and Z '.
At the time / = the origins of the two systems coincide. The coordinates of the charge in
the K system are consequently x = Vt, y = z = 0. In the K' system, we have a constant
electric field with vector potential A' = 0, and scalar potential equal to 4>' = e/R', where
R' 2 = x '2+y' 2 + z '2, in the K system, according to (24.1) for A' = 0,
6' e
<f>= , = . (38.1)
V 1 ^ "'J 1
,2
We must now express R' in terms of the coordinates x, y, z, in the K system. According to
the formulas for the Lorentz transformation
from which
x = ' — T72 ' y = y > z =z >
J'
(xF0 2 +(l^rW + z 2 )
R' 2 = ^ °J (38.2)
Substituting this in (38.1) we find
R*
where we have introduced the notation
1^
c 2
4> = ^i (38.3)
(?)
R* 2 = (x Vt) 2 + 1 y (y 2 + z 2 ). (38.4)
The vector potential in the K system is equal to
e
c 7r*'
V e\
A = <£ = ^. (38.5)
92 CONSTANT ELECTROMAGNETIC FIELDS § 38
In the K' system the magnetic field H' is absent and the electric field is
From formula (24.2), we find
ex' ^ E' y ey'
Hd w. — Hd *. — , o i Hi », —
2
V'5 R,i J x 
E =
c
ez'
R» ./i^
Substituting for R', x' , y', z\ their expressions in terms of x, y, z, we obtain
/ V 2 \ eR
where R is the radius vector from the charge e to the field point with coordinates x, y, z (its
components are x— Vt, y, z).
This expression for E can be written in another form by introducing the angle 9 between
the direction of motion and the radius vector R. It is clear that y 2 + z 2 = R 2 sin 2 0, and there
fore R* 2 can be written in the form:
R* 2
Then we have for E,
= R 2 ( 1  ^j sin 2 e\ (38.7)
v eR c 2
» 3 / v 2 \ 3 / 2 ' (38.8)
R ^^rsin 2 ^
For a fixed distance R from the charge, the value of the field E increases as 9 increases
from to 7i/2 (or as 9 decreases from n to n/2). The field along the direction of motion
((9 = 0, 7t) has the smallest value ; it is equal to
HI V
Ey
R 2 \ c 2 )'
The largest field is that perpendicular to the velocity (9 = n/2), equal to
v
R z I v 2
2
We note that as the velocity increases, the field E\\ decreases, while E x increases. We can
describe this pictorially by saying that the electric field of a moving charge is "contracted"
in the direction of motion. For velocities V close to the velocity of light, the denominator
in formula (38.8) is close to zero in a narrow interval of values 9 around the value 9 = n/2.
The "width" of this interval is, in order of magnitude,
vs
§ 39 MOTION IN THE COULOMB FIELD 93
Thus the electric field of a rapidly moving charge at a given distance from it is large only in a
narrow range of angles in the neighborhood of the equatorial plane, and the width of this
interval decreases with increasing V like V 1 — (V 2 /c 2 ).
The magnetic field in the K system is
H = VxE (38.9)
c
[see (24.5)]. In particular, for V <^c the electric field is given approximately by the usual
formula for the Coulomb law, E = eR/R 3 , and the magnetic field is
„ eVxR
H = ^, (38.10)
PROBLEM
Determine the force (in the K system) between two charges moving with the same velocity V.
Solution: We shall determine the force F by computing the force acting on one of the charges (e^)
in the field produced by the other (e 2 ). Using (38.9), we have
F = e 1 E a +^VxH a = e 1 n^E a +^VCVE a ).
Substituting for E 2 from (38.8), we get for the components of the force in the direction of motion
(F x ) and perpendicular to it (F y ) :
V 2 \ ( V 2 \ 2
1 jr J cos I 1 = ] sin
R 2 / v 2 \ 3/2 ' R 2 / V 2 \ 3/2 '
1  a sin 2 J ( !  ~2 sin2 e
where R is the radius vector from e 2 to ex, and is the angle between R and V.
§ 39. Motion in the Coulomb field
We consider the motion of a particle with mass m and charge e in the field produced by a
second charge e' ; we assume that the mass of this second charge is so large that it can be
considered as fixed. Then our problem becomes the study of the motion of a charge e in a
centrally symmetric electric field with potential </> = e'/r.
The total energy $ of the particle is equal to
= c\lp 2 + m 2 c 2 + ,
a
5
r
where a = ee'. If we use polar coordinates in the plane of motion of the particle, then as we
know from mechanics,
p* = (M 2 lr 2 ) + p 2 r ,
where p r is the radial component of the momentum, and M is the constant angular momen
tum of the particle. Then
$ = c s Jp 2 + ?r +m 2 c 2 +~. (39.1)
94 CONSTANT ELECTROMAGNETIC FIELDS § 39
We discuss the question whether the particle during its motion can approach arbitrarily
close to the center. First of all, it is clear that this is never possible if the charges e and e'
repel each other, that is, if e and e' have the same sign. Furthermore, in the case of attraction
(e and e' of opposite sign), arbitrarily close approach to the center is not possible if Mc > a,
for in this case the first term in (39.1) is always larger than the second, and for r»0 the
right side of the equation would approach infinity. On the other hand, if Mc < a, then
as r »• 0, this expression can remain finite (here it is understood that p r approaches infinity).
Thus, if
cM < a, (39.2)
the particle during its motion "falls in" toward the charge attracting it, in contrast to non
relativistic mechanics, where for the Coulomb field such a collapse is generally impossible
(with the exception of the one case M = 0, where the particle e moves on a line toward the
particle e').
A complete determination of the motion of a charge in a Coulomb field starts most
conveniently from the HamiltonJacobi equation. We choose polar coordinates r, (f>, in
the plane of the motion. The HamiltonJacobi equation (16.11) has the form
1 fdS a\ 2 /3S\ 2 1 /dS\ 2 , , n
?{Yt + r) + U) + ?{ei) +mc=0 
We seek an S of the form
S= £t+M<t>+f{r\
where $ and M are the constant energy and angular momentum of the moving particle. The
result is
C / 1 / ^A 2 Jp
S= £t + M(j)+ \ J2[£) 2m 2 c 2 dr. (39.3)
The trajectory is determined by the equation dS/dM = const. Integration of (39.3) leads to
the following results for the trajectory:
(a) If Mc > \ot\,
(c 2 M 2  a 2 )  = cJ(M£) 2  m 2 c\M 2 c 2  a 2 ) cos (<f> J 1  ^—^\  Sol. (39.4)
(b) If Mc < a,
(a 2 MV)  = ±cV(M<f) 2 + m 2 c 2 (a 2 M 2 c 2 ) cosh U J~^2~A +#<* (39.5)
(c) If Mc = a,
s /2_ ro 2 c 4_^/'^\ m (39<6)
r \cMJ
The integration constant is contained in the arbitrary choice of the reference line for
measurement of the angle $.
In (39.4) the ambiguity of sign in front of the square root is unimportant, since it already
contains the arbitrary reference origin of the angle $ under the cos. In the case of attraction
(a < 0) the trajectory corresponding to this equation lies entirely at finite values of r (finite
motion), if $ < mc 2 . If $ > mc 2 , then r can go to infinity (infinite motion). The finite motion
corresponds to motion in a closed orbit (ellipse) in nonrelativistic mechanics. From (39.4)
it is clear that in relativistic mechanics the trajectory can never be closed; when the angle $
§ 39 MOTION IN THE COULOMB FIELD 95
changes by 2n, the distance r from the center does not return to its initial value. In place of
ellipses we here get orbits in the form of open "rosettes". Thus, whereas in nonrelativistic
mechanics the finite motion in a Coulomb field leads to a closed orbit, in relativistic
mechanics the Coulomb field loses this property.
In (39.5) we must choose the positive sign for the root in case a < 0, and the negative sign
if a > [the opposite choice of sign would correspond to a reversal of the sign of the root in
(39.1)].
For a < the trajectories (39.5) and (39.6) are spirals in which the distance r approaches
as ^ oo. The time required for the "falling in" of the charge to the coordinate origin is
finite. This can be verified by noting that the dependence of the coordinate r on the time is
determined by the equation dS\dS = const; substituting (39.3), we see that the time is
determined by an integral which converges for r * 0.
PROBLEMS
1. Determine the angle of deflection of a charge passing through a repulsive Coulomb field
(a > 0).
Solution: The angle of deflection x equals x = n—2fa, where ^ is the angle between the two
asymptotes of the trajectory (39.4). We find
2cM , fv Vc 2 M 2 a 2 \
X = n tan" 1 ( ),
Vc 2 M 2 a 2 \ ca J'
where v is the velocity of the charge at infinity.
2. Determine the effective scattering cross section at small angles for the scattering of particles
in a Coulomb field.
Solution: The effective cross section da is the ratio of the number of particles scattered per second
into a given element do of solid angle to the flux density of impinging particles (i.e., to the number
of particles crossing one square centimeter, per second, of a surface perpendicular to the beam of
particles).
Since the angle of deflection x of the particle during its passage through the field is determined
by the impact parameter q (i.e. the distance from the center to the line along which the particle
would move in the absence of the field),
da = InodQ = 2tiq ~dx = Q
dx dx sin x
where do = In sin xdxA The angle of deflection (for small angles) can be taken equal to the ratio
of the change in momentum to its initial value. The change in momentum is equal to the time integral
of the force acting on the charge, in the direction perpendicular to the direction of motion it is
approximately (a/r 2 )/fe/r). Thus we have
— CO
oq dt 2a
da
2 +v 2 t 2 ) 3 ! 2 P QV
G
(v is the velocity of the particles). From this we find the effective cross section for small x'
\PvJ X*
In the nonrelativistic case, p a* mv, and the expression coincides with the one obtained from the
Rutherford formula:): for small x
t See Mechanics, § 18.
t See Mechanics, § 19.
96 CONSTANT ELECTROMAGNETIC FIELDS §40
§ 40. The dipole moment
We consider the field produced by a system of charges at large distances, that is, at
distances large compared with the dimensions of the system.
We introduce a coordinate system with origin anywhere within the system of charges.
Let the radius vectors of the various charges be r a . The potential of the field produced by
all the charges at the point having the radius vector R is
(the summation goes over all charges); here R r a are the radius vectors from the charges
e a to the point where we are finding the potential.
We must investigate this expansion for large R (R > r a ). To do this, we expand it in
powers of rJR , using the formula
/(R r)=/(R )rgrad/(R )
(in the grad, the differentiation applies to the coordinates of the vector R ). To terms of first
order,
Ye 1
<£ = V ~ Z e « r a • grad — . (40.2)
The sum
d = £ e a r a (40.3)
is called the dipole moment of the system of charges. It is important to note that if the sum
of all the charges, X e a , is zero, then the dipole moment does not depend on the choice of
the origin of coordinates, for the radius vectors r a and r' a of one and the same charge in two
different coordinate systems are related by
r' a = r a + a,
where a is some constant vector. Therefore if E e a = 0, the dipole moment is the same
in both systems :
" d' = I e a r' a = Z ^4a £ e a = d.
If we denote by e+, r a + and e~, r~ the positive and negative charges of the system and
their radius vectors, then we can write the dipole moment in the form
d = Z «  £ e~ a x a = R a + £ e fl + R; £ e~ a (40.4)
where
R ~t^p R ~T^r (40  5)
are the radius vectors of the "charge centers" for the positive and negative charges. If
£ e„ = E e~ — e, then
d = eR + _, (40.6)
where R+_ = R + R" is the radius vector from the center of negative to the center of
positive charge. In particular, if we have altogether two charges, then R + _ is the radius
vector between them.
If the total charge of the system is zero, then the potential of the field of this system at large
distances is
, „ 1 dR
4>=d.V = — J . (40.7)
§ 41 MULTIPOLE MOMENTS 97
The field intensity is :
or finally,
E= _grad^?= ^grad(dR )(dR )grad ^
K o *<o Ko
_ 3(nd)nd
e=sJl r ' (40.8)
where n is a unit vector along R . Another useful expression for the field is
E = (dV)V— . (40.9)
Thus the potential of the field at large distances produced by a system of charges with
total charge equal to zero is inversely proportional to the square of the distance, and the
field intensity is inversely proportional to the cube of the distance. This field has axial
symmetry around the direction of d. In a plane passing through this direction (which we
choose as the z axis), the components of the vector E are :
3 cos 2 01 3sin0cos0
*  ' ~, , E x = d 3 . (40.10)
R
The radial and tangential components in this plane are
, 2 cos sin
E R = d~ r , E e ^d— r . (40.11)
§ 41. Multipole moments
In the expansion of the potential in powers of l/R ,
= 0<°> + 0< 1 > + (2) +..., (41.1)
the term (j) (n) is proportional to 1/#S +1 . We saw that the first term, $ (0) , is determined by
the sum of all the charges; the second term, <£ (1) , sometimes called the dipole potential of
the system, is determined by the dipole moment of the system.
The third term in the expansion is
^^"•"'axfeGy (4L2)
where the sum goes over all charges; we here drop the index numbering the charges; x x are
the components of the vector r, and X a those of the vector R . This part of the potential is
usually called the quadrupole potential. If the sum of the charges and the dipole moment of
the system are both equal to zero, the expansion begins with (f> (2) .
In the expression (41.2) there enter the six quantities £ ex a x p . However, it is easy to see
that the field depends not on six independent quantities, but only on five. This follows from
the fact that the function \jR satisfies the Laplace equation, that is,
A Uo/ 3 « dX.dX f \R )
= 0.
aid of D ap , we can write
4> m = ^J^(U, (41.5)
98 CONSTANT ELECTROMAGNETIC FIELDS § 41
We can therefore write (2) in the form
The tensor
D aP = ^e(3x a x p r 2 S ap ) (41.3)
is called the quadrupole moment of the system. From the definition of D aP it is clear that
the sum of its diagonal elements is zero :
D aa = 0. (41.4)
Therefore the symmetric tensor D xP has altogether five independent components. With the
%e & ( 1
6 8X a dX p \R^
or, performing the differentiation,
a 2 1 = 3X a X, S aP
dX a aX p R R R
and using the fact that S aP D ap = D aa = 0,
* m  ^P (41.6)
ZK
Like every symmetric threedimensional tensor, the tensor D aP can be brought to principal
axes. Because of (41.4), in general only two of the three principal values will be independent.
If it happens that the system of charges is symmetric around some axis (the z axis)f then
this axis must be one of the principal axes of the tensor D aP , the location of the other two
axes in the x, y plane is arbitrary, and the three principal values are related to one another:
D xx = D yy =iD z2 . (41.7)
Denoting the component D zz by D (in this case it is simply called the quadrupole moment),
we get for the potential
^ (2> = i§3 ( 3 co§2 e ~ 1) = ^3 ^(cos 6), (41.8)
4K ZK
where 6 is the angle between R and the z axis, and P 2 is a Legendre polynomial.
Just as we did for the dipole moment in the preceding section, we can easily show that
the quadrupole moment of a system does not depend on the choice of the coordinate origin,
if both the total charge and the dipole moment of the system are equal to zero.
In similar fashion we could also write the succeeding terms of the expansion (41.1). The
/'th term of the expansion defines a tensor (which is called the tensor of the 2 z pole moment)
of rank /, symmetric in all its indices and vanishing when contracted on any pair of indices ;
it can be shown that such a tensor has 21+ 1 independent components. J
We shall express the general term in the expansion of the potential in another form, by
using the wellknown formula of the theory of spherical harmonics
fir~7l = / D2 2 \ = =E^I A(cos xl (419)
 R o* %/R% + r 2 2rR cosx i=o R
t We are assuming a symmetry axis of any order higher than the second.
j Such a tensor is said to be irreducible. The vanishing on contraction means that no tensor of lower
rank can be formed from the components.
§41 MULTIPOLE MOMENTS 99
where i is the angle between R and r. We introduce the spherical angles ®, <D and 6, (j),
formed by the vectors R and r, respectively, with the fixed coordinate axes, and use the
addition theorem for the spherical harmonics :
P l (cos X )= £ ^ I1 !^ p J m ( cos0 ) P i W, ( cos ^ e " im(O "* ) ' ( 4L1 °)
m =_i(/ + m)!
where the P, M are the associated Legendre polynomials.
We also introduce the spherical functionst
yjp, & = ( m l s] 2 ^ l^j ^r(cos ey m \ m > o,
Then the expansion (41.9) takes the form:
1 =E £ ^^^*(0,o)r, m (M).
(41.11)
R r ^ 0m e_^' +1 2/+l
Carrying out this expansion in each term of (40.1), we finally get the following expression
for the /'th term of the expansion of the potential:
^^iV&Oe,®), (4i.i2)
where
QX^eat J —Yae., •!>.)■ (4113)
The set of 21+ 1 quantities Qj? form the 2'pole moment of the system of charges
The quantities Q^ defined in this way are related to the components of the dipole
moment vector d by the formulas
QP = id„ e ( ± 1} = + ^ (4± id y ). (41.14)
The quantities Q™ are related to the tensor components D ajj by the relations
Q ( o 2) =  \ D zz , Qg\ = ± 4 (A«± ">„).
1 V6 (41.15)
Q { ±\ = " A (D xx D yy ±2iD xy ).
PROBLEM
Determine the quadrupole moment of a uniformly charged ellipsoid with respect to its center.
Solution: Replacing the summation in (41.3) by an integration over the volume of the ellipsoid,
we have :
Dxx = p j / i (2 * 2 ~ y2 ~ z2)dx dy dz ' et °'
Let us choose the coordinate axes along the axes of the ellipsoid with the origin at its center; from
symmetry considerations it is obvious that these axes are the principal axes of the tensor D aB . By
means of the transformation
x = x'a, y = y'b, z = z'c
t In accordance with the definition used in quantum mechanics.
100 CONSTANT ELECTROMAGNETIC FIELDS § 42
the integration over the volume of the ellipsoid
a 2 ^ b 2 ^ c 2
is reduced to integration over the volume of the unit sphere
jc ,a +/ a +z ,2 = l.
As a result we obtain :
D zx = (2a 2 b 2 c 2 ), D yy = ~ (2b 2 a 2 c 2 ),
D 3Z = e (2c 2 a 2 b 2 ),
where e = (4n/3)abce is the total charge of the ellipsoid.
§ 42. System of charges in an external field
We now consider a system of charges located in an external electric field. We designate
the potential of this external field by <£(r). The potential energy of each of the charges is
e a <KO, and the total potential energy of the system is
tf = !>«#■«)■ (42.1)
a
We introduce another coordinate system with its origin anywhere within the system of
charges; r fl is the radius vector of the charge e a in these coordinates.
Let us assume that the external field changes slowly over the region of the system of
charges, i.e. is quasiuniform with respect to the system. Then we can expand the energy U
in powers of r a . In this expansion,
U=U™+U™+U™+ ..., (42.2)
the first term is
^ (0) = ^oE^ (42.3)
where </> is the value of the potential at the origin. In this approximation, the energy of the
system is the same as it would be if all the charges were located at one point (the origin).
The second term in the expansion is
l^> = (gradtf>) 2> fl r a
Introducing the field intensity E at the origin and the dipole moment d of the system, we
have
l/ (1) =dE . (42.4)
The total force acting on the system in the external quasiuniform field is, to the order we
are considering,
F = Eo5> fl + (VdE) .
If the total charge is zero, the first term vanishes, and
F = (dV)E, (42.5)
i.e. the force is determined by the derivatives of the field intensity (taken at the origin). The
total moment of the forces acting on the system is
K = £ (r a x e a E ) = d x E , (42.6)
i.e. it is determined by the field intensity itself.
§ 43 CONSTANT MAGNETIC FIELD 101
Let us assume that there are two systems, each having total charge zero, and with dipole
moments d t and d 2 , respectively. Their mutual distance is assumed to be large in comparison
with their internal dimensions. Let us determine their potential energy of interaction, U. To
do this we regard one of the systems as being in the field of the other. Then
t/=d 2 E 1 .
where E x is the field of the first system. Substituting (40.8) for E l5 we find:
TT (d 1 d 2 )/? 2 3(d 1 R)(d 2 R)
U = ^— ^ ^ — ^ — J , (42 J)
where R is the vector separation between the two systems.
For the case where one of the systems has a total charge different from zero (and equal
to e), we obtain similarly
d R
U = e^, (42.8)
where R is the vector directed from the dipole to the charge.
The next term in the expansion (42.1) is
tt(2) 1 v 5 ^°
Here, as in § 41, we omit the index numbering the charge; the value of the second
derivative of the potential is taken at the origin; but the potential (f> satisfies Laplace's
equation,
^ = <5 ** 0
dxl aP dx a dxp
Therefore we can write
or, finally,
6 ox a oxp '
The general term in the series (42.2) can be expressed in terms of the 2'pole moments
D^ defined in the preceding section. To do this, we first expand the potential 0(r) in
spherical harmonics; the general form of this expansion is
00 ' / 4k
#r)= J r l £ J—a lm Y lm (0,<l>), (42.10
Z = m=l V U+ 1
where r, 9, § are the spherical coordinates of a point and the a lm are constants. Forming the
sum (42.1) and using the definition (41.13), we obtain:
tf (0 = I fltaQi  (42.11)
§ 43. Constant magnetic field
Let us consider the magnetic field produced by charges which perform a finite motion, in
which the particles are always within a finite region of space and the momenta also always
remain finite. Such a motion has a "stationary" character, and it is of interest to consider
102 CONSTANT ELECTROMAGNETIC FIELDS §43
the time average magnetic field H, produced by the charges ; this field will now be a function
only of the coordinates and not of the time, that is, it will be constant.
In order to find equations for the average magnetic field H, we take the time average of the
Maxwell equations
1 dE 4tt .
div H = 0, curl H = —+— j.
c at c
The first of these gives simply
divH = 0. (43.1)
In the second equation the average value of the derivative dE/dt, like the derivative of any
quantity which varies over a finite range, is zero (cf. the footnote on p. 84). Therefore the
second Maxwell equation becomes
curl H = — j. (43.2)
These two equations determine the constant field H.
We introduce the average vector potential A in accordance with
curl A = H.
We substitute this in equation (43.2). We find
_ _ 4n
grad div A — A A = — j.
c
But we know that the vector potential of a field is not uniquely defined, and we can impose
an arbitrary auxiliary condition on it. On this basis, we choose the potential A so that
div A = 0. (43.3)
Then the equation defining the vector potential of the constant magnetic field becomes
AX=yj. (43.4)
It is easy to find the solution of this equation by noting that (43.4) is completely analogous
to the Poisson equation (36.4) for the scalar potential of a constant electric field, where in
place of the charge density q We here have the current density ]/c. By analogy with the solution
(36.8) of the Poisson equation, we can write
1(1
c] R
R dV, (43.5)
where R is the distance from the field point to the volume element dV.
In formula (43.5) we can go over from the integral to a sum over the charges, by sub
stituting in place of j the product qx, and recalling that all the charges are pointlike. In this
we must keep in mind that in the integral (43.5), R is simply an integration variable, and is
therefore not subject to the averaging process. If we write in place of the integral
/
±dV, the sum £^«
then R a here are the radius vectors of the various particles, which change during the motion
of the charges. Therefore we must write
A=I^, (43.6)
c R a
where we average the whole expression under the summation sign.
§ 44 MAGNETIC MOMENTS 103
Knowing A, we can also find the magnetic field,
H = curl A = curl   ~dV.
c] R
The curl operator refers to the coordinates of the field point. Therefore the curl can be
brought under the integral sign and j can be treated as constant in the differentiation.
Applying the well known formula
curl fa=f curl a + grad /xa,
where /and a are an arbitrary scalar and vector, to the product j. 1/jR, we get
and consequently,
E = lJ l W dV (43J)
(the radius vector R is directed from dV to the field point). This is the law of Biot and
Savart.
§ 44. Magnetic moments
Let us consider the average magnetic field produced by a system of charges in stationary
motion, at large distances from the system.
We introduce a coordinate system with its origin anywhere within the system of charges,
just as we did in § 40. Again we denote the radius vectors of the various charges by r fl , and
the radius vector of the point at which we calculate the field by R . Then R — r fl is the radius
vector from the charge e a to the field point. According to (43.6), we have for the vector
potential :
C  K — T a\
As in § 40, we expand this expression in powers of r fl . To terms of first order (we omit the
index a), we have
*=k^^('vk)
In the first term we can write
z*e«
But the average value of the derivative of a quantity changing within a finite interval (like
S er) is zero. Thus there remains for A the expression
We transform this expression as follows. Noting that v = f , we can write (remembering
104 CONSTANT ELECTROMAGNETIC FIELDS § 44
that R is a constant vector)
1/7 1
J e(R T)v =  X er(r R )+  £ e [v(r R )r(vR )].
Upon substitution of this expression in A, the average of the first term (containing the time
derivative) again goes to zero, and we get
S = 2^1 £ e[v(r R )r(vR )].
We introduce the vector
1 v
m = — £ er x v, (44.2)
which is called the magnetic moment of the system. Then we get for A:
_ mxR 1
A = — 3 = Vxm (44.3
Knowing the vector potential, it is easy to find the magnetic field. With the aid of the
formula
we find
Furthermore,
and
curl (a x b) = (b • V)a(a • V)b + a div bb div a,
,/mxR \ ,. R R n
H = curlA = curl^^J = mdiv^(mV)^.
,. Ro 11
dlv ^3 = Ro'grad 3 + 3 divR =
K K K
Ro 1 ,— „.„ „ ,_ _ 1 m 3R (TTtR )
( m * V > S§ = ^3 (m V)R +R (m V) 3 =
Thus,
Ro Rq Ro Ro Rq
„ 3n(mn) — m
H = ~S^ , (44.4)
K o
where n is again the unit vector along R . We see that the magnetic field is expressed in terms
of the magnetic moment by the same formula by which the electric field was expressed in
terms of the dipole moment [see (40.8)].
If all the charges of the system have the same ratio of charge to mass, then we can write
1 v e r
m = — •>erxv =  — > mrxv.
2c 2mc
If the velocities of all the charges v ^ c, then m\ is the momentum p of the charge and we
get
m = ^' XV = 4nc M ' (44  5)
where M = E r x p is the mechanical angular momentum of the system. Thus in this case,
the ratio of magnetic moment to the angular momentum is constant and equal to e/2mc.
§ 45 larmor's theorem 105
PROBLEM
Find the ratio of the magnetic moment to the angular momentum for a system of two charges
(velocities v <^ c).
Solution: Choosing the origin of coordinates as the center of mass of the two particles we have
/Miri+m 2 r 2 = and pi = — p 2 = p, where p is the momentum of the relative motion. With the
aid of these relations, we find
m = — (— + —\ Wima M
2c \ml m\) mx+m?.
§ 45. Larmor's theorem
Let us consider a system of charges in an external constant uniform magnetic field.
The time average of the force acting on the system,
F = Y7xH = ^£rxH,
^ c dt^c
is zero, as is the time average of the time derivative of any quantity which varies over a
finite range. The average value of the moment of the forces is
K = I((rx(vxH))
^ c
and is different from zero. It can be expressed in terms of the magnetic moment of the system,
by expanding the vector triple product:
K = X^WiH)H(vr)}=S^{v(rH)^Hr
The second term gives zero after averaging, so that
1
K = £  v(r H) =  £ e{v(r H)r(vH)}
[the last transformation is analogous to the one used in deriving (44.3)], or finally
K = mxH. (45.1)
We call attention to the analogy with formula (42.6) for the electrical case.
The Lagrangian for a system of charges in an external constant uniform magnetic field
contains (compared with the Lagrangian for a closed system) the additional term
L H = ^Av = £^Hxrv = £^rxvH (45 2)
[where we have used the expression (19.4) for the vector potential of a uniform field].
Introducing the magnetic moment of the system, we have :
L H = mH. (45.3)
We call attention to the analogy with the electric field; in a uniform electric field, the
Lagrangian of a system of charges with total charge zero contains the term
L £ = dE,
which in that case is the negative of the potential energy of the charge system (see § 42).
106 CONSTANT ELECTROMAGNETIC FIELDS § 45
We now consider a system of charges performing a finite motion (with velocities v < c)
in the centrally symmetric electric field produced by a certain fixed charge. We transform
from the laboratory coordinate system to a system rotating uniformly around an axis
passing through the fixed particle. From the wellknown formula, the velocity v of the particle
in the new coordinate system is related to its velocity v' in the old system bv the relation
v' = v+fixr,
where r is the radius vector of the particle and il is the angular velocity of the rotating co
ordinate system. In the fixed system the Lagrangian of the system of charges is
_, mv' 2
where U is the potential energy of the charges in the external field plus the energy of their
mutual interactions. The quantity U is a function of the distances of the charges from the
fixed particle and of their mutual separations; when transformed to the rotating system it
obviously remains unchanged. Therefore in the new system the Lagrangian is
m
L = Z(y + Slxr) 2 U.
Let us assume that all the charges have the same chargetomass ratio ejm, and set
n = ^H. (45.4)
Then for sufficiently small H (when we can neglect terms in H 2 ) the Lagrangian becomes:
„ mv 2 1 r, „
LE — + £eHxrvl7.
We see that it coincides with the Lagrangian which would have described the motion of the
charges in the laboratory system of coordinates in the presence of a constant magnetic field
(see (45.2)).
Thus we arrive at the result that, in the nonrelativistic case, the behavior of a system of
charges all having the same elm, performing a finite motion in a centrally symmetric electric
field and in a weak uniform magnetic field H, is equivalent to the behavior of the same system
of charges in the same electric field in a coordinate system rotating uniformly with the angular
velocity (45.4). This assertion is the content of the Larmor theorem, and the angular velocity
Q == eHjlmc is called the Larmor frequency.
We can approach this same problem from a different point of view. If the magnetic field
H is sufficiently weak, the Larmor frequency will be small compared to the frequencies of the
finite motion of the system of charges. Then we may consider the averages, over times small
compared to the period 2n/Q, of quantities describing the system. These new quantities will
vary slowly in time (with frequency Q).
Let us consider the change in the average angular momentum M of the system. According
to a wellknown equation of mechanics, the derivative of M is equal to the moment K of
the forces acting on the system. We therefore have, using (45.1):
^ = K = mxH.
dt
If the e/m ratio is the same for all particles of the system, the angular momentum and
§45 larmor's theorem 107
magnetic moment are proportional to one another, and we find by using formulas (44.5)
and (45.3):
—=QxM. (45.5)
dt
This equation states that the vector M (and with it the magnetic moment m) rotates with
angular velocity O around the direction of the field, while its absolute magnitude and the
angle which it makes with this direction remain fixed. (This motion is called the Larmor
precession.)
CHAPTER 6
ELECTROMAGNETIC WAVES
§ 46. The wave equation
The electromagnetic field in vacuum is determined by the Maxwell equations in which we
must set p = 0, j = 0. We write them once more :
curlE = — , divH = 0, (46.1)
c ot v J
curl H =  — , div E = 0. (46.2)
c ot K '
These equations possess nonzero solutions. This means that an electromagnetic field can
exist even in the absence of any charges.
Electromagnetic fields occurring in vacuum in the absence of charges are called electro
magnetic waves. We now take up the study of the properties of such waves.
First of all we note that such fields must necessarily be timevarying. In fact, in the
contrary case, dH/dt = dE/dt = and the equations (46.1) and (46.2) go over into the
equations (36.1), (36.2) and (43.1), (43.2) of a constant field in which, however, we now
have p = 0, j = 0. But the solution of these equations which is given by formulas (36.8) and
(43.5) becomes zero for p = 0, j = 0.
We derive the equations determining the potentials of electromagnetic waves.
As we already know, because of the ambiguity in the potentials we can always subject
them to an auxiliary condition. For this reason, we choose the potentials of the electro
magnetic wave so that the scalar potential is zero :
4> = 0. (46.3)
Then
IdA
E =   — , H = curl A. (46.4)
c ot
Substituting these two expressions in the first of equations (46.2), we get
1 d 2 A
curl curl A =  AA + grad div A = =■ — =. (46.5)
c 2 dt 2 y '
Despite the fact that we have already imposed one auxiliary condition on the potentials,
the potential A is still not completely unique. Namely, we can add to it the gradient of an
arbitrary function which does not depend on the time (meantime leaving <f) unchanged).
In particular, we can choose the potentials of the electromagnetic wave so that
div A = 0. (46.6)
108
AA^^=0. (46.7)
§ 46 THE WAVE EQUATION 109
In fact, substituting for E from (46.4) in div E = 0, we have
, dA d , 4 ^
dlv a7 = a* dlvA = '
that is, div A is a function only of the coordinates. This function can always be made zero
by adding to A the gradient of a suitable timeindependent function.
The equation (46.5) now becomes
1_<^A
c 2 dt 2
This is the equation which determines the potentials of electromagnetic waves. It is called
the d'Alembert equation, or the wave equation.^
Applying to (46.7) the operators curl and d/dt, we can verify that the electric and magnetic
fields E and H satisfy the same wave equation.
We repeat the derivation of the wave equation in fourdimensional form. We write the
second pair of Maxwell equations for the field in the absence of charges in the form
(This is equation (30.2) with 7' = 0.) Substituting F ik , expressed in terms of the potentials,
dx t dx k
we get
d 2 A k d 2 A l n ^ ON
^~n  5— r» = ° (46  8)
OXiOX ox k ox
We impose on the potentials the auxiliary condition :
f)A k
^ = 0. (46.9)
(This condition is called the Lorentz condition, and potentials that satisfy it are said to be in
the Lorentz gauge) Then the first term in (46.8) drops out and there remains
d 2 A 1 r) 2 A 1
= g»°A l = . (46.10)
dx k dx k dx k dx
This is the wave equation expressed in fourdimensional form.f
In threedimensional form, the condition (46.9) is:
ld<b
— ^+divA = 0. (46.11)
c dt
It is more general than the conditions <fi = and div A = that were used earlier; potentials
that satisfy those conditions also satisfy (46.1 1). But unlike them the Lorentz condition has
a relativistically invariant character : potentials satisfying it in one frame satisfy it in any
other frame (whereas condition (46.6) is generally violated if the frame is changed).
t The wave equation is sometimes written in the form DA = 0, where
n = — — = a I —
d Xl dx l c 2 dt 2
is called the d'Alembertian operator.
t It should be mentioned that the condition (46.9) still does not determine the choice of the potentials
uniquely. We can add to A a term grad/, and subtract a term 1/c (df/dt) from #, where the function /is
not arbitrary but must satisfy the wave equation Of=0.
110 ELECTROMAGNETIC WAVES § 47
§ 47. Plane waves
We consider the special case of electromagnetic waves in which the field depends only on
one coordinate, say x (and on the time). Such waves are said to be plane. In this case the
equation for the field becomes
dt 2 c dx 2
c 2 ^, = 0, (47.1)
where by /is understood any component of the vectors E or H.
To solve this equation, we rewrite it in the form
and introduce new variables
(d d\/d d\ r n
{dt C Tx){dt +C dx) f=0 >
x x
£ = t, ri = t+
c c
so that t = i(n + 0, x = c/2(r}£). Then
d__\(o__ d_\ d 1/d d
d£~2\dt~ C dx
= {  c ) i = \(l +c E)>
so that the equation for /becomes
d 2 f
= o.
d£dri
The solution obviously has the formf=f 1 (^)+f 2 (ri), where/! and/ 2 are arbitrary functions.
Thus
/=/ 'H) +/ *(' + ;;)' (47  2)
Suppose, for example, f 2 = 0, so that
Cj
Let us clarify the meaning of this solution. In each plane x = const, the field changes with
the time; at each given moment the field is different for different x. It is clear that the field
has the same values for coordinates x and times t which satisfy the relation t—(xjc) = const,
that is,
x = const +ct.
This means that if, at some time t = 0, the field at a certain point x in space had some
definite value, then after an interval of time t the field has that same value at a distance ct
along the Zaxis from the original place. We can say that all the values of the electromagnetic
field are propagated in space along the X axis with a velocity equal to the velocity of light, c.
Thus,
■H)
represents a plane wave moving in the positive direction along the Jif axis. It is easy to show
that
represents a wave moving in the opposite, negative, direction along the X axis.
§ 47 PLANE WAVES 111
In § 46 we showed that the potentials of the electromagnetic wave can be chosen so that
<J> = 0, and div A = 0. We choose the potentials of the plane wave which we are now con
sidering in this same way. The condition div A = gives in this case
dx
since all quantities are independent of y and z. According to (47.1) we then have also
d 2 AJdt 2 = 0, that is, dAJdt = const. But the derivative dA/dt determines the electric field,
and we see that the nonzero component A x represents in this case the presence of a constant
longitudinal electric field. Since such a field has no relation to the electromagnetic wave, we
can set A x = 0.
Thus the vector potential of the plane wave can always be chosen perpendicular to the X
axis, i.e. to the direction of propagation of that wave.
We consider a plane wave moving in the positive direction of the Zaxis; in this wave, all
quantities, in particular also A, are functions only of t—(x/c). From the formulas
1 dA
E = , H = curl A,
c dt
we therefore obtain
E =   A', H = Vx A = V (t ) x A' =   nx A', (47.3)
c \ cj c
where the prime denotes differentiation with respect to t—(x/c) and n is a unit vector along
the direction of propagation of the wave. Substituting the first equation in the second, we
obtain
H = n x E. (47.4)
We see that the electric and magnetic fields E and H of a plane wave are directed perpen
dicular to the direction of propagation of the wave. For this reason, electromagnetic waves
are said to be transverse. From (47.4) it is clear also that the electric and magnetic fields of
the plane wave are perpendicular to each other and equal to each other in absolute value.
The energy flux in the plane wave, i.e. its Poynting vector is
and since E • n = 0,
S = ^ExH = ^Ex(nxE),
An An
S = ^E 2 n = ~ H 2 n.
An An
Thus the energy flux is directed along the direction of propagation of the wave. Since
W = ^(E 2 +H 2 ) = ^
Sn An
is the energy density of the wave, we can write
S = cWn, (47.5)
in accordance with the fact that the field propagates with the velocity of light.
The momentum per unit volume of the electromagnetic field is S/c 2 . For a plane wave this
gives (W/c)n. We call attention to the fact that the relation between energy Wand momen
tum W/c for the electromagnetic wave is the same as for a particle moving with the velocity
of light [see (9.9)].
112 ELECTROMAGNETIC WAVES § 47
The flux of momentum of the field is determined by the components a ap of the Maxwell
stress tensor. Choosing the direction of propagation of the wave as the X axis, we find that
the only nonzero component of a aP is
<y xx = W. (47.6)
As it must be, the flux of momentum is along the direction of propagation of the wave, and
is equal in magnitude to the energy density.
Let us find the law of transformation of the energy density of a plane electromagnetic
wave when we change from one inertial reference system to another. To do this we start
from the formula
1 / V V 2
W =
1 / V V 2 \
c 2
(see problem 1 in § 6) and substitute
S' x = cW cos a', o' xx = W cos 2 a',
where a' is the angle (in the K' system) between the X' axis (along which the velocity V
is directed) and the direction of propagation of the wave. We find:
V \ 2
1 \ — cos a' I
W=W  C —yi* (47.7)
Since W = E^/4n = H 2 /4n, the absolute values of the field intensities in the wave trans
form like V W.
PROBLEMS
1 . Determine the force exerted on a wall from which an incident plane electromagnetic wave is
reflected (with reflection coefficient R).
Solution: The force f acting on unit area of the wall is given by the flux of momentum through
this area, i.e., it is the vector with components
f a ^ ar ae N + a' a0 N B ,
where N is the vector normal to the surface of the wall, and a a0 and a' a& are the components of the
energymomentum tensors for the incident and reflected waves. Using (47.6), we obtain:
f  JFh(N • n)+ 0"n'(N ■ n').
From the definition of the reflection coefficient, we have: W = RW. Also introducing the angle
of incidence 6 (which is equal to the reflection angle) and writing out components, we find the
normal force ("light pressure")
^^^(l+^cos 2 ^
and the tangential force
ft = W(\ R) sin 9 cos 6.
2. Use the Hamilton Jacobi method to find the motion of a charge in the field of a plane electro
magnetic wave with vector potential X[t~ (x/c)].
Solution: We write the Hamilton Jacobi equation in fourdimensional form:
8S e \ / 8S e \
dx l c } V 8x k c
§ 47 PLANE WAVES 113
The fact that the field is a plane wave means that the A 1 are functions of one independent variable,
which can be written in the form £ = k t x\ where k l is a constant four vector with its square equal
to zero, k t k* = (see the following section). We subject the potentials to the Lorentz condition
8A l dA l ,
for the variables field this is equivalent to the condition A l ki — 0.
We seek a solution of equation (1) in the form
S=f t x l +F(0,
where / i: =(/ , f) is a constant vector satisfying the condition f t f t = m 2 c 2 (S = —fix 1 is the
solution of the Hamilton Jacobi equation for a free particle with fourmomentum p l =/'). Sub
stitution in (1) gives the equation
e 2 „, „ dF 2e „ A ,
A,A'2y T( f,A<=0,
where the constant y = k t f l . Having determined F from this equation, we get
(2)
S—fi*~\fiM+^A<Atdt.
Changing to threedimensional notation with a fixed reference frame, we choose the direction of
propagation of the wave as the x axis. Then £ = ct—x, while the constant y =/°— f 1 . Denoting
the twodimensional vector/^, f z by k, we find from the condition / ( f { — (/°) 2 — (Z 1 ) 2 — k 2 = m 2 c 2 ,
We choose the potentials in the gauge in which <j> = 0, while A(£) lies in the yz plane. Then equation
(2) takes the form:
According to the general rules (Mechanics, § 47), to determine the motion we must equate the
derivatives dS/dic, dS/dy to certain new constants, which can be made to vanish by a suitable choice
of the coordinate and time origins. We thus obtain the parametric equations in f:
y = K y £, AydZ, z=k 3 £ A z d£,
y cy J y cy J
The generalized momentum P =p+(e/c)A and the energy £ are found by differentiating the
action with respect to the coordinates and the time; this gives:
e e
Py = K y Ay, Pz = K z ■ A z ,
c c
_ y m 2 c 2 +K 2 e e 2
2 2y cy 2yc*
£ = (y+p x )c
If we average these over the time, the terms of first degree in the periodic function A(£) vanish.
We assume that the reference system has been chosen so that the particle is at rest in it on the average,
i.e. so that its averaged momentum is zero. Then
k = 0, y 2 — m 2 c 2 + e 2 A 2 .
114 ELECTROMAGNETIC WAVES § 48
The final formulas for determining the motion have the form:
2yV
JVA*)*, y= e ^A y dt, *= e \
A z di,
e"
Ct==i+ W) ( A2  A2 >^5 ( 3 )
Px = 2^ (A 2 A 2 ), p y =A y , P*= A *
<? = cy+£ c (A*A>). (4)
§ 48. Monochromatic plane waves
A very important special case of electromagnetic waves is a wave in which the field is a
simply periodic function of the time. Such a wave is said to be monochromatic. All quantities
(potentials, field components) in a monochromatic wave depend on the time through a
factor of the form cos (cot+a). The quantity co is called the cyclic frequency of the wave (we
shall simply call it the frequency).
In the wave equation, the second derivative of the field with respect to the time is now
d 2 f/dt 2 = —co 2 f, so that the distribution of the field in space is determined for a mono
chromatic wave by the equation
co 2
Af+tf=0. (48.1)
In a plane wave (propagating along the x axis), the field is a function only of t(x/c).
Therefore, if the plane wave is monochromatic, its field is a simply periodic function of
t—(xfc). The vector potential of such a wave is most conveniently written as the real part of
a complex expression :
A = Re{A e~ to H)} (48<2 )
Here A is a certain constant complex vector. Obviously, the fields E and H of such a
wave have analogous forms with the same frequency co. The quantity
2nc
X = — (48.3)
co
is called the wavelength; it is the period of variation of the field with the coordinate x at a
fixed time t.
The vector
k =  n (48.4)
c
(where n is a unit vector along the direction of propagation of the wave) is called the wave
vector. In terms of it we can write (48.2) in the form
A = Re{A e i(k  r  wt) }, (48.5)
which is independent of the choice of coordinate axes. The quantity which appears multiplied
by / in the exponent is called the phase of the wave.
§ 48 MONOCHROMATIC PLANE WAVES 115
So long as we perform only linear operations, we can omit the sign Re for taking the real
part, and operate with complex quantities as such.f Thus, substituting
A = A e'' (kr  wf)
in (47.3), we find the relation between the intensities and the vector potential of a plane
monochromatic wave in the form
E = ifcA, H = ikxA. (48.6)
We now treat in more detail the direction of the field of a monochromatic wave. To be
specific, we shall talk of the electric field
E = Re{E e , ' (k  r  C0 °}
(everything stated below applies equally well, of course, to the magnetic field). The quantity
E is a certain complex vector. Its square E 2 , is (in general) a complex number. If the argument
of this number is 2a (i.e. E 2 , = E^e" 2ia ), the vector b defined by
E = be~ ia (48.7)
will have its square real, b 2 = E  2 . With this definition, we write:
E = Re{be i(k  r  wt  a) }. (48.8)
We write b in the form
b = bi + ib 2 ,
where b A and b 2 are real vectors. Since b 2 = b 2 b + 2ib 1 b 2 must be a real quantity,
\) 1 • b 2 = 0, i.e. the vectors b t and b 2 are mutually perpendicular. We choose the direction
of b t as the y axis (and the x axis along the direction of propagation of the wave). We then
have from (48.8):
E y = b t cos ((otkr + a),
E z — ±b 2 sin (cof — kr + a),
where we use the plus (minus) sign if b 2 is along the positive (negative) z axis. From (48.9) it
follows that
f 2 F 2
rli + ti = i (48.10)
Thus we see that, at each point in space, the electric field vector rotates in a plane perpen
dicular to the direction of propagation of the wave, while its endpoint describes the ellipse
(48.10). Such a wave is said to be elliptically polarized. The rotation occurs in the direction
of (opposite to) a righthand screw rotating along the x axis, if we have the plus (minus) sign
in (48.9).
jfb 2 = b 2 , the ellipse (48.10) reduces to a circle, i.e. the vector E rotates while remaining
constant in magnitude. In this case we say that the wave is circularly polarized. The choice
of the directions of the y and z axes is now obviously arbitrary. We note that in such a wave
t If two quantities kit) and B(f ) are written in complex form
A(/) = A e tu ", B(O = B e ,a ",
then in forming their product we must first, of course, separate out the real part. But if, as it frequently
happens, we are interested only in the time average of this product, it can be computed as
£Re {AB*}.
In fact, we have:
Re A • Re B = i(A <r"°'+ A*e t(0t ) • (B e m +BttT r ).
When we average, the terms containing factors e ±2fa>t vanish, so that we are left with
ReAReB = i(A • B* + A* • B ) = \ Re (A • B*).
0z =±i (48.11)
Oy
H6 ELECTROMAGNETIC WAVES § 48
the ratio of the y and z components of the complex amplitude E is
E
for rotation in the same (opposite) direction as that of a righthand screw {right and left
polarizations), f
Finally, if b x or b 2 equals zero, the field of the wave is everywhere and always parallel
(or antiparallel) to one and the same direction. In this case the wave is said to be linearly
polarized, or plane polarized. An elliptically polarized wave can clearly be treated as the
superposition of two plane polarized waves.
Now let us turn to the definition of the wave vector and introduce the fourdimensional
wave vector with components
** = (".*)■ (48.12)
That these quantities actually form a fourvector is obvious from the fact that we get a
scalar (the phase of the wave) when we multiply by x l :
k i x i = (otkT. (48.13)
From the definitions (48.4) and (48.12) we see that the square of the wave fourvector is
zero:
k% = 0. (48.14)
This relation also follows directly from the fact that the expression
A = A e~ ikiXi
must be a solution of the wave equation (46.10).
As is the case for every plane wave, in a monochromatic wave propagating along the x
axis only the following components of the energymomentum tensor are different from zero
(see §47):
T 00 = T 01 = T 11 = W
By means of the wave four vector, these equations can be written in tensor form as
Wc 2
T ik = — k l k k . (48.15)
or v
Finally, by using the law of transformation of the wave four vector we can easily treat the
socalled Doppler effect — the change in frequency co of the wave emitted by a source moving
with respect to the observer, as compared to the "true" frequency co of the same source in
the reference system (K ) in which it is at rest.
Let Vbe the velocity of the source, i.e. the velocity of the K system relative to K. Accord
ing to the general formula for transformation of four vectors, we have :
v
k°  k x
/c (0)0 = C
J
iX
(the velocity of the K system relative to K is —V). Substituting &° = co/c,
k 1 = k cos a = co/c cos a, where a is the angle (in the K system) between the direction of
f We assume that the coordinate axes form a righthanded system.
§ 48 MONOCHROMATIC PLANE WAVES 117
emission of the wave and the direction of motion of the source, and expressing co in terms
of co , we obtain:
J
■C
co = co . (48.16)
1 cos a
c
This is the required formula. For V <^ c, and if the angle a is not too close to n/2, it gives :
For a = n/2, we have :
co ^ co ( 1 +  cos a ). (48.17)
<° = ffl o J 1 ^f = w o (l £2); (48.18)
in this case the relative change in frequency is proportional to the square of the ratio V/c.
PROBLEMS
1. Determine the direction and magnitude of the axes of the polarization ellipse in terms of the
complex amplitude E .
Solution: The problem consists in determining the vector b =bi+/b 2 , whose square is real. We
have from (48.7):
EoEi =b*+b*, E xE* = 2ib 1 xb a , (1)
or
bl+bl=A 2 +B 2 , bibz^ABsmd,
where we have introduced the notation
\Eoy\=A, \Eo.\=B, % = ^V*
B A
for the absolute values of E 0y and £"03 and for the phase difference 5 between them. Then
bi, 2 = VA 2 +B 2 +2ABsin 3 ±Va 2 +B 2 2AB sin 8, (2)
from which we get the magnitudes of the semiaxes of the polarization ellipse.
To determine their directions (relative to the arbitrary initial axes y and z) we start from the
equality
RelfEobxXEJba)}^,
which is easily verified by substituting E = (bi + /b 2 )e""' a . Writing out this equality in the y, z co
ordinates, we get for the angle 9 between the direction of bi and the y axis:
„ n 2AB cos <5
tan 26 = — — . f3^
A 2 B 2 K }
The direction of rotation of the field is determined by the sign of the x component of the vector
biXb 2 . Taking its expression from (1)
VQ>iXb*X = Eo a E* y *  _..*. .af/M (E QZ
we see that the direction of b x x b 2 (whether it is along or opposite to the positive direction of the
x axis), and the sign of the rotation (whether in the same direction, or opposite to the direction of a
righthand screw along the x axis) are given by the sign of the imaginary part of the ratio E 0z /E 0y
(plus for the first case and minus for the second). This is a generalization of the rule (48.1 1) for the
case of circular polarization.
118 ELECTROMAGNETIC WAVES § 49
2. Determine the motion of a charge in the field of a plane monochromatic linearly polarized
wave.
Solution: Choosing the direction of the field E of the wave as the y axis, we write:
cE
E y =E = E cos co$ , A y = A — sin ©<*
CO
(f = t—xjc). From formulas (3) and (4) of problem 2, § 47, we find (in the reference system in
which the particle is at rest on the average) the following representation of the motion in terms of
the parameter n = cog) :
e 2 Ecl . „ eE c
— — sin 2fj, y =
2>y z co 6 yco 2
x= 533 sin 217, j =  — , cos ?/, z = 0,
,2P2
, * g2 ^o • . a 2 2 , e 2 E_
t== 5^"^ sm 2? 7' ^ = /w 2 c 2 + —  T
to 8rco 3 2ct> 2
e 2 ^^ . eE .
Px= —  A — a cos 2»7, />„ = — sin n, p z = 0.
Ayco 2, co
The charge moves in the x, y plane in a symmetric figure8 curve with its longitudinal axis along
the y axis.
3. Determine the motion of a charge in the field of a circularly polarized wave.
Solution: For the field of the wave we have:
E y = E Q cos co£, E z — E sin co£,
The motion is given by the formulas :
a cE Q . cE
Ay = — ■ — sm coq, A z — — cos coq.
CO CO
ecE ecE .
x = 0, y— = cos cot, z= ^ sin cot,
yco A yco*
a eE Q . eE
Px = 0, p y — — sin cot, Pz= cos cot,
co co
r 2 F 2
9 9 9 i
y 2 = wrrH =.
CO A
Thus the charge moves in the y, z plane along a circle of radius ecE lyco 2 with a momentum having
the constant magnitude p = eE /co; at each instant the direction of the momentum p coincides with
the direction of the magnetic field H of the wave.
§ 49. Spectral resolution
Every wave can be subjected to the process of spectral resolution, i.e. can be represented as
a superposition of monochromatic waves with various frequencies. The character of this
expansion varies according to the character of the time dependence of the field.
One category consists of those cases where the expansion contains frequencies forming a
discrete sequence of values. The simplest case of this type arises in the resolution of a purely
periodic (though not monochromatic) field. This is the usual expansion in Fourier series;
it contains the frequencies which are integral multiples of the "fundamental" frequency
co = InjT, where T is the period of the field. We write it in the form
/= Z f n e i<aont (49.1)
n= — oo
(where /is any of the quantities describing the field). The quantities /„ are defined in terms
§ 50 PARTIALLY POLARIZED LIGHT 119
of the function /by the integrals
r/2
 = f J f(t)e in <°*dt. (49.2)
r/2
/*
Because f(t) must be real,
n =fn (49.3)
In more complicated cases, the expansion may contain integral multiples (and sums of
integral multiples) of several different incommensurable fundamental frequencies.
When the sum (49.1) is squared and averaged over the time, the products of terms with
different frequencies give zero because they contain oscillating factors. Only terms of the
form /„,/_„ = /„ 2 remain. Thus the average of the square of the field, i.e. the average
intensity of the wave, is the sum of the intensities of its monochromatic components:
oo oo
f 2 = S \L\ 2 = 2I \f n \\ (49 .4)
«= 00 71= 1
(where it is assumed that the average of the function /over a period is zero, i.e./ =/= 0).
Another category consists of fields which are expandable in a Fourier integral containing
a continuous sequence of different frequencies. For this to be possible, the function f(t )
must satisfy certain definite conditions; usually we consider functions which vanish for
t > ± oo. Such an expansion has the form
00
(0= \ f m e^ d ^ (49.5)
— oo
where the Fourier components are given in terms of the function f(t) by the integrals
00
L= j f(t)e ilot dt. (49.6)
— oo
Analogously to (49.3),
/«,=/*• (49.7)
Let us express the total intensity of the wave, i.e. the integral of/ 2 over all time, in terms
of the intensity of the Fourier components. Using (49.5) and (49.6), we have:
00 °° 00 00 00 00
//•*/{/ J/..t}*/{/.J/.*}Sj/^.
co oo oo oo oo oo
or, using (49.7),
oo oo oo
J>*JW'!2jW'£. (8)
§ 50. Partially polarized light
Every monochromatic wave is, by definition, necessarily polarized. However we usually
have to deal with waves which are only approximately monochromatic, and which contain
frequencies in a small interval Acq. We consider such a wave, and let co be some average
120 ELECTROMAGNETIC WAVES § 50
frequency for it. Then its field (to be specific we shall consider the electric field E) at a fixed
point in space can be written in the form
where the complex amplitude E (t) is some slowly varying function of the time (for a strictly
monochromatic wave E would be constant). Since E determines the polarization of the
wave, this means that at each point of the wave, its polarization changes with time; such a
wave is said to be partially polarized.
The polarization properties of electromagnetic waves, and of light in particular, are
observed experimentally by passing the light to be investigated through various bodies f
and then observing the intensity of the transmitted light. From the mathematical point of
view this means that we draw conclusions concerning the polarization properties of the light
from the values of certain quadratic functions of its field. Here of course we are considering
the time averages of such functions.
Quadratic functions of the field are made up of terms proportional to the products E a E p ,
E*E$ or E a Ef. Products of the form
E a Ep = E 0a E pe l< ° , E a Ep = E 0a E op e lui ,
which contain the rapidly oscillating factors e ±2icot give zero when the time average is taken.
The products E a E p = E 0oL E* p do not contain such factors, and so their averages are net
zero. Thus we see that the polarization properties of the light are completely characterized
by the tensor
J aP = EZE%. (50.1)
Since the vector E always lies in a plane perpendicular to the direction of the wave, the
tensor J aP has altogether four components (in this section the indices a, /? are understood
to take on only two values : a, /? = 1 , 2, corresponding to the y and z axes ; the x axis is
along the direction of propagation of the wave).
The sum of the diagonal elements of the tensor J af} (we denote it by /) is a real quantity —
the average value of the square modulus of the vector E (or E) :
J^J aa = E^E*. (50.2)
This quantity determines the intensity of the wave, as measured by the energy flux density.
To eliminate this quantity which is not directly related to the polarization properties, we
introduce in place of J aP the tensor
P aP = J f, (50.3)
for which p aa = 1 ; we call it the polarization tensor.
From the definition (50.1) we see that the components of the tensor J a/} , and consequently
also p aP , are related by
p aP = P% (50.4)
(i.e. the tensor is hermitian). Consequently the diagonal components p xl and p 22 are real
(with P11 + P22 = 1) while p 21 = p* 2  Thus the polarization is characterized by three real
parameters.
Let us study the conditions that the tensor p aP must satisfy for completely polarized light.
In this case E = const, and so we have simply
J« = Jp*e = E o«E%e (505)
t For example, through a Nicol prism.
§ 50 PARTIALLY POLARIZED LIGHT 121
(without averaging), i.e. the components of the tensor can be written as products of com
ponents of some constant vector. The necessary and sufficient condition for this is that the
determinant vanish :
\Pap\ = P11P22P12P21 =0 (50.6)
The opposite case is that of unpolarized or natural light. Complete absence of polarization
means that all directions (in the y 2 plane) are equivalent. In other words the polarization
tensor must have the form:
P«/»=iV (50.7)
The determinant is \p ap \ = £.
In the general case of arbitrary polarization the determinant has values between and \:\
By the degree of polarization we mean the positive quantity P, defined from
\pat\ = X\P 2 ). (50.8)
It runs from the value for unpolarized to 1 for polarized light.
An arbitrary tensor p af} can be split into two parts— a symmetric and an antisymmetric
part. Of these, the first
is real because of the hermiticity of p af> . The antisymmetric part is pure imaginary. Like any
antisymmetric tensor of rank equal to the number of dimensions, it reduces to a pseudo
scalar (see the footnote on p. 17):
2.{p a pppo)= ~2 e «P A >
where A is a real pseudoscalar, e afi is the unit antisymmetric tensor (with components
*i2 = ^21 = !)• Thus the polarization tensor has the form:
Pat = S aP ^ e aP A, S aP = S Pa , (50.9)
i.e. it reduces to one real symmetric tensor and one pseudoscalar.
For a circularly polarized wave, the vector E = const, where
E 02 = ±iE 01 .
It is easy to see that then S af = S ap , while A = ± 1 . On the other hand, for a linearly polarized
wave the constant vector E can be chosen to be real, so that A = 0. In the general case the
quantity A may be called the degree of circular polarization; it runs through values from
+ 1 to 1, where the limiting values correspond to right and leftcircularly polarized
waves, respectively.
The real symmetric tensor S aP , like any symmetric tensor, can be brought to principal
axes, with different principal values which we denote by X t and X 2 . The directions of the
principal axes are mutually perpendicular. Denoting the unit vectors along these directions
by n (1) and n (2) , we can write S aP in the form:
S ap = X in ^n^ + X 2 n^nf\ X,+X 2 = 1. (50.10)
The quantities X i and X 2 are positive and take on values from to 1.
f The fact that the determinant is positive for any tensor of the form (50.1) is easily seen by considering
the averaging, for simplicity, as a summation over discrete values, and using the wellknown algebraic
inequality
I Z x a y b \ 2 < Z x a  2 Z \y b \*.
122 ELECTROMAGNETIC WAVES § 50
Suppose that A = 0, so that p aP = S aP . Each of the two terms in £50. 10) has the form of
a product of two components of a constant vector (VAin (1) or VA 2 n (2) ). In other words,
each of the terms corresponds to linearly polarized light. Furthermore, we see that there
is no term in (50.10) containing products of components of the two waves. This means that
the two parts can be regarded as physically independent of one another, or, as one says, they
are incoherent. In fact, if two waves are independent, the average value of the product
E^ 1) £j 2) is equal to the product of the averages of each of the factors, and since each of them
is zero,
£p£f> = 0.
Thus we arrive at the result that in this case (A = 0) the partially polarized light can be
represented as a superposition of two incoherent waves (with intensities proportional to
X x and A 2 ), linearly polarized along mutually perpendicular directions.f (In the general case
of a complex tensor p a/} one can show that the light can be represented as a superposition of
two incoherent elliptically polarized waves, whose polarization ellipses are similar and
mutually perpendicular (see problem 2).)
Let be the angle between the axis 1 (the y axis) and the unit vector n (1) ; then
n (1) = (cos 0, sin 0), n (2) = (sin 0, cos 0).
Introducing the quantity l=l^X 2 (assume A 2 > A 2 ), we write the components of the
tensor (50.10) in the following form:
1 /l + /cos20 /sin 20 \
aP 2\ /sin20 l/cos20/
Thus, for an arbitrary choice of the axes y and z, the polarization properties of the wave can
be characterized by the following three real parameters: A— the degree of circular polariza
tion, /—the degree of maximum linear polarization, and 0— the angle between the direction
n (1) of maximum polarization and the y axis.
In place of these parameters one can choose another set of three parameters:
$! = / sin 20, Zi=A, £ 3 = /cos 20 (50.12)
(the Stokes parameters). The polarization tensor is expressed in terms of them as
p 1( 1 + ^ *i**Y (50.13)
All three parameters run through values from 1 to +1. The parameter £3 characterizes
the linear polarization along the y and z axes: the value £ 3 = 1 corresponds to complete
linear polarization along the y axis, and £ 3 =  1 to complete polarization along the z axis.
The parameter ^ characterizes the linear polarization along directions making an angle of
45° with the y axis: the value £ 2 = 1 means complete polarization at an angle = 7t/4,
while £ 2 = — 1 means complete polarization at = — n/4. J
f The determinant \S af \ = X 1 X a ; suppose that X 1 > X 2 ; then the degree of polarization, as denned in
(50.8), is P = 1 2/1,2. In the present case {A = 0) one frequently characterizes the degree of polarization by
using the depolarization coefficient, defined as the ratio /L 2 Mi
% For a completely elliptically polarized wave with axes of the ellipse bi and b 2 (see § 48), the Stokes
parameters are :
£ 1= 0, & = ±2Z>i6 2 , ^ = blb%.
Here the y axis is along b x , while the two signs in £ 2 correspond to directions of b 2 along and opposite to the
direction of the z axis.
§ 50 PARTIALLY POLARIZED LIGHT 123
The determinant of (50.13) is equal to
k, = Kitf«£i). (5014)
Comparing with (50.8), we see that
P = V« + £l + fr (50.15)
Thus, for a given overall degree of polarization P, different types of polarization are
possible, characterized by the values of the three quantities £ 2 , £,i, £3, the sum of whose
squares is fixed ; they form a sort of vector of fixed length.
We note that the quantities £ 2 = A and vCi+£i = I are invariant under Lorentz trans
formations. This remark is already almost obvious from the very meaning of these quantities
as degrees of circular and linear polarization. f
PROBLEMS
1. Resolve an arbitrary partially polarized light wave into its "natural" and "polarized" parts.
Solution: This resolution means the representation of the tensor J aB in the form
J a0 — 2 J a al3 \ ^ Oa ^ Off •
The first term corresponds to the natural, and the second to the polarized parts of the light. To
determine the intensities of the parts we note that the determinant
\Ja,iJ (n) S a0 \ = \E™E<*f\ =0.
Writing J a/} =Jp a p in the form (50.13) and solving the equation, we get
y(»)=/(lP).
The intensity of the polarized part is /<*> = E<, P)  2 = Jjw=JP.
The polarized part of the light is in general an elliptically polarized wave, where the directions
of the axes of the ellipse coincide with the principal axes of the tensor S aB . The lengths Z>i and b 2 of
the axes of the ellipse and the angle ^ formed by the axis bi and the y axis are given
by the equations :
bl+bl=JP, 2b ± b 2 = J£ 2 , tan 2«5 = p.
2. Represent an arbitrary partially polarized wave as a superposition of two incoherent ellip
tically polarized waves.
Solution: For the hermitian tensor p a0 the "principal axes" are determined by two unit complex
vectors n (n n* = 1), satisfying the equations
Pann = An a . (1)
The principal values X x and A 2 are the roots of the equation
\pap — ^dae\ =0.
Multiplying (1) on both sides by n*, we have:
^■ = Pann*n & = j\E 0a n*\ 2 ,
from which we see that X u X 2 are real and positive. Multiplying the equations
n M (i) _ ; W (D ,* w (2)* _ ;„ w (2)*
f For a direct proof, we note that since the field of the wave is transverse in any reference frame, it is clear
from the start that the tensor p aB remains twodimensional in any new frame. The transformation of p aB into
Pae leaves unchanged the sum of absolute squares p a0 pt (in fact, the form of the transformation does not
depend on the specific polarization properties of the light, while for a completely polarized wave this sum
is 1 in any reference system). Because this transformation is real, the real and imaginary parts of the tensor
Pap (50.9) transform independently, so that the sums of the squares of the components of each separately
remain constant, and are expressed in terms of / and A.
124 ELECTROMAGNETIC WAVES § 51
for the first by n (2) * and for the second by n™, taking the difference of the results and using the
hermiticity of p a0 , we get :
It then follows that n (1) • n (2) * = 0, i.e. the unit vectors n (1) and n <2) are mutually orthogonal.
The expansion of the wave is provided by the formula
One can always choose the complex amplitude so that, of the two mutually perpendicular com
ponents, one is real and the other imaginary (compare § 48). Setting
« ( i>  bi, /# = ib 2
(where now b t and b 2 are understood to be normalized by the condition b\+bl=\\ we get from
the equation n (1) • n (2) * =0:
n™ = ib 2 , rf* = b x .
We then see that the ellipses of the two elliptically polarized vibrations are similar (have equal axis
ratio), and one of them is turned through 90° relative to the other.
3. Find the law of transformation of the Stokes parameters for a rotation of the y, z axes through
an angle $.
Solution: The law is determined by the connection of the Stokes parameters to the components of
the twodimensional tensor in the yz plane, and is given by the formulas
£i = £icos20— £ 3 sin2& ^ = ii sin 2^+£ 3 cos 20, & = &.
§ 51. The Fourier resolution of the electrostatic field
The field produced by charges can also be formally expanded in plane waves (in a Fourier
integral). This expansion, however, is essentially different from the expansion of electro
magnetic waves in vacuum, for the field produced by charges does not satisfy the homo
geneous wave equation, and therefore each term of this expansion does not satisfy the
equation. From this it follows that for the plane waves into which the field of charges can be
expanded, the relation k 2 = a> 2 /c 2 , which holds for plane monochromatic electromagnetic
waves, is not fulfilled.
In particular, if we formally represent the electrostatic field as a superposition of plane
waves, then the "frequency" of these waves is clearly zero, since the field under consideration
does not depend on the time. The wave vectors themselves are, of course, different from zero.
We consider the field produced by a point charge e, located at the origin of coordinates.
The potential 4> of this field is determined by the equation (see § 36)
A0 = 4ne5(T). (51.1)
We expand $ in a Fourier integral, i.e., we represent it in the form
+ 00
* /**'*<& (5L2)
— oo
where d 3 k denotes dk x dk y dk z . In this formula <£ k = J #>"' k ' r ^. Applying the Laplace
operator to both sides of (51.2), we obtain
+ 00
§ 52 CHARACTERISTIC VIBRATIONS OF THE FIELD 125
so that the Fourier component of the expression A<j> is
(A<£) k =/c 2 <£ k .
On the other hand, we can find (A<£) k by taking Fourier components of both sides of
equation (51.1),
(A(f>) k = j 4ne5(i)e* r dV= 
Ane.
— 00
Equating the two expressions obtained for (A</>) k , we find
Ane
<t>* = ^ (513)
This formula solves our problem.
Just as for the potential (f>, we can expand the field
+ 00
■■_/ *•*'!£• (5i  4)
With the aid of (51.2), we have
+ 00
**«! ***■'§>— !**>+■'§?•
— oo
Comparing with (51.4), we obtain
„ ., . 4nek
E k =ik(f> k =i^ r . (51.5)
From this we see that the field of the waves, into which we have resolved the Coulomb field,
is directed along the wave vector. Therefore these waves can be said to be longitudinal.
§ 52. Characteristic vibrations of the field
We consider an electromagnetic field (in the absence of charges) in some finite volume of
space. To simplify further calculations we assume that this volume has the form of a rect
angular parallelepiped with sides A, B, C, respectively. Then we can expand all quantities
characterizing the field in this parallelepiped in a triple Fourier series (for the three co
ordinates). This expansion can be written (e.g. for the vector potential) in the form:
A = X(a ke '' k '+a* e  ik '), (52.1)
k
explicitly indicating that A is real. The summation extends here over all possible values of
the vector k whose components run through the values
2nn x 2nn v , 2nn,
fe  = ^> ^ = V' K = ~C^' (52  2)
where n x , n y , n z are positive and negative integers/From the equation div A = it follows
that for each k,
k'a k = 0, (52.3)
i.e., the complex vectors a k are "perpendicular" to the corresponding wave vectors k. The
126 ELECTROMAGNETIC WAVES § 52
vectors a k are, of course, functions of the time; they satisfy the equation
a k + c 2 ic 2 a k = 0. (52.4)
If the dimensions A, B, C of the volume are sufficiently large, then neighboring values
of k x , k y , k z (for which n x , n y , n z differ by unity) are very close to one another. In this case
we may speak of the number of possible values of k x , k y , k z in the small intervals Ak x , Ak y ,
Ak z .
Since to neighboring values of, say, k x , there correspond values of n x differing by unity,
the number An x of possible values of k x in the interval Ak x is equal simply to the number of
values of n x in the corresponding interval. Thus, we obtain
A„, = ^Afc„ A„, = £Afc„ A„ 2 = Afc r .
The total number An of possible values of the vector k with components in the intervals
Ak x , Ak y , Ak z is equal to the product An x An y An z , that is,
An = ^ Ak x Ak y Ak z , (52.5)
(27T)
where V = ABC is the volume of the field.
It is easy to determine from this the number of possible values of the wave vector having
absolute values in the interval Ak, and directed into the element of solid angle Ao. To get
this we need only transform to polar coordinates in the "k space" and write in place of
Ak x Ak y Ak z the element of volume in these coordinates. Thus
An = ^k 2 AkAo. (52.6)
(2tt) 3
Finally, the number of possible values of the wave vector with absolute value k in the interval
Ak and pointing in all directions is (we write An in place of Ao)
An = ^ k 2 Ak. (52.7)
2n 2
The vectors a k as functions of the time behave like simply periodic functions with
frequencies co k = ck (see 52.4). We present the expansion of the field in such a form that it
appears as an expansion in propagating plane waves. To do this we assume that each of the
a k depends on the time through the factor e~ iWkt :
a k ~<T tofcf , <o k =c/c. (52.8)
Then each individual term in the sum (52.1) is a function only of the difference kr co k t
which corresponds to a wave propagating in the direction of the vector k.
We calculate the total energy
Sn]
(E 2 + H 2 )dV
of the field in the volume V, expressing it in terms of the quantities a k . For the electric field
we have
E = _ 1 A =  ix(%e ikr + a^ £kr ),
c c k
or, keeping in mind (52.8),
E = i £ /c(a k e ,kr a^ ikr ). (52.9)
§ 52 CHARACTERISTIC VIBRATIONS OF THE FIELD 127
For the magnetic field H = curl A, we obtain
H = i £ (k x a k e +ikr k x aje"*"). (52.10
k
When calculating the squares of these sums, we must keep in mind that all products of
terms with wave vectors k # k' give zero on integration over the whole volume. In fact,
such terms contain factors of the form e +/q ' r , q = k+k', and the integral, e.g. of
A .In
r l ~A nxX
J e dx,
o
with integer n x different from zero, gives zero. For the same reason, products containing the
factors e ±2ik ' r vanish. In those terms from which the exponentials drop out, integration
over dV gives just the volume V.
As a result, we obtain
' = f I { fc2 "k * a * +( k x a *) ' ( k x a **)l
47T k
Since a k • k = 0, we have
(k x a k ) • (k x aif) = fc 2 a k • a£,
and we obtain finally
k 2 V
$ — Z <^k> <^k = ir~ a k ' a k
k 2.TI
(52.11)
Thus the total energy of the field is expressed as a sum of the energies <f k , associated with
each of the plane waves individually.
In a completely analogous fashion, we can calculate the total momentum of the field,
4 f SdV = J f ExHdV,
c 2 J Arte J
for which we obtain
?5t (52  l2)
This result could have been anticipated in view of the relation between the energy and
momentum of a plane wave (see § 47).
The expansion succeeds in expressing the field in terms of a series of discrete parameters
(the vectors a k ), in place of the description in terms of a continuous series of parameters,
which is essentially what is done when we give the potential A(x, y,z,t) at all points of space.
We now make a transformation of the variables a k , which has the result that the equations
of the field take on a form similar to the canonical equations (Hamilton equations) of
mechanics.
We introduce the real "canonical variables" Q k and P k according to the relations
Qk = J ^T2 (% + %*), (52.13)
4nc
= "' COfe V4^ (ak ~ ak * ) = 0k 
128 ELECTROMAGNETIC WAVES § 52
The Hamiltonian of the field is obtained by substituting these expressions in the energy
(52.11):
* = Z ^k = Z i(P k 2 + o>jQ2). (52.14)
k k
Then the Hamilton equations dj^/8F k = k coincide with P k = k , which is thus a con
sequence of the equations of motion. (This was achieved by an appropriate choice of the
coefficient in (52.13).) The equations of motion, dJ^jdQ k = P k , become the equations
Qk + o>*Qk = 0, (52.15)
that is, they are identical with the equations of the field.
Each of the vectors Q k and P k is perpendicular to the wave vector k, i.e. has two in
dependent components. The direction of these vectors determines the direction of polariza
tion of the corresponding travelling wave. Denoting the two components of the vector Q k
(in the plane perpendicular to k) by Q kj ,j = 1, 2, we have
Qk 2 = iek>
and similarly for P k . Then
JT = £ Jf ki , M> kj = KPlj + colQtj). (52.16)
kj
We see that the Hamiltonian splits into a sum of independent terms 3f kj , each of which
contains only one pair of the quantities Q kj , P kJ . Each such term corresponds to a traveling
wave with a definite wave vector and polarization. The quantity 34? kJ has the form of the
Hamiltonian of a onedimensional "oscillator", performing a simple harmonic vibration.
For this reason, one sometimes refers to this result as the expansion of the field in terms of
oscillators.
We give the formulas which express the field explicitly in terms of the variables P k , Q k .
From (52.13), we have
* k = iJf ( p k~ ^4), ^ = ~ljf ( P " + fft) *2k). (52.17)
Substituting these expressions in (52.1), we obtain for the vector potential of the field:
A = 2 Jf Z I ( cfe Qk cos krP k sin kr). (52.18)
For the electric and magnetic fields, we find
E= 2^£(cfcQ k sinkr + P k coskr),
In k 1 < 52  19 >
H = 2 J  £  {ck(k x Q k ) sin k • r+(k x P k ) cos k • r}.
CHAPTER 7
THE PROPAGATION OF LIGHT
§ 53. Geometrical optics
A plane wave is characterized by the property that its direction of propagation and
amplitude are the same everywhere. Arbitrary electromagnetic waves, of course, do not
have this property. Nevertheless, a great many electromagnetic waves, which are not plane,
have the property that within each small region of space they can be considered to be plane.
For this, it is clearly necessary that the amplitude and direction of the wave remain practically
constant over distances of the order of the wavelength. If this condition is satisfied, we can
introduce the socalled wave surface, i.e. a surface at all of whose points the phase of the
wave is the same (at a given time). (The wave surfaces of a plane wave are obviously
planes perpendicular to the direction of propagation of the wave.) In each small region of
space we can speak of a direction of propagation of the wave, normal to the wave surface.
In this way we can introduce the concept of rays — curves whose tangents at each point
coincide with the direction of propagation of the wave.
The study of the laws of propagation of waves in this case constitutes the domain of
geometrical optics. Consequently, geometrical optics considers the propagation of waves,
in particular of light, as the propagation of rays, completely divorced from their wave
properties. In other words, geometrical optics corresponds to the limiting case of small
wavelength, A > 0.
We now take up the derivation of the fundamental equation of geometrical optics — the
equation determining the direction of the rays. Let / be any quantity describing the field
of the wave (any component of E or H). For a plane monochromatic wave, /has the form
/= fle «k'«*+«> = fle *(*i*'+«) (53.1)
(we omit the Re; it is understood that we take the real part of all expressions).
We write the expression for the field in the form
f=atf*. (53.2)
In case the wave is not plane, but geometrical optics is applicable, the amplitude a is,
generally speaking, a function of the coordinates and time, and the phase ^, which is called
the eikonal, does not have a simple form, as in (53.1). It is essential, however, that ^ be a
large quantity. This is clear immediately from the fact that it changes by In when we move
through one wavelength, and geometrical optics corresponds to the limit X * 0.
Over small space regions and time intervals the eikonal \j/ can be expanded in series; to
129
130 THE PROPAGATION OF LIGHT § 53
terms of first order, we have
(the origin for coordinates and time has been chosen within the space region and time
interval under consideration; the derivatives are evaluated at the origin). Comparing this
expression with (53.1), we can write
k = ^sgradtfr, <*>=f t , (53.3)
which corresponds to the fact that in each small region of space (and each small interval of
time) the wave can be considered as plane. In fourdimensional form, the relation (53.3) is
expressed as
k >=£» < 53  4 >
where k t is the wave fourvector.
We saw in § 48 that the components of the four vector k l are related by k t k l = 0. Sub
stituting (53.4), we obtain the equation
dXi dx l ' ' '
This equation, the eikonal equation, is the fundamental equation of geometrical optics.
The eikonal equation can also be derived by direct transition to the limit X * in the
wave equation. The field / satisfies the wave equation
d 2 f
dx t dx l
Substituting /= ae^, we obtain
. = 0.
d 2 a .. ^ da dxj/ ,. p d 2 ^ d\b M
dx t dx l dxidx 1 dxfix 1 dx t dx l
But the eikonal if/, as we pointed out above, is a large quantity; therefore we can neglect
the first three terms compared with the fourth, and we arrive once more at equation (53.5).
We shall give certain relations which, in their application to the propagation of light in
vacuum, lead only to completely obvious results. Nevertheless, they are important because,
in their general form, these derivations apply also to the propagation of light in material
media.
From the form of the eikonal equation there results a remarkable analogy between
geometrical optics and the mechanics of material particles. The motion of a material
particle is determined by the Hamilton Jacobi equation (16.11). This equation, like the
eikonal equation, is an equation in the first partial derivatives and is of second degree. As
we know, the action S is related to the momentum p and the Hamiltonian «?f of the particle
by the relations
8S „ dS
'*• * = *■
Comparing these formulas with the formulas (53.3), we see that the wave vector plays the
same role in geometrical optics as the momentum of the particle in mechanics, while the
frequency plays the role of the Hamiltonian, i.e., the energy of the particle. The absolute
§ 53 GEOMETRICAL OPTICS 131
magnitude k of the wave vector is related to the frequency by the formula k = co/c. This
relation is analogous to the relation/? = $/c between the momentum and energy of a particle
with zero mass and velocity equal to the velocity of light.
For a particle, we have the Hamilton equations
In view of the analogy we have pointed out, we can immediately write the corresponding
equations for rays :
dco dco
In vacuum, co = ck, so that k = 0, v = en (n is a unit vector along the direction of propaga
tion) ; in other words, as it must be, in vacuum the rays are straight lines, along which the
light travels with velocity c.
The analogy between the wave vector of a wave and the momentum of a particle is made
especially clear by the following consideration. Let us consider a wave which is a super
position of monochromatic waves with frequencies in a certain small interval and occupying
some finite region in space (this is called a wave packet). We calculate the fourmomentum of
the field of this wave, using formula (32.6) with the energymomentum tensor (48.15) (for
each monochromatic component). Replacing k l in this formula by some average value,
we obtain an expression of the form
P l = Ak\ (53.8)
where the coefficient of proportionality A between the two four vectors P ' and k l is some
scalar. In threedimensional form this relation gives :
P = Ak, £ = Acq. (53.9)
Thus we see that the momentum and energy of a wave packet transform, when we go from
one reference system to another, like the wave vector and the frequency
Pursuing the analogy, we can establish for geometrical optics a principle analogous to the
principle of least action in mechanics. However, it cannot be written in Hamiltonian form as
5 \Ldt = 0, since it turns out to be impossible to introduce, for rays, a function analogous
to the Lagrangian of a particle. Since the Lagrangian of a particle is related to the Hamil
tonian #? by the equation L = p • dJf/dp — 34?, replacing the Hamiltonian 3f by the
frequency co and the momentum by the wave vector k, we should have to write for the
Lagrangian in optics k • dco/dk — co. But this expression is equal to zero, since co = ck. The
impossibility of introducing a Lagrangian for rays h also clear directly from the considera
tion mentioned earlier that the propagation of rays is analogous to the motion of particles
with zero mass.
If the wave has a definite constant frequency co, then the time dependence of its field is
given by a factor of the form e~ imt . Therefore for the eikonal of such a wave we can write
xj/ = cot + ij/ (x, y, z), (53.10)
where i/f is a function only of the coordinates. The eikonal equation (53.5) now takes the
form
(grad ^ ) 2 = ^ (53.11)
c
The wave surfaces are the surfaces of constant eikonal, i.e. the family of surfaces of the form
132
THE PROPAGATION OF LIGHT
§ 54
ij/ (x, y, z) — const. The rays themselves are at each point normal to the corresponding
wave surface; their direction is determined by the gradient V^ 
As is well known, in the case where the energy is constant, the principle of least action
for particles can also be written in the form of the socalled principle of Maupertuis:
5S = 8 Jprfl = 0,
where the integration extends over the trajectory of the particle between two of its points.
In this expression the momentum is assumed to be a function of the energy and the co
ordinate differentials. The analogous principle for rays is called Fermafs principle. In this
case, we can write by analogy:
dij/ = 8 !kdl = 0. (53.12)
In vacuum, k = (co/c)n, and we obtain (dln = dl):
dfdl = 0, (53.13)
which corresponds to rectilinear propagation of the rays.
§ 54. Intensity
In geometrical optics, the light wave can be considered as a bundle of rays. The rays
themselves, however, determine only the direction of propagation of the light at each point;
there remains the question of the distribution of the light intensity in space.
On some wave surface of the bundle of rays under consideration, we isolate an in
finitesimal surface element. From differential geometry it is known that every surface has,
at each of its points, two (generally different) principal radii of curvature. Let ac and bd
(Fig. 7) be elements of the principal circles of curvature, constructed at a given element of
•B
Fig. 7.
the wave surface. Then the rays passing through a and c meet at the corresponding center of
curvature lt while the rays passing through b and d meet at the other center of curvature
o 2 .
For fixed angular openings of the beams starting from O t and 2 , the lengths of the arcs
ac and bd are, clearly, proportional to the corresponding radii of curvature R t and R 2 (i.e.
to the lengths O t O and 2 O). The area of the surface element is proportional to the product
§ 54 INTENSITY 133
of the lengths ac and bd, i.e., proportional to R ± R 2 . In other words, if we consider the
element of the wave surface bounded by a definite set of rays, then as we move along them
the area of the element will change proportionally to R t R 2 .
On the other hand, the intensity, i.e. the energy flux density, is inversely proportional to
the surface area through which a given amount of light energy passes. Thus we arrive at the
result that the intensity is
'£
This formula must be understood as follows. On each ray (AB in Fig. 7) there are definite
points O t and 2 , which are the centers of curvature of all the wave surfaces intersecting
the given ray. The distances OO t and 00 2 from the point O where the wave surface inter
sects the ray, to the points O x and 2 , are the radii of curvature R x and R 2 of the wave
surface at the point O. Thus formula (54.1) determines the change in intensity of the light
along a given ray as a function of the distances from definite points on this ray. We emphasize
that this formula cannot be used to compare intensities at different points on a single wave
surface.
Since the intensity is determined by the square modulus of the field, we can write for the
change of the field itself along the ray
/= ^L e  (542)
where in the phase factor e ikR we can write either e ikRl or e ikRl . The quantities e ikRl and
e ikR 2 (f or a given ray) differ from each other only by a constant factor, since the difference
R 1 —R 2 , the distance between the two centers of curvature, is a constant.
If the two radii of curvature of the wave surface coincide, then (54.1) and (54.2) have the
form:
comt const ^
R 2 R '
This happens always when the light is emitted from a point source (the wave surfaces are
then concentric spheres and R is the distance from the light source).
From (54.1) we see that the intensity becomes infinite at the points R x = 0, R 2 = 0, i.e.
at the centers of curvature of the wave surface. Applying this to all the rays in a bundle, we
find that the intensity of the light in the given bundle becomes infinite, generally, on two
surfaces — the geometrical loci of all the centers of curvature of the wave surfaces. These
surfaces are called caustics. In the special case of a beam of rays with spherical wave surfaces,
the two caustics fuse into a single point (focus).
We note from wellknown results of differential geometry concerning the properties of the
loci of centers of curvature of a family of surfaces, that the rays are tangent to the caustic.
It is necessary to keep in mind that (for convex wave surfaces) the centers of curvature of
the wave surfaces can turn out to lie not on the rays themselves, but on their extensions
beyond the optical system from which they emerge. In such cases we speak of imaginary
caustics (or foci). In this case the intensity of the light does not become infinite anywhere.
As for the increase of intensity to infinity, in actuality we must understand that the
intensity does become large at points on the caustic, but it remains finite (see the problem
in § 59). The formal increase to infinity means that the approximation of geometrical optics
is never applicable in the neighborhood of the caustic. To this is related the fact that the
134 THE PROPAGATION OF LIGHT § 55
change in phase along the ray can be determined from formula (54.2) only over sections of
the ray which do not include its point of tangency to the caustic. Later (in § 59), we shall
show that actually in passing through the caustic the phase of the field decreases by njl.
This means that if, on the section of the ray before its first intersection with the caustic, the
field is proportional to the factor e lkx (x is the coordinate along the ray), then after passage
through the caustic the field will be proportional to e ^ kx ( n / 2 )\ The same thing occurs in the
neighborhood of the point of tangency to the second caustic, and beyond that point the field
is proportional to e l '(fc*«) t j
§ 55. The angular eikonal
A light ray traveling in vacuum and impinging on a transparent body will, on its emergence
from this body, generally have a direction different from its initial direction. This change in
direction will, of course, depend on the specific properties of the body and on its form.
However, it turns out that one can derive general laws relating to the change in direction of a
light ray on passage through an arbitrary material body. In this it is assumed only that
geometrical optics is applicable to rays propagating in the interior of the body under con
sideration. As is customary, we shall call such transparent bodies, through which rays of
light propagate, optical systems.
Because of the analogy mentioned in § 53, between the propagation of rays and the motion
of particles, the same general laws are valid for the change in direction of motion of a particle,
initially moving in a straight line in vacuum, then passing through some electromagnetic
field, and once more emerging into vacuum. For definiteness, we shall, however, always
speak later of the propagation of light rays.
We saw in a previous section that the eikonal equation, describing the propagation of the
rays, can be written in the form (53.11) (for light of a definite frequency). From now on we
shall, for convenience, designate by xj/ the eikonal i]/ divided by the constant co/c. Then the
basic equation of geometrical optics has the form :
(VtfO 2 = 1. (55.1)
Each solution of this equation describes a definite beam of rays, in which the direction
of the rays passing through a given point in space is determined by the gradient of \j/ at that
point. However, for our purposes this description is insufficient, since we are seeking general
relations determining the passage through an optical system not of a single definite bundle of
rays, but of arbitrary rays. Therefore we must use an eikonal expressed in such a form
that it describes all the generally possible rays of light, i.e. rays passing through any pair of
points in space. In its usual form the eikonal i/<r) is the phase of the rays in a certain bundle
passing through the point r. Now we must introduce the eikonal as a function i/^r, r') of the
coordinates of two points (r, r' are the radius vectors of the initial and end points of the ray).
A ray can pass through each pair of points r, r', and i/<r, r') is the phase difference (or, as it
is called, the optical path length) of this ray between the points r and r'. From now on we
shall always understand by r and r' the radius vectors to points on the ray before and after
its passage through the optical system.
t Although formula (54.2) itself is not valid near the caustic, the change in phase of the field corresponds
formally to a change in sign (i.e. multiplication by e in ) of Rt or R 2 in this formula.
§ 55 THE ANGULAR EIKONAL 135
If in \l/(r, r') one of the radius vectors, say r', is fixed, then i// as a function of r describes
a definite bundle of rays, namely, the bundle of rays passing through the point r'. Then \{/
must satisfy equation (55.1), where the differentiations are applied to the components of r.
Similarly, if r is assumed fixed, we again obtain an equation for if/(r, r'), so that
(V r i/0 2 = 1, (V r ,</0 2 =1. (55.2)
The direction of the ray is determined by the gradient of its phase. Since \j/(r, r') is the
difference in phase at the points r and r', the direction of the ray at the point r' is given by
the vector n' = d\j//8r', and at the point r by the vector n = 8\j//dr. From (55.2) it is clear
that n and n' are unit vectors :
n 2 = n' 2 = l. (55.3)
The four vectors r, r', n, n' are interrelated, since two of them (n, n') are derivatives of a
certain function \]/ with respect to the other two (r, r'). The function \\i itself satisfies the
auxiliary conditions (55.2).
To obtain the relation between n, n', r, r', it is convenient to introduce, in place of \//,
another quantity, on which no auxiliary condition is imposed (i.e., is not required to satisfy
any differential equations). This can be done as follows. In the function \J/ the independent
variables are r and r', so that for the differential dij/ we have
dil/ d\l/ , , ,
dij/ = rdr+ ^/dr' = n dr + ri • dx' .
Y dr dr'
We now make a Legendre transformation from r, r' to the new independent variables n,
n', that is, we write
di]/ = — d(nr) + r dn + d(ri r') — r' dri,
from which, introducing the function
X = n'T'nr^, (55.4)
we have
dx= rdn + r'dn'. (55.5)
The function x is called the angular eikonal; as we see from (55.5), the independent
variables in it are n and n\ No auxiliary conditions are imposed on x I n f act > equation (55.3)
now states only a condition referring to the independent variables : of the three components
n x , n y , n z , of the vector n (and similarly for n'), only two are independent. As independent
variables we shall use n y , n z , n' y , n' z ; then
n 3C = Vl«J«z, n' x = \lln' y 2 n z 2 .
Substituting these expressions in
dx— —xdn x — ydn y — zdn z + x'dn' x + y'dn' y +z'dn z ,
dn'
we obtain for the differential dx :
dx= ~[y x)dn y  (z x)
dn z +(y'^ f Adn' y +(z'
 ~ x
From this we obtain, finally, the following equations :
n y dx
y — x=~,
n x on y
n z dx
z x = t— ,
n x dn z
, n' y , dx
n x dn y
, K , dx
z  — x = — ,
n x dn z
(55.6)
136 THE PROPAGATION OF LIGHT § 56
which is the relation sought between n, n', r, r'. The function x characterizes the special
properties of the body through which the rays pass (or the properties of the field, in the case
of the motion of a charged particle).
For fixed values of n, n', each of the two pairs of equations (55.6) represent a straight line.
These lines are precisely the rays before and after passage through the optical system. Thus
the equation (55.6) directly determines the path of the ray on the two sides of the optical
system.
§ 56. Narrow bundles of rays
In studying the passage of beams of rays through optical systems, special interest attaches
to bundles whose rays all pass through one point (such bundles are said to be homocentric).
After passage through an optical system, homocentric bundles in general cease to be
homocentric, i.e. after passing through a body the rays no longer come together in any one
point. Only in exceptional cases will the rays starting from a luminous point come together
after passage through an optical system and all meet at one point (the image of the luminous
point).f
One can show (see § 57) that the only case for which all homocentric bundles remain
strictly homocentric after passage through the optical system is the case of identical imaging,
i.e., the case where the image differs from the object only in its position or orientation, or is
mirror inverted.
Thus no optical system can give a completely sharp image of an object (having finite
dimensions) except in the trivial case of identical imaging. $ Only approximate, but not
completely sharp images can be produced of an extended body, in any case other than for
identical imaging.
The most important case where there is approximate transition of homocentric bundles
into homocentric bundles is that of sufficiently narrow beams (i.e. beams with a small
opening angle) passing close to a particular line (for a given optical system). This line is
called the optic axis of the system.
Nevertheless, we must note that even infinitely narrow bundles of rays (in the three
dimensional case) are in general not homocentric; we have seen (Fig. 7) that even in such a
bundle different rays intersect at different points (this phenomenon is called astigmatism).
Exceptions are those points of the wave surface at which the two principal radii of curvature
are equal — a small region of the surface in the neighborhood of such points can be considered
as spherical, and the corresponding narrow bundle of rays is homocentric.
We consider an optical system having axial symmetry. § The axis of symmetry of the
system is also its optical axis. The wave surface of a bundle of rays traveling along this axis
also has axial symmetry; as we know, surfaces of rotation have equal radii of curvature at
their points of intersection with the symmetry axis. Therefore a narrow bundle moving in this
direction remains homocentric.
t The point of intersection can lie either on the rays themselves or on their continuations; depending on
this, the image is said to be real or virtual.
% Such imaging can be produced with a plane mirror.
§ It can be shown that the problem of image formation with the aid of narrow bundles, moving in the
neighborhood of the optical axis in a nonaxiallysymmetric system, can be reduced to image formation in
an axiallysymmetric system plus a subsequent rotation of the image thus obtained, relative to the object.
§ 56 NARROW BUNDLES OF RAYS 137
To obtain general quantitative relations, determining image formation with the aid of
narrow bundles, passing through an axiallysymmetric optical system, we use the general
equations (55.6) after determining first of all the form of the function x in the case under
consideration.
Since the bundles of rays are narrow and move in the neighborhood of the optical axis,
the vectors n, n' for each bundle are directed almost along this axis. If we choose the optical
axis as the X axis, then the components n y , n z , ri y , ri z will be small compared with unity. As
for the components n x , ri x ;n x &l and n' x can be approximately equal to either +1 or — 1.
In the first case the rays continue to travel almost in their original direction, emerging into
the space on the other side of the optical system, which in this case is called a lens. In the
second the rays change their direction to almost the reverse; such an optical system is called
a mirror.
Making use of the smallness of n y , n z , n' y , n' z , we expand the angular eikonal x (« y , n z , ri y ,
n' z ) in series and stop at the first terms. Because of the axial symmetry of the whole system,
X must be invariant with respect to rotations of the coordinate system around the optical
axis. From this it is clear that in the expansion of x there can be no terms of first order,
proportional to the first powers of the y and zcomponents of the vectors n and n' ; such
terms would not have the required invariance. The terms of second order which have the
required property are the squares n 2 and n' 2 and the scalar product n n'. Thus, to terms of
second order, the angular eikonal of an axiallysymmetric optical system has the form
X = const+  (n 2 y + n 2 z )+f(n y ri y +n z ri z ) + ± {ri 2 + ri 2 ), (56.1)
where/, g, h are constants.
For definiteness, we now consider a lens, so that we set n' x « 1 ; for a mirror, as we shall
show later, all the formulas have a similar appearance. Now substituting the expression
(56.1) in the general equations (55.6), we obtain:
np  g) fn' y = y, fn y + rip' + h) = y',
n z (xg)fri 2 = Z , fn z + ri z (x' + h) = z'. ( ' }
We consider a homocentric bundle emanating from the point x,y,z; let the point x', y' z' be
the point in which all the rays of the bundle intersect after passing through the lens. If the first
and second pairs of equations (56.2) were independent, then these four equations, for given
x, y, z, x', y', z', would determine one definite set of values n y , n z , ri y , ri z , that is, there would
be just one ray starting from the point x, y, z, which would pass through the point x', y', z'.
In order that all rays starting from x, y, z shall pass through x', y', z', it is consequently
necessary that the equations (56.2) not be independent, that is, one pair of these equations
must be a consequence of the other. The necessary condition for this dependence is that the
coefficients in the one pair of equations be proportional to the coefficients of the other pair.
Thus we must have
*_^ = __/_ = Z = i. (56.3)
/ x' + h y' z'
In particular,
(xg)(x' + h)=f 2 . (56.4)
The equations we have obtained give the required connection between the coordinates of
the image and object for image formation using narrow bundles.
138 THE PROPAGATION OF LIGHT § 56
The points x = g and x = —A on the optical axis are called the principal foci of the optical
system. Let us consider bundles of rays parallel to the optical axis. The source point of such
rays is, clearly, located at infinity on the optical axis, that is, x = oo. From (56.3) we see that
in this case, x' = —A. Thus a parallel bundle of rays, after passage through the optical
system, intersects at the principal focus. Conversely, a bundle of rays emerging from the
principal focus becomes parallel after passage through the system.
In the equation (56.3) the coordinates x and x' are measured from the same origin of co
ordinates, lying on the optical axis. It is, however, more convenient to measure the co
ordinates of object and image from different origins, choosing them at the corresponding
principal foci. As positive direction of the coordinates we choose the direction from the
corresponding focus toward the side to which the light travels. Designating the new co
ordinates of object and image by capital letters, we have
X = xg, X' = x' + h, Y = y, Y' = y', Z = z, Z' = z'.
The equations of image formation (56.3) and (56.4) in the new coordinates take the form
XX' = f 2 , (56.5)
r z' f x'
The quantity /is called the principal focal length of the system.
The ratio Y'/Y is called the lateral magnification. As for the longitudinal magnification,
since the coordinates are not simply proportional to each other, it must be written in dif
ferential form, comparing the length of an element of the object (along the direction of the
axis) with the length of the corresponding element in the image. From (56.5) we get for the
"longitudinal magnification"
dX'
f^ = /T
X 2 \Y
2 ~ \ v i • (56.7)
dX
We see from this that even for an infinitely small object, it is impossible to obtain a
geometrically similar image. The longitudinal magnification is never equal to the transverse
(except in the trivial case of identical imaging).
A bundle passing through the point X =/on the optical axis intersects once more at the
point X' = —/on the axis; these two points are called principal points. From equation
(56.2) {n y Xfny = Y, n z Xfn' z = Z) it is clear that in this case (X=f Y=Z = 0), we
have the equations n y = n' y , n z = n' z . Thus every ray starting from a principal point crosses
the optical axis again at the other principal point in a direction parallel to its original
direction.
If the coordinates of object and image are measured from the principal points (and not
from the principal foci), then for these coordinates £ and £', we have
? = X'+f, Z = Xf.
Substituting in (56.5) it is easy to obtain the equations of image formation in the form
ii=l. (56.8)
One can show that for an optical system with small thickness (for example, a mirror or a
thin lens), the two principal points almost coincide. In this case the equation (56.8) is
particularly convenient, since in it £ and £' are then measured practically from one and the
same point.
§ 56 NARROW BUNDLES OF RAYS 139
If the focal distance is positive, then objects located in front of the focus (X > 0) are
imaged erect (Y'/Y> 0); such optical systems are said to be converging. Iff< 0, then for
I>0we have Y'/Y<0, that is, the object is imaged in inverted form; such systems are
said to be diverging.
There is one limiting case of image formation which is not contained in the formulas
(56.8) ; this is the case where all three coefficients/, g, h are infinite (i.e. the optical system has
an infinite focal distance and its principal foci are located at infinity). Going to the limit of
infinite f, g, h in (56.4) we obtain
. h , f 2 9 h
x =  x\ .
9 9
Since we are interested only in the case where the object and its image are located at finite
distances from the optical system, /, g, h must approach infinity in such fashion that the
ratios h/g, (f 2 gh)/g are finite. Denoting them, respectively, by a 2 and /?, we have
x' = a 2 x + p.
For the other two coordinates we now have from the general equation (56.7) :
v' z'
y  =  = + a .
V z
Finally, again measuring the coordinates x and x' from different origins, namely from some
arbitrary point on the axis and from the image of this point, respectively, we finally obtain
the equations of image formation in the simple form
X' = a 2 X, Y' = +<xY, Z' = ±aZ. (56.9)
Thus the longitudinal and transverse magnifications are constants (but not equal to each
other). This case of image formation is called telescopic.
All the equations (56.5) through (56.9), derived by us for lenses, apply equally to mirrors,
and even to an optical system without axial symmetry, if only the image formation occurs
by means of narrow bundles of rays traveling near the optical axis. In this, the reference
points for the x coordinates of object and image must always be chosen along the optical
axis from corresponding points (principal foci or principal points) in the direction of propaga
tion of the ray. In doing this, we must keep in mind that for an optical system not possessing
axial symmetry, the directions of the optical axis in front of and beyond the system do not lie
in the same plane.
PROBLEMS
1. Find the focal distance for image formation with the aid of two axiallysymmetric optical
systems whose optical axes coincide.
Solution: Let/i and/ 2 be the focal lengths of the two systems. For each system separately, we
have
X\ X \ = — / 1 , X2X2 = —f<2, •
Since the image produced by the first system acts as the object for the second, then denoting by / the
distance between the rear principal focus of the first system and the front focus of the second, we
have X 2 = X x ' —I; expressing X 2 ' in terms of X u we obtain
yl _ Xif 2
2 f 1 2 +ix 1
140 THE PROPAGATION OF LIGHT § 56
or
* + T")(* i fl"(¥)'.
from which it is clear that the principal foci of the composite system are located at the points
*i = ~fi 2 /l, X* = A 2 // and the focal length is
f— —(ill
J i
(to choose the sign of this expression, we must write the corresponding equation for the transverse
magnification).
x = x=
Fig. 8.
In case / — 0, the focal length /= oo, that is, the composite system gives telescopic image for
mation. In this case we have X 2 ' = Xitfalftf, that is, the parameter a in the general formula
(56.9)isa = / 2 //i.
2. Find the focal length for charged particles of a "magnetic lens" in the form of a longitudinal
homogeneous field in the section of length / (Fig. 8).f
Solution: The kinetic energy of the particle is conserved during its motion in a magnetic field;
therefore the HamiltonJacobi equation for the reduced action S (r) (where the total action is
S= £t+S )is
(v*?a)
Af=p 2 ,
where
p = £ —m 2 c 2 = const.
c
Using formula (19.4) for the vector potential of the homogeneous magnetic field, choosing the x
axis along the field direction and considering this axis as the optical axis of an axiallysymmetric
optical system, we get the HamiltonJacobi equation in the form:
/dS \ 2 ZdSoY e 2
where r is the distance from the x axis, and S is a function of x and r.
For narrow beams of particles propagating close to the optical axis, the coordinate r is small, so
that accordingly we try to find 5*0 as a power series in r. The first two terms of this series are
S =px+ia(x)r 2 , (2)
where a(x) satisfies the equation
pa'(x) + a 2 +^ 2 H 2 = 0. (3)
In region 1 in front of the lens, we have:
„cd _ P
X — Xx
where x x < is a constant. This solution corresponds to a free beam of particles, emerging along
straight line rays from the point x = x x on the optical axis in region 1. In fact, the action function
t This might be the field inside a long solenoid, when we neglect the disturbance of the homogeneity of the
field near the ends of the solenoid.
§ 57 IMAGE FORMATION WITH BROAD BUNDLES OF RAYS 141
for the free motion of a particle with a momentum p in a direction out from the point x = Xi is
S =p^r 2 +{xx x ) 2 g*p(xx 1 ) +
pr*
2(jc— xi)"
Similarly, in region 2 behind the lens we write:
X — X2.
where the constant x 2 is the coordinate of the image of the point x x .
In region 3 inside the lens, the solution of equation (3) is obtained by separation of variables,
and gives :
CT c3> = ^cot(^*+c\
2c \2cp J
where C is an arbitrary constant.
The constant C and x 2 (for given x x ) are determined by the requirements of continuity of o(x)
for x = and x = l:
p eH p eH A I ' eH , , _
 — = — cot C, r^— = = cot ( — l+C
x x 2c l—x 2 2c \2cp
Eliminating the constant C from these equations, we find :
(xi—g)(x a +h)= f 2 ,
wheref
2cp eHl , ,
g== eH COt W H=:g  l >
f ^_2cp
„ . eHl'
eH sin  —
2cp
§ 57. Image formation with broad bundles of rays
The formation of images with the aid of narrow bundles of rays, which was considered
in the previous section, is approximate; it is the more exact (i.e. the sharper) the narrower
the bundles. We now go over to the question of image formation with bundles of rays of
arbitrary breadth.
In contrast to the formation of an image of an object by narrow beams, which can be
achieved for any optical system having axial symmetry, image formation with broad beams
is possible only for specially constituted optical systems. Even with this limitation, as already
pointed out in § 56, image formation is not possible for all points in space.
The later derivations are based on the following essential remark. Suppose that all rays,
starting from a certain point O and traveling through the optical system, intersect again at
some other point O'. It is easy to see that the optical path length \j/ is the same for all these
rays. In the neighborhood of each of the points O, O', the wave surfaces for the rays inter
secting in them are spheres with centers at O and O', respectively, and, in the limit as we
approach O and O', degenerate to these points. But the wave surfaces are the surfaces of
constant phase, and therefore the change in phase along different rays, between their points
of intersection with two given wave surfaces, is the same. From what has been said, it follows
that the total change in phase between the points O and O' is the same (for the different
rays).
t The value of /is given with the correct sign. However, to show this requires additional investigation.
142 THE PROPAGATION OF LIGHT § 57
Let us consider the conditions which must be fulfilled in order to have formation of an
image of a small line segment using broad beams ; the image is then also a small line segment.
We choose the directions of these segments as the directions of the £ and £' axes, with origins
at any two corresponding points O and O' of the object and image. Let \j/ be the optical path
length for the rays starting from O and reaching O'. For the rays starting from a point
infinitely near to O with coordinate d£, and arriving at a point of the image with coordinate
d£', the optical path length is ^ + #, where
We introduce the "magnification"
d?
as the ratio of the length d% of the element of the image to the length d£ of the imaged
element. Because of the smallness of the line segment which is being imaged, the quantity a
can be considered constant along the line segment. Writing, as usual, di]//d£ = n^,
#/d£' = n'z (n^, n\ are the cosines of the angles between the directions of the ray and the
corresponding axes £ and £'), we obtain
dty = (a^n'^ — n^dt;.
As for every pair of corresponding points of object and image, the optical path length
ip + diff must be the same for all rays starting from the point d£ and arriving at the point d£'.
From this we obtain the condition:
a 5 n\ — n% = const. (57. 1)
This is the condition we have been seeking, which the paths of the rays in the optical system
must satisfy in order to have image formation for a small line segment using broad beams.
The relation (57.1) must be fulfilled for all rays starting from the point O.
Let us apply this condition to image formation by means of an axiallysymmetric optical
system. We start with the image of a line segment coinciding with the optical axis (x axis) ;
clearly the image also coincides with the axis. A ray moving along the optical axis (n x = 1),
because of the axial symmetry of the system, does not change its direction after passing
through it, that is, n' x is also 1. From this it follows that const in (57.1) is equal in this case
toa^1, and we can rewrite (57.1) in the form
ln x
ln' x "*•
Denoting by 9 and 6' the angles subtended by the rays with the optical axis at points of the
object and image, we have
R'
i — n x = 1 — cos 6 = 2 sin 2 , 1 — n' x = 1 — cos 6' = 2 sin 2 .
Thus we obtain the condition for image formation in the form
. d
sin 
2 /
= const = Va,. (57.2)
sin —
2
§ 58 THE LIMITS OF GEOMETRICAL OPTICS 143
Next, let us consider the imaging of a small portion of a plane perpendicular to the optical
axis of an axially symmetric system; the image will obviously also be perpendicular to this
axis. Applying (57. 1) to an arbitrary segment lying in the plane which is to be imaged, we get :
a r sin 9' — sin 9 = const,
where 9 and 9' are again the angles made by the beam with the optical axis. For rays
emerging from the point of intersection of the object plane with the optical axis, and directed
along this axis (9 = 0), we must have 9' = 0, because of symmetry. Therefore const is zero,
and we obtain the condition for imaging in the form
= const = a r . (57.3)
sin 9'
As for the formation of an image of a threedimensional object using broad beams, it is
easy to see that this is impossible even for a small volume, since the conditions (57.2) and
(57.3) are incompatible.
§ 58. The limits of geometrical optics
From the definition of a monochromatic plane wave, its amplitude is the same everywhere
and at all times. Such a wave is infinite in extent in all directions in space, and exists over the
whole range of time from  oo to + oo. Any wave whose amplitude is not constant every
where at all times can only be more or less monochromatic. We now take up the question of
the "degree of nonmonochromaticity" of a wave.
Let us consider an electromagnetic wave whose amplitude at each point is a function of
the time. Let w be some average frequency of the wave. Then the field of the wave, for
example the electric field, at a given point has the form E (t)e~ i(0ot . This field, although it is of
course not monochromatic, can be expanded in monochromatic waves, that is, in a Fourier
integral. The amplitude of the component in this expansion, with frequency co, is propor
tional to the integral
+ oo
[ E o (0e i(to " Wo)f ^
— oo
The factor e Ka ~ ao)t is a periodic function whose average value is zero. If E were exactly
constant, then the integral would be exactly zero, for co # co . If, however, E (0 is variable,
but hardly changes over a time interval of order l/(coco ), then the integral is almost equal
to zero, the more exactly the slower the variation of E . In order for the integral to be sig
nificantly different from zero, it is necessary that E (f ) vary significantly over a time interval
of the order of l/(co — co ).
We denote by At the order of magnitude of the time interval during which the amplitude
of the wave at a given point in space changes significantly. From these considerations, it now
follows that the frequencies deviating most from w , which appear with reasonable intensity
in the spectral resolution of this wave, are determined by the condition l/(coco ) ~ At. If
we denote by Aco the frequency interval (around the average frequency co ) which enters in
the spectral resolution of the wave, then we have the relation
AeoA*~l. (58.1)
144 THE PROPAGATION OF LIGHT § 58
We see that a wave is the more monochromatic (i.e. the smaller Aco) the larger At, i.e. the
slower the variation of the amplitude at a given point in space.
Relations similar to (58.1) are easily derived for the wave vector. Let Ax, Ay, Az be the
orders of magnitude of distances along the X, Y, Z axes, in which the wave amplitude
changes significantly. At a given time, the field of the wave as a function of the coordinates
has the form
E (ry' k ° r ,
where k is some average value of the wave vector. By a completely analogous derivation to
that for (58.1) we can obtain the interval Ak of values contained in the expansion of the wave
into a Fourier integral :
Afc^Ax  1, Ak y Ay ~ 1, Ak z Az ~ 1. (58.2)
Let us consider, in particular, a wave which is radiated during a finite time interval. We
denote by At the order of magnitude of this interval. The amplitude at a given point in space
changes significantly during the time At in the course of which the wave travels completely
past the point. Because of the relations (58.1) we can now say that the "lack of mono
chromaticity" of such a wave, Aco, cannot be smaller than 1/Af (it can of course be larger):
1
Aco ^ — . (58.3)
Similarly, if Ax, Ay, Az are the orders of magnitude of the extension of the wave in space,
then for the spread in the values of components of the wave vector, entering in the resolution
of the wave, we obtain
M,>i. Afc y »i, A*,>1. (58.4
From these formulas it follows that if we have a beam of light of finite width, then the
direction of propagation of the light in such a beam cannot be strictly constant. Taking the
X axis along the (average) direction of light in the beam, we obtain
1 k
'^kAyAy' (58 " 5)
where 9 y is the order of magnitude of the deviation of the beam from its average direction in
the XY plane and k is the wavelength.
On the other hand, the formula (58.5) answers the question of the limit of sharpness of
optical image formation. A beam of light whose rays, according to geometrical optics, would
all intersect in a point, actually gives an image not in the form of a point but in the form of a
spot. For the width A of this spot, we obtain, according to (58.5),
where 9 is the opening angle of the beam. This formula can be applied not only to the image
but also to the object. Namely, we can state that in observing a beam of light emerging
from a luminous point, this point cannot be distinguished from a body of dimensions k/9.
In this way formula (58.6) determines the limiting resolving power of a microscope. The
minimum value of A, which is reached for 9 ~ 1, is k, in complete agreement with the fact
that the limit of geometrical optics is determined by the wavelength of the light.
§ 59 DIFFRACTION 145
PROBLEM
Determine the order of magnitude of the smallest width of a light beam produced from a parallel
beam at a distance / from a diaphragm.
Solution: Denoting the size of the aperture in the diaphragm by d, we have from (58.5) for the
angle of deflection of the beam (the "diffraction angle"), Md, so that the width of the beam is of order
d+(A/d)l. The smallest value of this quantity ~ VaL
§ 59. Diffraction
The laws of geometrical optics are strictly correct only in the ideal case when the wave
length can be considered to be infinitely small. The more poorly this condition is fulfilled,
the greater are the deviations from geometrical optics. Phenomenon which are the con
sequence of such deviations are called diffraction phenomena.
Diffraction phenomena can be observed, for example, if along the path of propagation of
the light f there is an obstacle — an opaque body (we call it a screen) of arbitrary form or, for
example, if the light passes through holes in opaque screens. If the laws of geometrical optics
were strictly satisfied, there would be beyond the screen regions of "shadow" sharply
delineated from regions where light falls. The diffraction has the consequence that, instead
of a sharp boundary between light and shadow, there is a quite complex distribution of the
intensity of the light. These diffraction phenomena appear the more strongly the smaller the
dimensions of the screens and the apertures in them, or the greater the wavelength.
The problem of the theory of diffraction consists in determining, for given positions and
shapes of the objects (and locations of the light sources), the distribution of the light, that is,
the electromagnetic field over all space. The exact solution of this problem is possible only
through solution of the wave equation with suitable boundary conditions at the surface of
the body, these conditions being determined also by the optical properties of the material.
Such a solution usually presents great mathematical difficulties.
However, there is an approximate method which for many cases is a satisfactory solution
of the problem of the distribution of light near the boundary between light and shadow. This
method is applicable to cases of small deviation from geometrical optics, i.e. when firstly, the
dimensions of all bodies are large compared with the wavelength (this requirement applies
both to the dimensions of screens and apertures and also to the distances from the bodies to
the points of emission and observation of the light) ; and secondly when there are only small
deviations of the light from the directions of the rays given by geometrical optics.
Let us consider a screen with an aperture through which the light passes from given
sources. Figure 9 shows the screen in profile (the heavy line); the light travels from left to
right. We denote by u some one of the components of E or H. Here we shall understand u
to mean a function only of the coordinates, i.e. without the factor e~ i<ot determining the time
dependence. Our problem is to determine the light intensity, that is, the field u, at any
point of observation P beyond the screen. For an approximate solution of this problem in
cases where the deviations from geometrical optics are small, we may assume that at the
points of the aperture the field is the same as it would have been in the absence of the screen.
In other words, the values of the field here are those which follow directly from geometrical
t In what follows, in discussing diffraction we shall talk of the diffraction of light; all these same con
siderations also apply, of course, to any electromagnetic wave.
146
THE PROPAGATION OF LIGHT
§ 59
Fig. 9.
optics. At all points immediately behind the screen, the field can be set equal to zero. In this
the properties of the screen (i.e. of the screen material) obviously play no part. It is also
obvious that in the cases we are considering, what is important for the diffraction is only the
shape of the edge of the aperture, while the shape of the opaque screen is unimportant.
We introduce some surface which covers the aperture in the screen and is bounded by its
edges (a profile of such a surface is shown in Fig. 9 as a dashed line). We break up this surface
into sections with area df whose dimensions are small compared with the size of the aperture,
but large compared with the wavelength of the light. We can then consider each of these
sections through which the light passes as if it were itself a source of light waves spreading
out on all sides from this section. We shall consider the field at the point P to be the result of
superposition of the fields produced by all the sections dfof the surface covering the aperture.
(This is called Huygens' principle.)
The field produced at the point P by the section <^is obviously proportional to the value u
of the field at the section ^itself (we recall that the field at ^is assumed to be the same as it
would have been in the absence of the screen). In addition, it is proportional to the projection
df„ of the area df on the plane perpendicular to the direction n of the ray coming from the
light source at df. This follows from the fact that no matter what shape the element ^has,
the same rays will pass through it provided its projection df n remains fixed, and therefore its
effect on the field at P will be the same.
Thus the field produced at the point P by the section dfis proportional to udf n . Further
more, we must still take into account the change in the amplitude and phase of the wave
during its propagation from df to P. The law of this change is determined by formula
(54.3). Therefore udf n must be multiplied by (\/R)e ikR (where R is the distance from df to P,
and k is the absolute value of the wave vector of the light), and we find that the required
field is
au
R
dfn,
where a is an as yet unknown constant. The field at the point P, being the result of the
addition of the fields produced by all the elements df, is consequently equal to
u p = a u
R
df„
(59.1)
where the integral extends over the surface bounded by the edge of the aperture. In the
approximation we are considering, this integral cannot, of course, depend on the form of this
surface. Formula (59.1) is, obviously, applicable not only to diffraction by an aperture
§ 59 DIFFRACTION 147
in a screen, but also to diffraction by a screen around which the light passes freely. In that
case the surface of integration in (59.1) extends on all sides from the edge of the screen.
To determine the constant a, we consider a plane wave propagating along the X axis ;
the wave surfaces are parallel to the plane YZ. Let u be the value of the field in the YZ plane.
Then at the point P, which we choose on the X axis, the field is equal to u p = ue lkx . On the
other hand, the field at the point P can be determined starting from formula (59.1), choosing
as surface of integration, for example, the YZ plane. In doing this, because of the smallness
of the angle of diffraction, only those points of the YZ plane are important in the integral
which lie close to the origin, i.e. the points for which y, z<x(x'\s the coordinate of the
point P). Then
/» — 5 — 5 y 2 + z 2
R = jx 2 + y 2 + z 2 xx +
2x
and (59.1) gives
+ 00 + 00
7 ikx p Icy* /• .fcz^
— J e 2x dy J e 2x dz,
where u is a constant (the field in the YZ plane) ; in the factor 1/R, we can put R ^ x = const.
By the substitution y = ^2x/k these two integrals can be transformed to the integral
+ oo +oo +oo
ikx 2lK
u D = aue ——.
k
and we get
On the other hand, u p = ue ikx , and consequently
k
2ni'
Substituting in (59.1), we obtain the solution to our problem in the form
"'J"^*"" (59  2)
In deriving formula (59.2), the light source was assumed to be essentially a point, and the
light was assumed to be strictly monochromatic. The case of a real, extended source, which
emits nonmonochromatic light, does not, however, require special treatment. Because of the
complete independence (incoherence) of the light emitted by different points of the source,
and the incoherence of the different spectral components of the emitted light, the total
diffraction pattern is simply the sum of the intensity distributions obtained from the diffrac
tion of the independent components of the light.
Let us apply formula (59.2) to the solution of the problem of the change in phase of a ray
on passing through its point of tangency to the caustic (see the end of § 54). We choose as
our surface of integration in (59.2) any wave surface, and determine the field u p at a point P,
lying on some given ray at a distance x from its point of intersection with the wave surface
we have chosen (we choose this point as coordinate origin O, and as YZ plane the plane
tangent to the wave surface at the point O). In the integration of (59.2) only a small area of
the wave surface in the neighborhood of O is important. If the XY and A'Zplanesare chosen
148 THE PROPAGATION OF LIGHT § 59
to coincide with the principal planes of curvature of the wave surface at the point O, then
near this point the equation of the surface is
V 2 Z 2
X _ * j
2.R i 2R 2
where R t and R 2 are the radii of curvature. The distance R from the point on the wave
surface with coordinates X, y, z, to the point P with coordinates x, 0, 0, is
On the wave surface, the field u can be considered constant; the same applies to the factor
l/R. Since we are interested only in changes in the phase of the wave, we drop coefficients
and write simply
+ 00 y 2 /l 1 \ +00 z 2 /l 1 \
u p~~i) e dfnr \ dye \ dze . (59.3)
 00 — 00
The centers of curvature of the wave surface lie on the ray we are considering, at the
points x = R t and x = R 2 ; these are the points where the ray is tangent to the caustic.
Suppose R 2 < R v For x < R 2 , the coefficients of i in the exponentials appearing in the two
integrands are positive, and each of these integrals is proportional to (1+0 Therefore on
the part of the ray before its first tangency to the caustic, we have u p ~ e ikx . For R 2 < x < R u
that is, on the segment of the ray between its two points of tangency, the integral over y is
proportional to 1 + 1, but the integral over z is proportional to 1  i, so that their product does
not contain i. Thus we have here u p  ie ikx = g'(**(*/2)) s that is, as the ray passes in the
neighborhood of the first caustic, its phase undergoes an additional change of —n/2.
Finally, for x>R 1 ,vre have u p ~ e ikx = e i(kx ~ n \ that is, on passing in the neighborhood
of the second caustic, the phase once more changes by — njl.
PROBLEM
Determine the distribution of the light intensity in the neighborhood of the point where the ray
is tangent to the caustic.
Solution: To solve the problem, we use formula (59.2), taking the integral in it over any wave
surface which is sufficiently far from the point of tangency of the ray to the caustic. In Fig. 10, ab
is a section of this wave surface, and a'b' is a section of the caustic; a'b' is the evolute of the curve
Fig. 10.
§ 59 DIFFRACTION 149
ab. We are interested in the intensity distribution in the neighborhood of the point O where the ray
QO is tangent to the caustic; we assume the length D of the segment QO of the ray to be large. We
denote by x the distance from the point O along the normal to the caustic, and assume positive
values x for points on the normal in the direction of the center of curvature.
The integrand in (59.2) is a function of the distance R from the arbitrary point Q' on the wave
surface to the point P. From a wellknown property of the evolute, the sum of the length of the seg
ment Q'O' of the tangent at the point O' and the length of the arc OO' is equal to the length QO
of the tangent at the point O. For points O and O' which are near to each other we have OO' = Oq
(q is the radius of curvature of the caustic at the point O). Therefore the length Q'O' = D — Oq. The
distance Q'O (along a straight line) is approximately (the angle 9 is assumed to be small)
3
Q'O^ Q'O'+gsm 9 = DGq+q sin 9 ^ DQ.
6
Finally, the distance R = Q'P is equal to R ^ Q'O— x sin 9 ^ Q'O— x9, that is,
Rc±Dx9\q9 3 .
Substituting this expression in (59.2), we obtain
+ 00 00
«p~ \ e e d9 = 2\ cos I kx9+^9 3 \d9
 oo o
(the slowly varying factor 1/D in the integrand is unimportant compared with the exponential
factor, so we assume it constant). Introducing the new integration variable £= (kg/l) 113 9, we get
where 0(0 is the Airy function.f
For the intensity / ~ w p  2 , we write:
(concerning the choice of the constant factor, cf. below).
For large positive values of x, we have from this the asymptotic formula
"'~*( x { 2j fyy
2Vx
f The Airy function <D(/) is defined as
4x 3
exp
M
<wn = j= fcos(+<^W (i)
o
(see Quantum Mechanics, Mathematical Appendices, § b). For large positive values of the argument, the
asymptotic expression for <E>(f ) is
1 / 2 3/2 \
(2)
that is, 0(0 goes exponentially to zero. For large negative values of t, the function 0(t) oscillates with
decreasing amplitude according to the law:
«KO* (  = ^ 8 in(f(/)3/'+^. (3)
The Airy function is related to the MacDonald function (modified Hankel function) of order 1/3 :
<t>(t) = V}J3nK ll3 (it 3 ' 2 ). (4)
Formula (2) corresponds to the asymptotic expansion of K v (t):
**)«Jl
150 THE PROPAGATION OF LIGHT § 60
that is, the intensity drops exponentially (shadow region). For large negative values of x, we have
2A
V
sin 2
2(x) 312 /2k* n\
3 V q + 4/'
that is, the intensity oscillates rapidly; its average value over these oscillations is
A
Vx
From this meaning of the constant A is clear — it is the intensity far from the caustic which would be
obtained from geometrical optics neglecting diffraction effects.
The function <£(/) attains its largest value, 0.949, for t = 1.02; correspondingly, the maximum
intensity is reached at x^lk^lo) 1 ' 3 = —1.02, where
I = 2.02>Ak 1,3 Q 1 > e .
At the point where the ray is tangent to the caustic (x = 0), we have I = 0.S9 Ak ll3 q 1 ' 6 [since
0(0) = 0.629].
Thus near the caustic the intensity is proportional to k 113 , that is, to X~ 1I3 (X is the wavelength).
For A>0, the intensity goes to infinity, as it should (see § 54).
§ 60. Fresnel diffraction
If the light source and the point P at which we determine the intensity of the light are
located at finite distances from the screen, then in determining the intensity at the point P,
only those points are important which lie in a small region of the wave surface over which we
integrate in (59.2) — the region which lies near the line joining the source and the point P.
In fact, since the deviations from geometrical optics are small, the intensity of the light
arriving at P from various points of the wave surface decreases very rapidly as we move
away from this line. Diffraction phenomena in which only a small portion of the wave
surface plays a role are called Fresnel diffraction phenomena.
Let us consider the Fresnel diffraction by a screen. From what we have just said, for a
given point P only a small region at the edge of the screen is important for this diffraction.
But over sufficiently small regions, the edge of the screen can always be considered to be a
straight line. We shall therefore, from now on, understand the edge of the screen to mean
just such a small straight line segment.
We choose as the XY plane a plane passing through the light source Q (Fig. 1 1) and
through the line of the edge of the screen. Perpendicular to this, we choose the plane XZ so
that it passes through the point Q and the point of observation P, at which we try to deter
mine the light intensity. Finally, we choose the origin of coordinates O on the line of the edge
of the screen, after which the positions of all three axes are completely determined.
Fig. 11.
§60
FRESNEL DIFFRACTION
151
p and D q . A negative d
Let the distance from the light source Q to the origin be D q . We denote the xcoordinate
of the point of observation P by D p , and its zcoordinate, i.e. its distance from the XY
plane, by d. According to geometrical optics, the light should pass only through points
lying above the X Y plane; the region below the XY plane is the region which according to
geometrical optics should be in shadow (region of geometrical shadow).
We now determine the distribution of light intensity on the screen near the edge of the
geometrical shadow, i.e. for values of d small compared with D
means that the point P is located within the geometrical shadow
As the surface of integration in (59.2) we choose the halfplane passing through the line
of the edge of the screen and perpendicular to the XY plane. The coordinates jc and y of
points on this surface are related by the equation x = y tan a (a is the angle between the line
of the edge of the screen and the Y axis), and the zcoordinate is positive. The field of the
wave produced by the source Q, at the distance R q from it, is proportional to the factor
e ikR q Therefore the field u on the surface of integration is proportional to
u ~ exp {iky/y 2 + z 2 + (D q + y tan a) 2 }.
In the integral (59.2) we must now substitute for R,
R = y/ y 2 + ( z  df + (D p y tan a) 2 .
The slowly varying factors in the integrand are unimportant compared with the exponential.
Therefore we may consider l/R constant, and write dy dz in place of df n . We then find that
the field at the point P is
+ 00 00
m p ~ J* Jexp{/fc(V(Aj+J ; tana) 2 lv 2 + z 2
oo
+ V(£p  y tan a) 2 + (z  d) 2 + v 2 )} dy dz. (60.1)
As we have already said, the light passing through the point P comes mainly from points
of the plane of integration which are in the neighborhood of O. Therefore in the integral
(60.1) only values of y and z which are small (compared with D q and D p ) are important. For
this reason we can write
/■ v sec octz
y/(D q +y tan a) 2 + y 2 + z 2 ~ D+ — +y tan a,
y/(D p  y tan a) 2 + (z  d) 2 + y 2
D P +
(zd) 2 + y 2
2D,
■ v tan a.
We substitute this in (60.1). Since we are interested only in the field as a function of the
distance d, the constant factor exp {ik(D p +D q )} can be omitted; the integral over y also
gives an expression not containing d, so we omit it also. We then find
Jexp{//c(
ik[ —
2D„
z 2 +
1
2D,
(zd) 2 )}dz.
This expression can also be written in the form
exp < ik
2(D P +
D q )\ J
exp < ik
1
K 1
+
n
1*
d'
2
2
LW
*v
»p\
1 1
dz.
(60.2)
152 THE PROPAGATION OF LIGHT § 60
The light intensity is determined by the square of the field, that is, by the square modulus
\u p \ 2 . Therefore, when calculating the intensity, the factor standing in front of the integral
is irrelevant, since when multiplied by the complex conjugate expression it gives unity. An
obvious substitution reduces the integral to
 e^drj,
e l " drj, (60.3)
where
— w
kD {
2D/D^+D P )
Thus, the intensity / at the point P is :
™ = dJ 7 „Z q ,„ v (60.4)
' = 'i
_ oo
Jll
*>r = ^° (f<V)+ \) 2 + (s(w 2 )+ iVl (60.5)
where
2 [\ K ' 2) V 2
C(z) = I  cos n 2 dt], S(z) = I  sin r\ 2 dr\
o
are called the Fresnel integrals. Formula (60.5) solves our problem of determining the light
intensity as a function of d. The quantity I is the intensity in the illuminated region at
points not too near the edge of the shadow ; more precisely, at those points with w P 1
(C(oo) = 5(oo) = \ in the limit w > oo).
The region of geometrical shadow corresponds to negative w. It is easy to find the
asymptotic form of the function / (w) for large negative values of w. To do this we proceed
as follows. Integrating by parts, we have
J 2i\w\ 2i J
n 2 '
\w\ \w\
Integrating by parts once more on the right side of the equation and repeating this process,
we obtain an expansion in powers of l/vv:
f e i,,2 drj
1 1
+
2iw 4w 3
(60.6)
Although an infinite series of this type does not converge, nevertheless, because the suc
cessive terms decrease very rapidly for large values of \w\, the first term already gives a good
representation of the function on the left for sufficiently large \w\ (such a series is said to be
asymptotic). Thus, for the intensity I(w), (60.5), we obtain the following asymptotic formula,
valid for large negative values of w :
1 = 1—2 ( 60  7 )
We see that in the region of geometric shadow, far from its edge, the intensity goes to zero
as the inverse square of the distance from the edge of the shadow.
§ 61 FRAUNHOFER DIFFRACTION 153
We now consider positive values of w, that is, the region above the XY plane. We write
00 +00 — W CO
f e in2 dn = f e^dn f J" 2 dn = (l + i) J? ( e ir > 2
drj.
For sufficiently large w, we can use an asymptotic representation for the integral standing
on the right side of the equation, and we have
00
j>> s(1+0 ^ + _i
— w
Substituting this expression in (60.5), we obtain
/ = Mi +
\l n
2iw
 sin  w — 
(60.8)
w
(60.9
Thus in the illuminated region, far from the edge of the shadow, the intensity has an infinite
sequence of maxima and minima, so that the ratio I/I oscillates on both sides of unity.
With increasing w, the amplitude of these oscillations decreases inversely with the distance
from the edge of the geometric shadow, and the positions of the maxima and minima steadily
approach one another.
For small w, the function I(w) has qualitatively this same character (Fig. 12). In the region
of the geometric shadow, the intensity decreases monotonically as we move away from the
boundary of the shadow. (On the boundary itself, I/I = £.) For positive w, the intensity has
alternating maxima and minima. At the first (largest) maximum, I/I = 1.37.
§ 61. Fraunhofer diffraction
Of special interest for physical applications are those diffraction phenomena which occur
when a plane parallel bundle of rays is incident on a screen. As a result of the diffraction,
the beam ceases to be parallel, and there is light propagation along directions other than the
initial one. Let us consider the problem of determining the distribution over direction of the
intensity of the diffracted light at large distances beyond the screen (this formulation of the
problem corresponds to Fraunhofer diffraction). Here we shall again restrict ourselves to the
case of small deviations from geometrical optics, i.e. we shall assume that the angles of
deviation of the rays from the initial direction (the diffraction angles) are small.
Fig. 12.
154 THE PROPAGATION OF LIGHT § 61
This problem can be solved by starting from the general formula (59.2) and passing to the
limit where the light source and the point of observation are at infinite distances from the
screen. A characteristic feature of the case we are considering is that, in the integral which
determines the intensity of the diffracted light, the whole wave surface over which the integral
is taken is important (in contrast to the case of Fresnel diffraction, where only the portions
of the wave surface near the edge of the screens are important).f
However, it is simpler to treat this problem anew, without recourse to the general formula
(59.2).
Let us denote by u Q the field which would exist beyond the screens if geometrical optics
were rigorously valid. This field is a plane wave, but its cross section has certain regions
(corresponding to the "shadows" of opaque screens) in which the field is zero. We denote
by S the part of the plane crosssection on which the field u is different from zero ; since
each such plane is a wave surface of the plane wave, u = const over the whole surface S.
Actually, however, a wave with a limited crosssectional area cannot be strictly plane
(see § 58). In its spatial Fourier expansion there appear components with wave vectors
having different directions, and this is precisely the origin of the diffraction.
Let us expand the field u into a twodimensional Fourier integral with respect to the co
ordinates y, z in the plane of the transverse crosssection of the wave. For the Fourier
components, we have :
w q = I I u e~ iq 'dydz, (61.1)
where the vectors q are constant vectors in the y, z plane ; the integration actually extends
only over that portion S of the y, z plane on which u is different from zero. If k is the wave
vector of the incident wave, the field component u q e l qT gives the wave vector k' = k+q.
Thus the vector q = k' — k determines the change in the wave vector of the light in the diffrac
tion. Since the absolute values k = k' = co/c, the small diffraction angles 6 y , 6 Z in the xy and
xzplanes are related to the components of the vector q by the equations
q y = ~ 9 y , q z = ~ 6 Z . (61.2)
c c
For small deviations from geometrical optics, the components in the expansion of the
field u can be assumed to be identical with the components of the actual field of the dif
fracted light, so that formula (61.1) solves our problem.
The intensity distribution of the diffracted light is given by the square wj 2 as a function
of the vector q. The quantitative connection with the intensity of the incident light is
established by the formula
\ \< d y dz = \ \\»«\ 2d j0 ( 61  3 >
t The criteria for Fresnel and Fraunhofer diffraction are easily found by returning to formula (60.2) and
applying it, for example, to a slit of width a (instead of to the edge of an isolated screen). The integration
over z in (60.2) should then be taken between the limits from to a. Fresnel diffraction corresponds to the
case when the term containing z 2 in the exponent of the integrand is important, and the upper limit of the
integral can be replaced by oo. For this to be the case, we must have
\D P D q J^
On the other hand, if this inequality is reversed, the term in z 2 can be dropped; this corresponds to the case of
Fraunhofer diffraction.
§ 61 FRAUNHOFER DIFFRACTION 155
[compare (49.8)]. From this we see that the relative intensity diffracted into the solid angle
do = dO y d6 z is given by
u I 2 dq y dq z ( a> \ 2
G

2 do. (61.4)
u 2 (2n) 2 \2ncJ
Let us consider the Fraunhofer diffraction from two screens which are "complementary" :
the first screen has holes where the second is opaque and conversely. We denote by m (1) and
u {2) the field of the light diffracted by these screens (when the same light is incident in both
cases). Since w q (1) and u q (2) are expressed by integrals (61.1) taken over the surfaces of the
apertures in the screens, and since the apertures in the two screens complement one another
to give the whole plane, the sum u q (1) + w q (2) is the Fourier component of the field obtained
in the absence of the screens, i.e. it is simply the incident light. But the incident light is a
rigorously plane wave with definite direction of propagation, so that w q (1) + w q (2) = for all
nonzero values of q. Thus we have w q (1) = w q (2) , or for the corresponding intensities,
w q (1 T = K (2)  2 forq^0. (61.5)
This means that complementary screens give the same distribution of intensity of the
diffracted light (this is called Babinefs principle).
We call attention here to one interesting consequence of the Babinet principle. Let us
consider a blackbody, i.e. one which absorbs completely all the light falling on it. According
to geometrical optics, when such a body is illuminated, there is produced behind it a region
of geometrical shadow, whose crosssectional area is equal to the area of the body in the
direction perpendicular to the direction of incidence of the light. However, the presence of
diffraction causes the light passing by the body to be partially deflected from its initial
direction. As a result, at large distances behind the body there will not be complete shadow
but, in addition to the light propagating in the original direction, there will also be a certain
amount of light propagating at small angles to the original direction. It is easy to determine
the intensity of this scattered light. To do this, we point out that according to Babinet's
principle, the amount of light deviated because of diffraction by the body under considera
tion is equal to the amount of light which would be deviated by diffraction from an aperture
cut in an opaque screen, the shape and size of the aperture being the same as that of the
transverse section of the body. But in Fraunhofer diffraction from an aperture all the light
passing through the aperture is deflected. From this it follows that the total amount of light
scattered by a blackbody is equal to the amount of light falling on its surface and absorbed
by it.
PROBLEMS
1. Calculate the Fraunhofer diffraction of a plane wave normally incident on an infinite slit
(of width 2d) with parallel sides cut in an opaque screen.
Solution: We choose the plane of the slit as the yz plane, with the z axis along the slit (Fig. 13
shows a section of the screen). For normally incident light, the plane of the slit is one of the wave
surfaces, and we choose it as the surface of integration in (61.1). Since the slit is infinitely long, the
light is deflected only in the xy plane [since the integral (61.1) becomes zero for g 3 =£ 0].
Therefore the field should be expanded only in the y coordinate:
a
U q = Uq J
< {,iy dy= — sinqa.
156
THE PROPAGATION OF LIGHT
§ 61
"t +a~
\
^
V
x
Fig. 13.
The intensity of the diffracted light in the angular range d9 is
dl =
h \Uc
\ 2 dq
2k
I sin 2 kaO
de,
2a \uq\ 2k nak 9 2
where k = cole, and I is the total intensity of the light incident on the slit.
dl/dO as a function of diffraction angle has the form shown in Fig. 14. As 9 increases toward
either side from = 0, the intensity goes through a series of maxima with rapidly decreasing
height. The successive maxima are separated by minima at the points 9 = nnjka (where n is an
integer); at the minima, the intensity falls to zero.
Fig. 14.
2. Calculate the Fraunhofer diffraction by a diffraction grating — a plane screen in which are cut
a series of identical parallel slits (the width of the slits is 2a, the width of opaque screen between
neighboring slits is 2b, and the number of slits is N).
Solution: We choose the plane of the grating as the yz plane, with the z axis parallel to the slits
Diffraction occurs only in the xy plane, and integration of (61.1) gives:
"s'Z
1
 ZiNqd
n = J. c
where d = a+b, and u' q is the result of the integration over a single slit. Using the results of problem
1, we get:
. _ I Q a /sin Ngd\ 2 /sin qa\ 2 , _ h /sin Nk9d\ 2 sin 2 ka9
Nn \ sin qd J \ qa J Nnak \ sin k9d J 9 2
dl
d9
(Io is the total intensity of the light passing through all the slits).
§ 61 FRAUNHOFER DIFFRACTION 157
For the case of a large number of slits (N> co), this formula can be written in another form. For
values q — nn/d, where n is an integer, dl/dq has a maximum; near such a maximum (i.e. for
qd = nn+E, with e small)
(sin^aVsir
qa ) n
But for N* oo, we have the formula t
N»oo 7tNx 2
We therefore have, in the neighborhood of each maximum:
! sin^ Ns
a /sin qa\ 2
dI=Io[ — S(e)de,
d\ qa J
i.e., in the limit the widths of the maxima are infinitely narrow and the total light intensity in the
«'th maximum is
roo _ r d sin2 i nna l d )
3. Find the distribution of intensity over direction for the diffraction of light which is incident
normal to the plane of a circular aperture of radius a.
Solution: We introduce cylindrical coordinates z, r, <j> with the z axis passing through the center
of the aperture and perpendicular to its plane. It is obvious that the diffraction is symmetric about
the z axis, so that the vector q has only a radial component q r =q = k9. Measuring the angle <f>
from the direction q, and integrating in (61.1) over the plane of the aperture, we find:
a 2n a
u q = u Q f (e i9rcos '"rd^dr = 2nuo f J (qr)rdr,
where J is the zero'th order Bessel function. Using the wellknown formula
a
I
Jo(.qr)rdr = Jx(aq),
we then have
u a
u q — 2% Ji(aq),
and according to (61.4) we obtain for the intensity of the light diffracted into the element of solid
angle do:
dI = I ^do,
where I is the total intensity of the light incident on the aperture.
t For x # o the function on the left side of the equation is zero, while according to a wellknown formula
of the theory of Fourier series,
^(s/^^*)^
From this we see that the properties of this function actually coincide with those of the <5function (see the
footnote on p. 7).
CHAPTER 8
THE FIELD OF MOVING CHARGES
§ 62. The retarded potentials
In Chapter 5 we studied the constant field, produced by charges at rest, and in Chapter 6,
the variable field in the absence of charges. Now we take up the study of varying fields in the
presence of arbitrarily moving charges.
We derive equations determining the potentials for arbitrarily moving charges. This
derivation is most conveniently done in fourdimensional form, repeating the derivation at
the end of § 46, with the one change that we use the second pair of Maxwell equations in the
form (30.2)
dF ik _ 4ti ..
a?~ ~~c j '
The same righthand side also appears in (46.8), and after imposing the Lorentz condition
a*'_ft i.e. !^+divA = 0, (62.1)
OX c ot
on the potentials, we get
* A ' A *}. (62.2)
A#4£*4«* (62.4)
dx k dx
This is the equation which determines the potentials of an arbitrary electromagnetic field
In threedimensional form it is written as two equations, for A and for $ :
1 d 2 A An
?~dt
c 2 dt
For constant fields, these reduce to the already familiar equations (36.4) and (43.4), and for
variable fields without charges, to the homogeneous wave equation.
As we know, the solution of the inhomogeneous linear equations (62.3) and (62.4) can be
represented as the sum of the solution of these equations without the righthand side, and a
particular integral of these equations with the righthand side. To find the particular solution,
we divide the whole space into infinitely small regions and determine the field produced by
the charges located in one of these volume elements. Because of the linearity of the field
equations, the actual field will be the sum of the fields produced by all such elements.
The charge de in a given volume element is, generally speaking, a function of the time.
If we choose the origin of coordinates in the volume element under consideration, then the
158
§ 62 THE RETARDED POTENTIALS 159
charge density is q = de(t ) <5(R), where R is the distance from the origin. Thus we must
solve the equation
1 rfi" A\
^<t>~2^= 4nde(t) SQL). (62.5)
Everywhere, except at the origin, <5(R) = 0, and we have the equation
1 d 2 <b
A0 2 ^ = O. (62.6)
It is clear that in the case we are considering $ has central symmetry, i.e., (j> is a function
only of R. Therefore if we write the Laplace operator in spherical coordinates, (62.6) reduces
to
R 2 dR \ 8RJ c 2 8t 2 ~
To solve this equation, we make the substitution <£ = %(i?, t )jR. Then, we find for x
dR 2 c 2 dt 2 ~
But this is the equation of plane waves, whose solution has the form (see § 47) :
*''('7) +/ »(' + 7
Since we only want a particular solution of the equation, it is sufficient to choose only one
of the functions /i and/ 2  Usually it turns out to be convenient to take/2 = (concerning
this, see below). Then, everywhere except at the origin, (f) has the form
H)
<t> = ^A (62.7)
So far the function x is arbitrary; we now choose it so that we also obtain the correct
value for the potential at the origin. In other words, we must select x so that at the origin
equation (62.5) is satisfied. This is easily done noting that as R > 0, the potential increases
to infinity, and therefore its derivatives with respect to the coordinates increase more rapidly
than its time derivative. Consequently as R » 0, we can, in equation (62.5), neglect the term
(\/c 2 )/(d 2 (l)/dt 2 ) compared with A<£. Then (62.5) goes over into the familiar equation (36.9)
leading to the Coulomb law. Thus, near the origin, (62.7) must go over into the Coulomb
law, from which it follows that x(t) = de(t), that is,
dett )
Y R
From this it is easy to get to the solution of equation (62.4) for an arbitrary distribution of
charges q(x, y, z, t). To do this, it is sufficient to write de = q dV(dV is the volume element)
and integrate over the whole space. To this solution of the inhomogeneous equation (6.24)
we can still add the solution (f> of the same equation without the righthand side. Thus,
160 THE FIELD OF MOVING CHARGES § 63
the general solution has the form:
<Kr, *) = j^Q U, t fj dV' + <t> , (62.8)
R = rr', dV = dx' dy' dz'
where
r = (x,y,z), r' = (x',y', z');
R is the distance from the volume element dVto the "field point" at which we determine the
potential. We shall write this expression briefly as
^Jfo±p> dK + 0o> (62 .9)
where the subscript means that the quantity q is to be taken at the time t — (R/c), and the
prime on dV has been omitted.
Similarly we have for the vector potential:
= 1 ffcjj,
c] R
Wei
dV+A , (62.10)
where A is the solution of equation (62.3) without the righthand term.
The potentials (62.9) and (62.10) (without <j) and A ) are called the retarded potentials.
In case the charges are at rest (i.e. density q independent of the time), formula (62.9) goes
over into the wellknown formula (36.8) for the electrostatic field; for the case of stationary
motion of the charges, formula (62.10), after averaging, goes over into formula (43.5) for the
vector potential of a constant magnetic field.
The quantities A and <f> in (62.9) and 62.10) are to be determined so that the conditions
of the problem are fulfilled. To do this it is clearly sufficient to impose initial conditions, that
is, to fix the values of the field at the initial time. However we do not usually have to deal
with such initial conditions. Instead we are usually given conditions at large distances from
the system of charges throughout all of time. Thus, we may be told that radiation is incident
on the system from outside. Corresponding to this, the field which is developed as a result
of the interaction of this radiation with the system can differ from the external field only by
the radiation originating from the system. This radiation emitted by the system must, at large
distances, have the form of waves spreading out from the system, that is, in the direction of
increasing R. But precisely this condition is satisfied by the retarded potentials. Thus these
solutions represent the field produced by the system, while </> and A must be set equal to
the external field acting on the system.
§ 63. The LienardWiechert potentials
Let us determine the potentials for the field produced by a charge carrying out an assigned
motion along a trajectory r = r (t ).
According to the formulas for the retarded potentials, the field at the point of observation
P(x, y, z) at time t is determined by the state of motion of the charge at the earlier time t ',
for which the time of propagation of the light signal from the point r (t ' ), where the charge
was located,to the field point P just coincides with the difference t—t'. Let R(?) = r— r (t)
be the radius vector from the charge e to the point P; like r (0 it is a given function of the
§ 63 THE LIENARDWIECHERT POTENTIALS 161
time. Then the time t ' is determined by the equation
t'+ R ^ = t. (63.1)
c
For each value of t this equation has just one root t'.\
In the system of reference in which the particle is at rest at time t ', the potential at the
point of observation at time t is just the Coulomb potential,
6 = £, A = 0. (63.2)
v R(t'y v '
The expressions for the potentials in an arbitrary reference system can be found directly
by finding a four vector which for v = coincides with the expressions just given for <$> and
A. Noting that, according to (63.1), <f> in (63.2) can also be written in the form
# e
c(tt'y
we find that the required four vector is :
A l = ef k , (63.3)
K k U
where u k is the four velocity of the charge, R k = [c(t— t '), r— r'], where x', y', z', t' are
related by the equation (63.1), which in fourdimensional form is
R k R k =0. (63.4)
Now once more transforming to threedimensional notation, we obtain, for the potentials
of the field produced by an arbitrarily moving point charge, the following expressions:
<j> = — , A = — — , (63.5)
('?i ■('?)
where R is the radius vector, taken from the point where the charge is located to the point
of observation P, and all the quantities on the right sides of the equations must be evaluated
at the time t', determined from (63.1). The potentials of the field, in the form (63.5), are
called the Lienor dWiechert potentials.
To calculate the intensities of the electric and magnetic fields from the formulas
1 dA
E = — grad 6, H = cud A,
c at
we must differentiate <£ and A with respect to the coordinates x, y, z of the point, and the
time t of observation. But the formulas (63.5) express the potentials as functions of t ', and
only through the relation (63.1) as implicit functions of x, y, z, t. Therefore to calculate the
t This point is obvious but it can be verified directly. To do this we choose the field point P and the time
of observation t as the origin O of the fourdimensional coordinate system and construct the light cone (§2)
with its vertex at O. The lower half of the cone, containing the absolute past (with respect to the event O), is
the geometrical locus of world points such that signals sent from them reach O. The points in which this
hypersurface intersects the world line of the charge are precisely the roots of (63. 1). But since the velocity of a
particle is always less than the velocity of light, the inclination of its world line relative to the time axis is
everywhere less than the slope of the light cone. It then follows that the world line of the particle can inter
sect the lower half of the light cone in only one point.
162 THE FIELD OF MOVING CHARGES § 63
required derivatives we must first calculate the derivatives of t '. Differentiating the relation
R(t ') = c{tt') with respect to t, we get
dR _ dR <tf_ _ _ Rjdf _ f df\
dt~ dt'~di~ ~~R~dt~ C V~ ft)'
(The value of dR/dt ' is obtained by differentiating the identity R 2 = R 2 and substituting
dR(t ')/dt ' =  \(t '). The minus sign is present because R is the radius vector from the charge
e to the point P, and not the reverse.)
Thus,
df 1
8t „ v • R'
(63.6)
Similarly, differentiating the same relation with respect to the coordinates, we find
grad ?= grad R(t') =   f — gra d ?+),
so that
R
grad t'= — — . (63.7)
(
c
With the aid of these formulas, there is no difficulty in carrying out the calculation of the
fields E and H. Omitting the intermediate calculations, we give the final results:
1
E=e (^^( R "" R )\T7^y Rx {( R ^) x *}' (63  8)
H = RxE. (63.9)
Here, v = dv/dt'; all quantities on the right sides of the equations refer to the time t '. It is
interesting to note that the magnetic field turns out to be everywhere perpendicular to the
electric.
The electric field (63.8) consists of two parts of different type. The first term depends only
on the velocity of the particle (and not on its acceleration) and varies at large distances like
1/R 2 . The second term depends on the acceleration, and for large R it varies like l/R. Later
(§ 66) we shall see that this latter term is related to the electromagnetic waves radiated by the
particle.
As for the first term, since it is independent of the acceleration it must correspond to the
field produced by a uniformly moving charge. In fact, for constant velocity the difference
R f i^ = R t ,v(*0
c
is the distance R t from the charge to the point of observation at precisely the moment of
observation. It is also easy to show directly that
R t ' c R t 'y=jRf^(^^t) 2 = R t Jl V ^m 2 e t ,
§ 64 SPECTRAL RESOLUTION OF THE RETARDED POTENTIALS 163
where 9 t is the angle between R t and v. Consequently the first term in (63.8) is identical
with the expression (38.8).
PROBLEM
Derive the LienardWiechert potentials by integrating (62.910).
Solution: We write formula (62.8) in the form:
^ (r ' 0= f \y^ S ( T ~ t+ ~c lr ~ r ' l ) dTdV '
(and similarly for A(r, t)), introducing the additional delta function and thus eliminating the
implicit arguments in the function q. For a point charge, moving in a trajectory r = r (t), we have:
o(r', x) = e8[r'— r (r)].
Substituting this expression and integrating over dV\ we get :
*'>Jf£f'
r—t+ rr (r)
c
The t integration is done using the formula
s[f(t)] = fWT
[where f ' is the root of Fit') = 0], and gives formula (63.5).
§ 64. Spectral resolution of the retarded potentials
The field produced by moving charges can be expanded into monochromatic waves. The
potentials of the different monochromatic components of the field have the form ^e" to ,
Aa,e~ l(0t . The charge and current densities of the system of charges producing the field can
also be expanded in a Fourier series or integral. It is clear that each Fourier component of q
and j is responsible for the creation of the corresponding monochromatic component of the
field.
In order to express the Fourier components of the field in terms of the Fourier components
of the charge density and current, we substitute in (62.9) for $ and q respectively, $ m e~ i<ot
and Q m e~ i<ot . We then obtain
4>»e u " = ]Q a — ir dV.
Factoring e~ imt and introducing the absolute value of the wave vector k = co/c, we have:
,ikR
0co = J Q.^rdV. (64.1)
Similarly, for A w we get
"J" 1,
y R ,
cR dV. (64.2)
164 THE FIELD OF MOVING CHARGES § 64
We note that formula (64.1) represents a generalization of the solution of the Poisson
equation to a more general equation of the form
&<t><o + k 2 (f>< = 4n Qa (64.3)
(obtained from equations (62.4) for q, $ depending on the time through the factor e~ iwt ).
If we were dealing with expansion into a Fourier integral, then the Fourier components
of the charge density would be
Q m = / Qe icot dt.
— 00
Substituting this expression in (64.1), we get
+ oo
4> a = jjy«" +m dVdt. (64.4)
— oo
We must still go over from the continuous distribution of charge density to the point charges
whose motion we are actually considering. Thus, if there is just one point charge, we set
q = <?<5[rr (0],
where r (? ) is the radius vector of the charge, and is a given function of the time. Substituting
this expression in (64.4) and carrying out the space integration [which reduces to replacing
rb yro(OL weget:
oo
<t>m==e I j^ efa,[t+ * (0/Cl ^' ( 64  5 )
— 00
where now R(t) is the distance from the moving particle to the point of observation.
Similarly we find for the vector potential:
oo
°>~ c) R{f) 6 dt > (64  6)
— 00
where v = i (t) is the velocity of the particle.
Formulas analogous to (64.5), (64.6) can also be written for the case where the spectral
resolution of the charge and current densities contains a discrete series of frequencies. Thus,
for a periodic motion of a point charge (with period T = 2n/co ) the spectral resolution of the
field contains only frequencies of the form nco , and the corresponding components of the
vector potential are
T
X n = ^[ y ~ e in <°° lt+R ^ dt (64.7)
cT J R(l)
o
(and similarly for (f> n ). In both (64.6) and (64.7) the Fourier components are defined in
accordance with § 49.
PROBLEM
Find the expansion in plane waves of the field of a charge in uniform rectilinear motion.
Solution: We proceed in similar fashion to that used in § 51. We write the charge density in the
form q = eS(r— \t), where v is the velocity of the particle. Taking Fourier components of the
equation Q^ = — Ane S(r—\t), we find (D^) k = — 47ree i<vk)f .
§ 65 THE LAGRANGIAN TO TERMS OF SECOND ORDER 165
On the other hand, from
we have
(D0 k = k 2 fa^ ..a •
c 2 dt'
Thus,
c 2 9f 2
from which, finally
e
K = 47re
 i(k • v)t
M
From this it follows that the wave with wave vector k has the frequency co = k • v. Similarly, we
obtain for the vector potential,
ygi(kV)J
A k — 47ie
* 2 ' k ' V
c
Finally, we have for the fields,
k • v r
E k = — ik^ k +i A k = bite i
c
 i(k • \»
"<?)'
4ne kx v ... ,,
H k =/kxAk = — / r, — ro e" ,(kv)f .
»k = «* aAj
("?)'
§ 65. The Lagrangian to terms of second order
In ordinary classical mechanics, we can describe a system of particles interacting with
each other with the aid of a Lagrangian which depends only on the coordinates and velocities
of these particles (at one and the same time). The possibility of doing this is, in the last
analysis, dependent on the fact that in mechanics the velocity of propagation of interactions
is assumed to be infinite.
We already know that because of the finite velocity of propagation, the field must be
considered as an independent system with its own "degrees of freedom". From this it follows
that if we have a system of interacting particles (charges), then to describe it we must consider
the system consisting of these particles and the field. Therefore, when we take into account
the finite velocity of propagation of interactions, it is impossible to describe the system of
interacting particles rigorously with the aid of a Lagrangian, depending only on the co
ordinates and velocities of the particles and containing no quantities related to the internal
"degrees of freedom" of the field.
However, if the velocity v of all the particles is small compared with the velocity of light,
then the system can be described by a certain approximate Lagrangian. It turns out to be
possible to introduce a Lagrangian describing the system, not only when all powers of v/c
are neglected (classical Lagrangian), but also to terms of second order, v 2 /c 2 . This last
166 THE FIELD OF MOVING CHARGES § 65
remark is related to the fact that the radiation of electromagnetic waves by moving charges
(and consequently, the appearance of a "self "field) occurs only in the third approximation
in v/c (see later, in § 67).f
As a preliminary, we note that in zero'th approximation, that is, when we completely
neglect the retardation of the potentials, the Lagrangian for a system of charges has the form
L^^mAY— (65.1)
a a>b i\ a b
(the summation extends over the charges which make up the system). The second term is the
potential energy of interaction as it would be for charges at rest.
To get the next approximation, we proceed in the following fashion. The Lagrangian for a
charge e a in an external field is
/ v 2 e
L a =  mc 2 / ^l!_^ + J?Av (65.2)
Choosing any one of the charges of the system, we determine the potentials of the field
produced by all the other charges at the position of the first, and express them in terms of the
coordinates and velocities of the charges which produce this field (this can be done only
approximately— for <j), to terms of order v 2 /c 2 , and for A, to terms in v/c). Substituting the
expressions for the potentials obtained in this way in (65.2), we get the Lagrangian for one of
the charges of the system (for a given motion of the other charges). From this, one can then
easily find the Lagrangian for the whole system.
We start from the expressions for the retarded potentials
+ ! e T*r H/¥^
If the velocities of all the charges are small compared with the velocity of light, then the
charge distribution does not change significantly during the time R/c. Therefore we can
expand Q t _ Rjc and j t _ R/c in series of powers of R/c. For the scalar potential we thus find, to
terms of second order:
, C QdV 13 f , 1 d 2 r
(q without indices is the value of q at time t ; the time differentiations can clearly be taken
out from under the integral sign). But I QdV is the constant total charge of the system.
Therefore the second term in our expression is zero, so that
± r gdv i e 2 r
We can proceed similarly with A. But the expression for the vector potential in terms of
the current density already contains l/c, and when substituted in the Lagrangian is multiplied
once more by 1/c. Since we are looking for a Lagrangian which is correct only to terms of
second order, we can limit ourselves to the first term in the expansion of A, that is,
^dV (65.4)
(we have substituted j = qv).
u
t In special cases the appearance of the radiation terms can even be put off until the fifth approximation
in v/c; in this case a Lagrangian even exists up to terms of order (v/c)*. (See problem 2 of § 75.)
§ 65 THE LAGRANGIAN TO TERMS OF SECOND ORDER 167
Let us first assume that there is only a single point charge e. Then we obtain from (65.3)
and (65.4),
± e e d 2 R e\
* = R + 2?W A = c"R' (65  5)
where R is the distance from the charge.
We choose in place of <£ and A other potentials <£' and A', making the transformation (see
§18):
4>'
= <£
15/
cdf
A' =
A+grady,
in which
we choose for /the function
e dR
f =
2c dt'
Then we
gett
V
e
= R'
A'
2c dt'
To calculate A' we note first of all that V(dR/dt) = (d/dt)VR. The grad operator here
means differentiation with respect to the coordinates of the field point at which we seek the
value of A'. Therefore WR is the unit vector n, directed from the charge e to the field point,
so that
ev e .
A = ^ + ;r n 
cR 2c
We also write:
RR
7i
n " dt \r) ~ R ~ R :
But the derivative —ft for a given field point is the velocity v of the charge, and the derivative
R is easily determined by differentiating R 2 = R 2 , that is, by writing
RR = Rti= Rv.
Thus,
— v+n(nv)
„= —  — .
Substituting this in the expression for A;, we get finally :
r R 2cR v '
If there are several charges then we must, clearly, sum these expressions over all the charges.
Substituting these expressions in (65.2), we obtain the Lagrangian L a for the charge e a
(for a fixed motion of the other charges). In doing this we must also expand the first term in
(65.2) in powers of vjc, retaining terms up to the second order. Thus we find :
m a v 2 1 m a v1: „, e b e a _, e 6 r , w N _
L a = ^r + 7>^re a Y}r + T^ XV" l Y ay b +(y a n ab )(y b n ab )]
Z o C b K ab ZC b K ah
(the summation goes over all the charges except e a ; n ab is the unit vector from e b to e a ).
t These potentials no longer satisfy the Lorentz condition (62.1), nor the equations (62.34).
168 THE FIELD OF MOVING CHARGES § 65
From this, it is no longer difficult to get the Lagrangian for the whole system. It is easy
to convince oneself that this function is not the sum of the L a for all the charges, but has the
form
T ^m a v 2 m a vt e a e b e a e b
L = L ~y + L ^r  L W~ + L y^y ba ' v fc + (v a • n ab )(y b • n a6 )]. (65. /)
a £ a OL a>b J\ ab a>b ZC l\ ab
Actually, for each of the charges under a given motion of all the others, this function L
goes over into L a as given above. The expression (65.7) determines the Lagrangian of a
system of charges correctly to terms of second order. (It was first obtained by Darwin, 1920.)
Finally we find the Hamiltonian of a system of charges in this same approximation. This
could be done by the general rule for calculating Jf from L; however it is simpler to proceed
as follows. The second and fourth terms in (65.7) are small corrections to L (0) (65.1). On the
other hand, we know from mechanics that for small changes of L and Jt? , the additions to
them are equal in magnitude and opposite in sign (here the variations of L are considered
for constant coordinates and velocities, while the changes in j4? refer to constant coordinates
and momenta). f
Therefore we can at once write Jf , subtracting from
^ (0) = <£ — +% —
a 2m a a >b R ab
the second and fourth terms of (65.7), replacing the velocities in them by the first approxima
tion v fl = yjm a . Thus,
2 4
~W = V ^ a — V ^ a ■+ v gqgft —
~2m fl iSc 2 m 3 a kh R„ b
~ £ ,„2 W " p [P«'Pfc + (P«n fl6 Xp 6 n fl6 )] (65.8)
a >b2c m a m b R ab
PROBLEMS
1. Determine (correctly to terms of second order) the center of inertia of a system of interacting
particles.
Solution: The problem is solved most simply by using the formula
U _ _?
a
[see (14.6)], where S a is the kinetic energy of the particle (including its rest energy), and W is the
energy density of the field produced by the particles. Since the & a contain the large quantities
m a c 2 , it is sufficient, in obtaining the next approximation, to consider only those terms in & a and
W which do not contain c, i.e. we need consider only the nonrelativistic kinetic energy of the particles
and the energy of the electrostatic field. We then have:
JuWiJlPrrfK
(SvfrdV
t See Mechanics, § 40.
§ 65 THE LAGRANGIAN TO TERMS OF SECOND ORDER 169
the integral over the infinitely distant surface vanishes; the second integral also is transformed
into a surface integral and vanishes, while we substitute A<p = — 4tiq in the third integral and obtain:
WrdV= P <prdV=Y j ea<Par a
where <p a is the potential produced at the point r a by all the charges other than e a .f
Finally, we get :
(with a summation over all b except b = a), where
p% , ^ e a e b
?( W °
2 "*a &» R al
a>b ^ab,
is the total energy of the system. Thus in this approximation the coordinates of the center of inertia
can actually be expressed in terms of quantities referring only to the particles.
2. Write the Hamiltonian in second approximation for a system of two particles, omitting the
motion of the system as a whole.
Solution: We choose a system of reference in which the total momentum of the two particles is
zero. Expressing the momenta as derivatives of the action, we have
Pi +Pa = W/ dr x + dS/ dr 2 = 0.
From this it is clear that in the reference system chosen the action is a function of r = r 2 — r x , the
difference of the radius vectors of the two particles. Therefore we have p 2 = — Pi = p, where
p = cS/ or is the momentum of the relative motion of the particles. The Hamiltonian is
* _ \ (1 + L) „*. i (± + 1\ pt+ >j« + _^v ^ +to .„ n
2 \mi m 2 J 8c 2 y Wx 3 m 2 3 / r Irrii m 2 c 2 r
t The elimination of the selffield of the particles corresponds to the mass "renormalization" mentioned
in the footnote on p. 90).
CHAPTER 9
RADIATION OF ELECTROMAGNETIC WAVES
§ 66. The field of a system of charges at large distances
We consider the field produced by a system of moving charges at distances large compared
with the dimensions of the system.
We choose the origin of coordinates O anywhere in the interior of the system of charges.
The radius vector from O to the point P, where we determine the field, we denote by R ,
and the unit vector in this direction by n. Let the radius vector of the charge element
de = odV be r, and the radius vector from de to the point P be R. Obviously R = R r.
At large distances from the system of charges, R > r, and we have approximately,
K = Ror^i? rn.
We substitute this in formulas (62.9), (62.10) for the retarded potentials. In the denominator
of the integrands we can neglect rn compared with R . In t(R/c), however, this is
generally not possible; whether it is possible to neglect these terms is determined not by the
relative values of R /c and r • (n/c), but by how much the quantities q and j change during
the time r • (n/c). Since R is constant in the integration and can be taken out from under the
integral sign, we get for the potentials of the field at large distances from the system of
charges the expressions :
*"£/*?—:'"'• (6<u)
A5f J'iS + ,. ! <»' (6")
At sufficiently large distances from the system of charges, the field over small regions of
space can be considered to be a plane wave. For this it is necessary that the distance be large
compared not only with the dimensions of the system, but also with the wavelength of the
electromagnetic waves radiated by the system. We refer to this region of space as the wave
zone of the radiation.
In a plane wave, the fields E and H are related to each other by (47.4), E = Hxn. Since
H = curl A, it is sufficient for a complete determination of the field in the wave zone to
calculate only the vector potential. In a plane wave we have H = (l/c)Axn [see (47.3)],
where the dot indicates differentiation with respect to time.f Thus, knowing A, we find H
t In the present case, this formula is easily verified also by direct computation of the curl of the expression
(66.2), and dropping terms in IjRl in comparison with terms ~ l/i? .
170
§ 66 THE FIELD OF A SYSTEM OF CHARGES AT LARGE DISTANCES 171
and E from the formulas:!
H = Axn, E =  (Axn)xn. (66.3)
c c
We note that the field at large distances is inversely proportional to the first power of the
distance R from the radiating system. We also note that the time t enters into the expressions
(66.1) to (66.3) always in the combination t(R lc).
For the radiation produced by a single arbitrarily moving point charge, it turns out to be
convenient to use the LienardWiechert potentials. At large distances, we can replace the
radius vector R in formula (63.5) by the constant vector R , and in the condition (63.1)
determining t ', we must set R = R r n(r (r) is the radius vector of the charge). Thus, J
A= e ^l , (66.4)
where t ' is determined from the equality
f T ^n = t^. (66.5)
c c
The radiated electromagnetic waves carry off energv. The energy flux is given by the
Poynting vector which, for a plane wave, is
S = cn.
An
The intensity dl of radiation into the element of solid angle do is defined as the amount of
energy passing in unit time through the element df= R 2 do of the spherical surface with
center at the origin and radius R . This quantity is clearly equal to the energy flux density S
multiplied by df, i.e.
dI = c~R 2 do. (66.6)
47T
Since the field H is inversely proportional to R , we see that the amount of energy radiated
by the system in unit time into the element of solid angle do is the same for all distances (if
the values of t(R /c) are the same for them). This is, of course, as it should be, since the
energy radiated from the system spreads out with velocity c into the surrounding space, not
accumulating or disappearing anywhere.
We derive the formulas for the spectral resolution of the field of the waves radiated by the
system. These formulas can be obtained directly from those in § 64. Substituting in (64.2)
R = R t • n (in which we can set R = R in the denominator of the integrand), we get for
the Fourier components of the vector potential :
JkRo f
A ^kS Le " t " dV (66J)
(where k = An). The components H w and E w are determined using formula (66.3). Sub
stituting in it for H, E, A, respectively, H^e"^', E^e"^, Ke~ imt , and then dividing by
t The formula E = (l/c)A [see (47.3)] is here not applicable to the potentials <j>, A, since they do not
satisfy the same auxiliary condition as was imposed on them in § 47.
% In formula (63.8) for the electric field, the present approximation corresponds to dropping the first
term in comparisonwith the second.
172 RADIATION OF ELECTROMAGNETIC WAVES § 66
e~ i(0t , we find
ic
H £0 = ikxA a) , E t0 = (kxAJxk. (66.8)
When speaking of the spectral distribution of the intensity of radiation, we must dis
tinguish between expansions in Fourier series and Fourier integrals. We deal with the expan
sion into a Fourier integral in the case of the radiation accompanying the collision of charged
particles. In this case the quantity of interest is the total amount of energy radiated during the
time of the collision (and correspondingly lost by the colliding particles). Suppose d£ na) is the
energy radiated into the element of solid angle do in the form of waves with frequencies in
the interval dco. According to the general formula (49.8), the part of the total radiation lying
in the frequency interval dco/2n is obtained from the usual formula for the intensity by
replacing the square of the field by the square modulus of its Fourier component and multi
plying by 2. Therefore we have in place of (66.6):
Mum = y % W^o do ~. (66.9)
If the charges carry out a periodic motion, then the radiation field must be expanded in a
Fourier series. According to the general formula (49.4) the intensities of the various com
ponents of the Fourier resolution are obtained from the usual formula for the intensity by
replacing the field by the Fourier components and then multiplying by two. Thus the intensity
of the radiation into the element of solid angle do, with frequency co = nco Q equals
c
dI„ = —\H„\ 2 Rldo. (66.10)
Finally, we give the formulas for determining the Fourier components of the radiation
field directly from the given motion of the radiating charges. For the Fourier integral
expansion, we have:
00
L= j ie^dt.
— 00
Substituting this in (66.7) and changing from the continuous distribution of currents to a
point charge moving along a trajectory r = r (t) (see § 64), we obtain:
+ 00
A w = — J ey{ty^^m dt> (66U)
— oo
Since v = drjdt, vdt = dr and this formula can also be written in the form of a line
integral taken along the trajectory of the charge :
gikRo /•
A °> = e ^j e i(cot  k ' ro) dr . (66.12)
According to (66.8), the Fourier components of the magnetic field have the form:
icoe ikR ° C
H. = e j£ J e ^*'o) n x dr ^ (66 13)
If the charge carries out a periodic motion in a closed trajectory, then the field must be
expanded in a Fourier series. The components of the Fourier series expansion are obtained
by replacing the integration over all times in formulas (66.1 1) to (66.13) by an average over
§ 67 DIPOLE RADIATION 173
the period T of the motion (see § 49). For the Fourier component of the magnetic field with
frequency co = nco = n(2n/T), we have
T
2nine ikRo
H = e
,IKK /•
e i[ n o,otkT O (0] nxv (A^
R J
c 2 T 2 „
2nine ikRo f
In the second integral, the integration goes over the closed orbit of the particle
= e iviry <P e ,( " COof  kro) n x dr . (66.14)
c T R J
PROBLEM
Find the fourdimensional expression for the spectral resolution of the fourmomentum radiated
by a charge moving along a given trajectory.
Solution: Substituting (66.8) in (66.9), and using the fact that, because of the condition (62.1),
Ar^o = k • A a , we find:
d* w = £■ (A^IA^Ik • A.,1 2 ) Rl do f°
Zn An
Lit In Ln In
Representing the fourpotential A lm in a form analogous to (66.12), we get:
k 2 e 2
d&nm =  ^ Xt X 1 * do dk,
where x l denotes the fourvector
X 1 = exp {—ikix l )dx {
and the integration is performed along the world line of the trajectory of the particle. Finally,
changing to fourdimensional notation [including the fourdimensional "volume element" in
fcspace, as in (10.1a)], we find for the radiated fourmomentum:
2jtc
§ 67. Dipole radiation
The time r (n/c) in the integrands of the expressions (66.1) and (66.2) for the retarded
potentials can be neglected in cases where the distribution of charge changes little during
this time. It is easy to find the conditions for satisfying this requirement. Let T denote the
order of magnitude of the time during which the distribution of the charges in the system
changes significantly. The radiation of the system will obviously contain periods of order
T (i.e. frequencies of order l/T). We further denote by a the order of magnitude of the
dimensions of the system. Then the time r • (n/c) ~ a/c. In order that the distribution of the
charges in the system shall not undergo a significant change during this time, it is necessary
that a/c <^ T. But cT is just the wavelength X of the radiation. Thus the condition a<cT
can be written in the form
a < A, (67.1)
that is, the dimensions of the system must be small compared with the radiated wavelength.
174 RADIATION OF ELECTROMAGNETIC WAVES § 67
We note that this same condition (67.1) can also be obtained from (66.7). In the integrand,
r goes through values in an interval of the order of the dimensions of the system, since outside
the system j is zero. Therefore the exponent zk • r is small, and can be neglected for those
waves in which ka <^ 1, which is equivalent to (67.1).
This condition can be written in still another form by noting that T ~ a/v, so that X ~ ca/v,
if v is of the order of magnitude of the velocities of the charges. From a <^ A, we then find
v<c, (67.2)
that is, the velocities of the charges must be small compared with the velocity of light.
We shall assume that this condition is fulfilled, and take up the study of the radiation at
distances from the radiating system large compared with the wavelength (and consequently,
in any case, large compared with the dimensions of the system). As was pointed out in § 66,
at such distances the field can be considered as a plane wave, and therefore in determining
the field it is sufficient to calculate only the vector potential.
The vector potential (66.2) of the field now has the form
=ii j " dK '
(67.3)
where the time t ' = t — (R lc) now no longer depends on the variable of integration. Sub
stituting j = q\, we rewrite (67.3) in the form
cR
1 Q>)
(the summation goes over all the charges of the system ; for brevity, we omit the index t ' —
all quantities on the right side of the equation refer to time t '). But
where d is the dipole moment of the system. Thus,
a = 4 a  ( 67 ' 4 >
cR
With the aid of formula (66.3) we find that the magnetic field is equal to
1
c r R
and the electric field to
H=y— axn, (67.5)
E = 4— (^ x n) x n. (67.6)
c R
We note that in the approximation considered here, the radiation is determined by the
second derivative of the dipole moment of the system. Radiation of this kind is called dipole
radiation.
Since d = 2 er, A = Z ev. Thus the charges can radiate only if they move with acceleration.
Charges in uniform motion do not radiate. This also follows directly from the principle of
relativity, since a charge in uniform motion can be considered in the inertial system in which
it is at rest, and a charge at rest does not radiate.
§ 67 DIPOLE RADIATION 175
Substituting (67.5) in (66.6), we get the intensity of the dipole radiation:
dl = — i<ixa) 2 do = — , sin 2 do, (67.7)
where is the angle between d and n. This is the amount of energy radiated by the system in
unit time into the element of solid angle do. We note that the angular distribution of the
radiation is given by the factor sin 2 9.
Substituting do — 2n sin 6 dd and integrating over 6 from to n, we find for the total
radiation
"&*■ (67  8)
If we have just one charge moving in the external field, then d = ex and 3 = ew, where w
is the acceleration of the charge. Thus the total radiation of the moving charge is
2e 2 w 2
1 —&• (67  9 >
We note that a closed system of particles, for all of which the ratio of charge to mass is
the same, cannot radiate (by dipole radiation). In fact, for such a system, the dipole moment
g
d = yer = Y— mr = const Y mr,
m
where const is the chargetomass ratio common to all the charges. But S mr = R 2 m,
where R is the radius vector of the center of inertia of the system (remember that all of the
velocities are small, v <^ c, so that nonrelativistic mechanics is applicable). Therefore d is
proportional to the acceleration of the center of inertia, which is zero, since the center of
inertia moves uniformly.
Finally, we give the formula for the spectral resolution of the intensity of dipole radiation.
For radiation accompanying a collision, we introduce the quantity dS a of energy radiated
throughout the time of the collision in the form of waves with frequencies in the interval
d(oj2% (see § 66). It is obtained by replacing the vector d in (67.8) by its Fourier component
A a and multiplying by 2 :
4
d£ (0 = — 3 (&J 2 dco.
For determining the Fourier components, we have
d 2
dt
from which 5 W == — co 2 d a) . Thus, we get
For periodic motion of the particles, we obtain in similar fashion the intensity of radiation
with frequency co = nco in the form
h= 3c 3 l d » • (67.11)
l(Ot =^2(d 0i e l(Ot )=<o 2 a (O e
176 RADIATION OF ELECTROMAGNETIC WAVES § 67
PROBLEMS
1. Find the radiation from a dipole d, rotating in a plane with constant angular velocity Q.f
Solution: Choosing the plane of the rotation as the x, y plane, we have:
d x — do cos Clt, d y — d sin €lt.
Since these functions are monochromatic, the radiation is also monochromatic, with frequency
co = Q. From formula (67.7) we find for the angular distribution of the radiation (averaged over
the period of the rotation):
dl= d ^(l+cos 2 G)do,
Snc 3
where 9 is the angle between the direction n of the radiation and the z axis. The total radiation is
2d%&
1 =
3c 3
The polarization of the radiation is along the vector A x n = co 2 n x d. Resolving it into com
ponents in the n, z plane and perpendicular to it, we find that the radiation is elliptically polarized,
and that the ratio of the axes of the ellipse is equal to n 3 = cos 9; in particular, the radiation
along the z axis is circularly polarized.
2. Determine the angular distribution of the radiation from a system of charges, moving as a
whole (with velocity v), if the distribution of the radiation is known in the reference system in which
the system is at rest as a whole.
Solution: Let
dl ' =/(cos 9', fi) do', do' = d(cos 9') d</>'
be the intensity of the radiation in the K' frame which is attached to the moving charge system
(9', </>' are the polar coordinates; the polar axis is along the direction of motion of the system). The
energy d£ radiated during a time interval dt in the fixed (laboratory) reference frame K, is related
to the energy d£' radiated in the K' system by the transformation formula
V
j* \r jd * cos e
dS—ydY ,„ c
J>~$ J
l V 
c 2
(the momentum of radiation propagating in a given direction is related to its energy by the equation
\dP\ = dtf/c). The polar angles, 9, 9' of the direction of the radiation in the K and K' frames are
related by formulas (5.6), and the azimuths $ and </>' are equal. Finally, the time interval dt' in the
K' system corresponds to the time
dt'
dt =
i y 
in the K system.
As a result, we find for the intensity dl = d#/dt in the K system:
J'
Thus, for a dipole moving'along the direction of its own axis, /= const • sin 2 9', and by using the
cos 9
dl= , x T , " 7 x 3 / I v ° , i I do.
1 cos 9
f The radiation from a rotator or a symmetric top which has a dipole moment is of this type. In the first
case, d is the total dipole moment of the rotator; in the second case d is the projection of the dipole moment
of the top on a plane perpendicular to its axis of precession (i.e. the direction of the total angular momentum).
§ 68 DIPOLE RADIATION DURING COLLISIONS 177
formula just obtained, we find:
( 1 ?) sin2 *
/ V V
dl = const • K tt rv do.
§ 68. Dipole radiation during collisions
In problems of radiation during collisions, one is seldom interested in the radiation
accompanying the collision of two particles moving along definite trajectories. Usually we
have to consider the scattering of a whole beam of particles moving parallel to each other,
and the problem consists in determining the total radiation per unit current density of
particles.
If the current density is unity, i.e. if one particle passes per unit time across unit area of
the crosssection of the beam, then the number of particles in the flux which have "impact
parameters" between g and g + dg is 2ng dg (the area of the ring bounded by the circles of
radius g and g + dg). Therefore the required total radiation is gotten by multiplying the total
radiation A£ from a single particle (with given impact parameter) by 2tiq dg and integrat
ing over g from to oo. The quantity determined in this way has the dimensions of energy
times area. We call it the effective radiation (in analogy to the effective crosssection for
scattering) and denote it by x :
x f M2ngdg. (68.1
We can determine in completely analogous manner the effective radiation in a given solid
angle element do, in a given frequency interval dco, etc.f
We derive the general formula for the angular distribution of radiation emitted in the
scattering of a beam of particles by a centrally symmetric field, assuming dipole radiation.
The intensity of the radiation (at a given time) from each of the particles of the beam under
consideration is determined by formula (67.7), in which d is the dipole moment of the particle
relative to the scattering center.} First of all we average this expression over all directions of
the vectors d in the plane perpendicular to the beam direction. Since (3 x n) 2 = 3 2 — (n • fl) 2 ,
the averaging affects only (n • d) 2 . Because the scattering field is centrally symmetric and the
incident beam is parallel, the scattering, and also the radiation, has axial symmetry around
an axis passing through the center. We choose this axis as x axis. From symmetry, it is
obvious that the first powers J y , d z give zero on averaging, and since d x is not subjected to
the averaging process,
d x d y = d x d z = 0.
The average values of d 2 y and d z are equal to each other, so that
? = 5 = i[(d) 2 ^].
t If the expression to be integrated depends on the angle of orientation of the projection of the dipole
moment of the particle on the plane transverse to the beam, then we must first average over all directions in
this plane and only then multiply by 2ng dg and integrate.
% Actually one usually deals with the dipole moment of two particles — the scatterer and the scattered
particle — relative to their common center of inertia.
178 RADIATION OF ELECTROMAGNETIC WAVES § 68
Keeping all this in mind, we find without difficulty:
(3 x n) 2 = i(3 2 + ell) + i@ 2 ~ 3^) cos 2 9,
where 9 is the angle between the direction n of the radiation and the x axis.
Integrating the intensity over the time and over all impact parameters, we obtain the
following final expression giving the effective radiation as a function of the direction of
radiation:
, do
dx„ =
A + B
3 cos 2 01
(68.2)
Anc
where
00+00 00+00
A = 3 I \ ^ dtln Q d Q> B = ? I I @ 2 3dl)dt2nQdQ. (68.3)
oo oo
The second term in (68.2) is written in such a form that it gives zero when averaged over all
directions, so that the total effective radiation is x = A/c 3 . We call attention to the fact that
the angular distribution of the radiation is symmetric with respect to the plane passing
through the scattering center and perpendicular to the beam, since the expression (68.2) is
unchanged if we replace 9 by n  9. This property is specific to dipole radiation, and is no
longer true for higher approximations in v/c.
The intensity of the radiation accompanying the scattering can be separated into two
parts — radiation polarized in the plane passing through the x axis and the direction n
(we choose this plane as the xy plane), and radiation polarized in the perpendicular plane xz.
The vector of the electric field has the direction of the vector
n x (A x n) = n(n • d) — fl
[see (67.6)]. The component of this vector in the direction perpendicular to the xy plane is —
d z , and its projection on the xy plane is jsin 9d x — cos 9d y \. This latter quantity is most con
veniently determined from the zcomponent of the magnetic field which has the direction
dxn.
Squaring E and averaging over all directions of the vector A in the yz plane, we see first
of all that the product of the projections of the field on the xy plane and perpendicular to it,
vanishes. This means that the intensity can actually be represented as the sum of two
independent parts — the intensities of the radiation polarized in the two mutually per
pendicular planes.
The intensity of the radiation with its electric vector perpendicular to the xy plane is
determined by the mean square d\ = ^(d 2 — d 2 ). For the corresponding part of the effective
radiation, we obtain the expression
00 +00
dxi = ^ I I (A 2 d 2 x )dt2nQdQ. (68.4)
 oo
We note that this part of the radiation is isotropic. It is unnecessary to give the expression
for the effective radiation with electric vector in the xy plane since it is clear that
dx\+dx^ = dx a .
In a similar way we can get the expression for the angular distribution of the effective
radiation in a given frequency interval dco :
§ 69
where
RADIATION OF LOW FREQUENCY IN COLLISIONS
"^n, co ~~
A(co) + B(co)
3 cos 2 01
do dco
2nc 3 lit
179
(68.5)
00 00
^(co) = — J dl2n Q d e , 5(a)) = y J (di3dL)2jrede. (68.6)
§ 69. Radiation of low frequency in collisions
In the spectral distribution of the radiation accompanying a collision, the main part of the
intensity is contained in frequencies co ~ 1/t, where t is the order of magnitude of the dura
tion of the collision. However, we shall here not consider this region of the spectrum (for
which one cannot obtain any general formulas) but rather the "tail" of the distribution at
low frequencies, satisfying the condition
(ox<\. (69.1)
We shall not assume that the velocities of the colliding particles are small compared to the
velocity of light, as we did in the preceding section; the formulas which follow will be valid
for arbitrary velocities.
In the integral
H t0 = j He icot dt,
the field H of the radiation is significantly different from zero only during a time interval of
the order of x. Therefore, in accord with condition (69.1), we can assume that cox <^ 1 in
the integral, so that we can replace e iwt by unity; then
oo
H,= JH
dt.
Substituting H = Axn/c and carrying out the time integration, we get :
c
(69.2)
where A 2 — A t is the change in the vector potential produced by the colliding particles during
the time of the collision.
The total radiation (with frequency co) during the time of the collision is found by sub
stituting (69.2) in (66.9):
R 2
° [(AaAJxnpdoJo). (69.3)
am 4cn 2
We can use the LienardWiechert expression (66.4) for the vector potential, and obtain
I
v 2 xn
v x xn
I — nv, 1 — nv.
c c l
do dco,
(69.4)
**'" nco a 2 3
4n z c 6 <
where v x and v 2 are the velocities of the particle before and after the collision, and the sum is
180 RADIATION OF ELECTROMAGNETIC WAVES § 69
taken over the two colliding particles. We note that the coefficient of dco is independent of
frequency. In other words, at low frequencies [condition (69.1)], the spectral distribution is
independent of frequency, i.e. dS n Jda> tends toward a constant limit as co > O.f
If the velocities of the colliding particles are small compared with the velocity of light, then
(69.4) becomes
An c
2 3 \^e{y 1 — y 1 )xvL\ 2 dod(o. (69.5)
This expression corresponds to the case of dipole radiation, with the vector potential given
by formula (67.4).
An interesting application of these formulas is to the radiation produced in the emission
of a new charged particle (e.g. the emergence of a /^particle from a nucleus). This process is
to be treated as an instantaneous change in the velocity of the particle from zero to its
actual value. [Because of the symmetry of formula (69.5) with respect to interchange of v t
and v 2 , the radiation originating in this process is identical with the radiation which would
be produced in the inverse process — the instantaneous stopping of the particle.] The
essential point is that, since the "time" for the process is x » 0, condition (69.1) is actually
satisfied for all frequencies.!
PROBLEM
Find the spectral distribution of the total radiation produced when a charged particle is emitted
which moves with velocity v.
Solution: According to formula (69.4) (in which we set v 2 = v, Vi = 0), we have:
e 2 v 2 C sin 2 9
•o [ l  c cos6 )
d g (o = e l( c \ n C +l?\ d(0 . (1)
nc \v c—v )
Evaluation of the integral gives :§
For v<^.c, this formula goes over into
2eV
5TZC
which can also be obtained directly from (69.5).
3 dco,
f By integrating over the impact parameters, we can obtain an analogous result for the effective radiation
in the scattering of a beam of particles. However it must be remembered that this result is not valid for the
effective radiation when there is a Coulomb interaction of the colliding particles, because then the integral
over q is divergent (logarithmically) for large q. We shall see in the next section that in this case the effective
radiation at low frequencies depends logarithmically on frequency and does not remain constant.
J However, the applicability of these formulas is limited by the quantum condition that too be small
compared with the total kinetic energy of the particle.
§ Even though, as we have already pointed out, condition (69.1) is satisfied for all frequencies, because the
process is "instantaneous" we cannot find the total radiated energy by integrating (1) over co — the integral
diverges at high frequencies. We mention that, aside from the violation of the conditions for classical
behavior at high frequencies, in the present case the cause of the divergence lies in the incorrect formulation
of the classical problem, in which the particle has an infinite acceleration at the initial time.
§ 70 RADIATION IN THE CASE OF COLOUMB INTERACTION 181
§ 70. Radiation in the case of Coulomb interaction
In this section we present, for reference purposes, a series of formulas relating to the
dipole radiation of a system of two charged particles; it is assumed that the velocities of the
particles are small compared with the velocity of light.
Uniform motion of the system as a whole, i.e. motion of its center of mass, is not of
interest, since it does not lead to radiation, therefore we need only consider the relative
motion of the particles. We choose the origin of coordinates at the center of mass. Then the
dipole moment of the system d = e 1 r 1 + e 2 r 2 has the form
e.m, e 2 m 1 ( e x e 2 \
d = _1_J l_k r = ^ (_!__!) r (70.1)
m 1 + m 2 \m 1 m 2 f
where the indices 1 and 2 refer to the two particles, and r = r l — r 2 is the radius vector
between them, and
m 1 m 2
P = i
m 1 + m 2
is the reduced mass.
We start with the radiation accompanying the elliptical motion of two particles attracting
each other according to the Coulomb law. As we know from mechanicsf , this motion can be
expressed as the motion of a particle with mass \i in the ellipse whose equation in polar
coordinates is
, a(le 2 )
1 + s cos = — \ (70.2)
where the semimajor axis a and the eccentricity e are
Here $ is the total energy of the particles (omitting their rest energy !) and is negative for a
finite motion; M = \ir 2 § is the angular momentum, and a is the constant in the Coulomb
law:
a = \e x e 2 \.
The time dependence of the coordinates can be expressed in terms of the parametric
equations
r = fl (l  e cos 0, t= . /— (£  e sin £). (70.4)
yj a.
One full revolution in the ellipse corresponds to a change of the parameter £ from to 2n;
the period of the motion is
/^
T = 2k
V a
We calculate the Fourier components of the dipole moment. Since the motion is periodic
we are dealing with an expansion in Fourier series. Since the dipole moment is proportional
to the radius vector r, the problem reduces to the calculation of the Fourier components of
the coordinates x = r cos 0, y = r sin <f>. The time dependence of x and y is given by the
 See Mechanics, § 15.
182 RADIATION OF ELECTROMAGNETIC WAVES § 70
parametric equations
(70.5)
x = a(cos £ — e), y = a\J 1 — e 2 sin £,
o t = £ — s sin £.
Here we have introduced the frequency
IT /T~~* (2 \^
Instead of the Fourier components of the coordinates, it is more convenient to calculate
the Fourier components of the velocities, using the fact that x n = — ia> nx n ; y n = — ico ny„.
We have
T
— ico n co ni J
f e iB « esin{) sinfd£.
o
But xdt — dx = — as'm£d£; transforming from an integral over t to one over £, we have
la
2nn
o
Similarly, we find
2n 2k
iayl — e 2 f . ,. . .. ia^/l—8 2 f . ,. . ..
v „ = — g'»« esin «cos^^ = — e'»«" in «^
27m J 27ine J
o o
(in going from the first to the second integral, we write the integrand as cos £ =
(cos £ — 1/g) + 1/e; then the integral with cos £— 1/e can be done, and gives identically
zero). Finally, we use a formula of the theory of Bessel functions,
2n n
f e w«xsin«)^ = 1 _f CQS ( n £ X sin Q d £ = JJ^^ ( 70 6 )
where J n (x) is the Bessel function of integral order n. As a final result, we obtain the follow
ing expression for the required Fourier components:
a iayl — e 2
x H =  J ' n {ne\ y n = J n (ne) (70.7)
n ne
(the prime on the Bessel function means differentiation with respect to its argument).
The expression for the intensity of the monochromatic components of the radiation is
obtained by substituting x a and y a into the formula
4. ,„4 / „ „ \ 2
Action
3c
2 M_M (W2+W2)
[see (67.11)]. Expressing a and co in terms of the characteristics of the particles, we obtain
finally :
64nWd e,\ 2 X _, 2( le 2 2
5 a 2 Vm! m 2 /
j; 2 (ne)+— 2 J^(ne)
(70.8)
" 3c'
In particular, we shall give the asymptotic formula for the intensity of very high harmonics
(large n) for motion in an orbit which is close to a parabola (s close to 1). For this purpose,
§ 70 RADIATION IN THE CASE OF COULOMB INTERACTION
we use the formula
1 /2\ 1/3
UnB)*r () O
Vn W
2/3
(l£ 2 )
183
(70.9)
n ^> 1, 1 — e <^ 1,
where O is the Airy function defined on p. 179.f
Substituting in (70.8) gives:
+
=) a*
2N 2/3
+
d>"
2/3
(18 2 )
(70.10)
This result can also be expressed in terms of the MacDonald function K
/„ =
64 n 2 <f v / e x e 2
9n c a \Trii m 2
K\
/3
(le 2 ) 3/2
+ K
2/3
(ls 2 ) 3/2
(the necessary formulas are given in the footnote on p. 201).
Next, we consider the collision of two attracting charged particles. Their relative motion is
described as the motion of a particle with mass /n in the hyperbola
l+£ cos (f) =
fl(£ 2 l)
where
a =
J
1 +
2$M A
2$' "V fia'
(now $ > 0). The time dependence of r is given by the parametric equations
r = a(ecosh £— 1), t
/iur
V a
(ssinh££),
(70.11)
(70.12)
(70.13)
where the parameter £ runs through values from — oo to + oo. For the coordinates x, j,we
have
x = a(s — cosh <jj), y = a\J& 2 — 1 sinh £. (70.14)
t For w :> 1 , the main contributions to the integral
%
/„(«£) =  cos [«(£— e sin £)] *#;
7T J
come from small values of £, (for larger values of £, the integrand oscillates rapidly). In accordance with this,
we expand the argument of the cosine in powers of £, :
y*.) = lja»[.(l^{+f)]« ;
because of the rapid convergence of the integral, the upper limit has been replaced by oo ; the term in £ 3
must be kept because the first order term contains the small coefficient 1 — e ^ (1 — e 2 )/2. The integral above
is reduced to the form (70.9) by an obvious substitution.
184 RADIATION OF ELECTROMAGNETIC WAVES § 70
The calculation of the Fourier components (we are now dealing with expansion in a
Fourier integral) proceeds in complete analogy to the preceding case. We find the result:
na ,,. TtaJs 2 — \ ,,.
x„ = — H^'(ivs), y w =  HiPQve). (70.15)
CO COS
where H\^ is the Hankel function of the first kind, of order iv, and we have introduced the
notation
v = — 1 = = ^ (70.16)
N ua
fia 3
(v is the relative velocity of the particles at infinity; the energy $ = fivl/2).\ In the calcula
tion we have used the formula from the theory of Bessel functions :
+ 00
J" ePtto^id^inHPiix). (70.17)
— oo
Substituting (70.15) in the formula
A # 4o>V /ci e 2 \ 2 , 2 . 2 dco
3c \m l m 2 J 2n
[see (67.10)], we get:
"•  w 2 fe  ^) 2 { W " ,,(, ' V6)]2  E ^ w * >0v£)]2 } *»• (70  18)
A quantity of greater interest is the "effective radiation" during the scattering of a parallel
beam of particles (see § 68). To calculate it, we multiply dS^ by 2nQdq and integrate over
all q from zero to infinity. We transform from an integral over q to one over e (between the
limits 1 and go) using the fact that 2tzq dg = 2na 2 s ds ; this relation follows from the definition
(70.12), in which the angular momentum M and the energy i are related to the impact
parameter q and the velocity v by
M = hqv , S\l~.
The resultant integral can be directly integrated with the aid of the formula
d
? + (H
= S (zZ pZ p,
where Z p (z) is an arbitrary solution of the Bessel equation of order p. J Keeping in mind that
for s) oo, the Hankel function H^Qvs) goes to zero, we get as our result the following
formula :
4n 2 a 3 co f   x 2
dx^ =
3c 3 [!Vq
(l±_^l\ \H\?{iv)\H\?\iv)da>. (70.19).
\m 1 m 2 J
t Note that the function H^iive) is purely imaginary, while its derivative H\V' (ive) is real.
% This formula is a direct consequence of the Bessel equation
z"+z'+(i^)z=o.
§ 70 RADIATION IN THE CASE OF COLOUMB INTERACTION 185
Let us consider the limiting cases of low and high frequencies. In the integral
+ 00
J e iH4sinh4) d £ = inH M( iv ) (70.20)
— oo
defining the Hankel function, the only important range of the integration parameter £ is
that ift which the exponent is of order unity. For low frequencies (v <^ 1), only the region of
large f is important. But for large £ we have sinh f P £. Thus, approximately,
+ oo
Htfiiv)* f e _l ' vsinh? ^ = ^ 1) (/v).
— oo
Similarly, we find that
H}»'(iv)*HP'(iv)
Using the approximate expression (for small x) from the theory of Bessel functions :
2 2
iH^ (ix)^ In —
n yx
(y = e c , where C is the Euler constant; y = 1.781 .. .), we get the following expression for
the effective radiation at low frequencies :
dXtt . »&!l /£. _ «. y ta (*e&) dco for m < fa3. (70 . 21)
3vqC Vwii m 2 / \ycoct/ a.
It depends logarithmically on the frequency.
For high frequencies (v > 1), on the other hand, the region of small £ is important in the
integral (70.20). In accordance with this, we expand the exponent of the integrand in powers
of £ and get, approximately,
+ 00 oo
H\l\iv)^~ e «^=— Re e 6 dn.
oo
By the substitution iv£ 3 /6 = rj, the integral goes over into the Tfunction, and we obtain the
result :
i /6V' 3 _/r
Similarly, we find
Next, using the formula of the theory of the Tfunction,
r(*)r(ix) = A,
sin nx
we obtain for the effective radiation at high frequencies :
d *» " W%? fe " £) d <°' for <° * T (m22)
that is, an expression which is independent of the frequency.
We now proceed to the radiation accompanying the collision of two particles repelling
186 RADIATION OF ELECTROMAGNETIC WAVES § 70
each other according to the Coulomb law U = ot/r(<x > 0). The motion occurs in a hyper
bola,
 1 + e cos <£ =  ', (70.23)
r
x = a(s + cosh £), y = «v e 2 — 1 sinh £,
' = vT" (e sinh ^ + ^ (70  24)
[a and e as in (70.12)]. All the calculations for this case reduce immediately to those given
above, so it is not necessary to present them. Namely, the integral
+ oo
x a =  f e iv(£sinh?+?) sinh^^
co J
— 00
for the Fourier component of the coordinate x reduces, by making the substitution
£ > in — %, to the integral for the case of attraction, multiplied by e~ nv ; the same holds for
yco
Thus the expressions for the Fourier components x^, y a in the case of repulsion differ
from the corresponding expressions for the case of attraction by the factor e~ Kv . So the only
change in the formulas for the radiation is an additional factor e~ 2nv . In particular, for low
frequencies we get the previous formula (70.21) (since for v <^ 1, e~ 2nv ^ 1). For high
frequencies, the effective radiation has the form
16na 2 ( e\ e 2 \ 2 ( 2ncoa\ 7 „ uvl
dx °  wk? U " t) exp ( ~ I*) da ' for m > ?■ <70  25)
It drops exponentially with increasing frequency.
PROBLEMS
1. Calculate the average total intensity of the radiation for elliptical motion of two attracting
charges.
Solution: From the expression (70.1) for the dipole moment, we have for the total intensity of the
radiation :
3c 3 l/Wi w 2 / 3c 3 V Wi m 2j
where we have used the equation of motion fir = — ar/r 3 . We express the coordinate r in terms of <j>
from the orbit equation (70.2) and, by using the equation dt = fir 2 d(f>/M, we replace the time in
tegration by an integration over the angle 4> (from to In). As a result, we find for the average
intensity :
7== L [ idt = 23/2 ( ei g 2 V ^ 5/2 « 3 K1 3/2 / 2Kjm 2 \
TJ 3c 3 \wi m 2 ) M 5 \ fix 2 /'
o
2. Calculate the total radiation A<f for the collision of two charged particles.
Solution: In the case of attraction the trajectory is the hyperbola (70.11) and in the case of
repulsion, (70.23). The angle between the asymptotes of the hyperbola and its axis is ^ , determined
from ±cos (j> — 1/e, and the angle of deflection of the particles (in the system of coordinates in
which the center of mass is at rest) is x = I n — 2<f>o\ • The calculation proceeds the same as in Problem
§ 70 RADIATION IN THE CASE OF COLOUMB INTERACTION 187
1 (the integral over <j> is taken between the limits — <t> and +<t> ). The result for the case of attraction
and for the case of repulsion:
A ^« tan 3{ ( „_ x) ( 1+3tani )_ 6tan ^(^
In both, x is understood to be a positive angle, determined from the relation
, X HvIq
cot  = .
2 a
Thus for a headon collision 0?— >0, x— >rc) of charges repelling each other:
45c 3 a \Wi
3. Calculate the total effective radiation in the scattering of a beam of particles in a repulsive
Coulomb field.
Solution: The required quantity is
00+00 00+00
0oo 000
We replace the time integration by integration over r along the trajectory of the charge, writing
dt = drjvr, where the radial velocity v r = r is expressed in terms of r by the formula
VH'^H^
2a
The integration over r goes between the limits from oo to the distance of closest approach r — r (e)
(the point at which v r = 0), and then from r Q once again to infinity; this reduces to twice the integral
from r to oo. The calculation of the double integral is conveniently done by changing the order of
integration — integrating first over q and then over r. The result of the calculation is :
8tt
4. Calculate the angular distribution of the total radiation emitted when one charge passes by
another, if the velocity is so large (though still small compared with the velocity of light) that the
deviation from straightline motion can be considered small.
Solution: The angle of deflection is small if the kinetic energy fiv 2 /2 is large compared to the
potential energy, which is of order a/g (}iv 2 > a/g). We choose the plane of the motion as the x, y
plane, with the origin at the center of inertia and the x axis along the direction of the velocity. In
first approximation, the trajectory is the straight line x = vt, y = q. In the next approximation, the
equations of motion give
.. _ a x ocvt .. ay <xq
with
r = Vx 2 +y 2 ^ V~g 2 +v 2 t 2 .
Using formula (67.7), we have:
00
188 RADIATION OF ELECTROMAGNETIC WAVES § 71
ection of do. Expi
Ivc 3 q 3 \mi m 2 J
where n is the unit vector in the direction of do. Expressing the integrand in terms of / and per
forming the integration, we get :
32yc 3„3l.„ _ .'(M^A.
§71. Quadrupole and magnetic dipole radiation
We now consider the radiation associated with the succeeding terms in the expansion
of the vector potential in powers of the ratio a\X of the dimensions of the system to the wave
length. Since a]X is assumed to be small, these terms are generally small compared with the
first (dipole) term, but they are important in those cases where the dipole moment of the
system is zero, so that dipole radiation does not occur.
Expanding the integrand in (66.2),
A = „
cR (
in powers of rn/c, we find, correct to terms of first order:
Substituting j = q\ and changing to point charges, we obtain :
A =k + ^"o^ Zcv(r,,,)  (7L1)
(From now on, as in § 67, we drop the index t ' in all quantities).
In the second term we write
v(rn) = r(nr)+v(nr)r(nv)
= 2af r(n ' r)+ 2 (rxv)xn *
We then find for A the expression
A = ^ + 2kS le ' {a ' l)+ ^ Xn) ' (7t2)
where d is the dipole moment of the system, and
1 v
m = — > erx v
2c ^
is its magnetic moment. For further transformation, we note that we can, without changing
the field, add to A any vector proportional to n, since according to formula (66.3), H and E
are unchanged by this. For this reason we can replace (71.2) by
d 1 d 2 „ _„ . . 2 _ 1 .
A = h „  —^ V e[3r(n • r)nr 2 ] + — m x n.
cR 6c 2 R dt 2 ^ L v cR
But the expression under the summation sign is just the product n p D ap of the vector n and
the quadrupole moment tensor D aP = L e(3x a x p d ap r 2 ) (see § 41). We introduce the vector
§ 71 QUADRUPOLE AND MAGNETIC DIPOLE RADIATION 189
D with components D a = D ap n p , and get the final expression for the vector potential :
^^^^i*™ (7U)
Knowing A, we can now determine the fields H and E of the radiation, using the general
formula (66.3):
H = y— <itxii+ — Cxn + (mxn)xn
c z R I 6c
E = i^ 1(3 x n) x n+ — (D x n) x n + nx mi. (71.4)
The intensity dl of the radiation in the solid angle do is given by the general formula
(66.6). We calculate here the total radiation, i.e., the energy radiated by the system in unit
time in all directions. To do this, we average dl over all directions of n; the total radiation
is equal to this average multiplied by An. In averaging the square of the magnetic field, all the
crossproducts of the three terms in H vanish, so that there remain only the mean squares of
the three. A simple calculation! gives the following result for /:
Thus the total radiation consists of three independent parts ; they are called, respectively,
dipole, quadrupole, and magnetic dipole radiation.
We note that the magnetic dipole radiation is actually not present for many systems.
Thus it is not present for a system in which the chargetomass ratio is the same for all the
moving charges (in this case the dipole radiation also vanishes, as already shown in § 67).
Namely, for such a system the magnetic moment is proportional to the angular momentum
(see § 44) and therefore, since the latter is conserved, m = 0. For the same reason, magnetic
dipole radiation does not occur for a system consisting of just two particles (cf. the problem
in § 44. In this case we cannot draw any conclusion concerning the dipole radiation).
PROBLEM
Calculate the total effective radiation in the scattering of a beam of charged particles by particles
identical with them.
Solution: In the collision of identical particles, dipole radiation (and also magnetic dipole
radiation) does not occur, so that we must calculate the quadrupole radiation. The quadrupole
moment tensor of a system of two identical particles (relative to their center of mass) is
g
where x a are the components of the radius vector r between the particles. After threefold differentia
t We present a convenient method for averaging the products of components of a unit vector. Since the
tensor n a tt is symmetric, it can be expressed in terms of the unit tensor S a0 . Also noting that its trace is 1
we have: '
The average value of the product of four components is:
n a n n y n 6 = Ts(d ae S yi + d ay 8 ed + d ad S 0y ).
The right side is constructed from unit tensors to give a fourthrank tensor that is symmetric in all its indices;
the overall coefficient is determined by contracting on two pairs of indices, which must give unity.
190 RADIATION OF ELECTROMAGNETIC WAVES § 72
tion of D a p, we express the first, second, and third derivatives with respect to time of x a in terms
of the relative velocity of the particles v a as :
. _ .. _m .. _e 2 x a m ... _ 2 v a r—3x a v T
where v r = \rjr is the radial component of the velocity (the second equality is the equation of
motion of the charge, and the third is obtained by differentiating the second). The calculation leads
to the following expression for the intensity :
Dip _ & 1
180c 5 15m 2 c 5 H
2 = v* + v£) ; v and v^ are expressible in terms of r by using the equalities
2 2 4e2 Q v o
v 2 = vZ , v 4> = — .
mr r
We replace the time integration by an integration over r in the same way as was done in Problem 3
of § 70, namely, we write
dr dr
dt = — =
'=Tis& = rw«;^ 2 +iK)
" fi
Q 2 vl 4e 2
r 2 mr
In the double integral (over q and r), we first carry out the integration over q and then over r. The
result of the calculation is :
An eH%
Y. = — ;.
9 mc 3
§ 72. The field of the radiation at near distances
The formulas for the dipole radiation were derived by us for the field at distances large
compared with the wavelength (and, all the more, large compared with the dimensions of the
radiating system). In this section we shall assume, as before, that the wavelength is large
compared with the dimensions of the system, but shall consider the field at distances which
are not large compared with, but of the same order as, the wavelength.
The formula (67.4) for the vector potential
A=jd (72.1)
cR
is still valid, since in deriving it we used only the fact that R was large compared with the
dimensions of the system. However, now the field cannot be considered to be a plane wave
even over small regions. Therefore the formulas (67.5) and (67.6) for the electric and mag
netic fields are no longer applicable, so that to calculate them, we must first determine both
A and (f>.
The formula for the scalar potential can be derived directly from that for the vector
potential, using the general condition (62.1),
divA+i^0,
c dt
imposed on the potentials. Substituting (72.1) in this, and integrating over the time, we get
0=_div^. (72.2)
§ 72
THE FIELD OF THE RADIATION AT NEAR DISTANCES
191
The integration constant (an arbitrary function of the coordinates) is omitted, since we are
interested only in the variable part of the potential. We recall that in the formula (72.2) as
well as in (72.1) the value of d must be taken at the time t' — t — (i? /c).f
Now it is no longer difficult to calculate the electric and magnetic field. From the usual
formulas, relating E and H to the potentials,
1 , d
H = curl 
c R,
, ,. d Id
E = graddiv2 — .
K C K
The expression for E can be rewritten in another form, noting that d t /R
(72.3)
(72.4)
just as any
1 / R
function of coordinates and time of the form — fit
R \ c
satisfies the wave equation:
i f? (A.
c^dt^Ro
A
R t
Also using the formula
we find that
curl curl a = grad div a — Aa,
d
E = curl curl
Ro
(72.5)
The results obtained determine the field at distances of the order of the wavelength. It is
understood that in all these formulas it is not permissible to take l/R out from under the
differentiation sign, since the ratio of terms containing \jR% to terms with l/R is just of the
same order as X/R .
Finally, we give the formulas for the Fourier components of the field. To determine H CT
we substitute in (72.3) for H and d their monochromatic components H (O e~ i0Jt and d w e~ itot ,
respectively. However, we must remember that the quantities on the right sides of equations
(72.1) to (72.5) refer to the time t' = t—(R /c). Therefore we must substitute in place of d
the expression
— i<o/Ro\
ftp ^ c > = ft pi<»t + ikRo
Making the substitution and dividing by e~ lmt , we get
e ikR o\ JkRo
) = *d„xV.
ik
Rn
k e
ikR
H co = ;/ccurl(d a
or, performing the differentiation,
H w = ikft a x n
where n is a unit vector along R .
t Sometimes one introduces the socalled Hertz vector, defined by
Ro \ c J
A =  l Z, ^ = div Z.
(72.6)
Then
192 RADIATION OF ELECTROMAGNETIC WAVES § 72
In similar fashion, we find from (72.4):
JikRo JkR
V a = k 2 d a — +(d (U V)V— ,
and differentiation gives
At distances large compared to the wave length (kR > 1), we can neglect the terms in
IjRl and 1/Rl in formulas (72.5) and (72.6), and we arrive at the field in the "wave zone",
E (0 =*nx(d (0 x n)e ikR °, H 0} =^d (O x ne ikRo .
At distances which are small compared to the wave length (JcR < 1), we neglect the terms in
l/R and 1/jRg and set e ikR s 1 ; then
E co = ^3( 3n ( d ,n)d w },
which corresponds to the static field of an electric dipole (§40); in this approximation, the
magnetic field vanishes.
PROBLEM
Calculate the quadrupole and magnetic dipole radiation fields at near distances.
Solution: Assuming, for brevity, that dipole radiation is not present, we have (see the calculation
carried out in § 71)
If dV If J'
C J *~c R C J R
where we have expanded in powers of r = R — R. In contrast to what was done in § 71, the factor
1/Ro cannot here be taken out from under the differentiation sign. We take the differential operator
out of the integral and rewrite the integral in tensor notation:
A a =
c dX p J R
(Xp are the components of the radius vector R ). Transforming from the integral to a sum over the
charges, we find
_ 1 8 ( I>^*A '
cdXp R '
In the same way as in § 71, this expression breaks up into a quadrupole part and a magnetic dipole
part. The corresponding scalar potentials are calculated from the vector potentials in the same way
as in the text. As a result, we obtain for the quadrupole radiation :
6c dX p Rq' v 6 dX a dXp R '
and for the magnetic dipole radiation :
A = curl—, # =
Ro
[all quantities on the right sides of the equations refer as usual to the time /' = t—(R /c)].
The field intensities for magnetic dipole radiation are:
E = — curl — , H = curl curl ~.
c Ro' Rq
§ 73 RADIATION FROM A RAPIDLY MOVING CHARGE 193
Comparing with (72.3), (72.4), we see that in the magnetic dipole case, E and H are expressed in
terms of m in the same way as — H and E are expressed in terms of d for the electric dipole case.
The spectral components of the potentials of the quadrupole radiation are :
6 aB dX p R ' * 6'" dX a dX p R •
Because of their complexity, we shall not give the expressions for the field.
§ 73. Radiation from a rapidly moving charge
Now we consider a charged particle moving with a velocity which is not small compared
with the velocity of light.
The formulas of § 67, derived under the assumption that v <^c, are not immediately
applicable to this case. We can, however, consider the particle in that system of reference in
which the particle is at rest at a given moment; in this system of reference the formulas
referred to are of course valid (we call attention to the fact that this can be done only for the
case of a single moving particle ; for a system of several particles there is generally no system
of reference in which all the particles are at rest simultaneously).
Thus in this particular system of reference the particle radiates, in time dt, the energy
2e z
dS = — w 2 dt (73.1)
[in accordance with formula (67.9], where w is the acceleration of the particle in this system
of reference. In this system of reference, the "radiated" momentum is zero:
dP = 0. (73.2)
In fact, the radiated momentum is given by the integral of the momentum flux density in the
radiation field over a closed surface surrounding the particle. But because of the symmetry
of the dipole radiation, the momenta carried off in opposite directions are equal in magnitude
and opposite in direction; therefore the integral is identically zero.
For the transformation to an arbitrary reference system, we rewrite formulas (73.1) and
(73.2) in fourdimensional form. It is easy to see that the "radiated fourmomentum" dP t
must be written as
2e 2 du k du k , . 2e 2 du k du k .
dP ' ~ 17 *"* dx =  Tc TsTs " *• (73  3)
In fact, in the reference frame in which the particle is at rest, the space components of the
fourvelocity «' are equal to zero, and
du k du k w 2
ds ds c 4 '
therefore the space components of dP 1 become zero and the time component gives equation
(73.1).
The total fourmomentum radiated during the time of passage of the particle through a
given electromagnetic field is equal to the integral of (73.3), that is,
2e 2 C du k diiir
AP l =~^ — — fc dx\ (73.4)
3c J ds ds
We rewrite this formula in another form, expressing the fouracceleration du'/ds in terms of
194 RADIATION OF ELECTROMAGNETIC WAVES
the electromagnetic field tensor, using the equation of motion (23.4):
§ 73
mc
du k
ds
We then obtain
AP l = 
2e A
3m V
J
F hl u l
(F kl u l ){F km u m )dx\
(73.5)
The time component of (73.4) or (73.5) gives the total radiated energy A<f . Substituting
for all the fourdimensional quantities their expressions in terms of threedimensional
quantities, we find
, (v x w) 2
oo w 2_V J
2e 2 C c 2
J — Zi^dt (73.6)
LS
3r
1
(w = v is the acceleration of the particle), or, in terms of the external electric and magnetic
fields:
M =
2e A
3m c
s
E+vxh1 4(Ev) 2
dt.
(73.7)
1
The expressions for the total radiated momentum differ by having an extra factor v in the
integrand.
It is clear from formula (73.7) that for velocities close to the velocity of light, the total
energy radiated per unit time varies with the velocity essentially like [1 — (y 2 /c 2 )] 1 , that is,
proportionally to the square of the energy of the moving particle. The only exception is
motion in an electric field, along the direction of the field. In this case the factor [1 — (v 2 /c 2 )]
standing in the denominator is cancelled by an identical factor in the numerator, and the
radiation does not depend on the energy of the particle.
Finally there is the question of the angular distribution of the radiation from a rapidly
moving charge. To solve this problem, it is convenient to use the LienardWiechert expres
sions for the fields, (63.8) and (63.9). At large distances we must retain only the term of
lowest order in l/R [the second term in (63.8)]. Introducing the unit vector n in the direction
of the radiation (R = nR), we get the formulas
nx
E =
n —
x w
c 2 R
1
nv
H = n x E,
(73.8)
where all the quantities on the right sides of the equations refer to the retarded time
t' = t(jR/c).
The intensity radiated into the solid angle do is dl = (c/4n)E 2 R 2 do. Expanding E 2 , we get
l4Ww) 2
dl =
4nc~
2(n • w)(v • w)
vn
c
+
w
vn
c
1
vn
do.
(73.9)
§73
RADIATION FROM A RAPIDLY MOVING CHARGE
195
If we want to determine the angular distribution of the total radiation throughout the
whole motion of the particle, we must integrate the intensity over the time. In doing this, it is
important to remember that the integrand is a function of t ' ; therefore we must write
dt
dt = — dt ' = ( 1
(73.10)
[see (63.6)], after which the integration over t' is immediately done. Thus we have the
following expression for the total radiation into the solid angle do :
Anc
do
I
'
2(n
w)(v
w)
[•(■
V
c
■)'
+
w
(nw) 2
vn
c
1
nv
> dt'. (73.11)
As we see from (73.9), in the general case the angular distribution of the radiation is
quite complicated. In the ultrarelativistic case, (l—(v/c)4l) it has a characteristic
appearance, which is related to the presence of high powers of the difference 1 — (v • n/c) in
the denominators of the various terms in this expression. Thus, the intensity is large within
the narrow range of angles in which the difference 1 — (v n/c) is small. Denoting by 6 the
small angle between n and v, we have :
1   cos s 1
c
■+ ^r =
B 2 >
this difference is small for
J
1
(73.12)
Thus an ultrarelativistic particle radiates mainly along the direction of its own motion,
within the small range (73.12) of angles around the direction of its velocity.
We also point out that, for arbitrary velocity and acceleration of the particle, there are
always two directions for which the radiated intensity is zero. These are the directions for
which the vector n— (v/c) is parallel to the vector w, so that the field (73.8) becomes zero.
(See also problem 2 of this section.)
Finally, we give the simpler formulas to which (73.9) reduces in two special cases.
If the velocity and acceleration of the particle are parallel,
H
wxn
n v
c
and the intensity is
dl =
4nr
1 — cos
do.
(73.13)
It is naturally, symmetric around the common direction of v and w, and vanishes along
(6 = 0) and opposite to (9 = n) the direction of the velocity. In the ultrarelativistic case,
the intensity as a function of 6 has a sharp double maximum in the region (73.12), with a
steep drop to zero for = 0.
196
RADIATION OF ELECTROMAGNETIC WAVES
§ 73
If the velocity and acceleration are perpendicular to one another, we have from (73.9):
dl =
4nc 3
('?)•■
~2 ) sin 2 9 cos (p
(l^cos9) Mcos^
do,
(73.14)
where 9 is again the angle between n and v, and <f> is the azimuthal angle of the vector n
relative to the plane passing through v and w. This intensity is symmetric only with respect
to the plane of v and w, and vanishes along the two directions in this plane which form the
angle 9  cos" x (v/c) with the velocity.
PROBLEMS
1. Find the total radiation from a relativistic particle with charge e u which passes with impact
parameter q through the Coulomb field of a fixed center (with potential <l> = e 2 /r).
Solution: In passing through the field, the relativistic particle is hardly deflected at all.f We may
therefore regard the velocity v in (73.7) as constant, so that the field at the position of the particle is
E =
e 2 r
e 2 r
(Q 2 +V 2 t 2 )
,2,2\3/2'
Fig. 15.
with x = vt, y = q. Performing the time integration in (73.7), we obtain:
Acf _ Tte\e% 4c 2 — v 2
12m 2 c 3 Q 3 v c 2 —v 2 '
2. Find the directions along which the intensity of the radiation from a moving particle vanishes.
Solution: From the geometrical construction (Fig. 15) we find that the required directions n lie
in the plane passing through v and w, and form an angle x with the direction of w where
and a is the angle between v and w.
sin x —  sin a,
c
t For y~c, deviations through sizable angles can occur only for impact parameters Q~e 2 /mc 2 , which
cannot in general be treated classically.
§74
SYNCHROTRON RADIATION (MAGNETIC BREMSSTRAHLUNG)
197
§ 74. Synchrotron radiation (magnetic bremsstrahlung)
We consider the radiation from a charge moving with arbitrary velocity in a circle in a
uniform constant magnetic field; such radiation is called magnetic bremsstrahlung. The
radius of the orbit r and the cyclic frequency of the motion co H are expressible in terms of the
field intensity H and the velocity of the particle v, by the formulas (see § 21):
mcv
v eH
r =
eH
J^i
co„ =  =
mc
ji
(74.1)
The total intensity of the radiation over all directions is given directly by (73.7), omitting
the time integration, in which we must set E = and H J_v:
/ =
2e 4 H 2 v 2
(3
(74.2)
We see that the total intensity is proportional to the square of the momentum of the particle.
If we are interested in the angular distribution of the radiation, then we must use formula
(73.11). One quantity of interest is the average intensity during a period of the motion. For
this we integrate (73.11) over the time of revolution of the particle in the circle and divide
the result by the period T = 27i/co H .
We choose the plane of the orbit as the XY plane (the origin is at the center of the circle),
and we draw the X Y plane to pass through the direction n of the radiation (Fig. 16). The
Fig. 16.
magnetic field is along the negative Z axis (the direction of motion of the particle in Fig. 16
corresponds to a positive charge e). Further, let 9 be the angle between the direction k of the
radiation and the Faxis, and (j> = a> H t be the angle between the radius vector of the particle
and the X axis. Then the cosine of the angle between k and the velocity v is cos 6 cos (f)
(the vector v lies in the XY plane, and at each moment is perpendicular to the radius vector
of the particle). We express the acceleration w of the particle in terms of the field H and the
velocity v by means of the equation of motion [see (21.1)]:
w = — Jl
mc
J<
v
~ vxH.
c z
198 RADIATION OF ELECTROMAGNETIC WAVES § 74
After a simple calculation, we get:
o (1 — cos 9 cos <f> 1
(the time integration has been converted into integration over $ = co H t). The integration is
elementary, though rather lengthy. As a result one finds the following formula:
„ 2 \ r ,,2 / 2\ / ..2
dl = do
Snm c
2„5
(?K
2 + ^ cos 2 (l^)(4+^cos 2 0)cos 2
,2 WS " I x „2 A^" 1 " Zi
/ u 2 \ 5/2 / I) 2 \ 7 ' 2
(l ? cos 2 0J 4 (l 2 cos 2 0J
(74.4)
The ratio of the intensity of radiation for 9 = n/2 (perpendicular to the plane of the orbit)
to the intensity for = (in the plane of the orbit) is
()
\doJ
v 2
4 + 32
s"'(r
As v *■ 0, this ratio approaches \, but for velocities close to the velocity of light, it becomes
very large.
Next we consider the spectral distribution of the radiation. Since the motion of the charge
is periodic, we are dealing with expansion in a Fourier series. The calculation starts con
veniently with the vector potential. For the Fourier components of the vector potential we
have the formula [see (66.12)]:
ifc.Ro /"»
A„ = e —  (J) e l <"" , " k "></!•,
cR Tj
where the integration is taken along the trajectory of the particle (the circle). For the co
ordinates of the particle we have x = r cos a> H t, y = r sin co H t. As integration variable we
choose the angle = co H t. Noting that
k • r = kr cos sin <p = (nv[c) cos 9 sin
(k = nco H /c = nv/cr), we find for the Fourier components of the xcomponent of the vector
potential :
In
Axn = ~ 2~Sr eikRo f e '"^~ c " cosflsin *) sin <t> #•
We have already had to deal with such an integral in § 70. It can be expressed in terms of the
derivative of a Bessel function :
iev ., o /nv \
A w a ^' r {7 eo "'} (746)
Similarly, one calculates A yn :
A y» = IT^rfi eikR ° J » (~ cos e \ ( 74  7 >
R cos 9 \ c )
The component along the Z axis obviously vanishes.
§ 74 SYNCHROTRON RADIATION (MAGNETIC BREMSSTRAHLUNG) 199
From the formulas of § 66 we have for the intensity of radiation with frequency co = nco H ,
in the element of solid angle do :
dl n = ~ H„ 2 R 2 do = ~ k x A n \ 2 R 2 do.
2% In
Noting that
lAxkl 2 = Alk 2 + A 2 k 2 sin
and substituting (74.6) and (74.7), we get for the intensity of radiation the following formula
(Schott, 1912):
2„4rr2 / „2\ r / m» \ n 2
dI " ~ 2nc 3 m 7 {
nv \ v ,,. nv
Ian 2 0J 2 A "— osG) + \J' n 2 [ — ™sQ
c / c 2 \ c
do. (74.8)
To determine the total intensity over all directions of the radiation with frequency co = na> H ,
this expression must be integrated over all angles. However, the integration cannot be carried
out in finite form. By a series of transformations, making use of certain relations from the
theory of Bessel functions, the required integral can be written in the following form:f
o
(74.9)
We consider in more detail the ultrarelativistic case where the velocity of motion of the
particle is close to the velocity of light.
Setting v = c in the numerator of (74.2), we find that in the ultrarelativistic case the total
intensity of the synchrotron radiation is proportional to the square of the particle energy S:
"?£(A)'. C74.10)
3m c \mc /
The angular distribution of the radiation is highly anisotropic. The radiation is con
centrated mainly in the plane of the orbit. The "width" A0 of the angular range within
which most of the radiation is included is easily evaluated from the condition 1 — (v 2 /c 2 )
cos 2 6  1 ~(v 2 /c 2 ), writing = n/2±A9, sin 6 ^ 1 (A0) 2 /2. It is clear that J
/ v 2 mc 2 .„. . _
A0 ~V 1 ?=^ ( ' 4  n)
We shall see below that in the ultrarelativistic case the main role in the radiation is played
by frequencies with large n (Arzimovich and Pomeranchuk, 1945). We can therefore use
the asymptotic formula (70.9), according to which :
J 2n (2nO £ y i fc[n*(l  e)l (74.12)
\Jnn 3
Substituting in (74.9), we get the following formula for the spectral distribution of the
f The computations can be found in the book by G. A. Schott, Electromagnetic Radiation, § 84, Cam
bridge, 1912.
% This result is, of course, in agreement with the angular distribution of the instantaneous intensity which
we found in the preceding section [see (73.12)]; however, the reader should not confuse the angle 6 of this
section with the angle between n and v in § 73 !
200 RADIATION OF ELECTROMAGNETIC WAVES § 74
radiation for large values of «:f
2e 4 H 2 — u*
h = ~ jJc* r (u)+ U 2 S * (u)du }> (74  13)
^/mc 2 \ 2
u = n 3 1 — j .
V
For w»0 the function in the curly brackets approaches the constant limit
O'(0/ = 0.4587 . . .{ Therefore for u < 1, we have
/  BBa52 ^v( r ;f. K»^(^a). (7414)
For u > 1, we can use the asymptotic expression for the Airy function (see the footnote
on p. 179, and obtain):
/.=
H 2 (aa^Y n i
\ * J f 2 fmc 2 \ 3 } ( $ \ 3
v««v exp r 3 n (t) j' w ► (w) < 74  15 >
that is, the intensity drops exponentially for large n.
Consequently the spectrum has a maximum for
/ (O 1
mc '
and the main part of the radiation is concentrated in the region of frequencies for which
/ & \ 3 eH ( & \ 2
<o~co H ( — A = — ( _ . ( 74#16 )
\mc J mc \mc / v /
These values of co are very large, compared to the distance co H between neighboring fre
quencies. We may say that the spectrum has a "quasicontinuous" character, consisting of a
large number of closely spaced lines.
In place of the distribution function I n we can therefore introduce a distribution over the
continuous series of frequencies co = nco H , writing
dl = I n dn = I n — .
For numerical computations it is convenient to express this distribution in terms of the
t In making the substitution, the limit « 2 ' 3 of the integral can be changed to infinity, to within the required
accuracy; we have also set v = c wherever possible. Even though values of £ close to 1 are important in the
integral (74.9), the use of formula (74.12) is still permissible, since the integral converges rapidly at the lower
limit.
t From the definition of the Airy function, we have:
°' (0) "^J^f* v^ J x ~ w sin xdx
3 1/6 r(f)
2Vn '
§ 74 SYNCHROTRON RADIATION (MAGNETIC BREMSSTRAHLUNG) 201
MacDonald function K v .\ After some simple transformations of formula (74.13), it can be
written as
_ on
^3e 3 H
dl = dco
where we use the notation
2n mc'
f{^\ F{Q = tJKtf)dZ, (74.17)
a>„ =
3eH / 8
2mc \mc 2
Figure 17 shows a graph of the function F(£).
)'
(74.18)
F(£)
1
092
05
n.?P 12 3 4
Fig. 17.
Finally, a few comments on the case when the particle moves, not in a plane orbit, but in
a helical trajectory, i.e. has a longitudinal velocity (along the field) v ]} = v cos / (where /
is the angle between H and v). The frequency of the rotational motion is given by the same
formula (74.1), but now the vector v moves, not in a circle, but on the surface of a cone with
its axis along H and with vertex angle 2/. The total intensity of the radiation (defined as the
total energy loss per sec from the particle) will differ from (74.2) in having H replaced by
H L = H sin x
In the ultrarelativistic case the radiation is concentrated in directions near the generators
of the "velocity cone". The spectral distribution and the total intensity (in the same sense as
above) are obtained from (74.17) and (74.10) by the substitution H* H x . If we are talking
about the intensity as seen in these directions by an observer at rest, then we must introduce
an additional factor (1 — (v^/c) cos x) _1 — sin 2 %, which takes into account the approach
of the radiator to the observer, which occurs at a velocity v§ cos x
f The connection between the Airy function and the function K x / 3 is given by formula (4) of the footnote
on p. 179. In the further transformations one uses the recursion relations
K v  i(x) —K v + i(a:)
2v
K v , 2K' v (x) — — Ky^x)— K v +i(x),
where K. v (pc) = K v (x). In particular, it is easy to show that
•™ — vWf'"}
202
RADIATION OF ELECTROMAGNETIC WAVES
PROBLEMS
§ 74
1 . Find the law of variation of energy with time for a charge moving in a circular orbit in a
constant uniform magnetic field, and losing energy by radiation.
Solution: According to (74.2), we have for the energy loss per unit time:
dS 2e*H 2
*5Sv<'"'" v >
(<? is the energy of the particle). From this we find :
« , (2e i H 2
— = = coth . „ . t+ const
m<r \ 3m d c 5
As / increases, the energy decreases monotonically, approaching the value e = mc 2 (for complete
stopping of the particle) asymptotically as />oo.
2. Find the asymptotic formula for the spectral distribution of the radiation at large values of n
for a particle moving in a circle with a velocity which is not close to the velocity of light.
Solution: We use the well known asymptotic formula of the theory of Bessel functions
J n (H£)
1
\/27m(le 2 ) 1/4
1+Vl
,vr
which is valid for n{\ — e 2 ) 312
e 4 H 2 n 112
In =
> 1. Using this formula, we find from (74.9):
n "1 2n
2V
nm c J
5/4
J^
1 +
This formula is applicable for /i[l(» a /c 2 )] 3/2 » 1; if in addition l(v 2 /c 2 ) is small, the formula
goes over into (74.15).
3. Find the polarization of the synchrotron radiation.
Solution: The electric field E n is calculated from the vector potential A n (74.67) according to the
formula
E n
1 (k x A„) x k =  \ k(k • A„) + ikk n .
Let d, e 2 be unit vectors in the plane perpendicular to k, where ei is parallel to the x axis and e 2
lies in the yz plane (their components are d = (1, 0, 0), e 2 = (0, sin 9, cos 0); the vectors e u e 2
and k form a righthand system. Then the electric field will be :
E„ = ikA^eL+ik sin 0A yn e 2 ,
or, dropping the unimportant common factors :
E n ~  J'n ( — cos 9  d +tan 9 J n I — cos 9 J /e 2 .
The wave is elliptically polarized (see § 48).
In the ultrarelativistic case, for large n and small angles 9, the functions /„ and J^ are expressed
in terms of K 1/3 and K 2j3 , where we set
1
cos 2 9x1
+ 9 2
in the arguments. As a result we get :
(wc 2 V
+ 9 2
E„ = e iy ,K 2/3 ( f ~ ¥ A +/e a 9K 1/3 ( ^ r
For 9 = the elliptical polarization degenerates into linear polarization along d. For large
0( 0 > mc 2 /«?, «0 3 >1), we have K 1/3 (x) a K 2 , 3 {x) a V^/2xe~ x , and the polarization tends to
become circular: E n i>d+ze 2 ; the intensity of the radiation, however, also becomes exponentially
small. In the intermediate range of angles the minor axis of the ellipse lies along e 2 and the major
axis along d. The direction of rotation depends on the sign of the angle 9 (9 > if the directions
of H and k lie on opposite sides of the orbit plane, as shown in Fig. 16).
§ 75 RADIATION DAMPING 203
§ 75. Radiation damping
In § 65 we showed that the expansion of the potentials of the field of a system of charges
in a series of powers of v/c leads in the second approximation to a Lagrangian completely
describing (in this approximation) the motion of the charges. We now continue the expansion
of the field to terms of higher order and discuss the effects to which these terms lead.
In the expansion of the scalar potential
the term of third order in 1/c is
For the same reason as in the derivation following (65.3), in the expansion of the vector
potential we need only take the term of second order in 1/c, that is,
A<2, = r4J> (75.2)
We make a transformation of the potentials:
4>' = 4>T> A'=A + grad/,
c ot
choosing the function /so that the scalar potential (£ (3) becomes zero:
J 6c 2 dt 2 J *
Then the new vector potential is equal to
A '(2) = _J f ldv _ 1 ! V [jPqdV
c 2 dt) J 6c 2 dt 2 J
c 2 dt J J 3c 2 dt 2 J
Making the transition from the integral to a sum over individual charges, we get for the
first term on the right the expression
In the second term, we write R = R r, where R and r have their usual meaning (see § 66) ;
then R = — f = — v and the second term takes the form
3c 2 ^
Thus,
A' (2) =^Iev. (75.3)
The magnetic field corresponding to this potential is zero (H = curl A' (2) = 0), since
204 RADIATION OF ELECTROMAGNETIC WAVES § 75
A' (2) does not contain the coordinates eplicitly. The electric field E = (l/c)A' (2) is
E = ^5 3 . (75.4)
where d is the dipole moment of the system.
Thus the third order terms in the expansion of the field lead to certain additional forces
acting on the charges, not contained in the Lagrangian (65.7); these forces depend on the
time derivatives of the accelerations of the charges.
Let us consider a system of charges carrying out a stationary motionf and calculate the
average work done by the field (75.4) per unit time. The force acting on each charge e is
f = eE, that is,
f =3^ (75.5)
The work done by this force in unit time is f • v, so that the total work performed on all the
charges is equal to the sum, taken over all the charges:
When we average over the time, the first term vanishes, so that the average work is equal to
I^Js* 2  (75.6)
The expression standing on the right is (except for a sign reversal) just the average energy
radiated by the system in unit time [see (67.8)]. Thus, the forces (75.5) appearing in third
approximation, describe the reaction of the radiation on the charges. These forces are called
radiation damping or Lorentz frictional forces.
Simultaneously with the energy loss from a radiating system of charges, there also occurs
a certain loss of angular momentum. The decrease in angular momentum per unit time,
dM/dt, is easily calculated with the aid of the expression for the damping forces. Taking the
time derivative of the angular momentum M = I r x p, we have M = E r x p, since
Sf x p = 2 m(v x v) = 0. We replace the time derivative of the momentum of the particle
by the friction force (75.5) acting on it, and find
We are interested in the time average of the loss of angular momentum for a stationary
motion, just as before, we considered the time average of the energy loss. Writing
dx3 = — (dxfl)dxci
at
and noting that the time derivative (first term) vanishes on averaging, we finally obtain the
following expression for the average loss of angular momentum of a radiating system:
dM 2 r—v
t More precisely, a motion which, although it would have been stationary if radiation were neglected,
proceeds with continual slowing down.
§ 75 RADIATION DAMPING 205
Radiation damping occurs also for a single charge moving in an external field. It is
equal to
2e 2 „
3c
f = ^v. (75.8)
For a single charge, we can always choose such a system of reference that the charge at the
given moment is at rest in it. If, in this reference frame, we calculate the higher terms in the
expansion of the field produced by the charge, it turns out that they have the following
property. As the radius vector R from the charge to the field point approaches zero, all these
terms become zero. Thus in the case of a single charge, formula (75.8) is an exact formula for
the reaction of the radiation, in the system of reference in which the charge is at rest.
Nevertheless, we must keep in mind that the description of the action of the charge "on
itself" with the aid of the damping force is unsatisfactory in general, and contains contradic
tions. The equation of motion of a charge, in the absence of an external field, on which only
the force (75.8) acts, has the form
2e 2
mv = — j v.
3c 3
This equation has, in addition to the trivial solution v = const, another solution in which the
acceleration v is proportional to exp(3mc 3 t/2e 2 ), that is, increases indefinitely with the time.
This means, for example, that a charge passing through any field, upon emergence from the
field, would have to be infinitely "selfaccelerated". The absurdity of this result is evidence for
the limited applicability of formula (75.8).
One can raise the question of how electrodynamics, which satisfies the law of conservation
of energy, can lead to the absurd result that a free charge increases its energy without limit.
Actually the root of this difficulty lies in the earlier remarks (§ 37) concerning the infinite
electromagnetic "intrinsic mass" of elementary particles. When in the equation of motion
we write a finite mass for the charge, then in doing this we essentially assign to it formally an
infinite negative "intrinsic mass" of nonelectromagnetic origin, which together with the
electromagnetic mass should result in a finite mass for the particle. Since, however, the sub
traction of one infinity from another is not an entirely correct mathematical operation,this
leads to a series of further difficulties, among which is the one mentioned here.
In a system of coordinates in which the velocity of the particle is small, the equation of
motion when we include the radiation damping has the form
e „ 2c 2
3 c 3
From our discussion, this equation is applicable only to the extent that the damping force
is small compared with the force exerted on the charge by the external field.
To clarify the physical meaning of this condition, we proceed as follows. In the system of
reference in which the charge is at rest at a given moment, the second time derivative of the
velocity is equal, neglecting the damping force, to
v = —EH vxH.
m mc
mv = eE+vxH+ 3V. (75.9)
In the second term we substitute (to the same order of accuracy) v = (e/w)E, and obtain
e . e 2
v = E+rExH.
206 RADIATION OF ELECTROMAGNETIC WAVES § 75
Corresponding to this, the damping force consists of two terms :
2e 3 . 2e 4
f=~ — 3E+— y^ExH. (75.10)
If w is the frequency of the motion, then E is proportional to coE and, consequently, the
first term is of order (e 3 co/mc 3 )E; the second is of order (e 4 /m 2 c 4 )EH. Therefore the condition
for the damping force to be small compared with the force eE exerted by the external field on
the charge gives, first of all,
e 2
— 3 co < 1,
mc
or, introducing the wavelength X ~ c/co,
k > — 2 . (75.11)
mc z
Thus formula (75.8) for the radiation damping is applicable only if the wavelength of the
radiation incident on the charge is large compared with the "radius" of the charge e z /mc 2 .
We see that once more a distance of order e 2 /mc 2 appears as the limit at which electro
dynamics leads to internal contradictions (see § 37).
Secondly, comparing the second term in the damping force to the force eE, we find the
condition
m 2 c 4
H < —3. (75.12)
e 5
Thus it is also necessary that the field itself be not too large. A field of order m 2 c 4 /e 3 also
represents a limit at which classical electrodynamics leads to internal contradictions. Also
we must remember here that actually, because of quantum effects, electrodynamics is
already not applicable for considerably smaller fields.!
To avoid misunderstanding, we remind the reader that the wavelength in (75.11) and the
field value in (75.12) refer to the system of reference in which the particle is at rest at the
given moment.
PROBLEMS
1. Calculate the time in which two attracting charges, performing an elliptic motion (with
velocity small compared with the velocity of light) and losing energy due to radiation, "fall in"
toward each other.
Solution: Assuming that the relative energy loss in one revolution is small, we can equate the
time derivative of the energy to the average intensity of the radiation (which was determined in
problem 1 of § 70):
d\*\ (2K) 3/ V /2 a 3 /*>i e 2 \ 2 /„ 2«#M 2>
dt 3c 3 M 5
where a = e 1 e 2 . Together with the energy, the particles lose angular momentum. The loss of
angular momentum per unit time is given by formula (75.7); substituting the expression (70.1) for
d, and noting that nx = — ar/r 3 and M = /irx v, we find:
dM _ 2a
~d7~~3c 3
(e 1 _e 2 \ 2 M
\mi m 2 ) r 3 '
f For fields of order m 2 c 3 /he, where h is Planck's constant.
c
Ig
§ 75 RADIATION DAMPING 207
We average this expression over a period of the motion. Because of the slowness of the changes in
M, it is sufficient to average on the right only over r~ 3 ; this average value is computed in precisely
the same way as the average of r~ 4 was found in problem 1 of § 70. As a result we find for the
average loss of angular momentum per unit time the following expression :
dM = 2<x(2»W 2 / e x e 2 \ 2
dt 3c 3 M 2 \mi m 2 )
[as in equation (1), we omit the average sign]. Dividing (1) by (2), we get the differential equation
d\*\ _ »«?
dM~ 2M
which, on integration, gives :
WJ^U^ + Mm. (3)
1 ' 2M 2 V Ml) Mo
The constant of integration is chosen so that for M = M , we have € = £ Q , where M and <? are
the initial angular momentum and energy of the particles.
The "falling in" of the particles toward one another corresponds to M>0. From (3) we see
that then S— > — oo.
We note that the product \&\M 2 tends toward na 2 j2, and from formula (70.3) it is clear that the
eccentricity e>0, i.e. as the particles approach one another, the orbit approaches a circle. Sub
stituting (3) in (2), we determine the derivative dt/dM expressed as a function of M, after which
integration with respect to M between the limits M and gives the time of fall :
, M =^te*yw + v2sps)».
fal1 <zV2\* \/i 3 \ m i m2 /
2. Find the Lagrangian for a system of two identical charged particles, correct to terms of fourth
orderf (Ya. A. Smorodinskii and V. N. Golubenkov, 1956).
Solution: The computation is conveniently done by a scheme which is somewhat different from
the one used in § 65. We start from the expression for the Lagrangian of the particles and the
field produced by them,
a
eY— V^HcurIA,
V c dt )
rts, we get :
8 i ,/dPHVK=^w+AxH}.* i yjEA^lJ(jA e *)rfK
For a system which does not emit dipole radiation, the integral over the infinitely distant surface
gives no contribution to the terms of order 1/c 4 . The term with the total time derivative can be
dropped from the Lagrangian. Thus the required fourth order terms in the Lagrangian are contained
in the expression
Writing
E 2 H
and carrying out an integration by parts, we get :
iilgiA^^2^7 1 "?
Continuing the expansion which was done in § 65, we find the terms of fourth order in the
potentials (j> and A/c) of the field produced by charge 1 at the position of charge 2 :
e a 4 /? 3 1 e d 2
« 2 > = 24?^' ^0 = 2? ST.****
f See the footnote on p. 166. The third order terms in the Lagrangian drop out automatically: the terms
of this order in the field produced by the particles are determined by the time derivative of the dipole moment
[see (75.3)], which is conserved in the present case.
208 RADIATION OF ELECTROMAGNETIC WAVES § 76
By the transformation (18.3) with the appropriate function/, we can bring these potentials to the
equivalent form
rfi<2) = f Ia 1 (2) = ^ s
d 2 id 3
^ ( * Vl)+ 12^ ( ™ 3)
0)
(the differentiation d/dt is done for a fixed position of the field point, i.e. of charge 2; the differentia
tion V is with respect to the coordinates of the field point).
The secondorder terms in the Lagrangian now give the expression!
U 4> = Yc [Al(2) ' ** +A °M " Vl l+ i^i (»S+»D (2)
After performing some of the differentiations in (1), we can represent Ai(2) as
c Al(2) = 8?f' '.=■ ^P*i *<«•» Jl
(where n is a unit vector in the direction from point 1 to point 2). Before making any further
calculations, it is convenient to eliminate from L< 4) those terms which contain time derivatives of
the velocity which are higher than first order; for this purpose we note that
where
c A 1 (2).v 2 =^v 2 .i= = ^g(v 2 .F l) (v 2 V)(v 2 F 1 )F 1 .v 2 },
j t (v 2 • F x ) = j f (v a • F 2 ) +(v a ■ V)(v 2 • F x )
is the total time derivative (differentiation with respect to both ends of the vector R!) and can be
dropped from the Lagrangian. The accelerations are eliminated from the resulting expression by
using the equation of motion of the first approximation: rm x = —e 2 n/R 2 , mi? 2 = e 2 n/R 2 . After
a rather long computation, we finally get :
£ (4) = g^ {[«+2( Vl • v 2 ) 2 3(n vx) 2 (n • v 2 ) 2 +(n vO^Kn • v 2 ) 2 y ?] +
From the symmetry in the two identical particles, it was clear beforehand that Vi == — v 2 in the
system of reference in which their center of inertia is at rest. Then the fourthorder terms in the
Lagrangian are:
where v = v 2 — Vi.
§ 76. Radiation damping in the relativistic case
We derive the relativistic expression for the radiation damping (for a single charge), which
is applicable also to motion with velocity comparable to that of light. This force is now a
fourvector g\ which must be included in the equation of motion of the charge, written in
fourdimensional form :
du l e ..
mc— = F k u k + g\ (76.1)
as c
f Here we omit the infinite terms associated with the action on the particles of their "self" fields. This
operation corresponds to a "renormalization" of the masses appearing in the Lagrangian (see the footnote
on p. 90).{
§ 76 RADIATION DAMPING IN THE RELATIVISTIC CASE 209
To determine g { we note that for v < c, its three space components must go over into the
components of the vector f/c (75.8). It is easy to see that the vector (2e 2 /3c)/(dV/tffc 2 )
has this property. However, it does not satisfy the identity g l u t = 0, which is valid for any
force fourvector. In order to satisfy this condition, we must add to the expression given, a
certain auxiliary four vector, made up from the four velocity u l and its derivatives. The
three space components of this vector must become zero in the limiting case v = 0, in order
not to change the correct values of f which are already given by (2e 2 /3c)j(d 2 u l /ds 2 ). The
fourvector u l has this property, and therefore the required auxiliary term has the form au\
The scalar a must be chosen so that we satisfy the auxiliary relation g% = 0. As a result we
find
, 2e 2 /d 2 u l . k d 2 u k \
'^U^?} (?6  2)
In accordance with the equations of motion, this expression can be written in another form,
by expressing d 2 u l /ds 2 directly in terms of the field tensor of the external field acting on the
particle :
du l e r ik
~r = — 2 b U k>
ds mc
d 2 u l e 8F ik . e 2 .. ,
7T = — 2 TT u k u + —n F Fu^
ds 1 mc 2 dx l m 2 c*
In making substitutions, we must keep in mind that the product of the tensor dF lk /dx l ,
which is antisymmetric in the indices /, k, and the symmetric tensor u t u k gives identically
zero. So,
2e 3 8F ik 2e 4 2e 4
^nyr »kU l  ^r 5 F il F kl u k + —j3 (F kl u l )(F km u m ). (76.3)
3mc ox 3m c 3m c
The integral of the fourforce g l over the world line of the motion of a charge, passing
through a given field, must coincide (except for opposite sign) with the total fourmomentum
AP ' of the radiation from the charge [just as the average value of the work of the force f in
the nonrelativistic case coincides with the intensity of dipole radiation; see (75.6)]. It is
easy to check that this is actually so. The first term in (76.2) goes to zero on performing the
integration, since at infinity the particle has no acceleration, i.e. (du l /ds) = 0. We integrate
the second term by parts and get :
f , i 2 ? 2 f * d 2 u k J 2e 2 r (du k \(du k \ J
which coincides exactly with (73.4).
When the velocity of the particle approaches the velocity of light, those terms in the space
components of the four vector (76.3) increase most rapidly which come from the third
derivatives of the components of the fourvelocity. Therefore, keeping only these terms in
(76.3) and using the relation (9.18) between the space components of the fourvector g l and
the threedimensional force f, we find for the latter:
Consequently, in this case the force f is opposite to the velocity of the particle ; choosing the
210 RADIATION OF ELECTROMAGNETIC WAVES § 76
latter as the Xaxis, and writing out the fourdimensional expressions, we obtain:
2e* {E y H z fHE z + H y f
fx ~~ 3m V ~~? (76 ' 4)
2
(where we have set v = c everywhere except in the denominator).
We see that for an ultrarelativistic particle, the radiation damping is proportional to the
square of its energy.
Let us call attention to the following interesting situation. Earlier we pointed out that the
expression obtained for the radiation damping is applicable only to fields which (in the
reference system K , in which the particle is at rest) are small compared with m 2 c 4 /e 3 . Let
F be the order of magnitude of the external field in the reference system K, in which the
particle moves with velocity v. Then in the K frame, the field has the order of magnitude
F/\fl—v 2 /c 2 (see the transformation formulas in § 24). Therefore F must satisfy the
condition
=«!. (76.5)
v
1  2
c l
At the same time, the ratio of the damping force (76.4) to the external force (~ eF) is of
the order of
e 3 F
1
and we see that, even though the condition (76.5) is satisfied, it may happen (for sufficiently
high energy of the particle) that the damping force is large compared with the ordinary
Lorentz force acting on the particle in the electromagnetic field. f Thus for an ultrarelativistic
particle we can have the case where the radiation damping is the main force acting on the
particle.
In this case the loss of (kinetic) energy of the particle per unit length of path can be
equated to the damping force f x alone; keeping in mind that the latter is proportional to the
square of the energy of the particle, we write
ax
where we denote by k(x) the coefficient, depending on the x coordinate and expressed in
terms of the transverse components of the field in accordance with (76.4). Integrating this
differential equation, we find
1 1 f
= + k(x)dx,
t We should emphasize that this result does not in any way contradict the derivation given earlier of the
relativistic expression for the fourforce g\ in which it was assumed to be "small" compared with the four
force (e/c)F ik u k . It is sufficient to satisfy the requirement that the components of one vector be small com
pared to those of another in just one frame of reference; by virtue of relativistic invariance, the four
dimensional formulas obtained on the basis of such an assumption will be valid in any other reference frame.
§ 77 SPECTRAL RESOLUTION OF THE RADIATION IN THE ULTRARELATIVISTIC CASE 211
where S Q represents the initial energy of the particle (its energy for x *■  oo). In particular,
the final energy &± of the particle (after passage of the particle through the field) is given by
the formula
+ 00
— = h k(x) dx.
We see that for $ »• oo, the final i ^ approaches a constant limit independent of <f
(I. Pomeranchuk, 1939). In other words, after passing through the field, the energy of the
particle cannot exceed the energy ^ crit , defined by the equation
+ 00
—  = k(x)dx,
©crit J
— oo
or, substituting the expression for k(x),
+ oo
'crit
PROBLEMS
1 . Calculate the limiting energy which a particle can have after passing through the field of a
magnetic dipole nt ; the vector m and the direction of motion lie in a plane.
Solution: We choose the plane passing through the vector nt and the direction of motion as the
XZ plane, where the particle moves parallel to the Xaxis at a distance q from it. For the transverse
components of the field of the magnetic dipole we have (see 44.4) :
H y =0,
3(m r) 2 — tn 2 r 2 nt . „ 2
h z = ± — ^ = ( ^ppp ^ 3 ^ cos *+ x s,n ^)e(e 2 +^ 2 ) cos ^
(4 is the angle between nt and the Z axis). Substituting in (76.6) and performing the integration,
we obtain
<T crit 64m 2 c 4 £ 5 \mc 2 J
2. Write the threedimensional expression for the damping force in the relativistic case.
Solution: Calculating the space components of the four vector (76.3), we find
3mc 3 \ c 2 1 }\dt J c \dt
+ 3^{ EXH+ ^ HX(HXV)+ c E(V  E »
3m 2 c
2e 4 f/ 1 \ 2 1
(39
§ 77. Spectral resolution of the radiation in the ultrarelativistic case
Earlier (in § 73) it was shown that the radiation from an ultrarelativistic particle is directed
mainly in the forward direction, along the velocity of the particle : it is contained almost
212 RADIATION OF ELECTROMAGNETIC WAVES § 77
entirely within the small range of angles
J
»»■"?
around the direction of v.
In evaluating the spectral resolution of the radiation, the relation between the magnitude
of the angular range A0 and the angle of deflection a of the particle in passing through the
external electromagnetic field is essential.
The angle a can be calculated as follows. The change in the transverse (to the direction of
motion) momentum of the particle is of the order of the product of the transverse force eFf
and the time of passage through the field, t ~ a/v s a/c (where a is the distance within
which the field is significantly different from 0). The ratio of this quantity to the momentum
mv mc
P =
y/lV 2 lc 2 y/lv 2 /c 2
determines the order of magnitude of the small angle a:
eFa
Dividing by A0, we find:
mc
a eFa
Wh 2 (77.1)
Ad mc v '
We call attention to the fact that it does not depend on the velocity of the particle, and is
completely determined by the properties of the external field itself.
We assume first that
eFa > mc 2 , (77.2)
that is, the total deflection of the particle is large compared with A9. Then we can say that
radiation in a given direction occurs mainly from that portion of the trajectory in which the
velocity of the particle is almost parallel to that direction (subtending with it an angle in the
interval A0) and the length of this segment is small compared with a. The field F can be
considered constant within this segment, and since a small segment of a curve can be con
sidered as an arc of a circle, we can apply the results obtained in § 74 for radiation during
uniform motion in a circle (replacing HbyF). In particular, we may state that the main part
of the radiation is concentrated in the frequency range
eF
w 7 I2T ( 77  3 >
mc
H)
[see (74.16)].
In the opposite limiting case,
eFa < mc 2 , (77 A)
the total angle of deflection of the particle is small compared with A6. In this case the
radiation is directed mainly into the narrow angular range Ad around the direction of motion,
while radiation arrives at a given point from the whole trajectory.
f If we choose the Xaxis along the direction of motion of the particle, then (eF) 2 is the sum of the squares
of the y and z components of the Lorentz force, eE+e\/c x H, in which we can here set v s c:
F 2 = (E y H z ) 2 +(E*+H y ) 2 .
§77 SPECTRAL RESOLUTION OF THE RADIATION IN THE ULTRARELATIVISTIC CASE 213
To compute the spectral distribution of the intensity, it is convenient to start in this case
from the LienardWiechert expressions (73.8) for the field in the wave zone. Let us compute
the Fourier component
2tt J
E m = — Ee i<ot dt.
The expression on the right of formula (73.8) is a function of the retarded time t ', which is
determined by the condition t' = tR(t')/c. At large distances from a particle which is
moving with an almost constant velocity v, we have :
*'£t^ + nr(t')£t^ + iiYl'
c c c c
(r = r(t) ^ yt is the radius vector of the particle), or
, A n ' y \ R o
We replace the t integration by an integration over t ', by setting
dt = dt' M — V
and obtain:
E co = ~2
C R
00
/f'lvV / nX {H) >< »«')} ^"('"^ <"'•
We treat the velocity v as constant; only the acceleration w(r') is variable. Introducing the
notation
ca' = o)(l—\ (77.5)
and the corresponding frequency component of the acceleration, we write E^ in the form
E » = ?k°G) n>< {(H) x,, 4
Finally from (66.9) we get for the energy radiated into solid angle do, with frequency in dco :
2 , dco
do — . (77.6)
»°> ^)_„3
2nc
S)  nx {(H) xw »<
An estimate of the order of magnitude of the frequencies in which the radiation is mainly
concentrated in the case of (77.4) is easily made by noting that the Fourier component w^
is significantly different from zero only if the time l/co', or
1
•(5)
is of the same order as the time a/v ~ a/c during which the acceleration of the particle
214 RADIATION OF ELECTROMAGNETIC WAVES § 77
changes significantly. Therefore we find:
<o~—A^. (77.7)
The energy dependence of the frequency is the same as in (77.3), but the coefficient is
different.
In the treatment of both cases (77.2) and (77.4) it was assumed that the total loss of energy
by the particle during its passage through the field was relatively small. We shall now show
that the first of these cases also covers the problem of the radiation by an ultrarelativistic
particle, whose total loss of energy is comparable with its initial energy.
The total loss of energy by the particle in the field can be determined from the work of the
Lorentz frictional force. The work done by the force (76.4) over the path ~ a is of order
e 4 F 2 a
af~
H)
In order for this to be comparable with the total energy of the particle,
the field must exist at distances
m 3 c 6
V 1 ?
e 4 F 2
But then condition (77.2) is satisfied automatically
j'i
2
2
aeF 3— l2> mc
e 3 F V <r
since the field ^must necessarily satisfy condition (76.5)
F
J
2
<
since otherwise we could not even apply ordinary electrodynamics.
PROBLEMS
1. Determine the spectral distribution of the total (over all directions) radiation intensity for the
condition (77.2).
Solution: For each element of length of the trajectory, the radiation is determined by (74.11),
where we must replace H by the value of the transverse force F at the given point and, in addition,
we must go over from a discrete to a continuous frequency spectrum. This transformation is accom
plished by formally multiplying by dn and the replacement
r J l dtl J T d( °
I n dn = I n j aco = I n — .
dco co
§ 78 SCATTERING BY FREE CHARGES 215
Next, integrating over all time, we obtain the spectral distribution of the total radiation in the
following form:
+ 00
2e 2 co
d£< n = —dco—7= ( 1 7,
V nc
+ 00 oo
dt,
where O(w) is the Airy function of the argument
rmc
■ —
,i4
[ eF \ c
2/3
The integrand depends on the integration variable t implicitly through the quantity u (F and with
it u, varies along the trajectory of the particle; for a given motion this variation can be considered
as a time dependence).
2. Determine the spectral distribution of the total (over all directions) radiated energy for the
condition (77.4).
Solution: Keeping in mind that the main role is played by the radiation at small angles to the
direction of motion, we write:
(.'cos,)^^)^^).
We replace the integration over angles do = sin 6d9d^^ Bddd^ in (77.6) by an integration over
d<f> dco'\co. In writing out the square of the vector triple product in (77.6) it must be remembered that
in the ultrarelativistic case the longitudinal component of the acceleration is small compared with
the transverse component [in the ratio 1 — (v 2 lc 2 )], and that in the present case we can, to sufficient
accuracy, consider w and v to be mutually perpendicular. As a result, we find for the spectral
distribution of the total radiation the following formula :
e 2 cod(o f Iwp,! 2 I" co / v 2 \ co 2 / v<
2nc 3 J co' 2 I a>'\ c 2 ) ± 2co' 2 \ c<
f(S)
dco'.
§ 78. Scattering by free charges
If an electromagnetic wave falls on a system of charges, then under its action the charges
are set in motion. This motion in turn produces radiation in all directions; there occurs, we
say, a scattering of the original wave.
The scattering is most conveniently characterized by the ratio of the amount of energy
emitted by the scattering system in a given direction per unit time, to the energy flux density
of the incident radiation. This ratio clearly has dimensions of area, and is called the effective
scattering crosssection (or simply the crosssection).
Let dl be the energy radiated by the system into solid angle do per second for an incident
wave with Poynting vector S. Then the effective crosssection for scattering (into the solid
angle do) is
dl
da== (78.1)
(the dash over a symbol means a time average). The integral a of da over all directions is the
total scattering crosssection.
Let us consider the scattering produced by a free charge at rest. Suppose there is incident
on this charge a plane monochromatic linearly polarized wave. Its electric field can be
216 RADIATION OF ELECTROMAGNETIC WAVES § 78
written in the form
E = E cos (k • r — cot + a).
We shall assume that the velocity acquired by the charge under the influence of the incident
wave is small compared with the velocity of light (which is usually the case). Then we can
consider the force acting on the charge to be eE, while the force (e/c)v x H due to the mag
netic field can be neglected. In this case we can also neglect the effect of the displacement of
the charge during its vibrations under the influence of the field. If the charge carries out
vibrations around the coordinate origin, then we can assume that the field which acts on the
charge at all times is the same as that at the origin, that is,
E = E cos (col — a).
Since the equation of motion of the charge is
mi = eE
and its dipole moment d = ex, then
.. e 2
d =  E. (78.2)
m
To calculate the scattered radiation, we use formula (67.7) for dipole radiation (this is
justified, since the velocity acquired by the charge is assumed to be small). We also note that
the frequency of the wave radiated by the charge (i.e., scattered by it) is clearly the same as
the frequency of the incident wave.
Substituting (78.2) in (67.7), we find
e A
dl = =^ (E x n) 2 do.
Anm z c i
On the other hand, the Poynting vector of the incident wave is
S = ^E 2 . (78.3)
From this we find, for the crosssection for scattering into the solid angle do,
da = ( % ) sin 2 9 do, (78.4)
where 6 is the angle between the direction of scattering (the vector n), and the direction of
the electric field E of the incident wave. We see that the effective scattering crosssection of a
free charge is independent of frequency.
We determine the total crosssection a. To do this, we choose the polar axis along E.
Then do = sin Odd d<j>; substituting this and integrating with respect to from to it,
and over from to 2n, we find
**(Af. (785)
3 \mc z J
(This is the Thomson formula?)
Finally, we calculate the differential crosssection da in the case where the incident wave
is unpolarized (ordinary light). To do this we must average (78.4) over all directions of the
vector E in a plane perpendicular to the direction of propagation of the incident wave
(direction of the wave vector k). Denoting by e the unit vector along the direction of E, we
write:
(ne) 2 = ln a n p e a e p .
§ 78 SCATTERING BY FREE CHARGES 217
The averaging is done using the formula!
e,
and gives
sin 2 9 = \ (l+^jW) = * (1+cos 2 0)
2\ k 2 / 2
where is the angle between the directions of the incident and scattered waves (the scattering
angle). Thus the effective crosssection for scattering of an unpolarized wave by a free charge
is
1 / e 2 \ 2
do = ~[ — 5) (1 + cos 2 0) do, (78.7)
2 \mc /
The occurrence of scattering leads, in particular, to the appearance of a certain force
acting on the scattering particle. One can verify this by the following considerations. On the
average, in unit time, the wave incident on the particle loses energy c Wc, where W is the
average energy density, and a is the total effective scattering crosssection. Since the momen
tum of the field is equal to its energy divided by the velocity of light, the incident wave loses
momentum equal in magnitude to Wa. On the other hand, in a system of reference in which
the charge carries out only small vibrations under the action of the force eE, and its velocity
v is small, the total flux of momentum in the scattered wave is zero, to terms of higher order
in v/c (in § 73 it was shown that in a reference system in which v = 0, radiation of momentum
by the particle does not occur). Therefore all the momentum lost by the incident wave is
"absorbed" by the scattering particle. The average force f acting on the particle is equal to
the average momentum absorbed per unit time, i.e.
f = aWn (78.8)
(n is a unit vector in the direction of propagation of the incident wave). We note that the
average force appears as a second order quantity in the field of the incident wave, while the
"instantaneous" force (the main part of which is eE) is of first order in the field.
Formula (78.8) can also be obtained directly by averaging the damping force (75.10). The
first term, proportional to E, goes to zero on averaging, as does the average of the main part
of the force, eE. The second term gives
 2e 4 — 2 Sn / e 2 \ 2 Y 2
f= 3^ Eno = yUW 4* n °'
which, using (78.5), coincides with (78.8).
PROBLEMS
1. Determine the effective crosssection for scattering of an elliptically polarized wave by a free
charge.
Solution: The field of the wave has the form E = A cos (cot+ a)+B sin (cot+ a), where A and B
are mutually perpendicular vectors (see § 48). By a derivation similar to the one in the text, we find
\mc 2 J
(Axn) 2 +(Bxn) 2
dc = [ — \ ^^ do.
t In fact, e a e e is a symmetric tensor with trace equal to 1 , which gives zero when multiplied by k a , because
e and k are perpendicular. The expression given here satisfies these conditions.
218 RADIATION OF ELECTROMAGNETIC WAVES § 78
2. Determine the effective crosssection for scattering of a linearly polarized wave by a charge
carrying out small vibrations under the influence of an elastic force (oscillator).
Solution: The equation of motion of the charge in the incident field E = E cos (cot+<x) is
r+colr = — E cos (cot+x),
m
where co is the frequency of its free vibrations. For the forced vibrations, we then have
eE cos (cot+ix)
T = .
m{al — co 2 )
Calculating d from this, we find
/ e 2 \ 2 co*
da = —  — =2 sin2 edo
\mc 2 J {col — co 2 ) 2
(9 is the angle between E and n).
3. Determine the total effective crosssection for scattering of light by an electric dipole which,
mechanically, is a rotator. The frequency co of the wave is assumed to be large compared with the
frequency Q of free rotation of the rotator.
Solution: Because of the condition co > Q , we can neglect the free rotation of the rotator, and
consider only the forced rotation under the action of the moment of the forces d x E exerted on it
by the scattered wave. The equation for this motion is: Jq = dxE, where J is the moment of
inertia of the rotator and Q is the angular velocity of rotation. The change in the dipole moment
vector, as it rotates without changing its absolute value, is given by the formula d = H x d. From
these two equations, we find (omitting the quadratic term in the small quantity Q) :
d = Ud x E) x d = j [EJ 2 (E • d)d].
Assuming that all orientations of the dipole in space are equally probable, and averaging d 2 over
them, we find for the total effective crosssection,
_ 16ttJ 4
a ~ 9M 2 '
4. Determine the degree of depolarization in the scattering of ordinary light by a free charge.
Solution: From symmetry considerations, it is clear that the two incoherent polarized com
ponents of the scattered light (see § 50) will be linearly polarized: one in the plane of scattering (the
plane passing through the incident and scattered waves) and the other perpendicular to this plane.
The intensities of these components are determined by the components of the field of the incident
wave in the plane of scattering (E N ) and perpendicular to it (E ± ), and, according to (78.3), are
proportional respectively to
(E,  xn) 2 £:^cos 2 and (E x xn) 2 =E 2 ±
(where is the angle of scattering). Since for the ordinary incident light, E^ =E\, the degree of
depolarization [see the definition in (50.9)] is :
q = cos 2 0.
5. Determine the frequency co' of the light scattered by a moving charge.
Solution: In a system of coordinates in which the charge is at rest, the frequency of the light does
not change on scattering (co = co'). This relation can be written in invariant form as
k'iU' 1 =k i u i ,
where u l is the fourvelocity of the charge. From this we find without difficulty
/Mcos 0' ) = co (l cos d\
where 9 and 9' are the angles made by the incident and scattered waves with the direction of motion
(v is the velocity of the charge).
6. Determine the angular distribution of the scattering of a linearly polarized wave by a charge
moving with velocity v in the direction of propagation of the wave.
§ 78 SCATTERING BY FREE CHARGES 219
Solution: The velocity of the particle is perpendicular to the fields E and H of the incident wave,
and is therefore also perpendicular to the acceleration w given to the particle. The scattered intensity
is given by (73.14), where the acceleration w of the particle must be expressed in terms of the fields
E and H of the incident wave by the formulas obtained in the problem in § 17. Dividing the intensity
dl by the Poynting vector of the incident wave, we get the following expression for the scattering
crosssection :
da
\mc 2 /
v '
1 — sin 9 cos <f>
c
1 —  sin 9 cos (f> 1 — ( 1 — \ ] cos 2
do,
where 9 and <f> are the polar angle and azimuth of the direction n relative to a system of coordinates
with Z axis along E, and X axis along v (cos (n, E) = cos 9; cos (n, v) = sin 9 cos <f>).
7. Calculate the motion of a charge under the action of the average force exerted upon it by the
wave scattered by it.
Solution: The force (78.8), and therefore the velocity of the motion under consideration, is along
the direction of propagation of the incident wave (X axis). In the auxiliary reference system K ,
in which the particle is at rest ( we recall that we are dealing with the motion averaged over the
period of the small vibrations), the force acting on it is aW , and the acceleration acquired by it
under the action of this force is
a —
Wo = — W
m
(the index zero refers to the reference system K ). The transformation to the original reference
system K (in which the charge moves with velocity v) is given by the formulas obtained in the problem
of § 7 and by formula (47.7), and gives :
 1 V 
d v 1 dv Wo c
dt I „2 A v 2 \ 312 dt m v
J^ H)
Integrating this expression, we find
c
1
2
3'
which determines the velocity v = dx/dt as an implicit function of the time (the integration con
stant has been chosen so that v — at t = 0).
8. Determine the average force exerted on a charge moving in an electromagnetic field consisting
of a superposition of waves in all directions with an isotropic distribution of directions of
propagation.
Solution: We write the equation of motion of the charge in fourdimensional form,
du l ,
mc — r = g\
ds
To determine the four vector g l , we note that in a system of reference in which the charge is at rest
at a given moment, in the presence of a single wave propagating along a definite direction (say,
along the X axis), the equation of motion is (v x = v)
dv r*/
m — = aW
dt
(we omit the average sign throughout). This means that the Xcomponent of the vector g l must
become (W/c)a. The fourvector (p\c)T iH u k has this property, where T ik is the energymomentum
tensor of the wave, and «' is the four velocity of the charge. In addition, g l must satisfy the condition
g l Ui = 0. This can be achieved by adding to the previous expression a fourvector of the form a.u\
220 RADIATION OF ELECTROMAGNETIC WAVES § 79
where a is a scalar. Determining a suitably, we obtain
mc d ^ =  (T ik u k u l u k «, T kl ). (1)
as c
In an electromagnetic field of isotropic radiation, the Poynting vector vanishes because of
symmetry, and the stress tensor a aB must have the form const • 5 a p. Noting also that we must have
T\ = 0, so that a a p = T° =W, we find
W ' x
°afi ^ O a p.
Substituting these expressions in (1), we find for the force acting on the charge:
d I mv \ 4Wav
Jt
3c
(5)'
This force acts in the direction opposite to the motion of the charge, i.e. the charge experiences
a retardation. We note that for v<^c, the retarding force is proportional to the velocity of the charge :
dv AWcv
m — — — .
dt 3c
9. Determine the effective crosssection for scattering of a linearly polarized wave by an oscillator,
taking into account the radiation damping.
Solution: We write the equation of motion of the charge in the incident field in the form
e 2e 2
f + co 2 r =  Eoe tot + =— s r.
In the damping force, we can substitute approximately r = — coir; then we find
imt
m
where y — (2e 2 /3mc 3 )co%. From this we obtain
r +yr + <»ro = — E e~
m
r = E,
e „ e
lat
m co Q — co" — i(jyy
The effective crosssection is
3 \mc z )
§ 79. Scattering of lowfrequency waves
The scattering of a wave by a system of charges differs from the scattering by a single
charge (at rest), first of all in the fact that because of the presence of internal motion of the
charges of the system, the frequency of the scattered radiation can be different from the
frequency of the incident wave. Namely, in the spectral resolution of the scattered wave
there appear, in addition to the frequency co of the incident wave, frequencies co' differing
from co by one of the internal frequencies of motion of the scattering system. The scattering
with changed frequency is called incoherent (or combinational), in contrast to the coherent
scattering without change in frequency.
Assuming that the field of the incident wave is weak, we can represent the current density
in the form j = j + j', where j is the current density in the absence of the external field, and
j' is the change in the current under the action of the incident wave. Correspondingly, the
vector potential (and other quantities) of the field of the system also has the form A = A + A',
where A and A' are determined by the currents j and j'. Clearly, A' describes the wave
scattered by the system.
Let us consider the scattering of a wave whose frequency co is small compared with all the
§ 80 SCATTERING OF HIGHFREQUENCY WAVES 221
internal frequencies of the system. The scattering will consist of an incoherent as well as a
coherent part, but we shall here consider only the coherent scattering.
In calculating the field of the scattered wave, for sufficiently low frequency co, we can use
the expansion of the retarded potentials which was presented in §§67 and 71, even if the
velocities of the particles of the system are not small compared with the velocity of light.
Namely, for the validity of the expansion of the integral
cR J c c
it is necessary only that the time r n/c ~ a/c be small compared with the time l/a>; for
sufficiently low frequencies (co <£ c/a), this condition is fulfilled independently of the velocities
of the particles of the system.
The first terms in the expansion give
H' = y— {H' x n + (m' x n) x n},
c R
where d', m' are the parts of the dipole and magnetic moments of the system which are
produced by the radiation falling on the system. The succeeding terms contain higher time
derivatives than the second, and we drop them.
The component H^, of the spectral resolution of the field of the scattered wave, with
frequency equal to that of the incident wave, is given by this same formula, when we sub
stitute for all quantities their Fourier components: d^ = — co 2 A' m , m' w = — co 2 m' m . Then we
obtain
co 2
u '<o = T^r (nxd;+nx(m;xn)}. (79.1)
C K
The later terms in the expansion of the field would give quantities proportional to higher
powers of the small frequency. If the velocities of all the particles of the system are small
(v < c), then in (79.1) we can neglect the second term in comparison to the first, since the
magnetic moment contains the ratio v/c. Then
H;, = 2— fi> 2 nxd;. (79.2)
C K
If the total charge of the system is zero, then for co *■ 0, d^,, and m^ approach constant
limits (if the sum of the charges were different from zero, then for co = 0, i.e. for a constant
field, the system would begin to move as a whole). Therefore for low frequencies (co <v/a)
we can consider d^, and m^ as independent of frequency, so that the field of the scattered
wave is proportional to the square of the frequency. Its intensity is consequently propor
tional to co 4 . Thus for the scattering of a lowfrequency wave, the effective crosssection for
(coherent) scattering is proportional to the fourth power of the frequency of the incident
radiation.f
§ 80. Scattering of highfrequency waves
We consider the scattering of a wave by a system of charges in the opposite limit, when the
frequency co of the wave is large compared with the fundamental internal frequencies of the
f This also applies to the scattering of light by ions as well as by neutral atoms. Because of the large mass
of the nucleus, the scattering resulting from the motion of the ion as a whole can be neglected.
222 RADIATION OF ELECTROMAGNETIC WAVES § 80
system. The latter have the order of magnitude co ~ v/a, so that co must satisfy the condition
co>co ~ V . (80.1)
a
In addition, we assume that the velocities of the charges of the system are small (v < c).
According to condition (80.1), the periods of the motion of the charges of the system are
large compared with the period of the wave. Therefore during a time interval of the order
of the period of the wave, the motion of the charges of the system can be considered uniform.
This means that in considering the scattering of short waves, we need not take into account
the interaction of the charges of the system with each other, that is, we can consider them as
free.
Thus in calculating the velocity v\ acquired by a charge in the field of the incident wave,
we can consider each of the charges in the system separately, and write for it an equation of
motion of the form
m a * = eV = eE e« cot  k  r \
at
where k = (co/c)n is the wave of the incident wave. The radius vector of the charge is, of
course, a function of the time. In the exponent on the right side of this equation the time
rate of change of the first term is large compared with that of the second (the first is co, while
the second is of order kv ~ v(co/c) < co). Therefore in integrating the equation of motion,
we can consider the term r on the right side as constant. Then
v'=^E <r , ' (w,  k  r) . (80.2)
icom
For the vector potential of the scattered wave (at large distances from the system), we have
from the general formula (66.2):
A ' " cl j j '' 7 ^ dV = i S <eA " T * tJ f '
where the sum goes over all the charges of the system; n' is a unit vector in the direction of
scattering. Substituting (80.2), we find
A' =  4 e~ i(0 ('" T°)e S  «*•', (80.3)
icR co ^ m
where q = k'k is the difference between the wave vector k = (co/c)n of the incident wave,
and the wave vector k' = (co/c)n r of the scattered wave.j The value of the sum in (80.3)
must be taken at the time t' = t(R /c) (for brevity as usual, we omit the index t ' on r);
the change of r in the time r • ri/c can be neglected in view of our assumption that the velocities
of the particles are small. The absolute value of the vector q is
« = 2sin, (80.4)
where is the scattering angle.
For scattering by an atom (or molecule), we can neglect the terms in the sum in (80.3)
which come from the nuclei, because their masses are large compared with the electron mass.
t Strictly speaking the wave vector k' = co'n'/c, where the frequency to" of the scattered wave may differ
from co. However, in the present case of high frequencies the difference a/ to ~&>o can be neglected.
§ 80 SCATTERING OF HIGHFREQUENCY WAVES 223
Later we shall be looking at just this case, so that we remove the factor e 2 \m from the sum
mation sign, and understand by e and m the charge and mass of the electron.
For the field H' of the scattered wave we find from (66.3):
H'^eK?)?^ e . (80.5)
c R m
The energy flux into an element of solid angle in the direction n' is
clH
'12
Rldo = —^ 2 {n'xK ) 7
2
do.
Sn u 87uc 3 m 2
Dividing this by the energy flux cE J 2 /8n: of the incident wave, and introducing the angle 9
between the direction of the field E of the incident wave and the direction of scattering, we
finally obtain the effective scattering crosssection in the form
do = ( — ,
\mc
£><V.
sin 2 9 do. (80.6)
The dash means a time average, i.e. an average over the motion of the charges of the system;
it appears because the scattering is observed over a time interval large compared with the
periods of motion of the charges of the system.
For the wavelength of the incident radiation, there follows from the condition (80.1) the
inequality X <^ ac/v. As for the relative values of X and a, both the limiting cases X > a
and X <4 a are possible. In both these cases the general formula (80.6) simplifies considerably.
In the case of X > a, in the expression (80.6) q • r <^ 1, since q ~ l/X, and r is of order of a.
Replacing e~ ,qr by unity in accordance with this, we have:
/Ze 2 \ 2
da = l — 2 ) sin2 e do (80.7)
that is, the scattering is proportional to the square of the atomic number Z.
We now go over to the case of X <4 a. In the square of the sum which appears in (80.6), in
addition to the square modulus of each term, there appear products of the forme  ' q " (ri ~ r2) .
In averaging over the motion of the charges, i.e. over their mutual separations, r x — r 2
takes on values in an interval of order a. Since q ~ l/X, X <£ a, the exponential factor
e iq (rir 2 ) j s a ra pj^iy oscillating function in this interval, and its average value vanishes.
Thus for X <^ a, the effective scattering crosssection is
da = z( 6 A sin 2 9 do, (80.8)
that is, the scattering is proportional to the first power of the atomic number. We note that
this formula is not applicable for small angles of scattering (9 ~ X/a), since in this case
q ~ 9/X~ l/a and the exponent q r is not large compared to unity.
To determine the effective coherent scattering crosssection, we must separate out that
part of the field of the scattered wave which has the frequency co. The expression (80.5)
depends on the time through the factor e~ i(0t , and also involves the time in the sum
Se" ,,r . This latter dependence leads to the result that in the field of the scattered wave there
are contained, along with the frequency co, other (though close to co) frequencies. That part
of the field which has the frequency co (i.e. depends on the time only through the factor
e~ l0}t ), is obtained if we average the sum 2 e~ iqr over time. In accordance with this, the
expression for the effective coherent scattering crosssection da cob , differs from the total
224 RADIATION OF ELECTROMAGNETIC WAVES § 80
crosssection do in that it contains, in place of the average value of the square modulus of
the sum, the square modulus of the average value of the sum,
da rn u = I — 2
jnc
X>~' , " r sinO do. (80.9)
It is useful to note that this average value of the sum is (except for a factor) just the space
Fourier component of the average distribution o(r) of the electric charge density in the
atom:
e ly^r = I Q (r) e ^'dV = Q q . (80.10)
In case I > a, we can again replace e' q " r by unity, so that
2 \ 2
da coh = [Z zr72 ) sm 2 do. (80. 1 1)
7 I =>"i 2
mc
Comparing this with the total effective crosssection (80.7), we see that do coh = da, that is,
all the scattering is coherent.
If X <^ a, then when we average in (80.9) all the terms of the sum (being rapidly oscillating
functions of the time) vanish, so that da coh = 0. Thus in this case the scattering is completely
incoherent.
CHAPTER 10
PARTICLE IN A GRAVITATIONAL FIELD
§ 81. Gravitational fields in nonrelativistic mechanics
Gravitational fields, or fields of gravity, have the basic property that all bodies move in
them in the same manner, independently of mass, provided the initial conditions are the
same.
For example, the laws of free fall in the gravity field of the earth are the same for all
bodies ; whatever their mass, all acquire one and the same acceleration.
This property of gravitational fields provides the possibility of establishing an analogy
between the motion of a body in a gravitational field and the motion of a body not located
in any external field, but which is considered from the point of view of a noninertial system
of reference. Namely, in an inertial reference system, the free motion of all bodies is uniform
and rectilinear, and if, say, at the initial time their velocities are the same, they will be the
same for all times. Clearly, therefore, if we consider this motion in a given noninertial
system, then relative to this system all the bodies will move in the same way.
Thus the properties of the motion in a noninertial system are the same as those in an
inertial system in the presence of a gravitational field. In other words, a noninertial reference
system is equivalent to a certain gravitational field. This is called the principle of equivalence.
Let us consider, for example, motion in a uniformly accelerated reference system. A body
of arbitrary mass, freely moving in such a system of reference, clearly has relative to this
system a constant acceleration, equal and opposite to the acceleration of the system itself.
The same applies to motion in a uniform constant gravitational field, e.g. the field of gravity
of the earth (over small regions, where the field can be considered uniform). Thus a uni
formly accelerated system of reference is equivalent to a constant, uniform external field.
A somewhat more general case is a nonuniformly accelerated linear motion of the reference
system — it is clearly equivalent to a uniform but variable gravitational field.
However, the fields to which noninertial reference systems are equivalent are not com
pletely identical with "actual" gravitational fields which occur also in inertial frames. For
there is a very essential difference with respect to their behavior at infinity. At infinite
distances from the bodies producing the field, "actual" gravitational fields always go to
zero. Contrary to this, the fields to which noninertial frames are equivalent increase without
limit at infinity, or, in any event, remain finite in value. Thus, for example, the centrifugal
force which appears in a rotating reference system increases without limit as we move away
from the axis of rotation ; the field to which a reference system in accelerated linear motion is
equivalent is the same over all space and also at infinity.
225
226 PARTICLE IN A GRAVITATIONAL FIELD § 82
The fields to which noninertial systems are equivalent vanish as soon as we transform to
an inertial system. In contrast to this, "actual" gravitational fields (existing also in an inertial
reference frame) cannot be eliminated by any choice of reference system. This is already
clear from what has been said above concerning the difference in conditions at infinity
between "actual" gravitational fields and fields to which noninertial systems are equivalent;
since the latter do not approach zero at infinity, it is clear that it is impossible, by any
choice of reference frame, to eliminate an "actual" field, since it vanishes at infinity.
All that can be done by a suitable choice of reference system is to eliminate the gravita
tional field in a given region of space, sufficiently small so that the field can be considered
uniform over it. This can be done by choosing a system in accelerated motion, the accelera
tion of which is equal to that which would be acquired by a particle placed in the region of
the field which we are considering.
The motion of a particle in a gravitational field is determined, in nonrelativistic mechanics,
by a Lagrangian having (in an inertial reference frame) the form
2
L = n ^ m< ^ i (81.1)
where is a certain function of the coordinates and time which characterizes the field and
is called the gravitational potential.^ Correspondingly, the equation of motion of the particle
is
v=grad</>. (81.2)
It does not contain the mass or any other constant characterizing the properties of the
particle; this is the mathematical expression of the basic property of gravitational fields.
§ 82. The gravitational field in relativistic mechanics
The fundamental property of gravitational fields that all bodies move in them in the same
way, remains valid also in relativistic mechanics. Consequently there remains also the
analogy between gravitational fields and noninertial reference systems. Therefore in studying
the properties of gravitational fields in relativistic mechanics, we naturally also start from
this analogy.
In an inertial reference system, in cartesian coordinates, the interval ds is given by the
relation:
ds 2 = c 2 dt 2 dx 2 dy 2 dz 2 .
Upon transforming to any other inertial reference system (i.e. under Lorentz transforma
tion), the interval, as we know, retains the same form. However, if we transform to a non
inertial system of reference, ds 2 will no longer be a sum of squares of the four coordinate
differentials.
So, for example, when we transform to a uniformly rotating system of coordinates,
x = x' cosQty' sinfif, y = x' sinQt+y' cosQt, z = z'
(Q is the angular velocity of the rotation, directed along the Z axis), the interval takes on the
t In what follows we shall seldom have to use the electromagnetic potential <l>, so that the designation of
the gravitational potential by the same symbol cannot lead to misunderstanding.
§ 82 THE GRAVITATIONAL FIELD IN RELATIVISTIC MECHANICS 227
form
ds 2 = [c 2 Q 2 (x' 2 + y' 2 J]dt 2 dx' 2 dy' 2 dz' 2 + 2Qy' dx' dt2Qx' dy' dt.
No matter what the law of transformation of the time coordinate, this expression cannot be
represented as a sum of squares of the coordinate differentials.
Thus in a noninertial system of reference the square of an interval appears as a quadratic
form of general type in the coordinate differentials, that is, it has the form
ds 2 = g ik dx i dx\ (82.1)
where the g ik are certain functions of the space coordinates x 1 , x 2 , x 3 and the time co
ordinate x°. Thus, when we use a noninertial system, the fourdimensional coordinate
system x°, x 1 , x 2 , x 3 is curvilinear. The quantities g ik , determining all the geometric properties
in each curvilinear system of coordinates, represent, we say, the spacetime metric.
The quantities g ik can clearly always be considered symmetric in the indices i and k
{g ki = g ik ), since they are determined from the symmetric form (82.1), where g ik and g ki enter
as factors of one and the same product dx l dx k . In the general case, there are ten different
quantities g ik — four with equal, and 43/2 = 6 with different indices. In an inertial reference
system, when we use cartesian space coordinates x 1 ' 2 ' 3 = x,y, z, and the time, x = ct,
the quantities g ik are
9oo = h 011=922 = 933=^ 0» = O for i#fc. (82.2)
We call a fourdimensional system of coordinates with these values of g ik galilean.
In the previous section it was shown that a noninertial system of reference is equivalent
to a certain field of force. We now see that in relativistic mechanics, these fields are deter
mined by the quantities g ik .
The same applies also to "actual" gravitational fields. Any gravitational field is just a
change in the metric of spacetime, as determined by the quantities g ik . This important fact
means that the geometrical properties of spacetime (its metric) are determined by physical
phenomena, and are not fixed properties of space and time.
The theory of gravitational fields, constructed on the basis of the theory of relativity, is
called the general theory of relativity. It was established by Einstein (and finally formulated
by him in 1916), and represents probably the most beautiful of all existing physical theories.
It is remarkable that it was developed by Einstein in a purely deductive manner and only
later was substantiated by astronomical observations.
As in nonrelativistic mechanics, there is a fundamental difference between "actual"
gravitational fields and fields to which noninertial reference systems are equivalent. Upon
transforming to a noninertial reference system, the quadratic form (82.1), i.e. the quantities
g ik , are obtained from their galilean values (82.2) by a simple transformation of coordinates,
and can be reduced over all space to their galilean values by the inverse coordinate trans
formation. That such forms for g ik are very special is clear from the fact that it is impossible
by a mere transformation of the four coordinates to bring the ten quantities g ik to a pre
assigned form.
An "actual" gravitational field cannot be eliminated by any transformation of co
ordinates. In other words, in the presence of a gravitational field spacetime is such that the
quantities g ik determining its metric cannot, by any coordinate transformation, be brought
to their galilean values over all space. Such a spacetime is said to be curved, in contrast to
flat spacetime, where such a reduction is possible.
228 PARTICLE IN A GRAVITATIONAL FIELD § 82
By an appropriate choice of coordinates, we can, however, bring the quantities g ik to
galilean form at any individual point of the nongalilean spacetime : this amounts to the
reduction to diagonal form of a quadratic form with constant coefficients (the values of g ik
at the given point). Such a coordinate system is said to be galilean for the given pointy
We note that, after reduction to diagonal form at a given point, the matrix of the quanti
ties g ik has one positive and three negative principal values. { From this it follows, in par
ticular, that the determinant g, formed from the quantities g ik , is always negative for a real
spacetime :
g < 0. (82.3)
A change in the metric of spacetime also means a change in the purely spatial metric.
To a galilean g ik in flat spacetime, there corresponds a euclidean geometry of space. In a
gravitational field, the geometry of space becomes noneuclidean. This applies both to
"true" gravitational fields, in which spacetime is "curved", as well as to fields resulting
from the fact that the reference system is noninertial, which leave the spacetime flat.
The problem of spatial geometry in a gravitational field will be considered in more detail
in § 84. It is useful to give here a simple argument which shows pictorially that space will
become noneuclidean when we change to a noninertial system of reference. Let us con
sider two reference frames, of which one (K) is inertial, while the other (K') rotates uniformly
with respect to K around their common z axis. A circle in the x, y plane of the K system
(with its center at the origin) can also be regarded as a circle in the x', y' plane of the K'
system. Measuring the length of the circle and its diameter with a yardstick in the K system,
we obtain values whose ratio is equal to n, in accordance with the euclidean character of
the geometry in the inertial reference system. Now let the measurement be carried out with
a yardstick at rest relative to K'. Observing this process from the K system, we find that the
yardstick laid along the circumference suffers a Lorentz contraction, whereas the yardstick
placed radially is not changed. It is therefore clear that the ratio of the circumference to the
diameter, obtained from such a measurement, will be greater than n.
In the general case of an arbitrary, varying gravitational field, the metric of space is not
only noneuclidean, but also varies with the time. This means that the relations between
different geometrical distances change with time. As a result, the relative position of "test
bodies" introduced into the field cannot remain unchanged in any coordinate system. § Thus
if the particles are placed around the circumference of a circle and along a diameter, since
the ratio of the circumference to the diameter is not equal to n and changes with time, it is
clear that if the separations of the particles along the diameter remain unchanged the separa
tions around the circumference must change, and conversely. Thus in the general theory
of relativity it is impossible in general to have a system of bodies which are fixed relative to
one another.
This result essentially changes the very concept of a system of reference in the general
theory of relativity, as compared to its meaning in the special theory. In the latter we meant
f To avoid misunderstanding, we state immediately that the choice of such a coordinate system does not
mean that the gravitational field has been eliminated over the corresponding infinitesimal volume of four
space. Such an elimination is also always possible, by virtue of the principle of equivalence, and has a greater
significance (see § 87).
% This set of signs is called the signature of the matrix.
§ Strictly speaking, the number of particles should be greater than four. Since we can construct a tetra
hedron from any six line segments, we can always, by a suitable definition of the reference system, make a
system of four particles form an invariant tetrahedron. A fortiori, we can fix the particles relative to one
another in systems of three or two particles.
§ 83 CURVILINEAR COORDINATES 229
by a reference system a set of bodies at rest relative to one another in unchanging relative
positions. Such systems of bodies do not exist in the presence of a variable gravitational field,
and for the exact determination of the position of a particle in space we must, strictly speak
ing, have an infinite number of bodies which fill all the space like some sort of "medium".
Such a system of bodies with arbitrarily running clocks fixed on them constitutes a reference
system in the general theory of relativity.
In connection with the arbitrariness of the choice of a reference system, the laws of nature
must be written in the general theory of relativity in a form which is appropriate to any four
dimensional system of coordinates (or, as one says, in "covariant" form). This, of course,
does not imply the physical equivalence of all these reference systems (like the physical
equivalence of all inertial reference systems in the special theory). On the contrary, the
specific appearances of physical phenomena, including the properties of the motion of
bodies, become different in all systems of reference.
§ 83. Curvilinear coordinates
Since, in studying gravitational fields we are confronted with the necessity of considering
phenomena in an arbitrary reference frame, it is necessary to develop fourdimensional
geometry in arbitrary curvilinear coordinates. Sections 83, 85 and 86 are devoted to this.
Let us consider the transformation from one coordinate system, x°, x l , x 2 , x 3 , to another
X = J \X 9 X ? X , X ))
where the /' are certain functions. When we transform the coordinates, their differentials
transform according to the relation
d^ = Tiidx' k . (83.1)
dx k
Every aggregate of four quantities A { (i = 0, 1,2, 3), which under a transformation of
coordinates transform like the coordinate differentials, is called a contravariant four vector:
A* = Pr k A*. (83.2)
dx k
Let be some scalar. Under a coordinate transformation, the four quantities d(j)/dx l
transform according to the formula
dx { dx' k dx 1 '
which is different from formula (83.2). Every aggregate of four quantities A t which, under a
coordinate transformation, transform like the derivatives of a scalar, is called a covariant
four vector :
A,J£Ai (83.4)
Because two types of vectors appear in curvilinear coordinates, there are three types of
tensors of the second rank. We call a contravariant tensor of the second rank, A ik , an aggregate
of sixteen quantities which transform like the products of the components of two contra
230 PARTICLE IN A GRAVITATIONAL FIELD § 83
variant vectors, i.e. according to the law
Aik dx l dx k Al ,
A ~B?*r A ■ (83 ' 5)
A covariant tensor of rank two, transforms according to the formula
_dx' l dx' m ,
Aik ~~dx~ i ~d^ Alm ' (83  6)
and a mixed tensor transforms as follows :
. dx l dx ,m ,,
The definitions given here are the natural generalization of the definitions of four vectors
and fourtensors in galilean coordinates (§ 6), according to which the differentials dx l
constitute a contravariant fourvector and the derivatives dcfr/dx 1 form a covariant four
vector.
The rules for forming fourtensors by multiplication or contraction of products of other
fourtensors remain the same in curvilinear coordinates as they were in galilean coordinates.
For example, it is easy to see that, by virtue of the transformation laws (83.2) and (83.4), the
scalar product of two fourvectors A i B i is invariant:
dx l dx' m , dx' m ,
A ' B ' = I^7 AB  = ^ A " B '^ A " B '
The unit fourtensor Si is defined the same as before in curvilinear coordinates: its
components are again 5 k = for i # k, and are equal to 1 for / = k. If A k is a fourvector,
then multiplying by 5 l k we get :
A% = A\
i.e. another four vector; this proves that S k is a tensor.
The square of the line element in curvilinear coordinates is a quadratic form in the
differentials dx i :
ds 2 = g ik dx l dx k , (83.8)
where the g ik are functions of the coordinates; g ik is symmetric in the indices / and k:
9 ik = 9ki (83.9)
Since the (contracted) product of g ik and the contravariant tensor dx f dx k is a scalar, the
g ik form a covariant tensor; it is called the metric tensor.
Two tensors A ik and B ik are said to be reciprocal to each other if
A ik B kl = S l ,
In particular the contravariant metric tensor is the tensor g ik reciprocal to the tensor g ik ,
that is,
9ik9 kl = Z\. (83.10)
The same physical quantity can be represented in contra or covariant components. It is
obvious that the only quantities that can determine the connection between the different
t Nevertheless, while in a galilean system the coordinates x l themselves (and not just their differentials)
also form a fourvector, this is, of course, not the case in curvilinear coordinates.
§ 83 CURVILINEAR COORDINATES 231
forms are the components of the metric tensor. This connection is given by the formulas:
A l = g ik A k , A t = gtk A k . (83.11)
In a galilean coordinate system the metric tensor has components :
10
(0) (O)* l°  1 ° ° (83.12)
dm y « o 01 '
\0 l y
Then formula (83.11) gives the familiar relation A = A , A 1 ' 2 > 3 = A u 2> 3 , etc.f
These remarks also apply to tensors. The transition between the different forms of a given
physical tensor is accomplished by using the metric tensor according to the formulas:
A\ = g il A lk , A ik = g il g km A lm ,
etc.
In § 6 we defined (in galilean coordinates) the completely antisymmetric unit pseudo
tensor e iklm . Let us transform it to an arbitrary system of coordinates, and now denote it by
E iklm . We keep the notation e iklm for the quantities defined as before by e 0123 = 1 (or
^oi23 = !)•
Let the x' 1 be galilean, and the x v be arbitrary curvilinear coordinates. According to the
general rules for transformation of tensors, we have :
6x' p dx' r dx' s dx n
or
piklm _ j Aklm
where / is the determinant formed from the derivatives dx^dx'", i.e. it is just the Jacobian
of the transformation from the galilean to the curvilinear coordinates:
d(x°, x\ x 2 , x 3 )
J ~d(x'°,x'\x' 2 ,x' 3 )
This Jacobian can be expressed in terms of the determinant of the metric tensor g ik (in the
system x l ). To do this we write the formula for the transformation of the metric tensor:
ik _ _^L d* ( )*m
y dx' 1 dx' m y '
and equate the determinants of the two sides of this equation. The determinant of the re
ciprocal tensor \g ik \ = \\g. The determinant # (0)/m  = 1. Thus we have \\g = / 2 , and
so/= l/yjg.
Thus, in curvilinear coordinates the antisymmetric unit tensor of rank four must be
defined as
E iklm = _L e ttlm m ( 83 13 )
vfir
t Whenever, in giving analogies, we use galilean coordinate systems, one should realize that such a
system can be selected only in a flat space. In the case of a curved fourspace, one should speak of a co
ordinate system that is galilean over a given infinitesimal element of fourvolume, which can always be
found. None of the derivations are affected by this change.
232 PARTICLE IN A GRAVITATIONAL FIELD § 83
The indices of this tensor are lowered by using the formula
e prst gi P gi ir gi s g mt = ge iklm ,
so that its covariant components are
Eikim = \/ge iklm . (83.14)
In a galilean coordinate system x' 1 the integral of a scalar with respect to
dQ' = dx'° dx' 1 dx' 2 dx' 3 is also a scalar, i.e. the element dQ' behaves like a scalar in
the integration (§ 6). On transforming to curvilinear coordinates *', the element of integra
tion dQ.' goes over into
dQ' >  dQ = yj~^g dQ.
Thus, in curvilinear coordinates, when integrating over a four volume the quantity V gdQ
behaves like an invariant, f
All the remarks at the end of § 6 concerning elements of integration over hypersurf aces,
surfaces and lines remain valid for curvilinear coordinates, with the one difference that the
definition of dual tensors changes. The element of "area" of the hypersurface spanned by
three infinitesimal displacements is the contravariant antisymmetric tensor dS ikl ; the vector
dual to it is gotten by multiplying by the tensor V ' g e iklm , so it is equal to
4^~gdS i = & iklm dS klm Jg. (83.15)
Similarly, if df lk is the element of (twodimensional) surface spanned by two infinitesimal
displacements, the dual tensor is defined asf
^Jgdft k = y'ge mm df lm . (83.16)
We keep the designations dS t and df£ as before for ie iklm dS klm and \e iklm df lm (and
not their products by V g)\ the rules (6.1419) for transforming the various integrals
into one another remain the same, since their derivation was formal in character and not
related to the tensor properties of the different quantities. Of particular importance is the
rule for transforming the integral over a hypersurface into an integral over a fourvolume
(Gauss' theorem), which is accomplished by the substitution :
dSitdQ—.. (83.17)
f If ^ is a scalar, the quantity V —g <f>, which gives an invariant when integrated over dQ, is called a
scalar density. Similarly, we speak of vector and tensor densities V — g A\ V — g A ik , etc. These quantities
give a vector or tensor on multiplication by the fourvolume element dQ. (the integral J A 1 V —g dQ over
a finite region cannot, generally speaking, be a vector, since the laws of transformation of the vector A 1 are
different at different points).
t It is understood that the elements dS klm and df ik are constructed on the infinitesimal displacements
dx\ dx' 1 , dx" 1 in the same way as in § 6, no matter what the geometrical significance of the coordinates x l .
Then the formal significance of the elements dS t and dff k is the same as before. In particular, as before
dS = dxx dx 2 dx 3 = dV. We keep the earlier definition of dV for the product of differentials of the three
space coordinates; we must, however, remember that the element of geometrical spatial volume is given in
curvilinear coordinates not by dV, but by Vy dV, where y is the determinant of the spatial metric tensor
(which will be defined in the next section).
§ 84 DISTANCES AND TIME INTERVALS 233
§ 84. Distances and time intervals
We have already said that in the general theory of relativity the choice of a coordinate
system is not limited in any way; the triplet of space coordinates x 1 , x 2 , x 3 , can be any
quantities defining the position of bodies in space, and the time coordinate x° can be
defined by an arbitrarily running clock. The question arises of how, in terms of the values
of the quantities jc 1 , x 2 , x 3 , x°, we can determine actual distances and time intervals.
First we find the relation of the proper time, which from now on we shall denote by x,
to the coordinate x°. To do this we consider two infmitesimally separated events, occurring
at one and the same point in space. Then the interval ds between the two events is, as we
know, just cdx, where dx is the (proper) time interval between the two events. Setting
dx 1 = dx 2 = dx 3 = in the general expression ds 2 = g ik dx 1 dx*, we consequently find
ds 2 = c 2 dx 2 = g 00 (dx ) 2 ,
from which
dx = ^ dx°, (84.1)
c
or else, for the time between any two events occurring at the same point in space,
x = \[sIJ^odx°. (84.2)
This relation determines the actual time interval (or as it is also called, the proper time
for the given point in space) for a change of the coordinate x°. We note in passing that the
quantity g 00 , as we see from these formulas, is positive:
0oo > 0. (84.3)
It is necessary to emphasize the difference between the meaning of (84.3) and the meaning
of the signature [the signs of three principal values of the tensor g ik (§ 82)]. A tensor g ik
which does not satisfy the second of these conditions cannot correspond to any real gravita
tional field, i.e. cannot be the metric of a real spacetime. Nonfulfilment of the condition
(84.3) would mean only that the corresponding system of reference cannot be realized with
real bodies; if the condition on the principal values is fulfilled, then a suitable transforma
tion of the coordinates can make g 00 positive (an example of such a system is given by the
rotating system of coordinates, see § 89).
We now determine the element dl of spatial distance. In the special theory of relativity
we can define dl as the interval between two infinitesimally separated events occurring at
one and the same time. In the general theory of relativity, it is usually impossible to do this,
i.e. it is impossible to determine dl by simply setting dx° = in ds. This is related to the fact
that in a gravitational field the proper time at different points in space has a different
dependence on the coordinate x°.
To find dl, we now proceed as follows.
Suppose a light signal is directed from some point B in space (with coordinates x a + dx a )
to a point A infinitely near to it (and having coordinates x a ) and then back over the same
path. Obviously, the time (as observed from the one point B) required for this, when
multiplied by c, is twice the distance between the two points.
Let us write the interval, separating the space and time coordinates :
ds 2 = g aP dx a dx p + 2g 0a dx° dx* + g 00 (dx ) 2 (84.4)
234
PARTICLE IN A GRAVITATIONAL FIELD
§ 84
where it is understood that we sum over repeated Greek indices from 1 to 3. The interval
between the events corresponding to the departure and arrival of the signal from one point
to the other is equal to zero. Solving the equation ds 2 = with respect to dx°, we find two
roots :
d x °W = _ {g 0a dx ct ^(g 0a g op g ap g 00 )dx ct dx' 3 },
9oo
dx ow =~ {g 0a dx a + ^(gZgopg a f l goo)dx^dx p },
ifOO
(84.5)
corresponding to the propagation of the signal in the two directions between A and B. If x°
is the moment of arrival of the signal at A, the times when it left B and when it will return
to B are, respectively, x° + dx ( 1) and x° + dx ( 2) . In the schematic diagram of Fig. 18 the
solid lines are the world lines corresponding to the given coordinates x a and x a + dx a , while
the dashed lines are the world lines of the signals.f It is clear that the total interval of "time"
between the departure of the signal and its return to the original point is equal to
dx ^  dx ^ = — V(do«0o,r  9 a p9oo) dx a dx*.
9oo
<
\
(2)
x° + dx„
(D
x° + dx n
A B
Fig. 18
The corresponding interval of proper time is obtained, according to (84.1), by multiplying
by v goo/c> and the distance dl between the two points by multiplying once more by c/2. As
a result, we obtain
dl 2 = (g aP +^^)dx^ x P.
\ 000 /
This is the required expression, defining the distance in terms of the space coordinate
elements. We rewrite it in the form
where
dl 2 = y a p dx a dx p ,
v (a I 9oa9 °p\
y a p\ g a p+— j
\ 9oo /
(84.6)
(84.7)
t In Fig. 18, it is assumed that dx^ > 0, dxp < 0, but this is not necessary: dx ow and dx°™ may have
the same sign. The fact that in this case the value *°(A) at the moment of arrival of the signal at A might
be less than the value x°(B) at the moment of its departure from B contains no contradiction, since the
rates of clocks at different points in space are not assumed to be synchronized in any way.
§ 84 DISTANCES AND TIME INTERVALS 235
is the threedimensional metric tensor, determining the metric, i.e., the geometric properties
of the space. The relations (84.7) give the connection between the metric of real space and
the metric of the fourdimensional spacetime.f
However, we must remember that the g ik generally depend on x°, so that the space metric
(84.6) also changes with time. For this reason, it is meaningless to integrate dl; such an
integral would depend on the world line chosen between the two given space points. Thus,
generally speaking, in the general theory of relativity the concept of a definite distance
between bodies loses its meaning, remaining valid only for infinitesimal distances. The only
case where the distance can be defined also over a finite domain is that in which the g ik
do not depend on the time, so that the integral J dl along a space curve has a definite
meaning.
It is worth noting that the tensor y aP is the reciprocal of the contravariant threedimen
sional tensor g*K In fact, from g ik g kl = 5\, we have, in particular,
9 aP g, y + 9 a0 9oy = F„ 9 aP g,o + 9 a °goo = 0, 9 op 9 P o + 9°°9oo = 1. (84.8)
Determining g a0 from the second equation and substituting in the first, we obtain:
_ n aP _ Six
9 7py — °r
This result can be formulated differently, by the statement that the quantities g* p form
the contravariant threedimensional metric tensor corresponding to the metric (84.6):
y*P = g*P. (84.9)
We also state that the determinants g and y, formed respectively from the quantities g ik
and y aP , are related to one another by
9 = 9oo7 ( 84  10 >
In some of the later applications it will be convenient to introduce the threedimensional
vector g, whose covariant components are defined as
g m =—. ( 84  n )
Considering g as a vector in the space with metric (84.6), we must define its contravariant
components as g* = y ap g p . Using (84.9) and the second of equations (84.8), it is easy to see
that
g'fg^g '. (84.12)
We also note the formula
g 00 = — g.ff, ( 84  13 >
goo
which follows from the third of equations (84.8).
f The quadratic form (84.6) must clearly be positive definite. For this, its coefficients must, as we know
from the theory of forms, satisfy the conditions
7n 7i2 7i3
721 722 733 >0.
731 732 733
Expressing y ik in terms of g tk , it is easy to show that these conditions take the form
000 001 002
<0, 010 011 012 >0, 0<O.
020 021 022
These conditions, together with the condition (84.3), must be satisfied by the components of the metric
tensor in every system of reference which can be realized with the aid of real bodies.
7n >o, r 1 yi2 >o,
721 722 1
000 001
1010 01ll
236 PARTICLE IN A GRAVITATIONAL FIELD § 85
We now turn to the definition of the concept of simultaneity in the general theory of
relativity. In other words, we discuss the question of the possibility of synchronizing clocks
located at different points in space, i.e. the setting up of a correspondence between the
readings of these clocks.
Such a synchronization must obviously be achieved by means of an exchange of light
signals between the two points. We again consider the process of propagation of signals
between two infinitely near points A and B, as shown in Fig. 18. We should regard as simul
taneous with the moment x° at the point A that reading of the clock at point B which is half
way between the moments of departure and return of the signal to that point, i.e. the moment
x° + Ax° = x° + ±(dx°( 2) + dx 0il) ).
Substituting (84.5), we thus find that the difference in the values of the "time" x° for two
simultaneous events occurring at infinitely near points is given by
o _ Qoa.dx
Ax° = ^—^g a dx a . (84.14)
000
This relation enables us to synchronize clocks in any infinitesimal region of space. Carry
ing out a similar synchronization from the point A, we can synchronize clocks, i.e. we can
define simultaneity of events, along any open curve.f
However, synchronization of clocks along a closed contour turns out to be impossible in
general. In fact, starting out along the contour and returning to the initial point, we would
obtain for Ax° a value different from zero. Thus it is, a fortiori, impossible to synchronize
clocks over all space. The exceptional cases are those reference systems in which all the
components g 0a are equal to zero. J
It should be emphasized that the impossibility of synchronization of all clocks is a property
of the arbitrary reference system, and not of the spacetime itself. In any gravitational field,
it is always possible (in infinitely many ways) to choose the reference system so that the three
quantities g 0a become identically equal to zero, and thus make possible a complete synchro
nization of clocks (see § 90).
Even in the special theory of relativity, proper time elapses differently for clocks moving
relative to one another. In the general theory of relativity, proper time elapses differently
even at different points of space in the same reference system. This means that the interval
of proper time between two events occurring at some point in space, and the interval of
time between two events simultaneous with these at another point in space, are in general
different from one another.
§ 85. Covariant differentiation
In galilean coordinates§ the differentials dA { of a vector A t form a vector, and the
derivatives dAJdx* of the components of a vector with respect to the coordinates form a
tensor. In curvilinear coordinates this is not so; dA t is not a vector, and dAJdx* is not a
f Multiplying (84.14) by g 0Q and bringing both terms to one side, we can state the condition for syn
chronization in the form dx = g i dx i = : the "covariant differential" dx between two infinitely near
simultaneous events must be equal to zero.
} We should also assign to this class those cases where the g 0a can be made equal to zero by a simple
transformation of the time coordinate, which does not involve any choice of the system of objects serving
for the definition of the space coordinates.
§ In general, whenever the quantities g ik are constant.
§ 85 COVARIANT DIFFERENTIATION 237
tensor. This is due to the fact that dA t is the difference of vectors located at different (in
finitesimally separated) points of space; at different points in space vectors transform
differently, since the coefficients in the transformation formulas (83.2), (83.4) are functions
of the coordinates.
It is also easy to verify these statements directly. To do this we determine the transforma
tion formulas for the differentials dA t in curvilinear coordinates. A covariant vector is
transformed according to the formula
_dx' k ,
therefore
dx' k , , , fix* 3x' k JA , A , d 2 x' k , t
dAi = J x  i dA' k + A' k d^=^ dA' k + A' k ^ dxK
Thus dA t does not transform at all like a vector (the same also applies, of course, to the
differential of a contravariant vector). Only if the second derivatives d 2 x' k /dx l dx l = 0, i.e.
if the x' k are linear functions of the x*, do the transformation formulas have the form
dx' k
dA^—tdAl
that is, dA t transforms like a vector.
We now undertake the definition of a tensor which in curvilinear coordinates plays the
same role as dAJdx* in galilean coordinates. In other words, we must transform dA i /dx k
from galilean to curvilinear coordinates.
In curvilinear coordinates, in order to obtain a differential of a vector which behaves like
a vector, it is necessary that the two vectors to be subtracted from each other be located at
the same point in space. In other words, we must somehow "translate" one of the vectors
(which are separated infmitesimally from each other) to the point where the second is
located, after which we determine the difference of two vectors which now refer to one and
the same point in space. The operation of translation itself must be defined so that in galilean
coordinates the difference shall coincide with the ordinary differential dA t . Since dA t is just
the difference of the components of two infmitesimally separated vectors, this means that
when we use galilean coordinates the components of the vector should not change as a result
of the translation operation. But such a translation is precisely the translation of a vector
parallel to itself. Under a parallel translation of a vector, its components in galilean co
ordinates do not change. If, on the other hand, we use curvilinear coordinates, then in
general the components of the vector will change under such a translation. Therefore in
curvilinear coordinates, the difference in the components of the two vectors after translating
one of them to the point where the other is located will not coincide with their difference
before the translation (i.e. with the differential dA t ).
Thus to compare two infmitesimally separated vectors we must subject one of them to a
parallel translation to the point where the second is located. Let us consider an arbitrary
contravariant vector ; if its value at the point x l is A 1 , then at the neighboring point x l + dx l
it is equal to A l + dA l . We subject the vector A 1 to an infinitesimal parallel displacement to
the point x l +dx l ; the change in the vector which results from this we denote by dA\ Then
the difference DA 1 between the two vectors which are now located at the same point is
DA i = dA i dA i . (85.1)
238 PARTICLE IN A GRAVITATIONAL FIELD § 85
The change 5 A 1 in the components of a vector under an infinitesimal parallel displacement
depends on the values of the components themselves, where the dependence must clearly be
linear. This follows directly from the fact that the sum of two vectors must transform accord
ing to the same law as each of the constituents. Thus 5 A 1 has the form
SA l = r kl A k dx l , (85.2)
where the T l kl are certain functions of the coordinates. Their form depends, of course, on the
coordinate system; for a galilean coordinate system T l kl = 0.
From this it is already clear that the quantities P kl do not form a tensor, since a tensor
which is equal to zero in one coordinate system is equal to zero in every other one. In a
curvilinear space it is, of course, impossible to make all the T l kl vanish over all of space. But
we can choose a coordinate system for which the r l kl become zero over a given infinitesimal
region (see the end of this sectionf). The quantities rj, are called Christoffel symbols. In
addition to the quantities P kl we shall later also use quantities T, kl J defined as follows :
If, kl = Qim^U (85.3)
Conversely,
ni = 9 im r m>kl . (85.4)
It is also easy to relate the change in the components of a co variant vector under a parallel
displacement to the Christoffel symbols. To do this we note that under a parallel displace
ment, a scalar is unchanged. In particular, the scalar product of two vectors does not change
under a parallel displacement.
Let A t and B l be any covariant and contravariant vectors. Then from <5(^ £ 5') = 0, we
have
B l SAi = A t 5B l = Ti^Ai dx l
or, changing the indices,
B l SAi = TlAkB 1 dx l .
From this, in view of the arbitrariness of the B\
SA^T^dx 1 , (85.5)
which determines the change in a covariant vector under a parallel displacement.
Substituting (85.2) and dA l = (dA l /dx l ) dx l in (85.1), we have
DA 1 = ( — . +r kl A k dx l . (85.6)
Jx l
Similarly, we find for a covariant vector
DA i = ( d ^iT k l A k )dx l . (85.7)
\dx l
The expressions in parentheses in (85.6) and (85.7) are tensors, since when multiplied by
the vector dx l they give a vector. Clearly, these are the tensors which give the desired
generalization of the concept of a derivative to curvilinear coordinates. These tensors are
called the covariant derivatives of the vectors A 1 and A t respectively. We shall denote them
by A 1 . k and A i; k . Thus,
DA i = A i . l dx l \ DA^A^dx 1 , (85.8)
f This is precisely the coordinate system which we have in mind in arguments where we, for brevity's sake,
speak of a "galilean" system; still all the proofs remain applicable not only to flat, but also to curved space.
% In place of r^ and r f , w , the symbols < . [ and .are sometimes used.
§ 85 COVARIANT DIFFERENTIATION 239
while the co variant derivatives themselves are :
A^^+TlA*, (85.9)
A tll = d ^I* a A k . (85.10)
In galilean coordinates, P kl = 0, and covariant differentiation reduces to ordinary differen
tiation.
It is also easy to calculate the covariant derivative of a tensor. To do this we must deter
mine the change in the tensor under an infinitesimal parallel displacement. For example, let
us consider any contravariant tensor, expressible as a product of two contravariant vectors
A l B k . Under parallel displacement,
5{A l B k ) = A i 8B k + B k dA i = A i T k lm B l dx m B k Y\ m A l dx m .
By virtue of the linearity of this transformation we must also have, for an arbitrary tensor
A ik
dA ik = {A im T k ml + A mk r m ddx l . (85.11)
Substituting this in
DA ik = dA ik SA ik = A ik ;l dx l ,
we get the covariant derivative of the tensor A ,k in the form
A% = 8 ^T+T i ml A mk +r k l A i '". (85.12)
ox
In completely similar fashion we obtain the covariant derivative of the mixed tensor A l k
and the covariant tensor A ik in the form
4;/ = ^rr^+r'A m , (85.13)
A ik;l = °^ TJ}A mk TZA im . (85.14)
One can similarly determine the covariant derivative of a tensor of arbitrary rank. In
doing this one finds the following rule of covariant differentiation. To obtain the covariant
derivative of the tensor A ; ; ; with respect to x\ we add to the ordinary derivative dA ; ; ; /dx l
for each covariant index i(A\ \ ;) a term T k u A; k ; , and for each contravariant index i(A\ \ ;)
a term +rj^4; fc ;.
One can easily verify that the covariant derivative of a product is found by the same rule as
for ordinary differentiation of products. In doing this we must consider the covariant
derivative of a scalar as an ordinary derivative, that is, as the covariant vector (f> k = dcj)/dx k ,
in accordance with the fact that for a scalar d(p = 0, and therefore D(f> = d(f). For example,
the covariant derivative of the product A i B k is
{A i B k ). l = A i , l B k + A i B k . l .
If in a covariant derivative we raise the index signifying the differentiation, we obtain the
socalled contravariant derivative. Thus,
AS k = g kl A i;l , A i ' k = g kl A\ tl .
We prove that the Christoffel symbols Y kl are symmetric in the subscripts. Since the
covariant derivative of a vector A i;k is a tensor, the difference A ixk —A kxi is also a tensor.
240 PARTICLE IN A GRAVITATIONAL FIELD § 85
Let the vector A t be the gradient of a scalar, that is, A t = d^/dx 1 . Since dAJdx* =
d 2 (f)/dx k dx i = dAJdx 1 , with the help of (85.10) we have
Ah, i A i; k — (T ik — r ki ) x
dx l
In a galilean coordinate system the covariant derivative reduces to the ordinary derivative,
and therefore the left side of our equation becomes zero. But since A k . iA i; k is a tensor, then
being zero in one system it must also be zero in any coordinate system. Thus we find that
rL = r\ k . (85.15)
Clearly, also,
r,, H = r f>tt . (85.16)
In general, there are altogether forty different quantities P kl ; for each of the four values of
the index i there are ten different pairs of values of the indices k and / (counting pairs
obtained by interchanging k and / as the same).
In concluding this section we present the formulas for transforming the Christoffel
symbols from one coordinate system to another. These formulas can be obtained by com
paring the laws of transformation of the two sides of the equations defining the covariant
derivatives, and requiring that these laws be the same for both sides. A simple calculation
gives
luK P dx , m dxk dxl + dxkdxl ^. (85.17)
From this formula it is clear that the quantity P kl behaves like a tensor only under linear
transformations [for which the second term in (85.17) drops out].
Formula (85.17) enables us to prove easily the assertion made above that it is always
possible to choose a coordinate system in which all the P kl become zero at a previously
assigned point (such a system is said to be locallygeodesic (see § 87).f
In fact, let the given point be chosen as the origin of coordinates, and let the values of the
T l kl at it be initially (in the coordinates x { ) equal to (P k i) . In the neighborhood of this point,
we now make the transformation
x'^x'+KrDoxV. (85.18)
Then
/ d 2 x' m dx* \
\dx k dx l dx' m )
= (FL)o
and according to (85.17), all the T'™ become equal to zero.
We note that for the transformation (85.18).
fa?).*"
so that it does not change the value of any tensor (including the tensor g ik ) at the given point,
so that we can make the Christoffel symbols vanish at the same time as we bring the g ik to
galilean form.
t It can also be shown that, by a suitable choice of the coordinate system, one can make all the !"£, go
to zero not just at a point but all along a given curve.
§ 86 THE RELATION OF THE CHRISTOFFEL SYMBOLS TO THE METRIC TENSOR 241
§ 86. The relation of the Christoffel symbols to the metric tensor
Let us show that the covariant derivative of the metric tensor g ik is zero. To do this we
note that the relation
DA t = g ik DA k
is valid for the vector DA h as for any vector. On the other hand, A t = g ik A k , so that
DA V = D(g ik A k ) = g ik DA k +A k Dg ik .
Comparing with DA t = g ik DA k , and remembering that the vector A k is arbitrary,
Dg ik = 0.
Therefore the covariant derivative
9 ik ..i = 0. (86.1)
Thus g ik may be considered as a constant during covariant differentiation.
The equation g ik . , = can be used to express the Christoffel symbols P kl in terms of the
metric tensor g ik . To do this we write in accordance with the general definition (85.14):
_°9ik_ r m —n r m — dik r — r — o
dik;l— <, i 9mk l il dim 1 kl — * i l k,il L i,kl ~ u 
Thus the derivatives of g ik are expressed in terms of the Christoffel symbols. f We write
the values of the derivatives of g ik , permuting the indices /, k, I:
<Hhk _ r r
dx l  lfc >» + 1 « fc "
M'_r +r
dx k ~ Ukl+ ^
dQki _ r r
~ p. i ~ l l.ki l k, li
Taking half the sum of these equations, we find (remembering that T t kl = T; lk )
1 /dgtk dffi.
2 V dx l dx
From this we have for the symbols T[ t = g im T mkh
r IH ==(^? + ^^Y (86.2)
Tkl ~~2 g Ve7 + ~d?dx»>)' (86,3)
These formulas give the required expressions for the Christoffel symbols in terms of the
metric tensor.
We now derive an expression for the contracted Christoffel symbol T l ki which will be
important later on. To do this we calculate the differential dg of the determinant g made up
from the components of the tensor g ik ; dg can be obtained by taking the differential of each
component of the tensor g ik and multiplying it by its coefficient in the determinant, i.e. by
the corresponding minor. On the other hand, the components of the tensor g ik reciprocal
to g ik are equal to the minors of the determinant of the g ik , divided by the determinant.
f Choosing a locallygeodesic system of coordinates therefore means that at the given point all the first
derivatives of the components of the metric tensor vanish.
242 PARTICLE IN A GRAVITATIONAL FIELD § 86
Therefore the minors of the determinant g are equal to gg ik . Thus,
dg = gg ik dg ik = gg ik dg ik (86.4)
(since g ik g ik = 8\ = 4, g ik dg ik = g ik dg ik ).
From (86.3), we have
r» = l n im ( d9mk d9mi d9ki "
ki 2 9 [dx 1 + dx k dx"
Changing the positions of the indices m and i in the third and first terms in parentheses, we
see that these two terms cancel each other, so that
1 n im d 9im
2 9 dx k '
Hi = ^ flf
or, according to (86.4),
1 dg d In V — g
2g dx k ~ dx ;
It is useful to note also the expression for the quantity g kl T l kl ; we have
r « = ^ = — lur^ (865)
u ~2 0g \dl + ~d^~^r) 9 9 \d7~i
With the help of (86.4) this can be transformed to
*»!<„ * *^9. (86.6)
yjg dx k
For various calculations it is important to remember that the derivatives of the contra
variant tensor g lk are related to the derivatives of g ik by the relations
9il dx^~ 9 d?> (86J)
(which are obtained by differentiating the equality g u g lk = <5*). Finally we point out that the
derivatives of g lk can also be expressed in terms of the quantities T kl . Namely, from the
identity g ik . , = it follows directly that
da ik
j^ = T l ml g mk T k ml g im . (86.8)
With the aid of the formulas which we have obtained we can put the expression for
A 1 , j, the generalized divergence of a vector in curvilinear coordinates, in convenient form.
Using (86.5), we have
A 'dx t+lliA ~ dx i+A dx 1
or, finally,
1 diyf^gA 1 ) , arn ^
A\ , = 7= — — — . (86.9 )
Vflf dx*
We can derive an analogous expression for the divergence of an antisymmetric tensor
A ik . From (85.12), we have
8A ik
Aik ,pi Amk.j^k Aim
A ik — p, k + A mk A + l mk^ 1 •
§ 87 MOTION OF A PARTICLE IN A GRAVITATIONAL FIELD 243
But, since A mk = A km ,
ri A mk — _F' A km =
1 mk A — 1 km^ 1 v '
Substituting the expression (86.5) for r£, fc , we obtain
A'x.^L, 8 ^ 9 ^. (86.10)
y/g d* k
Now suppose A ik is a symmetric tensor; we calculate the expression A k . k for its mixed
components. We have
a"  sa Kt*a< t'a" l d<  A ^ g) PA*
i4 «"a? +r " 4 ' r " 4 '.7^ Bx " " "
The last term here is equal to
1 ((Hki ddki _ d(kk\ A ki
2\dx k dx f dx l J
Because of the symmetry of the tensor A kl , two of the terms in parentheses cancel each other,
leaving
k _ * d(Vg^ ) _ldga Akl
yfZTg dx k 2 dx l
A* = J= ^Zf^ _ i ^ ^« (86.11)
l i;k
In cartesian coordinates, dA i Jdx k dAJdx i is an antisymmetric tensor. In curvilinear
coordinates this tensor is A iik —A kii . However, with the help of the expression for A i;k
and since T l kl = T\ k , we have
A. k A k . = d A—. (8612)
A:fe Akil dx k dx l v }
Finally, we transform to curvilinear coordinates the sum d 2 (f)ldx i dx l of the second
derivatives of a scalar <£. It is clear that in curvilinear coordinates this sum goes over into
4>\ \. But (f>. i = d4>ldx\ since covariant differentiation of a scalar reduces to ordinary dif
ferentiation. Raising the index /, we have
* 9 dx k '
and using formula (86.9), we find
It is important to note that Gauss' theorem (83.17) for the transformation of the integral
of a vector over a hypersurface into an integral over a four volume can, in view of (86.9), be
written as
j> A^^gdSi = j AKty/^gdQ. (86.14)
§ 87. Motion of a particle in a gravitational field
The motion of a free material particle is determined in the special theory of relativity
from the principle of least action,
5S= mcd f ds = 0, (87.1)
244 PARTICLE IN A GRAVITATIONAL FIELD § 87
according to which the particle moves so that its world line is an extremal between a given
pair of world points, in our case a straight line (in ordinary threedimensional space this
corresponds to uniform rectilinear motion).
The motion of a particle in a gravitational field is determined by the principle of least
action in this same form (87.1), since the gravitational field is nothing but a change in the
metric of spacetime, manifesting itself only in a change in the expression for ds in terms of
the dx\ Thus, in a gravitational field the particle moves so that its world point moves along
an extremal or, as it is called, a geodesic line in the fourspace x°, x 1 , x 2 , x 3 ; however, since
in the presence of the gravitational field spacetime is not galilean, this line is not a "straight
line", and the real spatial motion of the particle is neither uniform nor rectilinear.
Instead of starting once again directly from the principle of least action (see the problem
at the end of this section), it is simpler to obtain the equations of motion of a particle in a
gravitational field by an appropriate generalization of the differential equations for the free
motion of a particle in the special theory of relativity, i.e. in a galilean fourdimensional
coordinate system. These equations are du'/ds = or du l = 0, where u l = dx l /ds is the four
velocity. Clearly, in curvilinear coordinates this equation is generalized to the equation
Du l = 0. (87.2)
From the expression (85.6) for the covariant differential of a vector, we have
du i + T i ia u k dx l =$.
Dividing this equation by ds, we have
d 2 x i . dx k dx l
This is the required equation of motion. We see that the motion of a particle in a gravita
tional field is determined by the quantities T l kl . The derivative d 2 x i /ds 2 is the fouraccelera
tion of the particle. Therefore we may call the quantity —mT l kl u k u l the "fourforce",
acting on the particle in the gravitational field. Here, the tensor g ik plays the role of the
"potential" of the gravitational field — its derivatives determine the field "intensity" T kl .^
In § 85 it was shown that by a suitable choice of the coordinate system one can always
make all the T l kl zero at an arbitrary point of spacetime. We now see that the choice of such
a locallyinertial system of reference means the elimination of the gravitational field in the
given infinitesimal element of spacetime, and the possibility of making such a choice is an
expression of the principle of equivalence in the relativistic theory of gravitation.
As before, we define the fourmomentum of a particle in a gravitational field as
p l = mcu\ (87.4)
Its square is
PiP 1 = m 2 c 2 . (87.5)
t We also give the form of the equations of motion expressed in terms of covariant components of the
fouracceleration. From the condition Du t = 0, we find
—r — r fc , u u k u l = 0.
ds
Substituting for T k , u from (86.2), two of the terms cancel and we are left with
ds 2 dx*
§ 87 MOTION OF A PARTICLE IN A GRAVITATIONAL FIELD 245
Substituting — dS/dx* for p h we find the HamiltonJacobi equation for a particle in a
gravitational field:
•*IS" v  a (87  6)
The equation of a geodesic in the form (87.3) is not applicable to the propagation of a
light signal, since along the world line of the propagation of a light ray the interval ds, as
we know, is zero, so that all the terms in equation (87.3) become infinite. To get the equa
tions of motion in the form needed for this case, we use the fact that the direction of pro
pagation of a light ray in geometrical optics is determined by the wave vector tangent to the
ray. We can therefore write the fourdimensional wave vector in the form k 1 = dx l /dA,
where X is some parameter varying along the ray. In the special theory of relativity, in the
propagation of light in vacuum the wave vector does not vary along the path, that is,
dk l = (see § 53). In a gravitational field this equation clearly goes over into Dk l = or
dk*
— +r i kl k k k l = (87.7)
(these equations also determine the parameter X).
The absolute square of the wave fourvector (see § 48) is zero, that is,
fc,fc f = 0. (87.8)
Substituting dif/Jdx 1 in place of k t (^ is the eikonal), we find the eikonal equation in a gravita
tional field
In the limiting case of small velocities, the relativistic equations of motion of a particle
in a gravitational field must go over into the corresponding nonrelativistic equations. In
this we must keep in mind that the assumption of small velocity implies the requirement that
the gravitational field itself be weak; if this were not so a particle located in it would acquire
a high velocity.
Let us examine how, in this limiting case, the metric tensor g ik determining the field is
related to the nonrelativistic potential (j> of the gravitational field.
In nonrelativistic mechanics the motion of a particle in a gravitational field is determined
by the Lagrangian (81.1). We now write it in the form
TtlV
L = mc 2 +—  m& (87. 10)
adding the constant — mc 2 .\ This must be done so that the nonrelativistic Lagrangian in the
absence of the field, L = —mc 2 + mv 2 /2, shall be the same exactly as that to which the
corresponding relativistic function L = —mc 2 y/\—v 2 /c 2 reduces in the limit as v/c *■ 0.
Consequently, the nonrelativistic action function S for a particle in a gravitational field
has the form
S = Ldt = mc (c ~ + J dt.
t The potential <f> is, of course, denned only to within an arbitrary additive constant. We assume through
out that one makes the natural choice of this constant so that the potential vanishes far from the bodies
producing the field.
246 PARTICLE IN A GRAVITATIONAL FIELD § 87
Comparing this with the expression S = —mc$ds, we see that in the limiting case under
consideration
ds = [ c — — + — ] dt.
K*9
Squaring and dropping terms which vanish for c > oo, we find
ds 2 = (c 2 + 2(l))dt 2 dr 2 . (87.11)
where we have used the fact that v dt = dr.
Thus in the limiting case the component g 00 of the metric tensor is
26
c
As for the other components, from (87.11) it would follow that g aP = S aP , g 0<x = 0
Actually, however, the corrections to them are, generally speaking, of the same order of
magnitude as the corrections to g 00 (for more detail, see § 106). The impossibility of deter
mining these corrections by the method given above is related to the fact that the corrections
to the g aP , though of the same order of magnitude as the correction to g 00 , would give rise
to terms in the Lagrangian of a higher order of smallness (because in the expression for ds 2
the components g ap are not multiplied by c 2 , while this is the case for g 00 ).
PROBLEM
Derive the equation of motion (87.3) from the principle of least action (87.1).
Solution: We have:
Sds 2 — Ids Sds = S(g ik dx 1 dx k ) = dx 1 dx k ^ Sx l +2g ik dx l dSx k .
ox 1
Therefore
. . , dx l dx k dg ik dx l dSx k .
= — mc
C fl dx l dx k dg ik d ( dx K \ 1
J 2i IT* "a? dx ds ( gi * ds) dx ) ds
(in integrating by parts, we use the fact that Sx k = at the limits). In the second term in the integral,
we replace the index k by the index /. We then find, by equating to zero the coefficient of the arbitrary
variation Sx l :
1 !,««* d0ik d (a ill  l ,/u« dgik a ^ u'u k d9il 
Noting that the third term can be written as
 l u l u k ( d ^ + d9 ^\
2 U \dx k+ dx 1 )'
and introducing the Christoffel symbols r ; , ifc in accordance with (86.2), we have:
dui i t i k n
gur+T Uik u i u K = 0.
ds
Equation (87.3) is obtained from this by raising the index /.
§ 88 THE CONSTANT GRAVITATIONAL FIELD 247
§ 88. The constant gravitational field
A gravitational field is said to be constant if one can choose a system of reference in which
all the components of the metric tensor are independent of the time coordinate x° ; the
latter is then called the world time.
The choice of a world time is not completely unique. Thus, if we add to x° an arbitrary
function of the space coordinates, the g ik will still not contain x°; this transformation
corresponds to the arbitrariness in the choice of the time origin at each point in space.f
In addition, of course, the world time can be multiplied by an arbitrary constant, i.e. the
units for measuring it are arbitrary.
Strictly speaking, only the field produced by a single body can be constant. In a system of
several bodies, their mutual gravitational attraction will give rise to motion, as a result of
which the field produced by them cannot be constant.
If the body producing the field is fixed (in the reference system in which the g ik do not
depend on x°), then both directions of time are equivalent. For a suitable choice of the time
origin at all the points in space, the interval ds should in this case not be changed when we
change the sign of x°, and therefore all the components g 0a of the metric tensor must be
identically equal to zero. Such constant gravitational fields are said to be static.
However, for the field produced by a body to be constant, it is not necessary for the body
to be at rest. Thus the field of an axially symmetric body rotating uniformly about its axis
will also be constant. However in this case the two time directions are no longer equivalent
by any means — if the sign of the time is changed, the sign of the angular velocity is changed.
Therefore in such constant gravitational fields (we shall call them stationary fields) the
components g 0x of the metric tensor are in general different from zero.
The meaning of the world time in a constant gravitational field is that an interval of world
time between events at a certain point in space coincides with the interval of world time
between any other two events at any other point in space, if these events are respectively
simultaneous (in the sense explained in § 84) with the first pair of events. But to the same
interval of world time x° there correspond, at different points of space, different intervals
of proper time t.
The relation between world time and proper time, formula (84.1), can now be written in
the form
t = V0oo*°. (88.1)
applicable to any finite time interval.
If the gravitational field is weak, then we may use the approximate expression (87.12),
and (88.1) gives
?(■♦»
(88.2)
f It is easy to see that under such a transformation the spatial metric, as expected, does not change.
In fact, under the substitution
x°^x°+f(x 1 ,x 2 ,x 3 )
with an arbitrary function /(x 1 , x 2 , x 3 ), the components g tk change to
ffaff * g«0 +goof. af, B +9oaf. +ffopf, «,
doa^ffOa+goof.a, goQ^doO,
where /, « = dfjdx a . This obviously does not change the tensor (84.7).
248 PARTICLE IN A GRAVITATIONAL FIELD § 88
Thus proper time elapses the more slowly the smaller the gravitational potential at a given
point in space, i.e., the larger its absolute value (later, in § 96, it will be shown that the poten
tial (f> is negative). If one of two identical clocks is placed in a gravitational field for some
time, the clock which has been in the field will thereafter appear to be slow.
As was already indicated above, in a static gravitational field the components g 0a of the
metric tensor are zero. According to the results of § 84, this means that in such a field
synchronization of clocks is possible over all space. We note also that the element of spatial
distance in a static field is simply:
dl 2 = g aP dx a dx p . (88.3)
In a stationary field the g 0a are different from zero and the synchronization of clocks
over all space is impossible. Since the g ik do not depend on x°, formula (84.14) for the dif
ference between the values of world time for two simultaneous events occurring at different
points in space can be written in the form
Ax°=( 9 ^ (88.4)
J 000
for any two points on the line along which the synchronization of clocks is carried out. In
the synchronization of clocks along a closed contour, the difference in the value of the world
time which would be recorded upon returning to the starting point is equal to the integral
Ax =& 9 ^— (88.5)
J 9oo
taken along the closed contour, f
Let us consider the propagation of a light ray in a constant gravitational field. We have
seen in § 53 that the frequency of the light is the time derivative of the eikonal \j/ (with
opposite sign). The frequency expressed in terms of the world time x°/c is therefore
co — —c(d\l//dx°). Since the eikonal equation (87.9) in a constant field does not contain x°
explicitly, the frequency co remains constant during the propagation of the light ray. The
frequency measured in terms of the proper time is a> = —(dij//dx); this frequency is different
at different points of space.
From the relation
# _ _# dx° _ _# _^_
Jx"dx~° dx~ dx° V^o'
we have
co = p=. (88.6)
vtfoo
In a weak gravitational field we obtain from this, approximately,
a> = a> (l^). (88.7)
We see that the light frequency increases with increasing absolute value of the potential of
the gravitational field, i.e. as we approach the bodies producing the field; conversely, as the
light recedes from these bodies the frequency decreases. If a ray of light, emitted at a point
f The integral (88.5) is identically zero if the sum g 0a dx a /g 00 is an exact differential of some function of
the space coordinates. However, such a case would simply mean that we are actually dealing with a static
field, and that all the g 0a could be made equal to zero by a transformation of the form x +x°+f(x a ).
§ 88 THE CONSTANT GRAVITATIONAL FIELD 249
where the gravitational potential is 4>u nas ( at that point) the frequency co, then upon
arriving at a point where the potential is <£ 2 , it will have a frequency (measured in units of
the proper time at that point) equal to
co
i0i
c 2
;H)0+^>
A line spectrum emitted by some atoms located, for example, on the sun, looks the same
there as the spectrum emitted by the same atoms located on the earth would appear on it.
If, however, we observe on the earth the spectrum emitted by the atoms located on the sun,
then, as follows from what has been said above, its lines appear to be shifted with respect
to the lines of the same spectrum emitted on the earth. Namely, each line with frequency co
will be shifted through the interval Aco given by the formula
Aco = ^^co, (88.8)
where $ x and <f) 2 are the potentials of the gravitational field at the points of emission and
observation of the spectrum respectively. If we observe on the earth a spectrum emitted on
the sun or the stars, then I0J > 0 2 » and from (88.8) it follows that Aco < 0, i.e. the shift
occurs in the direction of lower frequency. The phenomenon we have described is called the
"red shift".
The occurrence of this phenomenon can be explained directly on the basis of what has
been said above about world time. Because the field is constant, the interval of world time
during which a certain vibration in the light wave propagates from one given point of space
to another is independent of jc°. Therefore it is clear that the number of vibrations occurring
in a unit interval of world time will be the same at all points along the ray. But to one and
the same interval of world time there corresponds a larger and larger interval of proper time,
the further away we are from the bodies producing the field. Consequently, the frequency,
i.e. the number of vibrations per unit proper time, will decrease as the light recedes from these
masses.
During the motion of a particle in a constant field, its energy, defined as
dS_
~ C cV>'
the derivative of the action with respect to the world time, is conserved; this follows, for
example, from the fact that x° does not appear explicitly in the Hamilton Jacobi equation.
The energy defined in this way is the time component of the covariant fourvector of
momentum p k = mcu k = mcg ki u l . In a static field, ds 2 = g 00 (dx ) 2 — dl 2 , and we have for
the energy, which we here denote by i Q ,
i dx° , dx°
<f = mc z g 00 — = mc z g 00
ds aOO Jg 00 (dx ) 2 dl 2
We introduce the velocity
dl cdl
dx \/g odx°
of the particle, measured in terms of the proper time, that is, by an observer located at the
250 PARTICLE IN A GRAVITATIONAL FIELD
given point. Then we obtain for the energy
o —
7 1  1
(88.9)
This is the quantity which is conserved during the motion of the particle.
It is easy to show that the expression (88.9) remains valid also for a stationary field, if
only the velocity v is measured in terms of the proper time, as determined by clocks syn
chronized along the trajectory of the particle. If the particle departs from point A at the
moment of world time x° and arrives at the infinitesimally distant point B at the moment
x°+dx°, then to determine the velocity we must now take, not the time interval
(x° + dx )— x° = dx°, but rather the difference between x° + dx° and the moment
x° — (goJffoo)dx a which is simultaneous at the point B with the moment x° at the point A :
(x° + dx ) (x° ^ dx a ) = Jx°+ — dx\
\ 9oo J 9oo
Multiplying by v g 00 /c, we obtain the corresponding interval of proper time, so that the
velocity is
c dx x
v* = r , (88.10)
yjh(dx —g a dx a )
where we have introduced the notation
h = g o, g*=^ (88.11)
000
for the threedimensional vector g (which was already mentioned in § 84) and for the three
dimensional scalar g 00 . The covariant components of the velocity v form a threedimen
sional vector in the space with metric y ap , and correspondingly the square of this vector is
to be taken asf
v^ltpv*, v 2 = v a v*. (88.12)
We note that with such a definition, the interval ds is expressed in terms of the velocity in
the usual fashion:
ds 2 = #oo (d* ) 2 + 2#oa dx° dx a + g ap dx a dx p
= h(dx°g a dx a ) 2 dl 2
= h(dx°g a dx a ) 2 (\  ^A (88.13)
The components of the four velocity
j dx l
ds
f In our further work we shall repeatedly introduce, in addition to fourvectors and fourtensors, three
dimensional vectors and tensors denned in the space with metric y a/> ; in particular the vectors g and v, which
we have already used, are of this type. Just as in four dimensions the tensor operations (in particular, raising
and lowering of indices) are done using the metric tensor g ik , so, in three dimensions these are done using the
tensor y aB . To avoid misunderstandings that may arise, we shall denote threedimensional quantities by
symbols other than those used for fourdimensional quantities.
§ 88 THE CONSTANT GRAVITATIONAL FIELD 251
are
«■— f=>. «°= J== + ^L=. (88.14)
The energy is
S Q = mc 2 g 0i u l = mc 2 h{u —g gL u a ),
and after substituting (88.14), takes the form (88.9).
In the limiting case of a weak gravitational field and low velocities, by substituting
g 00 = 1 + (20/ c 2 ) in (88.9), we get approximately:
2
mv
<f = mc 2 +— + m<£, (88.15)
where w<£ is the potential energy of the particle in the gravitational field, which is in agree
ment with the Lagrangian (87.10).
PROBLEMS
1. Determine the force acting on a particle in a constant gravitational field.
Solution: For the components of r fci which we need, we find the following expressions :
1 00 ry "; >
Th = ^{gt B g B a )\g e h' a ,
r a 0y = x a ey + [ge(g : y a g%+gy(g : i J a g%)]+ ^ 9ng?h' a .
In these expressions all the tensor operations (covariant differentiation, raising and lowering of
indices) are carried out in the threedimensional space with metric y a0 , on the threedimensional
vector g a and the threedimensional scalar h (88.11); X% y is the threedimensional Christoffel
symbol, constructed from the components of the tensor y ae in just the same way as r£, is con
structed from the components of g lk ; in the computations we use (84.912).
Substituting (1) in the equation of motion
du a
— = r(w yo 2 2r^*/vr« v wV'
and using the expression (88.14) for the components of the fourvelocity, we find after some simple
transformations :
The force f acting on the particle is the derivative of its momentum p with respect to the (syn
chronized) proper time, as defined by the threedimensional covariant differential:
L v 2 Dp a I. v 2 d mv a
fl V^
252 PARTICLE IN A GRAVITATIONAL FIELD § 88
From (2) we therefore have (for convenience we lower the index a):
\— I \ ■*" */ C J
or, in the usual threedimensional notation,!
f= ^f iVln VA + Vflx(curlg)i. (3)
71
We note that if the body is at rest, then the force acting on it [the first term in (3)] has a potential.
For low velocities of motion the second term in (3) has the form mcVh\ x (curl g) analogous to
the Coriolis force which would appear (in the absence of the field) in a coordinate system rotating
with angular velocity
Q =  VAcurlg.
2. Derive Fermat's principle for the propagation of a ray in a constant gravitational field.
Solution: Fermat's principle (§ 53) states:
S$ k a dx a = 0,
where the integral is taken along the ray, and the integral must be expressed in terms of the frequency
co (which is constant along the ray) and the coordinate differentials. Noting that k = — dif//dx ^
(g> /c), we write:
— = k = g 0i k l = g 00 k° +g 0a k a = h(k° —g a k a ).
Substituting this in the relation k t k l = g^Wk* = 0, written in the form
h(k° g* k a f  y aB k a k B = 0,
t In threedimensional curvilinear coordinates, the unit antisymmetric tensor is defined as
Vy
where e 12 3 = e 123 = 1, and the sign changes under transposition of indices [compare (83.1314)]. Accord
ingly the vector c = axb, defined as the vector dual to the antisymmetric tensor c By = a B b y —a y b B , has
components :
Conversely,
c a = %Vy e aBy c"" = Vy e aBy a B b\ c a = i e ae *c By = ^ e aBr a B b y
2Vy Vy
c aB = V~y e aBy c\ c aB = 4= e a ^c y .
■Vy
In particular, curl a should be understood in this same sense as the vector dual to the tensor
a0:«—a a ;e = (^a B /dx a )—(da a /dx B ), so that its contravariant components are
(curlar = ^^^^Y
2Vy ydx" dx*J
In this same connection we repeat that for the threedimensional divergence of a vector [see (86.9)]:
divsi=~^ a {Vya a ).
To avoid misunderstandings when comparing with formulas frequently used for the threedimensional
vector operations in orthogonal curvilinear coordinates (see, for example, Electrodynamics of Continuous
Media, appendix), we point out that in these formulas the components of the vectors are understood to be
the quantities Vg^.A\= Va[A*), Vg~ZA 2 , Vg~^ A 3 .
§ 89 ROTATION 253
we obtain :
1 /m„\ 2
y a0 k a k e = O.
1/cD^ 2
(tJ
h
Noting that the vector k a must have the direction of the vector dx a , we then find:
COp dx a
cVh dl
where dl (84.6) is the element of spatial distance along the ray. In order to obtain the expression for
k a , we write
k« = g a % = g a0 k +g ae k ff = g« — y ae k ,
c
so that
(*^'M(^+*}
Finally, multiplying by dx a , we obtain Fermat's principle in the form (dropping the constant factor
c»o/c):
In a static field, we have simply:
•Kfi*")'"
\
S I p = 0.
Vh
We call attention to the fact that in a gravitational field the ray does not propagate along the
shortest line in space, since the latter would be defined by the equation S J" dl = 0.
§ 89. Rotation
As a special case of a stationary gravitational field, let us consider a uniformly rotating
reference system. To calculate the interval ds we carry out the transformation from a system
at rest (inertial system) to the uniformly rotating one. In the coordinates r' , $', z', t of the
system at rest (we use cylindrical coordinates r', $', z'), the interval has the form
ds 2 = c 2 dt 2 dr' 2 r' 2 d4>' 2 dz' 2 . (89.1)
Let the cylindrical coordinates in the rotating system be r, §, z. If the axis of rotation
coincides with the axes Z and Z', then we have r' = r, z' = z, $' = <f) + £lt, where Q is the
angular velocity of rotation. Substituting in (89.1), we find the required expression for ds 2
in the rotating system of reference :
ds 2 = (c 2 QV) dt 2 2Qr 2 d(f) dtdz 2 r 2 d(f> 2 dr 2 . (89.2)
It is necessary to note that the rotating system of reference can be used only out to distances
equal to c/Q. In fact, from (89.2) we see that for r > c/Cl, g 00 becomes negative, which is
not admissible. The inapplicability of the rotating reference system at large distances is
related to the fact that there the velocity would become greater than the velocity of light,
and therefore such a system cannot be made up from real bodies.
As in every stationary field, clocks on the rotating body cannot be uniquely synchronized
at all points. Proceeding with the synchronization along any closed curve, we find, upon
returning to the starting point, a time differing from the initial value by an amount [see
254 PARTICLE IN A GRAVITATIONAL FIELD § 90
(88.5)]
9o* j a IX fir2 #
C J #00 c J
QV
2„2
or, assuming that Qr/c <^ 1 (i.e. that the velocity of the rotation is small compared with the
velocity of light),
Q f „ 2Q
At= j\ r 2 dcj) =+yS, (89.3)
where 5 is the projected area of the contour on a plane perpendicular to the axis of rotation
(the sign + or — holding according as we traverse the contour in, or opposite to, the direc
tion of rotation).
Let us assume that a ray of light propagates along a certain closed contour. Let us cal
culate to terms of order v/c the time t that elapses between the starting out of the light ray
and its return to the initial point. The velocity of light, by definition, is always equal to c,
if the times are synchronized along the given closed curve and if at each point we use the
proper time. Since the difference between proper and world time is of order v 2 /c 2 , then
in calculating the required time interval t to terms of order v/c this difference can be neglected.
Therefore we have
L 2Q „
c c
where L is the length of the contour. Corresponding to this, the velocity of light, measured
as the ratio L/t, appears equal to
c±2Qj. (89.4)
This formula, like the first approximation for the Doppler effect, can also be easily derived
in a purely classical manner.
PROBLEM
Calculate the element of spatial distance in a rotating coordinate system.
Solution: With the help of (84.6) and (84.7), we find
r 2 dd> 2
dl 2 = dr 2 +dz 2 + *
c 2
which determines the spatial geometry in the rotating reference system. We note that the ratio of
the circumference of a circle in the plane z — constant (with center on the axis of rotation) to its
radius r is
2n/VlQ. 2 r 2 /c 2 ,
i.e. larger than 2n.
§ 90. The equations of electrodynamics in the presence of a gravitational field
The electromagnetic field equations of the special theory of relativity can be easily
generalized so that they are applicable in an arbitrary fourdimensional curvilinear system
of coordinates, i.e., in the presence of a gravitational field.
§ 90 EQUATIONS OF ELECTRODYNAMICS IN A GRAVITATIONAL FIELD 255
The electromagnetic field tensor in the special theory of relativity is defined as
F ik = (dA k /dx i ) — (dA i Jdx k ). Clearly it must now be defined correspondingly as
Fik = A k;i —A i;k . But because of (86.12),
and therefore the relation of F ik to the potential A t does not change. Consequently the first
pair of Maxwell equations (26.5) also does not change its formf
In order to write the second pair of Maxwell equations, we must first determine the current
fourvector in curvilinear coordinates. We do this in a fashion completely analogous to that
which we followed in § 28. The spatial volume element, constructed on the space coordinate
elements dx 1 , dx 2 , and dx 3 , is Vy dV, where y is the determinant of the spatial metric
tensor (84.7) and dV = dx 1 dx 2 dx 3 (see the footnote on p. 232). We introduce the charge
density q according to the definition de = g^Jy dV, where de is the charge located within
the volume element vy dV. Multiplying this equation on both sides by dx 1 , we have:
/ q I dx l
de dx 1 = q dx 1 Vy dx 1 dx 2 dx 3 = ■ , — v — g d£l ~—q
V0oo dx
[where we have used the formula —g = yg 00 (84.10)]. The product V —gdQ. is the in
variant element of fourvolume, so that the current four vector is defined by the expression
oc dx 1
(90.3)
\f g 00 dx°
(the quantities dx l jdx° are the rates of change of the coordinates with the "time" x°, and
do not constitute a four vector). The component y° of the current four vector, multiplied
by V g 00 /c, is the spatial density of charge.
For point charges the density g is expressed as a sum of (5functions, as in formula (28.1).
We must, however, correct the definition of these functions for the case of curvilinear co
ordinates. By <5(r) we shall again mean the product Six 1 ) d(x 2 ) S(x 3 ), regardless of the
geometrical meaning of the coordinates jc 1 , x 2 , x 3 ; then the integral over dV(and not over
yJydV) is unity: j" <5(r) dV= 1. With this same definition of the ^functions, the charge
density is
Q
Q = H~T d ( r r a X
a Vy
and the current fourvector is
^E^OO™ (90.4)
Conservation of charge is expressed by the equation of continuity, which differs from
f It is easily seen that the equation can also be written in the form
F lk;l + F li ; k + F kl ;i=0,
from which its covariance is obvious.
256 PARTICLE IN A GRAVITATIONAL FIELD § 90
(29.4) only in replacement of the ordinary derivatives by covariant derivatives:
/; i = 7=1= ■/. (y/^0 f) = (90.5)
V g ox
[using formula (86.9)].
The second pair of Maxwell equations (30.2) is generalized similarly; replacing the
ordinary derivatives by covariant derivatives, we find :
1 d i Ait
F ik ;k = = — (y/g F ik ) =  / (90.6)
yjgdx k c
[using formula (86.10)].
Finally the equations of motion of a charged particle in gravitational and electromagnetic
fields is obtained by replacing the fouracceleration du l /ds in (23.4) by Du l /ds:
Du l /du l ■ , A e .,
mc — = mc(— +T l kl u k u l ) =  F lk u k . (90.7)
ds \ds ) c
PROBLEM
Write the Maxwell equations in a given gravitational field in threedimensional form (in the
threedimensional space with metric y aP ), introducing the threevectors E, D and the antisymmetric
threetensors B ae and H aB according to the definitions :
E a = Fo a , Bag = F a p,
D a = V^o F°\ H aB = V^o F aB . (1)
Solution: The quantities introduced above are not independent. Writing out the equations
F 0a = goi 9a m F lm , F" = g a V m Fi m ,
and introducing the threedimensional metric tensor y aB = —g aB +hg a g B [with g and h from (88.11)],
and using formulas (84.9) and 84.12), we get:
F H aP
D* = ^+g>H aft B«e =  7 =+g< 1 E«g"Ee. (2)
Vh Vh
We introduce the vectors B and H, dual to the tensors B aB and H aB , in accordance with the definition :
B"=^ 7 ,e°»B tr , Ha=\y/ye a ,yH» (3)
(see the footnote on p. 252; the minus sign is introduced so that in galilean coordinates the vector:
H and B coincide with the ordinary magnetic field intensity). Then (2) can be written in the forms
D = ^L+Hxg, B = ^+gxE. (4)
Vh Vh
Introducing definition (1) in (90.2), we get the equations:
dB aB dBya dB Br _
~dx T + 'dx T ~dx"~ '
dBas , dEa _ dE B _
Ix* ~dx° ~ ~dx« ~ '
or, changing to the dual quantities (3):
1 r)
divB = 0, curlE= (VyB) (5)
cy y ot
§ 90 EQUATIONS OF ELECTRODYNAMICS IN A GRAVITATIONAL FIELD 257
(x° = ct ; the definitions of the operations div and curl are given in the footnote on p. 252). Similarly
we find from (90.6) the equations
1 /} _ IP chc. a
_ (Vy H«")+ j= — (V~y D«) = 4ng —,
Vy dx B Vy dx° dx
or, in threedimensional notation :
div D = Aizq, curl H = ^=  (Vy D) + — s, (6)
cVy dt c
where s is the vector with components s a = q dx a \dt.
We also write the continuity equation (90.5) in threedimensional form:
4"(Vye)+divs0. (7)
Vy dt
The reader should note the analogy (purely formal, of course) of equations (5) and (6) to the
Maxwell equations for the electromagnetic field in material media. In particular, in a static gravita
tional field the quantity Vy drops out of the terms containing time derivatives, and relation (4)
reduces to D = E/Vh, B = H/Vh. We may say that with respect to its effect on the electromagnetic
field a static gravitational field plays the role of a medium with electric and magnetic permeabilities
e = fi = 1/V/C
CHAPTER 11
THE GRAVITATIONAL FIELD EQUATIONS
§ 91. The curvature tensor
Let us go back once more to the concept of parallel displacement of a vector. As we said
in § 85, in the general case of a curved fourspace, the infinitesimal parallel displacement of a
vector is defined as a displacement in which the components of the vector are not changed in
a system of coordinates which is galilean in the given infinitesimal volume element.
If x 1 = x\s) is the parametric equation of a certain curve (s is the arc length measured
from some point), then the vector u l = dx l \ds is a unit vector tangent to the curve. If the
curve we are considering is a geodesic, then along it Du l = 0. This means that if the vector
u l is subjected to a parallel displacement from a point x' on a geodesic curve to the point
x l +dx l on the same curve, then it coincides with the vector u l + du l tangent to the curve at
the point x l + dx l . Thus when the tangent to a geodesic moves along the curve, it is displaced
parallel to itself.
On the other hand, during the parallel displacement of two vectors, the "angle" between
them clearly remains unchanged. Therefore we may say that during the parallel displace
ment of any vector along a geodesic curve, the angle between the vector and the tangent
to the geodesic remains unchanged. In other words, during the parallel displacement of a
vector, its component along the geodesic must be the same at all points of the path.
Now the very important result appears that in a curved space the parallel displacement of
a vector from one given point to another gives different results if the displacement is carried
out over different paths. In particular, it follows from this that if we displace a vector parallel
to itself along some closed contour, then upon returning to the starting point, it will not
coincide with its original value.
In order to make this clear, let us consider a curved twodimensional space, i.e., any
curved surface. Figure 19 shows a portion of such a surface, bounded by three geodesic
curves. Let us subject the vector 1 to a parallel displacement along the contour made up of
Fig. 19.
258
§ 91
THE CURVATURE TENSOR
259
these three curves. In moving along the line AB, the vector 1, always retaining its angle with
the curve unchanged, goes over into the vector 2. In the same way, on moving along BC
it goes over into 3. Finally, on moving from C to A along the curve CA, maintaining a
constant angle with this curve, the vector under consideration goes over into 1', not co
inciding with the vector 1.
We derive the general formula for the change in a vector after parallel displacement
around any infinitesimal closed contour. This change AA k can clearly be written in the form
§ SA k , where the integral is taken over the given contour. Substituting in place of 8A k the
expression (85.5), we have
AA^jrlAidx 1 (91.1)
(the vector A x which appears in the integrand changes as we move along the contour).
For the further transformation of this integral, we must note the following. The values of
the vector A t at points inside the contour are not unique; they depend on the path along
which we approach the particular point. However, as we shall see from the result obtained
below, this nonuniqueness is related to terms of second order. We may therefore, with the
firstorder accuracy which is sufficient for the transformation, regard the components of the
vector Ai at points inside the infinitesimal contour as being uniquely determined by their
values on the contour itself by the formulas 5A t = TlA„dx l , i.e., by the derivatives
dAj
~dx l
— F" A
(91.2)
Now applying Stokes' theorem (6.19) to the integral (91.1) and considering that the area
enclosed by the contour has the infinitesimal value A/' m , we get:
AA,=
c{T km Ad cirUAd
dx l
dx r
A/'
dT
km
dx l
dKi . w dA i
*' A 4F 1
dx m ' km dx l
■ V 1
dA t
dx m
Af
lm
Substituting the values of the derivatives (91.2), we get finally:
AA k = $Ri lm A i Af lm ,
where JRL, is a tensor of the fourth rank:
Rklm —
arL art,
dx l dx n
iri r n —T l F"
"t" 1 nl L km l nm L kh
(91.3)
(91.4)
That R[ lm is a tensor is clear from the fact that in (91.3) the left side is a vector— the dif
ference AA k between the values of vectors at one and the same point. The tensor R l klm is
called the curvature tensor or the Riemann tensor.
It is easy to obtain a similar formula for a contravariant vector A k . To do this we note,
since under parallel displacement a scalar does not change, that A(A k B k ) = 0, where B k is
any covariant vector. With the help of (91.3), we then have
A(A k B k ) = A k AB k + B k AA k = ±A k B i R l klm Af lm +B k AA k =
= B k (AA k + lrA i R k lm Af ,m ) = 0,
or, in view of the arbitrariness of the vector B k ,
AA k = iR^A'Af 1 ™. (91.5)
260 THE GRAVITATIONAL FIELD EQUATIONS § 92
If we twice differentiate a vector A t covariantly with respect to x k and x l , then the result
generally depends on the order of differentiation, contrary to the situation for ordinary
differentiation. It turns out that the difference A i . kil —A ii i ik is given by the same curvature
tensor which we introduced above. Namely, one finds the formula
^i;k;iA i;l;k = A m RJ n kh (91.6)
which is easily verified by direct calculation in the locallygeodesic coordinate system
Similarly, for a contravariant vector,!
A i . k . l A i . l . k =A m R i mU . (91.7)
Finally, it is easy to obtain similar formulas for the second derivatives of tensors [this is
done most easily by considering, for example, a tensor of the form A t B k , and using formulas
(91.6) and (91.7); because of the linearity, the formulas thus obtained must be valid for an
arbitrary tensor A ik ]. Thus
A ik; l; m ~ A ik; m; I = A in R klm + ^nk^ilm (91.8)
Clearly, in a flat space the curvature tensor is zero, for, in a flat space, we can choose
coordinates such that over all the space all the r kt = 0, and therefore also R klm = 0. Because
of the tensor character of R klm it is then equal to zero also in any other coordinate system.
This is related to the fact that in a flat space parallel displacement is a single valued operation,
so that in making a circuit of a closed contour a vector does not change.
The converse theorem is also valid: if R[ lm = 0, then the space is flat. Namely, in any
space we can choose a coordinate system which is galilean over a given infinitesimal region.
I f R kim = 0, then parallel displacement is a unique operation, and then by a parallel dis
placement of the galilean system from the given infinitesimal region to all the rest of the
space, we can construct a galilean system over the whole space, which proves that the space
is Euclidean.
Thus the vanishing or nonvanishing of the curvature tensor is a criterion which enables
us to determine whether a space is flat or curved.
We note that although in a curved space we can also choose a coordinate system which
will be locally geodesic at a given point, at the same time the curvature tensor at this same
point does not go to zero (since the derivatives of the r l kl do not become zero along with
the r kl ).
§ 92. Properties of the curvature tensor
From the expression (91.4) it follows immediately that the curvature tensor is anti
symmetric in the indices / and m:
R klm = — R kmi (92.1)
Furthermore, one can easily verify that the following identity is valid:
Rl klm + R mkl + R \mk = 0 (92.2)
In addition to the mixed curvature tensor R klm , one also uses the covariant curvature
t Formula (91.7) can also be obtained directly from (91.6) by raising the index i and using the symmetry
properties of the tensor R iklm (§92).
§ 92 PROPERTIES OF THE CURVATURE TENSOR 261
tensorf
RikM^gMm (92.3)
By means of simple transformations it is easy to obtain the following expressions for
Riklm''
D 1 / d dim , d g kl d g iX d g km \ _ (
Riklm== 2Wd? + d^d^'dx k dx m dx'dx 1 ) 9np{kl im km il) ' K }
(for actual calculations the last term is more conveniently written as g np (T„ tkl T Ptim 
*■ n, km *■ p, il))'
From this expression one sees immediately the following symmetry properties:
Riklm = —Rkilm = —Rikml (92.5)
Rum = Rlmik, (92.6)
i.e. the tensor is antisymmetric in each of the index pairs i, k and /, m, and is symmetric
under the interchange of the two pairs with one another.
From these formulas it follows, in particular, that all components R iklm , in which i = k
or / = m are zero.
For R iklm as for R[ lm , the identity (92.2) is valid:
Riki m +Rimki + Ru m k = 0. (92.7)
Furthermore, from the relations (92.5)(92.6) it follows that if we cyclically permute any
three indices in R iklm and add the three components obtained, then the result will be zero.
Finally, we also prove the Bianchi identity :
Rlkl;m + R?mk;l + Rilm;k = 0. (92.8)
It is most conveniently verified by using a locallygeodesic coordinate system. Because of
its tensor character, the relation (92.8) will then be valid in any other system. Differentiating
(91.4) and then substituting in it P kl = 0, we find for the point under consideration
_dR? kl _ d 2 n d 2 r? k
K ikl;m dx m ^m^k famfaV
With the aid of this expression it is easy to verify that (92.8) actually holds.
From the curvature tensor we can, by contraction, construct a tensor of the second rank.
This contraction can be carried out in only one way: contraction of the tensor R iklm on the
indices i and k or / and m gives zero because of the antisymmetry in these indices, while
contraction on any other pair always gives the same result, except for sign. We define the
tensor R ik (the Ricci tensor) asf
R ik = 9 m Riimk — Riik (92.9)
According to (91.4), we have:
dx l dx k
This tensor is clearly symmetric:
** = *«. (92.11)
t In this connection it would be more correct to use the notation iV^m which clearly shows the position
of the index which has been raised.
% In the literature one also finds another definition of the tensor R ik , using contraction of R iMm on the first
and last indices. This definition differs in sign from the one used here.
Rik = ^~ k +T l ik TZTr l T[ m . (92.10)
262 THE GRAVITATIONAL FIELD EQUATIONS § 92
Finally, contracting R ik , we obtain the invariant
R = g ik R ik = g il 9 km Rm m , (92.12)
which is called the scalar curvature of the space.
The components of the tensor R ik satisfy a differential identity obtained by contracting
the Bianchi identity (92.8) on the pairs of indices ik and In:
R ^ = 2^ (92.13)
Because of the relations (92.57) not all the components of the curvature tensor are
independent. Let us determine the number of independent components.
The definition of the curvature tensor as given by the formulas written above applies to
a space of an arbitrary number of dimensions. Let us first consider the case of two dimen
sions, i.e. an ordinary surface; in this case (to distinguish them from fourdimensional
quantities) we denote the curvature tensor by P abci and the metric tensor by y ab , where the
indices a, b, . . . run through the values 1, 2. Since in each of the pairs ab and cd the two
indices must have different values, it is obvious that all the nonvanishing components of
the curvature tensor coincide or differ in sign. Thus in this case there is only one independent
component, for example P 1212 . It is easily found that the scalar curvature is
p = "~^T~' y  M = yiiV22(yi2) • (92.14)
The quantity P/2 coincides with the Gaussian curvature K of the surface:
P 1
o  "  (92.15)
2 QiQ 2 v '
where the q u q 2 are the principal radii of curvature of the surface at the particular point
(remember that q ± and q 2 are assumed to have the same sign if the corresponding centers
of curvature are on one side of the surface, and opposite signs if the centers of curvature
lie on opposite sides of the surface; in the first case K> 0, while in the second K< O.f
Next we consider the curvature tensor in threedimensional space; we denote it by P ap d
and the metric tensor by y ap , where the indices a, p run through values 1, 2, 3. The index
pairs ccp and yd run through three essentially different sets of values: 23, 31, and 12 (per
mutation of indices in a pair merely changes the sign of the tensor component). Since the
tensor P aPy5 is symmetric under interchange of these pairs, there are all together 32/2
independent components with different pairs of indices, and three components with
identical pairs. The identity (92.7) adds no new restrictions. Thus, in threedimensional
space the curvature tensor has six independent components. The symmetric tensor P ap
has the same number. Thus, from the linear relations P aB = g yd P yaSf} all the components
of the tensor P afty5 can be expressed in terms of P ap and the metric tensor y ap (see problem
1). If we choose a system of coordinates that is cartesian at the particular point, then by a
t Formula (92.15) is easy to get by writing the equation of the surface in the vicinity of the given point
(x = y = 0) in the form z = ix 2 ^) + (y 2 /2e 2 ).
Then the square of the line element on it is
dl* = (l+ ^) dx 2 + (l + £) dy*+2^dxdy.
Calculation of P 12 \2 at the point x = y = using formula (92.4) (in which only terms with second derivatives
of the y aB are needed) leads to (92.15).
§ 92 PROPERTIES OF THE CURVATURE TENSOR 263
suitable rotation we can bring the tensor P ap to principal axes.f Thus the curvature tensor
of a threedimensional space at a given point is determined by three quantities. $
Finally we go to fourdimensional space. The pairs of indices ik and Im in this case run
through six different sets of values: 01, 02, 03, 23, 31, 12. Thus there are six components
of R iklm with identical, and 65/2 with different, pairs of indices. The latter, however, are
still not independent of one another; the three components for which all four indices are
different are related, because of (92.7), by the identity:
^0123 + ^0312 + ^0231 =0 (92.16)
Thus, in fourspace the curvature tensor has a total of twenty independent components.
By choosing a coordinate system that is galilean at the given point and considering the
transformations that rotate this system (so that the g ik at the point are not changed), one
can achieve the vanishing of six of the components of the curvature tensor (since there are
six independent rotations of a fourdimensional coordinate system). Thus, in the general
case the curvature of fourspace is determined at each point by fourteen quantities.
If R. k = 0, § then the curvature tensor has a total of ten independent components in an
arbitrary coordinate system. By a suitable transformation we can then bring the tensor
Rikim ( at tne given point of fourspace) to a "canonical" form, in which its components are
expressed in general in terms of four independent quantities ; in special cases this number may
be even smaller. (The classification of the possible canonical types for the tensor R iklm was
found by A. Z. Petrov, 1950; see problem 3.)
If, however, R ik # 0, then the same classification can be used for the curvature tensor
after one has subtracted from it a particular part that is expressible in terms of the com
ponents R ik . Namely, we construct the tensor^f
Ciklm = Riklm~iRil9km + iKim9kl + $Rkl9im — %Rkm9u + iR(9il9km~9lm9kl) (92.17)
It is easy to see that this tensor has all the symmetry properties of the tensor R ikUn , but
vanishes when contracted on a pair of indices (// or km).
PROBLEMS
1. Express the curvature tensor P aei6 of threedimensional space in terms of the secondrank
tensor P aB .
Solution: We look for P aByd in the form
P<x0y« = Aay7ed — A a6 7/Sy + A 06 lay Afiyjad,
t For the actual determination of the principal values of the tensor P a0 there is no need to transform to a
coordinate system that is cartesian at the given point. These values can be found by determining the roots X
of the equation \P ae ~Xy ae \=0.
t Knowledge of the tensor P aevd enables us to determine the Gaussian curvature K of an arbitrary surface
in the space. Here we note only that if the x 1 , x 2 , x 2 are an orthogonal coordinate system, then
7iiy 2 2— (712) 2
is the Gaussian curvature for the "plane" perpendicular (at the given point) to the x 3 axis; by a "plane"
we mean a surface formed by geodesic lines.
§ We shall see later (§ 95) that the curvature tensor for the gravitational field in vacuum has this property.
If This complicated expression can be written more compactly in the form :
Ciklm = Riklm — Rlliffklm'r Rmli0kll + iRgilldklm,
where the square brackets imply antisymmetrization over the indices contained in them :
Auk} = KAik—Akt).
264 THE GRAVITATIONAL FIELD EQUATIONS § 92
which satisfies the symmetry conditions; here A aB is some symmetric tensor whose relation to P a0
is determined by contracting the expression we have written on the indices a and y. We thus find:
PaB=Ay a0 +A a0 , A a0 =P ae —iPy a0 ,
and finally,
p
PaBYd =PccYYea—P a dy0v+PBSyccY—P0yy a d+ j (yadypY — yaYVBi)'
2. Calculate the components of the tensors R mm and R ik for a metric in which a ik = for / i= k
(B. K. Harrison, 1960).
Solution: We represent the nonzero components of the metric tensor in the form
Qa = e t e 2F <, e = 1, e a = — 1.
The calculation according to formula (92.4) gives the following expressions for the nonzero com
ponents of the curvature tensor:
Rim = e l e 2F ' [F,. k F ki +F Uk F u t F,. t F u k F u ,. J, i^k^l;
Ruu = e 1 e^{F Ui F l , i Fl i F ul , i )+e l e^{F l , l F ul Fl l F ul , l )
(no summation over repeated indices!). The subscripts preceded by a comma denote ordinary
differentiation with respect to the corresponding coordinate.
Contracting the curvature tensor on two indices, we obtain :
i? Jfc = S (F,.*F fc . ( +F f>fc F u F, i ,F, ifc F, 1 ,. fc ),i#*;
Ru= S[F M F i . J Ff. J F / . J , i + ei ^e 2 ^^(F J>! F M F i 2 >i F M>I F <>I £ F MlI )].
3. Consider the possible types of canonical forms of the curvature tensor when R ik = 0.
Solution: We shall assume that the metric at the given point in fourdimensional space has been
brought to galilean form. We write the set of twenty independent components of the tensor R iklm
as a collection of three threedimensional tensors defined as follows:
AaB — RoccO0> CaB = i^ayi^PAu RydAn, BaB = 2 e «Yi RoBYd (1)
(e aPY is the unit antisymmetric tensor; since the threedimensional metric is cartesian, there is no
need to deal with the difference between upper and lower indices in the summation). The tensors
A aB and C ae are symmetric by definition; the tensor B ae is asymmetric, while its trace is zero
because of (92.16). According to the definitions (1) we have, for example,
"11 == Roi23» B12 — ft()131> Bx3 = i?0112> di = /?2323> • • •
It is easy to see that the conditions R km =g il R mm = are equivalent to the following relations
between the components of the tensors (1):
Aaa = 0, BaB = Bff a , AaB = — CaB (2)
We also introduce the symmetric complex tensor
D a0 = i(AaB + 2iB a gCa ) = A a B + iB aB . (3)
This combining of the two real threedimensional tensors A aB and B a0 into one complex tensor
corresponds precisely to the combination (in § 25) of the two vectors E and H into the complex
vector F, while the resulting relation between D aB and the fourtensor R mm corresponds to the
relation between F and the fourtensor F lk . It then follows that fourdimensional transformations
of the tensor R iklm are equivalent to threedimensional complex rotations carried out on the tensor
D ae .
With respect to these rotations one can define eigenvalues k = k'+ik" and eigenvectors n a
(complex, in general) as solutions of the system of equations
D a gn = kn a . (4)
§ 92 PROPERTIES OF THE CURVATURE TENSOR 265
The quantities k are the invariants of the curvature tensor. Since the trace #«« = 0, the sum of the
roots of equation (4) is zero :
A U) +A (2) +A (3) =0>
Depending on the number of independent eigenvectors n a , we arrive at the following classification
of possible cases of reduction of the curvature tensor to the canonical Petrov types IIII.
(I) There are three independent eigenvectors. Then their squares n a n a are different from zero
and by a suitable rotation we can bring the tensor D a0 , and with it A a0 and B ttP , to diagonal form:
nay \ IX 1 ** \
A ae =i x*y , *„,= V* (I)
\ A^' W \ X™" W
In this case the curvature tensor has four independent invariants.t
The complex invariants X a \ A (2) are expressed algebraically in terms of the complex scalars
h = ~ (Rm m R iklm iRm m R iklm ),
/ a = 1 (R iklm R**vr Rpr tic + i R mm R^4 pT ac )t
where the asterisk over a symbol denotes the dual tensor:
*
Riklm = ^Eucpr R Pr im
Calculating h and I 2 using (I), we obtain:
h = K^ (1)2 + ^ (2)2 + * <3>2 )> 7 2 = P (1) ^ 2) U (1) + A< 2 >). (5)
These formulas enable us to calculate X a \ A (2) starting from the values of R mm in any reference
system.
(II) There are two independent eigenvectors. The square of one of them is then equal to zero, so
that it cannot be chosen as the direction of one of the coordinate axes. One can, however, take it
to lie in the x 1 , x 2 plane; then n 2 = in u n 3 = 0. The corresponding equations (4) give:
#ii+i D 12 = X, D 22 i D 12 = X,
so that
D n = X— in, D 22 = X+in, D 12 = n.
The complex quantity X = X'+iX" is a scalar and cannot be changed. But the quantity n can be
given any nonzero value by a suitable complex rotation; we can therefore, without loss of generality,
assume it to be real. As a result we get the following canonical type for the real tensors A ae and
Bag'.
IX' n \ IX"n \
A aB = lfi X' 1, B ap = l o r+ M . (II)
\0 2X'I \ 2k' J
In this case there are just two invariants k' and k". Then, in accordance with (5), h = k 2 , h = k 3 ,
so that I\ = 11.
(III) There is just one eigenvector, and its square is zero. All the eigenvalues k are then identical
and consequently equal to zero. The solutions of equations (4) can be brought to the form
#n = #22 = #12 = 0, Di3 = M, #23 = m, so that
/0 A /0 0\
A aB = (o 0), B a , = lo A (III)
\ji 0/ \0 n 0/
In this case the curvature tensor has no invariants at all and we have a peculiar situation : the four
space is curved, but there are no invariants which could be used as a measure of its curvature. (The
same situation occurs in the degenerate case (II) when k' = k" = 0; this case is called type N.)
t The degenerate case when k ay = A (2) ', k a) " = A (2) " is called Type D in the literature.
266 THE GRAVITATIONAL FIELD EQUATIONS § 93
§ 93. The action function for the gravitational field
To arrive at the equations determining the gravitational field, it is necessary first to
determine the action S g for this field. The required equations can then be obtained by varying
the sum of the actions of field plus material particles.
Just as for the_electromagnetic field, the action S g must be expressed in terms of a scalar
integral j" (?V g dQ, taken over all space and over the time coordinate x° between two
given values. To determine this scalar we shall start from the fact that the equations of the
gravitational field must contain derivatives of the "potentials" no higher than the second
(just as is the case for the electromagnetic field). Since the field equations are obtained by
varying the action, it is necessary that the integrand G contain derivatives of g ik no higher
than first order; thus G must contain only the tensor g ik and the quantities rj
However, it is impossible to construct an invariant from the quantities g ik and r£, alone.
This is immediately clear from the fact that by a suitable choice of coordinate system we can
always make all the quantities r kl zero at a given point. There is, however, the scalar R
(the curvature of the fourspace), which though it contains in addition to the g ik and its
first derivatives also the second derivatives ofj^, is linear in the second derivatives. Because
of this linearity, the invariant integral J" R^Jg dQ can be transformed by means of Gauss'
theorem to the integral of an expression not containing the second derivatives. Namely,
j" R\l g dQ can be presented in the form
j R^g dCl = j Gj—g dCl + j *£^5 dn ,
where G contains only the tensor g ik and its first derivatives, and the integrand of the second
integral has the form of a divergence of a certain quantity w l (the detailed calculation is
given at the end of this section). According to Gauss' theorem, this second integral can be
transformed into an integral over a hypersurface surrounding the fourvolume over which
the integration is carried out in the other two integrals. When we vary the action, the
variation of the second term on the right vanishes, since in the principle of least action, the
variations of the field at the limits of the region of integration are zero. Consequently, we
may write
<5 j Ry/gdQ = d j GyJg
dQ.
The left side is a scalar; therefore the expression on the right is also a scalar (the quantity G
itself is, of course, not a scalar).
The quantity G satisfies the condition imposed above, since it contains only the g ik and
its derivatives. Thus we may write
"•isW^^'isW*^* 1 ' (931)
where k is a new universal constant. Just as was done for the action of the electromagnetic
field in § 27, we can see that the constant k must be positive (see the end of this section).
The constant k is called the gravitational constant. The dimensions of A: follow from (93. 1).
The action has dimensions gmcm 2 sec _1 ; all the coordinates have the dimensions cm,
the g ik are dimensionless, and so R has dimensions cm 2 . As a result, we find that k has
the dimensions cm 3 gm 1 sec~ 2 . Its numerical value is
k = 6.67 x 10" 8 cm 3 gm _1 sec~ 2 . (93.2)
§ 93 THE ACTION FUNCTION FOR THE GRAVITATIONAL FIELD 267
We note that we could have set k equal to unity (or any other dimensionless constant).
However, this would fix the unit of mass.f
Finally, let us calculate the quantity G of (93.1). From the expression (92.10) for R ik , we
have
/ ~ ~ / ik r» / I ik^ik „ik°* H , n ikrl rm _ n ik T m T l I
J g R = j g g lk R ik = \/ g Y M~ 9 fa* 9 ik lm 9 1 " 1 * M J
In the first two terms on the right, we have
Dropping the total derivatives, we find
V^ g = rr m ^ (V^ rtft £i (V^ ^ fc )(nrLr:,r»'V^.
With the aid of formulas (86.5)(86.8), we find that the first two terms on the right are
equal to y/g multiplied by
= 2g v (r™r fcm — r lfe rj^).
Finally, we have
G = g i Xr7 l r km r\ k rr m ) (933)
The components of the metric tensor are the quantities which determine the gravitational
field. Therefore in the principle of least action for the gravitational field it is the quantities
g ik which are subjected to variation. However, it is necessary here to make the following
fundamental reservation. Namely, we cannot claim now that in an actually realizable field
the action integral has a minimum (and not just an extremum) with respect to all possible
variations of the g ik . This is related to the fact that not every change in the g ik is associated
with a change in the spacetime metric, i.e. with a real change in the gravitational field.
The components g ik also change under a simple transformation of coordinates connected
merely with the shift from one system to another in one and the same spacetime. Each such
coordinate transformation is generally an aggregate of four independent transformations.
In order to exclude such changes in g ik which are not associated with a change in the metric,
we can impose four auxiliary conditions and require the fulfillment of these conditions under
the variation. Thus, when the principle of least action is applied to a gravitational field, we
t If one sets k = c 2 , the mass is measured in cm, where 1 cm  1.35 x 10 28 gm. Sometimes one uses in
place of k the quantity
x = !?* = 1.86 x 10 27 cm gm 1 ,
c 2
which is called the Einstein gravitational constant.
268 THE GRAVITATIONAL FIELD EQUATIONS § 94
can assert only that we can impose auxiliary conditions on the g ik , such that when they are
fulfilled the action has a minimum with respect to variations of the g ik .\
Keeping these remarks in mind, we now show that the gravitational constant must be
positive. As the four auxiliary conditions mentioned, we use the vanishing of the three
components g 0a , and the constancy of the determinant \g afi \ made up from the components
0o« = 0, \g ap \ = const;
from the last of these conditions we have
n *» d 9*p d i  n
9 ^ = d?\9 aP \0.
We are here interested in those terms in the integrand of the expression for the action which
contain derivatives of g ik with respect to x° (cf. p. 68). A simple calculation using (93.3)
shows that these terms in G are
_ t n *P n r* n oo dffay ddps
4 9 9 9 dx° 8x°
It is easy to see that this quantity is essentially negative. Namely, choosing a spatial system
of coordinates which is cartesian at a given point at a given moment of time (so that
9*p = sf" = <5«/})> we obtain:
_ i „oo f d San\ 2
4 9 \dx )'
and, since g°° = l/g 00 > 0, the sign of the quantity is obvious.
By a sufficiently rapid change of the components g afi with the time x° (within the time
interval between the limits of integration of x°) the quantity G can consequently be made
as large as one likes. If the constant k were negative, the action would then decrease without
limit (taking on negative values of arbitrarily large absolute magnitude), that is, there could
be no minimum.
§ 94. The energymomentum tensor
In § 32 the general rule was given for calculating the energymomentum tensor of any
physical system whose action is given in the form of an integral (32.1) over fourspace. In
curvilinear coordinates this integral must be written in the form
(94.1)
(in galilean coordinates g =  1, and S goes over into $AdVdt). The integration extends
over all the threedimensional space and over the time between two given moments, i.e., over
the infinite region of fourspace contained between two hypersurfaces.
As already discussed in § 32, the energymomentum tensor, calculated from the formula
(32.5), is generally not symmetric, as it should be. In order to symmetrize it, we had to add
t We must emphasize, however, that everything we have said has no effect on the derivation of the field
equations from the principle of least action (§ 95). These equations are already obtained as a result of the
requirement that the action be an extremum (i.e., vanishing of the first derivative), and not necessarily a
minimum. Therefore in deriving them we can vary all of the g ik independently.
§ 94 THE ENERGYMOMENTUM TENSOR 269
to (32.5) suitable terms of the form (d/8x l )\l/ m , where ^ m = il/ ilk . We shall now give
another method of calculating the energymomentum tensor which has the advantage of
leading at once to the correct expression.
In (94.1) we carry out a transformation from the coordinates x l to the coordinates
x H = x l +C, where the §' are small quantities. Under this transformation the g vk are trans
formed according to the formulas :
g (x)g (x) dxl dxm g \b l+ dxl J \d m + dxm J
~ a * (x i ) + a im<!? + a??
~g {x)+g dxm +g dxl .
Here the tensor g' ik is a function of the x' 1 , while the tensor g ik is a function of the original
coordinates x l . In order to represent all terms as functions of one and the same variables, we
expand g' ik (x l + £ l ) in powers of £'. Furthermore, if we neglect terms of higher order in £',
we can in all terms containing £', replace g nk by g ik . Thus we find
r)a ik dl k d£ l
n >ik, 1} _ n ik( x l\_fl V JL_ ± JU, 1^_ ■ fl « li.
g (x)g (x) £ dxl +g ^+g ^
It is easy to verify by direct trial that the last three terms on the right can be written as a
sum £ i; k + £ fe; ' of contravariant derivatives of the £'. Thus we finally obtain the transforma
tion of the g ik in the form
g ' ik = g ik + 5g ik , dg ik = £ i; * + f* ;l . (94.2)
For the co variant components, we have :
G'ik = dik + $9ik, $9ik = ~ Zi; k£k;i (94.3)
(so that, to terms of first order we satisfy the condition g iX g' kl = <5*).f
Since the action S is a scalar, it does not change under a transformation of coordinates.
On the other hand, the change SS in the action under a transformation of coordinates can
be written in the following form. As in § 32, let q denote the quantities defining the physical
system to which the action S applies. Under coordinate transformation the quantities q
change by bq. In calculating SS we need not write terms containing the changes in q. All
such terms must cancel each other by virtue of the "equations of motion" of the physical
system, since these equations are obtained by equating to zero the variation of S with
respect to the quantities q. Therefore it is sufficient to write the terms associated with changes
in the g ik . Using Gauss' theorem, and setting Sg lk = at the integration limits, we find SS
in the formj
f We note that the equations
£i;fc + {k;i =
determine the infinitesimal coordinate transformations that do not change the metric. In the literature these
are often called the Killing equations.
t It is necessary to emphasize that the notation of differentiation with respect to the components of the
symmetric tensor g ik , which we introduce here, has in a certain sense a symbolic character. Namely, the
derivative dF/dg (k (F is some function of the g ik ) actually has a meaning only as the expression of the fact
that dF= (dF/dg (k )dg ik . But in the sum (dF/dg tk )dg ik , the terms with differentials dg ik , of components with
i # k, appear twice. Therefore in differentiating the actual expression for F with respect to any definite
component g ik with i ^ k, we would obtain a value which is twice as large as that which we denote by
3Fldg lk . This remark must be kept in mind if we assign definite values to the indices /, k, in formulas in which
the derivatives with respect to g tk appear.
270
THE GRAVITATIONAL FIELD EQUATIONS
§ 94
Iff^A aV
dx l „ a^*
a a7
fir A dg'°
dfi
<V' fc dfi.
Here we introduce the notation
V0T tt =
d\J — g A d d\J — gA
dg ik
dx l a d^"
Then 55 takes the formf
dS = 2c J T^JQ da= ~h\ Tikd 9^~V
dQ.
(94.4)
(94.5)
(note that g ik dg lk = g lk 5g ik , and therefore T ik dg ik = T ik dg ik ). Substituting for Sg ik
the expression (94.2), we have, making use of the symmetry of the tensor T ik ,
5S = t\ T i^ i '' k + ^' i ^ Zr d <M = lj T ik e' k yl^g da.
Furthermore, we transform this expression in the following way :
SS = c / (^ W^ dn ~ \ j Tukt'yf^g d^ (94.6)
Using (86.9), the first integral can be written in the form
d
and transformed into an integral over a hypersurface. Since the £' vanish at the limits of
integration, this integral drops out.
Thus, equating SS to zero, we find
Ij^Mgmdn,
SS
=  1 c JT k iik ?y/gdQ = 0.
Because of the arbitrariness of the <f it then follows that
T k ;k = 0. (94.7)
Comparing this with equation (32.4) dT ik jd^ = 0, valid in galilean coordinates, we see that
the tensor T ik , defined by formula (94.4), must be identical with the energymomentum
tensor — at least to within a constant factor. It is easy to verify, carrying out, for example,
the calculation from formula (94.4) for the electromagnetic field
( A is '•*• is ™« v ">
that this factor is equal to unity.
t In the case we are considering, the ten quantities Sg ik are not independent, since they are the result of a
transformation of the coordinates, of which there are only four. Therefore from the vanishing of SS it does
not follow that T ik = 0\
§ 94 THE ENERGYMOMENTUM TENSOR 271
Thus, formula (94.4) enables us to calculate the energymomentum tensor by dif
ferentiating the function A with respect to the components of the metric tensor (and their
derivatives). The tensor T ik obtained in this way is symmetric. Formula (94.4) is convenient
for calculating the energymomentum tensor not only in the case of the presence of a gravita
tional field, but also in its absence, in which case the metric tensor has no independent
significance and the transition to curvilinear coordinates occurs formally as an intermediate
step in the calculation of T ik .
The expression (33.1) for the energymomentum tensor of the electromagnetic field must
be written in curvilinear coordinates in the form
T ik = ^ (F n F k l + \ F lm F lm g ik \ (94.8)
For a macroscopic body the energymomentum tensor is
T ik = (p + s)u i u k pg ik . (94.9)
We note that the quantity T 00 is always positive :f
r 00 > 0. (94.10)
(No general statement can be made about the mixed component T%.)
PROBLEM
Consider the possible cases of reduction to canonical form of a symmetric tensor of second rank
in a pseudoeuclidean space.
Solution: The reduction of a symmetric tensor A ik to principal axes means that we find "eigen
vectors" n l for which
A ik n k = Xn { . (1)
The corresponding principal (or "proper") values X are obtained from the condition for consistency
of equation (1), i.e. as the roots of the fourth degree equation
\A ik Xg ik \=0, (2)
and are invariants of the tensor. Both the quantities X and the eigenvectors corresponding to them
may be complex. (The components of the tensor A ik itself are of course assumed to be real.)
From equation (1) it is easily shown in the usual fashion that two vectors w t (1) and « ( (2) which
correspond to different principal values X w and A (2) are "mutually perpendicular" :
„w n wi = 0, (3)
In particular, if equation (2) has complexconjugate roots X and X*, to which there correspond the
complexconjugate vectors n t and n*, then we must have
n i n t *=0. (4)
The tensor A ik is expressed in terms of its principal values and the corresponding eigenvectors by
the formula
4 = s^^ (5)
(so long as none of the quantities n t n 1 is equal to zero — cf. below).
t We have T 00 = eu%+p(ul—g o) The first term is always positive. In the second term we write
g odx°+g 0a dx a
u =goou +g 0a u a ='
ds
and obtain after a simple transformation g op(dl/ds) 2 , where dl is the element of spatial distance (84.6);
from this it is clear that the second term of T Q0 is also positive. The same result can also be shown for the
tensor (94.8).
272 THE GRAVITATIONAL FIELD EQUATIONS § 95
Depending on the character of the roots of equation (2), the following three situations may occur.
(a) All four eigenvalues X are real. Then the vectors n t are also real, and since they are mutually
perpendicular, three of them must have spacelike directions and one a timelike direction (and are
normalized by the conditions n t n l = — 1 and itxn 1 = 1, respectively). Choosing the directions of the
coordinates along these vectors, we bring the tensor A ik to the form
'F
4 = l o o x* o ' (6)
X^j
(b) Equation (2) has two real roots (7 (2) , A (3) ) and two complexconjugate roots (X'±iX"). We
write the complexconjugate vectors n u nf, corresponding to the last two roots in the form a t ±ibi;
since they are defined only to within an arbitrary complex factor, we can normalize them by the
condition riiii 1 = n*n l * = 1. Also using equation (4), we find
a i a l +b i b i =0 t a { b l = 0, a l a i b l b i ^\,
so that
«i« i= =J, bib l = —^,
i.e. one of these vectors must be spacelike and the other timelike, f Choosing the coordinate axes
along the vectors a 1 , b\ n (2H , n i3)i , we bring the tensor to the form:
X' X"
, X" ~X' .
4 = ' o *» o I (7)
a< 3 \
(c) If the square of one of the vectors n l is equal to zero {riin 1 = 0), then this vector cannot be
chosen as the direction of a coordinate axis. We can however choose one of the planes x°, x" so
that the vector n l lies in it. Suppose this is the jt°, x 1 plane; then it follows from Wjw' =0 that
n° = n 1 , and from equation (1) we have ,4oo+^oi = X, A^+A^ = X, so that A X1 = — X+n,
Aoo = X+n, Aox = — n, where n is a quantity which is not invariant but changes under rotations
in the x°, x 1 plane; it can always be made real by a suitable rotation. Choosing the axes x 2 , x 3
along the other two (spacelike) vectors « (2)i , « (3)i , we bring the tensor A ik to the form
'X+fi n
— fi —X+m 0.
^ fc = l o *» o I (8)
x&>
This case corresponds to the situation when two of the roots (A< 0) , A (1) ) of equation (2) are equal.
We note that for the physical energymomentum tensor T lk of matter moving with velocities less
than the velocity of light only case (a) can occur; this is related to the fact that there must always
exist a reference system in which the flux of the energy of the matter, i.e. the components T aQ are
equal to zero. For the energymomentum tensor of electromagnetic waves we have case (c) with
X& = A (3) = X = (cf. p. 82); it can be shown that if this were not the case there would exist a
reference frame in which the energy flux would exceed the value c times the energy density.
§ 95. The gravitational field equations
We can now proceed to the derivation of the equations of the gravitational field. These
equations are obtained from the principle of least action 5(S m +S g ) = 0, where S g and S m
are the actions of the gravitational field and matter respectively. We now subject the
gravitational field, that is, the quantities g ik , to variation.
t Since only one of the vectors can have a timelike direction, it then follows that equation (2) cannot
have two pairs of complexconjugate roots.
§ 95 THE GRAVITATIONAL FIELD EQUATIONS 273
Calculating the variation 5S g , we have
5 j Ry/^j dQ = dj g ik R ik y/~g dQ
= J* {R^yf^i 5g ik + R ik g ik 8y/~g + g ik yr^g 5R ik }dQ.
From formula (86.4), we have
&\lg = — t=<5#=  V^7 g ik 8g ik ;
2 v g z
substituting this, we find
<5 j R^~gdQ = J {R ik ^g ik RW k sJ~g dQ + j g ik 5R ik y/~^g dQ. (95.1)
For the calculation of 5R ik we note that although the quantities r£, do not constitute a
tensor, their variations 3P kl do form a tensor, for T k t A k dx l is the change in a vector under
parallel displacement [see (85.5)] from some point P to an infinitesimally separated point
P'. Therefore dT k u A k dx l is the difference between the two vectors, obtained as the result of
two parallel displacements (one with the unvaried, the other with the varied T l kl ) from the
point P to one and the same point P '. The difference between two vectors at the same point
is a vector, and therefore 5T kl is a tensor.
Let us use a locally geodesic system of coordinates. Then at that point all the T kl = 0.
With the help of expression (92. 10) for the R ik , we have (remembering that the first derivatives
of the g lk are now equal to zero)
a*XR  n ik I 9 ST 1 —fip\ a ik — 5P  a il — 8T k  —
g3R *~ 9 \dx~i 5Tik dx kdTil r 9 dx l6Tik 9 dx ldrik ~dx 1 '
where
W l = g ik dP ik g il 5r k k .
Since w l is a vector, we may write the relation we have obtained, in an arbitrary coordinate
system, in the form
g ik ZR ik =
[replacing dw l /dx l by w l . , and using (86.9)]. Consequently the second integral on the right
side of (95.1) is equal to
j g ik dR ik sf^gdQ = j
d(y/g w l )
dQ,
8x l
and by Gauss' theorem can be transformed into an integral of w l over the hypersurface
surrounding the whole four volume. Since the variations of the field are zero at the integra
tion limits, this term drops out. Thus, the variation 5S g is equal tof
J (R ik  \ 9ik^) §g ik J^g dQ. (95.2)
We note that if we had started from the expression
c 3
^issj^'
dQ.
t We note here the following curious fact. If we calculate the variation S J R V —g dCl [with R llc from
(92.10)], considering the r£, as independent variables and the g ik as constants, and then use expression
(86.3) for the TU, we would obtain, as one easily verifies, identically zero. Conversely, one could determine
the relation between the r^, and the metric tensor by requiring that the variation we have mentioned should
vanish.
274 THE GRAVITATIONAL FIELD EQUATIONS
for the action of the field, then we would have obtained
§ 95
dS= 
9 16nk
c 3 C (d(G^g) d d(G^g)]
I
dg l
dx l
ox
Sg ik dQ.
Comparing this with (95.2), we find the following relation:
1 1
R ik~ ~ 9ikR — ~7=
1 yJg
d(Gy/g) d d{G^g)
dg ik
dx l
dg*
dx l
(95.3)
For the variation of the action of the matter we can write immediately from (94.5):
SSm = 2c\ T * d 9 ik ^~9 dQ, (95.4)
where T ik is the energymomentum tensor of the matter (including the electromagnetic field).
Gravitational interaction plays a role only for bodies with sufficiently large mass (because
of the smallness of the gravitational constant), and therefore in studying the gravitational
field we usually have to deal with macroscopic bodies. Corresponding to this we must usually
write for T ik the expression (94.9).
Thus, from the principle of least action 3S m + SS g = we find:
~ 3 * ' ' 8nk
I6nk
J \Rit j 9ikR~ ^ T ik ) 5g lk ylg dQ = 0,
from which, in view of the arbitrariness of the Sg
ik.
or, in mixed components,
1 Snk
2 c
2 c
(95.5)
(95.6)
These are the required equations of the gravitational field— the basic equation of the general
theory of relativity. They are called the Einstein equations.
Contracting (95.6) on the indices i and k, we find
ink
R= 
T;
(T = T'i). Therefore the equations of the field can also be written in the form
Snk
R ik = ~T
c
yTik^dikTJ.
(95.7)
(95.8)
Note that the equations of the gravitational field are nonlinear equations. Therefore for
gravitational fields the principle of superposition is not valid, contrary to the case for the
electromagnetic field in the special theory of relativity.
It is necessary, however, to remember that actually one has usually to deal with weak
gravitational fields, for which the equations of the field in first approximation are linear (see
the following section). For such fields, in this approximation, the principle of superposition
is valid.
§ 95 THE GRAVITATIONAL FIELD EQUATIONS 275
In empty space T ik = 0, and the equations of the gravitational field reduce to the equation
R ik = 0. (95.9)
We mention that this does not at all mean that in vacuum, spacetime is flat; for this we
would need the stronger conditions R l klm = 0.
The energymomentum tensor of the electromagnetic field has the property that T\ =
[see (33.2)]. From (95.7), it follows that in the presence of an electromagnetic field without
any masses the scalar curvature of spacetime is zero.
As we know, the divergence of the energymomentum tensor is zero:
7t ;fc = 0; (95.10)
therefore the divergence of the left side of equation (95.6) must be zero. This is actually the
case because of the identity (92.13).
Thus the equation (95.10) is essentially contained in the field equations (95.6). On the
other hand, the equation (95.10), expressing the law of conservation of energy and momen
tum, contains the equation of motion of the physical system to which the energymomentum
tensor under consideration refers (i.e., the equations of motion of the material particles or
the second pair of Maxwell equations). Thus the equations of the gravitational field also
contain the equations for the matter which produces this field. Therefore the distribution
and motion of the matter producing the gravitational field cannot be assigned arbitrarily.
On the contrary, they must be determined (by solving the field equations under given initial
conditions) at the same time as we find the field produced by the matter.
We call attention to the difference in principle between the present situation and the one
we had in the case of the electromagnetic field. The equations of that field (the Maxwell
equations) contain only the equation of conservation of the total charge (the continuity
equation), but not the equations of motion of the charges themselves. Therefore the distribu
tion and motion of the charges can be assigned arbitrarily, so long as the total charge is
constant. Assignment of this charge distribution then determines, through Maxwell's
equations, the electromagnetic field produced by the charges.
We must, however, make it clear that for a complete determination of the distribution
and motion of the matter in the case of the Einstein equations one must still add to them the
equation of state of the matter, i.e. an equation relating the pressure and density. This
equation must be given along with the field equations.!
The four coordinates x l can be subjected to an arbitrary transformation. By means of these
transformations we can arbitrarily assign four of the ten components of the tensor g ik .
Therefore there are only six independent quantities g ik . Furthermore, the four components
of the fourvelocity u\ which appear in the energymomentum tensor of the matter, are
related to one another by u% = 1, so that only three of them are independent. Thus we have
ten field equations (95.5) for ten unknowns, namely, six components of g ik , three components
of u\ and the density e/c 2 of the matter (or its pressure p).
For the gravitational field in vacuum there remain a total of six unknown quantities
(components of g ik ) and the number of independent field equations is reduced corres
pondingly: the ten equations R ik = are connected by the four identities (92.13).
t Actually the equation of state relates to one another not two but three thermodynamic quantities, for
example the pressure, density and temperature of the matter. In applications in the theory of gravitation, this
point is however not important, since the approximate equations of state used here actually do not depend
on the temperature (as, for example, the equation p = for rarefied matter, the limiting extremerelativistic
equation p = e/3 for highly compressed matter, etc.).
276 THE GRAVITATIONAL FIELD EQUATIONS § 95
We mention some peculiarities of the structure of the Einstein equations. They are a
system of secondorder partial differential equations. But the equations do not contain the
time derivatives of all ten components g ik . In fact it is clear from (92.4) that second derivatives
with respect to the time are contained only in the components R 0a0p of the curvature tensor,
where they enter in the form of the term \g aP (the dot denotes differentiation with
respect to x°); the second derivatives of the components g 0a and g 00 do not appear at all.
It is therefore clear that the tensor R ik , which is obtained by contraction of the curvature
tensor, and with it the equations (95.5), also contain the second derivatives with respect to
the time of only the six spatial components g aP .
It is also easy to see that these derivatives enter only in the ^equation of (95.6), i.e. the
equation
RlMR^Tl (95.11)
c
The J and % equations, i.e. the equations
R°olR = S fn, K^fTl (95.12)
contain only firstorder time derivatives. One can verify this by checking that in forming the
quantities R° a and R%%R = $(R%Rl) from R iklm by contraction, the components of the
form R 0a0p actually drop out. This can be seen even more simply from the identity (92.13),
by writing it in the form
(RltffRU = ~ (R a iim) ia (95.13)
(i = 0, 1, 2, 3). The highest time derivatives appearing on the right side of this equation are
second derivatives (appearing in the quantities Rf, R). Since (95.13) is an identity, its left
side must consequently contain no time derivatives of higher than second order. But one
time differentiation already appears explicitly in it; therefore the expressions Rf—^SfR
themselves cannot contain time derivatives of order higher than the first.
Furthermore, the left sides of equations (95.12) also do not contain the first derivatives
g 0a and g 00 (but only the derivatives g ap ). In fact, of all the T i>kl , only r a00 and r 000
contain these quantities, but these latter in turn appear only in the components of the
curvature tensor of the form R 0a o P which, as we already know, drop out when we form the
left sides of equations (95.12).
If one is interested in the solution of the Einstein equations for given initial conditions
(in the time), we must consider the question of the number of quantities for which the initial
spatial distribution can be assigned arbitrarily.
The initial conditions for a set of equations of second order must include both the
quantities to be differentiated as well as their first time derivatives. But since in the present
case the equations contain second derivatives of only the six g aP , not all the g ik and g ik can
be arbitrarily assigned. Thus, we may assign (in addition to the velocity and density of the
matter) the initial values of the functions g aP and g aP , after which the four equations (95.12)
determine the admissible initial values of g 0a and g 00 ; in (95.11) the initial values of g 0a
still remain arbitrary.
Among the initial conditions thus assigned there are some functions whose arbitrariness
is related simply to the arbitrariness in choice of the fourdimensional coordinate system.
But the only thing that has real physical meaning is the number of "physically different"
arbitrary functions, which cannot be reduced by any choice of coordinate system. From
§ 95 THE GRAVITATIONAL FIELD EQUATIONS 277
physical arguments it is easy to see that this number is eight: the initial conditions must
assign the distribution of the matter density and of its three velocity components, and also
of four other quantities characterizing the free gravitational field in the absence of matter
(see later in § 102); for the free gravitational field in vacuum only the last four quantities
should be fixed by the initial conditions.
PROBLEM
Write the equations for a constant gravitational field, expressing all the operations of differentia
tion with respect to the space coordinates as covariant derivatives in a space with the metric y a$
(84.7).
Solution: We introduce the notation g 00 =h, g 0a = hg a (88.11) and the threedimensional
velocity v a (89.10). In the following all operations of raising and lowering indices and of covariant
differentiation are carried out in the threedimensional space with the metric y aB , on the three
dimensional vectors g a , V and the threedimensional scalar h.
The desired equations must be invariant with respect to the transformation
x a *x a , x°+x°+f(x a ), (1)
which does not change the stationary character of the field. But under such a transformation, as is
easily shown (see the footnote on p. 247), g a ~^g a 8f/8x a , while the scalar h and the tensor
y aB = g a0 +hg a g ff are unchanged. It is therefore clear that the required equations, when expressed
in terms of y aB , h and g a , can contain g a only in the form of combinations of derivatives that con
stitute a threedimensional antisymmetric tensor:
_ _ 8g _ 8g a ( .
which is invariant under such transformations. Taking this fact into account, we can drastically
simplify the computations by setting (after computing all the derivatives appearing in R ik ) #« =
and g a; }+g B , « =0.f
The Christoffel symbols are:
1 00
2 y
; <*>
1 00
**■■
r°
1 a0
£*■
a+^g B f a $+ ••
• J
x 00
=  t a
2 Jb
~2^ h '' a '
r°
1 a0
\\
fSg a 8g e \ _
\dx e 8x a )
Th (gah 
■*■ 0Y
— } a —
2(g0fy a +9rft
,')+ • • •
g0h;a)+gvK0+
The terms omitted (indicated by the dots) are quadratic in the components of g a \ these terms do
drop out when we set g a = after performing the differentiations in R ik (92.10). In the calculations
one uses formulas (84.9), (84.1213); the X« y are threedimensional Christoffel symbols constructed
from the metric y a/3 .
t To avoid any misunderstanding we emphasize that this simplified method for making the computations,
which gives the correct field equations, would not be applicable to the calculation of arbitrary components
of the R ik itself, since they are not invariant under the transformation (1). In equations (3)(5) on the left
are given those components of the Ricci tensor which are actually equal to the expressions given. These
components are invariant under (1).
r.T.F. 10
278 THE GRAVITATIONAL FIELD EQUATIONS § 96
The tensor T ik is calculated using formula (94.9) with the «' from (88.14) (where again we set
As a result of the calculations, we obtain the following equations from (95.8):
(3)
c
c 2
h „ „„ 1 , / „ 8ttA:
— (V /i) :a:
2 Vh c
(p+e)v°v B ' ep
2 y
' K5)
(5)
Here ?" s is a threedimensional tensor constructed from the y aB in the same way as R tk is con
structed from the g ik .f
§ 96. Newton's law
In the Einstein field equations we now carry out the transition to the limit of non
relativistic mechanics. As was stated in § 87, the assumption of small velocities of all particles
requires also that the gravitational field be weak.
The expression for the component g 00 of the metric tensor (the only one which we need)
was found, for the limiting case which we are considering, in § 87 :
c
Further, we can use for the components of the energymomentum tensor the expression
(35.4) T* = iic 2 UiU k , where n is the mass density of the body (the sum of the rest masses
of the particles in a unit volume; we drop the subscript on fi). As for the four velocity u\
since the macroscopic motion is also considered to be slow, we must neglect all its space
components and retain only the time component, that is, we must set u a = 0, u° = u = 1.
Of all the components T\, there thus remains only
T° = iic 2 . (96.1)
The scalar T = T\ will be equal to this same value fie 2 .
We write the field equations in the form (95.8):
for / = k =
R k t =
One easily verifies that in the approximation we are considering all the other equations vanish
identically.
t The Einstein equations can also be written in an analogous way for the general case of a timedependent
metric. In addition to space derivatives they will also contain time derivatives of the quantities y a0 , g a , and ft.
See A. L. Zel'manov, Doklady Acad. Sci., U.S.S.R. 107, 815 (1956).
§96 NEWTON'S LAW 279
For the calculation of R% from the general formula (92.10), we note that terms containing
derivatives of the quantities r ki are in every case quantities of the second order. Terms con
taining derivatives with respect to *° = ct are small (compared with terms with derivatives
with respect to the coordinates x a ) since they contain extra powers of l/c. As a result, there
remains R 00 = R% = dr a 00 jdx a . Substituting
1 af dg 00 1 #
we find
00 " 2 9 dx* c
o 1^ 1 AJL
Thus the field equations give
A<£ = AnkpL. (96.2)
This is the equation of the gravitational field in nonrelativistic mechanics. It is completely
analogous to the Poisson equation (36.4) for the electric potential, where here in place of
the charge density we have the mass density multiplied by  k. Therefore we can immediately
write the general solution of equation (96.2) by analogy with (36.8) in the form
4>
«*J"^. < 96 ' 3 )
This formula determines the potential of the gravitational field of an arbitrary mass distribu
tion in the nonrelativistic approximation.
In particular, we have for the potential of the field of a single particle of mass m
4 =  ^ (964)
iv
and, consequently, the force F= m'id^jdR), acting in this field on another particle
(mass m'), is equal to
F=JT ( 965 >
This is the wellknown law of attraction of Newton.
The potential energy of a particle in a gravitational field is equal to its mass multiplied by
the potential of the field, in analogy to the fact that the potential energy in an electric field is
equal to the product of the charge and the potential of the field. Therefore, we may write, by
analogy with (37.1), for the potential energy of an arbitrary mass distribution, the expression
U = i f n4> dV. (96.6)
For the Newtonian potential of a constant gravitational field at large distances from the
masses producing it, we can give an expansion analogous to that obtained in §§ 4041 for
the electrostatic field. We choose the coordinate origin at the inertial center of the masses.
Then the integral J /ir dV, which is analogous to the dipole moment of a system of charges,
vanishes identically. Thus, unlike the case of the electrostatic field, in the case of the gravita
tional field we can always eliminate the "dipole terms". Consequently, the expansion of the
potential has the form :
M 1 _ d 2 _ J_
^^rM^WJx.r^^ (%  7)
280 THE GRAVITATIONAL FIELD EQUATIONS § 96
 where M = j" \i dV is the total mass of the system, and the quantity
D aP = j ii(3x a x p r 2 d aP )dV (96.8)
may be called the mass quadrupole moment tensor.^ It is related to the usual moment of
inertia tensor
J a p = j v(r 2 d a px a x p )dV
by the obvious relation
D aP = J y7 5 aP 3J aP . (96.9)
The determination of the Newtonian potential from a given distribution of masses is the
subject of one of the branches of mathematical physics; the exposition of the various
methods for this is not the subject of the present book. Here we shall for reference purposes
give only the formulas for the potential of the gravitational field produced by a homo
geneous ellipsoidal body. J
Let the surface of the ellipsoid be given by the equation
~2 + 172 + ~2 = l> a>b>c. (96.10)
a 1 b l c z
Then the potential of the field at an arbitrary point outside the body is given by the following
formula :
oo
tp=nnabck [(i^ +T ^^)^. (96.11)
J \ a 2 +s b 2 + s c 2 + s/R s
4 .
j? s = V(a 2 + s)(fc 2 + s)(c 2 + s),
where £ is the positive root of the equation
x 2 v 2 z 2
+ 7#. + rr = l. ( 96  12 >
a 2 + i; b 2 + £ c 2 + Q
The potential of the field in the interior of the ellipsoid is given by the formula
00
q> = n^iabck f ( 1  ^—  J~ £) £ (96.13)
J V a 2 + s b 2 + s c 2 + s/R s
o
which differs from (96.1 1) in having the lower limit replaced by zero ; we note that this expres
sion is a quadratic function of the coordinates x, y, z.
The gravitational energy of the body is obtained, according to (96.6), by integrating the
expression (96.13) over the volume of the ellipsoid. This integral can be done by elementary
methods,§ and gives:
oo
3km 2 r\lf a 2 b 2 c 2 \ 1 ds =
U ~ 8 J [5\a 2 + s + b 2 + s + c 2 + s) \R S
o
t We here write all indices «, as subscripts, not distinguishing between co and contravariant components,
in accordance with the fact that all operations are carried out in ordinary Newtonian (Euclidean) space.
t The derivation of these formulas can be found in the book of L. N. Stretenskii, Theory of the Newtonian
Potential, Gostekhizdat, 1946.
§ The integration of the squares x?, y 2 , z 2 is most simply done by making the substitution x = ax',
y = by', z = cz', which reduces the integral over the volume of the ellipsoid to an integral over the volume of
the unit sphere.
§ 96
NEWTON'S LAW 281
(96.14)
3km 2 C \2 . ( 1 \ Ids
o
( m =*!L abc n is the total mass of the body); integrating the first term by parts, we
obtain finally:
00
U = _ V??L f d l (96.15)
io J r;
o
All the integrals appearing in formulas (96.11)(96.14) can be expressed in terms of
elliptic integrals of the first and second kind. For ellipsoids of rotation, these integrals are
expressed in terms of elementary functions. In particular, the gravitational energy of an
oblate ellipsoid of rotation (a = b > c) is
l7= __4^L C0S  1 C  (96.16)
5vV
2 a
and for a prolate ellipsoid of rotation (a>b = c):
u= _^m^ cosh i a . (96.17)
For a sphere (a = c) both formulas give the value U = 3km 2 J5a, which, of course, can
also be obtained by elementary methods.f
PROBLEM
Determine the equilibrium shape of a homogeneous gravitating mass of liquid which is rotating
as a whole.
Solution: The condition of equilibrium is the constancy on the surface of the body of the sum ot
the gravitational potential and the potential of the centrifugal forces:
4>—— (x 2 +y 2 ) = const.
(Q is the angular velocity; the axis of rotation is the z axis). The required shape is that of an oblate
ellipsoid of rotation. To determine its parameters we substitute (96.13) in the condition of equi
librium, and eliminate z 2 by using equation (96.10); this gives:
[f ds _J^_ <? f ds 1 ,
(x2+y2) [j (^+sfVW+s 2 *» ka2c a 2 ) (a 2 +s)(c 2 +sy' 2 \ COmt •'
hom which it follows that the expression in the square brackets must vanish. Performing the in
tegration, we get the equation
(a 2 +2c 2 )c , c 3c 2 CI 2
cos 1 — —
_ 25 (^lY ' 3 m v /3 /f V /3
. _ ~6 \J ) m 10 / 3 * \a)
( a 2_ c 2)3/2 Wa a a 2_ c 2 2llkll
(M = f ma 2 Q is the angular momentum of the body around the z axis), which determines the
t The potential of the field inside a homogeneous sphere of radius a is
q>= 2nkM\a z  r ).
282 THE GRAVITATIONAL FIELD EQUATIONS § 97
ratio of the semiaxes c/a for given ft or M. The dependence of the ratio c/a on M is singlevalued;
c/a increases monotonically with increasing M.
However, it turns out that the symmetrical form which we have found is stable (with respect to
small perturbations) only for not too large values of M.f The stability is lost for M =
2.89 k 1 ' 2 m 5 ' 3 n 1 ' 6 (when c/a = 0.58). With further increase of M, the equilibrium shape becomes a
general ellipsoid with gradually decreasing values of b/a and c/a (from 1 and from 0.58, respectively).
This shape in turn becomes unstable for M = 3.84 k 112 m 513 v' 1 ' 6 (when a:b:c = 1 :0.43:0.34).
§ 97. The centrally symmetric gravitational field
Let us consider a gravitational field possessing central symmetry. Such a field can be
produced by any centrally symmetric distribution of matter; for this, of course, not only the
distribution but also the motion of the matter must be centrally symmetric, i.e. the velocity
at each point must be directed along the radius.
The central symmetry of the field means that the spacetime metric, that is, the expression
for the interval ds, must be the same for all points located at the same distance from the
center. In euclidean space this distance is equal to the radius vector; in a noneuclidean
space, such as we have in the presence of a gravitational field, there is no quantity which has
all the properties of the euclidean radius vector (for example to be equal both to the distance
from the center and to the length of the circumference divided by In). Therefore the choice
of a "radius vector" is now arbitrary.
If we use "spherical" space coordinates r, 9, <f>, then the most general centrally symmetric
expression for ds 2 is
ds 2 = h(r, i) dr 2 + k(r, t)(sin 2 9 d<f> 2 + d9 2 ) + l(r, t) dt 2 + a(r, t) dr dt, (97.1)
where a, h, k, I are certain functions of the "radius vector" r and the "time" t. But because
of the arbitrariness in the choice of a reference system in the general theory of relativity, we
can still subject the coordinates to any transformation which does not destroy the central
symmetry of ds 2 ; this means that we can transform the coordinates r and t according to the
formulas
r=fi(r',f), t=f 2 (r',t'),
where/!, f 2 are any functions of the new coordinates r', t'.
Making use of this possibility, we choose the coordinate r and the time t in such a way
that, first of all, the coefficient air, t) of dr dt in the expression for ds 2 vanishes and, secondly,
the coefficient k(r, t) becomes equal simply to — r 2 .% The latter condition implies that the
radius vector r is defined in such a way that the circumference of a circle with center at the
origin of coordinates is equal to 2nr (the element of arc of a circle in the plane 9 = n/2
is equal to dl = r d4>). It will be convenient to write the quantities h and / in exponential
form, as — e x and cV respectively, where X and v are some functions of r and t. Thus we
obtain the following expression for ds 2 :
ds 2 = e v c 2 dt 2 r 2 (d9 2 + sin 2 9 d(j> 2 )  e* dr 2 . (97.2)
Denoting by x°,x 1 ,x 2 ,x 3 , respectively, the coordinates ct, r, 9, <j), we have for the
f References to the literature concerning this question can be found in the book by H. Lamb, Hydro
dynamics, chap. XII.
% These conditions do not determine the choice of the time coordinate uniquely. It can still be subjected
to an arbitrary transformation / =f(t'), not containing r.
§ 97 THE CENTRALLY SYMMETRIC GRAVITATIONAL FIELD 283
nonzero components of the metric tensor the expressions
goo = e\ 9n = e\ 9n= ~r\ g 33 =r 2 sin 2 e.
Clearly,
g 00 = e v , g xl =e~\ g 22 = r' 2 , g 33 = r~ 2 sin" 2 0.
With these values it is easy to calculate the r£, from formula (86.3). The calculation leads
to the following expressions (the prime means differentiation with respect to r, while a dot
on a symbol means differentiation with respect to ct):
ln "2'
v'
ll0 ~2'
T 2 33 = sin 9
5
111 2 '
1 22 == ^ '
r 1  v <r A
1 oo — ^ :
r 2  r 3  
1 12 "A 13  r »
r 3 = cot 0,
1 oo  2 •
Tio — ~ >
rl 3 = rsin 2
Ge
A
(97.3)
All other components (except for those which differ from the ones we have written by a
transposition of the indices k and /) are zero.
To get the equations of gravitation we must calculate the components of the tensor RJ;
according to formula (92.10). A simple calculation leads to the following equations:
8 ^ri = e^ + A) + l, (97.4)
8rcfc._, Snk _, 1 _,/.. v' 2 v'X' VX'\ 1 _„ ( v . A 2 Av
c
Snk
nk . . 8tt/c . 1 _,/ v' 2 v'A' vT\ , 1 _ v /v A 2 Av\
(97.6)
/ J"
**The~^. (97.7)
c . r
The other components vanish identically. Using (94.9), the components of the energy
momentum tensor can be expressed in terms of the energy density e of the matter, its
pressure p, and the radial velocity v.
The equations (97.47) can be integrated exactly in the very important case of a centrally
symmetric field in vacuum, that is, outside of the masses producing the field. Setting the
energymomentum tensor equal to zero, we get the following equations :
' l (i+?)h° (97  8)
^(7^) + 4 = 0, (97.9)
A = (97.10)
284 THE GRAVITATIONAL FIELD EQUATIONS § 97
[we do not write the fourth equation, that is, equation (97.5), since it follows from the other
three equations].
From (97.10) we see directly that X does not depend on the time. Further, adding equations
(97.8) and (97.9), we find X' + v' = 0, that is,
A + v=J(t), (97.11)
where f(t) is a function only of the time. But when we chose the interval ds 2 in the form
(97.2), there still remained the possibility of an arbitrary transformation of the time of the
form t =f{t'). Such a transformation is equivalent to adding to v an arbitrary function of
the time, and with its aid we can always make/(0 in (97.11) vanish. And so, without any
loss in generality, we can set A+v = 0. Note that the centrally symmetric gravitational
field in vacuum is automatically static.
The equation (97.9) is easily integrated and gives :
e  w = 1+ ^<. (97 . 12)
r
Thus, at infinity (r> oo), e~ x = e v = 1, that is, far from the gravitating bodies the metric
automatically becomes galilean. The constant is easily expressed in terms of the mass of the
body by requiring that at large distances, where the field is weak, Newton's law should
hold. In other words, we should have g 00 = l + (20/c 2 ), where the potential <f> has its
Newtonian value (96.4) $ = —(Jcm/r) (m is the total mass of the bodies producing the field).
From this it is clear that const = —(2km/c 2 ). This quantity has the dimensions of length;
it is called the gravitational radius r g of the body :
r,J^. (97.13)
Thus we finally obtain the spacetime metric in the form :
ds 2 = U r A c 2 dt 2 r 2 (sm 2 6 d<f> 2 + d0 2 ) ^y. (97.14)
r
This solution of the Einstein equations was found by K. Schwarzschild (1916). It completely
determines the gravitational field in vacuum produced by any centrallysymmetric distribu
tion of masses. We emphasize that this solution is valid not only for masses at rest, but also
when they are moving, so long as the motion has the required symmetry (for example, a
centrallysymmetric pulsation). We note that the metric (97.14) depends only on the total
mass of the gravitating body, just as in the analogous problem in Newtonian theory.
The spatial metric is determined by the expression for the element of spatial distance:
dr 2
dl 2 = — +r 2 (sin 2 9 d(l> 2 + d0 2 ). (97.15)
r
The geometrical meaning of the coordinate r is determined by the fact that in the metric
(97.15) the circumference of a circle with its center at the center of the field is 2nr. But the
f For the field in the interior of a spherical cavity in a centrally symmetric distribution, we must have
const = 0, since otherwise the metric would have a singularity at r = 0. Thus the metric inside such a cavity
is automatically galilean, i.e., there is no gravitational field in the interior of the cavity (just as in Newtonian
theory).
§ 97 THE CENTRALLY SYMMETRIC GRAVITATIONAL FIELD 285
distance between two points r t and r 2 along the same radius is given by the integral
[4^=>r,r i . (97.16)
l M
Furthermore, we see that g 00 ^ 1. Combining with the formula (84.1) dx = y/g 00 dt,
defining the proper time, it follows that
dx < dt. (97.17)
The equality sign holds only at infinity, where / coincides with the proper time. Thus at
finite distances from the masses there is a "slowing down" of the time compared with the
time at infinity.
Finally, we present an approximate expression for ds 2 at large distances from the origin
of coordinates :
ds 2 = ds 2  2 — (dr 2 + c 2 dt 2 ). (97.18)
c r
The second term represents a small correction to the galilean metric ds 2 ,. At large distances
from the masses producing it, every field appears centrally symmetric. Therefore (97.18)
determines the metric at large distances from any system of bodies.
Certain general considerations can also be made concerning the behavior of a centrally
symmetric gravitational field in the interior of the gravitating masses. From equation (97.6)
we see that for r > 0, A must also vanish at least like r 2 ; if this were not so the right side
of the equation would become infinite for r * 0, that is, T° would have a singular point at
r = 0, which is physically impossible. Formally integrating (97.6) with the limiting con
dition A P=0 = °> we obtain
r
A = In l^ j T Q r 2 dr}. (97.19)
o
Since, from (94.10), T° = e~ v T 00 > 0, is it clear that A ^ 0, that is,
e k > 1. (97.20)
Subtracting equation (97.6) term by term from (97.4), we get:
a o , ( £ + P)( 1 + ^)
e V + ao= 8 4Vst)= — V^°>
c 2
i.e. v' + A'$s0. But for r+oo (far from the masses) the metric becomes galilean, i.e.
v _y o, A * 0. Therefore, from v' 1 A' ^ it follows that over all space
v + A^0. (97.21)
Since A $s 0, it then follows that v < 0, i.e.
e v < 1. (97.22)
The inequalities obtained show that the above properties (97.16) and (97.17) of the
spatial metric and the behaviour of clocks in a centrally symmetric field in vacuum apply
equally well to the field in the interior of the gravitating masses.
286 THE GRAVITATIONAL FIELD EQUATIONS § 97
If the gravitational field is produced by a spherical body of "radius" a, then for r > a,
we have T% = 0. For points with r > a, formula (97.19) therefore gives
A=lnO^ I TlrUr
c r
/«'
o
On the other hand, we can here apply the expression (97.14) referring to vacuum, according
to which
i /. 2km\
Equating the two expressions, we get the formula
a
m = p J T° r 2 dr, (97.23)
o
expressing the total mass of a body in terms of its energymomentum tensor.
PROBLEMS
1. Determine the spatial curvature in a centrally symmetric gravitational field in vacuum.
Solution: The components of the spatial curvature tensor P a g y6 can be expressed in terms of the
components of the tensor P ae (and the tensor y a0 ) so that we need only calculate P a0 (see problem 1
in § 92). The tensor P a0 is expressed in terms of y a0 just as R tk is expressed in terms of g ik . Using
the values of y a0 from (97.15), we find from the calculations:
pe = p<t> = JjL P r — —
2r 3 ' r 3 '
and PI = for a # yS. We note that P e g , P% > 0, P\ < 0, while P = P a a = 0.
From the formula given in problem 1 of § 92, we find:
Prere = (P r r +P"e)yrrVe6 = ~P% Yrr 7ee,
Prd>r<P — P e yrr Yd><t>i
Pe<t>o$ = — P r 7ee V<t>0
It then follows (see the footnote on p. 263) that for a "plane" perpendicular to the radius, the
Gaussian curvature is
(which means that, in a small triangle drawn on the "plane" in the neighborhood of its intersection
with the radius perpendicular to it, the sum of the angles of the triangle is greater than n). As to
the "planes" which pass through the centre, their Gaussian curvature K<0; this means that the
sum of the angles of a small triangle in such a "plane" is less than n (however this does not refer
to the triangles embracing the centre — the sum of the angles in such a triangle is greater than n).
2. Determine the form of the surface of rotation on which the geometry would be the same as on
a "plane" passing through the origin in a centrally symmetric gravitational field in vacuo.
Solution: The geometry on the surface of rotation z = z(r) is determined (in cylindrical co
ordinates) by the element of length:
dl 2 = dr 2 +dz 2 +r 2 d<p 2 = dr 2 (\+z' 2 )+r 2 dy 2 .
Comparing with the element of length (97.4) in the "plane" 6 = n/2
dr 2
dl 2 =r 2 d<p 2 +~ T ,
\r g \r
we find
1+V
(?)"•
§ 98 MOTION IN A CENTRALLY SYMMETRIC GRAVITATIONAL FIELD 287
from which
z = 2Vr g (rr g ).
For r = r g this function has a singularity— a branch point. The reason for this is that the spatial
metric (97.15) in contrast to the spacetime metric (97.14), actually has a singularity at r = r g .
The general properties of the geometry on "planes" passing through the center, which were
mentioned in the preceding problem, can also be found by considering the curvature in the pictorial
model given here.
3. Transform the interval (97.11) to such coordinates that its element of spatial distance has
conformal— euclidean form, i.e. dl is proportional to its euclidean expression.
Solution: Setting
we get from (97.14)
<fc 2 = ^ 1 c'dt'fl+Z) (dp 2 +P*d9*+p 2 sin* 6d? 2 ).
The coordinates p, 6, y are called isotropic spherical coordinates; instead of them we can also
introduce isotropic cartesian coordinates x, y, z. In particular, at large distances (p » r g ) we have
approximately:
rfja = A _ r A C 2 dt 2_ (l+ r j\ (dx 2 +dy 2 +dz 2 ).
§ 98. Motion in a centrally symmetric gravitational field
Let us consider the motion of a body in a centrally symmetric gravitational field. As in
every centrally symmetric field, the motion occurs in a single "plane" passing through the
origin ; we choose this plane as the plane 9 = n/2.
To determine the trajectory of the body (with mass m), we use the Hamilton Jacobi
equation:
ik dS 8S 2 2
9 &?a? mc  a
Using the g ik given in the expression (97.14), we find the following equation:
( d A\ 2 ^( d l\ 2 L( d A\
\cdtj \drj r 2 \d<i>)
e[~] ^ff I 4(p) mV = 0, (98.1)
where
e v = l' 9 (98.2)
r
,2
(m' is the mass of the body producing the field; r g = 2km' J c 1 is its gravitational radius). By
the general procedure for solving the HamiltonJacobi equation, we look for an S in the
form
S= £ t + M<f> + S r (r), (98.3)
with constant energy <f and angular momentum M. Substituting this in (98.1), we find the
equation
c r \or J
288 THE GRAVITATIONAL FIELD EQUATIONS § 98
from which
s ' = jV? e " 2v ( mV+ ?) cv  dr
it
SmV) + »Vrr. M> V'^
c 2 (rr g ) 2 r(rr g )_
The trajectory is determined f by the equation 8S/8M = const, from which
<f> = J — ,_,., (98.5)
VH^)H)
dr, (98.6)
This integral reduces to an elliptic integral.
For the motion of a planet in the field of attraction of the Sun, the relativistic theory
leads to only an insignificant correction compared to Newton's theory, since the velocities
of the planets are very small compared to the velocity of light. In the integrand in the equation
(98.5) for the trajectory, this corresponds to a small value for the ratio rjr, where r g is the
gravitational radius of the Sun.J
To investigate the relativistic corrections to the trajectory, it is convenient to start from
the expression (98.4) for the radial part of the action, before differentiation with respect to M.
We make a transformation of the integration variable, writing
r
r(rr g ) = r' 2 , i.e. r^^r',
as a result of which the second term under the square root takes the form M 2 /r' 2 . In the
first term we make an expansion in powers of rjr', and obtain to the required accuracy:
S r = 1 2<f m + % J +  (2m 2 m'/c + 4<Tmr 9 )  2 ( M 2 — 1 \
where for brevity we have dropped the prime on r' and introduced the nonrelativistic
energy $' (without the rest energy).
The correction terms in the coefficients of the first two terms under the square root have
only the not particularly interesting effect of changing the relation between the energy and
momentum of the particle and changing the parameters of its Newtonian orbit (ellipse). But
the change in the coefficient of 1/r 2 leads to a more fundamental effect — to a systematic
(secular) shift in the perihelion of the orbit.
Since the trajectory is defined by the equation (f> + (dS r /dM) = const, the change of
the angle $ after one revolution of the planet in its orbit is
**m AS 
where AS r is the corresponding change in S r . Expanding S r in powers of the small correction
to the coefficient of 1/r 2 , we get:
, m hm 2 c 2 r 2 oAS< 0)
AM oM
where AS ( r 0) corresponds to the motion in the closed ellipse which is unshifted. Differentiating
t See Mechanics, § 47.
t For the Sun, r g = 3 km; for Earth, r g = 0.44 cm.
§ 98 MOTION IN A CENTRALLY SYMMETRIC GRAVITATIONAL FIELD 289
this relation with respect to M, and using the fact that
 — AS< 0) = A0 (O) = 2n,
dM r
we find: n ,,,>
Znm 2 c 2 rl „ 6nk 2 m 2 m' 2
The second term is the required angular displacement H of the Newtonian ellipse during
one revolution, i.e. the shift in the perihelion of the orbit. Expressing it in terms of the length
a of the semimajor axis and the eccentricity e of the ellipse by means of the formula
M 2 n 2 ,
km'm
we obtain :f
c 2 a{\e z )
Next we consider the path of a light ray in a centrally symmetric gravitational field. This
path is determined by the eikonal equation (87.9)
y dx l dx k
which differs from the HamiltonJacobi equation only in having m set equal to zero. There
fore the trajectory of the ray can be obtained immediately from (98.5) by setting m = 0; at
the same time, in place of the energy S = (dSjdt) of the particle we must write the
frequency of the light, co = (#/df). Also introducing in place of the constant M a
constant q defined by q = cMjw , we get:
^ = f =£==. (98.8)
r dr
If we neglect the relativistic corrections (r, ►()), this equation gives r = <?/cos <f>, i.e. a
straight line passing at a distance q from the origin. To study the relativistic corrections,
we proceed in the same way as in the previous case.
For the radial part of the eikonal we have [see (98.4)]:
Making the same transformations as were u sed to go fro m (98.4) to (98.6), we find:
Expanding the integrand in powers of rjr, we have :
\ = ^(o, + M?2 f dr^ = ^ <o> + r_£o cosh i 1
Vr ¥r c J Vr 2 e 2 c e
where ^< 0) .corresponds to the classical straight ray.
t Numerical values of the shifts determined from formula (98.7) for Mercury and Earth are equal,
respectively, to 43.0" and 3.8" per century. Astronomical measurements give 43.1 ±0.4 and 5.0 +1.2 , in
excellent agreement with theory.
290 THE GRAVITATIONAL FIELD EQUATIONS § 99
The total change in \J/ r during the propagation of the light from some very large distance R
to the point r = q nearest to the center and then back to the distance R is equal to
A<A r = A^°> + 2^cosh li? .
C Q
The corresponding change in the polar angle $ along the ray is obtained by differentiation
with respect to M = q /<dc:
dM dM Q sJr 2  q 2 '
Finally, going to the limit R > oo, and noting that the straight ray corresponds to A0 = n,
we get:
A0 = 7T+?^.
This means that under the influence of the field of attraction the light ray is bent: its
trajectory is a curve which is concave toward the center (the ray is "attracted" toward the
center), so that the angle between its two asymptotes differs from n by
2r a 4km'
b<$> = ^ i =  T \ (98.9)
in other words, the ray of light, passing at a distance q from the center of the field, is
deflected through an angle <50.f
§ 99. The synchronous reference system
As we know from § 84, the condition for it to be possible to synchronize clocks at dif
ferent points in space is that the components g 0a of the metric tensor be equal to zero. If,
in addition, g 00 = 1, the time coordinate x° = t is the proper time at each point in space. J
A reference system satisfying the conditions
9oa = 0, g 00 = 1 (99.1)
is said to be synchronous. The interval element is such a system is given by the expression
ds 2 = dt 2  y aP dx* dx p , (99.2)
where the components of the spatial metric tensor are the same (except for sign) as the g aP :
y«p=g aP . (9*3)
The threedimensional tensor g aP determines the spatial metric.
In the synchronous reference system the time lines are geodesies in the fourspace. The
four vector u l = dx*/ds, which is tangent to the world line x 1 , x 2 , x 3 = const, has com
ponents u a = 0, u° = 1, and automatically satisfies the geodesic equations:
du l , ,
— +r kl u k u l = r 00 = o,
ds
since, from the conditions (99.1), the Christoffel symbols Too and Too vanish identically.
t For a ray just skirting the edge of the Sun, d<f> = 1.75".
% In this section we set c = 1 .
§ 99 THE SYNCHRONOUS REFERENCE SYSTEM 291
It is also easy to see that these lines are normal to the hypersurfaces / = const. In fact,
the fourvector normal to such a hypersurface, n t = dt/dx\ has covariant components
n = 1 n =0 With the conditions (99.1), the corresponding contravanant components
are also n° = 1, rf = 0, i.e., they coincide with the components of the fourvector u l which is
tangent to the time lines. .
Conversely, these properties can be used for the geometrical construction of a syn
chronous reference system in any spacetime. For this purpose we choose as our starting
surface any spacelike hypersurface, i.e., a hypersurface whose normals at each point have a
timelike direction (they lie inside the light cone with its vertex at this point); all elements ot
interval on such a hypersurface are spacelike. Next we construct the family of geodesic lines
normal to this hypersurface. If we now choose these lines as the time coordinate lines and
determine the time coordinate t as the length s of the geodesic line measured from the
initial hypersurface, we obtain a synchronous reference system.
It is clear that such a construction, and the selection of a synchronous reference system, is
always possible in principle.
Furthermore, this choice is still not unique. A metric of the form (99.2) allows any trans
formation of the space coordinates which does not affect the time, and also transformations
corresponding to the arbitrariness in the choice of the initial hypersurface for the geometrical
construction. . . .
The transformation to the synchronous reference system can, m principle, be done
analytically by using the HamiltonJacobi equation; the basis of this method is the fact that
the trajectories of a particle in a gravitational field are just the geodesic lines.
The HamiltonJacobi equation for a particle (whose mass we set equal to unity) in a
gravitational field is
y dx l dx k
(where we denote the action by t). Its complete integral has the form:
t=M*,x 1 ) + A(?), ( 99  5 >
where /is a function of the four coordinates x l and the three parameters ?; the fourth
constant A we treat as an arbitrary function of the three ? . With such a representation for t,
the equations for the trajectory of the particle can be obtained by equating the derivatives
dx/d^* to zero, i.e.
M. =  b A (99.6)
For each set of assigned values of the parameters £", the right sides of equations (99.6)
have definite constant values, and the world line determined by these equations is one of the
possible trajectories of the particle. Choosing the quantities f , which are constant along
the trajectory, as new space coordinates, and the quantity t as the new time coordinate, we
get the synchronous reference system; the transformation which takes us from the old
coordinates to the new is given by equations (99.56). In fact it is guaranteed that for such a
transformation the time lines will be geodesies and will be normal to the hypersurfaces
t = const. The latter point is obvious from the mechanical analogy: the fourvector dxjdx 1
which is normal to the hypersurface coincides in mechanics with the fourmomentum of the
particle, and therefore coincides in direction with its fourvelocity u\ i.e. with the four
vector tangent to the trajectory. Finally the condition g 00 = 1 is obviously satisfied, since the
292 THE GRAVITATIONAL FIELD EQUATIONS § 99
derivative dr/ds of the action along the trajectory is the mass of the particle, which we set
equal to 1 ; therefore \dxjds\ = 1.
We write the Einstein equations in the synchronous reference system, separating the
operations of space and time differentiation in the equations.
We introduce the notation
*>=^f (99.7)
for the time derivatives of the threedimensional metric tensor; these quantities also form a
threedimensional tensor. All operations of shifting indices and covariant differentiation of
tie threedimensional tensor x aP will be done in threedimensional space with the metric
V«0t We note that the sum < is the logarithmic derivative of the determinant y = \y af5 \ :
< = f pdj ^ = ^ln(y). (99.8)
For the Christoffel symbols we find the expressions:
yO _ pa _ pO _ (\
1 00 — * 00 — L 0a ~ u >
rS, = K/»» r 0/} = ^, T% = k* Pv (99.9)
where X* By are the threedimensional Christoffel symbols formed from the tensor y aB . A
calculation using formula (92.10) gives the following expressions for the components of the
tensor K lk :
1 d « 1 8 a
*oo=^<4^,
Ro a = 2 (Xa;l>X P p;J, (99.10)
Id 1
p = 2dt XaP+ 4 ^ ** ~ 2x " *'** + PaP '
Here P aP is the threedimensional Ricci tensor which is expressed in terms of y ap in the same
way as P ik is expressed in terms of g ik . All operations of raising indices and of covariant
differentiation are carried out in the threedimensional space with the metric y ap .
We write the Einstein equations in mixed components :
1 r\ 1
R o=~j t <~ 2 *J*? = ** k ( T oiT), (99.11)
2dt * 4
2
K =  «r4«) = **kT° a , (99.12)
Rt =P p a  ^ (V? 4) = *nk(Tli8ln (99.13)
2s/ydt
A characteristic feature of synchronous reference systems is that they are not stationary:
the gravitational field cannot be constant in such a system. In fact, in a constant field we
would have x ap = 0. But in the presence of matter the vanishing of all the x aB would con
tradict (99.11) (which has a right side different from zero). In empty space we would find
t But this does not, of course, apply to operations of shifting indices in the space components of the four
tensors R ik , T ik (see the footnote on p. 250). Thus T a must be understood as before to be g ey T ya ^g B0 T 0a ,
which in the present case reduces to g ev T ya and differs in sign from y ey T Ya .
§ 99 THE SYNCHRONOUS REFERENCE SYSTEM 293
from (99 13) that all the P aP , and with them all the components of the threedimensional
curvature tensor P aPyS , vanish, i.e. the field vanishes entirely (in a synchronous system with
a euclidean spatial metric the spacetime is flat).
At the same time the matter filling the space cannot in general be at rest relative to the
synchronous reference frame. This is obvious from the fact that particles of matter within
which there are pressures generally move along lines that are not geodesies; the world line
of a particle at rest is a time line, and thus is a geodesic in the synchronous reference system.
An exception is the case of "dust" (p = 0). Here the particles interacting with one another
will move along geodesic lines; consequently, in this case the condition for a synchronous
reference system does not contradict the condition that it be comoving with the matter.t
For other equations of state a similar situation can occur only in special cases when the
pressure gradient vanishes in all or in certain directions.
From (99,11) one can show that the determinant g = y of the metric tensor in a syn
chronous system must necessarily go to zero in a finite length of time.
To prove this we note that the expression on the right side of this equation is positive
for any distribution of the matter. In fact, in a synchronous reference system we have for
the energymomentum tensor (94.9):
(v + e)v 2
T $T = & + 3p)+ yJ Y Z ~
[the components of the fourvelocity are given by (88.14)]; this quantity is clearly positive.
The same statement is also true for the energymomentum tensor of the electromagnetic
field (T = 0, T% is the positive energy density of the field). Thus we have from (99.11):
(where the equality sign applies in empty space).
Using the algebraic inequality^
kJkJ > H*3 2
we can rewrite (99.14) in the form
^<+i«) 2 <o
or
()>. (99.15)
dtWJ'6
Suppose, for example, that at a certain time < > 0. Then as t decreases the quantity l/<
decreases, with a finite (nonzero) derivative, so that it must go to zero (from positive values)
in the course of a finite time. In other words, *£ goes to + oo, but since >£ = d In y/dt, this
means that the determinant y goes to zero [no faster than t 6 , according to the inequality
t Even in this case, in order to be able to choose a "synchronously comoving" system of reference, it is
still necessary that the matter move "without rotation". In the comoving system the contravariant components
of the velocity are u° = 1, W = 0. If the reference system is also synchronous, the covariant components
must satisfy u = 1, u a = 0, so that its fourdimensional curl must vanish:
dut du k _
But this tensor equation must then also be valid in any other reference frame. Thus, in a synchronous, but
not comoving, system, we then get the condition curl v = for the threedimensional velocity v.
X Its validity can easily be seen by bringing the tensor x a0 to diagonal form (at a given instant of time).
294 THE GRAVITATIONAL FIELD EQUATIONS § 99
(99.15)]. If on the other hand x" < at the initial time, we get the same result for increasing
times.
This result does not, however, by any means prove that there must be a real physical
singularity of the metric. A physical singularity is one that is characteristic of the space
time itself, and is not related to the character of the reference frame chosen (such a singularity
should be characterized by the tending to infinity of various scalar quantities — the matter
density, or the invariants of the curvature tensor). The singularity in the synchronous
reference system, which we have proven to be inevitable, is in general actually fictitious,
and disappears when we change to another (nonsynchronous) reference frame. Its origin
is evident from simple geometrical arguments.
We saw earlier that setting up a synchronous system reduces to the construction of a
family of geodesic lines orthogonal to any spacelike hypersurface. But the geodesic lines
of an arbitrary family will, in general, intersect one another on certain enveloping hyper
surfaces — the fourdimensional analogues of the caustic surfaces of geometrical optics.
We know that intersection of the coordinate lines gives rise to a singularity of the metric in
the particular coordinate system. Thus there is a geometrical reason for the appearance of a
singularity, associated with the specific properties of the synchronous system and therefore
not physical in character. In general an arbitrary metric of fourspace also permits the
existence of nonintersecting families of geodesic lines. The unavoidable vanishing of the
determinant in the synchronous system means that the curvature properties of a real (non
flat) spacetime (which are expressed by the inequality R% ^ 0) that are permitted by the
field equations exclude the possibility of existence of such families, so that the time lines in a
synchronous reference system necessarily intersect one another.f
We mentioned earlier that for dustlike matter the synchronous reference system can also
be comoving. In this case the density of the matter goes to infinity at the caustic — simply
as a result of the intersection of the world trajectories of the particles, which coincide with
the time lines. It is, however, clear that this singularity of the density can be eliminated by
introducing an arbitrarily small nonzero pressure of the matter, and in this sense is not
physical in character.
PROBLEMS
1 . Find the form of the solution of the gravitational field equations in vacuum in the vicinity of a
point that is not singular, but regular in the time.
Solution: Having agreed on the convention that the time under consideration is the time origin,
we look for y a0 in the form:
Y«0 =a a 0+tb a i, + t 2 c aP + . . . , (1)
t For the analytic structure of the metric in the vicinity of a fictitious singularity in a synchronous
reference system, see E. M. Lifshitz, V. V. Sudakov and I. M. Khalatnikov, JETP 40, 1847, 1961, (Soviet
Phys.—JETP, 13, 1298, 1961).
The general character of the metric is clear from geometrical considerations. Since the caustic hyper
surface always contains timelike intervals (the line elements of the geodesic time lines at their points of tan
gency to the caustic), it is not spacelike. Furthermore, on the caustic one of the principal values of the metric
tensor y a$ vanishes, corresponding to the vanishing of the distance (<5) between two neighboring geodesies
that intersect one another at their point of tangency to the caustic. The quantity 3 goes to zero as the first
power of the distance (/) to the point of intersection. Thus the principal value of the metric tensor, and with
it the determinant y, goes to zero like I 2 .
The synchronous reference system can also be constructed so that the time lines intersect on a set of points
having lower dimensionality than a hypersurface — on a twodimensional surface that may be called the
focal surface corresponding to the family of geodesies. The analytic construction of such a metric is given by
V. A. Belinskil and I. M. Khalatnikov, JETP 49, 1000, 1965 (Soviet Phys.—JETP 22, 694, 1966).
§ 99 ' THE SYNCHRONOUS REFERENCE SYSTEM 295
where «.„ b.» c a , are functions of the space coordinates. In this same approximation the reciprocal
tensor is: „,, ,. „„
yae = a ^tb a6 ^t\b a %c aR \
where a" is the tensor reciprocal to a.» and the raising of indices of the other tensors is done by
using a aB . We also have:
x a e = b aB +2tc aP , K = b' a +t(c B a b ay b<»).
The Einstein equations (99. 1113) lead to the following relations :
flo=c+ifW=0, (2)
iJ^K^.^^+^lc^+K^^^+^^+^^.KA^^lo, (3)
K = Pi ib>b+V>lb>C. = (4)
(6s /,« cs c) The operations of covariant differentiation are carried out in the threedimensional
space with metric a aB ; the tensor P a , is also defined with respect to this metric^
From (4) the coefficients c, are completely determined in terms of the coefficients a aP and b aB .
Then (2) gives the relation ,v
From the terms of zero order in (3) we have:
b B a ., e =b., a . (6)
The terms ~ / in this equation vanish identically when we use (5) and (6) and the identity
pe _ xp fseg (92 1 3)1
"" Thus 'the twelve quantities «.,, 6., are related to one another by the one relation (5) and the
three relations (6), so that there remain eight arbitrary functions of the three space coordinates.
Of these, three are related to the possibility of arbitrary transformations of the three space co
ordinates, and one to the arbitrariness in choosing the initial hypersurface for setting up the
synchronous reference system. Therefore we are left with the correct number (see the end of § 95)
of four "physically different" arbitrary functions.
2. Calculate the components of the curvature tensor R iklm in the synchronous reference system.
Solution: Using the Christoffel symbols (99.10) we find from (92.4):
R a eyi= —Patrt + lOt'aXer — X'rXf)*
RoaPy == HyXay; B XaB; y)>
18 1
Robots ~~2 8f Xafi ~~ 4 Ka v X/3 '
where P af1i is the threedimensional curvature tensor corresponding to the threedimensional
m t^Find 'the general form of the infinitesimal transformation from one synchronous reference
system to another.
Solution: The transformation has the form
,_»,+ ,,(*!, x \ x 3 ), x a +x a + ?(x\ x 2 , x 3 , t),
where m and {« are small quantities. We are guaranteed that the condition goo = 1 is satisfied by
keeping <p independent of t\ to maintain the condition g 0a = 0, we must satisfy the equations
d£ e __ 8<p
7a "Tt ~aP
from which
where the/* are again small quantities (forming a threedimensional vector f). The spatial metric
tensor y a0 is replaced by
[as can be easily verified using (94.3)]. _
Thus the transformation contains four arbitrary functions fa f) of the space coordinates.
296 THE GRAVITATIONAL FIELD EQUATIONS § 100
§ 100. Gravitational collapse
In the Schwarzschild metric (97.14), g 00 goes to zero and g tl to infinity at r = r g (on the
"Schwarzschild sphere"). This could give the basis for concluding that there must be a
singularity of the spacetime metric and that it is therefore impossible for bodies to exist
that have a "radius" (for a given mass) that is less than the gravitational radius. Actually,
however, this conclusion would be wrong. This is already evident from the fact that the
determinant g = r 4 sin 2 has no singularity at r g = r, so that the condition g < (82.3)
is not violated. We shall see that in fact we are dealing simply with the impossibility of
establishing a suitable reference system for r < r g .
To make clear the true character of the spacetime metric in this domainf we make a
transformation of the coordinates of the form:
Cf(r)dr „ r dr
cx= ±ct±\ J ^—, R = ct+\ . (100.1)
Then
ds 2 = — j 2 (c 2 dx 2 f 2 dR 2 )r 2 (d6 2 + sin 2 d<$>\
We eliminate the singularity at r = r g by choosing/(r) so thatf(r g ) = 1. If we set/(r) = \JrJr,
then the new coordinate system will also be synchronous (g xt =1). First choosing the upper
sign in (100.1), we have:
*"/
(lf)dr r Ir , 2r 3 ' 2
dr =
(?)
J. \f
3 r 1/2 '
or
2/3
r = (J ( *~ CT) j ^ (100  2)
(we set the integration constant, which depends on the time origin, equal to zero). The
element of interval is :
"3 1 4/3
(Rct)
ds 2 = c 2 dx °
2
3 VM . 2/ W + sin 2 0<^ 2 ). (100.3)
In these coordinates the singularity on the Schwarzschild sphere [to which there corres
ponds the equality f (R — ex) = r g ] is absent. The coordinate R is everywhere spacelike,
while x is timelike. The metric (100.3) is nonstationary. As in every synchronous reference
system, the time lines are geodesies. In other words, "test" particles at rest relative to the
reference system are particles moving freely in the given field.
To given values of r there correspond world lines R—cx = const (the sloping straight
lines in Fig. 20). The world lines of particles at rest relative to the reference system are
shown on this diagram as vertical lines ; moving along these lines, after a finite interval of
f This was first done by D. Finkelstein (1958) using a different transformation. The particular metric
(100.3) was first found in a different way by Lemaitre (1938) and in the present connection by Yu. Rylov
(1961).
§ ioo
GRAVITATIONAL COLLAPSE
297
Fig. 20.
proper time the particles "fall in" to the center of the field (r = 0), which is the location of
the true singularity of the metric.
Let us consider the propagation of radial light signals. The equation ds = (tor
0, (j) = const) gives for the derivative dxjdR along the ray:
dx
dR
= +
fe M )
 + /*
1/3 x V r'
(100.4)
the two signs corresponding to the two boundaries of the light "cone" with its vertex at
the given world point. When r > r g (point a in Fig. 20) the slope of these boundaries
satisfies \c dx/dR\ < 1, so that the straight line r = const (along which c dx/dR = 1) falls
inside the cone. But in the region r < r g (point a') we have \c dx/dR\ > 1, so that the line
r = const, the world line of a particle at rest relative to the center of the field, lies outside
the cone. Both boundaries of the cone intersect the line r = at a finite distance, approach
ing it along a vertical. Since no causally related events can lie on the world line outside the
light cone, it follows that in the region r < r g no particles can be at rest. Here all interactions
and signals propagate in the direction toward the center, reaching it after a finite interval
of time x.
Similarly, choosing the lower signs in (100.1) we would obtain an "expanding reference
system with a metric differing from (100.3) by a change of the sign of t. It corresponds to a
spacetime in which (in the region r < r g ) again rest is impossible, but all signals propagate
outward from the center.
The results described here can be applied to the problem of the behavior of massive
bodies in the general theory of relativity.f
The investigation of the relativistic conditions for equilibrium of a spherical body shows
that for a body of sufficiently large mass, states of static equilibrium cannot exist.J It is
t We caution against any applications to elementary particles: the entire theory presented in this book
already loses its validity for dimensions ~h/mc, which is by an enormous factor (~ 10 40 ) greater than km/c .
t See Statistical Physics, § 111.
298 THE GRAVITATIONAL FIELD EQUATIONS § 100
clear that such a body must contract without limit (i.e. it must undergo "gravitational
collapse").^
In a reference system not attached to the body and galilean at infinity [metric (97.14)],
the radius of the central body cannot be less than r r This means that according to the clocks
t of a distant observer the radius of the contracting body only approaches the gravitational
radius asymptotically as /► oo. It is easy to find the limiting form of this dependence.
A particle on the surface of the contracting body is at all times in the field of attraction of
a constant mass m, the total mass of the body. As r » r g the gravitational force becomes very
large; but the density of the body (and with it, the pressure) remains finite. Neglecting the
pressure forces for this reason, we reduce the determination of the time dependence r = r(t)
of the radius of the body to a consideration of the free fall of a test particle in the field of
the mass m.
The function r(t) for fall in a Schwarzschild field can be found (using the Hamilton
Jacobi method) from the equation dS/d£ = const, with the action S from (98.34), where,
for the case of purely radial motion, the angular momentum M = 0. Thus we get •
/, *
ct =
i 9 L/(^) 2 i +
(100.5)
(for brevity we omit the subscript on <f). This integral diverges like r g In (rr ) for r * r .
Thus we find the asymptotic formula for the approach of r to r :
r—r g = const e~ (ct/tg \ (100.6)
Although the rate of contraction as observed from outside goes to zero asymptotically,
the velocity of fall of the particles, as measured in their proper time, increases and approaches
the velocity of light. In fact, according to the definition (88.10):
„._„.,,. _»i.(*y
9 oo \dtj
Taking gl t and g 00 from (97.14) and dr/dt from (100'5), we find that v 2 > c 2 .
The approach to the gravitational radius, which according to the clocks of the outside
observer takes an infinite time, occupies only a finite interval of proper time (i.e. time in the
reference system comoving with the body). This is already clear from the analysis given above,
but one can also verify it directly by computing the proper time t as the invariant integral
1/c j" ds. Carrying out the calculation in the Schwarzschild reference system and taking
dr/dt for the falling particle from (100.5), we get:
•/</©■('?) :V J
„ . , . dr
ex
V r \mc 2 J
This integral converges for r*r g .
Having reached the gravitational radius (as measured by proper time), the body will con
tinue to contract, with all of its particles arriving at the center within a finite time. We do
not, however, observe this process from outside the system; we have seen that no signals
emerge from the Schwarzschild sphere (in the "contracting" reference system).
t The essential properties of this phenomenon were first explained by J. R. Oppenheimer and H. Snyder
(1939).
9Q9
c 10 Q GRAVITATIONAL COLLAPSE
With respect to an external observer the contraction to the gravitational radius is
accompanied by a "closing up" of the body. The time for propagation of signals sent from
the body tends to infinity: for a light signal c dt = drl(\r g lr), and the integral
r
[like the integral (100.5)] diverges for r  r 9 . Intervals of proper time on the surface of the
body are shortened, as compared to intervals of time t for the distant observer in the ratio
lrlr: consequently, as r+r g all processes on the body appear to be frozen with
respect to the external observer. Such a "congealing" body interacts with surrounding
bodies only through its static gravitational field.
The question of gravitational collapse of nonspherical bodies has not been clarified much
at present. One can apparently assert that for small deviations from sphericity collapse
occurs (relative to the system of a distinct observer) to the same state of congealing of
the body, and in the comoving reference system collapse occurs until it reaches the Schwarz
schild sphere; the ensuing fate of the body in the comoving system is, however not clear.t
In conclusion we make one further remark of a methodological nature. We have seen
that for the central field in vacuum the "system of the outside observer" that is mertial at
infinity is not complete: there is no place in it for the world lines of particles moving inside
the Schwarzschild sphere. The metric (100.3) is still applicable inside the Schwarzschild
sphere but this system too is not complete in a certain sense. Consider, in this system, a
particle carrying out a radial motion in the direction away from the center. As x  oo its
world line goes out to infinity, while for x +  co it must approach asymptotically to r = r g ,
since in this metric, within the Schwarzschild sphere motion can occur only along the direc
tion to the center. On the other hand, emergence of the particle from r = r g to any given
point r > r a occurs within a finite interval of proper time. In terms of proper time the
particle must approach the Schwarzschild sphere from inside before it can begin to move
outside it; but this part of the history of the particle is not kept by the particular reference
system 4
PROBLEMS
1 For a particle in the field of a spherical body contracting to the gravitational radius, find the
range of distances within which motion in a circular orbit is possible (S. A. Kaplan 1949).
Solution: The dependence r = r(t) for a particle moving in the SchwarzschUd field w th an
angular momentum M different from zero is obtained in a way analogous to (100.5), in differential
form,
1 dr mc 2 // <? V_ 1 , r g M 2 M 2 r g (1)
r
(where m is the mass of the particle and r g the gravitational radius of the body). Equating the
integrand in (1) to zero, we get the function #(r), which here replaces the potential curve of non
relativistic theory ; in Fig. 21 these curves are shown for different values of the angular momentum
M.
t See A. G. Doroshkevich, Ya. B. Zel'dovich and I. D. Novikov, JETP 49, 170, 1965, {Soviet Phys.~
t The construction of a reference system that is not incomplete in this way is considered in problem 5
of this section.
300
THE GRAVITATIONAL FIELD EQUATIONS
§ 100
e/mc
0943
Fig. 21.
The radii of the circular orbits and the corresponding energies are determined by the extrema of
the curves, where the minima correspond to stable, and the maxima to unstable orbits. For
M > V3 mcr g , each curve has one minimum and one maximum. As M increases from V3 mcr g
to oo the coordinates of the minimum increase from 3r g to oo (and the corresponding energies from
V8/9 mc 2 to mc 2 ); the coordinates of the maximum decrease from 3r g to 3r 9 /2 (while the corres
ponding energies go from V8/9 mc 2 to oo). For r < 3r g /2 there are no circular orbits.
2. For motion in this same field determine the crosssection for gravitational capture of (a) non
relativistic, (b) ultrarelativistic, particles coming from infinity (Ya. B. Zel'dovich and I. D. Novikov,
1964).
Solution: (a) For a nonrelativistic velocity »«, (at infinity) the energy of the particle is £ « mc 2 .
From Fig. 21 we see that the line £ = mc 2 lies above all the potential curves with angular momenta
M < 2mcr g , i.e. all those with impact parameters q < 2cr g /v 00 . All particles with such values of q
undergo gravitational capture: they reach the Schwarzschild sphere (asymptotically, as /»oo)
and do not emerge again to infinity. The capture crosssection is
4nr:
\v„)
(b) In equation (1) of problem 1 the transition to the ultrarelativistic particle (or to a light ray)
is achieved by the substitution w>0. Also introducing the impact parameter e= cM/£, we get:
1
1
cdt
J l ~$
^ +
Q r g
Setting the integrand equal to zero, we get the closest distance to the center, r mln , which the orbit
reaches. This quantity attains its smallest value (r min = 3r g /2) for q = 3V3rJ2; for smaller values
of q the particle moves to the Schwarzschild sphere. Thus we get for the capture crosssection
27 .
nri
3. Find the equations of a centrallysymmetric gravitational field in matter in the comoving
reference system.
Solution: We make use of the two possible transformations of the coordinates r, t in the element
of interval (97.1) in order to, first, make the coefficient a(r, t) of drdt vanish, and second, to make
the radial velocity of the matter vanish at each point (because of the central symmetry, other com
ponents of the velocity are not present). After this is done, r and / can still be subjected to an
arbitrary transformation of the form r = r(r') and / = /(/ ').
(2)
§ 100 GRAVITATIONAL COLLAPSE
We denote the radial coordinate and time selected in this way by R and r, and the ^ coefficknts
h, M, by e\ «*, e\ respectively (where A, n and v are functions of R and r). We then have tor
the line element: ,,,,, . j.,* m
In the comoving reference system the components of the energymomentum tensor are:
A quite lengthy calculation leads to the following field equations:
_^T 2 =— n = eX2v"+v ,2 +2/i"+At' 2 /x'A'v'A'+MV)+
+  ^v(Av + /ivA/i2lA 2 2/i/i 2 ), (3)
4
^ Tl=0 = \eW+»»'l»'v'M) (5)
(where the prime denotes differentiation with respect to R, and the dot with respect to ct).
General relations for the A, * v can be easily found if we start from the equations r, ^ = ^ wh °h
are contained in the field equations. Using formula (86.11) we get the following two equations.
If p is known as a function of e, equation (6) is integrated in the form:
x+to—ijgsfiW. */£+/*>. (7)
where the functions MR) and/ 2 (r) can be chosen arbitrarily in view of the possibility of making
arbitrary transformations of the type R = R(R') and t = t(t').
4. Find the general solution of the equations for a centrallysymmetric gravitational field in the
comoving reference for the case of dustlike matter, i.e. for p = (R. Tolman, 1934).?
Solution: From equations (6) we see that if p = we can set v = 0, which gives a unique choice
of the time t (in other words the reference system can be chosen to be comoving and at the same time
synchronousin accordance with the general statement on p. 293). In pace of »(R, t) we introduce
the function
Ki?,t) = e B/2 ,
representing the "radius", defined so that 2nr is the circumference (of a circle with center at the
origin); then the line element is
ds 2 = dx 2 e* dR 2 ~r 2 {R, T)(rf0 2 +sin 2 6 dip 2 ).
Equation (5) takes the form AV = 2r', and is immediately integrated over the time, giving
/o
«'2
r (8)
1+/
where f(R) is an arbitrary function subject only to the condition 1 +/> 0. Substituting this expres
sion in equation (2) (substitution in (3) gives nothing new), we get:
2rr+r 2 f=0.
The first integral of this equation is
t In problems 4 and 5, we set c = 1.
302 THE GRAVITATIONAL FIELD EQUATIONS § 100
where F(R) is another arbitrary function. Integrating once more, we get :
to(R)t = jVf^+Fr~ sinh" 1 /^ for f>0,
■tMt =yVfi*+&+— jy^^'J^f for /<0, (10)
■ffl
F^tot) 2 ' 3 for f=0.
In the first two cases the dependence r(R, r) can also be given in parametric form:
F F
r = y(co&htil\ T ~x=—j 2 (smhnri) for />0.
F F
r = ^Yf (1 cos rj), t t== (»7siiH7) for f< (10a)
(where 77 is the parameter). Substituting (8) in (4) and eliminating /by using (9), we find for e:
S" ke = ^ 2  (ID
Formulas (8)— (11) give the required general solution. We note that it depends essentially not on
three, but only on two arbitrary functions determining the relation between /, F and t , since the
coordinate R can still be subjected to an arbitrary transformation R = R(R'). This number corres
ponds exactly to the maximum possible number of "physically different" arbitrary functions for
this case (see p. 276): the initial centrallysymmetric distribution of the matter is fixed by two
quantities (the distributions of density and of radial velocity), while a free gravitational field with
central symmetry does not exist.
The overall sign in equations (10) is chosen so that the contraction of the sphere corresponds to
5750* —0. A complete solution for the collapse of the sphere requires specific inclusion of the
initial conditions and "matching" on the boundary of the sphere with the Schwarzschild solution
for empty space. But the limiting character of the metric inside the sphere follows immediately
from the formulas given here.
For t>t (R) the function r(R, t) goes to zero according to the law
/3\ 4 / 3
r2 ~(2J F2/3 ( T ° T ) 4/3 >
while the function e K goes to infinity like
v 2/3 T '2 F 2/3 J
©'
1+/ (to*) 2/3 *
This means that all radial distances (in the comoving reference system) go to infinity, and all
azimethal distances go to zero, while all volumes also go to zero (like t— T ).f Correspondingly, the
matter density increases without limit :
IF'
8nke x — — ;
3Ft (t — t)
Thus, in accordance with the remarks in the text, there is a collapse of the whole distribution in to
the center. J
In the spatial case where the function r (R) = const (i.e. all the particles of the sphere reach the
f The geometry on "planes" passing through the center is thus like that on a coneshaped surface which is
stretching in the course of time along its generators and at the same time contracting along the circles drawn
on it.
J It is understood that for e >■ co the assumption of dustlike matter is not permissible from the physical
point of view; one should use the ultrarelativistic equation of states = e/3. It appears that the general charac
ter of the contraction is to a large extent independent of the equation of state; see E. M. Lifshitz and I. M.
Khalatnikov, JETP 39, 149, 1960, {Soviet Phys.—JETP, 12, 108, 1961).
§ 100 GRAVITATIONAL COLLAPSE 303
center simultaneously) the metric inside the contracting sphere has a different character. In this case
F 2 ' 3 (t t) 4/3 ,
(r
4F 4/3 (/+l)
(tot) 4 ' 3 ,
Siike
i.e. as t
3(t t) 2 '
1 e. u S T*r all distances, both tangential and radial, tend to zero according to the same law
[~(t t) 2 ' 3 ]; the matter density goes to infinity like (r r) 2 , and in the limit its distribution
tends toward uniformity. „«u««
The case t = const includes, in particular, the collapse of a completely homogeneous sphere.
Assuming (for example, for /> 0) F/2/ 3 ' 2 = a , f= sinh 2 * (where a is a constant), we get the
metric *
ds 2 = dT 2 a 2 (r)[dR 2 +sinh 2 R(d0 2 +sm 2 d<p 2 )],
where the dependence a(r) is given by the parametric equations
a = 0o(cosh?/l), T T = ao(sinh>7/7).
The density is
o / 6a °
a
This solution coincides with the metric of a universe that is completely filled with homogeneous
matter (§ 109). This is an entirely natural result, since a sphere cut out of a uniform distribution
of matter has central symmetry. .
5. By a suitable choice of the functions F, f, r in the Tolman solution (problem 4), construct
the most complete reference system for the field of a point mass. f
Solution ■ When F = const * 0, we have from (1 1), e = 0, so that the solution applies to empty
space, i e. it describes the field produced by a point mass (which is located at ihe center where
there is a singularity of the metric). So setting F = I, f= 0, r (R) = R, we get the metric (100.3). J
To achieve our purpose, we must start from a solution that contains both expanding and con
tracting" spacetime regions. These are the Tolman solutions with/< 0; from (10a) we see that
as the parameter n changes monotonically (from to In) for given R, the time t changes mono
tonically, while r goes through a maximum. We set
1 n /* 2 ,i\ 3 ' 2
1
The we have:
\(* + l)(\cMt,\ } = ^(^ + 1 ) (rc'Z+sin*/)
K5 +I )"
(where the parameter tj runs through values to 2n).
In Fig 22, the curves /1C5 and A'C 'B' correspond to r = (parameter values q = In and ;/  0)
The curves AOA' and BOB' correspond to the Schwarzschild sphere r = r g . Between A'C B and
yi'Ofi' is a region of spacetime in which only motion out from the center is possible, and between
ACB and AOBsl region in which motion occurs only toward the center.
The world line of a particle that is at rest relative to this reference system is a vertical line
(R = const) It starts from r = (point a), cuts the Schwarzschild sphere at point b, and at time
t = reaches its farthest distance [r = r g (R 2 lr 2 g + 1)], after which the particle again begins to fall in
toward the Schwarzschild sphere, passing through it at point c and arriving once more at r =
(point rf) at the time T = (7r/2)(CR 2 /r g 2 )+l) 3 ' 2 . . .
This reference system is complete: both ends of the world line of any particle moving in the field
lie either on the true singularity r = or go out to infinity. The metric (100.3) covers only the region
t Such a system was first found by M. Kruskal, Phys. Rev. 119, 1743 (1960). The form of the solution
given here (in which the reference system is synchronous) is due to I. D. Novikov (1963).
% The case F = corresponds to the absence of a field ; by a suitable transformation of the variables,
the metric can be brought to galilean form.
304
THE GRAVITATIONAL FIELD EQUATIONS
A t B
§ 101
Fig. 22.
to the right of AOA' (or to the left of BOB), and the "expanding" reference system covers the
region to the right of BOB' (or to the left of AOA'). The system with metric (97.14) covers only the
region to the right of BOA' (or to the left of A OB').
§101. The energymomentum pseudotensor
In the absence of a gravitational field, the law of conservation of energy and momentum
of the material (and electromagnetic field) is expressed by the equation dT ik /dx k = 0.
The generalization of this equation to the case where a gravitational field is present is
equation (94.7):
1 d(T k J~g) ldg kl
Tfc
V
9
dx k
2 dx*
T Kl = 0.
(101.1)
In this form, however, this equation does not generally express any conservation law
whatever.! This is related to the fact that in a gravitational field the fourmomentum of the
matter alone must not be conserved, but rather the fourmomentum of matter plus gravita
tional field; the latter is not included in the expression for T k .
To determine the conserved total fourmomentum for a gravitational field plus the matter
located in it, we proceed as follows. J We choose a system of coordinates of such form that
t Because the integral $T k V—g dS k is conserved only if the condition
^~gT\)
dx k
is fulfilled, and not (101.1). This is easily verified by carrying out in curvilinear coordinates all those cal
culations which in § 29 were done in galilean coordinates. Besides it is sufficient simply to note that these
calculations have a purely formal character not connected with the tensor properties of the corresponding
quantities, like the proof of Gauss' theorem, which has the same form (83.17) in curvilinear as in cartesian
coordinates.
% One might get the notion to apply to the gravitational field the formula (94.4), substituting A =
— (c 4 /167r£)G. We emphasize, however, that this formula applies only to physical systems described by
quantities q different from the g ik \ therefore it cannot be applied to the gravitational field which is determined
by the quantities g ik themselves. Note, by the way, that upon substituting G in place of A in (94.4) we would
obtain simply zero, as is immediately clear from the relation (95.3) and the equations of the field in vacuum.
§ 101 THE ENERGYMOMENTUM PSEUDOTENSOR 305
at some particular point in spacetime all the first derivatives of the g ik vanish (the g ik need
not for this, necessarily have their galilean values). Then at this point the second term in
equation (101.1) vanishes, and in the first term we can take V g out from under the
derivative sign, so that there remains
d
or, in contravariant components,
 t* =
dx k l ' '
? T ik = 0.
8x k
Quantities T ik , identically satisfying this equation, can be written in the form
8
1 ~dx in '
where the r\ m are quantities antisymmetric in the indices k, /;
n ikl = ri m .
Actually it is not difficult to bring T ik to this form. To do this we start from the field
equation
and for R ik we have, according to (92.4)
R = 2 9 9 9 \ dx m dx n + dx i dx p dx »*dx p dx l dx n j
(we recall that at the point under consideration, all the It, = 0). After simple transforma
tions the tensor T ik can be put in the form
The expression in the curly brackets is antisymmetric in k and /, and is the quantity
which we designated above as rf kl . Since the first derivatives of g ik are zero at the point
under consideration, the factor l/(g) can be taken out from under the sign of differentia
tion d/dx l . We introduce the notation.
h m ^hn l(9)(9 ik 9 lm 9"9 km )l (101.2)
16nk dx
These quantities are antisymmetric in k and /:
h ikl =h ilk . (101.3)
Then we can write
= {g)T k .
ox
This relation, derived under the assumption dgjdx 1 = 0, is no longer valid when we go
to an arbitrary system of coordinates. In the general case, the difference dh lkl /dx l (g)T l
306 THE GRAVITATIONAL FIELD EQUATIONS § 101
is different from zero; we denote it by (g)t ik . Then we have, by definition,
dh ikl
(g)(T k + t ik ) = ^ r . (101.4)
The quantities t ik are symmetric in i and k:
t ik = t ki . (101.5)
This is clear immediately from their definition, since like the tensor T ik , the derivatives
dh lkl /dx l are symmetric quantities.! Expressing T ik in terms of R ik , according to the Einstein
equations, and using expression (101.2) for the h ik \ one can obtain, after a rather lengthy
calculation, the following expression for t ik :
tik= iLc ^ 2r ' r ^ r "p r  r "» r ^x^^ fcm ^^' m )+
+^^ m "(rf p rL+rLrf p r„ fc p rf w rLr^)+
~*~9 9 K\ lp A mn + imn ^lp ~ *np ^Im ~ *lm ^np) +
+9 lm g n \T\ n r k mp T\ m r k np )l (101.6)
or, in terms of derivatives of the components of the metric tensor,
(~9)t ik = ~ {Q ik ,iQ lm , m Q U ,,Q km , m + W k g lm Q ln ,pQ pm , n 
 (9 il g m „ Q kn , P Q mp , 1 + g kl g m „ Q in , P Q mp , :) + g lm g n Y, „ Q km , p +
__ +K2g il g km 9 ik 9 lm )(2g np g qr g pq g„ r )Q nr , , g«, m }, (101.7)
where g' fc = V g g lk , while the index ,i denotes a simple differentiation with respect to x\
An essential property of the t ik is that they do not constitute a tensor; this is clear from
the fact that in dh ikl /dx l there appears the ordinary, and not the co variant derivative.
However, t lk is expressed in terms of the quantities P kl , and the latter behave like a tensor
with respect to linear transformations of the coordinates (see § 85), so the same applies
to the t ik .
From the definition (101.4) it follows that for the sum T ik + t ik the equation
^ k (g)(T ik + t ik ) = (101.8)
is identically satisfied. This means that there is a conservation law for the quantities
pi = ~ c J (9)(T ik + t ik ) dS k . (101.9)
In the absence of a gravitational field, in galilean coordinates, t ih = 0, and the integral
we have written goes over into (1/c) J T ik dS k , that is, into the fourmomentum of the
material. Therefore the quantity (101.9) must be identified with the total fourmomentum
of matter plus gravitational field. The aggregate of quantities t ik is called the energy
momentum pseudotensor of the gravitational field.
The integration in (101.9) can be taken over any infinite hypersurface, including all of
the threedimensional space. If we choose for this the hypersurface jc° = const, then P*
 For just this aim we took(— g) out from under the derivative sign in the expression for T ik . If this
had not been done, dh m /dx l and therefore also t lk would turn out not to be symmetric in i and k.
§ 101 THE ENERGYMOMENTUM PSEUDOTENSOR 307
can be written in the form of a threedimensional space integral:
P i = \ f (_0XT ro M , °) dV. (101.10)
This fact, that the total fourmomentum of matter plus field is expressible as an integral
of the quantity (g) (T ik + t ik ) which is symmetric in the indices /, k, is very important. It
means that there is a conservation law for the angular momentum, defined as (see § 32)f
M ik = f (x* dP k x k dF) = 1 f {x\T kl + t kl )x k (T il + t il )}(g) dS t . (101.11)
Thus, also in the general theory of relativity, for a closed system of gravitating bodies
the total angular momentum is conserved, and, moreover, one can again define a center of
inertia which carries out a uniform motion. This latter point is related to the conservation
of the components M 0a (see § 14) which is expressed by the equation
x o f (T «o + f «o )( _ g) dv _ j x XT 00 + t 00 )(g) dV = const,
so that the coordinates of the center of inertia are given by the formula
f a (T 00 + f 00 )( _ )d y
X* = ±  . (101.12)
J(r oo +O(0)^
By choosing a coordinate system which is inertial in a given volume element, we can
make all the t ik vanish at any point in spacetime (since then all the T l u vanish). On the other
hand, we can get values of the t ik different from zero in flat space, i.e. in the absence of a
gravitational field, if we simply use curvilinear coordinates instead of cartesian. Thus, in
any case, it has no meaning to speak of a definite localization of the energy of the gravita
tional field in space. If the tensor T ik is zero at some world point, then this is the case for
any reference system, so that we may say that at this point there is no matter or electro
magnetic field. On the other hand, from the vanishing of a pseudotensor at some point in
one reference system it does not at all follow that this is so for another reference system, so
that it is meaningless to talk of whether or not there is gravitational energy at a given place.
This corresponds completely to the fact that by a suitable choice of coordinates, we can
"annihilate" the gravitational field in a given volume element, in which case, from what
has been said, the pseudotensor t ik also vanishes in this volume element.
The quantities P l (the fourmomentum of field plus matter) have a completely definite
meaning and are independent of the choice of reference system to just the extent that is
necessary on the basis of physical considerations.
Let us draw around the masses under consideration a region of space sufficiently large
so that outside of it we may say that there is no gravitational field. In the course of time, this
region cuts out a "channel" in fourdimensional spacetime. Outside of this channel there
is no field, so that fourspace is flat. Because of this we must, when calculating the energy
t It is necessary to note that the expression obtained by us for the fourmomentum of matter plus field is
by no means the only possible one. On the contrary, one can, in an infinity of ways (see, for example, the
problem in this section), form expressions which in the absence of a field reduce to T ik , and which upon
integration over dS k , give conservation of some quantity. However, the choice made by us is the only one for
which the energymomentum pseudotensor of the field contains only first (and not higher) derivatives of g tk
(a condition which is completely natural from the physical point of view), and is also symmetric, so that it is
possible to formulate a conservation law for the angular momentum.
308 THE GRAVITATIONAL FIELD EQUATIONS § 101
and momentum of the field, choose a fourdimensional reference system such that outside
the channel it goes over into a galilean system and all the t ik vanish.
By this requirement the reference system is, of course, not at all uniquely determined
it can still be chosen arbitrarily in the interior of the channel. However the P', in full accord
with their physical meaning, turn out to be completely independent of the choice of co
ordinate system in the interior of the channel. Consider two coordinate systems, different in
the interior of the channel, but reducing outside of it to one and the same galilean system,
and compare the values of the fourmomentum P l and P"' in these two systems at definite
moments of "time" x° and x'°. Let us introduce a third coordinate system, coinciding in
the interior of the channel at the moment jc° with the first system, and at the moment x'°
with the second, while outside of the channel it is galilean. But by virtue of the law of con
servation of energy and momentum the quantities P l are constant (dP^dx = 0). This is
the case for the third coordinate system as well as for the first two, and from this it follows
thatP f = P'\
Earlier it was mentioned that the quantities t ik behave like a tensor with respect to linear
transformations of the coordinates. Therefore the quantities P l form a four vector with
respect to such transformations, in particular with respect to Lorentz transformations
which, at infinity, take one galilean reference frame into another, f The fourmomentum P *"
can also be expressed as in integral over a distant threedimensional surface surrounding
"all space". Substituting (101.4) in (101.9), we find
c) dx l
dS k .
This integral can be transformed into an integral over an ordinary surface by means of
(6.17):
pi = t§ htkldf * 1  (10U3)
If for the surface of integration in (101.9) we choose the hypersurface x° = const, then in
(101.13) the surface of integration turns out to be a surface in ordinary space: J
if
h i0 * df a . (101.14)
To derive the analogous formula for the angular momentum, we write formula (101.2)
in the form
/i ifci = — A'' fc ' m ; (101.15)
the expression for the quantities l iklm in terms of the components of the metric tensor is
t Strictly speaking, in the definition (101.9) P l is a fourvector only with respect to linear transformations
with determinant equal to unity; among these are the Lorentz transformations, which alone are of physical
interest. If we also admit transformations with determinant not equal to unity, then we must introduce into
the definition of P* the value of g at infinity by writing V— #«, P l in place of P l on the left side of (101.9).
% The quantity df tk is the "normal" to the surface element, related to the "tangential" element df ih by
(6.11): df ik = ?e iklm df lm . On the surface bounding the hypersurface which is perpendicular to the x° axis,
the only nonzero components of df lm are those with /, m = 1, 2, 3, and so df* k has only those components
in which one of i and k is 0. The components df * are just the components of the threedimensional element
of ordinary surface, which we denote by df a .
§ 101 THE ENERGYMOMENTUM PSEUDOTENSOR 309
obvious from (101.2). Substituting (101.4) in (101.11) and integrating by parts, we obtain:
M c) \ * dxdx" X dx>»dx») dSl
if/. dx klmn k dx ilmn \ „„ i r (« dx klmn dx ilmn \
From the definition of the quantities X lklm it is easy to see that
Thus the remaining integral over dS l is equal to
1 r r)7 ilnk 1 r
lSh dS >=kS x "°' dff "
Finally, again choosing a purely spatial surface for the integration, we obtain :
M ifc =  f (x i h koa x k h ioa + X ioak )df a . (101.16)
We remind the reader that in applying formulas (101.14) and (101.16), in accordance
with what was said above, the system of space coordinates should be chosen so that at
inifinity the g ik tend toward their constant galilean values. Thus for the calculation according
to formula (101.14) of the fourmomentum of an isolated system of bodies which always
remain close to the origin of coordinates, we can use for the metric at large distances from
the bodies the expression (101.14), transforming it from spherical spatial coordinates to
cartesian (for which we must replace dr by n a dx", where n is a unit vector along the direction
of r) ; the corresponding metric tensor is
1 2km si 2kmn a n p
O o = l 2, g aP =Kp 2 » 9o* = ®, (101.17)
c r c r
where m is the total mass of the system. Computing the required components of h M using
formula (101.2), we obtain to the required accuracy (we keep terms ~ 1/r 2 ):
h aoP = 0,
16nk dx p K9 9 } 8;r d*P \ r + r 3 / An r 2 '
Now integrating (101.14) over a sphere of radius r, we obtain finally:
P" = 0, P° = mc, (101.18)
a result which was to be expected. It is an expression of the equality of "gravitational" and
"inertial" mass. {Gravitational mass means the mass which determines the gravitational
field produced by a body; this is the mass which enters in the expression for the interval in a
gravitational field, or, in particular, in Newton's law. The inertial mass determines the
relation between momentum and energy of a body and, in particular, the rest energy of a
body is equal to this mass multiplied by c 2 .)
In the case of a constant gravitational field it turns out to be possible to derive a simple
expression for the total energy of matter plus field in the form of an integral extended only
over the space occupied by the matter. To obtain this expression one can, for example,
R°o = i^~^gg io nd.
310 THE GRAVITATIONAL FIELD EQUATIONS § 101
start from the following identity, valid (as one easily verifies) when all quantities are in
dependent of x° :f
d_ . / —
Integrating KgV^g over (threedimensional) space and applying Gauss' theorem, we
obtain
j R° ^ dV = j ^~g g io r a oi df a .
We choose a sufficiently distant surface of integration and use on it the expressions for the
g ik given by formula (101.17), and obtain after a simple calculation
I
R° ^g dV = j m = ^ P°.
Noting also from the equations of the field that
R o o=^(n\Ty 4 ^(nTinn),
we get the required expression
P° = mc= j (T{ + T 2 2 + TlT° )yJ~g dV. (101.19)
This formula expresses the total energy of matter plus constant gravitational field (i.e. the
total mass of the bodies) in terms of the energymomentum tensor of the matter alone
(R. Tolman, 1930). We recall that in the case of central symmetry of the field, we had still
another expression for this same quantity, formula (97.23).
PROBLEM
Find the expression for the total fourmomentum of matter plus gravitational field, using
formula (32.5).
Solution: In curvilinear coordinates one has
S=jAV^gdVdt,
and therefore to obtain a quantity which is conserved we must in (32.5) write AV—g in place
of A, so that the fourmomentum has the form
H '{
Av^^s'^a*
Pl • • " ' "' ■ dx i dq w
t From (92.10), we have
Ro= g *R iQ =g°i ^ +r{ r{,„rnrS' w ),
and with the aid of (86.5) and (86.8), we find that this expression can be written as
r° — ±=— (V~ ^TioH^r^rSo.
Vg dx l
With the help of these same relations (86.8), one can easily verify that the second term on the right is identically
equal to — hTiSAdg lm l%x ), and vanishes as a consequence of the fact that all quantities are independent of
x°. Finally, for the same reason, replacing the summation over / in the first term by a summation over a, we
obtain the formula of the text.
§ 102 GRAVITATIONAL WAVES 311
In applying this formula to matter, for which the quantities q w are different from the g ik , we can
take V^g out from under the sign of differentiation, and the integrand turns out to be equal to
V^g T k , where T k t is the energymomentum tensor of the matter. When applying this same
formula to the gravitational field, we must set A = (c 4 /16ttA:)G, while the quantities q m are the
components g ik of the metric tensor. The total fourmomentum of field plus matter is thus equal to
]Jnv_^ +I lj*
oVgS^—i——
dS k .
Using the expression (93.3) for G, we can rewrite this expression in the form:
The second term in the curly brackets gives the fourmomentum of the gravitational field in the
absence of matter. The integrand is not symmetric in the indices /, k, so that one cannot formulate
a law of conservation of angular momentum.
§ 102. Gravitational waves
Let us consider a weak gravitational field in vacuo. In a weak field the spacetime metric
is "almost galilean", i.e. we can choose a system of reference in which the components of the
metric tensor are almost equal to their galilean values, which we denote by
gS )=  5 ^ 9^ = 0, g® = l. (102.1)
We can therefore write the g ik in the form
9ik = 9$ ) + h ik , (102.2)
where the h ik are small corrections, determined by the gravitational field.
With small h ik , the components T kl , which are expressed in terms of the derivatives of
g ik , are also small. Neglecting powers of h ik higher than the first, we may retain in the tensor
R iklm (92.4) only the terms in the first bracket:
_ifd 2 h im d 2 h kl d 2 h km d 2 h u \
mm 2 \dx k dx l dx l dx m dx'dx 1 dx k dx m )' K ' )
For the contracted tensor R ik , we have to this same accuracy
^ik — 9 l^limk ~ 9 ^limk
or
ik ~2\ 9 dx l dx m + 8x k dx l + dx i dx l dx'dxY K >
where h = /i.f
We have chosen our reference system so that the g ik differ little from the g\ k ^. But this
condition is also fulfilled for any infinitesimal coordinate transformation, so that we can
t In accordance with the approximation, all operations of raising and lowering indices of small tensors
and vectors are performed here and in the sequel using the "unperturbed" metric tensor g\%. Thus
h*, = g iom hii, etc.
Then we have for the contravariant components g ik :
gK^gMKh* (102.2a)
(so that, to terms of first order, the condition gug lk = S k is satisfied). The determinant of the metric tensor is
9 = g™(l +g mk h ik ) = g™(l+h). (102.2b)
ll*
fc = 0, tf = h* t $h. (102.5)
312 THE GRAVITATIONAL FIELD EQUATIONS § 102
still apply to the h ik four conditions (equal to the number of coordinates) which do not
violate the condition that the h ik be small. We choose for these auxiliary conditions the
equations
It should be pointed out that even with these conditions the coordinates are not uniquely
determined; let us see what transformations are still admissible. Under the transformation
x n = x l + £', where the £' are small quantities, the tensor g ik goes over into
9ik9ik ^k dy y
i.e.
[see formula (94.3)], in which the covariant differentiation reduces for the present case to
ordinary differentiation, because of the constancy of g\ k } ). It is then easy to show that, if
the h ik satisfy the condition (102.5), the h' ik will also satisfy this condition, if the ^ are solu
tions of the equation
□& = 0, (102.7)
where □ denotes the d'Alembertian operator
, 0Mm d 2 d 2 1 d 2
n = — n^ ' = —
u y dx l dx m dx 2 c 2 dt 2 '
From condition (102.5), the last three terms in the expression (102.4) for R ik cancel one
another, and we find:
R ik = ^ UK
Thus the equation for the gravitational field in vacuum takes on the form
□ # = 0. (102.8)
This is the ordinary wave equation. Thus gravitational fields, like electromagnetic fields,
propagate in vacuum with the velocity of light.
Let us consider a plane gravitational wave. In such a wave the field changes only along
one direction in space; for this direction we choose the axis a; 1 = x. Equation (102.8) then
changes to
(£?£)" ' (io2  9)
the solution of which is any function of t±x/c (§ 47).
Consider a wave propagating in the positive direction along the x axis. Then all the quan
tities h) are functions of t—x/c. The auxiliary condition (102.5) in this case gives
\[/l—ij/? = 0, where the dot denotes differentiation with respect to t. This equality can be
integrated by simply dropping the sign of differentiation — the integration constants can
be set equal to zero since we are here interested only (as in the case of electromagnetic
waves) in the varying part of the field. Thus, among the components ij/] that are left, we
have the relations
xj,\ = K *i = i& ^3 = ^3, *h = K (102.10)
§ 102 GRAVITATIONAL WAVES 313
As we pointed out, the conditions (102.5) still do not determine the system of reference
uniquely. We can still subject the coordinates to a transformation of the form
x 'i = x i + ^{txjc). These transformations can be employed to make the four quantities
^°u ^5» *l*l* ^2 + ^3 vanish; from the equalities (102.10) it then follows that the components
\j/\,il/ l 2 , \j/\, \{/o also vanish. As for the remaining quantities ^, ^f^i. thev cannot be
made to vanish by any choice of reference system since, as we see from (102.6), these com
ponents do not change under a transformation £, = ^(tx/c). We note that \jt = \j/\ also
vanishes, and therefore xj/ 1  = h).
Thus a plane gravitational wave is determined by two quantities, h 23 and h 22 = h 33 .
In other words, gravitational waves are transverse waves whose polarization is determined
by a symmetric tensor of the second rank in the yz plane, the sum of whose diagonal terms,
h 22 + h 33 , is zero.
We calculate the energy flux in a plane gravitational wave. The energy flux in a gravita
tional field is determined by the quantities cgt 0a &ct 0a . In a wave propagating along
the jc 1 axis, it is clear that only the component t 10 is different from zero.
The pseudotensor t ik is of second order; we must calculate the t 01 only to this accuracy.
A calculation making use of the formula (101.6), and the fact that in a plane wave the only
components of h ik different from zero are h 23 , h 22 = — h 33 , leads to the result:
01 _ _ c 3 / dh 22 dh 22 ^33 3^33 5/J23 ^23
32nk \ dx dt dx dt dx dt
If all quantities are functions only of txjc, then we get from this, finally,
t 01 = ^j c [hh + Xh 22 h 33 ) 2 l (102.11)
Since it has a definite energy, a gravitational wave produces around itself a certain
additional gravitational field. This field is a quantity of higher (second) order compared to
the field of the wave itself, since the energy producing it is a quantity of second order.
As initial conditions for the arbitrary field of a gravitational wave we must assign four
arbitrary functions of the coordinates : because of the transversality of the field there are
just two independent components of h aP , in addition to which we must also assign their
first time derivatives. Although we have made this enumeration here by starting from the
properties of a weak gravitational field, it is clear that the result, the number 4, cannot be
related to this assumption and applies for any free gravitational field, i.e. for any field which
is not associated with gravitating masses.
PROBLEMS
1. Determine the curvature tensor in a weak plane gravitational wave.
Solution: Calculating R mm for (92.4) in the linear approximation in the h ik , we find the following
nonzero components :
— ^0202 — ^?0303 = — J?1212 = f?0212 = ^0331 = "3131 ~ a >
^0203 = — ^1231 — — ^0312 = ^?0231 = Mt
where we use the notation
O = — i/133 = i/?22> M = — i^23«
314 THE GRAVITATIONAL FIELD EQUATIONS § 103
In terms of the threedimensional tensors A aB and B aB introduced in problem 3 of § 92, we have:
/0 0\ /0 \
/4a/j = 0 — a nV B ae = lo m o J.
\0 no) \0 a nj
By a suitable rotation of the x 2 , x 3 axes, we can make one of the quantities a or n vanish (at a
given point of fourspace); if we make a vanish in this way, we reduce the curvature tensor to the
degenerate Petrov type II (type N).
2. Find the small corrections to the tensor R ik for an arbitrary "unperturbed" metric g%\
Solution: The corrections to the ChristofFel symbols are expressed in terms of 8g ik = h ik as
ST i M = ¥hUi+h\ i Hh M :' t ) i
which can be verified by direct calculation (all operations of raising and lowering indices, and of
covariant differentiation, are done with the metric g$). For the corrections to the curvature tensor
we find
dRkim = 2(.h kim; i\h m i c; i—hi cm ' ;l—h k  t i m —h l U } C ;m J rhki' l ;rn)'
The corrections to the Ricci tensor are:
SR lk =dR l tlk = i(h 1 i : k: i+K. l:l h lk  l :l h :Uk ). (1)
From the relation
we have for the corrections to the mixed components R k :
SR k =g mkl SR il h kl R? l K (2)
§ 103. Exact solutions of the gravitational field equations depending on one variable
In this section we shall consider the possible types of exact solutions of the gravitational
field equations in vacuum, in which all the components of the metric tensor, for a suitable
choice of reference system, are functions of a single variable.! This variable may have either
timelike or spacelike character; to be specific, we shall assume first that it is timelike, and
shall denote it by x° = t.%
As we shall see, essentially different types of solutions are obtained depending on whether
or not it is possible to choose a reference system for which all the components g 0a =
while at the same time all other components still depend on only a single variable.
The last condition obviously permits transformations of the coordinates x of the form
x a +x*+(j)Xi)
where the <£* are arbitrary functions of t. For such a transformation,
9o*+9o*+9*e¥
(where the dot denotes differentiation with respect to t). If the determinant \g ap \ ^ 0, the
system of equations
9o a +9ap^ = (103.1
determines functions <jf{t) which accomplish the transformation to a reference system
with g 0x = 0. By a transformation of the variable t according to V — g 00 dt>dt, we can
then make g 00 equal to unity, so that we obtain a synchronous reference system, in which
9 oo = h 0o« = 0> 9ap = y*p(t). (103.2)
t Exact solutions of the field equations in vacuum, depending on a large number of variables, can be
found in the paper: B. K. Harrison, Phys. Rev. 116, 1285 (1959).
% In this section, to simplify the writing of formulas, we set c = 1.
§ 103 EXACT SOLUTIONS OF THE GRAVITATIONAL FIELD EQUATIONS 315
We can now use the equations of gravitation in the form (99. 1 1)— (99. 1 3). Since the
quantities y aP , and with them the components of the threedimensional tensor x aP = y a/? ,
do not depend on the coordinates x*, R% = 0. For the same reason, P ap = 0, and as a result
the equations of the gravitational field in vacuum reduce to the following system :
#+**£*! = 0» (103.3)
7(Vy*&" = (103.4)
vy
From equation (103.4) it follows that
Vy«J = 2& (103.5)
where the X p a are constants. Contracting on the indices a and /?, we then obtain
y 2
a y Vy
from which we see that y = const • t 2 ; without loss of generality we may set the constant
equal to unity (simply by a scale change of the coordinates x*); then A" = 1. Substitution
of (103.5) into equation (103.3) now gives the relation
%k% = 1 (103.6)
which relates the constants A£.
Next we lower the index /? in equations (103.5) and rewrite them as a system of ordinary
differential equations:
U = \%y» ( 103  7 )
The set of coefficients XI may be regarded as the matrix of some linear substitution. By a
suitable linear transformation of the coordinates jc 1 , x 2 , x 3 (or, what is equivalent, of g lp ,
g 2p , 03/?)> we can m general bring this matrix to diagonal form. We shall denote its principal
values (roots of the characteristic equation) by p l ,p 2 ,p 3 , and assume that they are all real
and distinct (concerning other cases, cf. below); the unit vectors along the corresponding
principal axes are n (1) , n (2) and n (3) . Then the solution of equations (103.7) can be written
in the form
y«p = t^ni'^ + t^n^^ + t^ni 3 ^ (103.8)
(where the coefficients of the powers of t have been made equal to unity by a suitable scale
change of the coordinates). Finally, choosing the directions of the vectors n (1) , n (2) , n (3) as
the directions of our axes (we call them x, y, z), we bring the metric to the final form
(E. Kasner, 1922):
ds 2 = dt 2 t 2pi dx 2 t 2p2 dy 2 t 2p3 dz 2 . (103.9)
Here^, p 2 and/? 3 are any three numbers satisfying the two relations
Pi+P 2 +P 3 = h P 2 i + Pl+Pl = l (103.10)
[the first of these follows from —g = t 2 , and the second — from (103.6)].
316 THE GRAVITATIONAL FIELD EQUATIONS § 103
The three numbers p lf p 2 and p 3 obviously cannot all have the same value. The case
where two of them are equal occurs for the triples 0,0,1 and — 1/3, 2/3, 2/3. In all other cases
the numbers p u p 2 and p 3 are all different, one of them being negative and the other two
positive. If we arrange them in the order p t <p 2 </>3, their values will lie in the intervals
±<Pi<0, 0<p 2 <$, l^p 3 ^l. (103.10a)
Thus the metric (103.9) corresponds to a homogeneous but anisotropic space whose total
volume increases (with increasing t) proportionally to t ; the linear distances along two of
the axes (y and z) increase, while they decrease along the third axis (x). The moment
t — is a singular point of the solution; at this point the metric has a singularity which can
not be eliminated by any transformation of the reference system. The only exception is the
case where p t =p 2 = 0, p 3 = 1. For these values we simply have a flat spacetime; by the
transformation / sinh z = C, t cosh z = x we can bring the metric (103.9) to galilean form.
A solution of the type of (103.9) also exists in the case where the parameter is^spacelike ;
we need only make the appropriate changes of sign, for example,
ds 2 = x 2pi dt 2 dx 2 x Zp2 dy 2 x 2p3 dz 2 .
However, in this case there also exist solutions of another type, which occur when the
characteristic equations of the matrix A^O in equations (103.7) has complex or coincident
roots (cf. problems 1 and 2). For the case of a timelike parameter t, these solutions are not
possible, since the determinant g in them would not satisfy the necessary condition g < 0.
A completely different type of solution corresponds to the case where the determinant of
the tensor g aP which appears in equations (103.1) is equal to zero. In this case there is no
reference system satisfying conditions (103.2). Instead we can now choose the reference
frame so that:
9 10 = U 9oo = 920 = 930 = 0» 9*p  9 a p(* )»
where the determinant \g aP \ = 0. The variable jc° then has "lightlike" character: for dx" = 0,
dx° # 0, the interval ds goes to zero; we denote this variable by x° = rj. The corresponding
interval element can be represented in the form
ds 2 = 2dx x dt]+g ab (dx a +g a dx 1 )(dx b + g h dx 1 ).
Here in the following equations the indices a, b, c, ... run through the values 2,3; we may
treat g ab as a twodimensional tensor and g a as the components of a twodimensional vector.
Computation of the quantities R ab , which we shall omit here, gives the following field
equations :
Kb = $9ac9 c 9bd9 d =
(where the dot denotes differentiation with respect to rj). From this it follows that g ac g c = 0,
or g c = 0, i.e., g c = const. By the transformation x a + g a x 1 + x a we can therefore bring the
metric to the form
ds 2 = 2dx l drj+g ab (ri) dx a dx\ (103.11)
The determinant —g of this metric tensor coincides with the determinant \g ab \, while
the only Christoffel symbols which are different from zero are the following:
1 bO ~ ? X b> x ab — 2 x ab>
where we have introduced the twodimensional tensor x ab = g ab . The only component of
the tensor R ik which is not identically zero is R 00 , so that we have the equation
Koo = **2+K*Z = 0. (103.12)
§ 103 EXACT SOLUTIONS OF THE GRAVITATIONAL FIELD EQUATIONS 317
Thus the three functions g 2 % (rj), g 33 (rj), g 23 (rj) must satisfy just one equation. There
fore two of them may be assigned arbitrarily. It is convenient to write equation (103.12) in
another form, by representing the quantities g ab as
g ab = x 2 y ab , \y ab \ = 1. (103.13)
Then the determinant g = \g ab \ = % 4 , and substitution in (103.12) and a simple trans
formation gives:
x+Kyacy hc )(y bd y ad )x = o (103.14)
(where y ab is the twodimensional tensor which is the inverse of y ab ). If we arbitrarily assign
the functions y ab (r]) (which are related to one another by the relation \y ab \ — 1), the function
X(rj) is determined by this equation.
We thus arrive at a solution containing two arbitrary functions. It is easy to see that it
represents a generalization of the treatment in § 102 of a weak plane gravitational wave
(propagating along one direction).! The latter is obtained if we make the transformation
_ t+x t t — x
V2 V2
and set y ab = S ab + h ab (r]) (where the h ab are small quantities, which are subjected to the
condition h 22 +h 33 = 0) and x = 1 ; a constant value of x satisfies equation (103.14) if we
neglect terms of second order.
Suppose that a weak gravitational wave of finite extension (a "wave packet") passes
through some point x in space. Before the arrival of the packet we have h ab = and x = 1 ;
after its passage we again have h ab = 0,d 2 x/dt 2 = 0, but the inclusion of second order terms
in equation (103.14) leads to the appearance of a nonzero negative value of dx/dt:
(where the integral is taken over the time during which the wave passes). Thus after the
passage of the wave we will have x = 1 const • /, and after some finite time interval has
elapsed x will change sign. But a null value of x means that the determinant g of the metric
tensor is zero, i.e. there is a singularity of the metric. However, this singularity is not physi
cally significant; it is related only to the inadequacy of the reference system furnished by the
passing gravitational wave and can be eliminated by appropriate transformation; after
passage of the wave the spacetime will again be flat.
One can show this directly. If we measure the values of the parameter r\ from its value for
the singular point, then x = V, so that
ds 2 = 2drj dx 1 r] 2 [(dx 2 ) 2 + (dx i ) 2 ].
It is easily seen that for this metric R iklm = 0, so that the corresponding spacetime is flat.
And, in fact, after the transformation
we get
f The possibility of such a generalization was first pointed out by I. Robinson and H. Bondi (1957). We
also cite papers in which solutions of a related character are found for a larger number of variables: A. Peres,
Phys. Rev. Letters 3, 571 (1959); I. Robinson and A. Trautman, ibid. 4, 431 (1960).
*ix 2 = y,
3 l e y 2 + Z 2
rjx 3 = z, x l =£ ,
2rj
ds 2 = 2dr]dZdy 2 dz 2 ,
318 THE GRAVITATIONAL FIELD EQUATIONS § 104
after which the substitution rj = (t+x)/y[2, £ = (t—x)/>/2 finally brings the metric to
galilean form.
This property of a gravitational wave — the appearance of a fictitious singularity — is, of
course, not related to the fact that the wave is weak, but also occurs in the general solution of
equation (103.12).f As in the example treated here, near the singularity % ~ r], i.e. — g ~ vf.
Finally we point out that, in addition to the general solution given above, equation
(103.12) also has special solutions of the form
ds 2 = 2dr\ d Xi ri 2s2 (dx 3 ) 2 rj 2s3 (dx 2 ) 2 , (103.15)
where s 2 , s 3 are numbers related to one another by the relation
In these solutions the metric has a true singular point (at y\ — 0) which cannot be eliminated
by any transformation of the reference system.
PROBLEMS
1. Find the solution of equations (103.7) corresponding to the case where the characteristic
equation of the matrix X B a has one real (p 3 ) and two complex (p lt 2 =/>'+*>") roots.
Solution: In this case the parameter x°, on which all the quantities depend, must have spacelike
character; we denote it by x. Correspondingly, we must now have g 00 = — 1 in (102.2). Equations
(103.34) are not changed.
The vectors n (1 \ n (2) in (103.8) become complex: n (1  2) = (n'±/n")/v2, where n', n" are unit
vectors. Choosing the axes x 1 , x 2 , x 3 along the directions n', n", n (3) , we obtain the solution in the
form
0n = 022 = * 2p/ cos Up" In ^Y ^12 = x**' sin Up" In ^ Y
033 = * 2p3 , g = 0oo0«/j = x 2 ,
where a is a constant (which can no longer be eliminated by a scale change along the x axis, without
changing other coefficients in the expressions given). The numbers, p u p 2 , Pa again satisfy the rela
tions (103.10), where the real numbers is either < —1/3 or > 1.
2. Do the same for the case where two of the roots coincide (p 2 —P3).
Solution: We know from the general theory of linear differential equations that in this case the
system (103.7) can be brought to the following canonical form:
2pi . 2p 2 . 2p 2 .A _ .
011 = 011, 02a = — 02a, 03« = — T 03«+ " 02*> O — Z, J,
X X xx,
where A is a constant. If A = 0, we return to (103.9). If X =£ 0, we can put X = 1 ; then
g xx = t 2 *l, 02a = d a X 2p 2, 03a = 6« t 2p * + d a X 2 ^ In X.
From the condition # 23 =0 32 , we find that a 2 = 0,a 3 = b 2 . By appropriate choice of scale along
the x 2 and x 3 axes, we finally bring the metric to the following form:
ds 2 = dx 2 x 2p i (^ 1 ) 2 d=2jc 2 ^ dx 2 dx 3 ±x 2 '2 In I (dx 3 ) 2 .
The numbers p u p 2 can have the values 1,0 or —1/3, 2/3.
§ 104. Gravitational fields at large distances from bodies
Let us consider the stationary gravitational field at large distances r from the body which
produces it, and determine the first terms of its expansion in powers of 1/r.
t This can be shown by using equation (103.12) in exactly the same way as was done in § 100 for the
analogous threedimensional equation.
§ 104 GRAVITATIONAL FIELDS AT LARGE DISTANCES FROM BODIES 319
In the first approximation, to terms of order 1/r, the small corrections to the galilean
values are given by the corresponding terms in the expansion of the Schwarzschild solution
(97.14), i.e. by the formulas already given in (101.17):
h® = ~, Ky=~^, fcji> = o. (mi)
Among the second order terms, proportional to 1/r 2 , there are terms which come from
two different sources. Some of the terms arise, as a result of the nonlinearity of the equations
of gravitation, from the firstorder terms. Since the latter depend only on the total mass
(and on no other characteristics) of the body, these secondorder terms also can only
depend on the total mass. It is therefore clear that these terms can be obtained by expanding
the Schwarzschild solution (97.14), from which we find:f
KV = 0, h $ =*!££ n.n,. (104.2)
The remaining second order terms appear as solutions of the already linearized equations
of the field. To calculate them, we use the linearized equations in the form (102.8). In the
stationary case, the wave equation reduces to the Laplace equation
Ah* = 0. (104.3)
The quantities h\ are coupled by the auxiliary conditions (102.5), which take the following
form, since the h\ are independent of the time:
A (fc ;_ i fc«J ) = 0> (104.4)
£*0. (104.5)
The component h 00 must be given by a scalar solution of the Laplace equation. We know
that such a solution, proportional to 1/r 2 , has the form aV(l/r) where a is a constant
vector. But a term of this type in h 00 can always be eliminated by a simple displacement of
the coordinate origin in the first order term in 1/r. Thus the presence of such a term in h 00
would simply indicate that we had made a poor choice of the coordinate origin, and is
therefore not of interest.
The components h 0a are given by a vector solution of the Laplace equation, i.e. they must
have the form
where X aP is a constant tensor. The condition (104.5) gives:
d 2 1
aP dx x dx p r~ '
t It should be noted that the specific appearance of the /&V, h ( a 2 \ h ( oo depends on the particular choice of
the spatial coordinates (galilean at infinity) ; the form given in the text corresponds to just that definition of r
for which the Schwarzschild solution is given by (97.14). So the transformation x' a = x a +£ a , £ a = ax a /2r
results [see (102.6)] in the addition to h l „e of the term (a/r)(S aB —n a n B ), and by a suitable choice of a we can
obtain:
h^=^i^l, (104.1a)
c 2 r
which corresponds to the Schwarzschild solution in the form given in problem 3 of § 97.
320 THE GRAVITATIONAL FIELD EQUATIONS § 104
from which it follows that A ap must have the form
Kp — a afi + ^ap>
where a af} is an antisymmetric tensor. But a solution of the form A(d/dx a )/(l/r) can be
eliminated by the transformation x'° = x° + £°, with £° = Xjr [see (102.6)]. Therefore the
only solution which has a real meaning is of the form
h 0a = a a p fap ~, a a p = ~ a Pa . (104.6)
Finally, by a similar but more complicated argument, one can show that by a suitable
transformation of the space coordinates one can always eliminate the quantities h a » given
by a tensor (symmetric in a, /?) solution of the Laplace equation.
There remains the task of examining the meaning of the tensor a ap in (104.6). For this
purpose we use (101.16) to compute the total angular momentum tensor M aP in terms of the
expressions we have found for the h 0a (assuming that all other components of h ik are absent).
To terms of second order in the h 0a we have from formula (101.2) (we note that
9*°=h a0 = h a0 ):
_ c 4 d
~'i6nicdx'y (haod ^~ h " o5a ' i)
c 4 8
16nk dx 13 ha0
A X2
d 2 1
a„
16nk ay dx l! dx y r
C 4 Zflpflydpy
16nk " y r 3
(where n is a unit vector along the radius vector). By means of these expressions, we find,
after performing the integration over a sphere of radius r(df y = n 7 r 2 do):
 c j tfh'oyx'h' *) df y =~j (n a n y a py n p n y a ay ) do
_ _ _^ 4jr
" Znk 3 V«t a to d to a «t>
c 3
a t*p
3k
A similar calculation gives :
i j X"» dfy=^ k j (Ko dfph fi0 df a )
Combining these two results, we get :
67c"
ap
Thus we finally have :
M «" = k "•
h£> = ~ M^. (104.7)
§ 104 GRAVITATIONAL FIELDS AT LARGE DISTANCES FROM BODIES 321
We emphasize that in the general case, when the field near the bodies may not be weak,
M aP is the angular momentum of the body together with its gravitational field. Only when
the field is weak at all distances can its contribution to the angular momentum be neglected.
We also note that in the case of a rotating body of spherical shape, producing a weak field
everywhere, formula (104.7) is valid over the whole space outside the body.
Formulas (104.1), (104.2) and (104.7) solve our problem to terms of order 1/r 2 . The
covariant components of the metric tensor are
g ik = 9^ + KP + h^\ (104.8)
To this same accuracy, the contravariant components are
gik = g Wik_ h (i)ik_ h (2)ik + h (i)i ih (i)ik^ (104.9)
Formula (104.7) can be written in vector form asf
2k
9 ~ c 3 r 2
nxM. (104.10)
In problem 1 of § 88 it was shown that in a stationary gravitational field there acts on the
particle a "Coriolis force" equal to that which would act on the particle if it were on a body
rotating with angular velocity il =(c/2)V#oo v x 3 Therefore we may say that in the field
produced by a rotating body (with total angular momentum M) there acts on a particle
distant from the body a force which is equivalent to the Coriolis force which would appear
for a rotation with angular velocity
fl^Vx^yr [M3n(nM)].
2 c r
PROBLEM
Determine the systematic ("secular") shift of the orbit of a particle moving in the field of a central
body, associated with the rotation of the latter (J. Lense, H. Thirring, 1918).
Solution: Because all the relativistic effects are small, they superpose linearly with one another,
so in calculating the effects resulting from the rotation of the central body we can neglect the in
fluence of the nonNewtonian centrally symmetric force field which we considered in § 98; in other
words, we can make the computations assuming that of all the h ik only the h 0a are different from
zero.
The orientation of the classical orbit of the particle is determined by two conserved quantities:
the angular momentum of the particle, M = r xp, and the vector
p _, kmtrir,
A = — xM
m r
whose conservation is peculiar to the Newtonian field <p = —km'/r (where m' is the mass of the
central body). J The vector M is perpendicular to the plane of the orbit, while the vector A is directed
along the major axis of the ellipse toward the perihelion (and is equal in magnitude to kmm'e,
where e is the eccentricity of the orbit). The required secular shift of the orbit can be described in
terms of the change in direction of these vectors.
The Lagrangian for a particle moving in the field (104.10) is
ds 1km m ,. ,..,.
L= mc — =L +SL, 8L = mcg\ = ?  3 M'vxr (1)
dt c 2 r 3
t To the present accuracy, the vector g a = g Jgoo = —Qoa For the same reason, in the definitions of
vector product and curl (cf. the footnote on p. 252), we must set y = 1, so that they may be taken as usual
for cartesian vectors.
% See Mechanics, § 15.
322 THE GRAVITATIONAL FIELD EQUATIONS § 104
(where we denote the angular momentum of the central body by M' to distinguish it from the angular
momentum M of the particle). Then the Hamiltonian is:f
Ik
Computing the derivative ]VI = rxp+rxp using the Hamilton equations r = 3^/8p,
p = ~(8Jf?/dr), we get:
2k
M = ^3 M 'x M  (2)
Since we are interested in the secular variation of M, we should average this expression over the
period of rotation of the particle. The averaging is conveniently done using the parametric represen
tation of the dependence of r on the time for motion in an elliptical orbit, in the form
T
r = a(l e cos £), t = —^e sin £)
Ln
(a and e are the semimajor axis and eccentricity of the ellipse) :%
d£ 1
TJ r 3 ~2na 3 ] (T^
ecos£) 2 a 3 (l~e 2 ) 3 ' 2 '
Thus the secular change of M is given by the formula
dM 2kM'xM
dt c 2 a 3 (le 2 ) 3 ' 2 ' (3)
i.e. the vector M rotates around the axis of rotation of the central cody, remaining fixed in mag
nitude.
An analogous calculation for the vector A gives :
2k fik
A = == M' X A+ £—. . (M • M')(r x M).
c 2 r 3 c 2 mr 5
The averaging of this expression is carried out in the same way as before; from symmetry considera
tions it is clear beforehand that the averaged vector r/r 5 will be along the major axis of the ellipse,
i.e. along the direction of the vector A. The computation leads to the following expression for the
secular change of the vector A:
^ = " XA ' n = c 2 a 3 t M er' 2 {n, ~ Mn ' n,)} (4)
(n and n' are unit vectors along the directions of M and M'), i.e. the vector A rotates with angular
velocity SI, remaining fixed in magnitude; this last point shows that the eccentricity of the orbit
does not undergo any secular change.
Formula (3) can be written in the form
— r =nxM,
dt
with the same SI as in (4); in other words, SI is the angular velocity of rotation of the ellipse "as
a whole". This rotation includes both the additional (compared to that considered in § 98) shift
of the perihelion of the orbit, and the secular rotation of its plane about the direction of the axis of
the body (where the latter effect is absent if the plane of the orbit coincides with the equatorial
plane of the body).
For comparison we note that to the effect considered in § 98 there corresponds
_ 6nkrri
~ c 2 a(le 2 )T U '
f See Mechanics, § 40.
t See Mechanics, § 15.
§ 105 RADIATION OF GRAVITATIONAL WAVES 323
§ 105. Radiation of gravitational waves
Let us consider next a weak gravitational field, produced by arbitrary bodies, moving with
velocities small compared with the velocity of light.
Because of the presence of matter, the equations of the gravitational field will differ from
the simple wave equation of the form \Jh k = (102.8) by having, on the right side of the
equality, terms coming from the energymomentum tensor of the matter. We write these
equations in the form
n*}$Jtj, (los.i)
where we have introduced in place of the h) the more convenient quantities \J/ k = h k %5 k h,
and where x\ denotes the auxiliary quantities which are obtained upon going over from the
exact equations of gravitation to the case of a weak field in the approximation we are con
sidering. It is easy to verify that the components x% and x° a are obtained directly from the
corresponding components T) by taking out from them the terms of the order of magnitude
in which we are interested; as for the components x a p , they contain along with terms obtained
from the Tfj, also terms of second order from R k — i<5*R.
The quantities if/] satisfy the condition (102.5) d\i/\jdx k = 0. From (105.1) it follows that
this same equation holds for the t* :
dx k
^L = 0. (105.2)
This equation here replaces the general relation T k . k = 0.
Using the equations which we have obtained, let us consider the problem of the energy
radiated by moving bodies in the form of gravitational waves. The solution of this problem
requires the determination of the gravitational field in the "wave zone", i.e. at distances
large compared with the wavelength of the radiated waves.
In principle, all the calculations are completely analogous to those which we carried out
for electromagnetic waves. Equation (105.1) for a weak gravitational field coincides in
form with the equation of the retarded potentials (§ 62). Therefore we can immediately
write its general solution in the form
#t?Je&!x (105  3)
Since the velocities of all the bodies in the system are small, we can write, for the field
at large distances from the system (see §§66 and 67),
c R J c
where R is the distance from the origin, chosen anywhere in the interior of the system.
From now on we shall, for brevity, omit the index t—(R /c) in the integrand.
For the evaluation of these integrals we use equation (105.2). Dropping the index on the
T* and separating space and time components, we write (105.2) in the form
dx ay ^a0_ n faoy dtpO _ MftS^
Multiplying the first equation by x?, we integrate over all space,
324 THE GRAVITATIONAL FIELD EQUATIONS § 105
Since the infinity x ik = 0, the first integral on the right, after transformation by Gauss'
theorem, vanishes. Taking half the sum of the remaining equation and the same equation
with transposed indices, we find
j r aP dV =   ^ j (r a0 x p + y x a ) dV.
Next, we multiply the second equation of (105.5) by x a x p , and again integrate over all
space. An analogous transformation leads to
^o j t 00 xV dv= \ fao*' + */»<>*") dV.
Comparing the two results, we find
J XaP dV = 2 dx* J TooX " X ' dV ' (1056)
Thus the integrals of all the x aP appear as expressions in terms of integrals containing only
the component t 00 . But this component, as was shown earlier, is simply equal to the corres
ponding component T 00 of the energymomentum tensor and can be written to sufficient
accuracy [see (96.1)] as:
T 00 = VC 2 . (105.7)
Substituting this in (105.6) and introducing the time t = x°/c, we find for (105.4)
2k d 2 C
^ = ~ 7r~ d? J ^ dv ' (105 ' 8)
At large distances from the bodies, we can consider the waves as plane (over not too large
regions of space). Therefore we can calculate the flux of energy radiated by the system, say
along the direction of the x 1 axis, by using formula (102.11). In this formula there enter the
components h 23 = \J/ 23 and h 22 — h 33 = ij/ 22 — ^33 From (105.8), we find for them the
expressions
2k .. 2k
"23 = — o 4n ^23> "22~~"33 = _ o 4n C^22 — ^33)
(the dot denotes time differentiation), where we have introduced the tensor
D aP = j ii(3x a x p 8 aP x 2 ) dV, (105.9)
the "quadrupole moment" of the mass (see § 96). As a result, we obtain the energy flux
along the x 1 axis in the form
10 ^ / "22 ^33 \ . K2
ct xv =
36nc
_ / D 22 p 33 y
(105.10)
Knowing the radiation in the direction of the x 1 axis, it is easy to determine the radiation
in an arbitrary direction characterized by the unit vector n. To do this we must construct
from the components of the tensor D afi and the vector n a a scalar, quadratic in the D afi ,
which for n± = 1, n 2 = n 3 = reduces to the expression in square brackets in (105.10).
The result for the intensity of energy radiated into solid angle do turns out to be
k Tl 1
dI = 36^? U iD ' pnatlfi)2 + 2 D2 « D * D o n > n r
do. (105.11)
§ 106 EQUATIONS OF MOTION, SECOND APPROXIMATION 325
The total radiation in all directions, i.e., the energy loss of the system per unit time
{dijdt), can be found by averaging the flux over all directions and multiplying the result
by 4tt. The averaging is easily performed using the formulas given in the footnote on p. 189.
This averaging leads to the following expression for the energy loss:
_dl == k D 2 (105.12)
dt 45c 5 a/I
It is necessary to note that the numerical value of this energy loss, even for astronomical
objects, is so small that its effects on the motion, even over cosmic time intervals, is com
pletely negligible (thus, for double stars, the energy loss in a year turns out to be ~ 10
of the total energy).
PROBLEM
Two bodies attracting each other according to Newton's law move in a circular orbit (around
their common center of mass). Calculate the velocity of approach of the two bodies, due to the loss
of energy by radiation of gravitational waves.
Solution: If mi, m 2 are the masses of the bodies, and r their mutual distance (constant for motion
in a circular orbit), then a calculation using (105.12) gives
dS 32k i m 1 m 2 \ 2 , 6
dt 5c 5 \mr\mJ
where co = In/T, and Tis the period of rotation. The frequency co is related to r by co 2 r 3 = k(m x + m 2 ) .
Since
km 1 m 2 = 2r 2 dS
~~~ 2r ' km 1 m 2 dt" 1
and we get finally
, _ 64k 3 mx m 2 {m x + m 2 )
§ 106. The equations of motion of a system of bodies in the second approximation
The expression (105.12) found in the preceding section for the loss of energy of a system
in the form of radiation of gravitational waves contains a factor l/c 5 , i.e. this loss appears
only in the fifth approximation in l/c. In the first four approximations, the energy of the
system remains constant. From this it follows that a system of gravitating bodies can be
described by a Lagrangian correctly to terms of order l/c 4 in the absence of an electro
magnetic field, for which a Lagrangian exists in general only to terms of second order (§ 65).
Here we shall give the derivation of the Lagrangian of a system of bodies to terms of
second order. We thus find the equations of motion of the system in the next approximation
after the Newtonian.
We shall neglect the dimensions and internal structure of the bodies, regarding them as
"pointlike"; in other words, we shall restrict ourselves to the zero'th approximation in the
expansion in powers of the ratios of the dimensions a of the bodies to their mutual separa
tions /.
To solve our problem we must start with the determination, in this same approximation,
of the weak gravitational field produced by the bodies at distances large compared to their
dimensions, but at the same time small compared to the wavelength A of the gravitational
waves radiated by the system (a <4 r 4, X ~ lc/v).
326 THE GRAVITATIONAL FIELD EQUATIONS § 106
In the first approximation, in equations (105.1) we must neglect terms containing second
time derivatives, with the factor 1/c 2 , and of all the components t& assume different from
zero only the component Tq = !*c z which contains c 2 (whereas the other components contain
the first or second power of the velocities of the bodies). We then obtain the equations
A^ = 0, A^o = 0, A«AS = ^V
c
We must look for solutions of these equations which go to zero at infinity (galilean metric).
It therefore follows from the first two equations that \j/ p a = 0,\J/ a = 0. Comparing the third
equation with equation (96.2) for the Newtonian potential 0, we find j/^ = 40/c 2 . Then
we have for the components of the tensor h\ = ^f# 5) the following values :f
2
h i=^<l>%, (106.1)
2
K = 0, h° = 2 cf), (106.2)
and for the interval,
ds 2 = (l + l<f>\ c 2 dt 2 fl~A (dx 2 + dy 2 + dz 2 ). (106.3)
We note that first order terms containing <j> appear not only in g 00 but also in g a/} ; in
§ 87 it was already stated that, in the equations of motion of the particle, the correction
terms in g aP give quantities of higher order than the terms coming from g 00 ; as a consequence,
of this, by a comparison with the Newtonian equations of motion we can determine only
000
As will be seen from the sequel, to obtain the required equations of motion it is sufficient
to know the spatial components h xp to the accuracy (~ 1/c 2 ) with which they are given in
(106.1); the mixed components (which are absent in the 1/c 2 approximation) are needed to
terms of order 1/c 3 , and the time component h 00 to terms in 1/c 4 . To calculate them we turn
once again to the general equations of gravitation, and consider the terms of corresponding
order in these equations.
Disregarding the fact that the bodies are macroscopic, we must write the energymomen
tum tensor of the matter in the form (33.4), (33.5). In curvilinear coordinates, this expression
is rewritten as
, ifc ,, m„c dx* dx k
rtt =?7^^ (r ° (106  4)
[for the appearance of the factor 1/Vflf, see the analogous transition in (90.4)]; the sum
mation extends over all the bodies in the system.
The component
v m a c 3 2 dt
a ■sj — g dS
in first approximation (for galilean g ik ) is equal to J] m a c 2 <5(r— r a ); in the next approxima
a
t This result is, of course, in complete agreement with the formulas found in § 104 for h\l? [where hi)? is
represented in the form (104.1a)].
§ 106 EQUATIONS OF MOTION, SECOND APPROXIMATION 327
tion, we substitute for g ik from (106.3) and find, after a simple computation:
Too = Ev 2 (i+f + S) <5(rr * ) ' (106 ' 5)
where v is the ordinary threedimensional velocity (if = dxf/dt) and (f> a is the potential
of the field at the point r a . (As yet we pay no attention to the fact that (p a contains an in
finite part— the potential of the selffield of the particle m a ; concerning this, see below.)
As regards the components T aP , T 0a of the energymomentum tensor, in this approxima
tion it is sufficient to keep for them only the first terms in the expansion of the expression
(106.4)
Tap = Z ™ a Va*VapS(r*a),
' ^ Sit A ( 106  6 )
a
Next we proceed to compute the components of the tensor R ik . The calculation is con
veniently done using the formula R ik = g lm R nmk with R Umk given by (92.4). Here we must
remember that the quantities h aP and h 00 contain no terms of order lower than 1/c 2 , and/? «
no terms lower than 1/c 3 ; differentiation with respect to x° = ct raises the order of smallness
of quantities by unity.
The main terms in R 00 are of order 1/c 2 ; in addition to them we must also keep terms of
the next non vanishing order 1/c 4 . A simple computation gives the result:
R °°  c dt \dx* " 2c dt J + 2 °° 2 n dx*dx p 4\dx*J
ldh 00 f dhj dK\
4 dx p \ dx a dx p )'
In this computation we have still not used any auxiliary condition for the quantities h ik .
Making use of this freedom, we now impose the condition
ff4f = 0, (106.7)
dx" 2c dt
as a result of which all the terms containing the components h 0a drop out of R 00 . In the
remaining terms we substitute
^=#f, fc o= ^ + °( c 4)
R 00 =  Aftoo + 4 0A0  4 (V<^) 2 , (106.8)
and obtain, to the required accuracy,
12 2
A&OO+4 0A04
2 c c
where we have gone over to threedimensional notation; here $ is the Newtonian potential
of the system of point particles, i.e.
In computing the components R 0a it is sufficient to keep only the terms of the first non
vanishing order — 1/c 3 . In similar fashion, we find:
■> i s*k i dv 1 8 *h$ i
°* 2c dtdx" 2 dx'dx" 2c dtdx" 2 °*
328 THE GRAVITATIONAL FIELD EQUATIONS § 106
and then, using the condition (106.7):
1 a , 1 d 2 cf>
*"2 A *° + 2?Sa? (m9)
Using the expressions (106.5)(106.9), we now write the Einstein equations
Snk/ 1 \
R ik = jr [T ik   g ik T). (106.10)
The time component of equation (106.10) gives:
Ah 00 +  4 0A0 ^ (V<£) 2 = ^r I m a c 2 ^1+ ^ + ^f J <5(rr a );
making use of the identity
4(V0) 2 = 2A(0 2 )40A0
and the equation of the Newtonian potential
A(j) = Ank £ m a d(rr a ), (106.11)
a
we rewrite this equation in the form
a (h 00  1 p) = "E».(i+f + 1?) *''.>• ( 106  12 )
After completing all the computations, we have replaced cf) a on the right side of (106.12) by
m h
b \r a r b \
i.e. by the potential at the point x a of the field produced by all the particles except for the
particle m a ; the exclusion of the infinite selfpotential of the particles (in the method used
by us, which regards the particles as pointlike) corresponds to a "renormalization" of their
masses, as a result of which they take on their true values, which take into account the field
produced by the particles themselves.f
The solution of (106.12) can be given immediately, using the familiar relation (36.9)
A  = 4n5(r).
r
We thus find :
2<t> 2<f> 2 2k m a <J>' a 3k m^l , 1(v;i .
The mixed component of equation (105.10) gives
c 2 8t dx x '
Afco« = " ^Im.ar0 i ^. (106.14)
t Actually, if there is only one particle at rest, the right side of the equation will have
simply (8nk/c 2 )m a S(r— r a ), and this equation will determine correctly (in second approximation) the field
produced by the particle.
§ 106 EQUATIONS OF MOTION, SECOND APPROXIMATION 329
The solution of this linear equation isf
_ Ak „ m a v aa _ 1_ d 2 f
0a ~ c 3 Vrr a  c*dtdx*'
where /is the solution of the auxiliary equation
km„
a I 1 l a\
Using the relation Ar = 2/r, we find:
/= ~E m l r  r «l>
and then, after a simple computation, we finally obtain:
ho* = 2? ? [r^ [ 7 *~ + (V«0«J. (106.15)
where n fl is a unit vector along the direction of the vector rr fl .
The expressions (106.1), (106.13) and (106.15) are sufficient for computing the required
Langrangian to terms of second order.
The Lagrangian for a single particle, in a gravitational field produced by other particles
and assumed to be given, is
L a = ~ m a c j= m a c l + noo+2no I  c i+^ c r] •
Expanding the square root and dropping the irrelevant constant m a c 2 , we rewrite this
expression, to the required accuracy, as
(106.16)
Here the values of all the h ik are taken at the point r fl ; again we must drop terms which become
infinite, which amounts to a "renormalization" of the mass m a appearing as a coefficient
inL fl .
The further course of the calculations is the following. The total Lagrangian of the system
is, of course, not equal to the sum of the Lagrangians L a for the individual bodies, but
must be constructed so that it leads to the correct values of the forces f a acting on each of
the bodies for a given motion of the others. For this purpose we compute the forces f a by
differentiating the Lagrangian L a :
t =
•■m...
the differentiation is carried out with respect to the running coordinate r of the "field
point" in the expressions for h ik ). It is then easy to form the total Lagrangian L, from which
all of the forces f a are obtained by taking the partial derivatives dL/dr a .
t In the stationary case, the second term on the right of equation (106.14) is absent. At large distances
from the system, its solution can be written immediately by analogy with the solution (44.3) of equation (43.4)
/»o« = ^ r2 (Mxn) (r
(where M= J" rx/m/F= Im o r xv a is the angular momentum of the system), in complete agreement
with formula (104.10).
a ■£ \ 6 C r ab / a OC a b ^'ab
330 THE GRAVITATIONAL FIELD EQUATIONS § 106
Omitting the simple intermediate computations, we give immediately the final result for
the Lagrangian:f
rri _ i '
L =
v \^,km a m b k 2 m a m b m c
■" * H ' c r o6 a 6 c **c r ab r ac
where r a6 = r fl r 6 , n ab is a unit vector along the direction r a r 6 , and the prime on the
summation sign means that we should omit the term with b = a or c = a.
PROBLEMS
1. Find the action function for the gravitational field in the Newtonian approximation.
Solution: Using the# ifc from (106.3), we find from the general formula (93.3), G = — (2/c 4 )(Vp) 2 ,
so that the action for the field is
* — 85 /J (")"'»'*•
The total action, for the field plus the masses distributed in space with density p, is :
ff ~JJ[Tr"85 (vrt "] , ' KA (I)
One easily verifies that variation of S with respect to q> gives the Poisson equation (96.2), as it
should.
The energy density is found from the Lagrangian density A [the integrand in (1)] by using the
general formula (32.5), which reduces in the present case (because of the absence of time derivatives
of <p in A) to changing the signs of the second and third terms. Integrating the energy density over
all space, where we substitute n<P = (M4nk)<p/\<p in the second term and integrate by parts, we
finally obtain the total energy of field plus matter in the form
/[
2. %nk KV)
dV.
Consequently the energy density of the gravitational field in the Newtonian theory is
W = (l/$nk)(V<p) 2 .t
2. Find the coordinates of the center of inertia of a system of gravitating bodies in the second
approximation.
Solution: In view of the complete formal analogy between Newton's law for gravitational
interaction and Coulomb's law for electrostatic interaction, the coordinates of the center of inertia
are given by the formula
which is analogous to the formula found in Problem 1 of § 65.
3. Find the secular shift of the perihelion of the orbit of two gravitating bodies of comparable
mass (H. Robertson, 1938).
t The equations of motion corresponding to this Lagrangian were first obtained by A. Einstein, L. Infeld
and B. Hoffmann (1938) and by A. Eddington and G. Clark (1938).
J To avoid any misunderstanding, we state that this expression is not the same as the component (— g) t 00
of the energymomentum pseudotensor (as calculated with the g ik from (106.3)); there is also a contribution
to W from (g) T ik .
§ 106 EQUATIONS OF MOTION, SECOND APPROXIMATION 331
Solution: The Lagrangian of the system of two bodies is
mi v? , m 2 v\ knix m 2 , 1 , 4 , 4 v ,
L = y 1 + —^ + — y— + g^ (mi »i f m 2 vi) +
,teim 2rv!ll 2 . _ / v „yi k 2 m 1 m 2 (m 1 +m 2 )
+ ~2^r P(v?+ v2)7Vi • v 2 (vi • n)(v 2 n)] ^a •
Going over to the Hamiltonian function and eliminating from it the motion of the center of inertia
(see problem 2 in § 65), we get:
vP 2 / 1 , 1\ kimm p* (1 1\
"~2Vmi + mJ r %c 2 \m*J m%)
where p is the momentum of the relative motion.
We determine the radial component of momentum p r as a function of the variable r and the
parameters M (the angular momentum) and € (the energy). This function is determined from the
equation & = £ (in which, in the secondorder terms, we must replace p 2 by its expression from
the zero'th approximation):
a 1 / 1 , 1 \ / 2 , M 2 \ km im2 1 / 1 1 \ f 2m im2 \ 2 ( km^mtf
The further course of the computations is analogous to that used in § 98. Having determined p r
from the algebraic equation given above, we make a transformation of the variable r in the integral
S r = jp,dr,
so that the term containing M 2 is brought to the form M 2 jr 2 . Then expanding the expression under
the square root in terms of the small relativistic corrections, we obtain:
k
2c 2 r
'3(^ + ^+7
\mi m 2 )
S r =
j Ja + ° t ( M >^)L*
[see (98.6)], where A and B are constant coefficients whose explicit computation is not necessary.
As a result we find for the shift in the perihelion of the orbit of the relative motion :
_ 6 nk 2 m\ m\ _ 6 nkitrix + m 2 )
(p ~ c 2 M 2 ~ c 2 a(\e 2 ) '
Comparing with (98.7) we see that for given dimensions and shape of the orbit, the shift in the peri
helion will be the same as it would be for the motion of one body in the field of a fixed center of
mass mi+m 2 .
4. Determine the frequency of precession of a spherical top, performing an orbital motion in the
gravitational field of a central body that is rotating about its axis.
Solution: In the first approximation the effect is the sum of two independent parts, one of which
is related to the nonNewtonian character of the centrally symmetric field, and the other to the
rotation of the central body.t
The first part is described by an additional term in the Lagrangian of the top, corresponding to
the second term in (106.17). We write the velocities of individual elements of the top (with mass
dm) in the form v = V+ co x r, where V is the velocity of the orbital motion, & is the angular velocity,
and r is the radius vector of the element dm relative to the center of the top (so that the integral over
the volume of the top [xdm = 0). Dropping terms independent of a and also neglecting terms
quadratic in eo, we have:
„„_ 3fcm' f . Voxr ,
t The first effect was treated by H. Weyl (1923) and the second by L. Schiff (1960).
332 THE GRAVITATIONAL FIELD EQUATIONS § 106
where m' is the mass of the central body, i? = R +r] is the distance from the center of the field
to the element dm, R is the radius vector of the center of inertia of the top. In the expansion
1/R « l/R (nr/Rl) (where n = R /i? ) the integral of the first term vanishes, while integration
of the second term is done using the formula
x a x B dm = iIS a g
where / is the moment of inertia of the top. As a result we get :
3km'
2c 2 R 2
where M = /« is the angular momentum of the top.
The additional term in the Lagrangian, due to the rotation of the central body, can also be found
from (106.17), but it is even simpler to calculate it using formula (1) of the problem in § 104:
where M' is the angular momentum of the central body. Expanding,
R n 
r 3 *K + r 3
and performing the integration, we get :
^ 2) ^=^ 5 {MM / 3(nMXnM')}.
Thus the total correction to the Lagrangian is
6L=M£l, n = ~nxvo+~{3n(nM0M'}.
To this function there corresponds the equation of motion
— =fixM
dt
[see equation (2) of the problem in § 104]. This means that the angular momentum M of the top
precesses with angular velocity Q, remaining constant in magnitude.
<5 (1) £=^7T 2 MVoXn,
,3~ ^ + ^ 3 ( r  3n ( n ' r ))
CHAPTER 12
COSMOLOGICAL PROBLEMS
§ 107. Isotropic space
The general theory of relativity opens new avenues of approach to the solution of prob
lems related to the properties of the universe on a cosmic scale. The new possibilities which
arise from the nongalilean nature of spacetime are remarkable.
These possibilities are the more important because Newtonian mechanics here leads to
contradictions which cannot be avoided in a sufficiently general way within the framework
of nonrelativistic theory. Thus, applying the Newtonian formula for the gravitational
potential (96.3) to a flat (as it is in Newtonian mechanics) infinite space filled with matter
having an arbitrarily distributed average density that vanishes nowhere, we find that the
potential becomes infinite at every point. This would lead to infinite forces acting on the
matter, which is absurd.
We know that the stars are distributed in space in an extremely nonuniform fashion —
they are concentrated in individual star systems (galaxies). But in studying the universe on a
"large scale" we should disregard these "local" inhomogeneities which result from the
agglomeration of matter into stars and star systems. Thus by the mass density we must
understand the density .averaged over regions of space whose dimensions are large compared
to the separations between galaxies.
The solutions of the gravitational equations which are considered here, the socalled
isotropic cosmological model (first found by A. Friedmann, 1922), are based on the assump
tion that the matter is distributed uniformly over all space.f Existing astronomical data
do not contradict such an assumption. But by its very nature it inevitably can have only an
approximate character, since the uniformity is surely violated when we go to a smaller
scale. However, there is every reason to believe that the isotropic model gives, in its general
features, an adequate description of the present state of the universe. We shall see that a
basic feature of this model is its nonstationarity. There is no doubt that this property
gives a correct explanation of such a fundamental for the entire cosmology phenomenon
as the "red shift" (see § 1 10).
Space uniformly filled with matter is completely homogeneous and isotropic. This means
that we can choose a "world" time so that at every moment the metric of the space is the
same at all points and in all directions.
First we take up the study of the metric of the isotropic space as such, disregarding for
the moment any possible time dependence. As we did previously, we denote the three
dimensional metric tensor by y aP , i.e. we write the element of spatial distance in the form
dl 2 = y aP dx a dx p . (107.1)
t We shall not consider at all equations containing the socalled cosmological constant, since there is
no sufficient physical basis at present for such a change in the form of the gravitational equations.
333
334 COSMOLOGICAL PROBLEMS § 107
The curvature of the space is completely determined by its threedimensional curvature
tensor, which we shall denote by P% 3 in distinction to the fourdimensional tensor Rl lm (the
properties of the tensor P Pyd are of course completely analogous to those of the tensor
R l k i m ). In the case of complete isotropy, the tensor P a Pyd must clearly be expressible in terms
of the metric tensor y afl alone. It is easy to see from the symmetry properties of Pp y5 that it
must have the form :
^ = my S p^y y p\ (107.2)
where X is some constant. The tensor of the second rank, P ap = P 7 ayP , is accordingly equal to
P a p = 2Xy aP (107.3)
and the scalar curvature
P = 6X. (107.4)
Thus we see that the curvature properties of an isotropic space are determined by just
one constant X. Correspondingly to this there are altogether three different possible cases
for the spatial metric: (1) the socalled space of constant positive curvature (corresponding
to a positive value of X), (2) space of constant negative curvature (corresponding to values
of X < 0), and (3) space with zero curvature (X = 0). Of these, the last will be a flat, i.e.
euclidean, space.
To investigate the metric it is convenient to start from geometrical analogy, by considering
the geometry of isotropic threedimensional space as the geometry on a hypersurface known
to be isotropic, in a fictitious fourdimensional space. f Such a space is a hypersphere; the
threedimensional space corresponding to this has a positive constant curvature. The
equation of a hypersphere of radius a in the fourdimensional space x u x 2 , x 3 , x 4 , has the
form
X^ T"X2 r^Cg t X4, ^ d i
and the element of length on it can be expressed as
dl 2 = dx\ + dx\ + dx\ + dx\.
Considering x u x 2 , x 3 as the three space coordinates, and eliminating the fictitious co
ordinate Ar 4 with the aid of the first equation, we get the element of spatial distance in the
form
dl 2 = dx 2 + dx 2 + dx 2 + ( x i dx i+ x 2 2 dx 2 +x 3 dx 3 ) 2 ^ (l0? 5)
From this expression, it is easy to calculate the constant X in (107.2). Since we know
beforehand that P aP has the form (107.3) over all space, it is sufficient to calculate it only for
points located near the origin, where the y aP are equal to
y*p  *p+ ~^2~
Since the first derivatives of the y aP , which determine the quantities 1"^, vanish at the origin,
the calculation from the general formula (92.10) turns out to be very simple and gives the
result
X = \. (107.6)
t This fourspace is understood to have nothing to do with fourdimensional spacetime.
R 107 ISOTROPIC SPACE 335
We may call the quantity a the "radius of curvature" of the space. We introduce in place
of the coordinates x u x 2t x 3 , the corresponding "spherical" coordinates r, 9, 0. Then the
line element takes the form
dl 2 = ^ + r 2 (sin 2 9 d<\> 2 + d9 2 ). (107.7)
..2
14
a
The coordinate origin can of course be chosen at any point in space. The circumference of a
circle in these coordinates is equal to 2nr, and the surface of a sphere to 4nr . The "radius"
of a circle (or sphere) is equal to
r
f7 T T  = asin ^r/a),
J Vlr 2 /a 2
o
that is, is larger than r. Thus the ratio of circumference to radius in this space is less than In.
Another convenient form for the dl 2 in "fourdimensional spherical coordinates" is
obtained by introducing in place of the coordinate r the "angle" % according to r = a sin x
(X goes between the limits to 7i). Then
dl 2 = a W + sin 2 x(sin 2 9 # 2 + d9 2 )]. (107.8)
The coordinate x determines the distance from the origin, given by ax. The surface of a
sphere in these coordinates equals Ana 2 sin 2 x We see that as we move away from the
origin, the surface of a sphere increases, reaching its maximum value Ana 2 at a distance
of nail. After that it begins to decrease, reducing to a point at the "opposite pole" of the
space, at distance na, the largest distance which can in general exist in such a space [all this
is also clear from (107.7) if we note that the coordinate r cannot take on values greater
than a].
The volume of a space with positive curvature is equal to
271 71 Jt
V= ! I J a 3 sin 2 x sin 9 d% d9 d§.
so that
V = 2n 2 a\ (107.9)
Thus a space of positive curvature turns out to be "closed on itself". Its volume is finite
though of course it has no boundaries.
It is interesting to note that in a closed space the total electric charge must be zero.
Namely, every closed surface in a finite space encloses on each side of itself a finite region
of space. Therefore the flux of the electric field through this surface is equal, on the one hand,
to the total charge located in the interior of the surface, and on the other hand to the total
charge outside of it, with opposite sign. Consequently, the sum of the charges on the two
sides of the surface is zero.
Similarly, from the expression (101.14) for the fourmomentum in the form of a surface
integral there follows the vanishing of the total fourmomentum P l over all space. Thus the
t The "cartesian" coordinates x lt x a , x 3 , x± are related to the fourdimensional spherical coordinates
£> 0, 4>, X by tne relations:
jci = asin;tsin0cos^, x 2 = a sin x sin 9 sin </>,
x 3 = a sin x cos 6, x t = a cos x
336 COSMOLOGICAL PROBLEMS § 108
definition of the total fourmomentum loses its meaning, since the corresponding con
servation law degenerates into the empty identity = 0.
We now go on to consider geometry of a space having a constant negative curvature.
From (107.6) we see that the constant X is negative if a is imaginary. Therefore all the
formulas for a space with negative curvature can be immediately obtained from the pre
ceding ones by replacing a by ia. In other words, the geometry of a space with negative
curvature is obtained mathematically as the geometry on a fourdimensional pseudosphere
with imaginary radius.
Thus the constant X is now
A =j> (107.10)
and the element of length in a space of negative curvature has, in coordinates r, 9, cf>, the
form
dr 2
dl 2 = ^ + r 2 (sin 2 9 dcj) 2 + d9 2 ), (107.11)
1+
a'
,2
where the coordinate r can go through all values from to oo. The ratio of the circumference
of a circle to its radius is now greater than In. The expression for dl 2 corresponding to
(107.8) is obtained if we introduce the coordinate % according to r = a sinh % (x here goes
from to oo). Then
dl 2 = a 2 {d X 2 + smh 2 x(sin 2 9 d<f) 2 + d9 2 )}. (107.12)
The surface of a sphere is now equal to 4na 2 sinh 2 x and as we move away from the
origin (increasing x)> it increases without limit. The volume of a space of negative curvature
is, clearly, infinite.
PROBLEM
Transform the element of length (107.7) to a form in which it is proportional to its euclidean
expression.
Solution: The substitution
leads to the result :
i+ &
dl 2 = (l + ^ J *(drl+rl d0 2 +rZ sin 2 6 ■ d<j> 2 ).
§ 108. Spacetime metric in the closed isotropic model
Going on now to the study of the spacetime metric of the isotropic model, we must
first of all make a choice of our reference system. The most convenient is a "comoving"
reference system, moving, at each point in space, along with the matter located at that
point. In other words, the reference system is just the matter filling the space; the velocity
of the matter in this system is by definition zero everywhere. It is clear that this reference
system is reasonable for the isotropic model— for any other choice the direction of the
§ 108 SPACETIME METRIC IN THE CLOSED ISOTROPIC MODEL 337
velocity of the matter would lead to an apparent nonequivalence of different directions in
space. The time coordinate must be chosen in the manner discussed in the preceding
section, i.e. so that at each moment of time the metric is the same over all of the space.
In view of the complete equivalence of all directions, the components g 0x of the metric
tensor are equal to zero in the reference system we have chosen. Namely, the three com
ponents g 0a can be considered as the components of a threedimensional vector which, if it
were different from zero, would lead to a nonequivalence of different directions. Thus ds 2
must have the form ds 2 = g 00 (dx°) 2 dl 2 . The component g 00 is here a function only of x°.
Therefore we can always choose the time coordinate so that g 00 reduces to c 2 . Denoting it
by t, we have
ds 2 = c 2 dt 2 dl 2 . (108.1)
This time t is clearly the proper time at each point in space.
Let us begin with the consideration of a space with positive curvature; from now on we
shall, for brevity, refer to the corresponding solution of the equations of gravitation as the
"closed model". For dl we use the expression (107.8) in which the "radius of curvature" a
is, in general, a function of the time. Thus we write ds 2 in the form
ds 2 = c 2 dt 2 a 2 (t){d X 2 + sin 2 X (d9 2 + sin 2 9 d(j> 2 )}. (108.2)
The function a(t) is determined by the equations of the gravitational field. For the solu
tion of these equations it is convenient to use, in place of the time, the quantity r\ defined by
the relation
cdt = a dt]. (108.3)
Then ds 2 can be written as
ds 2 = a\ n ){dn 2  di 2  sin 2 X (d9 2 + sin 2 9 • # 2 )}. (108.4)
To set up the field equations we must begin with the calculation of the components of
the tensor R ik (the coordinates x°, x l , x 2 , x 3 are rj, %, d, </>). Using the values of the com
ponents of the metric tensor,
0oo = « 2 > 0n= a 2 , g 22 = a 2 sin 2 x, g 33 = a 2 sin 2 %sin 2 9,
we calculate the quantities T l kl :
ro a r° a n r a — — x a r° — r a — o
A 00 = — » A a/3 — 3 #«/?' L Op — °P> x aO — A 00 — u >
CI CI CI
where the prime denotes differentiation with respect to r\. (There is no need to compute the
components T a Py explicitly.) Using these values, we find from the general formula (92.10):
_3
a A
R°o = ~^(a' 2 aa").
From the same symmetry arguments as we used earlier for the g 0a , it is clear from the start
that R 0a = 0. For the calculation of the components R{ we note that if we separate in them
the terms containing g aP (i.e. only the ¥%), these terms must constitute the components of a
threedimensional tensor P£, whose values are already known from (107.3) and (107.6):
Uj=Pj+... = ^«J+...,
where the dots represent terms containing g 00 in addition to the g aP . From the computation
338 COSMOLOGICAL PROBLEMS § 108
of these latter terms we find:
so that
Ri=^(2a 2 + a' 2 + aa")dl
R = R° + R* a =^(a + a").
Since the matter is at rest in the frame of reference we are using, if = 0, u° = l/a, and
we have from (94.9) Tg = e, where s is the energy density of the matter. Substituting these
expressions in the equation
„ n 1 „ 8nk „
we obtain:
~^ s = ^( a +a )• (108.5)
Here there enter two unknown functions s and a; therefore we must obtain still another
equation. For this is convenient to choose (in place of the spatial components of the field
equations) the equation T l . , = 0, which is one of the four equations (94.7) contained, as
we know, in the equations of gravitation. This equation can also be derived directly with
the help of thermodynamic relations, in the following fashion.
When in the field equations we use the expression (94.9) for the energymomentum
tensor, we are neglecting all those processes which involve energy dissipation and lead to an
increase in entropy. This neglect is here completely justified, since the auxiliary terms which
should be added to T) in connection with such processes of energy dissipation are negligibly
small compared with the energy density s, which contains the rest energy of the material
bodies.
Thus in deriving the field equations we may consider the total entropy as constant. We
now use the wellknown thermodynamic relation di = TdSpdV, where S, S, V, are
the energy, entropy, and volume of the system, and p, T, its pressure and temperature. At
constant entropy, we have simply d$ = p dV. Introducing the energy density & = g\V
we easily find
ds= (e+p)—.
The volume V of the space is, according to (107.9), proportional to the cube of the radius of
curvature a. Therefore dVjV = 3da/a =3d(\n a), and we can write
ds
3d(ln a),
s+p
or, integrating,
f ds
3 In a =  + const (108.6)
(the lower limit in the integral is constant).
If the relation between e and p (the "equation of state" of the matter) is known, then
equation (108.6) determines s as a function of a. Then from (108.5) we can determine r\
§ 108 SPACETIME METRIC IN THE CLOSED ISOTROPIC MODEL 339
in the form
n=±( — da (108.7)
■J
Equations (108.6), (108.7) solve, in general form, the problem of determining the metric in
the closed isotropic model.
If the material is distributed in space in the form of discrete macroscopic bodies, then to
calculate the gravitational field produced by it, we may treat these bodies as material particles
having definite masses, and take no account at all of their internal structure. Considering
the velocities of the bodies as relatively small (compared with c), we can set s = nc 2 , where
H is the sum of the masses of the bodies contained in unit volume. For the same reason
the pressure of the "gas" made up of these bodies is extremely small compared with e,
and can be neglected (from what we have said, the pressure in the interior of the bodies has
nothing to do with the question under consideration). As for the radiation'present in space,
its amount is relatively small, and its energy and pressure can also be neglected.
Thus, to describe the present state of the Universe in terms of this model, we should use
the equation of state for "dustlike" matter,
e = juc 2 , p = 0.
The integration in (108.8) then gives pa* = const. This equation could have been written
immediately, since it merely expresses the constancy of the sum M of the masses of the bodies
in all of space, which should be so for the case of dustlike matter, f Since the volume of space
in the closed model is V = 2n 2 a 3 , const = M\2% 2 . Thus
fia 3 = const = —j. (108.8)
Substituting (108.8) in equation (108.7) and performing the integration, we get:
a = a (lcos>/), (108.9)
where the constant
2kM
a ° = to?'
Finally, for the relation between t and r\ we find from (108.3):
t = a *(t\s mr \\ (108.10)
c
The equations (108.910) determine the function a(t) in parametric form. The function
a(t ) grows from zero at t = (q = 0) to a maximum value of a = 2a , which is reached
when t = najc (rj = n), and then decreases once more to zero when t = 2na /c (rj = 2n).
For >j^lwe have approximately a = a rj 2 /2, t = a r] 3 l6c, so that
^M t 2 '\ (108.11)
The matter density is
1 8xl ° 5 n0R1T>
" = 6^ = ^ (108 ' 12)
t To avoid misunderstandings (that might arise if one considers the remark in § 107 that the total four
momentum of a closed universe is zero), we emphasize that M is the sum of the masses of the bodies taken
one by one, without taking account of their gravitational interaction.
340 COSMOLOGICAL PROBLEMS § 109
(where the numerical value is given for density in gm cm 3 and t in sec). We call attention
to the fact that in this limit the function fi(t) has a universal character in the sense that it
does not depend on the parameter a .
When a * the density /i goes to infinity. But as fi *■ oo the pressure also becomes large,
so that in investigating the metric in this region we must consider the opposite case of
maximum possible pressure (for a given energy density e), i.e. we must describe the matter
by the equation of state
e
(see § 35). From formula (108.6) we then get:
3c 4 a 2
e a 4 = const = — ~ (108.13)
(where a x is a new constant), after which equations (108.7) and (108.3) give the relations
a,
a = a 1 sin>/, t — — (1 — cos 77).
c
Since it makes sense to consider this solution only for very large values of e (i.e. small a),
we assume rj <^ 1. Then a « a v rj, t <z a x rj 2 /2c, so that
a = yJ2a 1 ct. (108.14)
Then
e 3 4.5 x 10 5
,2
32nkt 2
(108.15)
(which again contains no parameters).
Thus, here too a > for t *■ 0, so that the value t = is actually a singular point of the
spacetime metric of the isotropic model (and the same remark applies in the closed model
also to the second point at which a = 0). We also see from (108.14) that if the sign of t is
changed, the quantity a(t) would become imaginary, and its square negative. All four
components g ik in (108.2) would then be negative, and the determinant g would be positive.
But such a metric is physically meaningless. This means that it makes no sense physically
to continue the metric analytically beyond the singularity.
§ 109. Spacetime metric for the open isotropic model
The solution corresponding to an isotropic space of negative curvature ("open model")
is obtained by a method completely analogous to the preceding. In place of (108.2), we now
have
ds 2 = c 2 dt 2 a 2 (t){dx 2 + sinh 2 %(d0 2 + sin 2 9 d<f> 2 )}. (109.1)
Again we introduce in place of / the variable //, according to c dt — a &r\ ; then we get
ds 2 = a\r]){dri 2 dx 2 smh 2 x(rf0 2 + sin 2 0# 2 )}. (109.2)
This expression can be obtained formally from (108.4) by changing r\, 7, a respectively to
irj, ix, ia. Therefore the equations of the field can also be gotten directly by this same sub
§ 109 SPACETIME METRIC FOR THE OPEN ISOTROPIC MODEL 341
stitution in equations (108.5) and (108.6). Equation (108.6) retains its previous form:
3 In a =  f — + const, (109.3)
while in place of (108.5), we have
*^ & = I (a' 2 a 2 ). (109.4)
(109.5)
Corresponding to this we find, instead of (108.7)
f da
n = ± .
a / — £a z + l
V 3c 4
For material in the form of dust, we find:t
a = a (cosh r\  1), t =  (sinh n  1]\ (109.6)
Ha* = 3 £a . (109.7)
4nk
The first two determine the function a(t) in parametric form.
In contrast to the closed model, here the radius of curvature changes monotonically,
increasing from zero at t = (r\ = 0) to infinity for t > oo (r\ ► oo). Correspondingly, the
matter density decreases monotonically from an infinite value when t = (when r\<\,
the monotonic decrease is given by the same approximate formula (108.12) as in the closed
model).
For large densities the solution (109.67) is not applicable, and we must again go to the
case p = e/3. We again get the relation
£ a 4 = const^^l (109.8)
and find for the function a(t):
a = a, sinh r\, t = — (cosh r\  1)
c
f We note that, by the transformation
r = Ae n sinh /, ex = Ae" cosh /,
Ae n = Vc 2 x 2 r 2 , tanh / = — ,
ex
the expression (108.2) is reduced to the "conformalgalilean" form
ds 2 =f(r, x)[c 2 dx 2 dr 2 r 2 (d0 2 +sin 2 9 d<j> 2 )].
Specifically, in the case of (109.6), setting A equal to a j2,
ds 2 = (\—J^=\{c 2 dx 2 dr 2 r 2 (d9 2 +sm 2 Q dj 2 )]
' \ 2Vc 2 x*r 2 J
(V. A. Fock, 1955). For large values of VcV r 2 (which correspond to >/> 1), this metric tends toward a
galilean form, as was to be expected since the radius of curvature tends toward infinity.
In the coordinates r, 0, (/>, r, the matter is not at rest and its distribution is not uniform; the distribution
and motion of the matter turns out to be centrally symmetric about any point of space chosen as the origin
of coordinates r, 6, (j>.
Sizk
342 COSMOLOGICAL PROBLEMS § 109
or, when r\ < 1,
a = yjlcii ct (109.9)
[with the earlier formula (108.15) for s(t)]. Thus in the open model, also, the metric has a
singularity (but only one, in contrast to the closed model).
Finally, in the limiting case of the solutions under consideration, corresponding to an
infinite radius of curvature of the space, we have a model with a flat (euclidean) space. The
interval ds 2 in the corresponding spacetime can be written in the form
ds 2 = c 2 dt 2 b 2 (t)(dx 2 + dy 2 + dz 2 ) (109.10)
(for the space coordinates we have chosen the "cartesian" coordinates x, y, z). The time
dependent factor in the element of spatial distance does not change the euclidean nature of
the space metric, since for a given t this factor is a constant, and can be made unity by a simple
coordinate transformation. A calculation similarto those in the previous paragraph leads to
the following equations:
3 (db\ 2 r ds
= ¥{di)> 31n6=J— + const.
For the case of low pressures, we find
Hb z = const, b = const t 2,z . (109.11)
For small t we must again consider the case p = s/3, for which we find
eb 4 = const, b = const V7. (109.12)
Thus in this case also the metric has a singular point (/ = 0).
We note that all the isotropic solutions found exist only when the matter density is
different from zero; for empty space the Einstein equations have no such solutions.! We
also mention that mathematically they are a special case of a more general class of solutions
that contain three physically different arbitrary functions of the space coordinates (see the
problem).
PROBLEM
Find the general form near the singular point for the metric in which the expansion of the space
proceeds "quasihomogeneously", i.e. so that all components y aft ~ —g aB (in the synchronous
reference system) tend to zero according to the same law. The space is filled with matter with the
equation of state/* = s/3 (E. M. Lifshitz and I. M. Khalatnikov, 1960).
Solution: We look for a solution near the singularity (t = 0) in the form:
Yc0 = ta aB +t 2 b a0 + . . . , (1)
where a a0 and b aB are functions of the (space) coordinates J ; below, we shall set c — 1 . The reciprocal
tensor is
y aB =  a aB —b aB ,
f For e = we would get from (109.5) a = a e n = ct [whereas the equations (108.7) are meaningless
because the roots are imaginary]. But the metric
ds 2 = c a rf/ 2 c 2 / 2 {^ 2 +sinh a x(^ a +sin 2 9 d^ 2 )}
can be transformed by the substitution r = ct sinh x, * = t cosh x, to the form
ds 2 = c 2 dT*dr 2 r*(d9 2 +sm 2 d<f> 2 ),
i.e. to a galilean spacetime.
% The Friedmann solution corresponds to a special choice of the functions a a0 , corresponding to a space
of constant curvature.
§110 THE RED SHIFT 343
where the tensor a"* is reciprocal to a aB , while b aB = a ay a B6 b y6 ; all raising and lowering of indices
and covariant differentiation is done using the timeindependent metric a aB .
Calculating the left sides of equations (99.11) and (99.12) to the necessary order in I ft, we get
4^ + S* 8 ?< 4 "» +1 >'
z (b : a b B : e)= 3— eu a u
(where b = Z>£). Also using the identity
1 =u t u l x u% u a u B a aB ,
Snke = 4f 2 2 t > Utt = ~2 (Z>: a ~ b "'' ^ (2)
The threedimensional Christoffel symbols, and with them the tensor P aB , are independent of the
time in the first approximation in \\t\ theP aB coincide with the expressions obtained when calculating
simply with the metric a aB . Using this, we find that in equation (99.13) the terms of order t~ 2
cancel, while the terms ~ \jt give
p «+l b « + T2 S » b = >
from which
4 5
b B = — P $ \ S e P, (3)
u a T ■* a I to a ' v J
we find :
(where P = a By P ey ). In view of the identity
1
2
[see (92.13)] the relation
Pl:,~P.* =
b B a .B = gb :a
is valid, so that the u a can be written in the form:
t 2
u a = — b. a . (4)
Thus, all six functions a aB remain arbitrary, while the coefficients b ttS of the next term in the
expansion (1) are determined in terms of them. The choice of the time in the metric (1) is completely
determined by the condition f = at the singularity; the space coordinates still permit arbitrary
transformations that do not involve the time (which can be used, for example, to bring a aB to
diagonal form). Thus the solution contains all together three "physically different" arbitrary
functions.
We note that in this solution the spatial metric is inhomogeneous and anisotropic, while the
density of the matter tends to become homogeneous as />0. In the approximation (4) the three
dimensional velocity v has zero curl, while its magnitude tends to zero according to the law
v 2 = v a v s y aB ~ t 3 .
§ 110. The red shift
The main feature characteristic of the solutions we have considered is the nonstationary
metric; the radius of curvature of the space is a function of the time. A change in the radius
of curvature leads to a change in all distances between bodies in the space, as is already seen
from the fact that the element of spatial distance dl is proportional to a. Thus as a increases
344 COSMOLOGICAL PROBLEMS § 110
the bodies in such a space "run away" from one another (in the open model, increasing a
corresponds to rj > 0, and in the closed model, to < rj < n). From the point of view of an
observer located on one of the bodies, it will appear as if all the other bodies were moving
in radial directions away from the observer. The speed of this "running away" at a given
time t is proportional to the separation of the bodies.
This prediction of the theory must be compared with a fundamental astronomical fact —
the red shift of lines in the spectra of galaxies. It we regard this as a Doppler shift, we arrive
at the conclusion that the galaxies are receding, i.e. at the present time the Universe is
expanding.f
Let us consider the propagation of a light ray in an isotropic space. For this purpose it is
simplest to use the fact that along the world line of the propagation of a light signal the
interval ds = 0. We choose the point from which the light emerges as the origin of co
ordinates #, 0, 0. From symmetry considerations it is clear that the light ray will propagate
"radially", i.e. along a line = const, 4> = const. In accordance with this, we set dd = dcj) =
in (108.4) or (109.2) and obtain ds 2 = a 2 {dr\ 2 dx 2 ). Setting this equal to zero, we find
dr\ = ±dx or, integrating,
X=±ri + const. (110.1)
The plus sign applies to a ray going out from the coordinate origin, and the minus sign to a
ray approaching the origin. In this form, equation (110.1) applies to the open as well as to
the closed model. With the help of the formulas of the preceding section, we can from this
express the distance traversed by the beam as a function of the time.
In the open model, a ray of light, starting from some point, in the course of its propaga
tion recedes farther and farther from it. In the closed model, a ray of light, starting out from
the initial point, can finally arrive at the "conjugate pole" of the space (this corresponds to a
change in x from to n) ; during the subsequent propagation, the ray begins to approach the
initial point. A circuit of the ray "around the space", and return to the initial point, would
correspond to a change of / from to 2n. From (110.1) we see that then r\ would also have
to change by 2n, which is, however, impossible (except for the one case when the light starts
at a moment corresponding to r\ = 0). Thus a ray of light cannot return to the starting point
after a circuit "around the space".
To a ray of light approaching the point of observation (the origin of coordinates), there
corresponds the negative sign on r\ in equation (1 10.1). If the moment of arrival of the ray
at this point is t(rj ), then for r\ = tj we must have x = 0, so that the equation of propagation
of such rays is
X = 1or}. (110.2)
From this it is clear that for an observer located at the point x = 0, only those rays of light
can reach him at the time t(?/ ), which started from points located at "distances" not
exceeding x = */o
This result, which applies to the open as well as to the closed model, is very essential.
We see that at each given moment of time t(ri), at a given point in space, there is accessible
to physical observation not all of space, but only that part of it which corresponds to
f The conclusion that the bodies are running away with increasing a(t ) can only be made if the energy of
interaction of the matter is small compared to the kinetic energy of its motion in the recession; this condition
is always satisfied for sufficiently distant galaxies. In the opposite case the mutual separations of the bodies
is determined mainly by their interactions; therefore, for example, the effect considered here should have
practically no influence on the dimensions of the nebulae themselves, and even less so on the dimensions of
stars.
§110 THE RED SHIFT 345
X < r\. Mathematically speaking, the "visible region" of the space is the section of the four
dimensional space by the light cone. This section turns out to be finite for the open as well
as the closed model (the quantity which is infinite for the open model is its section by the
hypersurface t = const, corresponding to the space where all points are observed at one and
the same time t). In this sense, the difference between the open and closed models turns out
to be much less drastic than one might have thought at first glance.
The farther the region observed by the observer at a given moment of time recedes from
him, the earlier the moment of time to which it corresponds. Let us look at the spherical
surface which is the geometrical locus of the points from which light started out at the time
t(rjx) and is observed at the origin at the time t(rj). The area of this surface is 4na 2 (rjx)
sin 2 x On the closed model), or Ana 2 {r\x) sinh 2 % ( m the open model). As it recedes from
the observer, the area of the "visible sphere" at first increases from zero (for x — 0) and then
reaches a maximum, after which it decreases once more, dropping back to zero for x = *1
(where a{r\x) = «(0) = 0). This means that the section through the light cone is not only
finite but also closed. It is as if it closed at the point "conjugate" to the observer; it can be
seen by observing along any direction in space. At this point e > oo, so that matter in all
stages of its evolution is, in principle, accessible to observation.
The total amount of observed matter is equal in the open model to
M„
•i
= An I /za 3 sinh 2 x ' dx
o
Substituting pia 3 from (109.7), we get
Mobs = , ° ( smn */ cos h r I~ r f) (1 10.3)
This quantity increases without limit as rj *■ oo. In the closed model, the increase of M obs
is limited by the total mass M; in similar fashion, we find for this case:
M
M obs = — {r] — sin rj cos rf). (110.4)
As t] increases from to n, this quantity increases from to M; the further increase of M obs
according to this formula is fictitious, and corresponds simply to the fact that in a "con
tracting" universe distant bodies would be observed twice (by means of the light "circling
the space" in the two directions).
Let us now consider the change in the frequency of light during its propagation in an
isotropic space. For this we first point out the following fact. Let there occur at a certain
point in space two events, separated by a time interval dt = (l/c)a (rj) drj. If at the moments
of these events light signals are sent out, which are observed at another point in space, then
between the moments of their observation there elapses a time interval corresponding to the
same change dr\ in the quantity r\ as for the starting point. This follows immediately from
equation (1 10.1), according to which the change in the quantity r\ during the time of propaga
tion of a light ray from one point to another depends only on the difference in the co
ordinates x f° r these points. But since during the time of propagation the radius of curvature
a changes, the time interval t between the moments of sending out of the two signals and the
moments of their observation are different; the ratio of these intervals is equal to the ratio of
the corresponding values of a.
C.T.F. 12
346 COSMOLOGICAL PROBLEMS § 110
From this it follows, in particular, that the periods of light vibrations, measured in terms
of the world time t, also change along the ray, proportionally to a. Thus, during the propaga
tion of a light ray, along its path,
coa = const. (110.5)
Let us suppose that at the time t(rj) we observe light emitted by a source located at a
distance corresponding to a definite value of the coordinate x According to (109.1), the
moment of emission of this light is t{f\—y). If co is the frequency of the light at the time of
emission, then from (110.5), the frequency co observed by us is
(o = co — — — . (110.6)
a(rj)
Because of the monotonic increase of the function a(rj), we have to < co , that is, a decrease
in the light frequency occurs. This means that when we observe the spectrum of light coming
toward us, all of its lines must appear to be shifted toward the red compared with the spec
trum of the same matter observed under ordinary conditions. The "red shift" phenomenon
is essentially the Doppler effect of the bodies' "running away" from each other.
The magnitude of the red shift measured, for example, as the ratio co/co of the displaced
to the undisplaced frequency, depends (for a given time of observation) on the distance at
which the observed source is located [in relation (110.6) there enters the coordinate x °f
the light source]. For not too large distances, we can expand a(rj—x) in a power series in x,
limiting ourselves to the first two terms :
— = \ ^M
co a(rj)
(the prime denotes differentiation with respect to n). Further, we note that the product
Xa(rf) is here just the distance / from the observed source. Namely, the "radial" line element
is equal to dl = a dx; in integrating this relation the question arises of how the distance is to
be determined by physical observation. In determining this distance we must take the values
of a at different points along the path of integration at different moments of time (integration
for rj = const would correspond to simultaneous observation of all the points along the
path, which is physically not feasible). But for "small" distances we can neglect the change
in a along the path of integration and write simply / = ax, with the value of a taken for the
moment of observation.
As a result, we find for the percentage change in the frequency the following formula:
c ^£o == _h_ h (m7)
co c
where we have introduced the notation
h = e ^JjA (U0.8)
a 2 (rj) a dt
for the socalled "Hubble constant". For a given instant of observation, this quantity is
independent of/. Thus the relative shift in spectral lines must be proportional to the distance
to the observed light source.
Considering the red shift as a result of a Doppler effect, one can determine the corres
ponding velocity v of recession of the body from the observer. Writing (coo) )/a> = —vie,
§110 THE RED SHIFT 347
and comparing with (1 10.7), we have
v = hl (110.9)
(this formula can also be obtained directly by calculating the derivative v = d(ax)/dt.
Astronomical data confirm the law (110.7), but the determination of the value of the
Hubble constant is hampered by the uncertainty in the establishment of a scale of cosmic
distances suitable for distant galaxies. The latest determinations give the value
h s 0.8 x 10" 10 yr" 1 = 025 x 10" 17 sec" *, (110.10)
1/ft « 4 x 10 17 sec = 1.3 x 10 9 yr.
It corresponds to an increase in the "velocity of recession" by 25 km/sec for each million
light years distance.
Substituting in equation (109.4), e = fie 2 and h = ca'/a 2 , we get for the open model the
following relation :
£! = **_ 5* (iio.il)
a 2 3
Combining this equation with the equality
c sinh n c . f]
h = — 7 — u T^ = ~„ coth V
a (cosh rj — l) a 2
we obtain
COShJ = *./r4. ( 110  12 )
2 V Snkfi
For the closed model we would get:
'^ph 1 , (110.13)
a 3
cos V  = h 74. (110.14)
2 V Snkn
Comparing (110.11) and (110.13), we see that the curvature of the space is negative or
positive according as the difference (Snk/3)nh 2 is negative or positive. This difference
goes to zero for \i =■ \i k , where
*£ (U0  15)
With the value (110.10), we get \i k s 1 x 10 29 g/cm 3 . In the present state of astronomical
knowledge, the value of the average density of matter in space can be estimated only with
very low accuracy. For an estimate, based on the number of galaxies and their average mass,
one now takes a value of about 3 x 10" 31 g/cm 3 . This value is 30 times less than p k and thus
would speak in favor of the open model. But even if we forget about the doubtful reliability
of this number, we should keep in mind that it does not take into account the possible
existence of a metagalactic dark gas, which could greatly increase the average matter
density.f
t The uncertainty in the value of n does not allow any sort of exact calculation of tj, especially since even
the sign of /i—p* is unknown. Setting fi = 3 x 10" 31 g/cm 3 in (110.12), and taking h from (110.10), we get
t] = 5.0. If we set fi = 10 30 g/cm 3 , then tj = 6.1.
348 COSMOLOGICAL PROBLEMS § 110
Let us note here a certain inequality which one can obtain for a given value of the quantity
h. For the open model we have
c sinh y\
a (cosh r\ — l) 2 '
and therefore
a . sinh »7(sinh rj — rj)
t = — (sinh rjrj) = — — — ir .
c /i(cosh r\ — \y
Since < r\ < oo, we must have
h <%< \ (11016)
Similarly, for the closed model we obtain
sin r\{r\ — sin rj)
h(l — cos rj) 2
To the increase of a{rj) there corresponds the interval < r\ < n ; therefore we get
0<t<^ (110.17)
3n
Next we determine the intensity / of the light arriving at the observer from a source
located at a distance corresponding to a definite value of the coordinate x The flux density of
light energy at the point of observation is inversely proportional to the surface of the sphere,
drawn through the point under consideration with center at the location of the source ; in a
space of negative curvature the area of the surface of the sphere equals Ana 2 sinh 2 x
Light emitted by the source during the interval dt = (l/c)a(rj—x)drj will reach the point
of observation during a time interval
. a(rj) 1
dt , " =  a(rj) drj.
a{rjx) c
Since the intensity is defined as the flux of light energy per unit time, there appears in / a
factor a(rj~x)/a(rj). Finally, the energy of a wave packet is proportional to its frequency
[see (53.9)]; since the frequency changes during propagation of the light according to the
law (110.5), this results in the factor a(rj — x)/a(t]) appearing in /once more. As a result, we
finally obtain the intensity in the form
g\ni)
a\rf) sinh 2 x
For the closed model we would similarly obtain
g 2 {nx)
a A (f}) sin 2 x
These formulas determine the dependence of the apparent brightness of an observed object
on fts distance (for a given absolute brightness). For small x we can set a(rjx) ^ a(rj), and
then J~ l/a 2 (rf)x 2 = l// 2 , that is, we have the usual law of decrease of intensity inversely
as the square of the distance.
Finally, let us consider the question of the socalled proper motions of bodies. In speaking
of the density and motion of matter, we have always understood this to be the average
density and average motion; in particular, in the system of reference which we have always
/ = const 4 y. 7^. (110.18)
/ = const ™ .7. . (110.19)
§ HO THE RED SHIFT 349
used, the velocity of the average motion is zero. The actual velocities of the bodies will under
go a certain fluctuation around this average value. In the course of time, the velocities of
proper motion of the bodies change. To determine the law of this change, let us consider a
freely moving body and choose the origin of coordinates at any point along its trajectory.
Then the trajectory will be a radial line, = const, <J> = const. The Hamilton Jacobi
equation (87.6), after substitution of the values of g ik , takes the form
/^V_ 0?V + m 2 c V(ij) = 0. (110.20)
Since x does not enter into the coefficients in this equation (i.e., / is a cyclic coordinate), the
conservation law dS/dx = const is valid. The momentum p of a moving body is equal, by
definition, to p = dS/dl = dS/a dx Tnus for a moving body the product pa is constant:
pa = const. (110.21)
Introducing the velocity v of proper motion of the body according to
mv
P =
we obtain
/■
J
"" = = const. (110.22)
c z
The law of change of velocity with time is determined by these relations. With increasing a,
the velocity v decreases monotonically.
PROBLEMS
1. Find the first two terms in the expansion of the apparent brightness of a galaxy as a function
of its red shift; the absolute brightness of a galaxy varies with time according to an exponential
law, I abs = const • e at (H. Robertson, 1955).
Solution: The dependence on distance x of the apparent brightness of a galaxy at the "instant"
tj, is given (for the closed model) by the formula
/ = const • e^"" ««>] a y~ X l .
a\rj) sin' 2 x
We define the red shift as the relative change in wave length:
_ X— A _ cop — co _ a(ji)—a{ri~x)
A (o a(tj—x)
Expanding / and z in powers of x [using the functions a(tf) and t(t]) from (108.9) and (108.10)] and
then eliminating x from the resulting equations, we find the result :
1
/ = const
z'
where we have introduced the notation
11
('!+?)
# = — = — >i.
l+COS?7 ju k
For the open model, we get the same formula with
2 =^<i.
1+ cosh 77 n k
350 COSMOLOGICAL PROBLEMS § 111
2. Find the leading terms in the expansion of the number of galaxies contained inside a "sphere"
of given radius, as a function of the red shift at the boundary of the sphere (where the spatial
distribution of galaxies is assumed to be uniform).
Solution: The number N of galaxies at "distances" < x is (in the closed model)
X
N — const • sin 2 xdx^ const • x 3 
Substituting the first two terms in the expansion of the function x(z), we obtain :
N = const • z 3 1  1 (2+g)z\ .
In this form the formula also holds for the open model.
§ 111. Gravitational stability of an isotropic universe
Let us consider the question of the behavior of small perturbations in the isotropic model,
i.e. the question of its gravitational stability (E. M. Lifshitz, 1946). We shall restrict our
treatment to perturbations over relatively small regions of space — regions whose linear
dimensions are small compared to the radius a.f
In every such region the spatial metric can be assumed to be euclidean in the first approxi
mation, i.e. the metric (107.8) or (107.12) is replaced by the metric
dl 2 = a 2 (r})(dx 2 + dy 2 + dz 2 ), (111.1)
where x, y, z are cartesian coordinates, measured in units of the radius a. We again use the
parameter v\ as time coordinate.
Without loss of generality we shall again describe the perturbed field in the synchronous
reference system, i.e. we impose on the variations 5g ik of the metric tensor the conditions
<5#oo = &9o* = 0 Varying the identity g ik u { \^ = 1 under these conditions (and remembering
that the unperturbed values of the components of the fourvelocity of the matter are
u° = l/a, u a = 0),f we get g 00 u°du° = 0,so that 8u° = O.The perturbations du" are in general
different from zero, so that the reference system is no longer comoving.
We denote the perturbations of the spatial metric tensor by h ttP = dy aP = — 5g aP . Then
dy afi — —h ap , where the raising of indices on h aP is done by using the unperturbed metric
In the linear approximation, the small perturbations of the gravitational field satisfy the
equations
5R k i idiSR = ~5Tl (111.2)
In the synchronous reference system the variations of the components of the energy
momentum tensor (94.9) are :
8Ti = 3 p Jp, 5TI = a(p + e)du", 5T% = Se. (111.3)
f A more detailed presentation of this question, including the investigation of perturbations over regions
whose size is comparable to a, is given in Adv. in Physics 12, 208 (1963).
% In this section we denote unperturbed values of quantities by letters without the auxiliary superscript (0) .
§111 GRAVITATIONAL STABILITY OF AN ISOTROPIC UNIVERSE 351
Because of the smallness of 5s and dp, we can write dp = (dp/de)5s, and we obtain the
relations:
ST P =S P ^ST° . (111.4)
as
Formulas for dR* can be gotten by varying the expression (99.10). Since the unperturbed
metric tensor y aP = a 2 3 ap , the unperturbed values are
_ 2d _2a! p _ 2d_ p
d d c*
where the dot denotes differentiation with respect to ct, and the prime with respect to r\. The
perturbations of x aP and x p = x ay y yP are:
Sx« P = ke =  K P , Sxl =  /i% + y Py ky = K = \ *J'
a n
where h p = y Py h ay . For the euclidean metric (111.1) the unperturbed values of the three
dimensional P p a are zero. The variations 5P p a are calculated from formulas (1) and (2) of
problem 2 in § 102: it is obvious that SP p a is expressed in terms of the dy aP just as the four
tensor 5R ik is expressed in terms of the 5g ik , all tensor operations being done in the three
dimensional space with the metric (111.1); because this metric is euclidean, all the co variant
differentiations reduce to simple differentiations with respect to the coordinates x* (for the
contravariant derivatives we must still divide by a 2 ). Taking all this into account (and
changing from derivatives with respect to t to derivatives with respect to */), we get, after
some simple calculations:
xvP L (uy^4h p '' f h p ' y h' p> i — h p " — h p ' h'S p
0K a — 2a 2 ' y y ' a ' y ' 2a 2 a 2fl
SR* = ^ 2 (h'*h p > p y, (111.5)
1 a'
— h" —
2a 2 2a'
1_
2a~'
(h = /i"). Here both the upper and lower indices following the comma denote simple
differentiations with respect to the x a (we continue to write indices above and below only to
retain uniformity of the notation).
We obtain the final equations for the perturbations by substituting in (111.4) the com
ponents <5Tf, expressed in terms of the 5R) according to (111.2). For these equations it is
convenient to choose the equations obtained from (111.4) for a ^ /?, and those obtained by
contracting on a, jS. They are :
(.K : l>,+hl : lh : ih> : r ) + hZ"+ ™ fcf = 0, a # /?,
I (*S J** (l « g) + h» + H 1 ( 2+ 3 g) = 0. (111.6)
The perturbations of the density and matter velocity can be determined from the known
h p using formulas (111.23). Thus we have for the relative change of the density:
5e
Snks
( dR °l SR ) ' ids? (*£''*:: + T *) (111J)
352 COSMOLOGICAL PROBLEMS § 111
Among the solutions of equations (111.6) there are some that can be eliminated by a simple
transformation of the reference system (without destroying the condition of synchronism),
and so do not represent a real physical change of the metric. The form of such solutions can
be established by using formulas (1) and (2) in problem 3 of § 99. Substituting the unper
turbed values y ap = a 2 d aP , we get from them the following expressions for fictitious perturba
tions of the metric:
«/o:J fv + 5/o«+(/.''+/'..), (in.8
j a ci
where the/ ,y^ are arbitrary (small) functions of the coordinates x, y, z.
Since the metric in the small regions of space we are considering is assumed to be euclidean,
an arbitrary perturbation in such a region can be expanded in plane waves. Using x, y, z for
cartesian coordinates measured in units of a, we can write the periodic space factor for the
plane waves in the form e inr , where n is a dimensionaless vector, which represents the wave
vector measured in units of \\a (the wave vector is k = n/a). If we have a perturbation over
a portion of space of dimensions ~ /, the expansion will involve waves of length
k = 2na/n ~ /. If we restrict the perturbations to regions of size / <4 a, we automatically
assume the number n to be quite large (n > 2tt).
Gravitational perturbations can be divided into three types. This classification reduces
to a determination of the possible types of plane waves in terms of which the symmetric
tensor h afi can be represented. We thus obtain the following classification:
1. Using the scalar function
Q = e inr , (111.9)
we can form the vector P = nQ and the tensorsf
0& = \%Q, ^ = (3^^)2 (U1.10)
These plane waves correspond to perturbations in which, in addition to the gravitational
field, there are changes in the velocity and density of the matter, i.e. we are dealing with per
turbations accompanied by condensations or rarefactions of the matter. The perturbation
of /i£ is expressed in terms of the tensors Q p a and P£ , the perturbation of the velocity is
expressed in terms of the vector P, and the perturbation of the density, in terms of the
scalar Q.
2. Using the transverse vector wave
S = se' nr , sn = 0, (111.11)
we can form the tensor (n fi S a +n a S p ); the scalar corresponding to this does not exist, since
n • S = 0. These waves correspond to perturbations in which, in addition to the gravita
tional field, we have a change in velocity but no change of the density of the matter; they
may be called rotational perturbations.
3. The transverse tensor wave
Gj = rfe fa " r , 0fn, = O. (111.12)
We can construct neither a vector nor a scalar by using it. These waves correspond to
perturbations of the gravitational field in which the matter remains at rest and uniformly
distributed throughout space. In other words, these are gravitational waves in an isotropic
universe.
t We write upper and lower indices on the components of ordinary cartesian tensors only to preserve
uniformity of notation.
§111 GRAVITATIONAL STABILITY OF AN ISOTROPIC UNIVERSE 353
The perturbations of the first type are of principal interest. We set
hl = Krj)Pt+iJto)Qt, h = nQ. (111.13)
From (111.7) we find for the relative change of the density
de
i ~i
n 2 (A+n)+ — n'
a
(111.14)
e 24nksa 2
The equations for determining X and \i are gotten by substituting (111.13) in (111.6):
2a! n 2
k"+ — A'(A + /z) = 0,
a 3
"" + "'l( 2+3 l) + T (A+ ' ,) ( 1+3 *) = a < 11MS >
These equations have the following two partial integrals, corresponding to those fictitious
changes of metric that can be eliminated by transforming the reference system :
X= fi = const, (111.16)
dr\ 3a'
J a J a a
(111.17)
[the first of these is gotten from (11 1.8) by choosing f = 0,f a = P a ; the second by choosing
/ O =0, /« = <>].
In the early stages of expansion of the universe, when the matter is described by the
equation of state p = s/3, we have a x a x rj, r\ <^ 1 (in both the open and closed models).
Equations (111.15) take the form:
In 2 3 In 2
k" + k' — (k+pi) = 0, n"+  n'+ — (k+fi) = 0. (111.18)
r\ 3 rj 3
These equations are conveniently investigated separately for the two limiting cases depend
ing on the ratio of the large quantities n and 1/rj.
Let us assume first that n is not too large (or that rj is sufficiently small), so that nrj <^ 1 .
To the order of accuracy for which the equations (111.18) are valid, we find from them for
this case :
k = ^ + c 2 [\+ n 2 y n=^c, n +c 2 (i n 2 y
where C l5 C 2 are constants; solutions of the form (111.16) and (111.17) are excluded (in the
present case these are the solution with k— n = const and the one with k+n~ l/// 2 ).
Calculating 8s/e from (111.14) and (108.15), we get the following expressions for the per
turbations of the metric and the density:
n
~ = j(Cxn + C 2 r\ 2 )Q, for p = , *]<. (111.19)
The constants C x and C 2 must satisfy conditions expressing the smallness of the perturba
tion at the time rj of its start: we must have /if <^ 1 (so that k <^ 1 and \i 4. 1) and Ss/s < 1.
As applied to (111.19) these conditions give the inequalities C x 4, r\ , C 2 <^ 1.
354 COSMOLOGICAL PROBLEMS § 111
In (111.19) there are various terms that increase in the expanding universe like different
powers of the radius a = a t t]. But this growth does not cause the perturbation to become
large: if we apply formula (111.19) for an order of magnitude to rj ~ l/«, we see that
(because of the inequalities found above for C 1 and C 2 ) the perturbations remain small
even at the upper limit of application of these formulas.
Now suppose that n is so large that nrj > 1. Solving (111.18) for this condition, we find
that the leading terms in A and p, are:f
A =  ? = const • = e inM ^.
2 r\ l
We then find for the perturbations of the metric and the density:
for hi = ^ (Pj2G2y"^  =  £ Qe in ^
n n s 9
' « (111.20)
e 1 *
3 n
where C is a complex constant satisfying the condition \C\ <^ 1. The presence of a periodic
factor in these expressions is entirely natural. For large n we are dealing with a perturbation
whose spatial periodicity is determined by the large wave vector k = n/a. Such perturbations
must propagate like sound waves with velocity
— J;
dp
d(8/c 2 ) V3'
Correspondingly the time part of the phase is determined, as in geometrical acoustics, by
the large integral \kudt = nr\j\[?>. As we see, the amplitude of the relative change of
density remains constant, while the amplitude of the perturbations of the metric itself
decreases like a 2 in the expanding universe.}
Now we consider later stages of the expansion, when the matter is already so rarefied
that we can neglect its pressure (p = 0). We shall limit ourselves to the case of small rj,
corresponding to that stage of the expansion when the radius a was still small compared to
its present value, but the matter was already quite rarefied.
For/? = and rj < 1, we have a « a rj 2 /2, and (111.15) takes the form:
4 n 2
A"+A'(A + ju) = 0,
v\ 3
4 n 2
rj 3
The solution of these equations is
6C 2 , 2 / c i^ 2 4C a\
1+fi = 2d ^, kH = n 2 [^ + ^y
f The factor \\rf in front of the exponential is the first term in the expansion in powers of 1/ntj. To find
it we must consider the first two terms in the expansion simultaneously [which is justified within the limits
of accuracy of (1 1 1 . 1 8)]. .
% It is easy to verify that (for p = e/3) nt] ~Ljk, where L ~ u/Vke/c 2 . It is natural that the characteristic
length L, which determines the behavior of perturbations with wave length X<^a, contains only hydro
dynamic' quantities— the matter density e/c 2 and the sound velocity u (and the gravitational constant k).
We note that there is a growth of the perturbations when ^>L [in (111.19)].
§ 112 HOMOGENEOUS SPACES 355
Also calculating dsjs by using (111.14) and (108.12), we find:
fcic^pj+Qft+^cpjea for n <{,
2
« = c 1 n¥(«QJ)+ ?!L ^ ? «G!!) for \<n<U (m.2i)
5s _ Cj n V C 2 n 2
7~ 30 + r] 2 '
We see that 5e/e contains terms that increase proportionally with a. But if nv\ <^ l,then
de/a does not become large even for rj ~ 1/w because of the condition C t <l. If, however,
r\n>\, then for f7 ~ 1 the relative change of density becomes of order C t « 2 , while the
smallness of the initial perturbation requires only that C x n 2 r\\ 4, 1. Thus, although the
growth of the perturbation occurs slowly, nevertheless its total growth may be considerable,
so that it becomes quite large.
One can similarly treat perturbations of the second and third types listed above. But the
laws for the damping of these perturbations can also be found without detailed calculations
by starting from the following simple arguments.
If over a small region of the matter (with linear dimensions /) there is a rotational per
turbation with velocity 5v, the angular momentum of this region is ~ (e/c 2 )/ 3 • / • v. During
the expansion of the universe / increases proportionally with a, while e decreases like a~ 3
(in the case of;? = 0) or like a~ 4 (for/? = e/3). From the conservation of angular momentum,
we have
5v = const for p = s/3,dv ~  for p = 0. (Ill .22)
a
Finally, the energy density of gravitational waves must decrease during the expansion
of the universe like a~ 4 . On the other hand, this density is expressed in terms of the pertur
bation of the metric by ~ k 2 (h%) 2 , where k = n\a is the wave vector of the perturbation.
It then follows that the amplitude of perturbations of the type of gravitational waves
decreases with time like I J a.
§112. Homogeneous spaces
The assumption of homogeneity and isotropy of space determines the metric completely
(leaving free only the sign of the curvature). Considerably more freedom is left if one assumes
only homogeneity of space, with no additional symmetry. Let us see what metric properties
a homogeneous space can have.
We shall be discussing the metric of a space at a given instant of time t. We assume that
the spacetime reference system is chosen to be synchronous, so that t is the same synchro
nized time for the whole space.
t A more detailed analysis taking into account the small pressure ^(e) shows that the po ssibility of neglect
ing the pressure requires that one satisfy the condition ufjn/c <^ 1 (where u = cVdplde is the small sound
velocity); it is easy to show that in this case also it coincides with the condition 2./L^> 1. Thus, growth of the
perturbation always occurs if A/L^> 1.
356 COSMOLOGICAL PROBLEMS § 112
Homogeneity implies identical metric properties at all points of the space. An exact
definition of this concept involves considering sets of coordinate transformations that trans
form the space into itself, i.e. leave its metric unchanged: if the line element before trans
formation is
dl 2 = y aP (x l , x 2 , x 3 ) dx a dx p ,
then after transformation the same line element is
dl 2 = y afi (x'\ x' 2 , x' 3 ) dx' a dx' p ,
with the same functional dependence of the y aP on the new coordinates. A space is homo
geneous if it admits a set of transformations (a group of motions) that enables us to bring
any given point to the position of any other point. Since space is threedimensional the
different transformations of the group are labelled by three independent parameters.
Thus, in euclidean space the homogeneity of space is expressed by the invariance of the
metric under parallel displacements (translations) of the cartesian coordinate system. Each
translation is determined by three parameters — the components of the displacement vector
of the coordinate origin. All these transformations leave invariant the three independent
differentials (dx, dy, dz) from which the line element is constructed.
In the general case of a noneuclidean homogeneous space, the transformations of its
group of motions again leave invariant three independent linear differential forms, which
do not, however, reduce to total differentials of any coordinate functions. We write these
forms as
e a a dx«, (112.1)
where the Latin index a labels three independent vectors (coordinate functions); we call
these vectors a frame.
Using the forms (1 12.1) we construct a spatial metric invariant under the given group of
motions :
dl 2 = y ab (e a a dx*)(e b dx l! ),
i.e. the metric tensor is
y a p = y ab e a a e b p . (112.2)
where the coefficients y ab , which are symmetric in the indices a and b, are functions of the
time.f The contravariant components of the metric tensor are written as
y aP = y ab e a a e p b , (112.3)
where the coefficients y ab form a matrix reciprocal to the matrix y ab (y ac y cb = dl), while the
quantities e* a form three vectors, "reciprocal" to the vectors e a a \
e a a e b a = S b a , 44 = % (112.4)
(each of these equations following automatically from the other). We note that the relation
between e a a and e a a can be written explicitly as
111
e 1 =e 2 xe 3 , e, = e 3 xe l , e 3 = e 1 xe 2 , (112.5)
v v v
where v = e 1 • e 2 x e 3 , while e a and e a are to be regarded as cartesian vectors with com
t Throughout this section we sum over repeated indices, both Greek indices and the Latin indices
(a, b, c, . . .) that label the frame vectors.
§112 HOMOGENEOUS SPACES 357
ponents e* and e a a respectively.f The determinant of the metric tensor (112.2) is
y = MK\ 2 = \y*eK < 112  6)
where \y aP \ is the determinant of the matrix y ap .
The invariance of the differential forms (112.1) means that
e a a {x) dx a = e%x') dx'\ (112.7)
where the e a a on the two sides of the equation are the same functions of the old and new
coordinates, respectively. Multiplying this equation by e p (x'), setting
dx' p
dx' p = — dx\
dx*
and comparing coefficients of the same differentials dx a , we find
~ = ef(x'K(x). (U2.8)
These equations are a system of differential equations that determine the functions x' p (x)
for a given frame. J In order to be integrable, the equations (112.8) must satisfy identically
the conditions
d 2 x' p _ d 2 x' p
dx a dx y ~ dx y dx a '
Calculating the derivatives, we find
Multiplying both sides of the equations by e%x)e y c {x)e f p {x') and shifting the differentiation
from one factor to the other by using (112.4), we get for the left side:
ei(x) \rt^ e%x) ~ ~t^ e%x) \ = 4(x )d(x } b^~ " ~^rr
and for the right, the same expression in the variable x. Since x and x' are arbitrary, these
expressions must reduce to constants :
©§)«"<*• (ili9)
The constants C c ab are called the structure constants of the group. Multiplying by e 7 c , we can
rewrite (112.9) in the form
 del « del
"aMa?' 5 "' (lmo)
t Do not confuse the el with the contravariant components of the vectors e«! The latter are equal to:
e aa = y aP ee = y ab e a b .
% For a transformation of the form x'" = x p +Z e , where the Z? are small quantities, we obtain from
(112.8) the equations
^! = £r<^ e « (112.8a)
The three linearly independent solutions of these equations, £\(b = 1, 2, 3), determine the infinitesimal
transformations of the group of motions of the space. The vectors £% are called the Killing vectors.
358 COSMOLOGICAL PROBLEMS § 112
As we see from their definition, the structure constants are antisymmetric in their lower
indices:
C c ab =C c ba . (112.11)
We can obtain still another condition on them by noting that (1 12.10) can be written in the
form of commutation relations
[X a , X h ~\ = X a X b X b X a = C c ab X c (112.12)
or the linear differential operatorsf
X °= e «i^ ( ll2  13 )
Then the condition mentioned above follows from the identity
[[x a , x„i x c \ + [[z 6 , x c i x fl ] + [[x c , x a i x b \ =
(the Jacobi identity), and has the form:
C e ab C d ec +C e bc C d ea + C e ca C d eb = 0. (112.14)
We note that equation (112.9) can be written in vector form as
(e a xe„)curle c = C c ab ,
where again the vectorial operations are carried out as if the coordinates jc" were cartesian.
Using (112.5) we then have
 e 1 • curl e 1 = C\ 2 ,  e 2 curl e 1 = C{ 3 , e 3  curie 1 = C\ u (112.15)
and six other equations, obtained by cyclic permutation of the indices 1, 2, 3.
The Einstein equations for a universe with a homogeneous space can be written as a
system of ordinary differential equations containing only functions of the time. To do this
all threedimensional vectors and tensors must be expanded in the triple of frame vectors
of the given space. Denoting the components of this expansion by indices a,b, . , we have,
by definition:
R ab = K p e a a 4, R 0a = R 0a e* a , u a = u"e a a ,
where all these quantities are functions only of t (as are the scalar quantities e and p).
Any further raising or lowering of indices is done using the y ab : R b a = y bc R ac , u a = y ab u b etc.
According to (99.1113) the Einstein equations in the synchronous reference system are
given in terms of the threedimensional tensors yc ap and P aP . For the first of these we have
simply
x ab = y a b (112.16)
(the dot denoting differentiation with respect to 0 The components P ab can be expressed
in terms of the quantities y ab and the structure constants of the group:
Rab = ~ a ad a bc~Cdc a ab>
alb = $(C c ab + C e bd y ea y dc  C e da y eb y dc ). (112.17)
t The results presented belong to the mathematical theory of continuous groups (Lie groups). In this
theory the operators X a satisfying conditions of the form (1 12.12) are called the generators of the group. We
mention, however (to avoid confusion when comparing with other presentations), that the systematic theory
usually starts from operators defined using the Killing vectors:
§ 112 HOMOGENEOUS SPACES 359
The covariant derivatives %£. y [which appear in (99.12)] are also expressed in terms of these
quantities, and we find for R° :
R° a = ^y bc y b XC c da 5 c a C e ed ). (112.18)
We emphasize that, in forming the Einstein equations, there is thus no need to use explicit
expressions for the frame vectors as functions of the coordinates.f
The choice of the three frame vectors in the differential forms (112.1) and, with them, of
the operators (112.13), is clearly not unique. They can be subjected to any linear trans
formation with constant (real) coefficients:
e' a * = A b a et. (112.19)
Relative to such transformations the quantities y ab behave like a covariant tensor, and the
constants C c ab like a tensor covariant in the indices a, b and contravariant in the index c.
The conditions (112.11) and (112.14) are the only ones that the structure constants must
satisfy. But among the sets of constants admissible under these conditions there are
equivalent ones, in the sense that their difference is caused only by a transformation (1 12. 19).
The problem of the classification of homogeneous spaces reduces to the determination of all
nonequivalent sets of structure constants.
A simple procedure for doing this is to make use of the "tensor" properties of the con
stants C c ab , expressing these nine quantities in terms of the six components of a symmetric
"tensor" n ab and the three components of a "vector" a c as
C c ah = e abd n dc +5 c b a a 5 c a a b . (112.20)
where e abd is the unit antisymmetric "tensor" (C. G. Behr, 1962). The condition for anti
symmetry of (112.11) has already been met, while the Jacobi identity (112.14) gives the
condition
n ab a b = 0. (112.21)
By means of the transformations (112.19) the symmetric "tensor" n ah can be brought to
diagonal form: let n (1) , ra (2) , « (3) be its eigenvalues. Equation (1 12.21) shows that the "vector"
a b (if it exists) lies along one of the principal directions of the "tensor" n ab , the one corres
ponding to the eigenvalue zero. Without loss of generality we can therefore set a b = (a, 0, 0).
Then (112.21) reduces to an w = 0, i.e. one of the quantities a or n (1) must be zero. The com
mutation relations (112.12) take the form:
lX l ,X 2 ] = aX 2 + n^X 3 ,
iX 2 ,X 3 ] = n^X l , (112.22)
[_X 3 ,X 1 ] = n^X 2 aX 3 .
The only remaining freedom is a change of sign of the operators X a and arbitrary scale
transformations of them (multiplication by constants). This permits us simultaneously to
change the sign of all the n {a) and also to make the quantity a positive (if it is different from
zero). We can also make all the structure constants equal to ±1, if at least one of the
quantities a, n (2) , n (3) vanishes. But if all three of these quantities differ from zero, the scale
transformations leave invariant the ratio a 2 /« (2) w (3) .
t The derivation of formulas (1 12.1718) can be found in the paper of E. Schiicking in the book Gravita
tion: an Introduction to Current Research, ed. L. Witten, J. Wiley, New York, 1962, p. 454.
360
COSMOLOGICAL PROBLEMS
§ 113
Thus we arrive at the following list of possible types of homogeneous spaces; in the first
column of the table we give the roman numeral by which the type is labelled according to
the Bianchi classification (L. Bianchi, 191 8) :f
Type
a
n (1 >
n w
H< 2 >
I
II
VII
1
VI
1
IX
1
1
VIII
1
1
V
1
IV
1
1
VII
a
1
1
III (a
VI (a
= D1
a
1
1
Type I is euclidean space (all components of the spatial curvature tensor vanish). In
addition to the trivial case of a galilean metric, the metric (103.9) belongs to this type.
If for the space of type IX one puts y ab = (a 2 /4)S ab , one finds for the Ricci tensor
P ab = \b ab and hence :
"afi = °ab e a e p = ~~2 7ap>
which corresponds to a space of constant positive curvature [cf. (107.3), (107.6)]; this space
is thus contained in type IX as a special case.
Similarly the space of constant negative curvature is contained as a special case in type V.
This is easily seen by transforming the structure constants of this group by the sub
stitution X 2 + X 3 = X 2 , X 2 X 3 = X 3 , X 1 =X' 1 . Then [X' Xi X' 2 ~\ = X' 2i [X' 2 ,X' 3 ] = 0,
[Z' 3 ,Zi] = X 3 , and if one puts y ab = a 2 d ab , the Ricci tensor becomes P ah =25 ab ,
P a p = — (2/a 2 )y a p which corresponds to a space of constant negative curvature.
§ 113. Oscillating regime of approach to a singular point
On the model of a universe with a homogeneous space of type IX we shall study the time
singularity of the metric, whose character is basically different from that of the singularity
in the homogeneous and isotropic model (V. A. Belinskii, I. M. Khalatnikov, E. M. Lifshitz,
1969; C. W. Misner, 1969). We shall see in the next section that such a situation has
a very general significance.
We shall be interested in the behavior of the model near the singularity (which we choose
as the time origin t = 0). We shall see later that the presence of matter does not affect the
qualitative properties of his behavior. For simplicity we shall therefore assume at first
t The parameter a runs through all positive values. The corresponding types are actually a oneparameter
family of different groups.
From given structure constants one can find the basis vectors by solving the differential equations (1 12.10).
They have been given for all types (together with the corresponding Killing vectors) in the paper of A. H.
Taub, Ann. Math. 53, 472, 1951.
§ 113 OSCILLATING REGIME OF APPROACH TO A SINGULAR POINT 361
that the space is empty. A physical singularity for such a space means that the invariants
of the fourdimensional curvature tensor go to infinity at t = 0.
We take the quantities yjt) in (112.2) to be diagonal, denoting the diagonal elements by
a 2 , b 2 , c 2 ; we here denote the three frame vectors e 1 , e 2 , e 3 by 1, m, n. Then the spatial
metric is written as: n
y a p = a 2 U p + b 2 m a m p + c 2 n a n p . U^i)
For a space of type IX the structure constants are:
C\, = C 2 , = C\ 2 = 1. (1132)
From (112, 1618) it can be seen that for these constants and a diagonal matrix y ab , the
components Hj, Uj, R°„, R?, *?, K of the Ricci tensor vanish identically in the synchronous
reference system. The remaining components of the Einstein equations give the following
system of equations for the functions a, b, c:
(abc)' 1
abc
,2l2„2
l(Xa 2 vc 2 ) 2 fi 2 b% (H3.3)
[(Afl 2 ib 2 ) 2 vV],
a h c
+ + = 0. (H3.4)
abc
[Equations (113.3) are the equation set R\ = K£ = R n n = 0; equation (113.4) is the equation
R o _ 0> ) The letters X, \i, v here denote the structure constants C\ 3 , C\ u C 12 ; although
they are set equal to 1 everywhere from now on, they here illustrate the origin of the different
terms in the equations.
The time derivatives in the system (113.34) take on a simpler form if we introduce m
place of the functions a, b, c, their logarithms a, /?, y :
a = e\ b = e p , c = e\ (113.5)
and in place of /, the variable t:
dt = abcdx. (113.6)
Then:
2a„ = (b 2 c 2 ) 2 a 4 ,
2p xx = (a 2 c 2 ) 2 b\ (113.7)
2y„ = (a 2 b 2 ) 2 c 4 ;
K«+0+y)« = «.&+««7t+&7t> ( 113  8 >
t The frame vectors corresponding to these constants are:
1 = (sin x 3 , cos x 3 sin x\ 0), m = (cos x 3 , sin x 3 sin x\ 0), n = (0, cosx 1 , 1).
The element of volume is :
dV= Vy dx 1 dx 2 dx 3 = abc sin x 1 dx 1 dx 2 dx 3 .
The coordinates run through values in the ranges ^ x 1 ^ n, < x 2 ^ lit, ^ x 3 < 4tt. The space is closed,
and its volume V= I6n 2 abc (when a = b = c it goes over into a space of constant positive curvature with
radius of curvature 2d).
362 COSMOLOGICAL PROBLEMS § 113
where the subscript t denotes differentiation with respect to t. Adding equations (113.7)
and replacing the sum of second derivatives on the left by (113.8), we obtain:
aJ x + a T y t +P x y T = Ka 4 +b 4 +c*2a 2 b 2 2a 2 c 2 2b 2 c 2 ). (113.9)
This relation contains only first derivatives, and is a first integral of the equations (113.7).
Equations (113.34) cannot be solved exactly in analytic form, but permit a detailed
qualitative study.
We note first that if the right sides of equations (1 13.3) were absent, the system would have
an exact solution, in which
a~t p >, b~t Pm , c~t Pn , (113.10)
where p h p m , and p„ are numbers connected by the relations
Pl + Pm + Pn = Pf+Pl + Pn = 1 (113.11)
[the analog of the Kasner solution (103.9)]. We have denoted the exponents by p h p m , p„,
without assuming any order of their size; we shall retain the notation Pi,p 2 ,p 3 of § 103
for the triple of numbers arranged in the order p^<p 2 < P3 and taking on values in the
intervals (103.10a) respectively. These numbers can be written in parametric form as
*«ITT^' P ^ = TTI+?' ft(s) = ITiT? (113  12)
All the different values of the p u p 2 , p 3 (preserving the assumed order) are obtained if the
parameter s runs through values in the range s ^ 1. The values s < 1 are reduced to this
same region as follows:
Pi () = Pi(«), Pi () = Pi(s), p 3 () = p 2 (s). (113.13)
Let us assume that within some time interval the right sides of equations (113.3) are
small, so that they can be neglected and we have the "Kasnerlike" regime (113.10). To be
specific, let us suppose that the exponent in the function a is negative: p t =p t <0. We shall
follow the evolution of the metric in the direction of decreasing t.
The left sides of (113.3) have a "potential" order of magnitude ~ t~ 2 . Noting that in the
regime (113.10), abc ~ t, we see that on the right sides, all the terms increase (for/»0)
more slowly than t~ 2 , except for the term sa*/a 2 b 2 c 2 ~ t~ 2 t~ 4lpi1 . These are the terms that
will play the role of a perturbation that destroys the Kasner regime. The terms a 4 on the right
sides of (113.7) correspond to them. Keeping only these terms, we find
**,= &*, P X r = y,t = ^ (113.14)
To the "initial" statef (113.10) there correspond the conditions
Vt = Pi> Px = Pm> yx = Pn
The first of equations (113.14) has the form of the equation of onedimensional motion
of a particle in the field of an exponential potential wall, where a plays the role of the co
ordinate. In this analogy, to the initial Kasner regime there corresponds a free motion with
constant velocity a T =p l . After reflection from the wall, the particle will again move freely
with the opposite sign of the velocity: a t = — p t . We also note that from equations (113.14),
a z +P x = const, and a T +y t = const, hence we find that fl x and y x take the values
Pr = Pm + 2p h V t = P» + 2p l .
f We emphasize once again that we are considering the evolution of the metric as / > 0; thus the "initial"
conditions refer to a later, and not an earlier time.
§ 113 OSCILLATING REGIME OF APPROACH TO A SINGULAR POINT 363
Now determining a, p, y, and then t, using (113.6), we find
e a ~ e~ PlX e p ~ e {Pm+2pi)x , e y ~ e (Pn+2pi)x
i.e.
a ~ t p '\ b ~ t p ' m , c ~ t p '",
where
Pl = i^ip; Pm ~ Tfw Pn ~ i+2 ft  (113>15)
If we had p l <p m < p„, Pi < 0, then now p' m <p' l < p'„, p' m <0; the function b, which was
decreasing (for t > 0) begins to increase, the rising function a now drops, while the function
c continues to fall. The perturbation itself [~ a 4 in (1 13.7)], which previously was increasing,
now damps out.
The law of change of the exponents (113.15) is conveniently represented using the para
metrization (113.12): if
Pi = Pl(s)> Pm = Pl(s), Pn = PaOO,
then
p' l = p 2 (sl), p' m = p 1 (sl), p„ = p 3 (sl). (113.16)
The larger of the two positive exponents remains positive.
Thus the action of the perturbation results in the replacement of one Kasner regime by
another, with the negative power shifting from the direction 1 to the direction m. Further
evolution of the metric leads in an analogous way to an increase in the perturbation given
by the terms ~ b 4 in (113.7), another shift of the Kasner regime, etc.
The successive shifts (113.16) with bouncing of the negative exponent/?! between the direc
tions 1 and m continue so long as s remains greater than 1. Values s < 1 are transformed
into s > 1 according to (113.13); at this moment either p t or p m is negative, while />„ is the
smaller of the two positive numbers (p„ = p 2 ) The following series of shifts will now bounce
the negative exponent between the directions n and I or between n and m. For an arbitrary
(irrational) initial value of s, the process of shifting continues without end.
In an exact solution of the equations, the powers/?,, p m ,p n will, of course, lose their literal
meaning. But the regularities in the shifting of exponents allow one to conclude that the
course of change of the metric as we approach the singularity will have the following
qualitative properties. The process of evolution of the metric is made up of successive
periods (we shall call them eras), during which distances along two of the axes oscillate,
while distances along the third axis decrease. On going from one era to the next, the direc
tion along which distances decrease monotonically bounces from one axis to another. The
order of this bouncing acquires asymptotically the character of a random process.
The successive eras crowd together as we approach t = 0. But the natural variable for
describing the behavior of this time evolution appears to be not the time t, but its logarithm,
In t, in terms of which the whole process of approach to the singularity is stretched out
to — oo.
The qualitative analysis presented above must, however, be supplemented with respect to
the following point.
In this analysis there correspond to the «'th era values of the numbers s {n) starting from
some largest value sj£> x down to some smallest, s^} n < 1. The length of the era (as measured
by the number of oscillations) is the integer s<£> x  s£in For the next era, s£i 1) = l/ s min
In the infinite sequence of numbers formed in this way one will find arbitrarily small (but
364 COSMOLOGICAL PROBLEMS § 113
never zero) values of sg? n and correspondingly arbitrarily large values of s££ 1} ; such values
correspond to "long" eras. But to large values of the parameter s there correspond exponents
(Pi> Pi, Pz) close to the values (0, 0, 1). Two of the exponents which are close to zero are
thus close to one another, hence also the laws of the change of two of the functions a, b, c,
are close to one another. If in the beginning of such a long era these two functions happen
to be close also in their absolute magnitude, they shall remain to be such during the larger
part of the entire era. In such a case it becomes necessary to keep not one term (a 4 ) on
the right sides of (113.7), but two terms.
Let c be that one of the functions a, b, c that decreases monotonically in the course of a
long era. It then rapidly becomes smaller than the other two; let us consider the solution
of equations (113.78) in just that region of the variable x where we can neglect c compared
to a and b. Let the upper limit of this region be x = t .
In this case the first two equations of (113.7) give
arr+ft t = 0, (113.17)
a tt P tx =e 4a +e 4li , (113.18)
while for the third equation we use (113.9), which gives:
y T (a r +&)=  a J x + \{e 2 *e 2l >) 2 . (113.19)
We write the solution of (113.17) in the form
2a 2
a+0 = — (TT ) + 21na ,
where a and £ are positive constants. It will be convenient to introduce in place of t a
new variable
f = £ exp
Then
2a 2 "
^(tt )
Co
(113.20)
cc + P = In— +2lna . (113.21)
Co
We also transform equations (113.1819), introducing the notation x = a/?
1
2
1 1
fe+ * Xt+ ~ sinh 2 X = 0, (113.22)
1 £ ,
7t =  ^ + g (2/ + cosh 2xl). (113.23)
To the decrease of t from oo to there corresponds the drop of x from oo to — oo ; corres
pondingly f drops from oo to 0. As we shall see later, a long era is obtained if £ (the value
of f corresponding to the instant t = t ) is a very large quantity. We shall consider the
solution of equations (113.2223) in the two regions £ > 1 and ^ 4 1.
For large £ the solution of (113.22) in first approximation (in 1/f) is:
X = ap =  r ~sm(££ ) (113.24)
(where A is a constant); the factor 1/Vc makes x a small quantity, so that we can make the
§ 113 OSCILLATING REGIME OF APPROACH TO A SINGULAR POINT 365
substitution sinh 2 X « 2/ in (113.22). From (113.23) we now find:
7i ~ \(xt + X 2 ) = A 2 , 1 = ^ 2 (^^o) + const.
Having determined a and from (1 13.21) and (1 13.24) and expanded e* and e> in accordance
with our approximation, we finally obtain:
""JlkTt* 1 ™] (113 ' 25)
c = c oe ^°^
The relation of £ to the time t is gotten by integrating the defining equation (113.6), and is
given by the formula
1 = e ;i 2 «o« (113.26)
The constant c (the value of c when £ = £ ) must satisfy c < a .
We now turn to the region f < 1. Here the leading terms in the solution of (113.22) are:
z==a j3 = Kln£ + const, (113.27)
where k is a constant lying in the range  1< k < + 1 ; this condition assures the smallness
of the last term in (113.22) (since sinh 2/ contains <f K and £ 2k ) compared to the first two
(~ r 2 ) Having determined a and p from (113.27) and (113.21), y from (113.23) and t from
(113.6), we get 2
t~r^. (11328)
This is again a Kasner regime, where the negative power of t appears in the function c(t).
Thus we arrive again at a picture of the same qualitative character. Over a long period
of time (corresponding to large decreasing values of two of the functions (a and b)
oscillate, while equation (113.25) shows, in addition, that these oscillations proceed on
the background of a slow (~ VI) fall off of their mean values. Throughout all this time
the functions a and b remain close in value. The third function c falls monotonically, the
decrease following the law c = c t/t . This evolution lasts until £ ~ 1, when formulas
(113.2526) become inapplicable. After this, as we see from (113.28), the decreasing
function c begins to rise, and the functions a and b drop. This will continue until the terms
~ c 2 /a 2 b 2 on the right sides of (113.3) become ~ t~ 2 , when the next series of oscillations
begin.
These qualitative features of the behavior of the metric near a singular point are not
changed by the presence of matter; near the singularity the matter can be "written into"
the metric of empty space, neglecting its back reaction on the gravitational field. In other
words, the evolution of the matter is determined simply by the equations of its motion in
366 COSMOLOGICAL PROBLEMS § 113
the given field. These equations are the hydrodynamic equations
1 d , , — .
7=~i(vg(Tu l ) = 0,
V —g ox 1
(cf. Fluid Mechanics, § 126). Here a is the entropy density; near the singularity we must use
the ultrarelativistic equation of state p = e/3, when a ~ e 3/4 .
Applying these equations to the motion of matter in the Kasner metric and the metric
(113.25), we find that in both cases the energy density increases monotonically (cf. the
Problems). This proves that the energy density tends to infinity (when t »• 0) in this model.
PROBLEMS
1. Find the law of variation with time of the density of matter, uniformly distributed in a space
with the metric (103.9), for small t.
Solution: Denote the time factors in (103.9) by a = t PX , b = t P2 , c = t P3 . Since all quantities
depend only on the time, and — g = abc, equations (113.29) give
~(abcu Q e 3 ") = t Ae^+uJ^O.
Then
abcuo e 3 /4 = const, (U
lias 1 ' 4 — const. (2)
According to (2) all the covariant components u a have the same order of magnitude. Among the
contravariant components, the largest (when />0) is u 3 =u 3 /c 2 . Keeping only the largest terms
in the identity 11,11' = 1, we get u% x u 3 u 3 = uj 2 c 2 , and then, from (1) and (2),
e ~ l/a 2 b 2 , u a ~ Vd>, (3)
or
As it should, e goes to infinity when t^>0 for all values ofp 3 except p 3 = 1, in accordance with the
fact that the singularity in the metric with exponents (0, 0, 1) is unphysical.
The validity of this approximation is verified by estimating the components T l k omitted on the
right sides of (103.34). The leading terms are:
r° ~ eul ~ t a+P3\ T\~ s~t aa»*>,
Tl ~ su 2 u 2  / a+2 P2  P3 ) 5 ti „ SU3ll 3 „ t a+p 3 )_
They all actually increase more slowly, when />0, than the left sides of the equations, which
increase like t~ 2 .
2. The same problem for the metric (113.25).
Solution: With the functions a and b from (113.25), we find from (3): e ~ £~ 2 . Through the whole
time when £ varies from £ to £ ~ 1, the density increases by a factor k% Considering the connection
(113.26) between £ and /, this means an increase by the factor In 2 (/ /'i), where t and t 1 are the
upper and lower limts of the era in terms of the time /.
§ 114. The character of the singularity in the general cosmological solution of the gravitational
equations
The adequacy of the isotropic model for describing the present state of the Universe is
no basis for expecting that it is equally suitable for describing its early stages of evolution.
§ 114 THE GENERAL COSMOLOGICAL SOLUTION 367
One may even ask to what extent the existence of a time singularity (i.e. finiteness of the
time) is a necessary general property of cosmological models, or whether it is really caused
by the specific simplifying assumptions on which the models are based.
If the presence of the singularity were independent of these assumptions, it would mean
that it is inherent not only to special solutions, but also to the general solution of the gravita
tional equations.! Finding such a solution in exact form, for all space and over all time, is
clearly impossible. But to solve our problem it is sufficient to study the form of the solution
only near the singularity. The criterion of generality of the solution is the number of
"physically arbitrary" functions of the space coordinates contained in it. In the general
solution the number of such functions must be sufficient for arbitrary assignment of initial
conditions at any chosen time [4 for empty space, 8 for space filled with matter (cf. § 95)].J
The singularity of the Friedmann solution for t = is characterized by the fact that the
vanishing of spatial distances occurs according to the same law in all directions. This type
of singularity is not, however, sufficiently general: it is typical of a class of solutions that
contain only three physically arbitrary coordinate functions (cf. the problem in § 109). We
also note that these solutions exist only for a space filled with matter.
The singularity that is characteristic of the Kasner solution (103.9) has a much more
general character.§ It belongs to the class of solutions in which the leading terms in the
expansion of the spatial metric tensor ( in the synchronous reference system) near the singular
point t = have the form
?«„ = t 2p %l p + t 2pm m a m p + t 2p "n a n p , (114.1)
where 1, m and n are three vector functions of the coordinates, while p h p m and p„ are co
ordinate functions related by the two equations (113.11). For the metric (114.1), the equation
R% = for the field in vacuum is satisfied automatically in its leading terms. Satisfying the
equations Rf = requires fulfilment of the additional condition
lcurll = (H4.2)
for that one of the vectors 1, m, n that has a negative power in (1 14.1) (which we have chosen
to be Pi = p x < 0). The origin of this condition can be followed using the equations
(113.3) of the preceding section which correspond to a definite choice of the vectors 1, m, n.
These equations could have the solution (113.10) valid down to t = only under the
condition X = 0, when on the right sides of the equations the terms a 2 /2b 2 c 2 , that grow
faster than t ~ 2 for t > 0, would vanish. But according to (1 12.15), the requirement that the
structure constant X = C\ 3 vanish precisely implies the condition (114.2).
As for the equations i?° = 0, which contain only first time derivatives of the components
of the tensor y af , they lead to three more relations (not containing the time) that must be
imposed on the coordinate functions in (114.1). Together with (114.2) there are thus all
t When we speak of a singularity in the cosmological solution we have in mind a singularity that is
attainable in all of the space (and not over some restricted part, as in the gravitational collapse of a finite
body). .
t We emphasize that for a system of nonlinear equations, such as the Einstein equations, the notion oi a
general solution is not unambiguous. In principle more than one general integral may exist, each of the
integrals covering not the entire manifold of conceivable initial conditions, but only some finite part of it.
Each such integral will contain the whole required set of arbitrary functions, but they may be subject to
specific conditions in the form of inequalities. The existence of a general solution possessing a singularity
does not therefore preclude the existence of other solutions that do not have a singularity.
§ This section gives only a general outline of the situation. For a more detailed presentation, cf. I. M.
Khalatnikov and E. M. Lifshitz, Adv. in Phys. 12, 185, 1963; V. A. Belinskii, I. M. Khalatmkov, and
E. M. Lifshitz, Adv. in Phys. 1970.
368 COSMOLOGICAL PROBLEMS § \\4
together four conditions. These conditions connect ten different coordinate functions: three
components of each of the three vectors 1, m and n, and one of the functions that appears as
a power of / [any one of the three functions p u p m and p n , which are related by the two equa
tions (113.11)]. In determining the number of physically arbitrary functions we must also
remember that the synchronous reference system used still allows arbitrary transformations
of the three spatial coordinates, not affecting the time. Thus the solution (114.1) contains
altogether 1043 = 3 physically arbitrary functions, which is one fewer than required
for the general solution in empty space.
The degree of generality achieved is not reduced when matter is introduced: the matter is
"written into" the metric (1 14. 1) with its four new coordinate functions needed for assigning
the initial distribution of the matter density and the three velocity components (cf the
problem in § 113).
Of the four conditions that must be imposed on the coordinate functions in (114.1), the
three conditions that arise from the equations R° a = are "natural"; they follow from the
very structure of the equations of gravitation. The imposition of the additional condition
(1 14.2) results in the "loss" of one arbitrary function.
By definition the general solution is completely stable. Application of any perturbation is
equivalent to changing the initial conditions at some moment of time, but since the general
solution admits arbitrary initial conditions, the perturbation cannot change its character.
But for the solution (114.1) the presence of the restrictive condition (114.2) means, in other
words, instability with respect to perturbations that violate this condition. The application
of such a perturbation should carry the model into a different regime, which ipso facto
will be completely general.
This is precisely the study made in the previous section for the special case of the homo
geneous model. The structure constants (113.2) mean precisely that for the homogeneous
space of type IX all three products 1 curl I, m curl m, n curl n are different from zero
[cf. (112.15)]. Thus the condition (114.2) cannot be fulfilled, no matter which direction we
assign the negative power of the time. The discussion given in § 113 of the equations
(113.34) consisted in an explanation of the effects produced on the Kasner regime by the
perturbation associated with a nonvanishing X = (1 curl \)/v.
Although the investigation of a special case cannot exhibit all the details of the general
case, it does give a basis for concluding that the singularity in the general cosmological
solution has the oscillating character described in § 113. We emphasize once more that this
character is not related to the presence of matter, and is already a feature of empty space
time itself.
The oscillating regime of approach to the singularity gives a whole new aspect to the con
cept of finiteness of time. An infinite set of oscillations are included between any finite
moment of world time t and the moment t = 0. In this sense the process has infinite charac
ter. Instead of the time t, a more natural variable (as already noted in § 113) appears to be
the logarithm In t, in terms of which the process is stretched out to  oo.
We have spoken throughout of the direction of approach to the singularity as the direc
tion of decreasing time; but in view of the symmetry of the equations of gravitation under
time reversal, we could equally well have talked of an approach to the singularity in the
direction of increasing time. Actually, however, because of the physical inequivalence of
future and past, there is an essential difference between these two cases with respect to the
formulation of the problem. A singularity in the future can have physical meaning only if it
is attainable from arbitrary initial conditions, assigned at any previous instant of time. It is
§ 114 THE GENERAL COSMOLOGICAL SOLUTION 369
clear that there is no reason why the distribution of matter and field that is attainable at
some instant in the process of evolution of the Universe should correspond to the specific
conditions required for the existence of some particular solution of the gravitational
equations.
As for the question of the type of singularity in the past, an investigation based solely on
the equations of gravitation can hardly give an unambiguous answer. It is natural to think
that the choice of the solution corresponding to the real universe is connected with some
profound physical requirements, whose establishment solely on the basis of the present
theory is impossible and whose clarification will come only from a further synthesis of
physical theories. In this sense it could, in principle, turn out that this choice corresponds
to some special (for example, isotropic) type of singularity. Nevertheless, it appears more
natural a priori to suppose that, in view of the general character of the oscillating regime,
just this regime should describe the early stages of evolution of the universe.
INDEX
Aberration of light 13
Absolute future 6
Absolute past 6
Action function 24, 67, 266
Adiabatic invariant 54
Airy function 149, 183, 200
Angular momentum 40, 79
Antisymmetric tensor 16, 21
Astigmatism 136
Asymptotic series 152
Axial vector 18, 47
Babinet's principle 155
Bessel functions 182
Bianchi classification 360
Bianchi identity 261
Binding energy 30
BiotSavart.law 103
Bremsstrahlung 179
magnetic 197
Caustic 133, 148
Center of inertia 41
Centerofmass system 31
Centrally symmetric gravitational field 282, 287
Characteristic vibrations 141
Charge 44
density 69
Christoffel symbols 238
Circular polarization 114
Circulation 67
Classical mechanics 2
Closed model 336
Coherent scattering 220
Collapse 296
Combinational scattering 220
Comoving reference system 293, 294, 301, 336
Conformalgalilean system 341
Contraction
of a field 92
of a tensor 16
Contravariant
derivative 239
tensor 16, 229
vector 14
Coriolis force 252, 321
Coulomb field 88
Coulomb law 88
Covariant
derivative 236
tensor 16, 229, 313
vector 14
Cross section 34, 215
Csystem 31
Current fourvector 69
Curvature tensor 258, 295
canonical forms of 264
Curved spacetime 227
Curvilinear coordinates 229
D'Alembert equation 109
D'Alembertian 109
Decay of particles 30
Degree of polarization 121
Delta function 29, 70
Depolarization, coefficient 122
Diffraction 145
Dipole
moment 96
radiation 173
Displacement current 75
Distribution function 29
Doppler effect 1 1 6, 254
Drift velocity 55, 59
Dual tensor 17
Dustlike matter 301
Effective radiation 177
Eikonal 129, 145
angular 134
equation 130
Elastic collision 36
Electric dipole moment 96
Electric field intensity 47
371
372
INDEX
Electromagnetic field tensor 60
Electromagnetic waves 108
Electromotive force 67
Electrostatic energy 89
Electrostatic field 88
Element of spatial distance 233
Elementary particles 26, 43
Elliptical polarization 115
Energy 26
density 75
flux 75
Energymomentum pseudotensor 304
Energymomentum tensor 77, 80, 268
for macroscopic bodies 97
Equation of continuity 71
Era 363
Euler constant 185
Events 3
Exact solutions of gravitational equations 314
Fermat's principle 132, 252
Field
constant 50
Lorentz transformation of 62
quasiuniform 54
uniform electric 52
uniform magnetic 53
Flat spacetime 227
Flux 66
Focus 133
Fouracceleration 22
Fourdimensional geometry 4
Fourforce 28
Fourgradient 19
Fourmomentum 27
Fourpotential 45
Fourscalar 15
Four vector 14
Fourvelocity 21
Fourier resolution 119,124,125
Frame vector 356
Fraunhofer diffraction 153
Frequency 114
Fresnel diffraction 150, 154
integrals 152
Friedmann solution 333
Galilean system 228
Galileo transformation 9
Gamma function 185
Gauge invariance 49
Gauss' theorem 20
Gaussian curvature 262
Gaussian system of units 69
General cosmological solution 367
Generalized momentum 45
Geodesic line 244
Geometrical optics 129
Gravitational collapse 296
Gravitational constant 266
Gravitational field 225, 274
centrally symmetric 282, 287
Gravitational mass 309
Gravitational potential 226
Gravitational radius 284
Gravitational stability 350
Gravitational waves 311, 352
Group of motions 356
Guiding center 55, 59
28,46,94, 112, 130,
Hamiltonian 26
HamiltonJacobi equation
349
Hankel function 149
Heaviside system 69
Homocentric bundle 136
Homogeneous space 355
Hubble constant 346
Huygens' principle 146
Hypersurface 19, 73
Impact parameter 95
Incoherent scattering 220
Incoherent waves 122
Inertial mass 309
Inertial system 1
Interval 3
Invariants of a field 63
Isotropic coordinates 287
Isotropic space 333
Jacobi identity 358
Kasner solution 315, 362
Killing equations 269
Killing vector 357
Laboratory system 31
Lagrangian 24, 69
density 77
to fourth order 236, 367
to second order 165
Laplace equation 88
Larmor precession 107
Larmor theorem 105
Legendre polynomials 98
Lens 137
Lie group 358
LienardWiechert potentials 160, 171, 179
Light
aberration of 13
cone 7
pressure 112
Linearly polarized wave 116
Locallygeodesic system 240, 260
Locallyinertial system 244
INDEX
373
Longitudinal waves 125
Lorentz condition 109
Lorentz contraction 1 1
Lorentz force 48
Lorentz frictional force 204
Lorentz gauge 109
Lorentz transformation 9, 62
Lsystem 31
MacDonald function 183
Macroscopic bodies 85
Magnetic bremsstrahlung 197
Magnetic dipole radiation 188
Magnetic field intensity 47
Magnetic lens 140
Magnetic moment 103
Magnification 142
Mass current vector 83
Mass density 82
Mass quadrupole moment tensor 280
Maupertuis' principle 51, 132
Maxwell equations 66, 73, 254
Maxwell stress tensor 82, 112
Metric
spacetime 227
tensor 16, 230
Mirror 137
Mixed tensor 230
Momentofinertia tensor 280
Momentum 25
density 79
fourvector of 27
space 29
Monochromatic wave 114
Multipole moment 97
Natural light 121
Near zone 190
Newtonian mechanics
Newton's law 278
Nicol prism 120
Null vector 1 5
Observed matter 345
Open model 340
Optic axis 136
Optical path length 134
Optical system 134
Oscillator 55
Parallel translation 237
Partially polarized light 1 19
Pascal's law 85
Petrov classification 263
Phase 114
Plane wave 110,129
Poisson equation 88
Polarization
circular 114
elliptic 114
tensor 120
Polar vector 18, 47
Poynting vector 76
Principal focus 138
Principal points 138
Principle
of equivalence 225
of least action 24, 27
of relativity 1
superposition 68
Proper
acceleration 52
length 11
time 7
volume 1 1
Pseudoeuclidean geometry 4, 10
Pseudoscalar 17, 68
Pseudotensor 17
Quadrupole moment 98
Quadrupole potential 97
Quadrupole radiation 188
Quantum mechanics 90
Quasicontinuous spectrum 200
Radiation
damping 203,208
of gravitational waves 323
Radius of electron 90
Rays 129
Real image 136
Recession of nebulae 343
Red shift 249,343
Reference system 1
Renormalization 90, 208
Resolving power 144
Rest energy 26
Rest frame 34
Retarded potentials 158
Ricci tensor 261, 360
Riemann tensor 259
Rotation 253
Rutherford formula 95
Scalar curvature 262
Scalar density 232
Scalar potential 45
Scalar product 15
Scattering 215
Schwarzschild sphere 296
Secular shift
of orbit 321
of perihelion 330
Selfenergy 90
Signal velocity 2
Signature 228
374
INDEX
Space component 15
Spacelike interval 6
Spacelike vector 15
Spatial
curvature 286
distance 233
metric tensor 250
Spectral resolution 118, 1 63, 21 1
Spherical harmonics 98
Static gravitational field 247
Stationary gravitational field 247
Stokes' parameters 122
Stokes' theorem 20
Stress tensor 80
Structure constants 357
Symmetric tensor 16, 21
Synchronization 236
Synchronous reference system 290
Synchrotron radiation 197, 202
Telescopic imaging 139
Tensor 15
angular momentum 41
antisymmetric 16, 21
completely antisymmetric unit 17
contraction of 16
contravariant 16, 229
covariant 16, 229, 313
density 232
dual 17
electromagnetic field 60
hermitian 120
irreducible 98
mixed 230
momentofinertia 280
symmetric 16, 21
Thomson formula 216
Time component 15
Timelike interval 5
Transverse vector 15
wave 111
Trochoidal motion 57
Ultrarelativistic region 27, 195, 201
Unit fourtensor 16
Unpolarized light 121
Vector
axial 18, 47
density 232
polar 18, 47
potential 45
Poynting 76
Velocity space 35
Virial theorem 84
Virtual image 136
Wave
equation 108
length 114
packet 131
surface 129
vector 116
zone 170
World
line 4
point 4
time 247
THE JOURNAL OF THE
FRANKLIN INSTITUTE
The Journal of the Franklin Institute
covers the traditional branches of
mathematics and the physical sciences,
both pure and applied, as well as the
new composite sciences, combining
two or more disciplines. Dedicated to
honour Ben Franklin, America's great
inventor, writer and scientist, the
journal provides a platform for the
dissemination of scientific ideas and
research, and draws its authors and
readership from more than sixty
countries throughout the world.
Major papers describing theoretical and
experimental researches are accepted
for publication on the basis of their
lasting value. The journal also
publishes brief communications of
exceptional interest and reviews a
number of current books in each issue.
Demonstrating the extreme flexibility
of the editorial policy, special issues
have been published on topics that are
timely and fall within the broad range
of interest of the journal.
Since its initial publication in 1826, the
journal has proved its ability to relate
to the times by bridging the gap from
one era to another, becoming one of
the most highly respected publications
in the world of science and engineering.
Today, the Journal of the Franklin
Institute remains as relevant, for present
and future generations of scientific
workers, as it was for its founders,
almost fifteen decades ago.
Write for a specimen copy of the
latest issue and details of related
Pergamon journals.
Pergamon Press
Headington Hill Hall, Oxford 0X3 OBW
Maxwell House, Fairview Park,
Elmsford, New York 10523
207 Queen's Quay West, Toronto 1
19a Boundary Street, Rushcutters Bay.
NSW 2011, Australia
Vieweg & Sohn GmbH, Burgplatz 1,
Braunschweig
Printed in Great Britain/Bradley
COURSE OF THEORETICAL PHYSICS
by L D. LANDAU {Deceased) and E. M. LIFSHITZ
Institute of Physical Problems, USSR Academy of Sciences
The complete Course of Theoretical Physics by Landau and Lifshitz, recognized as two of the world's
outstanding physicists, is being published in full by Pergamon Press. It comprises nine volumes,
covering all branches of the subject; translations from the Russian are by leading scientists.
Typical of the many statements made by experts, reviewing the series, are the following :
"The titles of the volumes in this series cover a vast range of topics, and there seems to be little in
physics on which the authors are not very well informed. " Nature
"The remarkable ninevolume Course of Theoretical Physics . . . the clearness and accuracy of the
authors' treatment of theoretical physics is well maintained. "
Proceedings of the Physical Society
Landau
Lifshitz
Of individual volumes, reviewers have written :
MECHANICS
"The entire book is a masterpiece of scientific writing. There is not a superfluous sentence and the
authors know exactly where they are going. ... It is certain that this volume will be able to hold its
own amongst more conventional texts in classical mechanics, as a scholarly and economic exposition
of the subject." Science Progress
QUANTUM MECHANICS (Nonrelativistic Theory)
". , . throughout the five hundred large pages, the authors' discussion proceeds with the clarity and
succinctness typical of the very best works on theoretical physics." Technology
FLUID MECHANICS
"The ground covered includes ideal fluids, viscous fluids, turbulence, boundary layers, conduction
and diffusion, surface phenomena and sound. Compressible fluids are treated under the headings of
shock waves, onedimensional gas flow and flow past finite bodies. There is a chapter on the fluid
dynamics of combustion while unusual topics discussed are reiativistic fluid dynamics, dynamics of
superfluids and fluctuations in fluid dynamics ... a valuable addition to any library covering the
mechanics of fluids." Science Progress
TTLJ c /■* I
ASSICAL THEORY OF FIELDS (Second Edition)
"This is an excellent and readable volume. It is a valuable and unique addition to the literature of
theoretical physics." Science
"The clarity of style, the concisement of treatment, and the originality and variety of illustrative problems
make this a book which can be highly recommended." Proceedings of the Physical Society
STATISTICAL PHYSICS
". . . stimulating reading, partly because of the clarity and compactness of some of the treatments put
forward, and partly by reason of contrasts with texts on statistical mechanics and statistical thermo
dynamics better known to English sciences. . . . Other features attract attention since they do not
always receive comparable mention in other textbooks." New Scientist
THEORY OF ELASTICITY
"I shall be surprised if this book does not come to be regarded as a masterpiece."
Journal of the Royal Institute of Physics (now the Physics Bulletin)
". . . the book is well constructed, ably translated, and excellently produced."
Journal of the Royal Aeronautical Society
ELECTRODYNAMICS OF CONTINUOUS MEDIA
"Within the volume one finds everything expected of a textbook on classical electricity and magnetism,
and a great deal more. It is quite certain that this book will remain unique and indispensable for many
years to come." Science Progress
"The volume on electrodynamics conveys a sense of mastery of the subject on the part of the authors
which is truly astonishing." Nature
CD CD
So
O CO
■*2.
_! 0)
CD —
E
08 01 601 9
Pergamon